Improve cleanup of gpg-homedirs
The `gpg-agent` that could have been spawned here dies when it sees it's socket disappear. However, sometimes it seems like we fail to delete the homedir, causing the `gpg-agent` to live on forever. We've noticed that the deletion failed in http://gitlab.com/gitlab-org/gitlab-foss/issues/36998: there was a race condition during the deletion where `gpg-agent` would still be modifying files while we've already called `FileUtils.remove_entry`. This will attempt to delete the directory multiple times, at least 0.1 seconds apart. This is a naive way of trying to make sure we clean up the homedir and count on `gpg-agent` to see that and make itself go away. On a web node we'll attempt for at most 0.5 seconds to clean up the directory before failing. In a sidekiq process we'll attempt the deletion for up to 2 seconds. When the cleanup fails, we will now track that exception in Sentry to gain some visibility. This also adds counters for the creation and deletion of tmp keychains, which we should be able to correlate to the nubmer of zombie `gpg-agent` processes.
Showing
Please register or sign in to comment