Commit fa210906 authored by Dylan Griffith's avatar Dylan Griffith

Fix ElasticDeleteProjectWorker delete commits, index_status

This `ElasticDeleteProjectWorker` is a sidekiq worker that is
responsible for clearing project data out of our Elasticsearch index. It
does this using a [Delete by
query](
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html
)
which attempts to find all associated (child records) for the project
and delete them as well as the project

This MR fixes 2 problems with `ElasticDeleteProjectWorker`:

1. Previously this worker was not correctly deleting commits as they did
not have a `project_id`
https://gitlab.com/gitlab-org/gitlab/-/issues/327829. We do have a
`commit.rid` which is the `project_id` set so we need to use this in our
`delete_by_query`
2. We had another weird bug
https://gitlab.com/gitlab-org/gitlab/-/issues/259721 where we were not
correctly clearing `index_status` after running this worker. When this
worker was created this made sense because it was only used when a
project was deleted. But on GitLab.com we also use this when a project
subscription expires. Therefore any projects that had expired
subscriptions then later renewed were considered to already be indexed
based on index_status which caused most of the repository to be missed.
parent 11c635a8
......@@ -12,6 +12,7 @@ class ElasticDeleteProjectWorker
def perform(project_id, es_id)
remove_project_and_children_documents(project_id, es_id)
IndexStatus.for_project(project_id).delete_all
end
private
......@@ -50,6 +51,12 @@ class ElasticDeleteProjectWorker
project_id: project_id
}
},
{
term: {
# We never set `project_id` for commits instead they have a nested rid which is the project_id
"commit.rid" => project_id
}
},
{
term: {
target_project_id: project_id # handle merge_request which aliases project_id to target_project_id
......
---
title: Fix ElasticDeleteProjectWorker delete commits, index_status
merge_request: 59457
author:
type: fixed
......@@ -21,21 +21,25 @@ RSpec.describe ElasticDeleteProjectWorker, :elastic, :sidekiq_inline do
end
shared_examples 'delete project and objects' do
it 'deletes a project with all nested objects' do
it 'deletes a project with all nested objects and clears the index_status' do
project = create :project, :repository
issue = create :issue, project: project
milestone = create :milestone, project: project
note = create :note, project: project
merge_request = create :merge_request, target_project: project, source_project: project
ElasticCommitIndexerWorker.new.perform(project.id)
ensure_elasticsearch_index!
## All database objects + data from repository. The absolute value does not matter
expect(project.reload.index_status).not_to be_nil
expect(Project.elastic_search('*', **search_options).records).to include(project)
expect(Issue.elastic_search('*', **search_options).records).to include(issue)
expect(Milestone.elastic_search('*', **search_options).records).to include(milestone)
expect(Note.elastic_search('*', **search_options).records).to include(note)
expect(MergeRequest.elastic_search('*', **search_options).records).to include(merge_request)
expect(Repository.elastic_search('*', **search_options)[:blobs][:results].response).not_to be_empty
expect(Repository.find_commits_by_message_with_elastic('*').count).to be > 0
subject.perform(project.id, project.es_id)
......@@ -46,6 +50,8 @@ RSpec.describe ElasticDeleteProjectWorker, :elastic, :sidekiq_inline do
expect(Milestone.elastic_search('*', **search_options).total_count).to be(0)
expect(Note.elastic_search('*', **search_options).total_count).to be(0)
expect(MergeRequest.elastic_search('*', **search_options).total_count).to be(0)
expect(Repository.elastic_search('*', **search_options)[:blobs][:results].response).to be_empty
expect(Repository.find_commits_by_message_with_elastic('*').count).to be(0)
# verify that entire index is empty
# searches use joins on the parent record (project)
......@@ -53,6 +59,8 @@ RSpec.describe ElasticDeleteProjectWorker, :elastic, :sidekiq_inline do
helper = Gitlab::Elastic::Helper.default
expect(helper.documents_count).to be(0)
expect(project.reload.index_status).to be_nil
end
end
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment