Fix MR commits with missing committers/authors
In MR https://gitlab.com/gitlab-org/gitlab/-/merge_requests/63669 we introduced a new data format for storing merge request diff commit authors and committers. As part of this work we made changes to the import/export code to support this new format, and added a set of migrations to migrate existing data to this new format. At this time we supported reading and writing of data in both the old and new format, allowing us to gradually migrate data over to the new format. In https://gitlab.com/gitlab-org/gitlab/-/merge_requests/72219 we ensured all migrations are done, stopped using the old data format, and removed the columns storing this data. Unfortunately, this chain of events uncovered a bug in our import/export logic. Consider the following timeline of events: 1. You export project "Cooking Recipes" from a GitLab instance running a version earlier than 14.1 (e.g. 14.0). 2. The instance you intend to import this project into is running 14.1 or newer. Existing data has been fully migrated already. 3. You import the project into this new instance. At this point, the imported data is using the old format, not the format. This is because we forgot to take into account users importing exports using GitLab 14.0 or older, instead only covering exports generated using GitLab 14.1 or newer. Because the background migrations finished, or the data imported would fall in a "bucket" (= a chunk or rows to migrate) that had already been migrated, the data would never be updated to the new format. In this commit we resolve this problem in two steps. First, we change the import/export logic to support importing data in both the old and new format. Exports still use the new format. In addition, we include a background migration that processes all projects created using a GitLab import/export since the first mentioned merge request was introduced. For each such project we scan over the merge request diff commits and fix any that are missing the commit author or committer details. For small self-hosted instances this process is unlikely to take more than a few minutes. On GitLab.com however we expect this process to take a few days, as we have to process around 200 000 projects imported since July. This means we'll likely need additional manual intervention similar to the manual work needed for https://gitlab.com/gitlab-org/gitlab/-/issues/334394. See https://gitlab.com/gitlab-org/gitlab/-/issues/344080 for additional details. Changelog: fixed
Showing
No preview for this file type
Please register or sign in to comment