Commit cadea9ca authored by Douwe Maan's avatar Douwe Maan Committed by Ruben Davila

Merge branch '26881-backup-fails-if-data-changes' into 'master'

Copy data before compression to prevent 'file changed as we read it'

Closes #26881

See merge request !8728
parent 88eab515
---
title: Add `copy` backup strategy to combat file changed errors
merge_request: 8728
author:
...@@ -84,6 +84,28 @@ Deleting tmp directories...[DONE] ...@@ -84,6 +84,28 @@ Deleting tmp directories...[DONE]
Deleting old backups... [SKIPPING] Deleting old backups... [SKIPPING]
``` ```
## Backup Strategy Option
> **Note:** Introduced as an option in 8.17
The default backup strategy is to essentially stream data from the respective
data locations to the backup using the Linux command `tar` and `gzip`. This works
fine in most cases, but can cause problems when data is rapidly changing.
When data changes while `tar` is reading it, the error `file changed as we read
it` may occur, and will cause the backup process to fail. To combat this, 8.17
introduces a new backup strategy called `copy`. The strategy copies data files
to a temporary location before calling `tar` and `gzip`, avoiding the error.
A side-effect is that the backup process with take up to an additional 1X disk
space. The process does its best to clean up the temporary files at each stage
so the problem doesn't compound, but it could be a considerable change for large
installations. This is why the `copy` strategy is not the default in 8.17.
To use the `copy` strategy instead of the default streaming strategy, specify
`STRATEGY=copy` in the Rake task command. For example,
`sudo gitlab-rake gitlab:backup:create STRATEGY=copy`.
## Exclude specific directories from the backup ## Exclude specific directories from the backup
You can choose what should be backed up by adding the environment variable `SKIP`. You can choose what should be backed up by adding the environment variable `SKIP`.
......
...@@ -8,6 +8,7 @@ module Backup ...@@ -8,6 +8,7 @@ module Backup
@name = name @name = name
@app_files_dir = File.realpath(app_files_dir) @app_files_dir = File.realpath(app_files_dir)
@files_parent_dir = File.realpath(File.join(@app_files_dir, '..')) @files_parent_dir = File.realpath(File.join(@app_files_dir, '..'))
@backup_files_dir = File.join(Gitlab.config.backup.path, File.basename(@app_files_dir) )
@backup_tarball = File.join(Gitlab.config.backup.path, name + '.tar.gz') @backup_tarball = File.join(Gitlab.config.backup.path, name + '.tar.gz')
end end
...@@ -15,8 +16,22 @@ module Backup ...@@ -15,8 +16,22 @@ module Backup
def dump def dump
FileUtils.mkdir_p(Gitlab.config.backup.path) FileUtils.mkdir_p(Gitlab.config.backup.path)
FileUtils.rm_f(backup_tarball) FileUtils.rm_f(backup_tarball)
if ENV['STRATEGY'] == 'copy'
cmd = %W(cp -a #{app_files_dir} #{Gitlab.config.backup.path})
output, status = Gitlab::Popen.popen(cmd)
unless status.zero?
puts output
abort 'Backup failed'
end
run_pipeline!([%W(tar -C #{@backup_files_dir} -cf - .), %W(gzip -c -1)], out: [backup_tarball, 'w', 0600])
FileUtils.rm_rf(@backup_files_dir)
else
run_pipeline!([%W(tar -C #{app_files_dir} -cf - .), %W(gzip -c -1)], out: [backup_tarball, 'w', 0600]) run_pipeline!([%W(tar -C #{app_files_dir} -cf - .), %W(gzip -c -1)], out: [backup_tarball, 'w', 0600])
end end
end
def restore def restore
backup_existing_files_dir backup_existing_files_dir
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment