Commit bc652905 authored by Brian Foster's avatar Brian Foster Committed by Kent Overstreet

bcachefs: flush journal to avoid invalid dev usage entries on recovery

A crash immediately after device removal can result in an
unmountable filesystem due to recovery failure. The following
command reliably reproduces on a multi-device fs:

  bcachefs device remove <dev> && xfs_io -xc shutdown <mnt>

The post-crash mount fails with an error similar to the following,
reported by fsck:

  invalid journal entry dev_usage at offset 7994/8034 seq 12: bad dev, fixing

This refers to a device usage entry in the journal that refers to
the index of the just removed device. Recovery considers this an
invalid entry and fails to proceed.

Device usage entries are added to journal buffer writes via
bch_journal_write() -> bch2_journal_super_entries_add_common(),
which means any journal buffer write has content that refers to
member devices at the time of the journal write.

The device remove sequence already removes metadata references to
the device being removed. It then flushes any pins that refer to the
device, clears replica entries, removes the in-memory device object
and lastly updates the superblock to reflect that the device is no
longer present. The problem is that any journal writes that occur
during this sequence will include a dev usage entry so long as the
device is present. To avoid this problem, we can flush the journal
once more after the device entry is removed from the in-core
structures, but before the superblock is updated to fully remove the
device on-disk.
Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
parent d14bfd10
......@@ -1521,6 +1521,17 @@ int bch2_dev_remove(struct bch_fs *c, struct bch_dev *ca, int flags)
bch2_dev_free(ca);
/*
* At this point the device object has been removed in-core, but the
* on-disk journal might still refer to the device index via sb device
* usage entries. Recovery fails if it sees usage information for an
* invalid device. Flush journal pins to push the back of the journal
* past now invalid device index references before we update the
* superblock, but after the device object has been removed so any
* further journal writes elide usage info for the device.
*/
bch2_journal_flush_all_pins(&c->journal);
/*
* Free this device's slot in the bch_member array - all pointers to
* this device must be gone:
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment