bcachefs: flush journal to avoid invalid dev usage entries on recovery
A crash immediately after device removal can result in an unmountable filesystem due to recovery failure. The following command reliably reproduces on a multi-device fs: bcachefs device remove <dev> && xfs_io -xc shutdown <mnt> The post-crash mount fails with an error similar to the following, reported by fsck: invalid journal entry dev_usage at offset 7994/8034 seq 12: bad dev, fixing This refers to a device usage entry in the journal that refers to the index of the just removed device. Recovery considers this an invalid entry and fails to proceed. Device usage entries are added to journal buffer writes via bch_journal_write() -> bch2_journal_super_entries_add_common(), which means any journal buffer write has content that refers to member devices at the time of the journal write. The device remove sequence already removes metadata references to the device being removed. It then flushes any pins that refer to the device, clears replica entries, removes the in-memory device object and lastly updates the superblock to reflect that the device is no longer present. The problem is that any journal writes that occur during this sequence will include a dev usage entry so long as the device is present. To avoid this problem, we can flush the journal once more after the device entry is removed from the in-core structures, but before the superblock is updated to fully remove the device on-disk. Signed-off-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Showing
Please register or sign in to comment