xfs: repair cannot update the summary counters when logging quota flags
While running xfs/804 (quota repairs racing with fsstress), I observed a filesystem shutdown in the primary sb write verifier: run fstests xfs/804 at 2022-05-23 18:43:48 XFS (sda4): Mounting V5 Filesystem XFS (sda4): Ending clean mount XFS (sda4): Quotacheck needed: Please wait. XFS (sda4): Quotacheck: Done. XFS (sda4): EXPERIMENTAL online scrub feature in use. Use at your own risk! XFS (sda4): SB ifree sanity check failed 0xb5 > 0x80 XFS (sda4): Metadata corruption detected at xfs_sb_write_verify+0x5e/0x100 [xfs], xfs_sb block 0x0 XFS (sda4): Unmount and run xfs_repair The "SB ifree sanity check failed" message was a debugging printk that I added to the kernel; observe that 0xb5 - 0x80 = 53, which is less than one inode chunk. I traced this to the xfs_log_sb calls from the online quota repair code, which tries to clear the CHKD flags from the superblock to force a mount-time quotacheck if the repair fails. On a V5 filesystem, xfs_log_sb updates the ondisk sb summary counters with the current contents of the percpu counters. This is done without quiescing other writer threads, which means it could be racing with a thread that has updated icount and is about to update ifree. If the other write thread had incremented ifree before updating icount, the repair thread will write icount > ifree into the logged update. If the AIL writes the logged superblock back to disk before anyone else fixes this siutation, this will lead to a write verifier failure, which causes a filesystem shutdown. Resolve this problem by updating the quota flags and calling xfs_sb_to_disk directly, which does not touch the percpu counters. While we're at it, we can elide the entire update if the selected qflags aren't set. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Showing
Please register or sign in to comment