1. 19 May, 2020 2 commits
    • Darrick J. Wong's avatar
      xfs: use ordered buffers to initialize dquot buffers during quotacheck · 78bba5c8
      Darrick J. Wong authored
      While QAing the new xfs_repair quotacheck code, I uncovered a quota
      corruption bug resulting from a bad interaction between dquot buffer
      initialization and quotacheck.  The bug can be reproduced with the
      following sequence:
      
      # mkfs.xfs -f /dev/sdf
      # mount /dev/sdf /opt -o usrquota
      # su nobody -s /bin/bash -c 'touch /opt/barf'
      # sync
      # xfs_quota -x -c 'report -ahi' /opt
      User quota on /opt (/dev/sdf)
                              Inodes
      User ID      Used   Soft   Hard Warn/Grace
      ---------- ---------------------------------
      root            3      0      0  00 [------]
      nobody          1      0      0  00 [------]
      
      # xfs_io -x -c 'shutdown' /opt
      # umount /opt
      # mount /dev/sdf /opt -o usrquota
      # touch /opt/man2
      # xfs_quota -x -c 'report -ahi' /opt
      User quota on /opt (/dev/sdf)
                              Inodes
      User ID      Used   Soft   Hard Warn/Grace
      ---------- ---------------------------------
      root            1      0      0  00 [------]
      nobody          1      0      0  00 [------]
      
      # umount /opt
      
      Notice how the initial quotacheck set the root dquot icount to 3
      (rootino, rbmino, rsumino), but after shutdown -> remount -> recovery,
      xfs_quota reports that the root dquot has only 1 icount.  We haven't
      deleted anything from the filesystem, which means that quota is now
      under-counting.  This behavior is not limited to icount or the root
      dquot, but this is the shortest reproducer.
      
      I traced the cause of this discrepancy to the way that we handle ondisk
      dquot updates during quotacheck vs. regular fs activity.  Normally, when
      we allocate a disk block for a dquot, we log the buffer as a regular
      (dquot) buffer.  Subsequent updates to the dquots backed by that block
      are done via separate dquot log item updates, which means that they
      depend on the logged buffer update being written to disk before the
      dquot items.  Because individual dquots have their own LSN fields, that
      initial dquot buffer must always be recovered.
      
      However, the story changes for quotacheck, which can cause dquot block
      allocations but persists the final dquot counter values via a delwri
      list.  Because recovery doesn't gate dquot buffer replay on an LSN, this
      means that the initial dquot buffer can be replayed over the (newer)
      contents that were delwritten at the end of quotacheck.  In effect, this
      re-initializes the dquot counters after they've been updated.  If the
      log does not contain any other dquot items to recover, the obsolete
      dquot contents will not be corrected by log recovery.
      
      Because quotacheck uses a transaction to log the setting of the CHKD
      flags in the superblock, we skip quotacheck during the second mount
      call, which allows the incorrect icount to remain.
      
      Fix this by changing the ondisk dquot initialization function to use
      ordered buffers to write out fresh dquot blocks if it detects that we're
      running quotacheck.  If the system goes down before quotacheck can
      complete, the CHKD flags will not be set in the superblock and the next
      mount will run quotacheck again, which can fix uninitialized dquot
      buffers.  This requires amending the defer code to maintaine ordered
      buffer state across defer rolls for the sake of the dquot allocation
      code.
      
      For regular operations we preserve the current behavior since the dquot
      items require properly initialized ondisk dquot records.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      78bba5c8
    • Brian Foster's avatar
      xfs: don't fail verifier on empty attr3 leaf block · f28cef9e
      Brian Foster authored
      The attr fork can transition from shortform to leaf format while
      empty if the first xattr doesn't fit in shortform. While this empty
      leaf block state is intended to be transient, it is technically not
      due to the transactional implementation of the xattr set operation.
      
      We historically have a couple of bandaids to work around this
      problem. The first is to hold the buffer after the format conversion
      to prevent premature writeback of the empty leaf buffer and the
      second is to bypass the xattr count check in the verifier during
      recovery. The latter assumes that the xattr set is also in the log
      and will be recovered into the buffer soon after the empty leaf
      buffer is reconstructed. This is not guaranteed, however.
      
      If the filesystem crashes after the format conversion but before the
      xattr set that induced it, only the format conversion may exist in
      the log. When recovered, this creates a latent corrupted state on
      the inode as any subsequent attempts to read the buffer fail due to
      verifier failure. This includes further attempts to set xattrs on
      the inode or attempts to destroy the attr fork, which prevents the
      inode from ever being removed from the unlinked list.
      
      To avoid this condition, accept that an empty attr leaf block is a
      valid state and remove the count check from the verifier. This means
      that on rare occasions an attr fork might exist in an unexpected
      state, but is otherwise consistent and functional. Note that we
      retain the logic to avoid racing with metadata writeback to reduce
      the window where this can occur.
      Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      f28cef9e
  2. 13 May, 2020 3 commits
  3. 08 May, 2020 27 commits
  4. 07 May, 2020 8 commits