1. 24 Mar, 2018 10 commits
    • Darrick J. Wong's avatar
      xfs: remove xfs_buf parameter from inode scrub methods · 7e56d9ea
      Darrick J. Wong authored
      Now that we no longer do raw inode buffer scrubbing, the bp parameter is
      no longer used anywhere we're dealing with an inode, so remove it and
      all the useless NULL parameters that go with it.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      7e56d9ea
    • Darrick J. Wong's avatar
      xfs: inode scrubber shouldn't bother with raw checks · d0018ad8
      Darrick J. Wong authored
      The inode scrubber tries to _iget the inode prior to running checks.
      If that _iget call fails with corruption errors that's an automatic
      fail, regardless of whether it was the inode buffer read verifier,
      the ifork verifier, or the ifork formatter that errored out.
      
      Therefore, get rid of the raw mode scrub code because it's not needed.
      Found by trying to fix some test failures in xfs/379 and xfs/415.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      d0018ad8
    • Darrick J. Wong's avatar
      xfs: bmap scrubber should do rmap xref with bmap for sparse files · 5e777b62
      Darrick J. Wong authored
      When we're scanning an extent mapping inode fork, ensure that every rmap
      record for this ifork has a corresponding bmbt record too.  This
      (mostly) provides the ability to cross-reference rmap records with bmap
      data.  The rmap scrubber cannot do the xref on its own because that
      requires taking an ilock with the agf lock held, which violates our
      locking order rules (inode, then agf).
      
      Note that we only do this for forks that are in btree format due to the
      increased complexity; or forks that should have data but suspiciously
      have zero extents because the inode could have just had its iforks
      zapped by the inode repair code and now we need to reclaim the old
      extents.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      5e777b62
    • Darrick J. Wong's avatar
      xfs: refactor inode buffer verifier error logging · 6edb1810
      Darrick J. Wong authored
      When the inode buffer verifier encounters an error, it's much more
      helpful to print a buffer from the offending inode instead of just the
      start of the inode chunk buffer.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      6edb1810
    • Darrick J. Wong's avatar
      xfs: refactor inode verifier error logging · 90a58f95
      Darrick J. Wong authored
      Refactor some of the inode verifier failure logging call sites to use
      the new xfs_inode_verifier_error method which dumps the offending buffer
      as well as the code location of the failed check.  This trims the
      output, makes it clearer to the admin that repair must be run, and gives
      the developers more details to work from.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      90a58f95
    • Darrick J. Wong's avatar
      xfs: refactor bmap record validation · 30b0984d
      Darrick J. Wong authored
      Refactor the bmap validator into a more complete helper that looks for
      extents that run off the end of the device, overflow into the next AG,
      or have invalid flag states.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      30b0984d
    • Darrick J. Wong's avatar
      xfs: sanity-check the unused space before trying to use it · 6915ef35
      Darrick J. Wong authored
      In xfs_dir2_data_use_free, we examine on-disk metadata and ASSERT if
      it doesn't make sense.  Since a carefully crafted fuzzed image can cause
      the kernel to crash after blowing a bunch of assertions, let's move
      those checks into a validator function and rig everything up to return
      EFSCORRUPTED to userspace.  Found by lastbit fuzzing ltail.bestcount via
      xfs/391.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      6915ef35
    • Brian Foster's avatar
      xfs: detect agfl count corruption and reset agfl · a27ba260
      Brian Foster authored
      The struct xfs_agfl v5 header was originally introduced with
      unexpected padding that caused the AGFL to operate with one less
      slot than intended. The header has since been packed, but the fix
      left an incompatibility for users who upgrade from an old kernel
      with the unpacked header to a newer kernel with the packed header
      while the AGFL happens to wrap around the end. The newer kernel
      recognizes one extra slot at the physical end of the AGFL that the
      previous kernel did not. The new kernel will eventually attempt to
      allocate a block from that slot, which contains invalid data, and
      cause a crash.
      
      This condition can be detected by comparing the active range of the
      AGFL to the count. While this detects a padding mismatch, it can
      also trigger false positives for unrelated flcount corruption. Since
      we cannot distinguish a size mismatch due to padding from unrelated
      corruption, we can't trust the AGFL enough to simply repopulate the
      empty slot.
      
      Instead, avoid unnecessarily complex detection logic and and use a
      solution that can handle any form of flcount corruption that slips
      through read verifiers: distrust the entire AGFL and reset it to an
      empty state. Any valid blocks within the AGFL are intentionally
      leaked. This requires xfs_repair to rectify (which was already
      necessary based on the state the AGFL was found in). The reset
      mitigates the side effect of the padding mismatch problem from a
      filesystem crash to a free space accounting inconsistency. The
      generic approach also means that this patch can be safely backported
      to kernels with or without a packed struct xfs_agfl.
      
      Check the AGF for an invalid freelist count on initial read from
      disk. If detected, set a flag on the xfs_perag to indicate that a
      reset is required before the AGFL can be used. In the first
      transaction that attempts to use a flagged AGFL, reset it to empty,
      warn the user about the inconsistency and allow the freelist fixup
      code to repopulate the AGFL with new blocks. The xfs_perag flag is
      cleared to eliminate the need for repeated checks on each block
      allocation operation.
      
      This allows kernels that include the packing fix commit 96f859d5
      ("libxfs: pack the agfl header structure so XFS_AGFL_SIZE is correct")
      to handle older unpacked AGFL formats without a filesystem crash.
      Suggested-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by Dave Chiluk <chiluk+linuxxfs@indeed.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      a27ba260
    • Christoph Hellwig's avatar
      xfs: unwind the try_again loop in xfs_log_force · 3e4da466
      Christoph Hellwig authored
      Instead split out a __xfs_log_fore_lsn helper that gets called again
      with the already_slept flag set to true in case we had to sleep.
      
      This prepares for aio_fsync support.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      3e4da466
    • Christoph Hellwig's avatar
      xfs: refactor xfs_log_force_lsn · 93806299
      Christoph Hellwig authored
      Use the the smallest possible loop as preable to find the correct iclog
      buffer, and then use gotos for unwinding to straighten the code.
      
      Also fix the top of function comment while we're at it.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      93806299
  2. 15 Mar, 2018 6 commits
  3. 14 Mar, 2018 6 commits
  4. 12 Mar, 2018 18 commits