1. 24 Feb, 2024 13 commits
    • Chandan Babu R's avatar
      Merge tag 'repair-rmap-btree-6.9_2024-02-23' of... · fd43925c
      Chandan Babu R authored
      Merge tag 'repair-rmap-btree-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: online repair of rmap btrees
      
      We have now constructed the four tools that we need to scan the
      filesystem looking for reverse mappings: an inode scanner, hooks to
      receive live updates from other writer threads, the ability to construct
      btrees in memory, and a btree bulk loader.
      
      This series glues those three together, enabling us to scan the
      filesystem for mappings and keep it up to date while other writers run,
      and then commit the new btree to disk atomically.
      
      To reduce the size of each patch, the functionality is left disabled
      until the end of the series and broken up into three patches: one to
      create the mechanics of scanning the filesystem, a second to transition
      to in-memory btrees, and a third to set up the live hooks.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'repair-rmap-btree-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: hook live rmap operations during a repair operation
        xfs: create a shadow rmap btree during rmap repair
        xfs: repair the rmapbt
        xfs: create agblock bitmap helper to count the number of set regions
        xfs: create a helper to decide if a file mapping targets the rt volume
      fd43925c
    • Chandan Babu R's avatar
      Merge tag 'in-memory-btrees-6.9_2024-02-23' of... · 8394a97c
      Chandan Babu R authored
      Merge tag 'in-memory-btrees-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: support in-memory btrees
      
      Online repair of the reverse-mapping btrees presens some unique
      challenges.  To construct a new reverse mapping btree, we must scan the
      entire filesystem, but we cannot afford to quiesce the entire filesystem
      for the potentially lengthy scan.
      
      For rmap btrees, therefore, we relax our requirements of totally atomic
      repairs.  Instead, repairs will scan all inodes, construct a new reverse
      mapping dataset, format a new btree, and commit it before anyone trips
      over the corruption.  This is exactly the same strategy as was used in
      the quotacheck and nlink scanners.
      
      Unfortunately, the xfarray cannot perform key-based lookups and is
      therefore unsuitable for supporting live updates.  Luckily, we already a
      data structure that maintains an indexed rmap recordset -- the existing
      rmap btree code!  Hence we port the existing btree and buffer target
      code to be able to create a btree using the xfile we developed earlier.
      Live hooks keep the in-memory btree up to date for any resources that
      have already been scanned.
      
      This approach is not maximally memory efficient, but we can use the same
      rmap code that we do everywhere else, which provides improved stability
      without growing the code base even more.  Note that in-memory btree
      blocks are always page sized.
      
      This patchset modifies the kernel xfs buffer cache to be capable of
      using a xfile (aka a shmem file) as a backing device.  It then augments
      the btree code to support creating btree cursors with buffers that come
      from a buftarg other than the data device (namely an xfile-backed
      buftarg).  For the userspace xfs buffer cache, we instead use a memfd or
      an O_TMPFILE file as a backing device.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'in-memory-btrees-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: launder in-memory btree buffers before transaction commit
        xfs: support in-memory btrees
        xfs: add a xfs_btree_ptrs_equal helper
        xfs: support in-memory buffer cache targets
        xfs: teach buftargs to maintain their own buffer hashtable
      8394a97c
    • Chandan Babu R's avatar
      Merge tag 'buftarg-cleanups-6.9_2024-02-23' of... · aa8fb4bb
      Chandan Babu R authored
      Merge tag 'buftarg-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: buftarg cleanups
      
      Clean up the buffer target code in preparation for adding the ability to
      target tmpfs files.  That will enable the creation of in memory btrees.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'buftarg-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: move setting bt_logical_sectorsize out of xfs_setsize_buftarg
        xfs: remove xfs_setsize_buftarg_early
        xfs: remove the xfs_buftarg_t typedef
      aa8fb4bb
    • Chandan Babu R's avatar
      Merge tag 'btree-readahead-cleanups-6.9_2024-02-23' of... · a7ade7e1
      Chandan Babu R authored
      Merge tag 'btree-readahead-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: btree readahead cleanups
      
      Minor cleanups for the btree block readahead code.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'btree-readahead-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: split xfs_buf_rele for cached vs uncached buffers
        xfs: move and rename xfs_btree_read_bufl
        xfs: remove xfs_btree_reada_bufs
        xfs: remove xfs_btree_reada_bufl
      a7ade7e1
    • Chandan Babu R's avatar
      Merge tag 'btree-check-cleanups-6.9_2024-02-23' of... · 169c030a
      Chandan Babu R authored
      Merge tag 'btree-check-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: btree check cleanups
      
      Minor cleanups for the btree block pointer checking code.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'btree-check-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: factor out a __xfs_btree_check_lblock_hdr helper
        xfs: rename btree helpers that depends on the block number representation
        xfs: consolidate btree block verification
        xfs: tighten up validation of root block in inode forks
        xfs: remove the crc variable in __xfs_btree_check_lblock
        xfs: misc cleanups for __xfs_btree_check_sblock
        xfs: consolidate btree ptr checking
        xfs: open code xfs_btree_check_lptr in xfs_bmap_btree_to_extents
        xfs: simplify xfs_btree_check_lblock_siblings
        xfs: simplify xfs_btree_check_sblock_siblings
      169c030a
    • Chandan Babu R's avatar
      Merge tag 'btree-remove-btnum-6.9_2024-02-23' of... · ee138217
      Chandan Babu R authored
      Merge tag 'btree-remove-btnum-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: remove bc_btnum from btree cursors
      
      From Christoph Hellwig,
      
      This series continues the migration of btree geometry information out of
      the cursor structure and into the ops structure.  This time around, we
      replace the btree type enumeration (btnum) with an explicit name string
      in the btree ops structure.  This enables easy creation of /any/ new
      btree type without having to mess with libxfs.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'btree-remove-btnum-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: remove xfs_btnum_t
        xfs: pass a 'bool is_finobt' to xfs_inobt_insert
        xfs: split xfs_inobt_init_cursor
        xfs: split xfs_inobt_insert_sprec
        xfs: remove the which variable in xchk_iallocbt
        xfs: remove the btnum argument to xfs_inobt_count_blocks
        xfs: remove xfs_inobt_cur
        xfs: split xfs_allocbt_init_cursor
        xfs: refactor the btree cursor allocation logic in xchk_ag_btcur_init
        xfs: add a sick_mask to struct xfs_btree_ops
        xfs: add a name field to struct xfs_btree_ops
        xfs: split the agf_roots and agf_levels arrays
        xfs: remove xfs_bmbt_stage_cursor
        xfs: fold xfs_bmbt_init_common into xfs_bmbt_init_cursor
        xfs: make staging file forks explicit
        xfs: make full use of xfs_btree_stage_ifakeroot in xfs_bmbt_stage_cursor
        xfs: remove xfs_rmapbt_stage_cursor
        xfs: fold xfs_rmapbt_init_common into xfs_rmapbt_init_cursor
        xfs: remove xfs_refcountbt_stage_cursor
        xfs: fold xfs_refcountbt_init_common into xfs_refcountbt_init_cursor
        xfs: remove xfs_inobt_stage_cursor
        xfs: fold xfs_inobt_init_common into xfs_inobt_init_cursor
        xfs: remove xfs_allocbt_stage_cursor
        xfs: fold xfs_allocbt_init_common into xfs_allocbt_init_cursor
        xfs: don't override bc_ops for staging btrees
        xfs: add a xfs_btree_init_ptr_from_cur
        xfs: move comment about two 2 keys per pointer in the rmap btree
      ee138217
    • Chandan Babu R's avatar
      Merge tag 'btree-geometry-in-ops-6.9_2024-02-23' of... · 681cb87b
      Chandan Babu R authored
      Merge tag 'btree-geometry-in-ops-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: move btree geometry to ops struct
      
      This patchset prepares the generic btree code to allow for the creation
      of new btree types outside of libxfs.  The end goal here is for online
      fsck to be able to create its own in-memory btrees that will be used to
      improve the performance (and reduce the memory requirements of) the
      refcount btree.
      
      To enable this, I decided that the btree ops structure is the ideal
      place to encode all of the geometry information about a btree. The btree
      ops struture already contains the buffer ops (and hence the btree block
      magic numbers) as well as the key and record sizes, so it doesn't seem
      all that farfetched to encode the XFS_BTREE_ flags that determine the
      geometry (ROOT_IN_INODE, LONG_PTRS, etc).
      
      The rest of the patchset cleans up the btree functions that initialize
      btree blocks and btree buffers.  The bulk of this work is to replace
      btree geometry related function call arguments with a single pointer to
      the ops structure, and then clean up everything else around that.  As a
      side effect, we rename the functions.
      
      Later, Christoph Hellwig and I merged together a bunch more cleanups
      that he wanted to do for a while.  All the btree geometry information is
      now in the btree ops structure, we've created an explicit btree type
      (ag, inode, mem) and moved the per-btree type information to a separate
      union.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'btree-geometry-in-ops-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: create predicate to determine if cursor is at inode root level
        xfs: split the per-btree union in struct xfs_btree_cur
        xfs: split out a btree type from the btree ops geometry flags
        xfs: store the btree pointer length in struct xfs_btree_ops
        xfs: factor out a btree block owner check
        xfs: factor out a xfs_btree_owner helper
        xfs: move the btree stats offset into struct btree_ops
        xfs: move lru refs to the btree ops structure
        xfs: set btree block buffer ops in _init_buf
        xfs: remove the unnecessary daddr paramter to _init_block
        xfs: btree convert xfs_btree_init_block to xfs_btree_init_buf calls
        xfs: rename btree block/buffer init functions
        xfs: initialize btree blocks using btree_ops structure
        xfs: extern some btree ops structures
        xfs: turn the allocbt cursor active field into a btree flag
        xfs: consolidate the xfs_alloc_lookup_* helpers
        xfs: remove bc_ino.flags
        xfs: encode the btree geometry flags in the btree ops structure
        xfs: fix imprecise logic in xchk_btree_check_block_owner
        xfs: drop XFS_BTREE_CRC_BLOCKS
        xfs: set the btree cursor bc_ops in xfs_btree_alloc_cursor
        xfs: consolidate btree block allocation tracepoints
        xfs: consolidate btree block freeing tracepoints
      681cb87b
    • Chandan Babu R's avatar
      Merge tag 'repair-fscounters-6.9_2024-02-23' of... · 5d1bd19d
      Chandan Babu R authored
      Merge tag 'repair-fscounters-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: online repair for fs summary counters
      
      A longstanding deficiency in the online fs summary counter scrubbing
      code is that it hasn't any means to quiesce the incore percpu counters
      while it's running.  There is no way to coordinate with other threads
      are reserving or freeing free space simultaneously, which leads to false
      error reports.  Right now, if the discrepancy is large, we just sort of
      shrug and bail out with an incomplete flag, but this is lame.
      
      For repair activity, we actually /do/ need to stabilize the counters to
      get an accurate reading and install it in the percpu counter.  To
      improve the former and enable the latter, allow the fscounters online
      fsck code to perform an exclusive mini-freeze on the filesystem.  The
      exclusivity prevents userspace from thawing while we're running, and the
      mini-freeze means that we don't wait for the log to quiesce, which will
      make both speedier.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'repair-fscounters-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: repair summary counters
      5d1bd19d
    • Chandan Babu R's avatar
      Merge tag 'indirect-health-reporting-6.9_2024-02-23' of... · f1077579
      Chandan Babu R authored
      Merge tag 'indirect-health-reporting-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: indirect health reporting
      
      This series enables the XFS health reporting infrastructure to remember
      indirect health concerns when resources are scarce.  For example, if a
      scrub notices that there's something wrong with an inode's metadata but
      memory reclaim needs to free the incore inode, we want to record in the
      perag data the fact that there was some inode somewhere with an error.
      The perag structures never go away.
      
      The first two patches in this series set that up, and the third one
      provides a means for xfs_scrub to tell the kernel that it can forget the
      indirect problem report.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'indirect-health-reporting-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: update health status if we get a clean bill of health
        xfs: remember sick inodes that get inactivated
        xfs: add secondary and indirect classes to the health tracking system
      f1077579
    • Chandan Babu R's avatar
      Merge tag 'corruption-health-reports-6.9_2024-02-23' of... · 6fe1910e
      Chandan Babu R authored
      Merge tag 'corruption-health-reports-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: report corruption to the health trackers
      
      Any time that the runtime code thinks it has found corrupt metadata, it
      should tell the health tracking subsystem that the corresponding part of
      the filesystem is sick.  These reports come primarily from two places --
      code that is reading a buffer that fails validation, and higher level
      pieces that observe a conflict involving multiple buffers.  This
      patchset uses automated scanning to update all such callsites with a
      mark_sick call.
      
      Doing this enables the health system to record problem observed at
      runtime, which (for now) can prompt the sysadmin to run xfs_scrub, and
      (later) may enable more targetted fixing of the filesystem.
      
      Note: Earlier reviewers of this patchset suggested that the verifier
      functions themselves should be responsible for calling _mark_sick.  In a
      higher level language this would be easily accomplished with lambda
      functions and closures.  For the kernel, however, we'd have to create
      the necessary closures by hand, pass them to the buf_read calls, and
      then implement necessary state tracking to detach the xfs_buf from the
      closure at the necessary time.  This is far too much work and complexity
      and will not be pursued further.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'corruption-health-reports-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: report XFS_IS_CORRUPT errors to the health system
        xfs: report realtime metadata corruption errors to the health system
        xfs: report quota block corruption errors to the health system
        xfs: report inode corruption errors to the health system
        xfs: report symlink block corruption errors to the health system
        xfs: report dir/attr block corruption errors to the health system
        xfs: report btree block corruption errors to the health system
        xfs: report block map corruption errors to the health tracking system
        xfs: report ag header corruption errors to the health tracking system
        xfs: report fs corruption errors to the health tracking system
        xfs: separate the marking of sick and checked metadata
      6fe1910e
    • Chandan Babu R's avatar
      Merge tag 'scrub-nlinks-6.9_2024-02-23' of... · 128d0fd1
      Chandan Babu R authored
      Merge tag 'scrub-nlinks-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: online repair of file link counts
      
      Now that we've created the infrastructure to perform live scans of every
      file in the filesystem and the necessary hook infrastructure to observe
      live updates, use it to scan directories to compute the correct link
      counts for files in the filesystem, and reset those link counts.
      
      This patchset creates a tailored readdir implementation for scrub
      because the regular version has to cycle ILOCKs to copy information to
      userspace.  We can't cycle the ILOCK during the nlink scan and we don't
      need all the other VFS support code (maintaining a readdir cursor and
      translating XFS structures to VFS structures and back) so it was easier
      to duplicate the code.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'scrub-nlinks-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: teach repair to fix file nlinks
        xfs: track directory entry updates during live nlinks fsck
        xfs: teach scrub to check file nlinks
        xfs: report health of inode link counts
      128d0fd1
    • Chandan Babu R's avatar
      Merge tag 'repair-quotacheck-6.9_2024-02-23' of... · aa03f524
      Chandan Babu R authored
      Merge tag 'repair-quotacheck-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: online repair of quota counters
      
      This series uses the inode scanner and live update hook functionality
      introduced in the last patchset to implement quotacheck on a live
      filesystem.  The quotacheck scrubber builds an incore copy of the
      dquot resource usage counters and compares it to the live dquots to
      report discrepancies.
      
      If the user chooses to repair the quota counters, the repair function
      visits each incore dquot to update the counts from the live information.
      The live update hooks are key to keeping the incore copy up to date.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'repair-quotacheck-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: repair dquots based on live quotacheck results
        xfs: repair cannot update the summary counters when logging quota flags
        xfs: track quota updates during live quotacheck
        xfs: implement live quotacheck inode scan
        xfs: create a sparse load xfarray function
        xfs: create a helper to count per-device inode block usage
        xfs: create a xchk_trans_alloc_empty helper for scrub
        xfs: report the health of quota counts
      aa03f524
    • Chandan Babu R's avatar
      Merge tag 'repair-inode-mode-6.9_2024-02-23' of... · 8e3ef44f
      Chandan Babu R authored
      Merge tag 'repair-inode-mode-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: repair inode mode by scanning dirs
      
      One missing piece of functionality in the inode record repair code is
      figuring out what to do with a file whose mode is so corrupt that we
      cannot tell us the type of the file.  Originally this was done by
      guessing the mode from the ondisk inode contents, but Christoph didn't
      like that because it read from data fork block 0, which could be user
      controlled data.
      
      Therefore, I've replaced all that with a directory scanner that looks
      for any dirents that point to the file with the garbage mode.  If so,
      the ftype in the dirent will tell us exactly what mode to set on the
      file.  Since users cannot directly write to the ftype field of a dirent,
      this should be safe.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'repair-inode-mode-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: repair file modes by scanning for a dirent pointing to us
        xfs: create a macro for decoding ftypes in tracepoints
        xfs: create a predicate to determine if two xfs_names are the same
        xfs: create a static name for the dot entry too
        xfs: iscan batching should handle unallocated inodes too
        xfs: cache a bunch of inodes for repair scans
        xfs: stagger the starting AG of scrub iscans to reduce contention
        xfs: allow scrub to hook metadata updates in other writers
        xfs: implement live inode scan for scrub
        xfs: speed up xfs_iwalk_adjust_start a little bit
      8e3ef44f
  2. 22 Feb, 2024 27 commits