1. 15 Mar, 2024 1 commit
  2. 07 Mar, 2024 1 commit
    • Dave Chinner's avatar
      xfs: shrink failure needs to hold AGI buffer · 75bcffbb
      Dave Chinner authored
      Chandan reported a AGI/AGF lock order hang on xfs/168 during recent
      testing. The cause of the problem was the task running xfs_growfs
      to shrink the filesystem. A failure occurred trying to remove the
      free space from the btrees that the shrink would make disappear,
      and that meant it ran the error handling for a partial failure.
      
      This error path involves restoring the per-ag block reservations,
      and that requires calculating the amount of space needed to be
      reserved for the free inode btree. The growfs operation hung here:
      
      [18679.536829]  down+0x71/0xa0
      [18679.537657]  xfs_buf_lock+0xa4/0x290 [xfs]
      [18679.538731]  xfs_buf_find_lock+0xf7/0x4d0 [xfs]
      [18679.539920]  xfs_buf_lookup.constprop.0+0x289/0x500 [xfs]
      [18679.542628]  xfs_buf_get_map+0x2b3/0xe40 [xfs]
      [18679.547076]  xfs_buf_read_map+0xbb/0x900 [xfs]
      [18679.562616]  xfs_trans_read_buf_map+0x449/0xb10 [xfs]
      [18679.569778]  xfs_read_agi+0x1cd/0x500 [xfs]
      [18679.573126]  xfs_ialloc_read_agi+0xc2/0x5b0 [xfs]
      [18679.578708]  xfs_finobt_calc_reserves+0xe7/0x4d0 [xfs]
      [18679.582480]  xfs_ag_resv_init+0x2c5/0x490 [xfs]
      [18679.586023]  xfs_ag_shrink_space+0x736/0xd30 [xfs]
      [18679.590730]  xfs_growfs_data_private.isra.0+0x55e/0x990 [xfs]
      [18679.599764]  xfs_growfs_data+0x2f1/0x410 [xfs]
      [18679.602212]  xfs_file_ioctl+0xd1e/0x1370 [xfs]
      
      trying to get the AGI lock. The AGI lock was held by a fstress task
      trying to do an inode allocation, and it was waiting on the AGF
      lock to allocate a new inode chunk on disk. Hence deadlock.
      
      The fix for this is for the growfs code to hold the AGI over the
      transaction roll it does in the error path. It already holds the AGF
      locked across this, and that is what causes the lock order inversion
      in the xfs_ag_resv_init() call.
      Reported-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      Fixes: 46141dc8 ("xfs: introduce xfs_ag_shrink_space()")
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      75bcffbb
  3. 29 Feb, 2024 2 commits
  4. 28 Feb, 2024 2 commits
  5. 26 Feb, 2024 1 commit
    • Darrick J. Wong's avatar
      xfs: fix scrub stats file permissions · e610e856
      Darrick J. Wong authored
      When the kernel is in lockdown mode, debugfs will only show files that
      are world-readable and cannot be written, mmaped, or used with ioctl.
      That more or less describes the scrub stats file, except that the
      permissions are wrong -- they should be 0444, not 0644.  You can't write
      the stats file, so the 0200 makes no sense.
      
      Meanwhile, the clear_stats file is only writable, but it got mode 0400
      instead of 0200, which would make more sense.
      
      Fix both files so that they make sense.
      
      Fixes: d7a74cad ("xfs: track usage statistics of online fsck")
      Signed-off-by: default avatar"Darrick J. Wong" <djwong@kernel.org>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      e610e856
  6. 24 Feb, 2024 19 commits
    • Darrick J. Wong's avatar
      xfs: fix log recovery erroring out on refcount recovery failure · 1e5efd72
      Darrick J. Wong authored
      Per the comment in the error case of xfs_reflink_recover_cow, zero out
      any error (after shutting down the log) so that we actually kill any new
      intent items that might have gotten logged by later recovery steps.
      Discovered by xfs/434, which few people actually seem to run.
      
      Fixes: 2c1e31ed ("xfs: place intent recovery under NOFS allocation context")
      Signed-off-by: default avatar"Darrick J. Wong" <djwong@kernel.org>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      1e5efd72
    • Chandan Babu R's avatar
      Merge tag 'symlink-cleanups-6.9_2024-02-23' of... · e6469b22
      Chandan Babu R authored
      Merge tag 'symlink-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: clean up symbolic link code
      
      This series cleans up a few bits of the symbolic link code as needed for
      future projects.  Online repair requires the ability to commit fixed
      fork-based filesystem metadata such as directories, xattrs, and symbolic
      links atomically, so we need to rearrange the symlink code before we
      land the atomic extent swapping.
      
      Accomplish this by moving the remote symlink target block code and
      declarations to xfs_symlink_remote.[ch].
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'symlink-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: move symlink target write function to libxfs
        xfs: move remote symlink target read function to libxfs
        xfs: move xfs_symlink_remote.c declarations to xfs_symlink_remote.h
      e6469b22
    • Chandan Babu R's avatar
      Merge tag 'expand-bmap-intent-usage_2024-02-23' of... · 6723ca99
      Chandan Babu R authored
      Merge tag 'expand-bmap-intent-usage_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: support attrfork and unwritten BUIs
      
      In preparation for atomic extent swapping and the online repair
      functionality that wants atomic extent swaps, enhance the BUI code so
      that we can support deferred work on the extended attribute fork and on
      unwritten extents.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'expand-bmap-intent-usage_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: xfs_bmap_finish_one should map unwritten extents properly
        xfs: support deferred bmap updates on the attr fork
      6723ca99
    • Chandan Babu R's avatar
      Merge tag 'realtime-bmap-intents-6.9_2024-02-23' of... · 4e3f7e7a
      Chandan Babu R authored
      Merge tag 'realtime-bmap-intents-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: widen BUI formats to support realtime
      
      Atomic extent swapping (and later, reverse mapping and reflink) on the
      realtime device needs to be able to defer file mapping and extent
      freeing work in much the same manner as is required on the data volume.
      Make the BUI log items operate on rt extents in preparation for atomic
      swapping and realtime rmap.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'realtime-bmap-intents-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: support recovering bmap intent items targetting realtime extents
        xfs: add a realtime flag to the bmap update log redo items
        xfs: fix xfs_bunmapi to allow unmapping of partial rt extents
      4e3f7e7a
    • Chandan Babu R's avatar
      Merge tag 'bmap-intent-cleanups-6.9_2024-02-23' of... · 10ea6158
      Chandan Babu R authored
      Merge tag 'bmap-intent-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: bmap log intent cleanups
      
      The next major target of online repair are metadata that are persisted
      in blocks mapped by a file fork.  In other words, we want to repair
      directories, extended attributes, symbolic links, and the realtime free
      space information.  For file-based metadata, we assume that the space
      metadata is correct, which enables repair to construct new versions of
      the metadata in a temporary file.  We then need to swap the file fork
      mappings of the two files atomically.  With this patchset, we begin
      constructing such a facility based on the existing bmap log items and a
      new extent swap log item.
      
      This series cleans up a few parts of the file block mapping log intent
      code before we start adding support for realtime bmap intents.  Most of
      it involves cleaning up tracepoints so that more of the data extraction
      logic ends up in the tracepoint code and not the tracepoint call site,
      which should reduce overhead further when tracepoints are disabled.
      There is also a change to pass bmap intents all the way back to the bmap
      code instead of unboxing the intent values and re-boxing them after the
      _finish_one function completes.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'bmap-intent-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: add a xattr_entry helper
        xfs: move xfs_bmap_defer_add to xfs_bmap_item.c
        xfs: reuse xfs_bmap_update_cancel_item
        xfs: add a bi_entry helper
        xfs: remove xfs_trans_set_bmap_flags
        xfs: clean up bmap log intent item tracepoint callsites
        xfs: split tracepoint classes for deferred items
      10ea6158
    • Chandan Babu R's avatar
      Merge tag 'repair-refcount-scalability-6.9_2024-02-23' of... · 74acb705
      Chandan Babu R authored
      Merge tag 'repair-refcount-scalability-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: reduce refcount repair memory usage
      
      The refcountbt repair code has serious memory usage problems when the
      block sharing factor of the filesystem is very high.  This can happen if
      a deduplication tool has been run against the filesystem, or if the fs
      stores reflinked VM images that have been aging for a long time.
      
      Recall that the original reference counting algorithm walks the reverse
      mapping records of the filesystem to generate reference counts.  For any
      given block in the AG, the rmap bag structure contains the all rmap
      records that cover that block; the refcount is the size of that bag.
      
      For online repair, the bag doesn't need the owner, offset, or state flag
      information, so it discards those.  This halves the record size, but the
      bag structure still stores one excerpted record for each reverse
      mapping.  If the sharing count is high, this will use a LOT of memory
      storing redundant records.  In the extreme case, 100k mappings to the
      same piece of space will consume 100k*16 bytes = 1.6M of memory.
      
      For offline repair, the bag stores the owner values so that we know
      which inodes need to be marked as being reflink inodes.  If a
      deduplication tool has been run and there are many blocks within a file
      pointing to the same physical space, this will stll use a lot of memory
      to store redundant records.
      
      The solution to this problem is to deduplicate the bag records when
      possible by adding a reference count to the bag record, and changing the
      bag add function to detect an existing record to bump the refcount.  In
      the above example, the 100k mappings will now use 24 bytes of memory.
      These lookups can be done efficiently with a btree, so we create a new
      refcount bag btree type (inside of online repair).  This is why we
      refactored the btree code in the previous patchset.
      
      The btree conversion also dramatically reduces the runtime of the
      refcount generation algorithm, because the code to delete all bag
      records that end at a given agblock now only has to delete one record
      instead of (using the example above) 100k records.  As an added benefit,
      record deletion now gives back the unused xfile space, which it did not
      do previously.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'repair-refcount-scalability-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: port refcount repair to the new refcount bag structure
        xfs: create refcount bag structure for btree repairs
        xfs: define an in-memory btree for storing refcount bag info during repairs
      74acb705
    • Chandan Babu R's avatar
      Merge tag 'repair-rmap-btree-6.9_2024-02-23' of... · fd43925c
      Chandan Babu R authored
      Merge tag 'repair-rmap-btree-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: online repair of rmap btrees
      
      We have now constructed the four tools that we need to scan the
      filesystem looking for reverse mappings: an inode scanner, hooks to
      receive live updates from other writer threads, the ability to construct
      btrees in memory, and a btree bulk loader.
      
      This series glues those three together, enabling us to scan the
      filesystem for mappings and keep it up to date while other writers run,
      and then commit the new btree to disk atomically.
      
      To reduce the size of each patch, the functionality is left disabled
      until the end of the series and broken up into three patches: one to
      create the mechanics of scanning the filesystem, a second to transition
      to in-memory btrees, and a third to set up the live hooks.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'repair-rmap-btree-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: hook live rmap operations during a repair operation
        xfs: create a shadow rmap btree during rmap repair
        xfs: repair the rmapbt
        xfs: create agblock bitmap helper to count the number of set regions
        xfs: create a helper to decide if a file mapping targets the rt volume
      fd43925c
    • Chandan Babu R's avatar
      Merge tag 'in-memory-btrees-6.9_2024-02-23' of... · 8394a97c
      Chandan Babu R authored
      Merge tag 'in-memory-btrees-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: support in-memory btrees
      
      Online repair of the reverse-mapping btrees presens some unique
      challenges.  To construct a new reverse mapping btree, we must scan the
      entire filesystem, but we cannot afford to quiesce the entire filesystem
      for the potentially lengthy scan.
      
      For rmap btrees, therefore, we relax our requirements of totally atomic
      repairs.  Instead, repairs will scan all inodes, construct a new reverse
      mapping dataset, format a new btree, and commit it before anyone trips
      over the corruption.  This is exactly the same strategy as was used in
      the quotacheck and nlink scanners.
      
      Unfortunately, the xfarray cannot perform key-based lookups and is
      therefore unsuitable for supporting live updates.  Luckily, we already a
      data structure that maintains an indexed rmap recordset -- the existing
      rmap btree code!  Hence we port the existing btree and buffer target
      code to be able to create a btree using the xfile we developed earlier.
      Live hooks keep the in-memory btree up to date for any resources that
      have already been scanned.
      
      This approach is not maximally memory efficient, but we can use the same
      rmap code that we do everywhere else, which provides improved stability
      without growing the code base even more.  Note that in-memory btree
      blocks are always page sized.
      
      This patchset modifies the kernel xfs buffer cache to be capable of
      using a xfile (aka a shmem file) as a backing device.  It then augments
      the btree code to support creating btree cursors with buffers that come
      from a buftarg other than the data device (namely an xfile-backed
      buftarg).  For the userspace xfs buffer cache, we instead use a memfd or
      an O_TMPFILE file as a backing device.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'in-memory-btrees-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: launder in-memory btree buffers before transaction commit
        xfs: support in-memory btrees
        xfs: add a xfs_btree_ptrs_equal helper
        xfs: support in-memory buffer cache targets
        xfs: teach buftargs to maintain their own buffer hashtable
      8394a97c
    • Chandan Babu R's avatar
      Merge tag 'buftarg-cleanups-6.9_2024-02-23' of... · aa8fb4bb
      Chandan Babu R authored
      Merge tag 'buftarg-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: buftarg cleanups
      
      Clean up the buffer target code in preparation for adding the ability to
      target tmpfs files.  That will enable the creation of in memory btrees.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'buftarg-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: move setting bt_logical_sectorsize out of xfs_setsize_buftarg
        xfs: remove xfs_setsize_buftarg_early
        xfs: remove the xfs_buftarg_t typedef
      aa8fb4bb
    • Chandan Babu R's avatar
      Merge tag 'btree-readahead-cleanups-6.9_2024-02-23' of... · a7ade7e1
      Chandan Babu R authored
      Merge tag 'btree-readahead-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: btree readahead cleanups
      
      Minor cleanups for the btree block readahead code.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'btree-readahead-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: split xfs_buf_rele for cached vs uncached buffers
        xfs: move and rename xfs_btree_read_bufl
        xfs: remove xfs_btree_reada_bufs
        xfs: remove xfs_btree_reada_bufl
      a7ade7e1
    • Chandan Babu R's avatar
      Merge tag 'btree-check-cleanups-6.9_2024-02-23' of... · 169c030a
      Chandan Babu R authored
      Merge tag 'btree-check-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: btree check cleanups
      
      Minor cleanups for the btree block pointer checking code.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'btree-check-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: factor out a __xfs_btree_check_lblock_hdr helper
        xfs: rename btree helpers that depends on the block number representation
        xfs: consolidate btree block verification
        xfs: tighten up validation of root block in inode forks
        xfs: remove the crc variable in __xfs_btree_check_lblock
        xfs: misc cleanups for __xfs_btree_check_sblock
        xfs: consolidate btree ptr checking
        xfs: open code xfs_btree_check_lptr in xfs_bmap_btree_to_extents
        xfs: simplify xfs_btree_check_lblock_siblings
        xfs: simplify xfs_btree_check_sblock_siblings
      169c030a
    • Chandan Babu R's avatar
      Merge tag 'btree-remove-btnum-6.9_2024-02-23' of... · ee138217
      Chandan Babu R authored
      Merge tag 'btree-remove-btnum-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: remove bc_btnum from btree cursors
      
      From Christoph Hellwig,
      
      This series continues the migration of btree geometry information out of
      the cursor structure and into the ops structure.  This time around, we
      replace the btree type enumeration (btnum) with an explicit name string
      in the btree ops structure.  This enables easy creation of /any/ new
      btree type without having to mess with libxfs.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'btree-remove-btnum-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: remove xfs_btnum_t
        xfs: pass a 'bool is_finobt' to xfs_inobt_insert
        xfs: split xfs_inobt_init_cursor
        xfs: split xfs_inobt_insert_sprec
        xfs: remove the which variable in xchk_iallocbt
        xfs: remove the btnum argument to xfs_inobt_count_blocks
        xfs: remove xfs_inobt_cur
        xfs: split xfs_allocbt_init_cursor
        xfs: refactor the btree cursor allocation logic in xchk_ag_btcur_init
        xfs: add a sick_mask to struct xfs_btree_ops
        xfs: add a name field to struct xfs_btree_ops
        xfs: split the agf_roots and agf_levels arrays
        xfs: remove xfs_bmbt_stage_cursor
        xfs: fold xfs_bmbt_init_common into xfs_bmbt_init_cursor
        xfs: make staging file forks explicit
        xfs: make full use of xfs_btree_stage_ifakeroot in xfs_bmbt_stage_cursor
        xfs: remove xfs_rmapbt_stage_cursor
        xfs: fold xfs_rmapbt_init_common into xfs_rmapbt_init_cursor
        xfs: remove xfs_refcountbt_stage_cursor
        xfs: fold xfs_refcountbt_init_common into xfs_refcountbt_init_cursor
        xfs: remove xfs_inobt_stage_cursor
        xfs: fold xfs_inobt_init_common into xfs_inobt_init_cursor
        xfs: remove xfs_allocbt_stage_cursor
        xfs: fold xfs_allocbt_init_common into xfs_allocbt_init_cursor
        xfs: don't override bc_ops for staging btrees
        xfs: add a xfs_btree_init_ptr_from_cur
        xfs: move comment about two 2 keys per pointer in the rmap btree
      ee138217
    • Chandan Babu R's avatar
      Merge tag 'btree-geometry-in-ops-6.9_2024-02-23' of... · 681cb87b
      Chandan Babu R authored
      Merge tag 'btree-geometry-in-ops-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: move btree geometry to ops struct
      
      This patchset prepares the generic btree code to allow for the creation
      of new btree types outside of libxfs.  The end goal here is for online
      fsck to be able to create its own in-memory btrees that will be used to
      improve the performance (and reduce the memory requirements of) the
      refcount btree.
      
      To enable this, I decided that the btree ops structure is the ideal
      place to encode all of the geometry information about a btree. The btree
      ops struture already contains the buffer ops (and hence the btree block
      magic numbers) as well as the key and record sizes, so it doesn't seem
      all that farfetched to encode the XFS_BTREE_ flags that determine the
      geometry (ROOT_IN_INODE, LONG_PTRS, etc).
      
      The rest of the patchset cleans up the btree functions that initialize
      btree blocks and btree buffers.  The bulk of this work is to replace
      btree geometry related function call arguments with a single pointer to
      the ops structure, and then clean up everything else around that.  As a
      side effect, we rename the functions.
      
      Later, Christoph Hellwig and I merged together a bunch more cleanups
      that he wanted to do for a while.  All the btree geometry information is
      now in the btree ops structure, we've created an explicit btree type
      (ag, inode, mem) and moved the per-btree type information to a separate
      union.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'btree-geometry-in-ops-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: create predicate to determine if cursor is at inode root level
        xfs: split the per-btree union in struct xfs_btree_cur
        xfs: split out a btree type from the btree ops geometry flags
        xfs: store the btree pointer length in struct xfs_btree_ops
        xfs: factor out a btree block owner check
        xfs: factor out a xfs_btree_owner helper
        xfs: move the btree stats offset into struct btree_ops
        xfs: move lru refs to the btree ops structure
        xfs: set btree block buffer ops in _init_buf
        xfs: remove the unnecessary daddr paramter to _init_block
        xfs: btree convert xfs_btree_init_block to xfs_btree_init_buf calls
        xfs: rename btree block/buffer init functions
        xfs: initialize btree blocks using btree_ops structure
        xfs: extern some btree ops structures
        xfs: turn the allocbt cursor active field into a btree flag
        xfs: consolidate the xfs_alloc_lookup_* helpers
        xfs: remove bc_ino.flags
        xfs: encode the btree geometry flags in the btree ops structure
        xfs: fix imprecise logic in xchk_btree_check_block_owner
        xfs: drop XFS_BTREE_CRC_BLOCKS
        xfs: set the btree cursor bc_ops in xfs_btree_alloc_cursor
        xfs: consolidate btree block allocation tracepoints
        xfs: consolidate btree block freeing tracepoints
      681cb87b
    • Chandan Babu R's avatar
      Merge tag 'repair-fscounters-6.9_2024-02-23' of... · 5d1bd19d
      Chandan Babu R authored
      Merge tag 'repair-fscounters-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: online repair for fs summary counters
      
      A longstanding deficiency in the online fs summary counter scrubbing
      code is that it hasn't any means to quiesce the incore percpu counters
      while it's running.  There is no way to coordinate with other threads
      are reserving or freeing free space simultaneously, which leads to false
      error reports.  Right now, if the discrepancy is large, we just sort of
      shrug and bail out with an incomplete flag, but this is lame.
      
      For repair activity, we actually /do/ need to stabilize the counters to
      get an accurate reading and install it in the percpu counter.  To
      improve the former and enable the latter, allow the fscounters online
      fsck code to perform an exclusive mini-freeze on the filesystem.  The
      exclusivity prevents userspace from thawing while we're running, and the
      mini-freeze means that we don't wait for the log to quiesce, which will
      make both speedier.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'repair-fscounters-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: repair summary counters
      5d1bd19d
    • Chandan Babu R's avatar
      Merge tag 'indirect-health-reporting-6.9_2024-02-23' of... · f1077579
      Chandan Babu R authored
      Merge tag 'indirect-health-reporting-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: indirect health reporting
      
      This series enables the XFS health reporting infrastructure to remember
      indirect health concerns when resources are scarce.  For example, if a
      scrub notices that there's something wrong with an inode's metadata but
      memory reclaim needs to free the incore inode, we want to record in the
      perag data the fact that there was some inode somewhere with an error.
      The perag structures never go away.
      
      The first two patches in this series set that up, and the third one
      provides a means for xfs_scrub to tell the kernel that it can forget the
      indirect problem report.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'indirect-health-reporting-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: update health status if we get a clean bill of health
        xfs: remember sick inodes that get inactivated
        xfs: add secondary and indirect classes to the health tracking system
      f1077579
    • Chandan Babu R's avatar
      Merge tag 'corruption-health-reports-6.9_2024-02-23' of... · 6fe1910e
      Chandan Babu R authored
      Merge tag 'corruption-health-reports-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: report corruption to the health trackers
      
      Any time that the runtime code thinks it has found corrupt metadata, it
      should tell the health tracking subsystem that the corresponding part of
      the filesystem is sick.  These reports come primarily from two places --
      code that is reading a buffer that fails validation, and higher level
      pieces that observe a conflict involving multiple buffers.  This
      patchset uses automated scanning to update all such callsites with a
      mark_sick call.
      
      Doing this enables the health system to record problem observed at
      runtime, which (for now) can prompt the sysadmin to run xfs_scrub, and
      (later) may enable more targetted fixing of the filesystem.
      
      Note: Earlier reviewers of this patchset suggested that the verifier
      functions themselves should be responsible for calling _mark_sick.  In a
      higher level language this would be easily accomplished with lambda
      functions and closures.  For the kernel, however, we'd have to create
      the necessary closures by hand, pass them to the buf_read calls, and
      then implement necessary state tracking to detach the xfs_buf from the
      closure at the necessary time.  This is far too much work and complexity
      and will not be pursued further.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'corruption-health-reports-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: report XFS_IS_CORRUPT errors to the health system
        xfs: report realtime metadata corruption errors to the health system
        xfs: report quota block corruption errors to the health system
        xfs: report inode corruption errors to the health system
        xfs: report symlink block corruption errors to the health system
        xfs: report dir/attr block corruption errors to the health system
        xfs: report btree block corruption errors to the health system
        xfs: report block map corruption errors to the health tracking system
        xfs: report ag header corruption errors to the health tracking system
        xfs: report fs corruption errors to the health tracking system
        xfs: separate the marking of sick and checked metadata
      6fe1910e
    • Chandan Babu R's avatar
      Merge tag 'scrub-nlinks-6.9_2024-02-23' of... · 128d0fd1
      Chandan Babu R authored
      Merge tag 'scrub-nlinks-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: online repair of file link counts
      
      Now that we've created the infrastructure to perform live scans of every
      file in the filesystem and the necessary hook infrastructure to observe
      live updates, use it to scan directories to compute the correct link
      counts for files in the filesystem, and reset those link counts.
      
      This patchset creates a tailored readdir implementation for scrub
      because the regular version has to cycle ILOCKs to copy information to
      userspace.  We can't cycle the ILOCK during the nlink scan and we don't
      need all the other VFS support code (maintaining a readdir cursor and
      translating XFS structures to VFS structures and back) so it was easier
      to duplicate the code.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'scrub-nlinks-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: teach repair to fix file nlinks
        xfs: track directory entry updates during live nlinks fsck
        xfs: teach scrub to check file nlinks
        xfs: report health of inode link counts
      128d0fd1
    • Chandan Babu R's avatar
      Merge tag 'repair-quotacheck-6.9_2024-02-23' of... · aa03f524
      Chandan Babu R authored
      Merge tag 'repair-quotacheck-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: online repair of quota counters
      
      This series uses the inode scanner and live update hook functionality
      introduced in the last patchset to implement quotacheck on a live
      filesystem.  The quotacheck scrubber builds an incore copy of the
      dquot resource usage counters and compares it to the live dquots to
      report discrepancies.
      
      If the user chooses to repair the quota counters, the repair function
      visits each incore dquot to update the counts from the live information.
      The live update hooks are key to keeping the incore copy up to date.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'repair-quotacheck-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: repair dquots based on live quotacheck results
        xfs: repair cannot update the summary counters when logging quota flags
        xfs: track quota updates during live quotacheck
        xfs: implement live quotacheck inode scan
        xfs: create a sparse load xfarray function
        xfs: create a helper to count per-device inode block usage
        xfs: create a xchk_trans_alloc_empty helper for scrub
        xfs: report the health of quota counts
      aa03f524
    • Chandan Babu R's avatar
      Merge tag 'repair-inode-mode-6.9_2024-02-23' of... · 8e3ef44f
      Chandan Babu R authored
      Merge tag 'repair-inode-mode-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
      
      xfs: repair inode mode by scanning dirs
      
      One missing piece of functionality in the inode record repair code is
      figuring out what to do with a file whose mode is so corrupt that we
      cannot tell us the type of the file.  Originally this was done by
      guessing the mode from the ondisk inode contents, but Christoph didn't
      like that because it read from data fork block 0, which could be user
      controlled data.
      
      Therefore, I've replaced all that with a directory scanner that looks
      for any dirents that point to the file with the garbage mode.  If so,
      the ftype in the dirent will tell us exactly what mode to set on the
      file.  Since users cannot directly write to the ftype field of a dirent,
      this should be safe.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      
      * tag 'repair-inode-mode-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
        xfs: repair file modes by scanning for a dirent pointing to us
        xfs: create a macro for decoding ftypes in tracepoints
        xfs: create a predicate to determine if two xfs_names are the same
        xfs: create a static name for the dot entry too
        xfs: iscan batching should handle unallocated inodes too
        xfs: cache a bunch of inodes for repair scans
        xfs: stagger the starting AG of scrub iscans to reduce contention
        xfs: allow scrub to hook metadata updates in other writers
        xfs: implement live inode scan for scrub
        xfs: speed up xfs_iwalk_adjust_start a little bit
      8e3ef44f
  7. 22 Feb, 2024 14 commits