1. 09 Jul, 2024 1 commit
    • Andreas Gruenbacher's avatar
      gfs2: Clean up glock demote logic · f75efefb
      Andreas Gruenbacher authored
      The logic for determining when to demote a glock in glock_work_func(),
      introduced in commit 7cf8dcd3 ("GFS2: Automatically adjust glock min
      hold time"), doesn't make sense: inode glocks have a minimum hold time
      that delays demotion, while all other glocks are expected to be demoted
      immediately.  Instead of demoting non-inode glocks immediately,
      glock_work_func() schedules glock work for them to be demoted, however.
      Get rid of that unnecessary indirection.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      f75efefb
  2. 20 Jun, 2024 3 commits
    • Andreas Gruenbacher's avatar
      gfs2: Revert "check for no eligible quota changes" · 5a1906a4
      Andreas Gruenbacher authored
      Since the previous commit, function gfs2_quota_sync() will not cause the
      sync generation to creep forward by one every time the function is
      called; this helps keep things a but more tidy.  We also don't care that
      this function allocates a page of memory every time it is called, so no
      good reason for keeping qd_changed() anymore, which just duplicates
      qd_grab_sync().
      
      This reverts commit 06aa6fd3.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      5a1906a4
    • Andreas Gruenbacher's avatar
      gfs2: Be more careful with the quota sync generation · d9a75a60
      Andreas Gruenbacher authored
      The quota sync generation is only ever updated under sd_quota_sync_mutex
      by gfs2_quota_sync(), but its current value is fetched ouside of that
      mutex, so use WRITE_ONCE() and READ_ONCE() when accessing it without
      holding that mutex.
      
      Pass the current sync generation to do_sync() from its callers to ensure
      that we're not recording the wrong generation when the syncing is
      done.  Also, make sure that qd->qd_sync_gen only ever moves forward.
      
      In gfs2_quota_sync(), only write the new sync generation when we know
      that there are changes.  This eliminates the need for function
      sd_changed(), which we will remove in the next commit.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      d9a75a60
    • Andreas Gruenbacher's avatar
      gfs2: Get rid of some unnecessary quota locking · 8d89e068
      Andreas Gruenbacher authored
      With the locking the previous patch has introduced for each struct
      gfs2_quota_data object, sd_quota_mutex has become largely irrelevant.
      By waiting on the buffer head instead of waiting on the mutex in
      get_bh(), it becomes completely irrelevant and can be removed.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      8d89e068
  3. 12 Jun, 2024 1 commit
    • Andreas Gruenbacher's avatar
      gfs2: Add some missing quota locking · d5563f42
      Andreas Gruenbacher authored
      The quota code is missing some locking between local quota changes and
      syncing those quota changes to the global quota file (gfs2_quota_sync);
      in particular, qd->qd_change needs to be kept in sync with the
      QDF_CHANGE change flag and the number of references held.  Use the
      qd->qd_lockref.lock spinlock for that.
      
      With the qd->qd_lockref.lock spinlock held, we can no longer call
      lockref_get(), so turn qd_hold() into a variant that assumes that the
      lock is held.  This function is really supposed to take an additional
      reference when one or more references are already held, so check for
      that instead of checking if the lockref is dead.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      d5563f42
  4. 08 Jun, 2024 7 commits
    • Andreas Gruenbacher's avatar
      gfs2: Fold qd_fish into gfs2_quota_sync · 614abc11
      Andreas Gruenbacher authored
      The split between qd_fish() and gfs2_quota_sync() is rather unfortunate
      as qd_fish() is repeatedly called to scan sdp->sd_quota_list only to
      find the next object to that needs syncing; if there are multiple
      objects on the list that need syncing, it makes more sense to grab them
      all in one go.  This is relatively easy to do when qd_fish() is folded
      into gfs2_quota_sync().
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      614abc11
    • Andreas Gruenbacher's avatar
      gfs2: quota need_sync cleanup · b510af07
      Andreas Gruenbacher authored
      Rename variable 'value' to 'change' as it stores a change in value.
      
      Add new 'value' and 'limit' variables for the current value and limit.
      
      Only fetch the tuning parameters when we need them.
      
      Get rid of unnecessary nesting.
      
      No change in functionality.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      b510af07
    • Andreas Gruenbacher's avatar
      gfs2: Fix and clean up function do_qc · 7da4d6e1
      Andreas Gruenbacher authored
      Function do_qc() is supposed to be conceptually simple: it alters the
      current in-memory and on-disk quota change values for a given uid/gid by
      a given delta.  If the on-disk record isn't defined yet, a new record is
      created.  If the on-disk record exists and the resulting change value is
      zero, there no longer is a need for that record and so the record is
      deleted.  On top of that, some reference counting is involved when
      creating and deleting records.
      
      Currently, instead of doing the above, do_qc() alters the on-disk value
      and then it sets the in-memory value to the on-disk value.  This is
      incorrect when the on-disk value differs from the in-memory value.  The
      two values are allowed to differ when quota changes are synced to the
      global quota file.  Fix by changing both values by the same amount.
      
      In addition, do_qc() currently gets confused when the delta value is 0.
      It isn't supposed to be called that way, but that assumption isn't
      mentioned and it makes the code harder to read.  Make the code more
      explicit.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      7da4d6e1
    • Andreas Gruenbacher's avatar
      gfs2: Revert "Add quota_change type" · ec4b5200
      Andreas Gruenbacher authored
      Commit 432928c9 ("gfs2: Add quota_change type") makes the incorrect
      assertion that function do_qc() should behave differently in the two
      contexts it is used in, but that isn't actually true.  In all cases,
      do_qc() grabs a "reference" when it starts using a slot in the per-node
      quota changes file, and it releases that "reference" when no more
      residual changes remain.  Revert that broken commit.
      
      There are some remaining issues with function do_qc() which are
      addressed in the next commit.
      
      This reverts commit 432928c9.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      ec4b5200
    • Andreas Gruenbacher's avatar
      gfs2: Revert "ignore negated quota changes" · 4b4b6374
      Andreas Gruenbacher authored
      Commit 4c6a0812 ("gfs2: ignore negated quota changes") skips quota
      changes with qd_change == 0 instead of writing them back, which leaves
      behind non-zero qd_change values in the affected slots.  The kernel then
      assumes that those slots are unused, while the qd_change values on disk
      indicate that they are indeed still in use.  The next time the
      filesystem is mounted, those invalid slots are read in from disk, which
      will cause inconsistencies.
      
      Revert that commit to avoid filesystem corruption.
      
      This reverts commit 4c6a0812.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      4b4b6374
    • Andreas Gruenbacher's avatar
      gfs2: qd_check_sync cleanups · 59ebc332
      Andreas Gruenbacher authored
      Rename qd_check_sync() to qd_grab_sync() and make it return a bool.
      Turn the sync_gen pointer into a regular u64 and pass in U64_MAX instead
      of a NULL pointer when sync generation checking isn't needed.
      
      Introduce a new qd_ungrab_sync() helper for undoing the effects of
      qd_grab_sync() if the subsequent bh_get() on the qd object fails.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      59ebc332
    • Andreas Gruenbacher's avatar
      gfs2: Revert "introduce qd_bh_get_or_undo" · 2aedfe84
      Andreas Gruenbacher authored
      The qd_bh_get_or_undo() helper introduced by that commit doesn't improve
      the code much, so revert it and clean things up in a more useful way in
      the next commit.
      
      This reverts commit 7dbc6ae6.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      2aedfe84
  5. 07 Jun, 2024 1 commit
    • Andreas Gruenbacher's avatar
      gfs2: Check quota consistency on mount · de0d95c2
      Andreas Gruenbacher authored
      In gfs2_quota_init(), make sure that the per-node "quota_change%u" file
      doesn't contain duplicate uids/gids.  Those duplicates would cause us to
      acquire the glock corresponding to those ids repeatedly, which the glock
      code doesn't allow.
      
      When finding inconsistencies, we wipe them out and ignore them.  The
      resulting quotas will likely be inconsistent, and running quotacheck(1)
      is advised.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      de0d95c2
  6. 04 Jun, 2024 1 commit
  7. 29 May, 2024 9 commits
    • Andreas Gruenbacher's avatar
      gfs2: Get rid of demote_ok checks · 713f8834
      Andreas Gruenbacher authored
      The demote_ok glock operation is only still used to prevent the inode
      glocks of the "jindex" and "rindex" directories from getting recycled
      while they are still referenced by sdp->sd_jindex and sdp->sd_rindex.
      However, the LRU walking code will no longer recycle glocks which are
      referenced, so the demote_ok glock operation is obsolete and can be
      removed.
      
      Each of a glock's holders in the gl_holders list is holding a reference
      on the glock, so when the list of holders isn't empty in demote_ok(),
      the existing reference count check will already prevent the glock from
      getting released.  This means that demote_ok() is obsolete as well.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      713f8834
    • Andreas Gruenbacher's avatar
      Revert "GFS2: Don't add all glocks to the lru" · 3f4475bf
      Andreas Gruenbacher authored
      This reverts commit e7ccaf5f.
      
      Before commit e7ccaf5f, every time a resource group glock was
      dequeued by gfs2_glock_dq(), it was added to the glock LRU list even
      though the glock was still referenced by the resource group and could
      never be evicted, anyway.  Commit e7ccaf5f added a GLOF_LRU hack to
      avoid that overhead for resource group glocks, and that hack was since
      adopted for some other types of glocks as well.
      
      We now no longer add glocks to the glock LRU list while they are still
      referenced.  This solves the underlying problem, and obsoletes the
      GLOF_LRU hack.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      (cherry picked from commit 3e5257c810cba91e274d07f3db5cf013c7c830be)
      3f4475bf
    • Andreas Gruenbacher's avatar
      gfs2: Revise glock reference counting model · 767fd5a0
      Andreas Gruenbacher authored
      In the current glock reference counting model, a bias of one is added to
      a glock's refcount when it is locked (gl->gl_state != LM_ST_UNLOCKED).
      A glock is removed from the lru_list when it is enqueued, and added back
      when it is dequeued.  This isn't a very appropriate model because most
      glocks are held for long periods of time (for example, the inode "owns"
      references to its inode and iopen glocks as long as the inode is cached
      even when the glock state changes to LM_ST_UNLOCKED), and they can only
      be freed when they are no longer referenced, anyway.
      
      Fix this by getting rid of the refcount bias for locked glocks.  That
      way, we can use lockref_put_or_lock() to efficiently drop all but the
      last glock reference, and put the glock onto the lru_list when the last
      reference is dropped.  When find_insert_glock() returns a reference to a
      cached glock, it removes the glock from the lru_list.
      
      Dumping the "glocks" and "glstats" debugfs files also takes glock
      references, but instead of removing the glocks from the lru_list in that
      case as well, we leave them on the list.  This ensures that dumping
      those files won't perturb the order of the glocks on the lru_list.
      
      In addition, when the last reference to an *unlocked* glock is dropped,
      we immediately free it; this preserves the preexisting behavior.  If it
      later turns out that caching unlocked glocks is useful in some
      situations, we can change the caching strategy.
      
      It is currently unclear if a glock that has no active references can
      have the GLF_LFLUSH flag set.  To make sure that such a glock won't
      accidentally be evicted due to memory pressure, we add a GLF_LFLUSH
      check to gfs2_dispose_glock_lru().
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      767fd5a0
    • Andreas Gruenbacher's avatar
      gfs2: Switch to a per-filesystem glock workqueue · 30e388d5
      Andreas Gruenbacher authored
      Switch to a per-filesystem glock workqueue.  Additional workqueues are
      cheap nowadays, and keeping separate workqueues allows to flush the work
      of each filesystem without affecting the others.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      30e388d5
    • Andreas Gruenbacher's avatar
      gfs2: Report when glocks cannot be freed for a long time · 51568ac2
      Andreas Gruenbacher authored
      When glocks cannot be freed for a long time, avoid the "task blocked for
      more than N seconds" messages and report how many glocks are still
      outstanding, instead.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      51568ac2
    • Andreas Gruenbacher's avatar
      gfs2: gfs2_glock_get cleanup · 8f6b8f14
      Andreas Gruenbacher authored
      Clean up the messy code in gfs2_glock_get().  No change in
      functionality.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      8f6b8f14
    • Andreas Gruenbacher's avatar
      gfs2: Invert the GLF_INITIAL flag · c8758ad0
      Andreas Gruenbacher authored
      Invert the meaning of the GLF_INITIAL flag: right now, when GLF_INITIAL
      is set, a DLM lock exists and we have a valid identifier for it; when
      GLF_INITIAL is cleared, no DLM lock exists (yet).  This is confusing.
      In addition, it makes more sense to highlight the exceptional case
      (i.e., no DLM lock exists yet) in glock dumps and trace points than to
      highlight the common case.
      
      To avoid confusion between the "old" and the "new" meaning of the flag,
      use 'a' instead of 'I' to represent the flag.
      
      For improved code consistency, check if the GLF_INITIAL flag is cleared
      to determine whether a DLM lock exists instead of checking if the lock
      identifier is non-zero.
      
      Document what the flag is used for.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      c8758ad0
    • Andreas Gruenbacher's avatar
    • Andreas Gruenbacher's avatar
      gfs2: Update glocks documentation · 97d6fdcd
      Andreas Gruenbacher authored
      Rearrange the table of locking modes and associated caching capability
      to be in order of increasing caching capability.
      
      Update the description of the glock operations.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      97d6fdcd
  8. 28 May, 2024 6 commits
  9. 26 May, 2024 5 commits
  10. 25 May, 2024 6 commits
    • Linus Torvalds's avatar
      Merge tag 'mm-hotfixes-stable-2024-05-25-09-13' of... · 9b62e02e
      Linus Torvalds authored
      Merge tag 'mm-hotfixes-stable-2024-05-25-09-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
      
      Pull misc fixes from Andrew Morton:
       "16 hotfixes, 11 of which are cc:stable.
      
        A few nilfs2 fixes, the remainder are for MM: a couple of selftests
        fixes, various singletons fixing various issues in various parts"
      
      * tag 'mm-hotfixes-stable-2024-05-25-09-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
        mm/ksm: fix possible UAF of stable_node
        mm/memory-failure: fix handling of dissolved but not taken off from buddy pages
        mm: /proc/pid/smaps_rollup: avoid skipping vma after getting mmap_lock again
        nilfs2: fix potential hang in nilfs_detach_log_writer()
        nilfs2: fix unexpected freezing of nilfs_segctor_sync()
        nilfs2: fix use-after-free of timer for log writer thread
        selftests/mm: fix build warnings on ppc64
        arm64: patching: fix handling of execmem addresses
        selftests/mm: compaction_test: fix bogus test success and reduce probability of OOM-killer invocation
        selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages
        selftests/mm: compaction_test: fix bogus test success on Aarch64
        mailmap: update email address for Satya Priya
        mm/huge_memory: don't unpoison huge_zero_folio
        kasan, fortify: properly rename memintrinsics
        lib: add version into /proc/allocinfo output
        mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL
      9b62e02e
    • Linus Torvalds's avatar
      Merge tag 'irq-urgent-2024-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a0db36ed
      Linus Torvalds authored
      Pull irq fixes from Ingo Molnar:
      
       - Fix x86 IRQ vector leak caused by a CPU offlining race
      
       - Fix build failure in the riscv-imsic irqchip driver
         caused by an API-change semantic conflict
      
       - Fix use-after-free in irq_find_at_or_after()
      
      * tag 'irq-urgent-2024-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq/irqdesc: Prevent use-after-free in irq_find_at_or_after()
        genirq/cpuhotplug, x86/vector: Prevent vector leak during CPU offline
        irqchip/riscv-imsic: Fixup riscv_ipi_set_virq_range() conflict
      a0db36ed
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2024-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3a390f24
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
      
       - Fix regressions of the new x86 CPU VFM (vendor/family/model)
         enumeration/matching code
      
       - Fix crash kernel detection on buggy firmware with
         non-compliant ACPI MADT tables
      
       - Address Kconfig warning
      
      * tag 'x86-urgent-2024-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/cpu: Fix x86_match_cpu() to match just X86_VENDOR_INTEL
        crypto: x86/aes-xts - switch to new Intel CPU model defines
        x86/topology: Handle bogus ACPI tables correctly
        x86/kconfig: Select ARCH_WANT_FRAME_POINTERS again when UNWINDER_FRAME_POINTER=y
      3a390f24
    • Linus Torvalds's avatar
      Merge tag 'for-linus-6.10-1' of https://github.com/cminyard/linux-ipmi · 56676c4c
      Linus Torvalds authored
      Pull ipmi updates from Corey Minyard:
       "Mostly updates for deprecated interfaces, platform.remove and
        converting from a tasklet to a BH workqueue.
      
        Also use HAS_IOPORT for disabling inb()/outb()"
      
      * tag 'for-linus-6.10-1' of https://github.com/cminyard/linux-ipmi:
        ipmi: kcs_bmc_npcm7xx: Convert to platform remove callback returning void
        ipmi: kcs_bmc_aspeed: Convert to platform remove callback returning void
        ipmi: ipmi_ssif: Convert to platform remove callback returning void
        ipmi: ipmi_si_platform: Convert to platform remove callback returning void
        ipmi: ipmi_powernv: Convert to platform remove callback returning void
        ipmi: bt-bmc: Convert to platform remove callback returning void
        char: ipmi: handle HAS_IOPORT dependencies
        ipmi: Convert from tasklet to BH workqueue
      56676c4c
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-6.10-rc1' of https://github.com/ceph/ceph-client · 74eca356
      Linus Torvalds authored
      Pull ceph updates from Ilya Dryomov:
       "A series from Xiubo that adds support for additional access checks
        based on MDS auth caps which were recently made available to clients.
      
        This is needed to prevent scenarios where the MDS quietly discards
        updates that a UID-restricted client previously (wrongfully) acked to
        the user.
      
        Other than that, just a documentation fixup"
      
      * tag 'ceph-for-6.10-rc1' of https://github.com/ceph/ceph-client:
        doc: ceph: update userspace command to get CephFS metadata
        ceph: add CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK feature bit
        ceph: check the cephx mds auth access for async dirop
        ceph: check the cephx mds auth access for open
        ceph: check the cephx mds auth access for setattr
        ceph: add ceph_mds_check_access() helper
        ceph: save cap_auths in MDS client when session is opened
      74eca356
    • Linus Torvalds's avatar
      Merge tag 'ntfs3_for_6.10' of https://github.com/Paragon-Software-Group/linux-ntfs3 · 89b61ca4
      Linus Torvalds authored
      Pull ntfs3 updates from Konstantin Komarov:
       "Fixes:
         - reusing of the file index (could cause the file to be trimmed)
         - infinite dir enumeration
         - taking DOS names into account during link counting
         - le32_to_cpu conversion, 32 bit overflow, NULL check
         - some code was refactored
      
        Changes:
         - removed max link count info display during driver init
      
        Remove:
         - atomic_open has been removed for lack of use"
      
      * tag 'ntfs3_for_6.10' of https://github.com/Paragon-Software-Group/linux-ntfs3:
        fs/ntfs3: Break dir enumeration if directory contents error
        fs/ntfs3: Fix case when index is reused during tree transformation
        fs/ntfs3: Mark volume as dirty if xattr is broken
        fs/ntfs3: Always make file nonresident on fallocate call
        fs/ntfs3: Redesign ntfs_create_inode to return error code instead of inode
        fs/ntfs3: Use variable length array instead of fixed size
        fs/ntfs3: Use 64 bit variable to avoid 32 bit overflow
        fs/ntfs3: Check 'folio' pointer for NULL
        fs/ntfs3: Missed le32_to_cpu conversion
        fs/ntfs3: Remove max link count info display during driver init
        fs/ntfs3: Taking DOS names into account during link counting
        fs/ntfs3: remove atomic_open
        fs/ntfs3: use kcalloc() instead of kzalloc()
      89b61ca4