1. 20 Sep, 2024 16 commits
  2. 01 Sep, 2024 13 commits
    • NeilBrown's avatar
      nfsd: move nfsd_pool_stats_open into nfsctl.c · c9f10f81
      NeilBrown authored
      nfsd_pool_stats_open() is used in nfsctl.c, so move it there.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      c9f10f81
    • NeilBrown's avatar
      SUNRPC: make various functions static, or not exported. · f2b27e1d
      NeilBrown authored
      Various functions are only used within the sunrpc module, and several
      are only use in the one file.  So clean up:
      
      These are marked static, and any EXPORT is removed.
        svc_rcpb_setup()
        svc_rqst_alloc()
        svc_rqst_free()  - also moved before first use
        svc_rpcbind_set_version()
        svc_drop() - also moved to svc.c
      
      These are now not EXPORTed, but are not static.
        svc_authenticate()
        svc_sock_update_bufs()
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      f2b27e1d
    • NeilBrown's avatar
      lockd: discard nlmsvc_timeout · 4ed9ef32
      NeilBrown authored
      nlmsvc_timeout always has the same value as (nlm_timeout * HZ), so use
      that in the one place that nlmsvc_timeout is used.
      
      In truth it *might* not always be the same as nlmsvc_timeout is only set
      when lockd is started while nlm_timeout can be set at anytime via
      sysctl.  I think this difference it not helpful so removing it is good.
      
      Also remove the test for nlm_timout being 0.  This is not possible -
      unless a module parameter is used to set the minimum timeout to 0, and
      if that happens then it probably should be honoured.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      4ed9ef32
    • NeilBrown's avatar
      nfsd: don't EXPORT_SYMBOL nfsd4_ssc_init_umount_work() · 8203ab8a
      NeilBrown authored
      nfsd4_ssc_init_umount_work() is only used in the nfsd module, so there
      is no need to EXPORT it.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      8203ab8a
    • Chen Hanxiao's avatar
      NFS: trace: show TIMEDOUT instead of 0x6e · cef48236
      Chen Hanxiao authored
      __nfs_revalidate_inode may return ETIMEDOUT.
      
      print symbol of ETIMEDOUT in nfs trace:
      
      before:
      cat-5191 [005] 119.331127: nfs_revalidate_inode_exit: error=-110 (0x6e)
      
      after:
      cat-1738 [004] 44.365509: nfs_revalidate_inode_exit: error=-110 (TIMEDOUT)
      Signed-off-by: default avatarChen Hanxiao <chenhx.fnst@fujitsu.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      cef48236
    • Youzhong Yang's avatar
      nfsd: use system_unbound_wq for nfsd_file_gc_worker() · 4b84551a
      Youzhong Yang authored
      After many rounds of changes in filecache.c, the fix by commit
      ce7df055(NFSD: Make the file_delayed_close workqueue UNBOUND)
      is gone, now we are getting syslog messages like these:
      
      [ 1618.186688] workqueue: nfsd_file_gc_worker [nfsd] hogged CPU for >13333us 4 times, consider switching to WQ_UNBOUND
      [ 1638.661616] workqueue: nfsd_file_gc_worker [nfsd] hogged CPU for >13333us 8 times, consider switching to WQ_UNBOUND
      [ 1665.284542] workqueue: nfsd_file_gc_worker [nfsd] hogged CPU for >13333us 16 times, consider switching to WQ_UNBOUND
      [ 1759.491342] workqueue: nfsd_file_gc_worker [nfsd] hogged CPU for >13333us 32 times, consider switching to WQ_UNBOUND
      [ 3013.012308] workqueue: nfsd_file_gc_worker [nfsd] hogged CPU for >13333us 64 times, consider switching to WQ_UNBOUND
      [ 3154.172827] workqueue: nfsd_file_gc_worker [nfsd] hogged CPU for >13333us 128 times, consider switching to WQ_UNBOUND
      [ 3422.461924] workqueue: nfsd_file_gc_worker [nfsd] hogged CPU for >13333us 256 times, consider switching to WQ_UNBOUND
      [ 3963.152054] workqueue: nfsd_file_gc_worker [nfsd] hogged CPU for >13333us 512 times, consider switching to WQ_UNBOUND
      
      Consider use system_unbound_wq instead of system_wq for
      nfsd_file_gc_worker().
      Signed-off-by: default avatarYouzhong Yang <youzhong@gmail.com>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      4b84551a
    • Jeff Layton's avatar
      nfsd: count nfsd_file allocations · 700bb4ff
      Jeff Layton authored
      We already count the frees (via nfsd_file_releases). Count the
      allocations as well. Also switch the direct call to nfsd_file_slab_free
      in nfsd_file_do_acquire to nfsd_file_free, so that the allocs and
      releases match up.
      Signed-off-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      700bb4ff
    • Jeff Layton's avatar
      nfsd: fix refcount leak when file is unhashed after being found · 8a792617
      Jeff Layton authored
      If we wait_for_construction and find that the file is no longer hashed,
      and we're going to retry the open, the old nfsd_file reference is
      currently leaked. Put the reference before retrying.
      
      Fixes: c6593366 ("nfsd: don't kill nfsd_files because of lease break error")
      Signed-off-by: default avatarJeff Layton <jlayton@kernel.org>
      Tested-by: default avatarYouzhong Yang <youzhong@gmail.com>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      8a792617
    • Jeff Layton's avatar
      nfsd: remove unneeded EEXIST error check in nfsd_do_file_acquire · 81a95c2b
      Jeff Layton authored
      Given that we do the search and insertion while holding the i_lock, I
      don't think it's possible for us to get EEXIST here. Remove this case.
      
      Fixes: c6593366 ("nfsd: don't kill nfsd_files because of lease break error")
      Signed-off-by: default avatarJeff Layton <jlayton@kernel.org>
      Tested-by: default avatarYouzhong Yang <youzhong@gmail.com>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      81a95c2b
    • Youzhong Yang's avatar
      nfsd: add list_head nf_gc to struct nfsd_file · 8e6e2ffa
      Youzhong Yang authored
      nfsd_file_put() in one thread can race with another thread doing
      garbage collection (running nfsd_file_gc() -> list_lru_walk() ->
      nfsd_file_lru_cb()):
      
        * In nfsd_file_put(), nf->nf_ref is 1, so it tries to do nfsd_file_lru_add().
        * nfsd_file_lru_add() returns true (with NFSD_FILE_REFERENCED bit set)
        * garbage collector kicks in, nfsd_file_lru_cb() clears REFERENCED bit and
          returns LRU_ROTATE.
        * garbage collector kicks in again, nfsd_file_lru_cb() now decrements nf->nf_ref
          to 0, runs nfsd_file_unhash(), removes it from the LRU and adds to the dispose
          list [list_lru_isolate_move(lru, &nf->nf_lru, head)]
        * nfsd_file_put() detects NFSD_FILE_HASHED bit is cleared, so it tries to remove
          the 'nf' from the LRU [if (!nfsd_file_lru_remove(nf))]. The 'nf' has been added
          to the 'dispose' list by nfsd_file_lru_cb(), so nfsd_file_lru_remove(nf) simply
          treats it as part of the LRU and removes it, which leads to its removal from
          the 'dispose' list.
        * At this moment, 'nf' is unhashed with its nf_ref being 0, and not on the LRU.
          nfsd_file_put() continues its execution [if (refcount_dec_and_test(&nf->nf_ref))],
          as nf->nf_ref is already 0, nf->nf_ref is set to REFCOUNT_SATURATED, and the 'nf'
          gets no chance of being freed.
      
      nfsd_file_put() can also race with nfsd_file_cond_queue():
        * In nfsd_file_put(), nf->nf_ref is 1, so it tries to do nfsd_file_lru_add().
        * nfsd_file_lru_add() sets REFERENCED bit and returns true.
        * Some userland application runs 'exportfs -f' or something like that, which triggers
          __nfsd_file_cache_purge() -> nfsd_file_cond_queue().
        * In nfsd_file_cond_queue(), it runs [if (!nfsd_file_unhash(nf))], unhash is done
          successfully.
        * nfsd_file_cond_queue() runs [if (!nfsd_file_get(nf))], now nf->nf_ref goes to 2.
        * nfsd_file_cond_queue() runs [if (nfsd_file_lru_remove(nf))], it succeeds.
        * nfsd_file_cond_queue() runs [if (refcount_sub_and_test(decrement, &nf->nf_ref))]
          (with "decrement" being 2), so the nf->nf_ref goes to 0, the 'nf' is added to the
          dispose list [list_add(&nf->nf_lru, dispose)]
        * nfsd_file_put() detects NFSD_FILE_HASHED bit is cleared, so it tries to remove
          the 'nf' from the LRU [if (!nfsd_file_lru_remove(nf))], although the 'nf' is not
          in the LRU, but it is linked in the 'dispose' list, nfsd_file_lru_remove() simply
          treats it as part of the LRU and removes it. This leads to its removal from
          the 'dispose' list!
        * Now nf->ref is 0, unhashed. nfsd_file_put() continues its execution and set
          nf->nf_ref to REFCOUNT_SATURATED.
      
      As shown in the above analysis, using nf_lru for both the LRU list and dispose list
      can cause the leaks. This patch adds a new list_head nf_gc in struct nfsd_file, and uses
      it for the dispose list. This does not fix the nfsd_file leaking issue completely.
      Signed-off-by: default avatarYouzhong Yang <youzhong@gmail.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      8e6e2ffa
    • Linus Torvalds's avatar
      Linux 6.11-rc6 · 431c1646
      Linus Torvalds authored
      431c1646
    • Linus Torvalds's avatar
      Merge tag 'v6.11-rc5-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 · 6b9ffc45
      Linus Torvalds authored
      Pull smb client fixes from Steve French:
      
       - copy_file_range fix
      
       - two read fixes including read past end of file rc fix and read retry
         crediting fix
      
       - falloc zero range fix
      
      * tag 'v6.11-rc5-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: Fix FALLOC_FL_ZERO_RANGE to preflush buffered part of target region
        cifs: Fix copy offload to flush destination region
        netfs, cifs: Fix handling of short DIO read
        cifs: Fix lack of credit renegotiation on read retry
      6b9ffc45
    • Linus Torvalds's avatar
      Merge tag 'bcachefs-2024-08-21' of https://github.com/koverstreet/bcachefs · a4c76312
      Linus Torvalds authored
      Push bcachefs fixes from Kent Overstreet:
       "The data corruption in the buffered write path is troubling; inode
        lock should not have been able to cause that...
      
         - Fix a rare data corruption in the rebalance path, caught as a nonce
           inconsistency on encrypted filesystems
      
         - Revert lockless buffered write path
      
         - Mark more errors as autofix"
      
      * tag 'bcachefs-2024-08-21' of https://github.com/koverstreet/bcachefs:
        bcachefs: Mark more errors as autofix
        bcachefs: Revert lockless buffered IO path
        bcachefs: Fix bch2_extents_match() false positive
        bcachefs: Fix failure to return error in data_update_index_update()
      a4c76312
  3. 31 Aug, 2024 11 commits
    • Kent Overstreet's avatar
      bcachefs: Mark more errors as autofix · 3d3020c4
      Kent Overstreet authored
      errors that are known to always be safe to fix should be autofix: this
      should be most errors even at this point, but that will need some
      thorough review.
      
      note that errors are still logged in the superblock, so we'll still know
      that they happened.
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      3d3020c4
    • Kent Overstreet's avatar
      bcachefs: Revert lockless buffered IO path · e3e69409
      Kent Overstreet authored
      We had a report of data corruption on nixos when building installer
      images.
      
      https://github.com/NixOS/nixpkgs/pull/321055#issuecomment-2184131334
      
      It seems that writes are being dropped, but only when issued by QEMU,
      and possibly only in snapshot mode. It's undetermined if it's write
      calls are being dropped or dirty folios.
      
      Further testing, via minimizing the original patch to just the change
      that skips the inode lock on non appends/truncates, reveals that it
      really is just not taking the inode lock that causes the corruption: it
      has nothing to do with the other logic changes for preserving write
      atomicity in corner cases.
      
      It's also kernel config dependent: it doesn't reproduce with the minimal
      kernel config that ktest uses, but it does reproduce with nixos's distro
      config. Bisection the kernel config initially pointer the finger at page
      migration or compaction, but it appears that was erroneous; we haven't
      yet determined what kernel config option actually triggers it.
      
      Sadly it appears this will have to be reverted since we're getting too
      close to release and my plate is full, but we'd _really_ like to fully
      debug it.
      
      My suspicion is that this patch is exposing a preexisting bug - the
      inode lock actually covers very little in IO paths, and we have a
      different lock (the pagecache add lock) that guards against races with
      truncate here.
      
      Fixes: 7e64c86c ("bcachefs: Buffered write path now can avoid the inode lock")
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      e3e69409
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging · 6cd90e5e
      Linus Torvalds authored
      Pull misc fixes from Guenter Roeck.
      
      These are fixes for regressions that Guenther has been reporting, and
      the maintainers haven't picked up and sent in. With rc6 fairly imminent,
      I'm taking them directly from Guenter.
      
      * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        apparmor: fix policy_unpack_test on big endian systems
        Revert "MIPS: csrc-r4k: Apply verification clocksource flags"
        microblaze: don't treat zero reserved memory regions as error
      6cd90e5e
    • Linus Torvalds's avatar
      Merge tag 'pwrseq-fixes-for-v6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · 8463be84
      Linus Torvalds authored
      Pull power sequencing fix from Bartosz Golaszewski:
       "A follow-up fix for the power sequencing subsystem. It turned out the
        previous fix for this driver was incomplete and broke the WLAN support
        on some platforms. This addresses the issue.
      
         - set the direction of the wlan-enable GPIO to output after
           requesting it as-is"
      
      * tag 'pwrseq-fixes-for-v6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
        power: sequencing: qcom-wcn: set the wlan-enable GPIO to output
      8463be84
    • Bartosz Golaszewski's avatar
      power: sequencing: qcom-wcn: set the wlan-enable GPIO to output · d8b76207
      Bartosz Golaszewski authored
      Commit a9aaf1ff ("power: sequencing: request the WLAN enable GPIO
      as-is") broke WLAN on boards on which the wlan-enable GPIO enabling the
      wifi module isn't in output mode by default. We need to set direction to
      output while retaining the value that was already set to keep the ath
      module on if it's already started.
      
      Fixes: a9aaf1ff ("power: sequencing: request the WLAN enable GPIO as-is")
      Link: https://lore.kernel.org/r/20240823115500.37280-1-brgl@bgdev.plSigned-off-by: default avatarBartosz Golaszewski <bartosz.golaszewski@linaro.org>
      d8b76207
    • Linus Torvalds's avatar
      Merge tag 'usb-6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · e8784b0a
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are some small USB fixes for 6.11-rc6.  Included in here are:
      
         - dwc3 driver fixes for reported issues
      
         - MAINTAINER file update, marking a driver as unsupported :(
      
         - cdnsp driver fixes
      
         - USB gadget driver fix
      
         - USB sysfs fix
      
         - other tiny fixes
      
         - new device ids for usb serial driver
      
        All of these have been in linux-next this week with no reported
        issues"
      
      * tag 'usb-6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        USB: serial: option: add MeiG Smart SRM825L
        usb: cdnsp: fix for Link TRB with TC
        usb: dwc3: st: add missing depopulate in probe error path
        usb: dwc3: st: fix probed platform device ref count on probe error path
        usb: dwc3: ep0: Don't reset resource alloc flag (including ep0)
        usb: core: sysfs: Unmerge @usb3_hardware_lpm_attr_group in remove_power_attributes()
        usb: typec: fsa4480: Relax CHIP_ID check
        usb: dwc3: xilinx: add missing depopulate in probe error path
        usb: dwc3: omap: add missing depopulate in probe error path
        dt-bindings: usb: microchip,usb2514: Fix reference USB device schema
        usb: gadget: uvc: queue pump work in uvcg_video_enable()
        cdc-acm: Add DISABLE_ECHO quirk for GE HealthCare UI Controller
        usb: cdnsp: fix incorrect index in cdnsp_get_hw_deq function
        usb: dwc3: core: Prevent USB core invalid event buffer address access
        MAINTAINERS: Mark UVC gadget driver as orphan
      e8784b0a
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 770b0ffe
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Minor fixes only.
      
        The sd.c one ignores a sync cache request if format is in progress
        which can happen if formatting a drive across suspend/resume"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: sd: Ignore command SYNCHRONIZE CACHE error if format in progress
        scsi: aacraid: Fix double-free on probe failure
        scsi: lpfc: Fix overflow build issue
      770b0ffe
    • Linus Torvalds's avatar
      Merge tag 'nfsd-6.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux · 6a2fcc51
      Linus Torvalds authored
      Pull nfsd fix from Chuck Lever:
      
       - One more write delegation fix
      
      * tag 'nfsd-6.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
        nfsd: fix nfsd4_deleg_getattr_conflict in presence of third party lease
      6a2fcc51
    • Linus Torvalds's avatar
      Merge tag 'xfs-6.11-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 0efdc097
      Linus Torvalds authored
      Pull xfs fixes from Chandan Babu:
      
       - Do not call out v1 inodes with non-zero di_nlink field as being
         corrupt
      
       - Change xfs_finobt_count_blocks() to count "free inode btree" blocks
         rather than "inode btree" blocks
      
       - Don't report the number of trimmed bytes via FITRIM because the
         underlying storage isn't required to do anything and failed discard
         IOs aren't reported to the caller anyway
      
       - Fix incorrect setting of rm_owner field in an rmap query
      
       - Report missing disk offset range in an fsmap query
      
       - Obtain m_growlock when extending realtime section of the filesystem
      
       - Reset rootdir extent size hint after extending realtime section of
         the filesystem
      
      * tag 'xfs-6.11-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: reset rootdir extent size hint after growfsrt
        xfs: take m_growlock when running growfsrt
        xfs: Fix missing interval for missing_owner in xfs fsmap
        xfs: use XFS_BUF_DADDR_NULL for daddrs in getfsmap code
        xfs: Fix the owner setting issue for rmap query in xfs fsmap
        xfs: don't bother reporting blocks trimmed via FITRIM
        xfs: xfs_finobt_count_blocks() walks the wrong btree
        xfs: fix folio dirtying for XFILE_ALLOC callers
        xfs: fix di_onlink checking for V1/V2 inodes
      0efdc097
    • Linus Torvalds's avatar
      Merge tag 'arm-fixes-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · 35667a29
      Linus Torvalds authored
      Pull ARM SoC fixes from Arnd Bergmann:
       "There is a fairly large number of bug fixes for Qualcomm platforms,
        most of them addressing issues with the devicetree files for the newly
        added Snapdragon X1 based laptops to make them more reliable.
      
        The Qualcomm driver changes address a few build-time issues as well as
        runtime problems in the tzmem and scm firmware, the USB Type-C driver,
        and the cmd-db and pmic_glink soc drivers.
      
        The NXP i.MX usually gets a bunch of devicetree fixes that is
        proportional to the number of supported machines. This includes both
        warning fixes and correctness for the 64-bit i.MX9, i.MX8 and
        layerscape platforms, as well as a single fix for a 32-bit i.MX6 based
        board.
      
        The other changes are the usual minor changes, including an update to
        the MAINTAINERS file, an omap3 dts file and a SoC driver for mpfs
        (risc-v)"
      
      * tag 'arm-fixes-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (50 commits)
        firmware: microchip: fix incorrect error report of programming:timeout on success
        soc: qcom: pd-mapper: Fix singleton refcount
        firmware: qcom: tzmem: disable sdm670 platform
        soc: qcom: pmic_glink: Actually communicate when remote goes down
        usb: typec: ucsi: Move unregister out of atomic section
        soc: qcom: pmic_glink: Fix race during initialization
        firmware: qcom: qseecom: remove unused functions
        firmware: qcom: tzmem: fix virtual-to-physical address conversion
        firmware: qcom: scm: Mark get_wq_ctx() as atomic call
        arm64: dts: qcom: x1e80100: Fix Adreno SMMU global interrupt
        arm64: dts: qcom: disable GPU on x1e80100 by default
        arm64: dts: imx8mm-phygate: fix typo pinctrcl-0
        arm64: dts: imx95: correct L3Cache cache-sets
        arm64: dts: imx95: correct a55 power-domains
        arm64: dts: freescale: imx93-tqma9352-mba93xxla: fix typo
        arm64: dts: freescale: imx93-tqma9352: fix CMA alloc-ranges
        ARM: dts: imx6dl-yapp43: Increase LED current to match the yapp4 HW design
        arm64: dts: imx93: update default value for snps,clk-csr
        arm64: dts: freescale: tqma9352: Fix watchdog reset
        arm64: dts: imx8mp-beacon-kit: Fix Stereo Audio on WM8962
        ...
      35667a29
    • Linus Torvalds's avatar
      Merge tag 'input-for-v6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 1934261d
      Linus Torvalds authored
      Pull input fix from Dmitry Torokhov:
      
       - a fix for Cypress PS/2 touchpad for regression introduced in 6.11
         merge window where a timeout condition is incorrectly reported for
         all extended Cypress commands
      
      * tag 'input-for-v6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: cypress_ps2 - fix waiting for command response
      1934261d