1. 05 Oct, 2022 8 commits
    • Chuck Lever's avatar
      SUNRPC: Replace the use of the xprtiod WQ in rpcrdma · 6b1eb3b2
      Chuck Lever authored
      While setting up a new lab, I accidentally misconfigured the
      Ethernet port for a system that tried an NFS mount using RoCE.
      This made the NFS server unreachable. The following WARNING
      popped on the NFS client while waiting for the mount attempt to
      time out:
      
      kernel: workqueue: WQ_MEM_RECLAIM xprtiod:xprt_rdma_connect_worker [rpcrdma] is flushing !WQ_MEM_RECLAI>
      kernel: WARNING: CPU: 0 PID: 100 at kernel/workqueue.c:2628 check_flush_dependency+0xbf/0xca
      kernel: Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs 8021q garp stp mrp llc rfkill rpcrdma>
      kernel: CPU: 0 PID: 100 Comm: kworker/u8:8 Not tainted 6.0.0-rc1-00002-g6229f8c054e5 #13
      kernel: Hardware name: Supermicro X10SRA-F/X10SRA-F, BIOS 2.0b 06/12/2017
      kernel: Workqueue: xprtiod xprt_rdma_connect_worker [rpcrdma]
      kernel: RIP: 0010:check_flush_dependency+0xbf/0xca
      kernel: Code: 75 2a 48 8b 55 18 48 8d 8b b0 00 00 00 4d 89 e0 48 81 c6 b0 00 00 00 48 c7 c7 65 33 2e be>
      kernel: RSP: 0018:ffffb562806cfcf8 EFLAGS: 00010092
      kernel: RAX: 0000000000000082 RBX: ffff97894f8c3c00 RCX: 0000000000000027
      kernel: RDX: 0000000000000002 RSI: ffffffffbe3447d1 RDI: 00000000ffffffff
      kernel: RBP: ffff978941315840 R08: 0000000000000000 R09: 0000000000000000
      kernel: R10: 00000000000008b0 R11: 0000000000000001 R12: ffffffffc0ce3731
      kernel: R13: ffff978950c00500 R14: ffff97894341f0c0 R15: ffff978951112eb0
      kernel: FS:  0000000000000000(0000) GS:ffff97987fc00000(0000) knlGS:0000000000000000
      kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      kernel: CR2: 00007f807535eae8 CR3: 000000010b8e4002 CR4: 00000000003706f0
      kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      kernel: Call Trace:
      kernel:  <TASK>
      kernel:  __flush_work.isra.0+0xaf/0x188
      kernel:  ? _raw_spin_lock_irqsave+0x2c/0x37
      kernel:  ? lock_timer_base+0x38/0x5f
      kernel:  __cancel_work_timer+0xea/0x13d
      kernel:  ? preempt_latency_start+0x2b/0x46
      kernel:  rdma_addr_cancel+0x70/0x81 [ib_core]
      kernel:  _destroy_id+0x1a/0x246 [rdma_cm]
      kernel:  rpcrdma_xprt_connect+0x115/0x5ae [rpcrdma]
      kernel:  ? _raw_spin_unlock+0x14/0x29
      kernel:  ? raw_spin_rq_unlock_irq+0x5/0x10
      kernel:  ? finish_task_switch.isra.0+0x171/0x249
      kernel:  xprt_rdma_connect_worker+0x3b/0xc7 [rpcrdma]
      kernel:  process_one_work+0x1d8/0x2d4
      kernel:  worker_thread+0x18b/0x24f
      kernel:  ? rescuer_thread+0x280/0x280
      kernel:  kthread+0xf4/0xfc
      kernel:  ? kthread_complete_and_exit+0x1b/0x1b
      kernel:  ret_from_fork+0x22/0x30
      kernel:  </TASK>
      
      SUNRPC's xprtiod workqueue is WQ_MEM_RECLAIM, so any workqueue that
      one of its work items tries to cancel has to be WQ_MEM_RECLAIM to
      prevent a priority inversion. The internal workqueues in the
      RDMA/core are currently non-MEM_RECLAIM.
      
      Jason Gunthorpe says this about the current state of RDMA/core:
      > If you attempt to do a reconnection/etc from within a RECLAIM
      > context it will deadlock on one of the many allocations that are
      > made to support opening the connection.
      >
      > The general idea of reclaim is that the entire task context
      > working under the reclaim is marked with an override of the gfp
      > flags to make all allocations under that call chain reclaim safe.
      >
      > But rdmacm does allocations outside this, eg in the WQs processing
      > the CM packets. So this doesn't work and we will deadlock.
      >
      > Fixing it is a big deal and needs more than poking WQ_MEM_RECLAIM
      > here and there.
      
      So we will change the ULP in this case to avoid the use of
      WQ_MEM_RECLAIM where possible. Deadlocks that were possible before
      are not fixed, but at least we no longer have a false sense of
      confidence that the stack won't allocate memory during memory
      reclaim.
      Suggested-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      6b1eb3b2
    • Anna Schumaker's avatar
      NFSv4.2: Add a tracepoint for listxattr · a0b685e7
      Anna Schumaker authored
      This can be defined as simply an NFS4_INODE_EVENT() since we don't have
      the name of a specific xattr to list. This roughly matches readdir,
      which also uses an NFS4_INODE_EVENT() tracepoint.
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      a0b685e7
    • Anna Schumaker's avatar
      NFSv4.2: Add tracepoints for getxattr, setxattr, and removexattr · 27ffed10
      Anna Schumaker authored
      These functions take similar arguments, and can share a tracepoint class
      for common formatting.
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      27ffed10
    • Anna Schumaker's avatar
      NFSv4.2: Move TRACE_DEFINE_ENUM(NFS4_CONTENT_*) under CONFIG_NFS_V4_2 · 3a100e4d
      Anna Schumaker authored
      NFS4_CONTENT_DATA and NFS4_CONTENT_HOLE both only exist under NFS v4.2.
      Move their corresponding TRACE_DEFINE_ENUM calls under this Kconfig
      option.
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      3a100e4d
    • Anna Schumaker's avatar
      NFSv4.2: Add special handling for LISTXATTR receiving NFS4ERR_NOXATTR · 96369461
      Anna Schumaker authored
      We can translate this into an empty response list instead of passing an
      error up to userspace.
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      96369461
    • Gaosheng Cui's avatar
      nfs: remove nfs_wait_atomic_killable() and nfs_write_prepare() declaration · a035618c
      Gaosheng Cui authored
      nfs_write_prepare() has been removed since
      commit a4cdda59 ("NFS: Create a common pgio_rpc_prepare
      function"), so remove it.
      
      nfs_wait_atomic_killable() has been removed since
      commit 723c921e ("sched/wait, fs/nfs: Convert wait_on_atomic_t()
      usage to the new wait_var_event() API"), so remove it.
      Signed-off-by: default avatarGaosheng Cui <cuigaosheng1@huawei.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      a035618c
    • Gaosheng Cui's avatar
      NFSv4: remove nfs4_renewd_prepare_shutdown() declaration · 8aa7cf85
      Gaosheng Cui authored
      nfs4_renewd_prepare_shutdown() has been removed since
      commit 3050141b ("NFSv4: Kill nfs4_renewd_prepare_shutdown()"),
      so remove it.
      Signed-off-by: default avatarGaosheng Cui <cuigaosheng1@huawei.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      8aa7cf85
    • Jiangshan Yi's avatar
      fs/nfs/pnfs_nfs.c: fix spelling typo and syntax error in comment · 74fd2ca0
      Jiangshan Yi authored
      Fix spelling typo and syntax error in comment.
      Suggested-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reported-by: default avatark2ci <kernel-bot@kylinos.cn>
      Signed-off-by: default avatarJiangshan Yi <yijiangshan@kylinos.cn>
      Reviewed-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      74fd2ca0
  2. 03 Oct, 2022 8 commits
  3. 25 Sep, 2022 8 commits
  4. 24 Sep, 2022 10 commits
  5. 23 Sep, 2022 6 commits