1. 08 Feb, 2018 1 commit
    • Olga Kornievskaia's avatar
      fix parallelism for rpc tasks · f515f86b
      Olga Kornievskaia authored
      Hi folks,
      
      On a multi-core machine, is it expected that we can have parallel RPCs
      handled by each of the per-core workqueue?
      
      In testing a read workload, observing via "top" command that a single
      "kworker" thread is running servicing the requests (no parallelism).
      It's more prominent while doing these operations over krb5p mount.
      
      What has been suggested by Bruce is to try this and in my testing I
      see then the read workload spread among all the kworker threads.
      Signed-off-by: default avatarOlga Kornievskaia <kolga@netapp.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      f515f86b
  2. 07 Feb, 2018 2 commits
  3. 06 Feb, 2018 2 commits
  4. 02 Feb, 2018 2 commits
    • Chuck Lever's avatar
      xprtrdma: Fix BUG after a device removal · e89e8d8f
      Chuck Lever authored
      Michal Kalderon reports a BUG that occurs just after device removal:
      
      [  169.112490] rpcrdma: removing device qedr0 for 192.168.110.146:20049
      [  169.143909] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
      [  169.181837] IP: rpcrdma_dma_unmap_regbuf+0xa/0x60 [rpcrdma]
      
      The RPC/RDMA client transport attempts to allocate some resources
      on demand. Registered buffers are one such resource. These are
      allocated (or re-allocated) by xprt_rdma_allocate to hold RPC Call
      and Reply messages. A hardware resource is associated with each of
      these buffers, as they can be used for a Send or Receive Work
      Request.
      
      If a device is removed from under an NFS/RDMA mount, the transport
      layer is responsible for releasing all hardware resources before
      the device can be finally unplugged. A BUG results when the NFS
      mount hasn't yet seen much activity: the transport tries to release
      resources that haven't yet been allocated.
      
      rpcrdma_free_regbuf() already checks for this case, so just move
      that check to cover the DEVICE_REMOVAL case as well.
      Reported-by: default avatarMichal Kalderon <Michal.Kalderon@cavium.com>
      Fixes: bebd0318 ("xprtrdma: Support unplugging an HCA ...")
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Tested-by: default avatarMichal Kalderon <Michal.Kalderon@cavium.com>
      Cc: stable@vger.kernel.org # v4.12+
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      e89e8d8f
    • Chuck Lever's avatar
      xprtrdma: Fix calculation of ri_max_send_sges · 1179e2c2
      Chuck Lever authored
      Commit 16f906d6 ("xprtrdma: Reduce required number of send
      SGEs") introduced the rpcrdma_ia::ri_max_send_sges field. This fixes
      a problem where xprtrdma would not work if the device's max_sge
      capability was small (low single digits).
      
      At least RPCRDMA_MIN_SEND_SGES are needed for the inline parts of
      each RPC. ri_max_send_sges is set to this value:
      
        ia->ri_max_send_sges = max_sge - RPCRDMA_MIN_SEND_SGES;
      
      Then when marshaling each RPC, rpcrdma_args_inline uses that value
      to determine whether the device has enough Send SGEs to convey an
      NFS WRITE payload inline, or whether instead a Read chunk is
      required.
      
      More recently, commit ae72950a ("xprtrdma: Add data structure to
      manage RDMA Send arguments") used the ri_max_send_sges value to
      calculate the size of an array, but that commit erroneously assumed
      ri_max_send_sges contains a value similar to the device's max_sge,
      and not one that was reduced by the minimum SGE count.
      
      This assumption results in the calculated size of the sendctx's
      Send SGE array to be too small. When the array is used to marshal
      an RPC, the code can write Send SGEs into the following sendctx
      element in that array, corrupting it. When the device's max_sge is
      large, this issue is entirely harmless; but it results in an oops
      in the provider's post_send method, if dev.attrs.max_sge is small.
      
      So let's straighten this out: ri_max_send_sges will now contain a
      value with the same meaning as dev.attrs.max_sge, which makes
      the code easier to understand, and enables rpcrdma_sendctx_create
      to calculate the size of the SGE array correctly.
      Reported-by: default avatarMichal Kalderon <Michal.Kalderon@cavium.com>
      Fixes: 16f906d6 ("xprtrdma: Reduce required number of send SGEs")
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Tested-by: default avatarMichal Kalderon <Michal.Kalderon@cavium.com>
      Cc: stable@vger.kernel.org # v4.10+
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      1179e2c2
  5. 29 Jan, 2018 1 commit
  6. 28 Jan, 2018 1 commit
  7. 25 Jan, 2018 2 commits
  8. 24 Jan, 2018 1 commit
  9. 23 Jan, 2018 20 commits
  10. 22 Jan, 2018 1 commit
    • Eric Biggers's avatar
      NFS: reject request for id_legacy key without auxdata · 49686cbb
      Eric Biggers authored
      nfs_idmap_legacy_upcall() is supposed to be called with 'aux' pointing
      to a 'struct idmap', via the call to request_key_with_auxdata() in
      nfs_idmap_request_key().
      
      However it can also be reached via the request_key() system call in
      which case 'aux' will be NULL, causing a NULL pointer dereference in
      nfs_idmap_prepare_pipe_upcall(), assuming that the key description is
      valid enough to get that far.
      
      Fix this by making nfs_idmap_legacy_upcall() negate the key if no
      auxdata is provided.
      
      As usual, this bug was found by syzkaller.  A simple reproducer using
      the command-line keyctl program is:
      
          keyctl request2 id_legacy uid:0 '' @s
      
      Fixes: 57e62324 ("NFS: Store the legacy idmapper result in the keyring")
      Reported-by: syzbot+5dfdbcf7b3eb5912abbb@syzkaller.appspotmail.com
      Cc: <stable@vger.kernel.org> # v3.4+
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarTrond Myklebust <trondmy@gmail.com>
      49686cbb
  11. 18 Jan, 2018 3 commits
  12. 16 Jan, 2018 4 commits
    • Chuck Lever's avatar
      xprtrdma: Introduce rpcrdma_mw_unmap_and_put · ec12e479
      Chuck Lever authored
      Clean up: Code review suggested that a common bit of code can be
      placed into a helper function, and this gives us fewer places to
      stick an "I DMA unmapped something" trace point.
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      ec12e479
    • Chuck Lever's avatar
      xprtrdma: Remove usage of "mw" · 96ceddea
      Chuck Lever authored
      Clean up: struct rpcrdma_mw was named after Memory Windows, but
      xprtrdma no longer supports a Memory Window registration mode.
      Rename rpcrdma_mw and its fields to reduce confusion and make
      the code more sensible to read.
      
      Renaming "mw" was suggested by Tom Talpey, the author of the
      original xprtrdma implementation. It's a good idea, but I haven't
      done this until now because it's a huge diffstat for no benefit
      other than code readability.
      
      However, I'm about to introduce static trace points that expose
      a few of xprtrdma's internal data structures. They should make sense
      in the trace report, and it's reasonable to treat trace points as a
      kernel API contract which might be difficult to change later.
      
      While I'm churning things up, two additional changes:
      - rename variables unhelpfully called "r" to "mr", to improve code
        clarity, and
      - rename the MR-related helper functions using the form
        "rpcrdma_mr_<verb>", to be consistent with other areas of the
        code.
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      96ceddea
    • Chuck Lever's avatar
      xprtrdma: Replace all usage of "frmr" with "frwr" · ce5b3717
      Chuck Lever authored
      Clean up: Over time, the industry has adopted the term "frwr"
      instead of "frmr". The term "frwr" is now more widely recognized.
      
      For the past couple of years I've attempted to add new code using
      "frwr" , but there still remains plenty of older code that still
      uses "frmr". Replace all usage of "frmr" to avoid confusion.
      
      While we're churning code, rename variables unhelpfully called "f"
      to "frwr", to improve code clarity.
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      ce5b3717
    • Chuck Lever's avatar
      xprtrdma: Don't clear RPC_BC_PA_IN_USE on pre-allocated rpc_rqst's · 30b5416b
      Chuck Lever authored
      No need for the overhead of atomically setting and clearing this bit
      flag for every use of a pre-allocated backchannel rpc_rqst. These
      are a distinct pool of rpc_rqsts that are used only for callback
      operations, so it is safe to simply leave the bit set.
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      30b5416b