• Chuck Lever's avatar
    xprtrdma: Segment head and tail XDR buffers on page boundaries · 821c791a
    Chuck Lever authored
    A single memory allocation is used for the pair of buffers wherein
    the RPC client builds an RPC call message and decodes its matching
    reply. These buffers are sized based on the maximum possible size
    of the RPC call and reply messages for the operation in progress.
    
    This means that as the call buffer increases in size, the start of
    the reply buffer is pushed farther into the memory allocation.
    
    RPC requests are growing in size. It used to be that both the call
    and reply buffers fit inside a single page.
    
    But these days, thanks to NFSv4 (and especially security labels in
    NFSv4.2) the maximum call and reply sizes are large. NFSv4.0 OPEN,
    for example, now requires a 6KB allocation for a pair of call and
    reply buffers, and NFSv4 LOOKUP is not far behind.
    
    As the maximum size of a call increases, the reply buffer is pushed
    far enough into the buffer's memory allocation that a page boundary
    can appear in the middle of it.
    
    When the maximum possible reply size is larger than the client's
    RDMA receive buffers (currently 1KB), the client has to register a
    Reply chunk for the server to RDMA Write the reply into.
    
    The logic in rpcrdma_convert_iovs() assumes that xdr_buf head and
    tail buffers would always be contained on a single page. It supplies
    just one segment for the head and one for the tail.
    
    FMR, for example, registers up to a page boundary (only a portion of
    the reply buffer in the OPEN case above). But without additional
    segments, it doesn't register the rest of the buffer.
    
    When the server tries to write the OPEN reply, the RDMA Write fails
    with a remote access error since the client registered only part of
    the Reply chunk.
    
    rpcrdma_convert_iovs() must split the XDR buffer into multiple
    segments, each of which are guaranteed not to contain a page
    boundary. That way fmr_op_map is given the proper number of segments
    to register the whole reply buffer.
    Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
    Reviewed-by: default avatarDevesh Sharma <devesh.sharma@broadcom.com>
    Reviewed-by: default avatarSagi Grimberg <sagig@mellanox.com>
    Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
    821c791a
rpc_rdma.c 29.4 KB