Commits · 477687e1116ad16180caf8633dd830b296a5ce73 · Kirill Smelkov / linux

07 Mar, 2019 2 commits

SUNRPC: Fix up RPC back channel transmission · 477687e1

Trond Myklebust authored Mar 05, 2019

Now that transmissions happen through a queue, we require the RPC tasks
to handle error conditions that may have been set while they were
sleeping. The back channel does not currently do this, but assumes
that any error condition happens during its own call to xprt_transmit().

The solution is to ensure that the back channel splits out the
error handling just like the forward channel does.

Fixes: 89f90fe1 ("SUNRPC: Allow calls to xprt_transmit() to drain...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

477687e1

SUNRPC: Prevent thundering herd when the socket is not connected · ed7dc973

Trond Myklebust authored Mar 04, 2019

If the socket is not connected, then we want to initiate a reconnect
rather that trying to transmit requests. If there is a large number
of requests queued and waiting for the lock in call_transmit(),
then it can take a while for one of the to loop back and retake
the lock in call_connect.

Fixes: 89f90fe1 ("SUNRPC: Allow calls to xprt_transmit() to drain...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

ed7dc973

02 Mar, 2019 17 commits

SUNRPC: Allow dynamic allocation of back channel slots · 0d1bf340

Trond Myklebust authored Mar 02, 2019

Now that the reads happen in a process context rather than a softirq,
it is safe to allocate back channel slots using a reclaiming
allocation.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

0d1bf340

NFSv4.1: Bump the default callback session slot count to 16 · 067c4696

Trond Myklebust authored Mar 02, 2019

Users can still control this value explicitly using the
max_session_cb_slots module parameter, but let's bump the default
up to 16 for now.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

067c4696

SUNRPC: Convert remaining GFP_NOIO, and GFP_NOWAIT sites in sunrpc · 12a3ad61

Trond Myklebust authored Mar 02, 2019

Convert the remaining gfp_flags arguments in sunrpc to standard reclaiming
allocations, now that we set memalloc_nofs_save() as appropriate.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

12a3ad61

NFS/flexfiles: Clean up mirror DS initialisation · cefa587a

Trond Myklebust authored Feb 28, 2019

Get rid of the redundant parameter and rename the function
ff_layout_mirror_valid() to ff_layout_init_mirror_ds() for clarity.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

cefa587a

NFS/flexfiles: Remove dead code in ff_layout_mirror_valid() · 29a23909

Trond Myklebust authored Feb 28, 2019

nfs4_ff_alloc_deviceid_node() guarantees that if mirror->mirror_ds is
a valid pointer, then so is mirror->mirror_ds->ds.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

29a23909

NFS/flexfile: Simplify nfs4_ff_layout_select_ds_stateid() · 4cbc8a57

Trond Myklebust authored Feb 28, 2019

Pass in a pointer to the mirror rather than forcing another
array access.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

4cbc8a57

NFS/flexfile: Simplify nfs4_ff_layout_ds_version() · 626d48b1

Trond Myklebust authored Feb 28, 2019

Pass in a pointer to the mirror rather than forcing another
array access.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

626d48b1

NFS/flexfiles: Simplify ff_layout_get_ds_cred() · 312cd4cb

Trond Myklebust authored Feb 28, 2019

Pass in a pointer to the mirror rather than forcing another
array access.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

312cd4cb

NFS/flexfiles: Simplify nfs4_ff_find_or_create_ds_client() · 561d6f8a

Trond Myklebust authored Feb 28, 2019

Pass in a pointer to the mirror rather than forcing another
array access.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

561d6f8a

NFS/flexfiles: Simplify nfs4_ff_layout_select_ds_fh() · 749da527

Trond Myklebust authored Feb 28, 2019

Pass in a pointer to the mirror rather than having to retrieve it from
the array and then verify the resulting pointer.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

749da527

NFS/flexfiles: Speed up read failover when DSes are down · 76c66905

Trond Myklebust authored Feb 14, 2019

If we notice that a DS may be down, we should attempt to read from the
other mirrors first before we go back to retry the dead DS.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

76c66905

NFS/flexfiles: Don't invalidate DS deviceids for being unresponsive · 17aaec81

Trond Myklebust authored Feb 26, 2019

If the DS is unresponsive, we want to just mark it as such, while
reporting the errors. If the server later returns the same deviceid
in a new layout, then we don't want to have to look it up again.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

17aaec81

NFS/flexfiles: Remove bogus checks for invalid deviceids · d082d4b5

Trond Myklebust authored Feb 26, 2019

We already check the deviceids before we start the RPC call.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

d082d4b5

NFS/flexfiles: Avoid unnecessary layout invalidations · 0a156dd5

Trond Myklebust authored Feb 27, 2019

In ff_layout_mirror_valid() we may not want to invalidate the layout
segment despite the call to GETDEVICEINFO failing. The reason is that
a read may still be able to make progress on another mirror.

So instead we let the caller (in this case nfs4_ff_layout_prepare_ds())
decide whether or not it needs to invalidate.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

0a156dd5

NFS/flexfiles: refactor calls to fs4_ff_layout_prepare_ds() · 2444ff27

Trond Myklebust authored Feb 14, 2019

While we may want to skip attempting to connect to a downed mirror
when we're deciding which mirror to select for a read, we do not
want to do so once we've committed to attempting the I/O in
ff_layout_read/write_pagelist(), or ff_layout_initiate_commit()
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

2444ff27

NFSv4: Handle early exit in layoutget by returning an error · 18c0778a

Trond Myklebust authored Feb 13, 2019

If the LAYOUTGET rpc call exits early without an error, convert it to
EAGAIN.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

18c0778a

NFS/flexfiles: Send LAYOUTERROR when failing over mirrored reads · f0922a6c

Trond Myklebust authored Feb 10, 2019

When a read to the preferred mirror returns an error, the flexfiles
driver records the error in the inode list and currently marks the
layout for return before failing over the attempted read to the next
mirror.
What we actually want to do is fire off a LAYOUTERROR to notify the
MDS that there is an issue with the preferred mirror, then we fail
over. Only once we've failed to read from all mirrors should we
return the layout.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

f0922a6c

01 Mar, 2019 8 commits

NFSv4.2: Add client support for the generic 'layouterror' RPC call · 3eb86093
Trond Myklebust authored Feb 08, 2019
```
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
```
3eb86093

NFSv4/flexfiles: Abort I/O early if the layout segment was invalidated · a79f194a

Trond Myklebust authored Feb 27, 2019

If a layout segment gets invalidated while a pNFS I/O operation
is queued for transmission, then we ideally want to abort
immediately. This is particularly the case when there is a large
number of I/O related RPCs queued in the RPC layer, and the layout
segment gets invalidated due to an ENOSPC error, or an EACCES (because
the client was fenced). We may end up forced to spam the MDS with a
lot of otherwise unnecessary LAYOUTERRORs after that I/O fails.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

a79f194a

NFSv4/pnfs: Fix barriers in nfs4_mark_deviceid_unavailable() · 39a5201a

Trond Myklebust authored Feb 26, 2019

Fix the memory barriers in nfs4_mark_deviceid_unavailable() and
nfs4_test_deviceid_unavailable().
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

39a5201a

NFS/flexfiles: Fix up sparse RCU annotations · 762bb7e9
Trond Myklebust authored Feb 26, 2019
```
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
```
762bb7e9

NFSv4/flexfiles: Fix invalid deref in FF_LAYOUT_DEVID_NODE() · 108bb4af

Trond Myklebust authored Feb 26, 2019

If the attempt to instantiate the mirror's layout DS pointer failed,
then that pointer may hold a value of type ERR_PTR(), so we need
to check that before we dereference it.

Fixes: 65990d1a ("pNFS/flexfiles: Fix a deadlock on LAYOUTGET")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

108bb4af

NFS: Add missing encode / decode sequence_maxsz to v4.2 operations · 1a3466ae

Anna Schumaker authored Mar 01, 2019

These really should have been there from the beginning, but we never
noticed because there was enough slack in the RPC request for the extra
bytes. Chuck's recent patch to use au_cslack and au_rslack to compute
buffer size shrunk the buffer enough that this was now a problem for
SEEK operations on my test client.

Fixes: f4ac1674 ("nfs: Add ALLOCATE support")
Fixes: 2e72448b ("NFS: Add COPY nfs operation")
Fixes: cb95deea ("NFS OFFLOAD_CANCEL xdr")
Fixes: 624bd5b7 ("nfs: Add DEALLOCATE support")
Fixes: 1c6dcbe5 ("NFS: Implement SEEK")
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

1a3466ae

NFSv4.1: Don't process the sequence op more than once. · c71c46f0

Trond Myklebust authored Mar 01, 2019

Ensure that if we call nfs41_sequence_process() a second time for the
same rpc_task, then we only process the results once.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

c71c46f0

NFSv4.1: Reinitialise sequence results before retransmitting a request · c1dffe0b

Trond Myklebust authored Mar 01, 2019

If we have to retransmit a request, we should ensure that we reinitialise
the sequence results structure, since in the event of a signal
we need to treat the request as if it had not been sent.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org

c1dffe0b

26 Feb, 2019 1 commit

SUNRPC: Fix an Oops in udp_poll() · a73881c9

Trond Myklebust authored Feb 26, 2019

udp_poll() checks the struct file for the O_NONBLOCK flag, so we must not
call it with a NULL file pointer.

Fixes: 0ffe86f4 ("SUNRPC: Use poll() to fix up the socket requeue races")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

a73881c9

25 Feb, 2019 1 commit

Merge tag 'nfs-rdma-for-5.1-1' of git://git.linux-nfs.org/projects/anna/linux-nfs · 06b5fc3a

Trond Myklebust authored Feb 25, 2019

NFSoRDMA client updates for 5.1

New features:
- Convert rpc auth layer to use xdr_streams
- Config option to disable insecure enctypes
- Reduce size of RPC receive buffers

Bugfixes and cleanups:
- Fix sparse warnings
- Check inline size before providing a write chunk
- Reduce the receive doorbell rate
- Various tracepoint improvements

[Trond: Fix up merge conflicts]
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

06b5fc3a

23 Feb, 2019 1 commit

NFS/pnfs: Bulk destroy of layouts needs to be safe w.r.t. umount · 5085607d

Trond Myklebust authored Feb 22, 2019

If a bulk layout recall or a metadata server reboot coincides with a
umount, then holding a reference to an inode is unsafe unless we
also hold a reference to the super block.

Fixes: fd9a8d71 ("NFSv4.1: Fix bulk recall and destroy of layouts")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

5085607d

21 Feb, 2019 2 commits

NFS: Fix a soft lockup in the delegation recovery code · 6f9449be

Trond Myklebust authored Feb 21, 2019

Fix a soft lockup when NFS client delegation recovery is attempted
but the inode is in the process of being freed. When the
igrab(inode) call fails, and we have to restart the recovery process,
we need to ensure that we won't attempt to recover the same delegation
again.

Fixes: 45870d69 ("NFSv4.1: Test delegation stateids when server...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

6f9449be

NFSv4.1: Avoid false retries when RPC calls are interrupted · 3453d570

Trond Myklebust authored Jun 20, 2018

A 'false retry' in NFSv4.1 occurs when the client attempts to transmit a
new RPC call using a slot+sequence number combination that references an
already cached one. Currently, the Linux NFS client will do this if a
user process interrupts an RPC call that is in progress.
The problem with doing so is that we defeat the main mechanism used by
the server to differentiate between a new call and a replayed one. Even
if the server is able to perfectly cache the arguments of the old call,
it cannot know if the client intended to replay or send a new call.

The obvious fix is to bump the sequence number pre-emptively if an
RPC call is interrupted, but in order to deal with the corner cases
where the interrupted call is not actually received and processed by
the server, we need to interpret the error NFS4ERR_SEQ_MISORDERED
as a sign that we need to either wait or locate a correct sequence
number that lies between the value we sent, and the last value that
was acked by a SEQUENCE call on that slot.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Tested-by: Jason Tibbitts <tibbs@math.uh.edu>

3453d570

20 Feb, 2019 8 commits

SUNRPC: Remove the redundant 'zerocopy' argument to xs_sendpages() · 6f903b11
Trond Myklebust authored Feb 19, 2019
```
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
```
6f903b11

SUNRPC: Further cleanups of xs_sendpages() · c87dc4c7

Trond Myklebust authored Feb 19, 2019

Now that we send the pages using a struct msghdr, instead of
using sendpage(), we no longer need to 'prime the socket' with
an address for unconnected UDP messages.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

c87dc4c7

SUNRPC: Convert socket page send code to use iov_iter() · 0472e476

Trond Myklebust authored Feb 19, 2019

Simplify the page send code using iov_iter and bvecs.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

0472e476

SUNRPC: Convert xs_send_kvec() to use iov_iter_kvec() · e791f8e9

Trond Myklebust authored Feb 19, 2019

Prepare to the socket transmission code to use iov_iter.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

e791f8e9

SUNRPC: Initiate a connection close on an ESHUTDOWN error in stream receive · 5f52a9d4

Trond Myklebust authored Feb 16, 2019

If the client stream receive code receives an ESHUTDOWN error either
because the server closed the connection, or because it sent a
callback which cannot be processed, then we should shut down
the connection.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

5f52a9d4

SUNRPC: Don't suppress socket errors when a message read completes · 727fcc64

Trond Myklebust authored Feb 15, 2019

If the message read completes, but the socket returned an error
condition, we should ensure to propagate that error.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

727fcc64

SUNRPC: Handle zero length fragments correctly · e92053a5

Trond Myklebust authored Feb 15, 2019

A zero length fragment is really a bug, but let's ensure we don't
go nuts when one turns up.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

e92053a5

SUNRPC: Don't reset the stream record info when the receive worker is running · ae053551

Trond Myklebust authored Feb 20, 2019

To ensure that the receive worker has exclusive access to the stream record
info, we must not reset the contents other than when holding the
transport->recv_mutex.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

ae053551