Commits · a23dd544debcda4ee4a549ec7de59e85c3c8345c · Kirill Smelkov / linux

30 Jun, 2022 1 commit

SUNRPC: Fix READ_PLUS crasher · a23dd544

Chuck Lever authored Jun 30, 2022

Looks like there are still cases when "space_left - frag1bytes" can
legitimately exceed PAGE_SIZE. Ensure that xdr->end always remains
within the current encode buffer.
Reported-by: Bruce Fields <bfields@fieldses.org>
Reported-by: Zorro Lang <zlang@redhat.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216151
Fixes: 6c254bf3 ("SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

a23dd544

27 Jun, 2022 1 commit

NFSD: restore EINVAL error translation in nfsd_commit() · 8a9ffb8c

Alexey Khoroshilov authored Jun 25, 2022

commit 555dbf1a ("nfsd: Replace use of rwsem with errseq_t")
incidentally broke translation of -EINVAL to nfserr_notsupp.
The patch restores that.

Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Fixes: 555dbf1a ("nfsd: Replace use of rwsem with errseq_t")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

8a9ffb8c

08 Jun, 2022 5 commits

SUNRPC: Remove pointer type casts from xdr_get_next_encode_buffer() · da9e94fe

Chuck Lever authored Jun 07, 2022

To make the code easier to read, remove visual clutter by changing
the declared type of @p.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>

da9e94fe

SUNRPC: Clean up xdr_get_next_encode_buffer() · bd07a641

Chuck Lever authored Jun 07, 2022

The value of @p is not used until the "location of the next item" is
computed. Help human readers by moving its initial assignment to the
paragraph where that value is used and by clarifying the antecedents
in the documenting comment.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: NeilBrown <neilb@suse.com>
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>

bd07a641

SUNRPC: Clean up xdr_commit_encode() · 90d871b3

Chuck Lever authored Jun 07, 2022

Both the kvec::iov_len field and the third parameter of memcpy() and
memmove() are size_t. There's no reason for the implicit conversion
from size_t to int and back. Change the type of @shift to make the
code easier to read and understand.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>

90d871b3

SUNRPC: Optimize xdr_reserve_space() · 62ed448c

Chuck Lever authored Jun 07, 2022

Transitioning between encode buffers is quite infrequent. It happens
about 1 time in 400 calls to xdr_reserve_space(), measured on NFSD
with a typical build/test workload.

Force the compiler to remove that code from xdr_reserve_space(),
which is a hot path on both the server and the client. This change
reduces the size of xdr_reserve_space() from 10 cache lines to 2
when compiled with -Os.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>

62ed448c

SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer() · 6c254bf3

Chuck Lever authored Jun 07, 2022

I found that NFSD's new NFSv3 READDIRPLUS XDR encoder was screwing up
right at the end of the page array. xdr_get_next_encode_buffer() does
not compute the value of xdr->end correctly:

 * The check to see if we're on the final available page in xdr->buf
   needs to account for the space consumed by @nbytes.

 * The new xdr->end value needs to account for the portion of @nbytes
   that is to be encoded into the previous buffer.

Fixes: 2825a7f9 ("nfsd4: allow encoding across page boundaries")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>

6c254bf3

02 Jun, 2022 2 commits

SUNRPC: Trap RDMA segment overflows · f012e95b

Chuck Lever authored Jun 01, 2022

Prevent svc_rdma_build_writes() from walking off the end of a Write
chunk's segment array. Caught with KASAN.

The test that this fix replaces is invalid, and might have been left
over from an earlier prototype of the PCL work.

Fixes: 7a1cbfa1 ("svcrdma: Use parsed chunk lists to construct RDMA Writes")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

f012e95b

NFSD: Fix potential use-after-free in nfsd_file_put() · b6c71c66

Chuck Lever authored May 31, 2022

nfsd_file_put_noref() can free @nf, so don't dereference @nf
immediately upon return from nfsd_file_put_noref().
Suggested-by: Trond Myklebust <trondmy@hammerspace.com>
Fixes: 99939792 ("nfsd: Clean up nfsd_file_put()")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

b6c71c66

29 May, 2022 1 commit

MAINTAINERS: reciprocal co-maintainership for file locking and nfsd · 9ff9f77f

Jeff Layton authored May 24, 2022

Chuck has agreed to backstop me as maintainer of the file locking code,
and I'll do the same for him on knfsd.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

9ff9f77f

26 May, 2022 4 commits

NFSD: nfsd_file_put() can sleep · 08af54b3

Chuck Lever authored May 11, 2022

Now that there are no more callers of nfsd_file_put() that might
hold a spin lock, ensure the lockdep infrastructure can catch
newly introduced calls to nfsd_file_put() made while a spinlock
is held.

Link: https://lore.kernel.org/linux-nfs/ece7fd1d-5fb3-5155-54ba-347cfc19bd9a@oracle.com/T/#mf1855552570cf9a9c80d1e49d91438cd9085aadaSigned-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>

08af54b3

NFSD: Add documenting comment for nfsd4_release_lockowner() · 043862b0

Chuck Lever authored May 22, 2022

And return explicit nfserr values that match what is documented in the
new comment / API contract.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

043862b0

NFSD: Modernize nfsd4_release_lockowner() · bd8fdb6e

Chuck Lever authored May 22, 2022

Refactor: Use existing helpers that other lock operations use. This
change removes several automatic variables, so re-organize the
variable declarations for readability.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

bd8fdb6e

NFSD: Fix possible sleep during nfsd4_release_lockowner() · ce3c4ad7

Chuck Lever authored May 21, 2022

nfsd4_release_lockowner() holds clp->cl_lock when it calls
check_for_locks(). However, check_for_locks() calls nfsd_file_get()
/ nfsd_file_put() to access the backing inode's flc_posix list, and
nfsd_file_put() can sleep if the inode was recently removed.

Let's instead rely on the stateowner's reference count to gate
whether the release is permitted. This should be a reliable
indication of locks-in-use since file lock operations and
->lm_get_owner take appropriate references, which are released
appropriately when file locks are removed.
Reported-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org

ce3c4ad7

23 May, 2022 10 commits

nfsd: destroy percpu stats counters after reply cache shutdown · fd5e363e

Julian Schroeder authored May 23, 2022

Upon nfsd shutdown any pending DRC cache is freed. DRC cache use is
tracked via a percpu counter. In the current code the percpu counter
is destroyed before. If any pending cache is still present,
percpu_counter_add is called with a percpu counter==NULL. This causes
a kernel crash.
The solution is to destroy the percpu counter after the cache is freed.

Fixes: e567b98c (“nfsd: protect concurrent access to nfsd stats counters”)
Signed-off-by: Julian Schroeder <jumaco@amazon.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

fd5e363e

nfsd: Fix null-ptr-deref in nfsd_fill_super() · 6f6f84aa

Zhang Xiaoxu authored May 21, 2022

KASAN report null-ptr-deref as follows:

  BUG: KASAN: null-ptr-deref in nfsd_fill_super+0xc6/0xe0 [nfsd]
  Write of size 8 at addr 000000000000005d by task a.out/852

  CPU: 7 PID: 852 Comm: a.out Not tainted 5.18.0-rc7-dirty #66
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014
  Call Trace:
   <TASK>
   dump_stack_lvl+0x34/0x44
   kasan_report+0xab/0x120
   ? nfsd_mkdir+0x71/0x1c0 [nfsd]
   ? nfsd_fill_super+0xc6/0xe0 [nfsd]
   nfsd_fill_super+0xc6/0xe0 [nfsd]
   ? nfsd_mkdir+0x1c0/0x1c0 [nfsd]
   get_tree_keyed+0x8e/0x100
   vfs_get_tree+0x41/0xf0
   __do_sys_fsconfig+0x590/0x670
   ? fscontext_read+0x180/0x180
   ? anon_inode_getfd+0x4f/0x70
   do_syscall_64+0x35/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xae

This can be reproduce by concurrent operations:
	1. fsopen(nfsd)/fsconfig
	2. insmod/rmmod nfsd

Since the nfsd file system is registered before than nfsd_net allocated,
the caller may get the file_system_type and use the nfsd_net before it
allocated, then null-ptr-deref occurred.

So init_nfsd() should call register_filesystem() last.

Fixes: bd5ae928 ("nfsd: register pernet ops last, unregister first")
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

6f6f84aa

nfsd: Unregister the cld notifier when laundry_wq create failed · 62fdb65e

Zhang Xiaoxu authored May 21, 2022

If laundry_wq create failed, the cld notifier should be unregistered.
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

62fdb65e

SUNRPC: Use RMW bitops in single-threaded hot paths · 28df0988

Chuck Lever authored Apr 29, 2022

I noticed CPU pipeline stalls while using perf.

Once an svc thread is scheduled and executing an RPC, no other
processes will touch svc_rqst::rq_flags. Thus bus-locked atomics are
not needed outside the svc thread scheduler.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

28df0988

NFSD: Clean up the show_nf_flags() macro · bb283ca1

Chuck Lever authored Mar 27, 2022

The flags are defined using C macros, so TRACE_DEFINE_ENUM is
unnecessary.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

bb283ca1

NFSD: Trace filecache opens · 0122e882

Chuck Lever authored Mar 27, 2022

Instrument calls to nfsd_open_verified() to get a sense of the
filecache hit rate.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

0122e882

NFSD: Move documenting comment for nfsd4_process_open2() · 7e2ce0cc

Chuck Lever authored Mar 23, 2022

Clean up nfsd4_open() by converting a large comment at the only
call site for nfsd4_process_open2() to a kerneldoc comment in
front of that function.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

7e2ce0cc

NFSD: Fix whitespace · 26320d7e

Chuck Lever authored Mar 21, 2022

Clean up: Pull case arms back one tab stop to conform every other
switch statement in fs/nfsd/nfs4proc.c.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

26320d7e

NFSD: Remove dprintk call sites from tail of nfsd4_open() · f67a16b1

Chuck Lever authored Mar 30, 2022

Clean up: These relics are not likely to benefit server
administrators.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

f67a16b1

NFSD: Instantiate a struct file when creating a regular NFSv4 file · fb70bf12

Chuck Lever authored Mar 30, 2022

There have been reports of races that cause NFSv4 OPEN(CREATE) to
return an error even though the requested file was created. NFSv4
does not provide a status code for this case.

To mitigate some of these problems, reorganize the NFSv4
OPEN(CREATE) logic to allocate resources before the file is actually
created, and open the new file while the parent directory is still
locked.

Two new APIs are added:

+ Add an API that works like nfsd_file_acquire() but does not open
the underlying file. The OPEN(CREATE) path can use this API when it
already has an open file.

+ Add an API that is kin to dentry_open(). NFSD needs to create a
file and grab an open "struct file *" atomically. The
alloc_empty_file() has to be done before the inode create. If it
fails (for example, because the NFS server has exceeded its
max_files limit), we avoid creating the file and can still return
an error to the NFS client.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=382Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: JianHong Yin <jiyin@redhat.com>

fb70bf12

20 May, 2022 7 commits

NFSD: Clean up nfsd_open_verified() · f4d84c52

Chuck Lever authored Mar 27, 2022

Its only caller always passes S_IFREG as the @type parameter. As an
additional clean-up, add a kerneldoc comment.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

f4d84c52

NFSD: Remove do_nfsd_create() · 1c388f27

Chuck Lever authored Mar 28, 2022

Now that its two callers have their own version-specific instance of
this function, do_nfsd_create() is no longer used.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

1c388f27

NFSD: Refactor NFSv4 OPEN(CREATE) · 254454a5

Chuck Lever authored Mar 28, 2022

Copy do_nfsd_create() to nfs4proc.c and remove NFSv3-specific logic.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

254454a5

NFSD: Refactor NFSv3 CREATE · df9606ab

Chuck Lever authored Mar 28, 2022

The NFSv3 CREATE and NFSv4 OPEN(CREATE) use cases are about to
diverge such that it makes sense to split do_nfsd_create() into one
version for NFSv3 and one for NFSv4.

As a first step, copy do_nfsd_create() to nfs3proc.c and remove
NFSv4-specific logic.

One immediate legibility benefit is that the logic for handling
NFSv3 createhow is now quite straightforward. NFSv4 createhow
has some subtleties that IMO do not belong in generic code.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

df9606ab

NFSD: Refactor nfsd_create_setattr() · 5f46e950

Chuck Lever authored Mar 28, 2022

I'd like to move do_nfsd_create() out of vfs.c. Therefore
nfsd_create_setattr() needs to be made publicly visible.

Note that both call sites in vfs.c commit both the new object and
its parent directory, so just combine those common metadata commits
into nfsd_create_setattr().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

5f46e950

NFSD: Avoid calling fh_drop_write() twice in do_nfsd_create() · 14ee45b7

Chuck Lever authored Mar 28, 2022

Clean up: The "out" label already invokes fh_drop_write().

Note that fh_drop_write() is already careful not to invoke
mnt_drop_write() if either it has already been done or there is
nothing to drop. Therefore no change in behavior is expected.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

14ee45b7

NFSD: Clean up nfsd3_proc_create() · e6156859

Chuck Lever authored Mar 25, 2022

As near as I can tell, mode bit masking and setting S_IFREG is
already done by do_nfsd_create() and vfs_create(). The NFSv4 path
(do_open_lookup), for example, does not bother with this special
processing.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

e6156859

19 May, 2022 9 commits

SUNRPC: Simplify synopsis of svc_pool_for_cpu() · 2059b698

Chuck Lever authored May 10, 2022

Clean up: There is one caller. The @cpu argument can be made
implicit now that a get_cpu/put_cpu pair is no longer needed.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

2059b698

SUNRPC: Don't disable preemption while calling svc_pool_for_cpu(). · 586095d3

Sebastian Andrzej Siewior authored May 10, 2022

svc_xprt_enqueue() disables preemption via get_cpu() and then asks
for a pool of a specific CPU (current) via svc_pool_for_cpu().
While preemption is disabled, svc_xprt_enqueue() acquires
svc_pool::sp_lock with bottom-halfs disabled, which can sleep on
PREEMPT_RT.

Disabling preemption is not required here. The pool is protected with a
lock so the following list access is safe even cross-CPU. The following
iteration through svc_pool::sp_all_threads is under RCU-readlock and
remaining operations within the loop are atomic and do not rely on
disabled-preemption.

Use raw_smp_processor_id() as the argument for the requested CPU in
svc_pool_for_cpu().
Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

586095d3

NFSD: Show state of courtesy client in client info · e9488d5a

Dai Ngo authored May 02, 2022

Update client_info_show to show state of courtesy client
and seconds since last renew.
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

e9488d5a

NFSD: add support for lock conflict to courteous server · 27431aff

Dai Ngo authored May 02, 2022

This patch allows expired client with lock state to be in COURTESY
state. Lock conflict with COURTESY client is resolved by the fs/lock
code using the lm_lock_expirable and lm_expire_lock callback in the
struct lock_manager_operations.

If conflict client is in COURTESY state, set it to EXPIRABLE and
schedule the laundromat to run immediately to expire the client. The
callback lm_expire_lock waits for the laundromat to flush its work
queue before returning to caller.
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

27431aff

fs/lock: add 2 callbacks to lock_manager_operations to resolve conflict · 2443da22

Dai Ngo authored May 02, 2022

Add 2 new callbacks, lm_lock_expirable and lm_expire_lock, to
lock_manager_operations to allow the lock manager to take appropriate
action to resolve the lock conflict if possible.

A new field, lm_mod_owner, is also added to lock_manager_operations.
The lm_mod_owner is used by the fs/lock code to make sure the lock
manager module such as nfsd, is not freed while lock conflict is being
resolved.

lm_lock_expirable checks and returns true to indicate that the lock
conflict can be resolved else return false. This callback must be
called with the flc_lock held so it can not block.

lm_expire_lock is called to resolve the lock conflict if the returned
value from lm_lock_expirable is true. This callback is called without
the flc_lock held since it's allowed to block. Upon returning from
this callback, the lock conflict should be resolved and the caller is
expected to restart the conflict check from the beginnning of the list.

Lock manager, such as NFSv4 courteous server, uses this callback to
resolve conflict by destroying lock owner, or the NFSv4 courtesy client
(client that has expired but allowed to maintains its states) that owns
the lock.
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>

2443da22

fs/lock: add helper locks_owner_has_blockers to check for blockers · 591502c5

Dai Ngo authored May 02, 2022

Add helper locks_owner_has_blockers to check if there is any blockers
for a given lockowner.
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>

591502c5

NFSD: move create/destroy of laundry_wq to init_nfsd and exit_nfsd · d76cc46b

Dai Ngo authored May 02, 2022

This patch moves create/destroy of laundry_wq from nfs4_state_start
and nfs4_state_shutdown_net to init_nfsd and exit_nfsd to prevent
the laundromat from being freed while a thread is processing a
conflicting lock.
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

d76cc46b

NFSD: add support for share reservation conflict to courteous server · 3d694271

Dai Ngo authored May 02, 2022

This patch allows expired client with open state to be in COURTESY
state. Share/access conflict with COURTESY client is resolved by
setting COURTESY client to EXPIRABLE state, schedule laundromat
to run and returning nfserr_jukebox to the request client.
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

3d694271

NFSD: add courteous server support for thread with only delegation · 66af2579

Dai Ngo authored May 02, 2022

This patch provides courteous server support for delegation only.
Only expired client with delegation but no conflict and no open
or lock state is allowed to be in COURTESY state.

Delegation conflict with COURTESY/EXPIRABLE client is resolved by
setting it to EXPIRABLE, queue work for the laundromat and return
delay to the caller. Conflict is resolved when the laudromat runs
and expires the EXIRABLE client while the NFS client retries the
OPEN request. Local thread request that gets conflict is doing the
retry in _break_lease.

Client in COURTESY or EXPIRABLE state is allowed to reconnect and
continues to have access to its state. Access to the nfs4_client by
the reconnecting thread and the laundromat is serialized via the
client_lock.
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

66af2579