- 17 Aug, 2015 23 commits
-
-
Trond Myklebust authored
Chuck reports seeing cases where a GETATTR that happens to race with an asynchronous WRITE is overriding the file size, despite the attribute barrier being set by the writeback code. The culprit turns out to be the check in nfs_ctime_need_update(), which sees that the ctime is newer than the cached ctime, and assumes that it is safe to override the attribute barrier. This patch removes that override, and ensures that attribute barriers are always respected. Reported-by: Chuck Lever <chuck.lever@oracle.com> Fixes: a08a8cd3 ("NFS: Add attribute update barriers to NFS writebacks") Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Trond Myklebust authored
* layoutfixes: NFSv4.1/pnfs: Remove redundant wakeup in pnfs_send_layoutreturn() NFSv4.1/pnfs: Remove redundant check in pnfs_layoutgets_blocked() NFSv4.1/pnfs: Remove redundant lo->plh_block_lgets in layoutreturn NFSv4.1/pnfs: Don't prevent layoutgets when doing return-on-close NFSv4.1/pnfs: Fix serialisation of layout return and layoutget NFSv4.1/pnfs: Remove redundant checks in pnfs_layoutgets_blocked() pNFS: Tighten up locking around DS commit buckets
-
Trond Myklebust authored
* bugfixes: SUNRPC: Fix a thinko in xs_connect() NFSv4.1/pNFS: Fix borken function _same_data_server_addrs_locked() NFS: nfs_set_pgio_error sometimes misses errors
-
git://git.linux-nfs.org/projects/anna/nfs-rdmaTrond Myklebust authored
NFS: NFS over RDMA Client Side Changes These patches improve both client performance and scalability, most notably by increasing the maixmum allowed rsize and wsize and by increasing the number of RDMA "credits". There are also several bugfixes, such as correcting how WRITE compounds are encoded and fixing large NFS symlink operations. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Anna Schumaker authored
And call nfs_file_clear_open_context() directly. This makes it obvious that nfs_file_release() will always return 0. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Anna Schumaker authored
All nfs_write_inode() does is pass its arguments to nfs_commit_unstable_pages(). Let's cut out the middle man and have nfs_write_pages() do the work directly. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Anna Schumaker authored
All these functions do is call nfs41_ping_server() without adding anything. Let's remove them and give nfs41_ping_server() a better name instead. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Anna Schumaker authored
The idmap_init() and idmap_quit() functions only exist to call the _keyring() version. Let's just call the keyring() functions directly. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Anna Schumaker authored
They already exist and do the exact same thing. Let's save ourselves several lines of code! Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Anna Schumaker authored
This function is to help determine if two sockaddrs are really the same socket. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Anna Schumaker authored
I'm planning on using these functions inside the client, so remove the underscores to make it feel like I'm using a public interface. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Anna Schumaker authored
nfs_readdir_xdr_to_array() uses both a cache array and an array of pages, so I rename these functions to make it clearer how the code works. nfs_readdir_large_page() becomes nfs_readdir_alloc_pages() because this function has absolutely nothing to do with setting up a large page. nfs_readdir_free_pagearray() becomes nfs_readdir_free_pages() to stay consistent with the new alloc_pages() function. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Anna Schumaker authored
This variable is initialized to NULL and is never modified before being passed to nfs_readdir_free_large_page(). But that's okay, because nfs_readdir_free_large_page() only seems to exist as a way of calling nfs_readdir_free_pagearray() without this parameter. Let's simplify by removing pages_ptr and nfs_readdir_free_pagearray(). Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Jeff Layton authored
We already know that pg_lseg is NULL here. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Christoph Hellwig authored
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Christoph Hellwig authored
We generally want to read and write to a block device that's used by the pNFS block layout client (and even if it's read only the server has no way of telling us). Add FMODE_WRITE to the mode argument so that we don't incorrectly tell the block driver that we want a read-only open. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Christoph Hellwig authored
Instead of overwriting kernel memory reject too long signatures. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Christoph Hellwig authored
We need to replace the __be32 with a void pointer to do proper arithmentics on the virtual addresses so that we can get the right page pointers. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Christoph Hellwig authored
We need to include the first u32 for the number of entries. Add a helper for the calculation instead of opencoding it so that it's in one place. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Kinglong Mee authored
---Steps to Reproduce-- <nfs-server> # cat /etc/exports /nfs/referal *(rw,insecure,no_subtree_check,no_root_squash,crossmnt) /nfs/old *(ro,insecure,subtree_check,root_squash,crossmnt) <nfs-client> # mount -t nfs nfs-server:/nfs/ /mnt/ # ll /mnt/*/ <nfs-server> # cat /etc/exports /nfs/referal *(rw,insecure,no_subtree_check,no_root_squash,crossmnt,refer=/nfs/old/@nfs-server) /nfs/old *(ro,insecure,subtree_check,root_squash,crossmnt) # service nfs restart <nfs-client> # ll /mnt/*/ --->>>>> oops here [ 5123.102925] BUG: unable to handle kernel NULL pointer dereference at (null) [ 5123.103363] IP: [<ffffffffa03ed38b>] nfs4_proc_get_locations+0x9b/0x120 [nfsv4] [ 5123.103752] PGD 587b9067 PUD 3cbf5067 PMD 0 [ 5123.104131] Oops: 0000 [#1] [ 5123.104529] Modules linked in: nfsv4(OE) nfs(OE) fscache(E) nfsd(OE) xfs libcrc32c iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi coretemp crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ppdev vmw_balloon parport_pc parport i2c_piix4 shpchp auth_rpcgss nfs_acl vmw_vmci lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi serio_raw scsi_transport_spi e1000 mptscsih mptbase ata_generic pata_acpi [last unloaded: nfsd] [ 5123.105887] CPU: 0 PID: 15853 Comm: ::1-manager Tainted: G OE 4.2.0-rc6+ #214 [ 5123.106358] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014 [ 5123.106860] task: ffff88007620f300 ti: ffff88005877c000 task.ti: ffff88005877c000 [ 5123.107363] RIP: 0010:[<ffffffffa03ed38b>] [<ffffffffa03ed38b>] nfs4_proc_get_locations+0x9b/0x120 [nfsv4] [ 5123.107909] RSP: 0018:ffff88005877fdb8 EFLAGS: 00010246 [ 5123.108435] RAX: ffff880053f3bc00 RBX: ffff88006ce6c908 RCX: ffff880053a0d240 [ 5123.108968] RDX: ffffea0000e6d940 RSI: ffff8800399a0000 RDI: ffff88006ce6c908 [ 5123.109503] RBP: ffff88005877fe28 R08: ffffffff81c708a0 R09: 0000000000000000 [ 5123.110045] R10: 00000000000001a2 R11: ffff88003ba7f5c8 R12: ffff880054c55800 [ 5123.110618] R13: 0000000000000000 R14: ffff880053a0d240 R15: ffff880053a0d240 [ 5123.111169] FS: 0000000000000000(0000) GS:ffffffff81c27000(0000) knlGS:0000000000000000 [ 5123.111726] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5123.112286] CR2: 0000000000000000 CR3: 0000000054cac000 CR4: 00000000001406f0 [ 5123.112888] Stack: [ 5123.113458] ffffea0000e6d940 ffff8800399a0000 00000000000167d0 0000000000000000 [ 5123.114049] 0000000000000000 0000000000000000 0000000000000000 00000000a7ec82c6 [ 5123.114662] ffff88005877fe18 ffffea0000e6d940 ffff8800399a0000 ffff880054c55800 [ 5123.115264] Call Trace: [ 5123.115868] [<ffffffffa03fb44b>] nfs4_try_migration+0xbb/0x220 [nfsv4] [ 5123.116487] [<ffffffffa03fcb3b>] nfs4_run_state_manager+0x4ab/0x7b0 [nfsv4] [ 5123.117104] [<ffffffffa03fc690>] ? nfs4_do_reclaim+0x510/0x510 [nfsv4] [ 5123.117813] [<ffffffff810a4527>] kthread+0xd7/0xf0 [ 5123.118456] [<ffffffff810a4450>] ? kthread_worker_fn+0x160/0x160 [ 5123.119108] [<ffffffff816d9cdf>] ret_from_fork+0x3f/0x70 [ 5123.119723] [<ffffffff810a4450>] ? kthread_worker_fn+0x160/0x160 [ 5123.120329] Code: 4c 8b 6a 58 74 17 eb 52 48 8d 55 a8 89 c6 4c 89 e7 e8 4a b5 ff ff 8b 45 b0 85 c0 74 1c 4c 89 f9 48 8b 55 90 48 8b 75 98 48 89 df <41> ff 55 00 3d e8 d8 ff ff 41 89 c6 74 cf 48 8b 4d c8 65 48 33 [ 5123.121643] RIP [<ffffffffa03ed38b>] nfs4_proc_get_locations+0x9b/0x120 [nfsv4] [ 5123.122308] RSP <ffff88005877fdb8> [ 5123.122942] CR2: 0000000000000000 Fixes: ec011fe8 ("NFS: Introduce a vector of migration recovery ops") Cc: stable@vger.kernel.org # v3.13+ Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Trond Myklebust authored
It is rather pointless to test the value of transport->inet after calling xs_reset_transport(), since it will always be zero, and so we will never see any exponential back off behaviour. Also don't force early connections for SOFTCONN tasks. If the server disconnects us, we should respect the exponential backoff. Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Trond Myklebust authored
- Switch back to using list_for_each_entry(). Fixes an incorrect test for list NULL termination. - Do not assume that lists are sorted. - Finally, consider an existing entry to match if it consists of a subset of the addresses in the new entry. Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Trond Myklebust authored
We should ensure that we always set the pgio_header's error field if a READ or WRITE RPC call returns an error. The current code depends on 'hdr->good_bytes' always being initialised to a large value, which is not always done correctly by callers. When this happens, applications may end up missing important errors. Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
- 12 Aug, 2015 13 commits
-
-
Trond Myklebust authored
pnfs_clear_layoutreturn_waitbit() should already be calling rpc_wake_up(&NFS_SERVER(ino)->roc_rpcwaitq) for us. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Trond Myklebust authored
layoutget now should already be serialised w.r.t. layout returns Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Trond Myklebust authored
The NFS_LAYOUT_RETURN bit already suffices to ensure that layoutget is blocked. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Trond Myklebust authored
If there is an outstanding return-on-close, then we just want new layoutget requests to wait rather than fail. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Trond Myklebust authored
We should always test for outstanding layout returns, whether or not pnfs_should_retry_layoutget() is true. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Trond Myklebust authored
If there are no valid layout segments, then we should already have checked in pnfs_update_layout() whether or not this is the first layoutget. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Trond Myklebust authored
I'm not aware of any bugreports around this issue, but the locking around the pnfs_commit_bucket is inconsistent at best. This patch tightens it up by ensuring that the 'bucket->committing' list is always changed atomically w.r.t. the 'bucket->clseg' layout segment tracking. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Kinglong Mee authored
The xprt created by svc_create_xprt have be added to serv->sv_permsocks. So putting the xprt directly is useless. Otherwise, there is a more svc_xprt_put after the xprt be freed. v2, same as v1. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
NeilBrown authored
It is unusual to combine the open flags O_RDONLY and O_EXCL, but it appears that libre-office does just that. [pid 3250] stat("/home/USER/.config", {st_mode=S_IFDIR|0700, st_size=8192, ...}) = 0 [pid 3250] open("/home/USER/.config/libreoffice/4-suse/user/extensions/buildid", O_RDONLY|O_EXCL <unfinished ...> NFSv4 takes O_EXCL as a sign that a setattr command should be sent, probably to reset the timestamps. When it was an O_RDONLY open, the SETATTR command does not identify any actual attributes to change. If no delegation was provided to the open, the SETATTR uses the all-zeros stateid and the request is accepted (at least by the Linux NFS server - no harm, no foul). If a read-delegation was provided, this is used in the SETATTR request, and a Netapp filer will justifiably claim NFS4ERR_BAD_STATEID, which the Linux client takes as a sign to retry - indefinitely. So only treat O_EXCL specially if O_CREAT was also given. Signed-off-by: NeilBrown <neilb@suse.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Kinglong Mee authored
Commit 1d3d4437 "vmscan: per-node deferred work" have made register_shrinker can return an intergater error. If register_shrinker() fail, the later unregister_shrinker() will cause a NULL pointer access. v2, same as v1. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Jeff Layton authored
The current limit of 32 bytes artificially limits the name string that we end up stuffing into NFSv4.x client ID blobs. If you have multiple hosts with long hostnames that only differ near the end, then this can cause NFSv4 client ID collisions. Linux nodenames are actually limited to __NEW_UTS_LEN bytes (64), so use that as the limit instead. Also, use XDR_QUADLEN to specify the slack length, just for clarity and in case someone in the future changes this to something not evenly divisible by 4. Reported-by: Michael Skralivetsky <michael.skralivetsky@primarydata.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Trond Myklebust authored
Prevent a potential deadlock. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Peng Tao authored
Turned out I misinterpreted the spec... Cc: Tom Haynes <thomas.haynes@primarydata.com> Reported-by: Jean Spector <jean@primarydata.com> Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
- 10 Aug, 2015 1 commit
-
-
Trond Myklebust authored
pnfs_layout_mark_request_commit() needs to ensure that it adds the request to the commit list atomically with all the other updates in order to prevent corruption to buckets[ds_commit_idx].wlseg due to races with pnfs_generic_clear_request_commit(). Fixes: 338d00cf ("pnfs: Refactor the *_layout_mark_request_commit...") Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
- 05 Aug, 2015 3 commits
-
-
Devesh Sharma authored
This is a rework of the following patch sent almost a year back: http://www.mail-archive.com/linux-rdma%40vger.kernel.org/msg20730.html In presence of active mount if someone tries to rmmod vendor-driver, the command remains stuck forever waiting for destruction of all rdma-cm-id. in worst case client can crash during shutdown with active mounts. The existing code assumes that ia->ri_id->device cannot change during the lifetime of a transport. xprtrdma do not have support for DEVICE_REMOVAL event either. Lifting that assumption and adding support for DEVICE_REMOVAL event is a long chain of work, and is in plan. The community decided that preventing the hang right now is more important than waiting for architectural changes. Thus, this patch introduces a temporary workaround to acquire HCA driver module reference count during the mount of a nfs-rdma mount point. Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@dev.mellanox.co.il> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
The verbs are obsolete. The ib_rereg_phys_mr() verb is not used by kernel ULPs, and the last ib_reg_phys_mr() call site in the kernel tree has now been removed. Two staging tree call sites remain in the Lustre client. The Lustre team has been notified of the deprecation of reg_phys_mr. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
RDMA_NOMSG type calls are less efficient than RDMA_MSG. Count NOMSG calls so administrators can tell if they happen to be used more than expected. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-