- 24 Oct, 2010 19 commits
-
-
Andy Adamson authored
Add the ability to actually send LAYOUTGET and GETDEVICEINFO. This also adds in the machinery to handle layout state and the deviceid cache. Note that GETDEVICEINFO is not called directly by the generic layer. Instead it is called by the drivers while parsing the LAYOUTGET opaque data in response to an unknown device id embedded therein. RFC 5661 only encodes device ids within the driver-specific opaque data. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Dean Hildebrand <dhildebz@umich.edu> Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: Mike Sager <sager@netapp.com> Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com> Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Andy Adamson authored
In particular, server reboot will invalidate all layouts. Note that in order to have an active layout, we must get a successful response from the server. To avoid adding that machinery, this patch just includes a stub that fakes up a successful return. Since the layout is never referenced for io, this is not a problem. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Dean Hildebrand <dhildebz@umich.edu> Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Benny Halevy authored
At the start of the io paths, try to grab the relevant layout information. This will initiate the inode's layout cache, but stubs ensure the cache stays empty. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Dean Hildebrand <dhildebz@umich.edu> Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn> Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Dean Hildebrand authored
This driver just registers itself and supplies trivial mount/umount functions. Signed-off-by: Dean Hildebrand <dhildebz@umich.edu> Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Fred Isaman authored
Allow a module implementing a layout type to register, and have its mount/umount routines called for filesystems that the server declares support it. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Bian Naimeng <biannm@cn.fujitsu.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Ricardo Labiaga authored
Put in the infrastructure that uses information returned from the server at mount to select a layout driver module. In this patch, a stub is used that always returns "no driver found". Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Dean Hildebrand <dhildebz@umich.edu> Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Andy Adamson authored
This information will be used to determine which layout driver, if any, to use for subsequent IO on this filesystem. Each driver is assigned an integer id, with 0 reserved to indicate no driver. The server can in theory return multiple ids. However, our current client implementation only notes the first entry and ignores the rest. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Alexandros Batsakis authored
In NFSv4.1 the stateid consists of the other and seqid fields. For layout processing we need to numerically compare the seqid value of layout stateids. To do so, introduce a union to nfs4_stateid to switch between opaque(16 bytes) and opaque(12 bytes) / __be32 Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Dean Hildebrand authored
Use only layoutreturn constant for both returns and recalls. (return_* works better for recall_type rather the other way around) Signed-off-by: Dean Hildebrand <dhildebz@umich.edu> Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Benny Halevy authored
A helper for decoding a fixed length opaque value. Returns a pointer to the next item in the xdr stream. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Andy Adamson authored
Already accepted by Bruce Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Roman Borisov authored
Return value of "decode_attr_bitmap()" was not checked; Signed-off-by: Roman Borisov <ext-roman.borisov@nokia.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Ricardo Labiaga authored
Used by the client to determine if the server has a granular enough time stamp. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Ricardo Labiaga authored
Instead of blindly zapping the caches, attempt to revalidate them if the server has indicated that it uses high resolution timestamps. NFSv4 should be able to always revalidate the cache since the protocol requires the update of the change attribute on modification of the data. In reality, there are servers (the Linux NFS server for example) that do not obey this requirement and use ctime as the basis for change attribute. Long term, the server needs to be fixed. At this time, and to be on the safe side, continue zapping caches if the server indicates that it does not have a high resolution timestamp. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Rob Leslie reports seeing the following Oops after his Kerberos session expired. BUG: unable to handle kernel NULL pointer dereference at 00000058 IP: [<e186ed94>] rpcauth_refreshcred+0x11/0x12c [sunrpc] *pde = 00000000 Oops: 0000 [#1] last sysfs file: /sys/devices/platform/pc87360.26144/temp3_input Modules linked in: autofs4 authenc esp4 xfrm4_mode_transport ipt_LOG ipt_REJECT xt_limit xt_state ipt_REDIRECT xt_owner xt_HL xt_hl xt_tcpudp xt_mark cls_u32 cls_tcindex sch_sfq sch_htb sch_dsmark geodewdt deflate ctr twofish_generic twofish_i586 twofish_common camellia serpent blowfish cast5 cbc xcbc rmd160 sha512_generic sha1_generic hmac crypto_null af_key rpcsec_gss_krb5 nfsd exportfs nfs lockd fscache nfs_acl auth_rpcgss sunrpc ip_gre sit tunnel4 dummy ext3 jbd nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables x_tables pc8736x_gpio nsc_gpio pc87360 hwmon_vid loop aes_i586 aes_generic sha256_generic dm_crypt cs5535_gpio serio_raw cs5535_mfgpt hifn_795x des_generic geode_rng rng_core led_class ext4 mbcache jbd2 crc16 dm_mirror dm_region_hash dm_log dm_snapshot dm_mod sd_mod crc_t10dif ide_pci_generic cs5536 amd74xx ide_core pata_cs5536 ata_generic libata usb_storage via_rhine mii scsi_mod btrfs zlib_deflate crc32c libcrc32c [last unloaded: scsi_wait_scan] Pid: 12875, comm: sudo Not tainted 2.6.36-net5501 #1 / EIP: 0060:[<e186ed94>] EFLAGS: 00010292 CPU: 0 EIP is at rpcauth_refreshcred+0x11/0x12c [sunrpc] EAX: 00000000 EBX: defb13a0 ECX: 00000006 EDX: e18683b8 ESI: defb13a0 EDI: 00000000 EBP: 00000000 ESP: de571d58 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 Process sudo (pid: 12875, ti=de570000 task=decd1430 task.ti=de570000) Stack: e186e008 00000000 defb13a0 0000000d deda6000 e1868f22 e196f12b defb13a0 <0> defb13d8 00000000 00000000 e186e0aa 00000000 defb13a0 de571dac 00000000 <0> e186956c de571e34 debea5c0 de571dc8 e186967a 00000000 debea5c0 de571e34 Call Trace: [<e186e008>] ? rpc_wake_up_next+0x114/0x11b [sunrpc] [<e1868f22>] ? call_decode+0x24a/0x5af [sunrpc] [<e196f12b>] ? nfs4_xdr_dec_access+0x0/0xa2 [nfs] [<e186e0aa>] ? __rpc_execute+0x62/0x17b [sunrpc] [<e186956c>] ? rpc_run_task+0x91/0x97 [sunrpc] [<e186967a>] ? rpc_call_sync+0x40/0x5b [sunrpc] [<e1969ca2>] ? nfs4_proc_access+0x10a/0x176 [nfs] [<e19572fa>] ? nfs_do_access+0x2b1/0x2c0 [nfs] [<e186ed61>] ? rpcauth_lookupcred+0x62/0x84 [sunrpc] [<e19573b6>] ? nfs_permission+0xad/0x13b [nfs] [<c0177824>] ? exec_permission+0x15/0x4b [<c0177fbd>] ? link_path_walk+0x4f/0x456 [<c017867d>] ? path_walk+0x4c/0xa8 [<c0179678>] ? do_path_lookup+0x1f/0x68 [<c017a3fb>] ? user_path_at+0x37/0x5f [<c016359c>] ? handle_mm_fault+0x229/0x55b [<c0170a2d>] ? sys_faccessat+0x93/0x146 [<c0170aef>] ? sys_access+0xf/0x13 [<c02cf615>] ? syscall_call+0x7/0xb Code: 0f 94 c2 84 d2 74 09 8b 44 24 0c e8 6a e9 8b de 83 c4 14 89 d8 5b 5e 5f 5d c3 55 57 56 53 83 ec 1c fc 89 c6 8b 40 10 89 44 24 04 <8b> 58 58 85 db 0f 85 d4 00 00 00 0f b7 46 70 8b 56 20 89 c5 83 EIP: [<e186ed94>] rpcauth_refreshcred+0x11/0x12c [sunrpc] SS:ESP 0068:de571d58 CR2: 0000000000000058 This appears to be caused by the function rpc_verify_header() first calling xprt_release(), then doing a call_refresh. If we release the transport slot, we should _always_ jump back to call_reserve before calling anything else. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
-
Trond Myklebust authored
Also ensure we only ask for either fileid or mounted_on_fileid in the readdirplus case too... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Otherwise, we may end up reading uninitialised data from the resulting struct nfs_fattr. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
- 23 Oct, 2010 18 commits
-
-
Trond Myklebust authored
We don't want to have the mounted_on_fileid overwrite the true fileid. We only return the former if the server didn't supply the true fileid. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
decode_attr_filehandle still needs to skip the XDR-encoded filehandle if someone passes a null pointer argument. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Also some clean ups. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Bryan Schumaker authored
By requsting more attributes during a readdir, we can mimic the readdir plus operation that was in NFSv3. To test, I ran the command `ls -lU --color=none` on directories with various numbers of files. Without readdir plus, I see this: n files | 100 | 1,000 | 10,000 | 100,000 | 1,000,000 --------+-----------+-----------+-----------+-----------+---------- real | 0m00.153s | 0m00.589s | 0m05.601s | 0m56.691s | 9m59.128s user | 0m00.007s | 0m00.007s | 0m00.077s | 0m00.703s | 0m06.800s sys | 0m00.010s | 0m00.070s | 0m00.633s | 0m06.423s | 1m10.005s access | 3 | 1 | 1 | 4 | 31 getattr | 2 | 1 | 1 | 1 | 1 lookup | 104 | 1,003 | 10,003 | 100,003 | 1,000,003 readdir | 2 | 16 | 158 | 1,575 | 15,749 total | 111 | 1,021 | 10,163 | 101,583 | 1,015,784 With readdir plus enabled, I see this: n files | 100 | 1,000 | 10,000 | 100,000 | 1,000,000 --------+-----------+-----------+-----------+-----------+---------- real | 0m00.115s | 0m00.206s | 0m01.079s | 0m12.521s | 2m07.528s user | 0m00.003s | 0m00.003s | 0m00.040s | 0m00.290s | 0m03.296s sys | 0m00.007s | 0m00.020s | 0m00.120s | 0m01.357s | 0m17.556s access | 3 | 1 | 1 | 1 | 7 getattr | 2 | 1 | 1 | 1 | 1 lookup | 4 | 3 | 3 | 3 | 3 readdir | 6 | 62 | 630 | 6,300 | 62,993 total | 15 | 67 | 635 | 6,305 | 63,004 Readdir plus disabled has about a 16x increase in the number of rpc calls and is 4 - 5 times slower on large directories. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Bryan Schumaker authored
Getattr should be able to decode errors and the readdir file handle. decode_getfattr_attrs does the actual attribute decoding, while decode_getfattr_generic will check the opcode before decoding. This will let other functions call decode_getfattr_attrs to decode their attributes. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Bryan Schumaker authored
Check if the decoded entry has the eof bit set when returning from xdr_decode with an error. If it does, we should set the eof bits in the array before returning. This should keep us from looping when we expect more data but the server doesn't give us anything new. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Bryan Schumaker authored
Check for all errors, not a specific one. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Bryan Schumaker authored
We can use vmapped pages to read more information from the network at once. This will reduce the number of calls needed to complete a readdir. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> [trondmy: Added #include for linux/vmalloc.h> in fs/nfs/dir.c] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Bryan Schumaker authored
Remove the page size checking code for a readdir decode. This is now done by decode_dirent with xdr_streams. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Bryan Schumaker authored
Convert nfs*xdr.c to use an xdr stream in decode_dirent. This will prevent a kernel oops that has been occuring when reading a vmapped page. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
We sometimes need to be able to read ahead in an xdr_stream without incrementing the current pointer position. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Bryan Schumaker authored
We will now use readdir plus even on directories that are very large. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Bryan Schumaker authored
This patch adds readdir plus support to the cache array. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
If we're going through the loop in nfs_readdir() more than once, we usually do not want to restart searching from the beginning of the pages cache. We only want to do that if the previous search failed... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Bryan Schumaker authored
This patch adds the readdir cache array and functions to retreive the array stored on a cache page, clear the array by freeing allocated memory, add an entry to the array, and search the array for a given cookie. It then modifies readdir to make use of the new cache array. With the new cache array method, we no longer need some of this code. Finally, nfs_llseek_dir() will set file->f_pos to a value greater than 0 and desc->dir_cookie to zero. When we see this, readdir needs to find the file at position file->f_pos from the start of the directory. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Randy Dunlap authored
nfs4state.c uses interfaces from ratelimit.h. It needs to include that header file to fix build errors: fs/nfs/nfs4state.c:1195: warning: type defaults to 'int' in declaration of 'DEFINE_RATELIMIT_STATE' fs/nfs/nfs4state.c:1195: warning: parameter names (without types) in function declaration fs/nfs/nfs4state.c:1195: error: invalid storage class for function 'DEFINE_RATELIMIT_STATE' fs/nfs/nfs4state.c:1195: error: implicit declaration of function '__ratelimit' fs/nfs/nfs4state.c:1195: error: '_rs' undeclared (first use in this function) Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: linux-nfs@vger.kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Otherwise, we cannot recover state correctly. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
If nfs_intent_set_file() returns an error, we usually want to pass that back up the stack. Also ensure that nfs_open_revalidate() returns '1' on success. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
- 19 Oct, 2010 3 commits
-
-
Trond Myklebust authored
If the server sends us an NFS4ERR_STALE_CLIENTID while the state management thread is busy reclaiming state, we do want to treat all state that wasn't reclaimed before the STALE_CLIENTID as if a network partition occurred (see the edge conditions described in RFC3530 and RFC5661). What we do not want to do is to send an nfs4_reclaim_complete(), since we haven't yet even started reclaiming state after the server rebooted. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
-
Trond Myklebust authored
In the case of a server reboot, the state recovery thread starts by calling nfs4_state_end_reclaim_reboot() in order to avoid edge conditions when the server reboots while the client is in the middle of recovery. However, if the client has already marked the nfs4_state as requiring reboot recovery, then the above behaviour will cause the recovery thread to treat the open as if it was part of such an edge condition: the open will be recovered as if it was part of a lease expiration (and all the locks will be lost). Fix is to remove the call to nfs4_state_mark_reclaim_reboot from nfs4_async_handle_error(), and nfs4_handle_exception(). Instead we leave it to the recovery thread to do this for us. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
-
Trond Myklebust authored
NFSv4 open recovery is currently broken: since we do not clear the state->flags states before attempting recovery, we end up with the 'can_open_cached()' function triggering. This again leads to no OPEN call being put on the wire. Reported-by: Sachin Prabhu <sprabhu@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
-