1. 08 Nov, 2013 1 commit
    • Trond Myklebust's avatar
      SUNRPC: Fix a data corruption issue when retransmitting RPC calls · a6b31d18
      Trond Myklebust authored
      The following scenario can cause silent data corruption when doing
      NFS writes. It has mainly been observed when doing database writes
      using O_DIRECT.
      
      1) The RPC client uses sendpage() to do zero-copy of the page data.
      2) Due to networking issues, the reply from the server is delayed,
         and so the RPC client times out.
      
      3) The client issues a second sendpage of the page data as part of
         an RPC call retransmission.
      
      4) The reply to the first transmission arrives from the server
         _before_ the client hardware has emptied the TCP socket send
         buffer.
      5) After processing the reply, the RPC state machine rules that
         the call to be done, and triggers the completion callbacks.
      6) The application notices the RPC call is done, and reuses the
         pages to store something else (e.g. a new write).
      
      7) The client NIC drains the TCP socket send buffer. Since the
         page data has now changed, it reads a corrupted version of the
         initial RPC call, and puts it on the wire.
      
      This patch fixes the problem in the following manner:
      
      The ordering guarantees of TCP ensure that when the server sends a
      reply, then we know that the _first_ transmission has completed. Using
      zero-copy in that situation is therefore safe.
      If a time out occurs, we then send the retransmission using sendmsg()
      (i.e. no zero-copy), We then know that the socket contains a full copy of
      the data, and so it will retransmit a faithful reproduction even if the
      RPC call completes, and the application reuses the O_DIRECT buffer in
      the meantime.
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: stable@vger.kernel.org
      a6b31d18
  2. 04 Nov, 2013 5 commits
  3. 01 Nov, 2013 2 commits
    • Trond Myklebust's avatar
      NFS: Fix a missing initialisation when reading the SELinux label · fcb63a9b
      Trond Myklebust authored
      Ensure that _nfs4_do_get_security_label() also initialises the
      SEQUENCE call correctly, by having it call into nfs4_call_sync().
      Reported-by: default avatarJeff Layton <jlayton@redhat.com>
      Cc: stable@vger.kernel.org # 3.11+
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      fcb63a9b
    • Jeff Layton's avatar
      nfs: fix oops when trying to set SELinux label · 12207f69
      Jeff Layton authored
      Chao reported the following oops when testing labeled NFS:
      
      BUG: unable to handle kernel NULL pointer dereference at           (null)
      IP: [<ffffffffa0568703>] nfs4_xdr_enc_setattr+0x43/0x110 [nfsv4]
      PGD 277bbd067 PUD 2777ea067 PMD 0
      Oops: 0000 [#1] SMP
      Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache sg coretemp kvm_intel kvm crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul iTCO_wdt glue_helper ablk_helper cryptd iTCO_vendor_support bnx2 pcspkr serio_raw i7core_edac cdc_ether microcode usbnet edac_core mii lpc_ich i2c_i801 mfd_core shpchp ioatdma dca acpi_cpufreq mperf nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sr_mod sd_mod cdrom crc_t10dif mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ata_generic ttm pata_acpi drm ata_piix libata megaraid_sas i2c_core dm_mirror dm_region_hash dm_log dm_mod
      CPU: 4 PID: 25657 Comm: chcon Not tainted 3.10.0-33.el7.x86_64 #1
      Hardware name: IBM System x3550 M3 -[7944OEJ]-/90Y4784     , BIOS -[D6E150CUS-1.11]- 02/08/2011
      task: ffff880178397220 ti: ffff8801595d2000 task.ti: ffff8801595d2000
      RIP: 0010:[<ffffffffa0568703>]  [<ffffffffa0568703>] nfs4_xdr_enc_setattr+0x43/0x110 [nfsv4]
      RSP: 0018:ffff8801595d3888  EFLAGS: 00010296
      RAX: 0000000000000000 RBX: ffff8801595d3b30 RCX: 0000000000000b4c
      RDX: ffff8801595d3b30 RSI: ffff8801595d38e0 RDI: ffff880278b6ec00
      RBP: ffff8801595d38c8 R08: ffff8801595d3b30 R09: 0000000000000001
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801595d38e0
      R13: ffff880277a4a780 R14: ffffffffa05686c0 R15: ffff8802765f206c
      FS:  00007f2c68486800(0000) GS:ffff88027fc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000000 CR3: 000000027651a000 CR4: 00000000000007e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Stack:
       0000000000000000 0000000000000000 0000000000000000 0000000000000000
       0000000000000000 ffff880277865800 ffff880278b6ec00 ffff880277a4a780
       ffff8801595d3948 ffffffffa02ad926 ffff8801595d3b30 ffff8802765f206c
      Call Trace:
       [<ffffffffa02ad926>] rpcauth_wrap_req+0x86/0xd0 [sunrpc]
       [<ffffffffa02a1d40>] ? call_connect+0xb0/0xb0 [sunrpc]
       [<ffffffffa02a1d40>] ? call_connect+0xb0/0xb0 [sunrpc]
       [<ffffffffa02a1ecb>] call_transmit+0x18b/0x290 [sunrpc]
       [<ffffffffa02a1d40>] ? call_connect+0xb0/0xb0 [sunrpc]
       [<ffffffffa02aae14>] __rpc_execute+0x84/0x400 [sunrpc]
       [<ffffffffa02ac40e>] rpc_execute+0x5e/0xa0 [sunrpc]
       [<ffffffffa02a2ea0>] rpc_run_task+0x70/0x90 [sunrpc]
       [<ffffffffa02a2f03>] rpc_call_sync+0x43/0xa0 [sunrpc]
       [<ffffffffa055284d>] _nfs4_do_set_security_label+0x11d/0x170 [nfsv4]
       [<ffffffffa0558861>] nfs4_set_security_label.isra.69+0xf1/0x1d0 [nfsv4]
       [<ffffffff815fca8b>] ? avc_alloc_node+0x24/0x125
       [<ffffffff815fcd2f>] ? avc_compute_av+0x1a3/0x1b5
       [<ffffffffa055897b>] nfs4_xattr_set_nfs4_label+0x3b/0x50 [nfsv4]
       [<ffffffff811bc772>] generic_setxattr+0x62/0x80
       [<ffffffff811bcfc3>] __vfs_setxattr_noperm+0x63/0x1b0
       [<ffffffff811bd1c5>] vfs_setxattr+0xb5/0xc0
       [<ffffffff811bd2fe>] setxattr+0x12e/0x1c0
       [<ffffffff811a4d22>] ? final_putname+0x22/0x50
       [<ffffffff811a4f2b>] ? putname+0x2b/0x40
       [<ffffffff811aa1cf>] ? user_path_at_empty+0x5f/0x90
       [<ffffffff8119bc29>] ? __sb_start_write+0x49/0x100
       [<ffffffff811bd66f>] SyS_lsetxattr+0x8f/0xd0
       [<ffffffff8160cf99>] system_call_fastpath+0x16/0x1b
      Code: 48 8b 02 48 c7 45 c0 00 00 00 00 48 c7 45 c8 00 00 00 00 48 c7 45 d0 00 00 00 00 48 c7 45 d8 00 00 00 00 48 c7 45 e0 00 00 00 00 <48> 8b 00 48 8b 00 48 85 c0 0f 84 ae 00 00 00 48 8b 80 b8 03 00
      RIP  [<ffffffffa0568703>] nfs4_xdr_enc_setattr+0x43/0x110 [nfsv4]
       RSP <ffff8801595d3888>
      CR2: 0000000000000000
      
      The problem is that _nfs4_do_set_security_label calls rpc_call_sync()
      directly which fails to do any setup of the SEQUENCE call. Have it use
      nfs4_call_sync() instead which does the right thing. While we're at it
      change the name of "args" to "arg" to better match the pattern in
      _nfs4_do_setattr.
      Reported-by: default avatarChao Ye <cye@redhat.com>
      Cc: David Quigley <dpquigl@davequigley.com>
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Cc: stable@vger.kernel.org # 3.11+
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      12207f69
  4. 31 Oct, 2013 3 commits
    • Jeff Layton's avatar
      nfs: fix inverted test for delegation in nfs4_reclaim_open_state · 1acd1c30
      Jeff Layton authored
      commit 6686390b (NFS: remove incorrect "Lock reclaim failed!"
      warning.) added a test for a delegation before checking to see if any
      reclaimed locks failed. The test however is backward and is only doing
      that check when a delegation is held instead of when one isn't.
      
      Cc: NeilBrown <neilb@suse.de>
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Fixes: 6686390b: NFS: remove incorrect "Lock reclaim failed!" warning.
      Cc: stable@vger.kernel.org # 3.12
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      1acd1c30
    • Trond Myklebust's avatar
      SUNRPC: Cleanup xs_destroy() · a1311d87
      Trond Myklebust authored
      There is no longer any need for a separate xs_local_destroy() helper.
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      a1311d87
    • NeilBrown's avatar
      SUNRPC: close a rare race in xs_tcp_setup_socket. · 93dc41bd
      NeilBrown authored
      We have one report of a crash in xs_tcp_setup_socket.
      The call path to the crash is:
      
        xs_tcp_setup_socket -> inet_stream_connect -> lock_sock_nested.
      
      The 'sock' passed to that last function is NULL.
      
      The only way I can see this happening is a concurrent call to
      xs_close:
      
        xs_close -> xs_reset_transport -> sock_release -> inet_release
      
      inet_release sets:
         sock->sk = NULL;
      inet_stream_connect calls
         lock_sock(sock->sk);
      which gets NULL.
      
      All calls to xs_close are protected by XPRT_LOCKED as are most
      activations of the workqueue which runs xs_tcp_setup_socket.
      The exception is xs_tcp_schedule_linger_timeout.
      
      So presumably the timeout queued by the later fires exactly when some
      other code runs xs_close().
      
      To protect against this we can move the cancel_delayed_work_sync()
      call from xs_destory() to xs_close().
      
      As xs_close is never called from the worker scheduled on
      ->connect_worker, this can never deadlock.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      [Trond: Make it safe to call cancel_delayed_work_sync() on AF_LOCAL sockets]
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      93dc41bd
  5. 30 Oct, 2013 1 commit
  6. 28 Oct, 2013 28 commits