1. 03 Apr, 2014 15 commits
    • Jeff Layton's avatar
      NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mapping · a18d7602
      Jeff Layton authored
      commit d529ef83 upstream.
      
      There is a possible race in how the nfs_invalidate_mapping function is
      handled.  Currently, we go and invalidate the pages in the file and then
      clear NFS_INO_INVALID_DATA.
      
      The problem is that it's possible for a stale page to creep into the
      mapping after the page was invalidated (i.e., via readahead). If another
      writer comes along and sets the flag after that happens but before
      invalidate_inode_pages2 returns then we could clear the flag
      without the cache having been properly invalidated.
      
      So, we must clear the flag first and then invalidate the pages. Doing
      this however, opens another race:
      
      It's possible to have two concurrent read() calls that end up in
      nfs_revalidate_mapping at the same time. The first one clears the
      NFS_INO_INVALID_DATA flag and then goes to call nfs_invalidate_mapping.
      
      Just before calling that though, the other task races in, checks the
      flag and finds it cleared. At that point, it trusts that the mapping is
      good and gets the lock on the page, allowing the read() to be satisfied
      from the cache even though the data is no longer valid.
      
      These effects are easily manifested by running diotest3 from the LTP
      test suite on NFS. That program does a series of DIO writes and buffered
      reads. The operations are serialized and page-aligned but the existing
      code fails the test since it occasionally allows a read to come out of
      the cache incorrectly. While mixing direct and buffered I/O isn't
      recommended, I believe it's possible to hit this in other ways that just
      use buffered I/O, though that situation is much harder to reproduce.
      
      The problem is that the checking/clearing of that flag and the
      invalidation of the mapping really need to be atomic. Fix this by
      serializing concurrent invalidations with a bitlock.
      
      At the same time, we also need to allow other places that check
      NFS_INO_INVALID_DATA to check whether we might be in the middle of
      invalidating the file, so fix up a couple of places that do that
      to look for the new NFS_INO_INVALIDATING flag.
      
      Doing this requires us to be careful not to set the bitlock
      unnecessarily, so this code only does that if it believes it will
      be doing an invalidation.
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a18d7602
    • Weston Andros Adamson's avatar
      pnfs: fix BUG in filelayout_recover_commit_reqs · 1b70c8f6
      Weston Andros Adamson authored
      commit 471252cd upstream.
      
      cond_resched_lock(cinfo->lock) is called everywhere else while holding
      the cinfo->lock spinlock.  Not holding this lock while calling
      transfer_commit_list in filelayout_recover_commit_reqs causes the BUG
      below.
      
      It's true that we can't hold this lock while calling pnfs_put_lseg,
      because that might try to lock the inode lock - which might be the
      same lock as cinfo->lock.
      
      To reproduce, mount a 2 DS pynfs server and run an O_DIRECT command
      that crosses a stripe boundary and is not page aligned, such as:
      
       dd if=/dev/zero of=/mnt/f bs=17000 count=1 oflag=direct
      
      BUG: sleeping function called from invalid context at linux/fs/nfs/nfs4filelayout.c:1161
      in_atomic(): 0, irqs_disabled(): 0, pid: 27, name: kworker/0:1
      2 locks held by kworker/0:1/27:
       #0:  (events){.+.+.+}, at: [<ffffffff810501d7>] process_one_work+0x175/0x3a5
       #1:  ((&dreq->work)){+.+...}, at: [<ffffffff810501d7>] process_one_work+0x175/0x3a5
      CPU: 0 PID: 27 Comm: kworker/0:1 Not tainted 3.13.0-rc3-branch-dros_testing+ #21
      Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013
      Workqueue: events nfs_direct_write_schedule_work [nfs]
       0000000000000000 ffff88007a39bbb8 ffffffff81491256 ffff88007b87a130  ffff88007a39bbd8 ffffffff8105f103 ffff880079614000 ffff880079617d40  ffff88007a39bc20 ffffffffa011603e ffff880078988b98 0000000000000000
      Call Trace:
       [<ffffffff81491256>] dump_stack+0x4d/0x66
       [<ffffffff8105f103>] __might_sleep+0x100/0x105
       [<ffffffffa011603e>] transfer_commit_list+0x94/0xf1 [nfs_layout_nfsv41_files]
       [<ffffffffa01160d6>] filelayout_recover_commit_reqs+0x3b/0x68 [nfs_layout_nfsv41_files]
       [<ffffffffa00ba53a>] nfs_direct_write_reschedule+0x9f/0x1d6 [nfs]
       [<ffffffff810705df>] ? mark_lock+0x1df/0x224
       [<ffffffff8106e617>] ? trace_hardirqs_off_caller+0x37/0xa4
       [<ffffffff8106e691>] ? trace_hardirqs_off+0xd/0xf
       [<ffffffffa00ba8f8>] nfs_direct_write_schedule_work+0x9d/0xb7 [nfs]
       [<ffffffff810501d7>] ? process_one_work+0x175/0x3a5
       [<ffffffff81050258>] process_one_work+0x1f6/0x3a5
       [<ffffffff810501d7>] ? process_one_work+0x175/0x3a5
       [<ffffffff8105187e>] worker_thread+0x149/0x1f5
       [<ffffffff81051735>] ? rescuer_thread+0x28d/0x28d
       [<ffffffff81056d74>] kthread+0xd2/0xda
       [<ffffffff81056ca2>] ? __kthread_parkme+0x61/0x61
       [<ffffffff8149e66c>] ret_from_fork+0x7c/0xb0
       [<ffffffff81056ca2>] ? __kthread_parkme+0x61/0x61
      Signed-off-by: default avatarWeston Andros Adamson <dros@primarydata.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1b70c8f6
    • Christoph Hellwig's avatar
      nfs: increment i_dio_count for reads, too · 59653950
      Christoph Hellwig authored
      commit 1f90ee27 upstream.
      
      i_dio_count is used to protect dio access against truncate.  We want
      to make sure there are no dio reads pending either when doing a
      truncate.  I suspect on plain NFS things might work even without
      this, but once we use a pnfs layout driver that access backing devices
      directly things will go bad without the proper synchronization.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      59653950
    • Christoph Hellwig's avatar
      nfs: defer inode_dio_done call until size update is done · 2f786d99
      Christoph Hellwig authored
      commit 2a009ec9 upstream.
      
      We need to have the I/O fully finished before telling the truncate code
      that we are done.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      2f786d99
    • Christoph Hellwig's avatar
      nfs: fix size updates for aio writes · 2fae4cc0
      Christoph Hellwig authored
      commit 9811cd57 upstream.
      
      nfs_file_direct_write only updates the inode size if it succeeded and
      returned the number of bytes written.  But in the AIO case nfs_direct_wait
      turns the return value into -EIOCBQUEUED and we skip the size update.
      
      Instead the aio completion path should updated it, which this patch
      does.  The implementation is a little hacky because there is no obvious
      way to find out we are called for a write in nfs_direct_complete.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      2fae4cc0
    • Alexander Aring's avatar
      nfs: fix dead code of ipv6_addr_scope · 2e3b72ab
      Alexander Aring authored
      commit a8c22754 upstream.
      
      The correct way to check on IPV6_ADDR_SCOPE_LINKLOCAL is to check with
      the ipv6_addr_src_scope function.
      
      Currently this can't be work, because ipv6_addr_scope returns a int with
      a mask of IPV6_ADDR_SCOPE_MASK (0x00f0U) and IPV6_ADDR_SCOPE_LINKLOCAL
      is 0x02. So the condition is always false.
      Signed-off-by: default avatarAlexander Aring <alex.aring@gmail.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      2e3b72ab
    • Trond Myklebust's avatar
      NFSv4.1: Prevent a 3-way deadlock between layoutreturn, open and state recovery · da6bf1d4
      Trond Myklebust authored
      commit f22e5edd upstream.
      
      Andy Adamson reports:
      
      The state manager is recovering expired state and recovery OPENs are being
      processed. If kswapd is pruning inodes at the same time, a deadlock can occur
      when kswapd calls evict_inode on an NFSv4.1 inode with a layout, and the
      resultant layoutreturn gets an error that the state mangager is to handle,
      causing the layoutreturn to wait on the (NFS client) cl_rpcwaitq.
      
      At the same time an open is waiting for the inode deletion to complete in
      __wait_on_freeing_inode.
      
      If the open is either the open called by the state manager, or an open from
      the same open owner that is holding the NFSv4 sequence id which causes the
      OPEN from the state manager to wait for the sequence id on the Seqid_waitqueue,
      then the state is deadlocked with kswapd.
      
      The fix is simply to have layoutreturn ignore all errors except NFS4ERR_DELAY.
      We already know that layouts are dropped on all server reboots, and that
      it has to be coded to deal with the "forgetful client model" that doesn't
      send layoutreturns.
      Reported-by: default avatarAndy Adamson <andros@netapp.com>
      Link: http://lkml.kernel.org/r/1385402270-14284-1-git-send-email-andros@netapp.comSigned-off-by: default avatarTrond Myklebust <Trond.Myklebust@primarydata.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      da6bf1d4
    • Andy Adamson's avatar
      SUNRPC: do not fail gss proc NULL calls with EACCES · f5ddb3e9
      Andy Adamson authored
      commit c297c8b9 upstream.
      
      Otherwise RPCSEC_GSS_DESTROY messages are not sent.
      Signed-off-by: default avatarAndy Adamson <andros@netapp.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      f5ddb3e9
    • Olaf Hering's avatar
      fbmem: really support wildcard video=options for all fbdev drivers · e8bdefeb
      Olaf Hering authored
      commit f5d2b7c2 upstream.
      
      Documentation/fb/modedb.txt states that video=option should be
      considered a global option. But video_setup and fb_get_options are not
      coded that way. Instead its required to boot with video=driver:option to
      set a given option in drvier.  This is cumbersome because it requires to
      know in advance which driver will be active for a given board/kernel.
      
      The following patch implements the documented catchall for the fbdev
      drivers. It is now possible to boot with video=XxY without the need to
      know the active driver in advance. The specific case it tries to fix is
      syslinux in the SUSE installer which offers a menu to set a display
      resolution. Right now this just appends the vga= option the kernel. But
      in addition to vga= it should be possible to pass a generic video=XxY
      for all framebuffer/drm drivers. With this change forcing a certain
      window size of VM displays is now much easier.
      
      Today the video= option is stored in a global fb_mode_option. But
      unfortunately only drm uses it.
      
      Note: this change introduces a small memleak if video=option is actually
      used because fb_mode_option is const. Most drivers use strsep to get to
      individual options. This could be fixed in a followup patch which always
      releases the option string in every caller of fb_get_options.
      Signed-off-by: default avatarOlaf Hering <olaf@aepfle.de>
      Signed-off-by: default avatarTomi Valkeinen <tomi.valkeinen@ti.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e8bdefeb
    • Toshi Kani's avatar
      arch/x86/mm/srat: Skip NUMA_NO_NODE while parsing SLIT · ee023577
      Toshi Kani authored
      commit a85eba88 upstream.
      
      When ACPI SLIT table has an I/O locality (i.e. a locality
      unique to an I/O device), numa_set_distance() emits this warning
      message:
      
       NUMA: Warning: node ids are out of bound, from=-1 to=-1 distance=10
      
      acpi_numa_slit_init() calls numa_set_distance() with
      pxm_to_node(), which assumes that all localities have been
      parsed with SRAT previously.  SRAT does not list I/O localities,
      where as SLIT lists all localities including I/Os.  Hence,
      pxm_to_node() returns NUMA_NO_NODE (-1) for an I/O locality.
      
      I/O localities are not supported and are ignored today, but emitting
      such warning message leads to unnecessary confusion.
      
      Change acpi_numa_slit_init() to avoid calling
      numa_set_distance() with NUMA_NO_NODE.
      Signed-off-by: default avatarToshi Kani <toshi.kani@hp.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Link: http://lkml.kernel.org/n/tip-dSvpjjvp8aMzs1ybkftxohlh@git.kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      ee023577
    • Oliver Neukum's avatar
      crypto: sha256_ssse3 - also test for BMI2 · 6b11f0ed
      Oliver Neukum authored
      commit 16c0c4e1 upstream.
      
      The AVX2 implementation also uses BMI2 instructions,
      but doesn't test for their availability. The assumption
      that AVX2 and BMI2 always go together is false. Some
      Haswells have AVX2 but not BMI2.
      Signed-off-by: default avatarOliver Neukum <oneukum@suse.de>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      6b11f0ed
    • Jeff Mahoney's avatar
      drm/nouveau: make vga_switcheroo code depend on VGA_SWITCHEROO · fa13abce
      Jeff Mahoney authored
      commit d0ce7b85 upstream.
      
      Commit 8116188f ("nouveau/acpi: hook up to the MXM method for mux
      switching.") broke the build on non-x86 architectures due to the new
      dependency on MXM and MXM being an x86 platform driver.
      
      It built previously since the vga switcheroo registration routines were
      zereod out on !X86.  The code was built in but unused.
      
      This patch makes all of the DSM code depend on CONFIG_VGA_SWITCHEROO,
      allowing it to build on non-x86 and shrinking the module size as well.
      
      [rdunlap@infradead.org: fix build eror when VGA_SWITCHEROO is not enabled]
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Cc: David Airlie <airlied@linux.ie>
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      fa13abce
    • J. Bruce Fields's avatar
      nfs: use IS_ROOT not DCACHE_DISCONNECTED · 8276a9ae
      J. Bruce Fields authored
      commit a3f432bf upstream.
      
      This check was added by Al Viro with
      d9e80b7d "nfs d_revalidate() is too
      trigger-happy with d_drop()", with the explanation that we don't want to
      remove the root of a disconnected tree, which will still be included on
      the s_anon list.
      
      But DCACHE_DISCONNECTED does *not* actually identify dentries that are
      disconnected from the dentry tree or hashed on s_anon.  IS_ROOT() is the
      way to do that.
      
      Also add a comment from Al's commit to remind us why this check is
      there.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      8276a9ae
    • David Howells's avatar
      X.509: struct x509_certificate needs struct tm declaring · 05cf0398
      David Howells authored
      commit 57be4a78 upstream.
      
      struct x509_certificate needs struct tm declaring by #inclusion of linux/time.h
      prior to its definition.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarJosh Boyer <jwboyer@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      05cf0398
    • Bockholdt Arne's avatar
      intel_idle: Fixed C6 state on Avoton/Rangeley processors · 34bf8801
      Bockholdt Arne authored
      commit 22e580d0 upstream.
      
      Corrected the MWAIT flag for C-State C6 on Intel Avoton/Rangeley processors.
      Signed-off-by: default avatarArne Bockholdt <linux-kernel@bockholdt.com>
      Acked-by: default avatarLen Brown <len.brown@intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      34bf8801
  2. 01 Apr, 2014 3 commits
  3. 31 Mar, 2014 22 commits