1. 03 Feb, 2016 13 commits
    • Dave Chinner's avatar
      xfs: inode recovery readahead can race with inode buffer creation · 848235ac
      Dave Chinner authored
      [ Upstream commit b79f4a1c ]
      
      When we do inode readahead in log recovery, we do can do the
      readahead before we've replayed the icreate transaction that stamps
      the buffer with inode cores. The inode readahead verifier catches
      this and marks the buffer as !done to indicate that it doesn't yet
      contain valid inodes.
      
      In adding buffer error notification  (i.e. setting b_error = -EIO at
      the same time as as we clear the done flag) to such a readahead
      verifier failure, we can then get subsequent inode recovery failing
      with this error:
      
      XFS (dm-0): metadata I/O error: block 0xa00060 ("xlog_recover_do..(read#2)") error 5 numblks 32
      
      This occurs when readahead completion races with icreate item replay
      such as:
      
      	inode readahead
      		find buffer
      		lock buffer
      		submit RA io
      	....
      	icreate recovery
      	    xfs_trans_get_buffer
      		find buffer
      		lock buffer
      		<blocks on RA completion>
      	.....
      	<ra completion>
      		fails verifier
      		clear XBF_DONE
      		set bp->b_error = -EIO
      		release and unlock buffer
      	<icreate gains lock>
      	icreate initialises buffer
      	marks buffer as done
      	adds buffer to delayed write queue
      	releases buffer
      
      At this point, we have an initialised inode buffer that is up to
      date but has an -EIO state registered against it. When we finally
      get to recovering an inode in that buffer:
      
      	inode item recovery
      	    xfs_trans_read_buffer
      		find buffer
      		lock buffer
      		sees XBF_DONE is set, returns buffer
      	    sees bp->b_error is set
      		fail log recovery!
      
      Essentially, we need xfs_trans_get_buf_map() to clear the error status of
      the buffer when doing a lookup. This function returns uninitialised
      buffers, so the buffer returned can not be in an error state and
      none of the code that uses this function expects b_error to be set
      on return. Indeed, there is an ASSERT(!bp->b_error); in the
      transaction case in xfs_trans_get_buf_map() that would have caught
      this if log recovery used transactions....
      
      This patch firstly changes the inode readahead failure to set -EIO
      on the buffer, and secondly changes xfs_buf_get_map() to never
      return a buffer with an error state set so this first change doesn't
      cause unexpected log recovery failures.
      
      cc: <stable@vger.kernel.org> # 3.12 - current
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      848235ac
    • Ard Biesheuvel's avatar
      s390: fix normalization bug in exception table sorting · 2826e094
      Ard Biesheuvel authored
      [ Upstream commit bcb7825a ]
      
      The normalization pass in the sorting routine of the relative exception
      table serves two purposes:
      - it ensures that the address fields of the exception table entries are
        fully ordered, so that no ambiguities arise between entries with
        identical instruction offsets (i.e., when two instructions that are
        exactly 8 bytes apart each have an exception table entry associated with
        them)
      - it ensures that the offsets of both the instruction and the fixup fields
        of each entry are relative to their final location after sorting.
      
      Commit eb608fb3 ("s390/exceptions: switch to relative exception table
      entries") ported the relative exception table format from x86, but modified
      the sorting routine to only normalize the instruction offset field and not
      the fixup offset field. The result is that the fixup offset of each entry
      will be relative to the original location of the entry before sorting,
      likely leading to crashes when those entries are dereferenced.
      
      Fixes: eb608fb3 ("s390/exceptions: switch to relative exception table entries")
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      2826e094
    • Ben Skeggs's avatar
      drm/nouveau/kms: take mode_config mutex in connector hotplug path · ccafc9df
      Ben Skeggs authored
      [ Upstream commit 0a882cad ]
      
      fdo#93634
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      ccafc9df
    • Vegard Nossum's avatar
      uml: flush stdout before forking · 498b3468
      Vegard Nossum authored
      [ Upstream commit 0754fb29 ]
      
      I was seeing some really weird behaviour where piping UML's output
      somewhere would cause output to get duplicated:
      
        $ ./vmlinux | head -n 40
        Checking that ptrace can change system call numbers...Core dump limits :
                soft - 0
                hard - NONE
        OK
        Checking syscall emulation patch for ptrace...Core dump limits :
                soft - 0
                hard - NONE
        OK
        Checking advanced syscall emulation patch for ptrace...Core dump limits :
                soft - 0
                hard - NONE
        OK
        Core dump limits :
                soft - 0
                hard - NONE
      
      This is because these tests do a fork() which duplicates the non-empty
      stdout buffer, then glibc flushes the duplicated buffer as each child
      exits.
      
      A simple workaround is to flush before forking.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      498b3468
    • Vegard Nossum's avatar
      uml: fix hostfs mknod() · 4e623240
      Vegard Nossum authored
      [ Upstream commit 9f2dfda2 ]
      
      An inverted return value check in hostfs_mknod() caused the function
      to return success after handling it as an error (and cleaning up).
      
      It resulted in the following segfault when trying to bind() a named
      unix socket:
      
        Pid: 198, comm: a.out Not tainted 4.4.0-rc4
        RIP: 0033:[<0000000061077df6>]
        RSP: 00000000daae5d60  EFLAGS: 00010202
        RAX: 0000000000000000 RBX: 000000006092a460 RCX: 00000000dfc54208
        RDX: 0000000061073ef1 RSI: 0000000000000070 RDI: 00000000e027d600
        RBP: 00000000daae5de0 R08: 00000000da980ac0 R09: 0000000000000000
        R10: 0000000000000003 R11: 00007fb1ae08f72a R12: 0000000000000000
        R13: 000000006092a460 R14: 00000000daaa97c0 R15: 00000000daaa9a88
        Kernel panic - not syncing: Kernel mode fault at addr 0x40, ip 0x61077df6
        CPU: 0 PID: 198 Comm: a.out Not tainted 4.4.0-rc4 #1
        Stack:
         e027d620 dfc54208 0000006f da981398
         61bee000 0000c1ed daae5de0 0000006e
         e027d620 dfcd4208 00000005 6092a460
        Call Trace:
         [<60dedc67>] SyS_bind+0xf7/0x110
         [<600587be>] handle_syscall+0x7e/0x80
         [<60066ad7>] userspace+0x3e7/0x4e0
         [<6006321f>] ? save_registers+0x1f/0x40
         [<6006c88e>] ? arch_prctl+0x1be/0x1f0
         [<60054985>] fork_handler+0x85/0x90
      
      Let's also get rid of the "cosmic ray protection" while we're at it.
      
      Fixes: e9193059 "hostfs: fix races in dentry_name() and inode_name()"
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      4e623240
    • Mikulas Patocka's avatar
      dm snapshot: fix hung bios when copy error occurs · 27f8627c
      Mikulas Patocka authored
      [ Upstream commit 385277bf ]
      
      When there is an error copying a chunk dm-snapshot can incorrectly hold
      associated bios indefinitely, resulting in hung IO.
      
      The function copy_callback sets pe->error if there was error copying the
      chunk, and then calls complete_exception.  complete_exception calls
      pending_complete on error, otherwise it calls commit_exception with
      commit_callback (and commit_callback calls complete_exception).
      
      The persistent exception store (dm-snap-persistent.c) assumes that calls
      to prepare_exception and commit_exception are paired.
      persistent_prepare_exception increases ps->pending_count and
      persistent_commit_exception decreases it.
      
      If there is a copy error, persistent_prepare_exception is called but
      persistent_commit_exception is not.  This results in the variable
      ps->pending_count never returning to zero and that causes some pending
      exceptions (and their associated bios) to be held forever.
      
      Fix this by unconditionally calling commit_exception regardless of
      whether the copy was successful.  A new "valid" parameter is added to
      commit_exception -- when the copy fails this parameter is set to zero so
      that the chunk that failed to copy (and all following chunks) is not
      recorded in the snapshot store.  Also, remove commit_callback now that
      it is merely a wrapper around pending_complete.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      27f8627c
    • Mike Christie's avatar
      scsi: add Synology to 1024 sector blacklist · 9d8246cd
      Mike Christie authored
      [ Upstream commit 9055082f ]
      
      Another iscsi target that cannot handle large IOs, but does not tell us
      a limit.
      
      The Synology iSCSI targets report:
      
      Block limits VPD page (SBC):
        Write same no zero (WSNZ): 0
        Maximum compare and write length: 0 blocks
        Optimal transfer length granularity: 0 blocks
        Maximum transfer length: 0 blocks
        Optimal transfer length: 0 blocks
        Maximum prefetch length: 0 blocks
        Maximum unmap LBA count: 0
        Maximum unmap block descriptor count: 0
        Optimal unmap granularity: 0
        Unmap granularity alignment valid: 0
        Unmap granularity alignment: 0
        Maximum write same length: 0x0 blocks
      
      and the size of the command it can handle seems to depend on how much
      memory it can allocate at the time. This results in IO errors when
      handling large IOs. This patch just has us use the old 1024 default
      sectors for this target by adding it to the scsi blacklist. We do not
      have good contacs with this vendors, so I have not been able to try and
      fix on their side.
      
      I have posted this a long while back, but it was not merged. This
      version just fixes it up for merge/patch failures in the original
      version.
      Reported-by: default avatarAncoron Luciferis <ancoron.luciferis@googlemail.com>
      Reported-by: default avatarMichael Meyers <steltek@tcnnet.com>
      Signed-off-by: default avatarMike Christie <mchristi@redhat.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Cc: <stable@vger.kernel.org> # 4.1+
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      9d8246cd
    • Jeff Layton's avatar
      locks: fix unlock when fcntl_setlk races with a close · 2e21928b
      Jeff Layton authored
      [ Upstream commit 7f3697e2 ]
      
      Dmitry reported that he was able to reproduce the WARN_ON_ONCE that
      fires in locks_free_lock_context when the flc_posix list isn't empty.
      
      The problem turns out to be that we're basically rebuilding the
      file_lock from scratch in fcntl_setlk when we discover that the setlk
      has raced with a close. If the l_whence field is SEEK_CUR or SEEK_END,
      then we may end up with fl_start and fl_end values that differ from
      when the lock was initially set, if the file position or length of the
      file has changed in the interim.
      
      Fix this by just reusing the same lock request structure, and simply
      override fl_type value with F_UNLCK as appropriate. That ensures that
      we really are unlocking the lock that was initially set.
      
      While we're there, make sure that we do pop a WARN_ON_ONCE if the
      removal ever fails. Also return -EBADF in this event, since that's
      what we would have returned if the close had happened earlier.
      
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: <stable@vger.kernel.org>
      Fixes: c293621b (stale POSIX lock handling)
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarJeff Layton <jeff.layton@primarydata.com>
      Acked-by: default avatar"J. Bruce Fields" <bfields@fieldses.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      2e21928b
    • Emmanuel Grumbach's avatar
      iwlwifi: pcie: properly configure the debug buffer size for 8000 · d9d42b67
      Emmanuel Grumbach authored
      [ Upstream commit 62d7476d ]
      
      8000 device family has a new debug engine that needs to be
      configured differently than 7000's.
      The debug engine's DMA works in chunks of memory and the
      size of the buffer really means the start of the last
      chunk. Since one chunk is 256-byte long, we should
      configure the device to write to buffer_size - 256.
      This fixes a situation were the device would write to
      memory it is not allowed to access.
      
      CC: <stable@vger.kernel.org> [4.1+]
      Signed-off-by: default avatarEmmanuel Grumbach <emmanuel.grumbach@intel.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      d9d42b67
    • Sasha Levin's avatar
      iwlwifi: update and fix 7265 series PCI IDs · 01999150
      Sasha Levin authored
      [ Upstream commit 006bda75 ]
      
      Update and fix some 7265 PCI IDs entries.
      
      CC: <stable@vger.kernel.org> [3.13+]
      Signed-off-by: default avatarOren Givon <oren.givon@intel.com>
      Signed-off-by: default avatarEmmanuel Grumbach <emmanuel.grumbach@intel.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      01999150
    • David Sterba's avatar
      btrfs: handle invalid num_stripes in sys_array · 1ecdadef
      David Sterba authored
      [ Upstream commit f5cdedd7 ]
      
      We can handle the special case of num_stripes == 0 directly inside
      btrfs_read_sys_array. The BUG_ON in btrfs_chunk_item_size is there to
      catch other unhandled cases where we fail to validate external data.
      
      A crafted or corrupted image crashes at mount time:
      
      BTRFS: device fsid 9006933e-2a9a-44f0-917f-514252aeec2c devid 1 transid 7 /dev/loop0
      BTRFS info (device loop0): disk space caching is enabled
      BUG: failure at fs/btrfs/ctree.h:337/btrfs_chunk_item_size()!
      Kernel panic - not syncing: BUG!
      CPU: 0 PID: 313 Comm: mount Not tainted 4.2.5-00657-ge047887-dirty #25
      Stack:
       637af890 60062489 602aeb2e 604192ba
       60387961 00000011 637af8a0 6038a835
       637af9c0 6038776b 634ef32b 00000000
      Call Trace:
       [<6001c86d>] show_stack+0xfe/0x15b
       [<6038a835>] dump_stack+0x2a/0x2c
       [<6038776b>] panic+0x13e/0x2b3
       [<6020f099>] btrfs_read_sys_array+0x25d/0x2ff
       [<601cfbbe>] open_ctree+0x192d/0x27af
       [<6019c2c1>] btrfs_mount+0x8f5/0xb9a
       [<600bc9a7>] mount_fs+0x11/0xf3
       [<600d5167>] vfs_kern_mount+0x75/0x11a
       [<6019bcb0>] btrfs_mount+0x2e4/0xb9a
       [<600bc9a7>] mount_fs+0x11/0xf3
       [<600d5167>] vfs_kern_mount+0x75/0x11a
       [<600d710b>] do_mount+0xa35/0xbc9
       [<600d7557>] SyS_mount+0x95/0xc8
       [<6001e884>] handle_syscall+0x6b/0x8e
      Reported-by: default avatarJiri Slaby <jslaby@suse.com>
      Reported-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      CC: stable@vger.kernel.org	# 3.19+
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      1ecdadef
    • Grygorii Strashko's avatar
      PCI: host: Mark PCIe/PCI (MSI) IRQ cascade handlers as IRQF_NO_THREAD · 16870da0
      Grygorii Strashko authored
      [ Upstream commit 8ff0ef99 ]
      
      On -RT and if kernel is booting with "threadirqs" cmd line parameter,
      PCIe/PCI (MSI) IRQ cascade handlers (like dra7xx_pcie_msi_irq_handler())
      will be forced threaded and, as result, will generate warnings like this:
      
        WARNING: CPU: 1 PID: 82 at kernel/irq/handle.c:150 handle_irq_event_percpu+0x14c/0x174()
        irq 460 handler irq_default_primary_handler+0x0/0x14 enabled interrupts
        Backtrace:
         (warn_slowpath_common) from (warn_slowpath_fmt+0x38/0x40)
         (warn_slowpath_fmt) from (handle_irq_event_percpu+0x14c/0x174)
         (handle_irq_event_percpu) from (handle_irq_event+0x84/0xb8)
         (handle_irq_event) from (handle_simple_irq+0x90/0x118)
         (handle_simple_irq) from (generic_handle_irq+0x30/0x44)
         (generic_handle_irq) from (dra7xx_pcie_msi_irq_handler+0x7c/0x8c)
         (dra7xx_pcie_msi_irq_handler) from (irq_forced_thread_fn+0x28/0x5c)
         (irq_forced_thread_fn) from (irq_thread+0x128/0x204)
      
      This happens because all of them invoke generic_handle_irq() from the
      requested handler.  generic_handle_irq() grabs raw_locks and thus needs to
      run in raw-IRQ context.
      
      This issue was originally reproduced on TI dra7-evem, but, as was
      identified during discussion [1], other hosts can also suffer from this
      issue.  Fix all them at once by marking PCIe/PCI (MSI) IRQ cascade handlers
      IRQF_NO_THREAD explicitly.
      
      [1] http://lkml.kernel.org/r/1448027966-21610-1-git-send-email-grygorii.strashko@ti.com
      
      [bhelgaas: add stable tag, fix typos]
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: Lucas Stach <l.stach@pengutronix.de> (for imx6)
      CC: stable@vger.kernel.org
      CC: Kishon Vijay Abraham I <kishon@ti.com>
      CC: Jingoo Han <jingoohan1@gmail.com>
      CC: Kukjin Kim <kgene@kernel.org>
      CC: Krzysztof Kozlowski <k.kozlowski@samsung.com>
      CC: Richard Zhu <Richard.Zhu@freescale.com>
      CC: Thierry Reding <thierry.reding@gmail.com>
      CC: Stephen Warren <swarren@wwwdotorg.org>
      CC: Alexandre Courbot <gnurou@gmail.com>
      CC: Simon Horman <horms@verge.net.au>
      CC: Pratyush Anand <pratyush.anand@gmail.com>
      CC: Michal Simek <michal.simek@xilinx.com>
      CC: "Sören Brinkmann" <soren.brinkmann@xilinx.com>
      CC: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      16870da0
    • Christoph Biedl's avatar
      PCI: Fix minimum allocation address overwrite · 24ab3b8e
      Christoph Biedl authored
      [ Upstream commit 3460baa6 ]
      
      Commit 36e097a8 ("PCI: Split out bridge window override of minimum
      allocation address") claimed to do no functional changes but unfortunately
      did: The "min" variable is altered.  At least the AVM A1 PCMCIA adapter was
      no longer detected, breaking ISDN operation.
      
      Use a local copy of "min" to restore the previous behaviour.
      
      [bhelgaas: avoid gcc "?:" extension for portability and readability]
      Fixes: 36e097a8 ("PCI: Split out bridge window override of minimum allocation address")
      Signed-off-by: default avatarChristoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      CC: stable@vger.kernel.org      # v3.14+
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      24ab3b8e
  2. 01 Feb, 2016 27 commits