1. 13 Oct, 2015 40 commits
    • Jeffery Miller's avatar
      Add radeon suspend/resume quirk for HP Compaq dc5750. · ed2a4a92
      Jeffery Miller authored
      commit 09bfda10 upstream.
      
      With the radeon driver loaded the HP Compaq dc5750
      Small Form Factor machine fails to resume from suspend.
      Adding a quirk similar to other devices avoids
      the problem and the system resumes properly.
      Signed-off-by: default avatarJeffery Miller <jmiller@neverware.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      ed2a4a92
    • Chris Wilson's avatar
      drm/i915: Always mark the object as dirty when used by the GPU · 843ab6d0
      Chris Wilson authored
      commit 51bc1404 upstream.
      
      There have been many hard to track down bugs whereby userspace forgot to
      flag a write buffer and then cause graphics corruption or a hung GPU
      when that buffer was later purged under memory pressure (as the buffer
      appeared clean, its pages would have been evicted rather than preserved
      and any changes more recent than in the backing storage would be lost).
      In retrospect this is a rare optimisation against memory pressure,
      already the slow path. If we always mark the buffer as dirty when
      accessed by the GPU, anything not used can still be evicted cheaply
      (ideal behaviour for mark-and-sweep eviction) but we do not run the risk
      of corruption. For correct read serialisation, userspace still has to
      notify when the GPU writes to an object. However, there are certain
      situations under which userspace may wish to tell white lies to the
      kernel...
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Kristian Høgsberg <krh@bitplanet.net>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Cc: "Goel, Akash" <akash.goel@intel.co>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Reviewed-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      843ab6d0
    • Tan, Jui Nee's avatar
      spi: spi-pxa2xx: Check status register to determine if SSSR_TINT is disabled · 42e96ab1
      Tan, Jui Nee authored
      commit 02bc933e upstream.
      
      On Intel Baytrail, there is case when interrupt handler get called, no SPI
      message is captured. The RX FIFO is indeed empty when RX timeout pending
      interrupt (SSSR_TINT) happens.
      
      Use the BIOS version where both HSUART and SPI are on the same IRQ. Both
      drivers are using IRQF_SHARED when calling the request_irq function. When
      running two separate and independent SPI and HSUART application that
      generate data traffic on both components, user will see messages like
      below on the console:
      
        pxa2xx-spi pxa2xx-spi.0: bad message state in interrupt handler
      
      This commit will fix this by first checking Receiver Time-out Interrupt,
      if it is disabled, ignore the request and return without servicing.
      Signed-off-by: default avatarTan, Jui Nee <jui.nee.tan@intel.com>
      Acked-by: default avatarJarkko Nikula <jarkko.nikula@linux.intel.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      42e96ab1
    • Yishai Hadas's avatar
      IB/uverbs: Fix race between ib_uverbs_open and remove_one · cf50958a
      Yishai Hadas authored
      commit 35d4a0b6 upstream.
      
      Fixes: 2a72f212 ("IB/uverbs: Remove dev_table")
      
      Before this commit there was a device look-up table that was protected
      by a spin_lock used by ib_uverbs_open and by ib_uverbs_remove_one. When
      it was dropped and container_of was used instead, it enabled the race
      with remove_one as dev might be freed just after:
      dev = container_of(inode->i_cdev, struct ib_uverbs_device, cdev) but
      before the kref_get.
      
      In addition, this buggy patch added some dead code as
      container_of(x,y,z) can never be NULL and so dev can never be NULL.
      As a result the comment above ib_uverbs_open saying "the open method
      will either immediately run -ENXIO" is wrong as it can never happen.
      
      The solution follows Jason Gunthorpe suggestion from below URL:
      https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg25692.html
      
      cdev will hold a kref on the parent (the containing structure,
      ib_uverbs_device) and only when that kref is released it is
      guaranteed that open will never be called again.
      
      In addition, fixes the active count scheme to use an atomic
      not a kref to prevent WARN_ON as pointed by above comment
      from Jason.
      Signed-off-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarShachar Raindel <raindel@mellanox.com>
      Reviewed-by: default avatarJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      cf50958a
    • Noa Osherovich's avatar
      IB/mlx4: Use correct SL on AH query under RoCE · 9421b777
      Noa Osherovich authored
      commit 5e99b139 upstream.
      
      The mlx4 IB driver implementation for ib_query_ah used a wrong offset
      (28 instead of 29) when link type is Ethernet. Fixed to use the correct one.
      
      Fixes: fa417f7b ('IB/mlx4: Add support for IBoE')
      Signed-off-by: default avatarShani Michaeli <shanim@mellanox.com>
      Signed-off-by: default avatarNoa Osherovich <noaos@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      9421b777
    • Trond Myklebust's avatar
      SUNRPC: xs_reset_transport must mark the connection as disconnected · 9434e485
      Trond Myklebust authored
      commit 0c78789e upstream.
      
      In case the reconnection attempt fails.
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      [bwh: Backported to 3.2: add local variable xprt]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      9434e485
    • Mike Marciniszyn's avatar
      IB/qib: Change lkey table allocation to support more MRs · cb463364
      Mike Marciniszyn authored
      commit d6f1c17e upstream.
      
      The lkey table is allocated with with a get_user_pages() with an
      order based on a number of index bits from a module parameter.
      
      The underlying kernel code cannot allocate that many contiguous pages.
      
      There is no reason the underlying memory needs to be physically
      contiguous.
      
      This patch:
      - switches the allocation/deallocation to vmalloc/vfree
      - caps the number of bits to 23 to insure at least 1 generation bit
        o this matches the module parameter description
      Reviewed-by: default avatarVinit Agnihotri <vinit.abhay.agnihotri@intel.com>
      Signed-off-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      [bwh: Backported to 3.2:
       - Adjust context
       - Add definition of qib_dev_warn(), added upstream by commit ddb88765
         ("IB/qib: Convert opcode counters to per-context")]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      cb463364
    • David Jeffery's avatar
      xfs: return errors from partial I/O failures to files · ff8c37e6
      David Jeffery authored
      commit c9eb256e upstream.
      
      There is an issue with xfs's error reporting in some cases of I/O partially
      failing and partially succeeding. Calls like fsync() can report success even
      though not all I/O was successful in partial-failure cases such as one disk of
      a RAID0 array being offline.
      
      The issue can occur when there are more than one bio per xfs_ioend struct.
      Each call to xfs_end_bio() for a bio completing will write a value to
      ioend->io_error.  If a successful bio completes after any failed bio, no
      error is reported do to it writing 0 over the error code set by any failed bio.
      The I/O error information is now lost and when the ioend is completed
      only success is reported back up the filesystem stack.
      
      xfs_end_bio() should only set ioend->io_error in the case of BIO_UPTODATE
      being clear.  ioend->io_error is initialized to 0 at allocation so only needs
      to be updated by a failed bio. Also check that ioend->io_error is 0 so that
      the first error reported will be the error code returned.
      Signed-off-by: default avatarDavid Jeffery <djeffery@redhat.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      ff8c37e6
    • Grant Likely's avatar
      drivercore: Fix unregistration path of platform devices · b0e0b3d0
      Grant Likely authored
      commit 7f5dcaf1 upstream.
      
      The unregister path of platform_device is broken. On registration, it
      will register all resources with either a parent already set, or
      type==IORESOURCE_{IO,MEM}. However, on unregister it will release
      everything with type==IORESOURCE_{IO,MEM}, but ignore the others. There
      are also cases where resources don't get registered in the first place,
      like with devices created by of_platform_populate()*.
      
      Fix the unregister path to be symmetrical with the register path by
      checking the parent pointer instead of the type field to decide which
      resources to unregister. This is safe because the upshot of the
      registration path algorithm is that registered resources have a parent
      pointer, and non-registered resources do not.
      
      * It can be argued that of_platform_populate() should be registering
        it's resources, and they argument has some merit. However, there are
        quite a few platforms that end up broken if we try to do that due to
        overlapping resources in the device tree. Until that is fixed, we need
        to solve the immediate problem.
      
      Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
      Cc: Wolfram Sang <wsa@the-dreams.de>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
      Signed-off-by: default avatarGrant Likely <grant.likely@linaro.org>
      Tested-by: default avatarRicardo Ribalda Delgado <ricardo.ribalda@gmail.com>
      Tested-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      b0e0b3d0
    • David Daney's avatar
      of/address: Don't loop forever in of_find_matching_node_by_address(). · 3ccc6060
      David Daney authored
      commit 3a496b00 upstream.
      
      If the internal call to of_address_to_resource() fails, we end up
      looping forever in of_find_matching_node_by_address().  This can be
      caused by a defective device tree, or calling with an incorrect
      matches argument.
      
      Fix by calling of_find_matching_node() unconditionally at the end of
      the loop.
      Signed-off-by: default avatarDavid Daney <david.daney@cavium.com>
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      3ccc6060
    • Adrien Schildknecht's avatar
      rtlwifi: rtl8192cu: Add new device ID · e2aebb82
      Adrien Schildknecht authored
      commit 1642d09f upstream.
      
      The v2 of NetGear WNA1000M uses a different idProduct: USB ID 0846:9043
      Signed-off-by: default avatarAdrien Schildknecht <adrien+dev@schischi.me>
      Acked-by: default avatarLarry Finger <Larry.Finger@lwfinger.net>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e2aebb82
    • Marek Vasut's avatar
      rtlwifi: rtl8192cu: Add new device ID · 2da6a629
      Marek Vasut authored
      commit 9374e7d2 upstream.
      
      Add new ID for ASUS N10 WiFi dongle.
      Signed-off-by: default avatarMarek Vasut <marex@denx.de>
      Tested-by: default avatarMarek Vasut <marex@denx.de>
      Cc: Larry Finger <Larry.Finger@lwfinger.net>
      Cc: John W. Linville <linville@tuxdriver.com>
      Acked-by: default avatarLarry Finger <Larry.Finger@lwfinger.net>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      2da6a629
    • Stephen Chandler Paul's avatar
      DRM - radeon: Don't link train DisplayPort on HPD until we get the dpcd · b012b39a
      Stephen Chandler Paul authored
      commit 924f92bf upstream.
      
      Most of the time this isn't an issue since hotplugging an adaptor will
      trigger a crtc mode change which in turn, causes the driver to probe
      every DisplayPort for a dpcd. However, in cases where hotplugging
      doesn't cause a mode change (specifically when one unplugs a monitor
      from a DisplayPort connector, then plugs that same monitor back in
      seconds later on the same port without any other monitors connected), we
      never probe for the dpcd before starting the initial link training. What
      happens from there looks like this:
      
      	- GPU has only one monitor connected. It's connected via
      	  DisplayPort, and does not go through an adaptor of any sort.
      
      	- User unplugs DisplayPort connector from GPU.
      
      	- Change in HPD is detected by the driver, we probe every
      	  DisplayPort for a possible connection.
      
      	- Probe the port the user originally had the monitor connected
      	  on for it's dpcd. This fails, and we clear the first (and only
      	  the first) byte of the dpcd to indicate we no longer have a
      	  dpcd for this port.
      
      	- User plugs the previously disconnected monitor back into the
      	  same DisplayPort.
      
      	- radeon_connector_hotplug() is called before everyone else,
      	  and tries to handle the link training. Since only the first
      	  byte of the dpcd is zeroed, the driver is able to complete
      	  link training but does so against the wrong dpcd, causing it
      	  to initialize the link with the wrong settings.
      
      	- Display stays blank (usually), dpcd is probed after the
      	  initial link training, and the driver prints no obvious
      	  messages to the log.
      
      In theory, since only one byte of the dpcd is chopped off (specifically,
      the byte that contains the revision information for DisplayPort), it's
      not entirely impossible that this bug may not show on certain monitors.
      For instance, the only reason this bug was visible on my ASUS PB238
      monitor was due to the fact that this monitor using the enhanced framing
      symbol sequence, the flag for which is ignored if the radeon driver
      thinks that the DisplayPort version is below 1.1.
      Signed-off-by: default avatarStephen Chandler Paul <cpaul@redhat.com>
      Reviewed-by: default avatarJerome Glisse <jglisse@redhat.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      b012b39a
    • Jan Kara's avatar
      xfs: Fix xfs_attr_leafblock definition · 86cbc007
      Jan Kara authored
      commit ffeecc52 upstream.
      
      struct xfs_attr_leafblock contains 'entries' array which is declared
      with size 1 altough it can in fact contain much more entries. Since this
      array is followed by further struct members, gcc (at least in version
      4.8.3) thinks that the array has the fixed size of 1 element and thus
      may optimize away all accesses beyond the end of array resulting in
      non-working code. This problem was only observed with userspace code in
      xfsprogs, however it's better to be safe in kernel as well and have
      matching kernel and xfsprogs definitions.
      Signed-off-by: default avatarJan Kara <jack@suse.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      [bwh: Backported to 3.2: adjust filename]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      86cbc007
    • Tyler Hicks's avatar
      eCryptfs: Invalidate dcache entries when lower i_nlink is zero · 209a7a67
      Tyler Hicks authored
      commit 5556e7e6 upstream.
      
      Consider eCryptfs dcache entries to be stale when the corresponding
      lower inode's i_nlink count is zero. This solves a problem caused by the
      lower inode being directly modified, without going through the eCryptfs
      mount, leaving stale eCryptfs dentries cached and the eCryptfs inode's
      i_nlink count not being cleared.
      Signed-off-by: default avatarTyler Hicks <tyhicks@canonical.com>
      Reported-by: default avatarRichard Weinberger <richard@nod.at>
      [bwh: Backported to 3.2:
       - Test d_revalidate pointer directly rather than a DCACHE_OP flag
       - Open-code d_inode()
       - Adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      209a7a67
    • Matthijs Kooijman's avatar
      USB: ftdi_sio: Added custom PID for CustomWare products · 392a99c8
      Matthijs Kooijman authored
      commit 1fb8dc36 upstream.
      
      CustomWare uses the FTDI VID with custom PIDs for their ShipModul MiniPlex
      products.
      Signed-off-by: default avatarMatthijs Kooijman <matthijs@stdin.nl>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      392a99c8
    • Peter Chen's avatar
      usb: host: ehci-sys: delete useless bus_to_hcd conversion · 586491f5
      Peter Chen authored
      commit 0521cfd0 upstream.
      
      The ehci platform device's drvdata is the pointer of struct usb_hcd
      already, so we doesn't need to call bus_to_hcd conversion again.
      Signed-off-by: default avatarPeter Chen <peter.chen@freescale.com>
      Acked-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [bwh: Backported to 3.2: Unfortunately some EHCI platform sub-drivers 
       point drvdata to a private structure, so only create and remove the
       attributes if drvdata has been set as expected.]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      586491f5
    • Maciej S. Szmigiero's avatar
      serial: 8250: bind to ALi Fast Infrared Controller (ALI5123) · 8ba4fa58
      Maciej S. Szmigiero authored
      commit 1d700277 upstream.
      
      This way this device can be used with irtty-sir -
      at least on Toshiba Satellite A20-S103 it is not configured by default
      and needs PNP activation before it starts to respond on I/O ports.
      
      This device has actually its own driver (ali-ircc),
      but this driver seems to be non-functional for a very long time
      (see http://permalink.gmane.org/gmane.linux.irda.general/484
      http://permalink.gmane.org/gmane.network.protocols.obex.openobex.user/943
      https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=535070 ).
      Signed-off-by: default avatarMaciej Szmigiero <mail@maciej.szmigiero.name>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [bwh: Backported to 3.2:
       - Drop change to acpi_pnp.c, as there's no need to whitelist ACPI devices
         for the PNP bus
       - Adjust filename]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      8ba4fa58
    • Nikhil Badola's avatar
      drivers: usb: fsl: Workaround for USB erratum-A005275 · 79fcb25f
      Nikhil Badola authored
      commit f8786a91 upstream.
      
      Incoming packets in high speed are randomly corrupted by h/w
      resulting in multiple errors. This workaround makes FS as
      default mode in all affected socs by disabling HS chirp
      signalling.This errata does not affect FS and LS mode.
      
      Forces all HS devices to connect in FS mode for all socs
      affected by this erratum:
      P3041 and P2041 rev 1.0 and 1.1
      P5020 and P5010 rev 1.0 and 2.0
      P5040, P1010 and T4240 rev 1.0
      Signed-off-by: default avatarRamneek Mehresh <ramneek.mehresh@freescale.com>
      Signed-off-by: default avatarNikhil Badola <nikhil.badola@freescale.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      79fcb25f
    • NeilBrown's avatar
      NFSv4: don't set SETATTR for O_RDONLY|O_EXCL · 090e974e
      NeilBrown authored
      commit efcbc04e upstream.
      
      It is unusual to combine the open flags O_RDONLY and O_EXCL, but
      it appears that libre-office does just that.
      
      [pid  3250] stat("/home/USER/.config", {st_mode=S_IFDIR|0700, st_size=8192, ...}) = 0
      [pid  3250] open("/home/USER/.config/libreoffice/4-suse/user/extensions/buildid", O_RDONLY|O_EXCL <unfinished ...>
      
      NFSv4 takes O_EXCL as a sign that a setattr command should be sent,
      probably to reset the timestamps.
      
      When it was an O_RDONLY open, the SETATTR command does not
      identify any actual attributes to change.
      If no delegation was provided to the open, the SETATTR uses the
      all-zeros stateid and the request is accepted (at least by the
      Linux NFS server - no harm, no foul).
      
      If a read-delegation was provided, this is used in the SETATTR
      request, and a Netapp filer will justifiably claim
      NFS4ERR_BAD_STATEID, which the Linux client takes as a sign
      to retry - indefinitely.
      
      So only treat O_EXCL specially if O_CREAT was also given.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      [bwh: Backported to 3.2: we only check open_flags, not createmode as well]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      090e974e
    • Paul Bolle's avatar
      windfarm: decrement client count when unregistering · 48c46d4a
      Paul Bolle authored
      commit fe2b5921 upstream.
      
      wf_unregister_client() increments the client count when a client
      unregisters. That is obviously incorrect. Decrement that client count
      instead.
      
      Fixes: 75722d39 ("[PATCH] ppc64: Thermal control for SMU based machines")
      Signed-off-by: default avatarPaul Bolle <pebolle@tiscali.nl>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      48c46d4a
    • Masahiro Yamada's avatar
      devres: fix devres_get() · ebc0ae5a
      Masahiro Yamada authored
      commit 64526370 upstream.
      
      Currently, devres_get() passes devres_free() the pointer to devres,
      but devres_free() should be given with the pointer to resource data.
      
      Fixes: 9ac7849e ("devres: device resource management")
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      ebc0ae5a
    • Sudip Mukherjee's avatar
      auxdisplay: ks0108: fix refcount · b3170aab
      Sudip Mukherjee authored
      commit bab383de upstream.
      
      parport_find_base() will implicitly do parport_get_port() which
      increases the refcount. Then parport_register_device() will again
      increment the refcount. But while unloading the module we are only
      doing parport_unregister_device() decrementing the refcount only once.
      We add an parport_put_port() to neutralize the effect of
      parport_get_port().
      Signed-off-by: default avatarSudip Mukherjee <sudip@vectorindia.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      b3170aab
    • Xiao Guangrong's avatar
      KVM: MMU: fix validation of mmio page fault · 41e3025e
      Xiao Guangrong authored
      commit 6f691251 upstream.
      
      We got the bug that qemu complained with "KVM: unknown exit, hardware
      reason 31" and KVM shown these info:
      [84245.284948] EPT: Misconfiguration.
      [84245.285056] EPT: GPA: 0xfeda848
      [84245.285154] ept_misconfig_inspect_spte: spte 0x5eaef50107 level 4
      [84245.285344] ept_misconfig_inspect_spte: spte 0x5f5fadc107 level 3
      [84245.285532] ept_misconfig_inspect_spte: spte 0x5141d18107 level 2
      [84245.285723] ept_misconfig_inspect_spte: spte 0x52e40dad77 level 1
      
      This is because we got a mmio #PF and the handler see the mmio spte becomes
      normal (points to the ram page)
      
      However, this is valid after introducing fast mmio spte invalidation which
      increases the generation-number instead of zapping mmio sptes, a example
      is as follows:
      1. QEMU drops mmio region by adding a new memslot
      2. invalidate all mmio sptes
      3.
      
              VCPU 0                        VCPU 1
          access the invalid mmio spte
                                  access the region originally was MMIO before
                                  set the spte to the normal ram map
      
          mmio #PF
          check the spte and see it becomes normal ram mapping !!!
      
      This patch fixes the bug just by dropping the check in mmio handler, it's
      good for backport. Full check will be introduced in later patches
      Reported-by: default avatarPavel Shirshov <ru.pchel@gmail.com>
      Tested-by: default avatarPavel Shirshov <ru.pchel@gmail.com>
      Signed-off-by: default avatarXiao Guangrong <guangrong.xiao@linux.intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      [bwh: Backported to 3.2: error code from handle_mmio_page_fault_common()
       was not named]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      41e3025e
    • Dan Carpenter's avatar
      usb: gadget: m66592-udc: forever loop in set_feature() · ef1ce106
      Dan Carpenter authored
      commit 5feb5d20 upstream.
      
      There is an "&&" vs "||" typo here so this loops 3000 times or if we get
      unlucky it could loop forever.
      
      Fixes: ceaa0a6e ('usb: gadget: m66592-udc: add support for TEST_MODE')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarFelipe Balbi <balbi@ti.com>
      [bwh: Backported to 3.2: adjust filename]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      ef1ce106
    • Mark Rustad's avatar
      PCI: Add VPD function 0 quirk for Intel Ethernet devices · fb0a9682
      Mark Rustad authored
      commit 7aa6ca4d upstream.
      
      Set the PCI_DEV_FLAGS_VPD_REF_F0 flag on all Intel Ethernet device
      functions other than function 0, so that on multi-function devices, we will
      always read VPD from function 0 instead of from the other functions.
      
      [bhelgaas: changelog]
      Signed-off-by: default avatarMark Rustad <mark.d.rustad@intel.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarAlexander Duyck <alexander.h.duyck@redhat.com>
      [bwh: Backported to 3.2:
       - Put the class check in the new function as there is no
         DECLARE_PCI_FIXUP_CLASS_EARLY(
       - Adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      fb0a9682
    • Mark Rustad's avatar
      PCI: Add dev_flags bit to access VPD through function 0 · 6a3e5597
      Mark Rustad authored
      commit 932c435c upstream.
      
      Add a dev_flags bit, PCI_DEV_FLAGS_VPD_REF_F0, to access VPD through
      function 0 to provide VPD access on other functions.  This is for hardware
      devices that provide copies of the same VPD capability registers in
      multiple functions.  Because the kernel expects that each function has its
      own registers, both the locking and the state tracking are affected by VPD
      accesses to different functions.
      
      On such devices for example, if a VPD write is performed on function 0,
      *any* later attempt to read VPD from any other function of that device will
      hang.  This has to do with how the kernel tracks the expected value of the
      F bit per function.
      
      Concurrent accesses to different functions of the same device can not only
      hang but also corrupt both read and write VPD data.
      
      When hangs occur, typically the error message:
      
        vpd r/w failed.  This is likely a firmware bug on this device.
      
      will be seen.
      
      Never set this bit on function 0 or there will be an infinite recursion.
      Signed-off-by: default avatarMark Rustad <mark.d.rustad@intel.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      6a3e5597
    • Bob Copeland's avatar
      mac80211: enable assoc check for mesh interfaces · a661308e
      Bob Copeland authored
      commit 3633ebeb upstream.
      
      We already set a station to be associated when peering completes, both
      in user space and in the kernel.  Thus we should always have an
      associated sta before sending data frames to that station.
      
      Failure to check assoc state can cause crashes in the lower-level driver
      due to transmitting unicast data frames before driver sta structures
      (e.g. ampdu state in ath9k) are initialized.  This occurred when
      forwarding in the presence of fixed mesh paths: frames were transmitted
      to stations with whom we hadn't yet completed peering.
      Reported-by: default avatarAlexis Green <agreen@cococorp.com>
      Tested-by: default avatarJesse Jones <jjones@cococorp.com>
      Signed-off-by: default avatarBob Copeland <me@bobcopeland.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a661308e
    • Bjorn Helgaas's avatar
      PCI: Fix TI816X class code quirk · 94bd7767
      Bjorn Helgaas authored
      commit d1541dc9 upstream.
      
      In fixup_ti816x_class(), we assigned "class = PCI_CLASS_MULTIMEDIA_VIDEO".
      But PCI_CLASS_MULTIMEDIA_VIDEO is only the two-byte base class/sub-class
      and needs to be shifted to make space for the low-order interface byte.
      
      Shift PCI_CLASS_MULTIMEDIA_VIDEO to set the correct class code.
      
      Fixes: 63c44080 ("PCI: Add quirk for setting valid class for TI816X Endpoint")
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      CC: Hemant Pedanekar <hemantp@ti.com>
      [bwh: Backported to 3.2: the class check is done in this function as there
       is no DECLARE_PCI_FIXUP_CLASS_EARLY()]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      94bd7767
    • David Härdeman's avatar
      rc-core: fix remove uevent generation · 38906d3d
      David Härdeman authored
      commit a66b0c41 upstream.
      
      The input_dev is already gone when the rc device is being unregistered
      so checking for its presence only means that no remove uevent will be
      generated.
      Signed-off-by: default avatarDavid Härdeman <david@hardeman.nu>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@osg.samsung.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      38906d3d
    • Eric W. Biederman's avatar
      vfs: Test for and handle paths that are unreachable from their mnt_root · 6d84ade2
      Eric W. Biederman authored
      commit 397d425d upstream.
      
      In rare cases a directory can be renamed out from under a bind mount.
      In those cases without special handling it becomes possible to walk up
      the directory tree to the root dentry of the filesystem and down
      from the root dentry to every other file or directory on the filesystem.
      
      Like division by zero .. from an unconnected path can not be given
      a useful semantic as there is no predicting at which path component
      the code will realize it is unconnected.  We certainly can not match
      the current behavior as the current behavior is a security hole.
      
      Therefore when encounting .. when following an unconnected path
      return -ENOENT.
      
      - Add a function path_connected to verify path->dentry is reachable
        from path->mnt.mnt_root.  AKA to validate that rename did not do
        something nasty to the bind mount.
      
        To avoid races path_connected must be called after following a path
        component to it's next path component.
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      6d84ade2
    • Eric W. Biederman's avatar
      dcache: Handle escaped paths in prepend_path · 722632af
      Eric W. Biederman authored
      commit cde93be4 upstream.
      
      A rename can result in a dentry that by walking up d_parent
      will never reach it's mnt_root.  For lack of a better term
      I call this an escaped path.
      
      prepend_path is called by four different functions __d_path,
      d_absolute_path, d_path, and getcwd.
      
      __d_path only wants to see paths are connected to the root it passes
      in.  So __d_path needs prepend_path to return an error.
      
      d_absolute_path similarly wants to see paths that are connected to
      some root.  Escaped paths are not connected to any mnt_root so
      d_absolute_path needs prepend_path to return an error greater
      than 1.  So escaped paths will be treated like paths on lazily
      unmounted mounts.
      
      getcwd needs to prepend "(unreachable)" so getcwd also needs
      prepend_path to return an error.
      
      d_path is the interesting hold out.  d_path just wants to print
      something, and does not care about the weird cases.  Which raises
      the question what should be printed?
      
      Given that <escaped_path>/<anything> should result in -ENOENT I
      believe it is desirable for escaped paths to be printed as empty
      paths.  As there are not really any meaninful path components when
      considered from the perspective of a mount tree.
      
      So tweak prepend_path to return an empty path with an new error
      code of 3 when it encounters an escaped path.
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      722632af
    • David S. Miller's avatar
      sparc64: Fix userspace FPU register corruptions. · 31cbb1f4
      David S. Miller authored
      commit 44922150 upstream.
      
      If we have a series of events from userpsace, with %fprs=FPRS_FEF,
      like follows:
      
      ETRAP
      	ETRAP
      		VIS_ENTRY(fprs=0x4)
      		VIS_EXIT
      		RTRAP (kernel FPU restore with fpu_saved=0x4)
      	RTRAP
      
      We will not restore the user registers that were clobbered by the FPU
      using kernel code in the inner-most trap.
      
      Traps allocate FPU save slots in the thread struct, and FPU using
      sequences save the "dirty" FPU registers only.
      
      This works at the initial trap level because all of the registers
      get recorded into the top-level FPU save area, and we'll return
      to userspace with the FPU disabled so that any FPU use by the user
      will take an FPU disabled trap wherein we'll load the registers
      back up properly.
      
      But this is not how trap returns from kernel to kernel operate.
      
      The simplest fix for this bug is to always save all FPU register state
      for anything other than the top-most FPU save area.
      
      Getting rid of the optimized inner-slot FPU saving code ends up
      making VISEntryHalf degenerate into plain VISEntry.
      
      Longer term we need to do something smarter to reinstate the partial
      save optimizations.  Perhaps the fundament error is having trap entry
      and exit allocate FPU save slots and restore register state.  Instead,
      the VISEntry et al. calls should be doing that work.
      
      This bug is about two decades old.
      Reported-by: default avatarJames Y Knight <jyknight@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.2:
       - Adjust context
       - Drop changes to NG4memcpy.S and ksyms.c]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      31cbb1f4
    • lucien's avatar
      sctp: donot reset the overall_error_count in SHUTDOWN_RECEIVE state · 9f9ccecb
      lucien authored
      commit f648f807 upstream.
      
      Commit f8d96052 ("sctp: Enforce retransmission limit during shutdown")
      fixed a problem with excessive retransmissions in the SHUTDOWN_PENDING by not
      resetting the association overall_error_count.  This allowed the association
      to better enforce assoc.max_retrans limit.
      
      However, the same issue still exists when the association is in SHUTDOWN_RECEIVED
      state.  In this state, HB-ACKs will continue to reset the overall_error_count
      for the association would extend the lifetime of association unnecessarily.
      
      This patch solves this by resetting the overall_error_count whenever the current
      state is small then SCTP_STATE_SHUTDOWN_PENDING.  As a small side-effect, we
      end up also handling SCTP_STATE_SHUTDOWN_ACK_SENT and SCTP_STATE_SHUTDOWN_SENT
      states, but they are not really impacted because we disable Heartbeats in those
      states.
      
      Fixes: Commit f8d96052 ("sctp: Enforce retransmission limit during shutdown")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      9f9ccecb
    • David Ahern's avatar
      net: Fix RCU splat in af_key · 3e3b6dd7
      David Ahern authored
      commit ba51b6be upstream.
      
      Hit the following splat testing VRF change for ipsec:
      
      [  113.475692] ===============================
      [  113.476194] [ INFO: suspicious RCU usage. ]
      [  113.476667] 4.2.0-rc6-1+deb7u2+clUNRELEASED #3.2.65-1+deb7u2+clUNRELEASED Not tainted
      [  113.477545] -------------------------------
      [  113.478013] /work/monster-14/dsa/kernel.git/include/linux/rcupdate.h:568 Illegal context switch in RCU read-side critical section!
      [  113.479288]
      [  113.479288] other info that might help us debug this:
      [  113.479288]
      [  113.480207]
      [  113.480207] rcu_scheduler_active = 1, debug_locks = 1
      [  113.480931] 2 locks held by setkey/6829:
      [  113.481371]  #0:  (&net->xfrm.xfrm_cfg_mutex){+.+.+.}, at: [<ffffffff814e9887>] pfkey_sendmsg+0xfb/0x213
      [  113.482509]  #1:  (rcu_read_lock){......}, at: [<ffffffff814e767f>] rcu_read_lock+0x0/0x6e
      [  113.483509]
      [  113.483509] stack backtrace:
      [  113.484041] CPU: 0 PID: 6829 Comm: setkey Not tainted 4.2.0-rc6-1+deb7u2+clUNRELEASED #3.2.65-1+deb7u2+clUNRELEASED
      [  113.485422] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014
      [  113.486845]  0000000000000001 ffff88001d4c7a98 ffffffff81518af2 ffffffff81086962
      [  113.487732]  ffff88001d538480 ffff88001d4c7ac8 ffffffff8107ae75 ffffffff8180a154
      [  113.488628]  0000000000000b30 0000000000000000 00000000000000d0 ffff88001d4c7ad8
      [  113.489525] Call Trace:
      [  113.489813]  [<ffffffff81518af2>] dump_stack+0x4c/0x65
      [  113.490389]  [<ffffffff81086962>] ? console_unlock+0x3d6/0x405
      [  113.491039]  [<ffffffff8107ae75>] lockdep_rcu_suspicious+0xfa/0x103
      [  113.491735]  [<ffffffff81064032>] rcu_preempt_sleep_check+0x45/0x47
      [  113.492442]  [<ffffffff8106404d>] ___might_sleep+0x19/0x1c8
      [  113.493077]  [<ffffffff81064268>] __might_sleep+0x6c/0x82
      [  113.493681]  [<ffffffff81133190>] cache_alloc_debugcheck_before.isra.50+0x1d/0x24
      [  113.494508]  [<ffffffff81134876>] kmem_cache_alloc+0x31/0x18f
      [  113.495149]  [<ffffffff814012b5>] skb_clone+0x64/0x80
      [  113.495712]  [<ffffffff814e6f71>] pfkey_broadcast_one+0x3d/0xff
      [  113.496380]  [<ffffffff814e7b84>] pfkey_broadcast+0xb5/0x11e
      [  113.497024]  [<ffffffff814e82d1>] pfkey_register+0x191/0x1b1
      [  113.497653]  [<ffffffff814e9770>] pfkey_process+0x162/0x17e
      [  113.498274]  [<ffffffff814e9895>] pfkey_sendmsg+0x109/0x213
      
      In pfkey_sendmsg the net mutex is taken and then pfkey_broadcast takes
      the RCU lock.
      
      Since pfkey_broadcast takes the RCU lock the allocation argument is
      pointless since GFP_ATOMIC must be used between the rcu_read_{,un}lock.
      The one call outside of rcu can be done with GFP_KERNEL.
      
      Fixes: 7f6b9dbd ("af_key: locking change")
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      3e3b6dd7
    • Andy Lutomirski's avatar
      x86/ldt: Further fix FPU emulation · 159e99c1
      Andy Lutomirski authored
      commit 12e244f4 upstream.
      
      The previous fix confused a selector with a segment prefix.  Fix it.
      
      Compile-tested only.
      
      Cc: Juergen Gross <jgross@suse.com>
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Fixes: 4809146b ("x86/ldt: Correct FPU emulation access to LDT")
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      159e99c1
    • Herton R. Krzesinski's avatar
      ipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits · a1c4fb80
      Herton R. Krzesinski authored
      commit 602b8593 upstream.
      
      The current semaphore code allows a potential use after free: in
      exit_sem we may free the task's sem_undo_list while there is still
      another task looping through the same semaphore set and cleaning the
      sem_undo list at freeary function (the task called IPC_RMID for the same
      semaphore set).
      
      For example, with a test program [1] running which keeps forking a lot
      of processes (which then do a semop call with SEM_UNDO flag), and with
      the parent right after removing the semaphore set with IPC_RMID, and a
      kernel built with CONFIG_SLAB, CONFIG_SLAB_DEBUG and
      CONFIG_DEBUG_SPINLOCK, you can easily see something like the following
      in the kernel log:
      
         Slab corruption (Not tainted): kmalloc-64 start=ffff88003b45c1c0, len=64
         000: 6b 6b 6b 6b 6b 6b 6b 6b 00 6b 6b 6b 6b 6b 6b 6b  kkkkkkkk.kkkkkkk
         010: ff ff ff ff 6b 6b 6b 6b ff ff ff ff ff ff ff ff  ....kkkk........
         Prev obj: start=ffff88003b45c180, len=64
         000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
         010: ff ff ff ff ff ff ff ff c0 fb 01 37 00 88 ff ff  ...........7....
         Next obj: start=ffff88003b45c200, len=64
         000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
         010: ff ff ff ff ff ff ff ff 68 29 a7 3c 00 88 ff ff  ........h).<....
         BUG: spinlock wrong CPU on CPU#2, test/18028
         general protection fault: 0000 [#1] SMP
         Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib]
         CPU: 2 PID: 18028 Comm: test Not tainted 4.2.0-rc5+ #1
         Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
         RIP: spin_dump+0x53/0xc0
         Call Trace:
           spin_bug+0x30/0x40
           do_raw_spin_unlock+0x71/0xa0
           _raw_spin_unlock+0xe/0x10
           freeary+0x82/0x2a0
           ? _raw_spin_lock+0xe/0x10
           semctl_down.clone.0+0xce/0x160
           ? __do_page_fault+0x19a/0x430
           ? __audit_syscall_entry+0xa8/0x100
           SyS_semctl+0x236/0x2c0
           ? syscall_trace_leave+0xde/0x130
           entry_SYSCALL_64_fastpath+0x12/0x71
         Code: 8b 80 88 03 00 00 48 8d 88 60 05 00 00 48 c7 c7 a0 2c a4 81 31 c0 65 8b 15 eb 40 f3 7e e8 08 31 68 00 4d 85 e4 44 8b 4b 08 74 5e <45> 8b 84 24 88 03 00 00 49 8d 8c 24 60 05 00 00 8b 53 04 48 89
         RIP  [<ffffffff810d6053>] spin_dump+0x53/0xc0
          RSP <ffff88003750fd68>
         ---[ end trace 783ebb76612867a0 ]---
         NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [test:18053]
         Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib]
         CPU: 3 PID: 18053 Comm: test Tainted: G      D         4.2.0-rc5+ #1
         Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
         RIP: native_read_tsc+0x0/0x20
         Call Trace:
           ? delay_tsc+0x40/0x70
           __delay+0xf/0x20
           do_raw_spin_lock+0x96/0x140
           _raw_spin_lock+0xe/0x10
           sem_lock_and_putref+0x11/0x70
           SYSC_semtimedop+0x7bf/0x960
           ? handle_mm_fault+0xbf6/0x1880
           ? dequeue_task_fair+0x79/0x4a0
           ? __do_page_fault+0x19a/0x430
           ? kfree_debugcheck+0x16/0x40
           ? __do_page_fault+0x19a/0x430
           ? __audit_syscall_entry+0xa8/0x100
           ? do_audit_syscall_entry+0x66/0x70
           ? syscall_trace_enter_phase1+0x139/0x160
           SyS_semtimedop+0xe/0x10
           SyS_semop+0x10/0x20
           entry_SYSCALL_64_fastpath+0x12/0x71
         Code: 47 10 83 e8 01 85 c0 89 47 10 75 08 65 48 89 3d 1f 74 ff 7e c9 c3 0f 1f 44 00 00 55 48 89 e5 e8 87 17 04 00 66 90 c9 c3 0f 1f 00 <55> 48 89 e5 0f 31 89 c1 48 89 d0 48 c1 e0 20 89 c9 48 09 c8 c9
         Kernel panic - not syncing: softlockup: hung tasks
      
      I wasn't able to trigger any badness on a recent kernel without the
      proper config debugs enabled, however I have softlockup reports on some
      kernel versions, in the semaphore code, which are similar as above (the
      scenario is seen on some servers running IBM DB2 which uses semaphore
      syscalls).
      
      The patch here fixes the race against freeary, by acquiring or waiting
      on the sem_undo_list lock as necessary (exit_sem can race with freeary,
      while freeary sets un->semid to -1 and removes the same sem_undo from
      list_proc or when it removes the last sem_undo).
      
      After the patch I'm unable to reproduce the problem using the test case
      [1].
      
      [1] Test case used below:
      
          #include <stdio.h>
          #include <sys/types.h>
          #include <sys/ipc.h>
          #include <sys/sem.h>
          #include <sys/wait.h>
          #include <stdlib.h>
          #include <time.h>
          #include <unistd.h>
          #include <errno.h>
      
          #define NSEM 1
          #define NSET 5
      
          int sid[NSET];
      
          void thread()
          {
                  struct sembuf op;
                  int s;
                  uid_t pid = getuid();
      
                  s = rand() % NSET;
                  op.sem_num = pid % NSEM;
                  op.sem_op = 1;
                  op.sem_flg = SEM_UNDO;
      
                  semop(sid[s], &op, 1);
                  exit(EXIT_SUCCESS);
          }
      
          void create_set()
          {
                  int i, j;
                  pid_t p;
                  union {
                          int val;
                          struct semid_ds *buf;
                          unsigned short int *array;
                          struct seminfo *__buf;
                  } un;
      
                  /* Create and initialize semaphore set */
                  for (i = 0; i < NSET; i++) {
                          sid[i] = semget(IPC_PRIVATE , NSEM, 0644 | IPC_CREAT);
                          if (sid[i] < 0) {
                                  perror("semget");
                                  exit(EXIT_FAILURE);
                          }
                  }
                  un.val = 0;
                  for (i = 0; i < NSET; i++) {
                          for (j = 0; j < NSEM; j++) {
                                  if (semctl(sid[i], j, SETVAL, un) < 0)
                                          perror("semctl");
                          }
                  }
      
                  /* Launch threads that operate on semaphore set */
                  for (i = 0; i < NSEM * NSET * NSET; i++) {
                          p = fork();
                          if (p < 0)
                                  perror("fork");
                          if (p == 0)
                                  thread();
                  }
      
                  /* Free semaphore set */
                  for (i = 0; i < NSET; i++) {
                          if (semctl(sid[i], NSEM, IPC_RMID))
                                  perror("IPC_RMID");
                  }
      
                  /* Wait for forked processes to exit */
                  while (wait(NULL)) {
                          if (errno == ECHILD)
                                  break;
                  };
          }
      
          int main(int argc, char **argv)
          {
                  pid_t p;
      
                  srand(time(NULL));
      
                  while (1) {
                          p = fork();
                          if (p < 0) {
                                  perror("fork");
                                  exit(EXIT_FAILURE);
                          }
                          if (p == 0) {
                                  create_set();
                                  goto end;
                          }
      
                          /* Wait for forked processes to exit */
                          while (wait(NULL)) {
                                  if (errno == ECHILD)
                                          break;
                          };
                  }
          end:
                  return 0;
          }
      
      [akpm@linux-foundation.org: use normal comment layout]
      Signed-off-by: default avatarHerton R. Krzesinski <herton@redhat.com>
      Acked-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Rafael Aquini <aquini@redhat.com>
      CC: Aristeu Rozanski <aris@redhat.com>
      Cc: David Jeffery <djeffery@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a1c4fb80
    • Bart Van Assche's avatar
      libfc: Fix fc_fcp_cleanup_each_cmd() · 5813f566
      Bart Van Assche authored
      commit 8f2777f5 upstream.
      
      Since fc_fcp_cleanup_cmd() can sleep this function must not
      be called while holding a spinlock. This patch avoids that
      fc_fcp_cleanup_each_cmd() triggers the following bug:
      
      BUG: scheduling while atomic: sg_reset/1512/0x00000202
      1 lock held by sg_reset/1512:
       #0:  (&(&fsp->scsi_pkt_lock)->rlock){+.-...}, at: [<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
      Preemption disabled at:[<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
      Call Trace:
       [<ffffffff816c612c>] dump_stack+0x4f/0x7b
       [<ffffffff810828bc>] __schedule_bug+0x6c/0xd0
       [<ffffffff816c87aa>] __schedule+0x71a/0xa10
       [<ffffffff816c8ad2>] schedule+0x32/0x80
       [<ffffffffc0217eac>] fc_seq_set_resp+0xac/0x100 [libfc]
       [<ffffffffc0218b11>] fc_exch_done+0x41/0x60 [libfc]
       [<ffffffffc0225cff>] fc_fcp_cleanup_each_cmd.isra.21+0xcf/0x150 [libfc]
       [<ffffffffc0225f43>] fc_eh_device_reset+0x1c3/0x270 [libfc]
       [<ffffffff814a2cc9>] scsi_try_bus_device_reset+0x29/0x60
       [<ffffffff814a3908>] scsi_ioctl_reset+0x258/0x2d0
       [<ffffffff814a2650>] scsi_ioctl+0x150/0x440
       [<ffffffff814b3a9d>] sd_ioctl+0xad/0x120
       [<ffffffff8132f266>] blkdev_ioctl+0x1b6/0x810
       [<ffffffff811da608>] block_ioctl+0x38/0x40
       [<ffffffff811b4e08>] do_vfs_ioctl+0x2f8/0x530
       [<ffffffff811b50c1>] SyS_ioctl+0x81/0xa0
       [<ffffffff816cf8b2>] system_call_fastpath+0x16/0x7a
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Signed-off-by: default avatarVasu Dev <vasu.dev@intel.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      5813f566
    • John Soni Jose's avatar
      libiscsi: Fix host busy blocking during connection teardown · 861a9633
      John Soni Jose authored
      commit 660d0831 upstream.
      
      In case of hw iscsi offload, an host can have N-number of active
      connections. There can be IO's running on some connections which
      make host->host_busy always TRUE. Now if logout from a connection
      is tried then the code gets into an infinite loop as host->host_busy
      is always TRUE.
      
       iscsi_conn_teardown(....)
       {
         .........
          /*
           * Block until all in-progress commands for this connection
           * time out or fail.
           */
           for (;;) {
            spin_lock_irqsave(session->host->host_lock, flags);
            if (!atomic_read(&session->host->host_busy)) { /* OK for ERL == 0 */
      	      spin_unlock_irqrestore(session->host->host_lock, flags);
                    break;
            }
           spin_unlock_irqrestore(session->host->host_lock, flags);
           msleep_interruptible(500);
           iscsi_conn_printk(KERN_INFO, conn, "iscsi conn_destroy(): "
                       "host_busy %d host_failed %d\n",
      	          atomic_read(&session->host->host_busy),
      	          session->host->host_failed);
      
      	................
      	...............
           }
        }
      
      This is not an issue with software-iscsi/iser as each cxn is a separate
      host.
      
      Fix:
      Acquiring eh_mutex in iscsi_conn_teardown() before setting
      session->state = ISCSI_STATE_TERMINATE.
      Signed-off-by: default avatarJohn Soni Jose <sony.john@avagotech.com>
      Reviewed-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Reviewed-by: default avatarChris Leech <cleech@redhat.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      861a9633
    • Joe Thornber's avatar
      dm btree: add ref counting ops for the leaves of top level btrees · a11117c7
      Joe Thornber authored
      commit b0dc3c8b upstream.
      
      When using nested btrees, the top leaves of the top levels contain
      block addresses for the root of the next tree down.  If we shadow a
      shared leaf node the leaf values (sub tree roots) should be incremented
      accordingly.
      
      This is only an issue if there is metadata sharing in the top levels.
      Which only occurs if metadata snapshots are being used (as is possible
      with dm-thinp).  And could result in a block from the thinp metadata
      snap being reused early, thus corrupting the thinp metadata snap.
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      [bwh: Backported to 3.2:
       - Drop change to remove_one()
       - Remove const pointer qualifications]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a11117c7