1. 30 Sep, 2015 16 commits
  2. 21 Sep, 2015 2 commits
    • Jonathon Jongsma's avatar
      drm/qxl: validate monitors config modes · 93401d9f
      Jonathon Jongsma authored
      commit bd3e1c7c upstream.
      
      Due to some recent changes in
      drm_helper_probe_single_connector_modes_merge_bits(), old custom modes
      were not being pruned properly. In current kernels,
      drm_mode_validate_basic() is called to sanity-check each mode in the
      list. If the sanity-check passes, the mode's status gets set to to
      MODE_OK. In older kernels this check was not done, so old custom modes
      would still have a status of MODE_UNVERIFIED at this point, and would
      therefore be pruned later in the function.
      
      As a result of this new behavior, the list of modes for a device always
      includes every custom mode ever configured for the device, with the
      largest one listed first. Since desktop environments usually choose the
      first preferred mode when a hotplug event is emitted, this had the
      result of making it very difficult for the user to reduce the size of
      the display.
      
      The qxl driver did implement the mode_valid connector function, but it
      was empty. In order to restore the old behavior where old custom modes
      are pruned, we implement a proper mode_valid function for the qxl
      driver. This function now checks each mode against the last configured
      custom mode and the list of standard modes. If the mode doesn't match
      any of these, its status is set to MODE_BAD so that it will be pruned as
      expected.
      Signed-off-by: default avatarJonathon Jongsma <jjongsma@redhat.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      93401d9f
    • Stephen Chandler Paul's avatar
      DRM - radeon: Don't link train DisplayPort on HPD until we get the dpcd · 3b895193
      Stephen Chandler Paul authored
      commit 924f92bf upstream.
      
      Most of the time this isn't an issue since hotplugging an adaptor will
      trigger a crtc mode change which in turn, causes the driver to probe
      every DisplayPort for a dpcd. However, in cases where hotplugging
      doesn't cause a mode change (specifically when one unplugs a monitor
      from a DisplayPort connector, then plugs that same monitor back in
      seconds later on the same port without any other monitors connected), we
      never probe for the dpcd before starting the initial link training. What
      happens from there looks like this:
      
      	- GPU has only one monitor connected. It's connected via
      	  DisplayPort, and does not go through an adaptor of any sort.
      
      	- User unplugs DisplayPort connector from GPU.
      
      	- Change in HPD is detected by the driver, we probe every
      	  DisplayPort for a possible connection.
      
      	- Probe the port the user originally had the monitor connected
      	  on for it's dpcd. This fails, and we clear the first (and only
      	  the first) byte of the dpcd to indicate we no longer have a
      	  dpcd for this port.
      
      	- User plugs the previously disconnected monitor back into the
      	  same DisplayPort.
      
      	- radeon_connector_hotplug() is called before everyone else,
      	  and tries to handle the link training. Since only the first
      	  byte of the dpcd is zeroed, the driver is able to complete
      	  link training but does so against the wrong dpcd, causing it
      	  to initialize the link with the wrong settings.
      
      	- Display stays blank (usually), dpcd is probed after the
      	  initial link training, and the driver prints no obvious
      	  messages to the log.
      
      In theory, since only one byte of the dpcd is chopped off (specifically,
      the byte that contains the revision information for DisplayPort), it's
      not entirely impossible that this bug may not show on certain monitors.
      For instance, the only reason this bug was visible on my ASUS PB238
      monitor was due to the fact that this monitor using the enhanced framing
      symbol sequence, the flag for which is ignored if the radeon driver
      thinks that the DisplayPort version is below 1.1.
      Signed-off-by: default avatarStephen Chandler Paul <cpaul@redhat.com>
      Reviewed-by: default avatarJerome Glisse <jglisse@redhat.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3b895193
  3. 18 Sep, 2015 11 commits
    • Marc Zyngier's avatar
      arm64: KVM: Fix host crash when injecting a fault into a 32bit guest · a246a015
      Marc Zyngier authored
      commit 126c69a0 upstream.
      
      When injecting a fault into a misbehaving 32bit guest, it seems
      rather idiotic to also inject a 64bit fault that is only going
      to corrupt the guest state. This leads to a situation where we
      perform an illegal exception return at EL2 causing the host
      to crash instead of killing the guest.
      
      Just fix the stupid bug that has been there from day 1.
      Reported-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Tested-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a246a015
    • Horia Geant?'s avatar
      crypto: caam - fix memory corruption in ahash_final_ctx · a06d0b11
      Horia Geant? authored
      commit b310c178 upstream.
      
      When doing pointer operation for accessing the HW S/G table,
      a value representing number of entries (and not number of bytes)
      must be used.
      
      Fixes: 045e3678 ("crypto: caam - ahash hmac support")
      Signed-off-by: default avatarHoria Geant? <horia.geanta@freescale.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a06d0b11
    • Guenter Roeck's avatar
      regmap: regcache-rbtree: Clean new present bits on present bitmap resize · f71e250f
      Guenter Roeck authored
      commit 8ef9724b upstream.
      
      When inserting a new register into a block, the present bit map size is
      increased using krealloc. krealloc does not clear the additionally
      allocated memory, leaving it filled with random values. Result is that
      some registers are considered cached even though this is not the case.
      
      Fix the problem by clearing the additionally allocated memory. Also, if
      the bitmap size does not increase, do not reallocate the bitmap at all
      to reduce overhead.
      
      Fixes: 3f4ff561 ("regmap: rbtree: Make cache_present bitmap per node")
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      f71e250f
    • Bart Van Assche's avatar
      libfc: Fix fc_fcp_cleanup_each_cmd() · efffb6c0
      Bart Van Assche authored
      commit 8f2777f5 upstream.
      
      Since fc_fcp_cleanup_cmd() can sleep this function must not
      be called while holding a spinlock. This patch avoids that
      fc_fcp_cleanup_each_cmd() triggers the following bug:
      
      BUG: scheduling while atomic: sg_reset/1512/0x00000202
      1 lock held by sg_reset/1512:
       #0:  (&(&fsp->scsi_pkt_lock)->rlock){+.-...}, at: [<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
      Preemption disabled at:[<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
      Call Trace:
       [<ffffffff816c612c>] dump_stack+0x4f/0x7b
       [<ffffffff810828bc>] __schedule_bug+0x6c/0xd0
       [<ffffffff816c87aa>] __schedule+0x71a/0xa10
       [<ffffffff816c8ad2>] schedule+0x32/0x80
       [<ffffffffc0217eac>] fc_seq_set_resp+0xac/0x100 [libfc]
       [<ffffffffc0218b11>] fc_exch_done+0x41/0x60 [libfc]
       [<ffffffffc0225cff>] fc_fcp_cleanup_each_cmd.isra.21+0xcf/0x150 [libfc]
       [<ffffffffc0225f43>] fc_eh_device_reset+0x1c3/0x270 [libfc]
       [<ffffffff814a2cc9>] scsi_try_bus_device_reset+0x29/0x60
       [<ffffffff814a3908>] scsi_ioctl_reset+0x258/0x2d0
       [<ffffffff814a2650>] scsi_ioctl+0x150/0x440
       [<ffffffff814b3a9d>] sd_ioctl+0xad/0x120
       [<ffffffff8132f266>] blkdev_ioctl+0x1b6/0x810
       [<ffffffff811da608>] block_ioctl+0x38/0x40
       [<ffffffff811b4e08>] do_vfs_ioctl+0x2f8/0x530
       [<ffffffff811b50c1>] SyS_ioctl+0x81/0xa0
       [<ffffffff816cf8b2>] system_call_fastpath+0x16/0x7a
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Signed-off-by: default avatarVasu Dev <vasu.dev@intel.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      efffb6c0
    • Davide Italiano's avatar
      ext4: move check under lock scope to close a race. · 7e7ec82e
      Davide Italiano authored
      commit 280227a7 upstream
      
      fallocate() checks that the file is extent-based and returns
      EOPNOTSUPP in case is not. Other tasks can convert from and to
      indirect and extent so it's safe to check only after grabbing
      the inode mutex.
      
      [Nikolay Borisov: Bakported to 3.12.47
       - Adjusted context
       - Add the 'out' label]
      Signed-off-by: default avatarDavide Italiano <dccitaliano@gmail.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarNikolay Borisov <kernel@kyup.com>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      7e7ec82e
    • Ian Abbott's avatar
      staging: comedi: adl_pci7x3x: fix digital output on PCI-7230 · bf9adcbe
      Ian Abbott authored
      commit ad83dbd9 upstream
      
      The "adl_pci7x3x" driver replaced the "adl_pci7230" and "adl_pci7432"
      drivers in commits 8f567c37 ("staging: comedi: new adl_pci7x3x
      driver") and 657f77d1 ("staging: comedi: remove adl_pci7230 and
      adl_pci7432 drivers").  Although the new driver code agrees with the
      user manuals for the respective boards, digital outputs stopped working
      on the PCI-7230.  This has 16 digital output channels and the previous
      adl_pci7230 driver shifted the 16 bit output state left by 16 bits
      before writing to the hardware register.  The new adl_pci7x3x driver
      doesn't do that.  Fix it in `adl_pci7x3x_do_insn_bits()` by checking
      for the special case of the subdevice having only 16 channels and
      duplicating the 16 bit output state into both halves of the 32-bit
      register.  That should work both for what the board actually does and
      for what the user manual says it should do.
      
      Fixes: 8f567c37 ("staging: comedi: new adl_pci7x3x driver")
      Signed-off-by: default avatarIan Abbott <abbotti@mev.co.uk>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      bf9adcbe
    • Ian Abbott's avatar
      staging: comedi: usbduxsigma: don't clobber ao_timer in command test · 92ca2aaa
      Ian Abbott authored
      commit c04a1f17 upstream
      
      `devpriv->ao_timer` is used while an asynchronous command is running on
      the AO subdevice.  It also gets modified by the subdevice's `cmdtest`
      handler for checking new asynchronous commands,
      `usbduxsigma_ao_cmdtest()`, which is not correct as it's allowed to
      check new commands while an old command is still running.  Fix it by
      moving the code which sets up `devpriv->ao_timer` into the subdevice's
      `cmd` handler, `usbduxsigma_ao_cmd()`.
      
      ** This backported patch also moves the code that sets up
      `devpriv->ao_sample_count` and `devpriv->ao_continuous` from
      `usbduxsigma_ao_cmdtest()` to `usbduxsigma_ao_cmd()` for the same reason
      as above.  (This was not needed in the upstream commit.) **
      
      Note that the removed code in `usbduxsigma_ao_cmdtest()` checked that
      `devpriv->ao_timer` did not end up less that 1, but that could not
      happen due because `cmd->scan_begin_arg` or `cmd->convert_arg` had
      already been range-checked.
      
      Also note that we tested the `high_speed` variable in the old code, but
      that is currently always 0 and means that we always use "scan" timing
      (`cmd->scan_begin_src == TRIG_TIMER` and `cmd->convert_src == TRIG_NOW`)
      and never "convert" (individual sample) timing (`cmd->scan_begin_src ==
      TRIG_FOLLOW` and `cmd->convert_src == TRIG_TIMER`).  The moved code
      tests `cmd->convert_src` instead to decide whether "scan" or "convert"
      timing is being used, although currently only "scan" timing is
      supported.
      
      Fixes: fb1ef622 ("staging: comedi: usbduxsigma: tidy up analog output command support")
      Signed-off-by: default avatarIan Abbott <abbotti@mev.co.uk>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      92ca2aaa
    • Ian Abbott's avatar
      staging: comedi: usbduxsigma: don't clobber ai_timer in command test · 52cfd931
      Ian Abbott authored
      commit 423b24c3 upstream
      
      `devpriv->ai_timer` is used while an asynchronous command is running on
      the AI subdevice.  It also gets modified by the subdevice's `cmdtest`
      handler for checking new asynchronous commands
      (`usbduxsigma_ai_cmdtest()`), which is not correct as it's allowed to
      check new commands while an old command is still running.  Fix it by
      moving the code which sets up `devpriv->ai_timer` and
      `devpriv->ai_interval` into the subdevice's `cmd` handler,
      `usbduxsigma_ai_cmd()`.
      
      ** This backported patch also moves the code that sets up
      `devpriv->ai_sample_count` and `devpriv->ai_continuous` from
      `usbduxsigma_ai_cmdtest()` to `usbduxsigma_ai_cmd()` for the same reason
      as above. (This was not needed in the upstream commit.) **
      
      Note that the removed code in `usbduxsigma_ai_cmdtest()` checked that
      `devpriv->ai_timer` did not end up less than than 1, but that could not
      happen because `cmd->scan_begin_arg` had already been checked to be at
      least the minimum required value (at least when `cmd->scan_begin_src ==
      TRIG_TIMER`, which had also been checked to be the case).
      
      Fixes: b986be85 ("staging: comedi: usbduxsigma: tidy up analog input command support)
      Signed-off-by: default avatarIan Abbott <abbotti@mev.co.uk>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      52cfd931
    • Mark Rustad's avatar
      PCI: Add VPD function 0 quirk for Intel Ethernet devices · 416473bd
      Mark Rustad authored
      commit 7aa6ca4d upstream.
      
      Set the PCI_DEV_FLAGS_VPD_REF_F0 flag on all Intel Ethernet device
      functions other than function 0, so that on multi-function devices, we will
      always read VPD from function 0 instead of from the other functions.
      
      [bhelgaas: changelog]
      Signed-off-by: default avatarMark Rustad <mark.d.rustad@intel.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      416473bd
    • Mark Rustad's avatar
      PCI: Add dev_flags bit to access VPD through function 0 · 4386f737
      Mark Rustad authored
      commit 932c435c upstream.
      
      Add a dev_flags bit, PCI_DEV_FLAGS_VPD_REF_F0, to access VPD through
      function 0 to provide VPD access on other functions.  This is for hardware
      devices that provide copies of the same VPD capability registers in
      multiple functions.  Because the kernel expects that each function has its
      own registers, both the locking and the state tracking are affected by VPD
      accesses to different functions.
      
      On such devices for example, if a VPD write is performed on function 0,
      *any* later attempt to read VPD from any other function of that device will
      hang.  This has to do with how the kernel tracks the expected value of the
      F bit per function.
      
      Concurrent accesses to different functions of the same device can not only
      hang but also corrupt both read and write VPD data.
      
      When hangs occur, typically the error message:
      
        vpd r/w failed.  This is likely a firmware bug on this device.
      
      will be seen.
      
      Never set this bit on function 0 or there will be an infinite recursion.
      Signed-off-by: default avatarMark Rustad <mark.d.rustad@intel.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      4386f737
    • Jiri Slaby's avatar
      Linux 3.12.48 · cbbdd64c
      Jiri Slaby authored
      cbbdd64c
  4. 14 Sep, 2015 2 commits
    • Pablo Neira Ayuso's avatar
      netfilter: nf_conntrack: don't release a conntrack with non-zero refcnt · a7775d15
      Pablo Neira Ayuso authored
      [ Upstream commit e53376be ]
      
      With this patch, the conntrack refcount is initially set to zero and
      it is bumped once it is added to any of the list, so we fulfill
      Eric's golden rule which is that all released objects always have a
      refcount that equals zero.
      
      Andrey Vagin reports that nf_conntrack_free can't be called for a
      conntrack with non-zero ref-counter, because it can race with
      nf_conntrack_find_get().
      
      A conntrack slab is created with SLAB_DESTROY_BY_RCU. Non-zero
      ref-counter says that this conntrack is used. So when we release
      a conntrack with non-zero counter, we break this assumption.
      
      CPU1                                    CPU2
      ____nf_conntrack_find()
                                              nf_ct_put()
                                               destroy_conntrack()
                                              ...
                                              init_conntrack
                                               __nf_conntrack_alloc (set use = 1)
      atomic_inc_not_zero(&ct->use) (use = 2)
                                               if (!l4proto->new(ct, skb, dataoff, timeouts))
                                                nf_conntrack_free(ct); (use = 2 !!!)
                                              ...
                                              __nf_conntrack_alloc (set use = 1)
       if (!nf_ct_key_equal(h, tuple, zone))
        nf_ct_put(ct); (use = 0)
         destroy_conntrack()
                                              /* continue to work with CT */
      
      After applying the path "[PATCH] netfilter: nf_conntrack: fix RCU
      race in nf_conntrack_find_get" another bug was triggered in
      destroy_conntrack():
      
      <4>[67096.759334] ------------[ cut here ]------------
      <2>[67096.759353] kernel BUG at net/netfilter/nf_conntrack_core.c:211!
      ...
      <4>[67096.759837] Pid: 498649, comm: atdd veid: 666 Tainted: G         C ---------------    2.6.32-042stab084.18 #1 042stab084_18 /DQ45CB
      <4>[67096.759932] RIP: 0010:[<ffffffffa03d99ac>]  [<ffffffffa03d99ac>] destroy_conntrack+0x15c/0x190 [nf_conntrack]
      <4>[67096.760255] Call Trace:
      <4>[67096.760255]  [<ffffffff814844a7>] nf_conntrack_destroy+0x17/0x30
      <4>[67096.760255]  [<ffffffffa03d9bb5>] nf_conntrack_find_get+0x85/0x130 [nf_conntrack]
      <4>[67096.760255]  [<ffffffffa03d9fb2>] nf_conntrack_in+0x352/0xb60 [nf_conntrack]
      <4>[67096.760255]  [<ffffffffa048c771>] ipv4_conntrack_local+0x51/0x60 [nf_conntrack_ipv4]
      <4>[67096.760255]  [<ffffffff81484419>] nf_iterate+0x69/0xb0
      <4>[67096.760255]  [<ffffffff814b5b00>] ? dst_output+0x0/0x20
      <4>[67096.760255]  [<ffffffff814845d4>] nf_hook_slow+0x74/0x110
      <4>[67096.760255]  [<ffffffff814b5b00>] ? dst_output+0x0/0x20
      <4>[67096.760255]  [<ffffffff814b66d5>] raw_sendmsg+0x775/0x910
      <4>[67096.760255]  [<ffffffff8104c5a8>] ? flush_tlb_others_ipi+0x128/0x130
      <4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
      <4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
      <4>[67096.760255]  [<ffffffff814c136a>] inet_sendmsg+0x4a/0xb0
      <4>[67096.760255]  [<ffffffff81444e93>] ? sock_sendmsg+0x13/0x140
      <4>[67096.760255]  [<ffffffff81444f97>] sock_sendmsg+0x117/0x140
      <4>[67096.760255]  [<ffffffff8102e299>] ? native_smp_send_reschedule+0x49/0x60
      <4>[67096.760255]  [<ffffffff81519beb>] ? _spin_unlock_bh+0x1b/0x20
      <4>[67096.760255]  [<ffffffff8109d930>] ? autoremove_wake_function+0x0/0x40
      <4>[67096.760255]  [<ffffffff814960f0>] ? do_ip_setsockopt+0x90/0xd80
      <4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
      <4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
      <4>[67096.760255]  [<ffffffff814457c9>] sys_sendto+0x139/0x190
      <4>[67096.760255]  [<ffffffff810efa77>] ? audit_syscall_entry+0x1d7/0x200
      <4>[67096.760255]  [<ffffffff810ef7c5>] ? __audit_syscall_exit+0x265/0x290
      <4>[67096.760255]  [<ffffffff81474daf>] compat_sys_socketcall+0x13f/0x210
      <4>[67096.760255]  [<ffffffff8104dea3>] ia32_sysret+0x0/0x5
      
      I have reused the original title for the RFC patch that Andrey posted and
      most of the original patch description.
      
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Andrew Vagin <avagin@parallels.com>
      Cc: Florian Westphal <fw@strlen.de>
      Reported-by: default avatarAndrew Vagin <avagin@parallels.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarAndrew Vagin <avagin@parallels.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a7775d15
    • Andrey Vagin's avatar
      netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get · 336323b7
      Andrey Vagin authored
      [ Upstream commit c6825c09 ]
      
      Lets look at destroy_conntrack:
      
      hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
      ...
      nf_conntrack_free(ct)
      	kmem_cache_free(net->ct.nf_conntrack_cachep, ct);
      
      net->ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU.
      
      The hash is protected by rcu, so readers look up conntracks without
      locks.
      A conntrack is removed from the hash, but in this moment a few readers
      still can use the conntrack. Then this conntrack is released and another
      thread creates conntrack with the same address and the equal tuple.
      After this a reader starts to validate the conntrack:
      * It's not dying, because a new conntrack was created
      * nf_ct_tuple_equal() returns true.
      
      But this conntrack is not initialized yet, so it can not be used by two
      threads concurrently. In this case BUG_ON may be triggered from
      nf_nat_setup_info().
      
      Florian Westphal suggested to check the confirm bit too. I think it's
      right.
      
      task 1			task 2			task 3
      			nf_conntrack_find_get
      			 ____nf_conntrack_find
      destroy_conntrack
       hlist_nulls_del_rcu
       nf_conntrack_free
       kmem_cache_free
      						__nf_conntrack_alloc
      						 kmem_cache_alloc
      						 memset(&ct->tuplehash[IP_CT_DIR_MAX],
      			 if (nf_ct_is_dying(ct))
      			 if (!nf_ct_tuple_equal()
      
      I'm not sure, that I have ever seen this race condition in a real life.
      Currently we are investigating a bug, which is reproduced on a few nodes.
      In our case one conntrack is initialized from a few tasks concurrently,
      we don't have any other explanation for this.
      
      <2>[46267.083061] kernel BUG at net/ipv4/netfilter/nf_nat_core.c:322!
      ...
      <4>[46267.083951] RIP: 0010:[<ffffffffa01e00a4>]  [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590 [nf_nat]
      ...
      <4>[46267.085549] Call Trace:
      <4>[46267.085622]  [<ffffffffa023421b>] alloc_null_binding+0x5b/0xa0 [iptable_nat]
      <4>[46267.085697]  [<ffffffffa02342bc>] nf_nat_rule_find+0x5c/0x80 [iptable_nat]
      <4>[46267.085770]  [<ffffffffa0234521>] nf_nat_fn+0x111/0x260 [iptable_nat]
      <4>[46267.085843]  [<ffffffffa0234798>] nf_nat_out+0x48/0xd0 [iptable_nat]
      <4>[46267.085919]  [<ffffffff814841b9>] nf_iterate+0x69/0xb0
      <4>[46267.085991]  [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
      <4>[46267.086063]  [<ffffffff81484374>] nf_hook_slow+0x74/0x110
      <4>[46267.086133]  [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
      <4>[46267.086207]  [<ffffffff814b5890>] ? dst_output+0x0/0x20
      <4>[46267.086277]  [<ffffffff81495204>] ip_output+0xa4/0xc0
      <4>[46267.086346]  [<ffffffff814b65a4>] raw_sendmsg+0x8b4/0x910
      <4>[46267.086419]  [<ffffffff814c10fa>] inet_sendmsg+0x4a/0xb0
      <4>[46267.086491]  [<ffffffff814459aa>] ? sock_update_classid+0x3a/0x50
      <4>[46267.086562]  [<ffffffff81444d67>] sock_sendmsg+0x117/0x140
      <4>[46267.086638]  [<ffffffff8151997b>] ? _spin_unlock_bh+0x1b/0x20
      <4>[46267.086712]  [<ffffffff8109d370>] ? autoremove_wake_function+0x0/0x40
      <4>[46267.086785]  [<ffffffff81495e80>] ? do_ip_setsockopt+0x90/0xd80
      <4>[46267.086858]  [<ffffffff8100be0e>] ? call_function_interrupt+0xe/0x20
      <4>[46267.086936]  [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
      <4>[46267.087006]  [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
      <4>[46267.087081]  [<ffffffff8118f2e8>] ? kmem_cache_alloc+0xd8/0x1e0
      <4>[46267.087151]  [<ffffffff81445599>] sys_sendto+0x139/0x190
      <4>[46267.087229]  [<ffffffff81448c0d>] ? sock_setsockopt+0x16d/0x6f0
      <4>[46267.087303]  [<ffffffff810efa47>] ? audit_syscall_entry+0x1d7/0x200
      <4>[46267.087378]  [<ffffffff810ef795>] ? __audit_syscall_exit+0x265/0x290
      <4>[46267.087454]  [<ffffffff81474885>] ? compat_sys_setsockopt+0x75/0x210
      <4>[46267.087531]  [<ffffffff81474b5f>] compat_sys_socketcall+0x13f/0x210
      <4>[46267.087607]  [<ffffffff8104dea3>] ia32_sysret+0x0/0x5
      <4>[46267.087676] Code: 91 20 e2 01 75 29 48 89 de 4c 89 f7 e8 56 fa ff ff 85 c0 0f 84 68 fc ff ff 0f b6 4d c6 41 8b 45 00 e9 4d fb ff ff e8 7c 19 e9 e0 <0f> 0b eb fe f6 05 17 91 20 e2 80 74 ce 80 3d 5f 2e 00 00 00 74
      <1>[46267.088023] RIP  [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Pablo Neira Ayuso <pablo@netfilter.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Signed-off-by: default avatarAndrey Vagin <avagin@openvz.org>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      336323b7
  5. 02 Sep, 2015 5 commits
    • Benjamin LaHaise's avatar
      aio: fix reqs_available handling · 09c59fc8
      Benjamin LaHaise authored
      commit d856f32a upstream.
      
      As reported by Dan Aloni, commit f8567a38 ("aio: fix aio request
      leak when events are reaped by userspace") introduces a regression when
      user code attempts to perform io_submit() with more events than are
      available in the ring buffer.  Reverting that commit would reintroduce a
      regression when user space event reaping is used.
      
      Fixing this bug is a bit more involved than the previous attempts to fix
      this regression.  Since we do not have a single point at which we can
      count events as being reaped by user space and io_getevents(), we have
      to track event completion by looking at the number of events left in the
      event ring.  So long as there are as many events in the ring buffer as
      there have been completion events generate, we cannot call
      put_reqs_available().  The code to check for this is now placed in
      refill_reqs_available().
      
      A test program from Dan and modified by me for verifying this bug is available
      at http://www.kvack.org/~bcrl/20140824-aio_bug.c .
      Reported-by: default avatarDan Aloni <dan@kernelim.com>
      Signed-off-by: default avatarBenjamin LaHaise <bcrl@kvack.org>
      Acked-by: default avatarDan Aloni <dan@kernelim.com>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Mateusz Guzik <mguzik@redhat.com>
      Cc: Petr Matousek <pmatouse@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      09c59fc8
    • Heinz Mauelshagen's avatar
      dm cache mq: fix memory allocation failure for large cache devices · ca6292e4
      Heinz Mauelshagen authored
      commit 14f398ca upstream.
      
      The memory allocated for the multiqueue policy's hash table doesn't need
      to be physically contiguous.  Use vzalloc() instead of kzalloc().
      Fedora has been carrying this fix since 10/10/2013.
      
      Failure seen during creation of a 10TB cached device with a 2048 sector
      block size and 411GB cache size:
      
       dmsetup: page allocation failure: order:9, mode:0x10c0d0
       CPU: 11 PID: 29235 Comm: dmsetup Not tainted 3.10.4 #3
       Hardware name: Supermicro X8DTL/X8DTL, BIOS 2.1a       12/30/2011
        000000000010c0d0 ffff880090941898 ffffffff81387ab4 ffff880090941928
        ffffffff810bb26f 0000000000000009 000000000010c0d0 ffff880090941928
        ffffffff81385dbc ffffffff815f3840 ffffffff00000000 000002000010c0d0
       Call Trace:
        [<ffffffff81387ab4>] dump_stack+0x19/0x1b
        [<ffffffff810bb26f>] warn_alloc_failed+0x110/0x124
        [<ffffffff81385dbc>] ? __alloc_pages_direct_compact+0x17c/0x18e
        [<ffffffff810bda2e>] __alloc_pages_nodemask+0x6c7/0x75e
        [<ffffffff810bdad7>] __get_free_pages+0x12/0x3f
        [<ffffffff810ea148>] kmalloc_order_trace+0x29/0x88
        [<ffffffff810ec1fd>] __kmalloc+0x36/0x11b
        [<ffffffffa031eeed>] ? mq_create+0x1dc/0x2cf [dm_cache_mq]
        [<ffffffffa031efc0>] mq_create+0x2af/0x2cf [dm_cache_mq]
        [<ffffffffa0314605>] dm_cache_policy_create+0xa7/0xd2 [dm_cache]
        [<ffffffffa0312530>] ? cache_ctr+0x245/0xa13 [dm_cache]
        [<ffffffffa031263e>] cache_ctr+0x353/0xa13 [dm_cache]
        [<ffffffffa012b916>] dm_table_add_target+0x227/0x2ce [dm_mod]
        [<ffffffffa012e8e4>] table_load+0x286/0x2ac [dm_mod]
        [<ffffffffa012e65e>] ? dev_wait+0x8a/0x8a [dm_mod]
        [<ffffffffa012e324>] ctl_ioctl+0x39a/0x3c2 [dm_mod]
        [<ffffffffa012e35a>] dm_ctl_ioctl+0xe/0x12 [dm_mod]
        [<ffffffff81101181>] vfs_ioctl+0x21/0x34
        [<ffffffff811019d3>] do_vfs_ioctl+0x3b1/0x3f4
        [<ffffffff810f4d2e>] ? ____fput+0x9/0xb
        [<ffffffff81050b6c>] ? task_work_run+0x7e/0x92
        [<ffffffff81101a68>] SyS_ioctl+0x52/0x82
        [<ffffffff81391d92>] system_call_fastpath+0x16/0x1b
      Signed-off-by: default avatarHeinz Mauelshagen <heinzm@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      ca6292e4
    • Akinobu Mita's avatar
      bio: fix argument of __bio_add_page() for max_sectors > 0xffff · 52c31210
      Akinobu Mita authored
      commit 34f2fd8d upstream.
      
      The data type of max_sectors and max_hw_sectors in queue settings are
      unsigned int.  But these values are passed to __bio_add_page() as an
      argument whose data type is unsigned short.  In the worst case such as
      max_sectors is 0x10000, bio_add_page() can't add a page and IOs can't
      proceed.
      
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAkinobu Mita <akinobu.mita@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      52c31210
    • James Smart's avatar
      lpfc: Fix scsi prep dma buf error. · 62d80e38
      James Smart authored
      commit 5116fbf1 upstream.
      
      Didn't check for less-than-or-equal zero. Means we may later call
      scsi_dma_unmap() even though we don't have valid mappings.
      Signed-off-by: default avatarDick Kennedy <dick.kennedy@avagotech.com>
      Signed-off-by: default avatarJames Smart <james.smart@avagotech.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      62d80e38
    • Shirish Pargaonkar's avatar
      cifs: Send a logoff request before removing a smb session · a67172a0
      Shirish Pargaonkar authored
      commit 7f48558e upstream.
      
      Send a smb session logoff request before removing smb session off of the list.
      On a signed smb session, remvoing a session off of the list before sending
      a logoff request results in server returning an error for lack of
      smb signature.
      
      Never seen an error during smb logoff, so as per MS-SMB2 3.2.5.1,
      not sure how an error during logoff should be retried. So for now,
      if a server returns an error to a logoff request, log the error and
      remove the session off of the list.
      Signed-off-by: default avatarShirish Pargaonkar <shirishpargaonkar@gmail.com>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a67172a0
  6. 31 Aug, 2015 1 commit
  7. 27 Aug, 2015 3 commits
    • Dan Carpenter's avatar
      rds: fix an integer overflow test in rds_info_getsockopt() · e19d1363
      Dan Carpenter authored
      [ Upstream commit 468b732b ]
      
      "len" is a signed integer.  We check that len is not negative, so it
      goes from zero to INT_MAX.  PAGE_SIZE is unsigned long so the comparison
      is type promoted to unsigned long.  ULONG_MAX - 4095 is a higher than
      INT_MAX so the condition can never be true.
      
      I don't know if this is harmful but it seems safe to limit "len" to
      INT_MAX - 4095.
      
      Fixes: a8c879a7 ('RDS: Info and stats')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e19d1363
    • Jack Morgenstein's avatar
      net/mlx4_core: Fix wrong index in propagating port change event to VFs · 80bcc280
      Jack Morgenstein authored
      [ Upstream commit 1c1bf349 ]
      
      The port-change event processing in procedure mlx4_eq_int() uses "slave"
      as the vf_oper array index. Since the value of "slave" is the PF function
      index, the result is that the PF link state is used for deciding to
      propagate the event for all the VFs. The VF link state should be used,
      so the VF function index should be used here.
      
      Fixes: 948e306d ('net/mlx4: Add VF link state support')
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      80bcc280
    • Florian Westphal's avatar
      netlink: don't hold mutex in rcu callback when releasing mmapd ring · 6056cd78
      Florian Westphal authored
      [ Upstream commit 0470eb99 ]
      
      Kirill A. Shutemov says:
      
      This simple test-case trigers few locking asserts in kernel:
      
      int main(int argc, char **argv)
      {
              unsigned int block_size = 16 * 4096;
              struct nl_mmap_req req = {
                      .nm_block_size          = block_size,
                      .nm_block_nr            = 64,
                      .nm_frame_size          = 16384,
                      .nm_frame_nr            = 64 * block_size / 16384,
              };
              unsigned int ring_size;
      	int fd;
      
      	fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
              if (setsockopt(fd, SOL_NETLINK, NETLINK_RX_RING, &req, sizeof(req)) < 0)
                      exit(1);
              if (setsockopt(fd, SOL_NETLINK, NETLINK_TX_RING, &req, sizeof(req)) < 0)
                      exit(1);
      
      	ring_size = req.nm_block_nr * req.nm_block_size;
      	mmap(NULL, 2 * ring_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
      	return 0;
      }
      
      +++ exited with 0 +++
      BUG: sleeping function called from invalid context at /home/kas/git/public/linux-mm/kernel/locking/mutex.c:616
      in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: init
      3 locks held by init/1:
       #0:  (reboot_mutex){+.+...}, at: [<ffffffff81080959>] SyS_reboot+0xa9/0x220
       #1:  ((reboot_notifier_list).rwsem){.+.+..}, at: [<ffffffff8107f379>] __blocking_notifier_call_chain+0x39/0x70
       #2:  (rcu_callback){......}, at: [<ffffffff810d32e0>] rcu_do_batch.isra.49+0x160/0x10c0
      Preemption disabled at:[<ffffffff8145365f>] __delay+0xf/0x20
      
      CPU: 1 PID: 1 Comm: init Not tainted 4.1.0-00009-gbddf4c4818e0 #253
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Debian-1.8.2-1 04/01/2014
       ffff88017b3d8000 ffff88027bc03c38 ffffffff81929ceb 0000000000000102
       0000000000000000 ffff88027bc03c68 ffffffff81085a9d 0000000000000002
       ffffffff81ca2a20 0000000000000268 0000000000000000 ffff88027bc03c98
      Call Trace:
       <IRQ>  [<ffffffff81929ceb>] dump_stack+0x4f/0x7b
       [<ffffffff81085a9d>] ___might_sleep+0x16d/0x270
       [<ffffffff81085bed>] __might_sleep+0x4d/0x90
       [<ffffffff8192e96f>] mutex_lock_nested+0x2f/0x430
       [<ffffffff81932fed>] ? _raw_spin_unlock_irqrestore+0x5d/0x80
       [<ffffffff81464143>] ? __this_cpu_preempt_check+0x13/0x20
       [<ffffffff8182fc3d>] netlink_set_ring+0x1ed/0x350
       [<ffffffff8182e000>] ? netlink_undo_bind+0x70/0x70
       [<ffffffff8182fe20>] netlink_sock_destruct+0x80/0x150
       [<ffffffff817e484d>] __sk_free+0x1d/0x160
       [<ffffffff817e49a9>] sk_free+0x19/0x20
      [..]
      
      Cong Wang says:
      
      We can't hold mutex lock in a rcu callback, [..]
      
      Thomas Graf says:
      
      The socket should be dead at this point. It might be simpler to
      add a netlink_release_ring() function which doesn't require
      locking at all.
      Reported-by: default avatar"Kirill A. Shutemov" <kirill@shutemov.name>
      Diagnosed-by: default avatarCong Wang <cwang@twopensource.com>
      Suggested-by: default avatarThomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      6056cd78