1. 18 Sep, 2015 4 commits
    • Ekaterina Tumanova's avatar
      KVM: s390: Zero out current VMDB of STSI before including level3 data. · cd440e0c
      Ekaterina Tumanova authored
      commit b75f4c9a upstream.
      
      s390 documentation requires words 0 and 10-15 to be reserved and stored as
      zeros. As we fill out all other fields, we can memset the full structure.
      Signed-off-by: default avatarEkaterina Tumanova <tumanova@linux.vnet.ibm.com>
      Reviewed-by: default avatarDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      cd440e0c
    • Sabrina Dubroca's avatar
      e1000: add dummy allocator to fix race condition between mtu change and netpoll · 4d2837d5
      Sabrina Dubroca authored
      commit 08e83316 upstream.
      
      There is a race condition between e1000_change_mtu's cleanups and
      netpoll, when we change the MTU across jumbo size:
      
      Changing MTU frees all the rx buffers:
          e1000_change_mtu -> e1000_down -> e1000_clean_all_rx_rings ->
              e1000_clean_rx_ring
      
      Then, close to the end of e1000_change_mtu:
          pr_info -> ... -> netpoll_poll_dev -> e1000_clean ->
              e1000_clean_rx_irq -> e1000_alloc_rx_buffers -> e1000_alloc_frag
      
      And when we come back to do the rest of the MTU change:
          e1000_up -> e1000_configure -> e1000_configure_rx ->
              e1000_alloc_jumbo_rx_buffers
      
      alloc_jumbo finds the buffers already != NULL, since data (shared with
      page in e1000_rx_buffer->rxbuf) has been re-alloc'd, but it's garbage,
      or at least not what is expected when in jumbo state.
      
      This results in an unusable adapter (packets don't get through), and a
      NULL pointer dereference on the next call to e1000_clean_rx_ring
      (other mtu change, link down, shutdown):
      
      BUG: unable to handle kernel NULL pointer dereference at           (null)
      IP: [<ffffffff81194d6e>] put_compound_page+0x7e/0x330
      
          [...]
      
      Call Trace:
       [<ffffffff81195445>] put_page+0x55/0x60
       [<ffffffff815d9f44>] e1000_clean_rx_ring+0x134/0x200
       [<ffffffff815da055>] e1000_clean_all_rx_rings+0x45/0x60
       [<ffffffff815df5e0>] e1000_down+0x1c0/0x1d0
       [<ffffffff811e2260>] ? deactivate_slab+0x7f0/0x840
       [<ffffffff815e21bc>] e1000_change_mtu+0xdc/0x170
       [<ffffffff81647050>] dev_set_mtu+0xa0/0x140
       [<ffffffff81664218>] do_setlink+0x218/0xac0
       [<ffffffff814459e9>] ? nla_parse+0xb9/0x120
       [<ffffffff816652d0>] rtnl_newlink+0x6d0/0x890
       [<ffffffff8104f000>] ? kvm_clock_read+0x20/0x40
       [<ffffffff810a2068>] ? sched_clock_cpu+0xa8/0x100
       [<ffffffff81663802>] rtnetlink_rcv_msg+0x92/0x260
      
      By setting the allocator to a dummy version, netpoll can't mess up our
      rx buffers.  The allocator is set back to a sane value in
      e1000_configure_rx.
      
      Fixes: edbbb3ca ("e1000: implement jumbo receive with partial descriptors")
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      4d2837d5
    • K. Y. Srinivasan's avatar
      Drivers: hv: vmbus: Fix a bug in the error path in vmbus_open() · a76338a9
      K. Y. Srinivasan authored
      commit 40384e4b upstream.
      
      Correctly rollback state if the failure occurs after we have handed over
      the ownership of the buffer to the host.
      Signed-off-by: default avatarK. Y. Srinivasan <kys@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      a76338a9
    • Alexander Ploumistos's avatar
      Bluetooth: ath3k: Add support Atheros AR5B195 combo Mini PCIe card · b1113711
      Alexander Ploumistos authored
      commit 2eeff0b4 upstream.
      
      Add 04f2:aff1 to ath3k.c supported devices list and btusb.c blacklist, so
      that the device can load the ath3k firmware and re-enumerate itself as an
      AR3011 device.
      
      T:  Bus=05 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=12   MxCh= 0
      D:  Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
      P:  Vendor=04f2 ProdID=aff1 Rev= 0.01
      C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
      I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=1ms
      E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms
      E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
      I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
      I:  If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
      I:  If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
      I:  If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
      I:  If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
      I:  If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms
      Signed-off-by: default avatarAlexander Ploumistos <alexpl@fedoraproject.org>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      b1113711
  2. 14 Sep, 2015 1 commit
    • Weilong Chen's avatar
      ipv6: add check for blackhole or prohibited entry in rt6_redire · 9a6fbaeb
      Weilong Chen authored
      There's a check for ip6_null_entry, but it's not enough if the config
      CONFIG_IPV6_MULTIPLE_TABLES is selected. Blackhole or prohibited entries
      should also be ignored.
      
      This path is for kernel before v3.6, as there's a commit b94f1c09
      use icmpv6_notify() instead of rt6_redirect() and rt6_redirect has
      been deleted.
      
      The oops as follow:
          [exception RIP: do_raw_write_lock+12]
          RIP: ffffffff8122c42c  RSP: ffff880666e45820  RFLAGS: 00010282
          RAX: ffff8801207bffd8  RBX: 0000000000000018  RCX: 0000000000000000
          RDX: 0000000000000000  RSI: ffff880666e45898  RDI: 0000000000000018
          RBP: ffff880666e45830   R8: 000000000000001e   R9: 0000000006000000
          R10: ffff88011796b8a0  R11: 0000000000000004  R12: ffff88010391ed00
          R13: 0000000000000000  R14: ffff880666e45898  R15: ffff88011796b890
          ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
          [ffff880666e45838] _raw_write_lock_bh at ffffffff81450b39
          [ffff880666e45858] __ip6_ins_rt at ffffffff813ed8c1
          [ffff880666e45888] ip6_ins_rt at ffffffff813eef58
          [ffff880666e458b8] rt6_redirect at ffffffff813f0b84
          [ffff880666e45958] ndisc_rcv at ffffffff813f95d8
          [ffff880666e45a08] icmpv6_rcv at ffffffff814000e8
          [ffff880666e45ae8] ip6_input_finish at ffffffff813e43bb
          [ffff880666e45b38] ip6_input at ffffffff813e4b08
          [ffff880666e45b68] ipv6_rcv at ffffffff813e4969
          [ffff880666e45bc8] __netif_receive_skb at ffffffff8135158a
          [ffff880666e45c38] dev_gro_receive at ffffffff81351cb0
          [ffff880666e45c78] napi_gro_receive at ffffffff81351fc5
          [ffff880666e45cb8] tg3_rx at ffffffffa0bfb354 [tg]
          [ffff880666e45d88] tg3_poll_work at ffffffffa0c07857 [tg]
          [ffff880666e45e18] tg3_poll_msix at ffffffffa0c07d1b [tg]
          [ffff880666e45e68] net_rx_action at ffffffff81352219
          [ffff880666e45ec8] __do_softirq at ffffffff8103e5a1
          [ffff880666e45f38] call_softirq at ffffffff81459c4c
          [ffff880666e45f50] do_softirq at ffffffff8100413d
          [ffff880666e45f80] do_IRQ at ffffffff81003cce
      This happened when ip6_route_redirect found a rt which was set
      blackhole, the rt had a NULL rt6i_table argument which is accessed by
      __ip6_ins_rt() when trying to lock rt6i_table->tb6_lock caused a BUG:
      "BUG: unable to handle kernel NULL pointer"
      Signed-off-by: default avatarWeilong Chen <chenweilong@huawei.com>
      9a6fbaeb
  3. 19 Jun, 2015 35 commits
    • Zefan Li's avatar
      Linux 3.4.108 · cf1b3dad
      Zefan Li authored
      cf1b3dad
    • Ian Campbell's avatar
      xen: netback: read hotplug script once at start of day. · 366df578
      Ian Campbell authored
      commit 31a41898 upstream.
      
      When we come to tear things down in netback_remove() and generate the
      uevent it is possible that the xenstore directory has already been
      removed (details below).
      
      In such cases netback_uevent() won't be able to read the hotplug
      script and will write a xenstore error node.
      
      A recent change to the hypervisor exposed this race such that we now
      sometimes lose it (where apparently we didn't ever before).
      
      Instead read the hotplug script configuration during setup and use it
      for the lifetime of the backend device.
      
      The apparently more obvious fix of moving the transition to
      state=Closed in netback_remove() to after the uevent does not work
      because it is possible that we are already in state=Closed (in
      reaction to the guest having disconnected as it shutdown). Being
      already in Closed means the toolstack is at liberty to start tearing
      down the xenstore directories. In principal it might be possible to
      arrange to unregister the device sooner (e.g on transition to Closing)
      such that xenstore would still be there but this state machine is
      fragile and prone to anger...
      
      A modern Xen system only relies on the hotplug uevent for driver
      domains, when the backend is in the same domain as the toolstack it
      will run the necessary setup/teardown directly in the correct sequence
      wrt xenstore changes.
      Signed-off-by: default avatarIan Campbell <ian.campbell@citrix.com>
      Acked-by: default avatarWei Liu <wei.liu2@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      366df578
    • Joonsoo Kim's avatar
      slub: refactoring unfreeze_partials() · e0483eb8
      Joonsoo Kim authored
      commit 43d77867 upstream.
      
      Current implementation of unfreeze_partials() is so complicated,
      but benefit from it is insignificant. In addition many code in
      do {} while loop have a bad influence to a fail rate of cmpxchg_double_slab.
      Under current implementation which test status of cpu partial slab
      and acquire list_lock in do {} while loop,
      we don't need to acquire a list_lock and gain a little benefit
      when front of the cpu partial slab is to be discarded, but this is a rare case.
      In case that add_partial is performed and cmpxchg_double_slab is failed,
      remove_partial should be called case by case.
      
      I think that these are disadvantages of current implementation,
      so I do refactoring unfreeze_partials().
      
      Minimizing code in do {} while loop introduce a reduced fail rate
      of cmpxchg_double_slab. Below is output of 'slabinfo -r kmalloc-256'
      when './perf stat -r 33 hackbench 50 process 4000 > /dev/null' is done.
      
      ** before **
      Cmpxchg_double Looping
      ------------------------
      Locked Cmpxchg Double redos   182685
      Unlocked Cmpxchg Double redos 0
      
      ** after **
      Cmpxchg_double Looping
      ------------------------
      Locked Cmpxchg Double redos   177995
      Unlocked Cmpxchg Double redos 1
      
      We can see cmpxchg_double_slab fail rate is improved slightly.
      
      Bolow is output of './perf stat -r 30 hackbench 50 process 4000 > /dev/null'.
      
      ** before **
       Performance counter stats for './hackbench 50 process 4000' (30 runs):
      
           108517.190463 task-clock                #    7.926 CPUs utilized            ( +-  0.24% )
               2,919,550 context-switches          #    0.027 M/sec                    ( +-  3.07% )
                 100,774 CPU-migrations            #    0.929 K/sec                    ( +-  4.72% )
                 124,201 page-faults               #    0.001 M/sec                    ( +-  0.15% )
         401,500,234,387 cycles                    #    3.700 GHz                      ( +-  0.24% )
         <not supported> stalled-cycles-frontend
         <not supported> stalled-cycles-backend
         250,576,913,354 instructions              #    0.62  insns per cycle          ( +-  0.13% )
          45,934,956,860 branches                  #  423.297 M/sec                    ( +-  0.14% )
             188,219,787 branch-misses             #    0.41% of all branches          ( +-  0.56% )
      
            13.691837307 seconds time elapsed                                          ( +-  0.24% )
      
      ** after **
       Performance counter stats for './hackbench 50 process 4000' (30 runs):
      
           107784.479767 task-clock                #    7.928 CPUs utilized            ( +-  0.22% )
               2,834,781 context-switches          #    0.026 M/sec                    ( +-  2.33% )
                  93,083 CPU-migrations            #    0.864 K/sec                    ( +-  3.45% )
                 123,967 page-faults               #    0.001 M/sec                    ( +-  0.15% )
         398,781,421,836 cycles                    #    3.700 GHz                      ( +-  0.22% )
         <not supported> stalled-cycles-frontend
         <not supported> stalled-cycles-backend
         250,189,160,419 instructions              #    0.63  insns per cycle          ( +-  0.09% )
          45,855,370,128 branches                  #  425.436 M/sec                    ( +-  0.10% )
             169,881,248 branch-misses             #    0.37% of all branches          ( +-  0.43% )
      
            13.596272341 seconds time elapsed                                          ( +-  0.22% )
      
      No regression is found, but rather we can see slightly better result.
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Signed-off-by: default avatarJoonsoo Kim <js1304@gmail.com>
      Signed-off-by: default avatarPekka Enberg <penberg@kernel.org>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      e0483eb8
    • Ben Hutchings's avatar
      xen-pciback: Add name prefix to global 'permissive' variable · cb990484
      Ben Hutchings authored
      commit 8014bcc8 upstream.
      
      The variable for the 'permissive' module parameter used to be static
      but was recently changed to be extern.  This puts it in the kernel
      global namespace if the driver is built-in, so its name should begin
      with a prefix identifying the driver.
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Fixes: af6fc858 ("xen-pciback: limit guest control of command register")
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      cb990484
    • Tejun Heo's avatar
      writeback: use |1 instead of +1 to protect against div by zero · c905f0af
      Tejun Heo authored
      commit 464d1387 upstream.
      
      mm/page-writeback.c has several places where 1 is added to the divisor
      to prevent division by zero exceptions; however, if the original
      divisor is equivalent to -1, adding 1 leads to division by zero.
      
      There are three places where +1 is used for this purpose - one in
      pos_ratio_polynom() and two in bdi_position_ratio().  The second one
      in bdi_position_ratio() actually triggered div-by-zero oops on a
      machine running a 3.10 kernel.  The divisor is
      
        x_intercept - bdi_setpoint + 1 == span + 1
      
      span is confirmed to be (u32)-1.  It isn't clear how it ended up that
      but it could be from write bandwidth calculation underflow fixed by
      c72efb65 ("writeback: fix possible underflow in write bandwidth
      calculation").
      
      At any rate, +1 isn't a proper protection against div-by-zero.  This
      patch converts all +1 protections to |1.  Note that
      bdi_update_dirty_ratelimit() was already using |1 before this patch.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      [lizf: Backported to 3.4: drop other two changes as there's only one
       such statment in 3.4]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      c905f0af
    • Yann Droneaud's avatar
      IB/core: don't disallow registering region starting at 0x0 · 397c6496
      Yann Droneaud authored
      commit 66578b0b upstream.
      
      In a call to ib_umem_get(), if address is 0x0 and size is
      already page aligned, check added in commit 8494057a
      ("IB/uverbs: Prevent integer overflow in ib_umem_get address
      arithmetic") will refuse to register a memory region that
      could otherwise be valid (provided vm.mmap_min_addr sysctl
      and mmap_low_allowed SELinux knobs allow userspace to map
      something at address 0x0).
      
      This patch allows back such registration: ib_umem_get()
      should probably don't care of the base address provided it
      can be pinned with get_user_pages().
      
      There's two possible overflows, in (addr + size) and in
      PAGE_ALIGN(addr + size), this patch keep ensuring none
      of them happen while allowing to pin memory at address
      0x0. Anyway, the case of size equal 0 is no more (partially)
      handled as 0-length memory region are disallowed by an
      earlier check.
      
      Link: http://mid.gmane.org/cover.1428929103.git.ydroneaud@opteya.com
      Cc: Shachar Raindel <raindel@mellanox.com>
      Cc: Jack Morgenstein <jackm@mellanox.com>
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarYann Droneaud <ydroneaud@opteya.com>
      Reviewed-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Reviewed-by: default avatarHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      397c6496
    • Quentin Casasnovas's avatar
      cdc-acm: prevent infinite loop when parsing CDC headers. · b383c48a
      Quentin Casasnovas authored
      commit 0d3bba02 upstream.
      
      Phil and I found out a problem with commit:
      
        7e860a6e ("cdc-acm: add sanity checks")
      
      It added some sanity checks to ignore potential garbage in CDC headers but
      also introduced a potential infinite loop.  This can happen at the first
      loop iteration (elength = 0 in that case) if the description isn't a
      DT_CS_INTERFACE or later if 'buffer[0]' is zero.
      
      It should also be noted that the wrong length was being added to 'buffer'
      in case 'buffer[1]' was not a DT_CS_INTERFACE descriptor, since elength was
      assigned after that check in the loop.
      
      A specially crafted USB device could be used to trigger this infinite loop.
      
      Fixes: 7e860a6e ("cdc-acm: add sanity checks")
      Signed-off-by: default avatarPhil Turnbull <phil.turnbull@oracle.com>
      Signed-off-by: default avatarQuentin Casasnovas <quentin.casasnovas@oracle.com>
      CC: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      CC: Oliver Neukum <oneukum@suse.de>
      CC: Adam Lee <adam8157@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      b383c48a
    • Al Viro's avatar
      don't bugger nd->seq on set_root_rcu() from follow_dotdot_rcu() · 981889fb
      Al Viro authored
      commit 7bd88377 upstream.
      
      return the value instead, and have path_init() do the assignment.  Broken by
      "vfs: Fix absolute RCU path walk failures due to uninitialized seq number",
      which was Cc-stable with 2.6.38+ as destination.  This one should go where
      it went.
      
      To avoid dummy value returned in case when root is already set (it would do
      no harm, actually, since the only caller that doesn't ignore the return value
      is guaranteed to have nd->root *not* set, but it's more obvious that way),
      lift the check into callers.  And do the same to set_root(), to keep them
      in sync.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Cc: Ian Jackson <ian.jackson@eu.citrix.com>
      [lizf: the previous backport of this upstream commit is buggy. fix it]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      981889fb
    • Yinghai Lu's avatar
      PCI: Convert pcibios_resource_to_bus() to take a pci_bus, not a pci_dev · bded67cc
      Yinghai Lu authored
      commit fc279850 upstream.
      
      These interfaces:
      
        pcibios_resource_to_bus(struct pci_dev *dev, *bus_region, *resource)
        pcibios_bus_to_resource(struct pci_dev *dev, *resource, *bus_region)
      
      took a pci_dev, but they really depend only on the pci_bus.  And we want to
      use them in resource allocation paths where we have the bus but not a
      device, so this patch converts them to take the pci_bus instead of the
      pci_dev:
      
        pcibios_resource_to_bus(struct pci_bus *bus, *bus_region, *resource)
        pcibios_bus_to_resource(struct pci_bus *bus, *resource, *bus_region)
      
      In fact, with standard PCI-PCI bridges, they only depend on the host
      bridge, because that's the only place address translation occurs, but
      we aren't going that far yet.
      
      [bhelgaas: changelog]
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Cc: Dirk Behme <dirk.behme@gmail.com>
      [lizf: Backported to 3.4:
       - make changes to pci_host_bridge() instead of find_pci_root_bus()
       - adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      bded67cc
    • Konrad Rzeszutek Wilk's avatar
      config: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected · edf76233
      Konrad Rzeszutek Wilk authored
      commit a6dfa128 upstream.
      
      A huge amount of NIC drivers use the DMA API, however if
      compiled under 32-bit an very important part of the DMA API can
      be ommitted leading to the drivers not working at all
      (especially if used with 'swiotlb=force iommu=soft').
      
      As Prashant Sreedharan explains it: "the driver [tg3] uses
      DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of
      the dma "mapping" and dma_unmap_addr() to get the "mapping"
      value. On most of the platforms this is a no-op, but ... with
      "iommu=soft and swiotlb=force" this house keeping is required,
      ... otherwise we pass 0 while calling pci_unmap_/pci_dma_sync_
      instead of the DMA address."
      
      As such enable this even when using 32-bit kernels.
      Reported-by: default avatarIan Jackson <Ian.Jackson@eu.citrix.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Acked-by: default avatarDavid S. Miller <davem@davemloft.net>
      Acked-by: default avatarPrashant Sreedharan <prashant@broadcom.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michael Chan <mchan@broadcom.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: boris.ostrovsky@oracle.com
      Cc: cascardo@linux.vnet.ibm.com
      Cc: david.vrabel@citrix.com
      Cc: sanjeevb@broadcom.com
      Cc: siva.kallam@broadcom.com
      Cc: vyasevich@gmail.com
      Cc: xen-devel@lists.xensource.com
      Link: http://lkml.kernel.org/r/20150417190448.GA9462@l.oracle.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      edf76233
    • Kirill A. Shutemov's avatar
      perf tools: Fix build with perl 5.18 · db5a0101
      Kirill A. Shutemov authored
      commit 575bf1d0 upstream.
      
      perl.h from new Perl release doesn't like -Wundef and -Wswitch-default:
      
      /usr/lib/perl5/core_perl/CORE/perl.h:548:5: error: "SILENT_NO_TAINT_SUPPORT" is not defined [-Werror=undef]
       #if SILENT_NO_TAINT_SUPPORT && !defined(NO_TAINT_SUPPORT)
           ^
      /usr/lib/perl5/core_perl/CORE/perl.h:556:5: error: "NO_TAINT_SUPPORT" is not defined [-Werror=undef]
       #if NO_TAINT_SUPPORT
           ^
      In file included from /usr/lib/perl5/core_perl/CORE/perl.h:3471:0,
                       from util/scripting-engines/trace-event-perl.c:30:
      /usr/lib/perl5/core_perl/CORE/sv.h:1455:5: error: "NO_TAINT_SUPPORT" is not defined [-Werror=undef]
       #if NO_TAINT_SUPPORT
           ^
      In file included from /usr/lib/perl5/core_perl/CORE/perl.h:3472:0,
                       from util/scripting-engines/trace-event-perl.c:30:
      /usr/lib/perl5/core_perl/CORE/regexp.h:436:5: error: "NO_TAINT_SUPPORT" is not defined [-Werror=undef]
       #if NO_TAINT_SUPPORT
           ^
      In file included from /usr/lib/perl5/core_perl/CORE/hv.h:592:0,
                       from /usr/lib/perl5/core_perl/CORE/perl.h:3480,
                       from util/scripting-engines/trace-event-perl.c:30:
      /usr/lib/perl5/core_perl/CORE/hv_func.h: In function ‘S_perl_hash_siphash_2_4’:
      /usr/lib/perl5/core_perl/CORE/hv_func.h:222:3: error: switch missing default case [-Werror=switch-default]
         switch( left )
         ^
      /usr/lib/perl5/core_perl/CORE/hv_func.h: In function ‘S_perl_hash_superfast’:
      /usr/lib/perl5/core_perl/CORE/hv_func.h:274:5: error: switch missing default case [-Werror=switch-default]
           switch (rem) { \
           ^
      /usr/lib/perl5/core_perl/CORE/hv_func.h: In function ‘S_perl_hash_murmur3’:
      /usr/lib/perl5/core_perl/CORE/hv_func.h:398:5: error: switch missing default case [-Werror=switch-default]
           switch(bytes_in_carry) { /* how many bytes in carry */
           ^
      
      Let's disable the warnings for code which uses perl.h.
      Signed-off-by: default avatarKirill A. Shutemov <kirill@shutemov.name>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1372063394-20126-1-git-send-email-kirill@shutemov.nameSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Vinson Lee <vlee@twopensource.com>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      db5a0101
    • hujianyang's avatar
      UBI: fix soft lockup in ubi_check_volume() · 9f03e834
      hujianyang authored
      commit 9aa272b4 upstream.
      
      Running mtd-utils/tests/ubi-tests/io_basic.c could cause
      soft lockup or watchdog reset. It is because *updatevol*
      will perform ubi_check_volume() after updating finish
      and this function will full scan the updated lebs if the
      volume is initialized as STATIC_VOLUME.
      
      This patch adds *cond_resched()* in the loop of lebs scan
      to avoid soft lockup.
      
      Helped by Richard Weinberger <richard@nod.at>
      
      [ 2158.067096] INFO: rcu_sched self-detected stall on CPU { 1}  (t=2101 jiffies g=1606 c=1605 q=56)
      [ 2158.172867] CPU: 1 PID: 2073 Comm: io_basic Tainted: G           O 3.10.53 #21
      [ 2158.172898] [<c000f624>] (unwind_backtrace+0x0/0x120) from [<c000c294>] (show_stack+0x10/0x14)
      [ 2158.172918] [<c000c294>] (show_stack+0x10/0x14) from [<c008ac3c>] (rcu_check_callbacks+0x1c0/0x660)
      [ 2158.172936] [<c008ac3c>] (rcu_check_callbacks+0x1c0/0x660) from [<c002b480>] (update_process_times+0x38/0x64)
      [ 2158.172953] [<c002b480>] (update_process_times+0x38/0x64) from [<c005ff38>] (tick_sched_handle+0x54/0x60)
      [ 2158.172966] [<c005ff38>] (tick_sched_handle+0x54/0x60) from [<c00601ac>] (tick_sched_timer+0x44/0x74)
      [ 2158.172978] [<c00601ac>] (tick_sched_timer+0x44/0x74) from [<c003f348>] (__run_hrtimer+0xc8/0x1b8)
      [ 2158.172992] [<c003f348>] (__run_hrtimer+0xc8/0x1b8) from [<c003fd9c>] (hrtimer_interrupt+0x128/0x2a4)
      [ 2158.173007] [<c003fd9c>] (hrtimer_interrupt+0x128/0x2a4) from [<c0246f1c>] (arch_timer_handler_virt+0x28/0x30)
      [ 2158.173022] [<c0246f1c>] (arch_timer_handler_virt+0x28/0x30) from [<c0086214>] (handle_percpu_devid_irq+0x9c/0x124)
      [ 2158.173036] [<c0086214>] (handle_percpu_devid_irq+0x9c/0x124) from [<c0082bd8>] (generic_handle_irq+0x20/0x30)
      [ 2158.173049] [<c0082bd8>] (generic_handle_irq+0x20/0x30) from [<c000969c>] (handle_IRQ+0x64/0x8c)
      [ 2158.173060] [<c000969c>] (handle_IRQ+0x64/0x8c) from [<c0008544>] (gic_handle_irq+0x3c/0x60)
      [ 2158.173074] [<c0008544>] (gic_handle_irq+0x3c/0x60) from [<c02f0f80>] (__irq_svc+0x40/0x50)
      [ 2158.173083] Exception stack(0xc4043c98 to 0xc4043ce0)
      [ 2158.173092] 3c80:                                                       c4043ce4 00000019
      [ 2158.173102] 3ca0: 1f8a865f c050ad10 1f8a864c 00000031 c04b5970 0003ebce 00000000 f3550000
      [ 2158.173113] 3cc0: bf00bc68 00000800 0003ebce c4043ce0 c0186d14 c0186cb8 80000013 ffffffff
      [ 2158.173130] [<c02f0f80>] (__irq_svc+0x40/0x50) from [<c0186cb8>] (read_current_timer+0x4/0x38)
      [ 2158.173145] [<c0186cb8>] (read_current_timer+0x4/0x38) from [<1f8a865f>] (0x1f8a865f)
      [ 2183.927097] BUG: soft lockup - CPU#1 stuck for 22s! [io_basic:2073]
      [ 2184.002229] Modules linked in: nandflash(O) [last unloaded: nandflash]
      Signed-off-by: default avatarWang Kai <morgan.wang@huawei.com>
      Signed-off-by: default avatarhujianyang <hujianyang@huawei.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      9f03e834
    • Sasha Levin's avatar
      autofs4: check dev ioctl size before allocating · a5822a08
      Sasha Levin authored
      commit e53d77eb upstream.
      
      There wasn't any check of the size passed from userspace before trying
      to allocate the memory required.
      
      This meant that userspace might request more space than allowed,
      triggering an OOM.
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarIan Kent <raven@themaw.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      a5822a08
    • Dan Carpenter's avatar
      ipvs: uninitialized data with IP_VS_IPV6 · 4dd86a6a
      Dan Carpenter authored
      commit 3b05ac38 upstream.
      
      The app_tcp_pkt_out() function expects "*diff" to be set and ends up
      using uninitialized data if CONFIG_IP_VS_IPV6 is turned on.
      
      The same issue is there in app_tcp_pkt_in().  Thanks to Julian Anastasov
      for noticing that.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
      Cc: Pablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      4dd86a6a
    • Florian Westphal's avatar
      net: make skb_gso_segment error handling more robust · 812fbfa1
      Florian Westphal authored
      commit 330966e5 upstream.
      
      skb_gso_segment has three possible return values:
      1. a pointer to the first segmented skb
      2. an errno value (IS_ERR())
      3. NULL.  This can happen when GSO is used for header verification.
      
      However, several callers currently test IS_ERR instead of IS_ERR_OR_NULL
      and would oops when NULL is returned.
      
      Note that these call sites should never actually see such a NULL return
      value; all callers mask out the GSO bits in the feature argument.
      
      However, there have been issues with some protocol handlers erronously not
      respecting the specified feature mask in some cases.
      
      It is preferable to get 'have to turn off hw offloading, else slow' reports
      rather than 'kernel crashes'.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      [lizf: Backported to 3.4: drop some hunks as there are fewer skb_gso_segment()
       users in 3.4]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      812fbfa1
    • Pravin B Shelar's avatar
      openvswitch: Check currect return value from skb_gso_segment() · 4e237a3e
      Pravin B Shelar authored
      commit 92e5dfc3 upstream.
      
      Fix return check typo.
      Signed-off-by: default avatarPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: default avatarJesse Gross <jesse@nicira.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      4e237a3e
    • Jann Horn's avatar
      fs: take i_mutex during prepare_binprm for set[ug]id executables · e661bb1c
      Jann Horn authored
      commit 8b01fc86 upstream.
      
      This prevents a race between chown() and execve(), where chowning a
      setuid-user binary to root would momentarily make the binary setuid
      root.
      
      This patch was mostly written by Linus Torvalds.
      Signed-off-by: default avatarJann Horn <jann@thejh.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [lizf: Backported to 3.4:
       - adjust context
       - remove task_no_new_priv and user namespace stuff
       - open-code file_inode()
       - s/READ_ONCE/ACCESS_ONCE]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      e661bb1c
    • Tomas Henzl's avatar
      hpsa: fix memory leak in kdump hard reset · fcafa22d
      Tomas Henzl authored
      commit 03741d95 upstream.
      
      There is a potential memory leak in hpsa_kdump_hard_reset_controller.
      Reviewed-by: default avatarDon Brace <don.brace@pmcs.com>
      Reviewed-by: default avatarScott Teel <scott.teel@pmcs.com>
      Signed-off-by: default avatarTomas Henzl <thenzl@redhat.com>
      Signed-off-by: default avatarDon Brace <don.brace@pmcs.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Vinson Lee <vlee@twopensource.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      fcafa22d
    • Tomas Henzl's avatar
      hpsa: turn off interrupts when kdump starts · 52f70606
      Tomas Henzl authored
      commit 3b747298 upstream.
      
      Sometimes when the card is restarted it may cause -
      "irq 16: nobody cared (try booting with the "irqpoll" option)"
      that is likely caused so, that the card, after the hard reset
      finishes, pulls on the irq. Disabling the ints before or after
      the hpsa_kdump_hard_reset_controller fixes it.
      
      At this point we can't know in which state the card is,
      so using SA5_INTR_OFF + SA5_REPLY_INTR_MASK_OFFSET defines directly,
      instead of the function the drivers provides, seems to be apropriate.
      Reviewed-by: default avatarScott Teel <scott.teel@pmcs.com>
      Signed-off-by: default avatarDon Brace <don.brace@pmcs.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Vinson Lee <vlee@twopensource.com>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      52f70606
    • Tomas Henzl's avatar
      hpsa: add missing pci_set_master in kdump path · 55411793
      Tomas Henzl authored
      commit 859c75ab upstream.
      
      Add a call to pci_set_master(...)  missing in the previous
      patch "hpsa: refine the pci enable/disable handling".
      Found thanks to Rob Elliot.
      Signed-off-by: default avatarTomas Henzl <thenzl@redhat.com>
      Reviewed-by: default avatarRobert Elliott <elliott@hp.com>
      Tested-by: default avatarRobert Elliott <elliott@hp.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Vinson Lee <vlee@twopensource.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      55411793
    • Tomas Henzl's avatar
      hpsa: refine the pci enable/disable handling · 7de2f4c1
      Tomas Henzl authored
      commit 132aa220 upstream.
      
      When a second(kdump) kernel starts and the hard reset method is used
      the driver calls pci_disable_device without previously enabling it,
      so the kernel shows a warning -
      [   16.876248] WARNING: at drivers/pci/pci.c:1431 pci_disable_device+0x84/0x90()
      [   16.882686] Device hpsa
      disabling already-disabled device
      ...
      This patch fixes it, in addition to this I tried to balance also some other pairs
      of enable/disable device in the driver.
      Unfortunately I wasn't able to verify the functionality for the case of a sw reset,
      because of a lack of proper hw.
      Signed-off-by: default avatarTomas Henzl <thenzl@redhat.com>
      Reviewed-by: default avatarStephen M. Cameron <scameron@beardog.cce.hp.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Vinson Lee <vlee@twopensource.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      7de2f4c1
    • Eli Cohen's avatar
      IB/core: Avoid leakage from kernel to user space · 54561a52
      Eli Cohen authored
      commit 377b5134 upstream.
      
      Clear the reserved field of struct ib_uverbs_async_event_desc which is
      copied to user space.
      Signed-off-by: default avatarEli Cohen <eli@mellanox.com>
      Reviewed-by: default avatarYann Droneaud <ydroneaud@opteya.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      54561a52
    • Ian Abbott's avatar
      spi: spidev: fix possible arithmetic overflow for multi-transfer message · 62211954
      Ian Abbott authored
      commit f20fbaad upstream.
      
      `spidev_message()` sums the lengths of the individual SPI transfers to
      determine the overall SPI message length.  It restricts the total
      length, returning an error if too long, but it does not check for
      arithmetic overflow.  For example, if the SPI message consisted of two
      transfers and the first has a length of 10 and the second has a length
      of (__u32)(-1), the total length would be seen as 9, even though the
      second transfer is actually very long.  If the second transfer specifies
      a null `rx_buf` and a non-null `tx_buf`, the `copy_from_user()` could
      overrun the spidev's pre-allocated tx buffer before it reaches an
      invalid user memory address.  Fix it by checking that neither the total
      nor the individual transfer lengths exceed the maximum allowed value.
      
      Thanks to Dan Carpenter for reporting the potential integer overflow.
      Signed-off-by: default avatarIan Abbott <abbotti@mev.co.uk>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      [Ian Abbott: Note: original commit compares the lengths to INT_MAX instead
      of bufsiz due to changes in earlier commits.]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      62211954
    • Jim Snow's avatar
      sb_edac: Fix erroneous bytes->gigabytes conversion · a7434776
      Jim Snow authored
      commit 8c009100 upstream.
      Signed-off-by: default avatarJim Snow <jim.snow@intel.com>
      Signed-off-by: default avatarLukasz Anaczkowski <lukasz.anaczkowski@intel.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@osg.samsung.com>
      Cc: Vinson Lee <vlee@twopensource.com>
      [lizf: Backported to 3.4:
       - adjust context
       - use debugf0() instead of edac_dbg()]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      a7434776
    • Scott Wood's avatar
      powerpc/mpc85xx: Add ranges to etsec2 nodes · 5f842c0f
      Scott Wood authored
      commit bb344ca5 upstream.
      
      Commit 746c9e9f "of/base: Fix PowerPC address parsing hack" limited
      the applicability of the workaround whereby a missing ranges is treated
      as an empty ranges.  This workaround was hiding a bug in the etsec2
      device tree nodes, which have children with reg, but did not have
      ranges.
      Signed-off-by: default avatarScott Wood <scottwood@freescale.com>
      Reported-by: default avatarAlexander Graf <agraf@suse.de>
      Cc: Scott Wood <scottwood@freescale.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      5f842c0f
    • Ben Hutchings's avatar
      splice: Apply generic position and size checks to each write · bff9edd6
      Ben Hutchings authored
      3.2.67-rc1 review patch.  If anyone has any objections, please let me know.
      
      ------------------
      
      From: Ben Hutchings <ben@decadent.org.uk>
      
      We need to check the position and size of file writes against various
      limits, using generic_write_check().  This was not being done for
      the splice write path.  It was fixed upstream by commit 8d020765
      ("->splice_write() via ->write_iter()") but we can't apply that.
      
      CVE-2014-7822
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      bff9edd6
    • Ben Greear's avatar
      Fix lockup related to stop_machine being stuck in __do_softirq. · b674b0ad
      Ben Greear authored
      commit 34376a50 upstream.
      
      The stop machine logic can lock up if all but one of the migration
      threads make it through the disable-irq step and the one remaining
      thread gets stuck in __do_softirq.  The reason __do_softirq can hang is
      that it has a bail-out based on jiffies timeout, but in the lockup case,
      jiffies itself is not incremented.
      
      To work around this, re-add the max_restart counter in __do_irq and stop
      processing irqs after 10 restarts.
      
      Thanks to Tejun Heo and Rusty Russell and others for helping me track
      this down.
      
      This was introduced in 3.9 by commit c10d7367 ("softirq: reduce
      latencies").
      
      It may be worth looking into ath9k to see if it has issues with its irq
      handler at a later date.
      
      The hang stack traces look something like this:
      
          ------------[ cut here ]------------
          WARNING: at kernel/watchdog.c:245 watchdog_overflow_callback+0x9c/0xa7()
          Watchdog detected hard LOCKUP on cpu 2
          Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc]
          Pid: 23, comm: migration/2 Tainted: G         C   3.9.4+ #11
          Call Trace:
           <NMI>   warn_slowpath_common+0x85/0x9f
            warn_slowpath_fmt+0x46/0x48
            watchdog_overflow_callback+0x9c/0xa7
            __perf_event_overflow+0x137/0x1cb
            perf_event_overflow+0x14/0x16
            intel_pmu_handle_irq+0x2dc/0x359
            perf_event_nmi_handler+0x19/0x1b
            nmi_handle+0x7f/0xc2
            do_nmi+0xbc/0x304
            end_repeat_nmi+0x1e/0x2e
           <<EOE>>
            cpu_stopper_thread+0xae/0x162
            smpboot_thread_fn+0x258/0x260
            kthread+0xc7/0xcf
            ret_from_fork+0x7c/0xb0
          ---[ end trace 4947dfa9b0a4cec3 ]---
          BUG: soft lockup - CPU#1 stuck for 22s! [migration/1:17]
          Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc]
          irq event stamp: 835637905
          hardirqs last  enabled at (835637904): __do_softirq+0x9f/0x257
          hardirqs last disabled at (835637905): apic_timer_interrupt+0x6d/0x80
          softirqs last  enabled at (5654720): __do_softirq+0x1ff/0x257
          softirqs last disabled at (5654725): irq_exit+0x5f/0xbb
          CPU 1
          Pid: 17, comm: migration/1 Tainted: G        WC   3.9.4+ #11 To be filled by O.E.M. To be filled by O.E.M./To be filled by O.E.M.
          RIP: tasklet_hi_action+0xf0/0xf0
          Process migration/1
          Call Trace:
           <IRQ>
            __do_softirq+0x117/0x257
            irq_exit+0x5f/0xbb
            smp_apic_timer_interrupt+0x8a/0x98
            apic_timer_interrupt+0x72/0x80
           <EOI>
            printk+0x4d/0x4f
            stop_machine_cpu_stop+0x22c/0x274
            cpu_stopper_thread+0xae/0x162
            smpboot_thread_fn+0x258/0x260
            kthread+0xc7/0xcf
            ret_from_fork+0x7c/0xb0
      Signed-off-by: default avatarBen Greear <greearb@candelatech.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarPekka Riikonen <priikone@iki.fi>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [xr: Backported to 3.4: Adjust context]
      Signed-off-by: default avatarRui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      b674b0ad
    • Eric Dumazet's avatar
      softirq: reduce latencies · 8c9c6ffb
      Eric Dumazet authored
      commit c10d7367 upstream.
      
      In various network workloads, __do_softirq() latencies can be up
      to 20 ms if HZ=1000, and 200 ms if HZ=100.
      
      This is because we iterate 10 times in the softirq dispatcher,
      and some actions can consume a lot of cycles.
      
      This patch changes the fallback to ksoftirqd condition to :
      
      - A time limit of 2 ms.
      - need_resched() being set on current task
      
      When one of this condition is met, we wakeup ksoftirqd for further
      softirq processing if we still have pending softirqs.
      
      Using need_resched() as the only condition can trigger RCU stalls,
      as we can keep BH disabled for too long.
      
      I ran several benchmarks and got no significant difference in
      throughput, but a very significant reduction of latencies (one order
      of magnitude) :
      
      In following bench, 200 antagonist "netperf -t TCP_RR" are started in
      background, using all available cpus.
      
      Then we start one "netperf -t TCP_RR", bound to the cpu handling the NIC
      IRQ (hard+soft)
      
      Before patch :
      
      # netperf -H 7.7.7.84 -t TCP_RR -T2,2 -- -k
      RT_LATENCY,MIN_LATENCY,MAX_LATENCY,P50_LATENCY,P90_LATENCY,P99_LATENCY,MEAN_LATENCY,STDDEV_LATENCY
      MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET
      to 7.7.7.84 () port 0 AF_INET : first burst 0 : cpu bind
      RT_LATENCY=550110.424
      MIN_LATENCY=146858
      MAX_LATENCY=997109
      P50_LATENCY=305000
      P90_LATENCY=550000
      P99_LATENCY=710000
      MEAN_LATENCY=376989.12
      STDDEV_LATENCY=184046.92
      
      After patch :
      
      # netperf -H 7.7.7.84 -t TCP_RR -T2,2 -- -k
      RT_LATENCY,MIN_LATENCY,MAX_LATENCY,P50_LATENCY,P90_LATENCY,P99_LATENCY,MEAN_LATENCY,STDDEV_LATENCY
      MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET
      to 7.7.7.84 () port 0 AF_INET : first burst 0 : cpu bind
      RT_LATENCY=40545.492
      MIN_LATENCY=9834
      MAX_LATENCY=78366
      P50_LATENCY=33583
      P90_LATENCY=59000
      P99_LATENCY=69000
      MEAN_LATENCY=38364.67
      STDDEV_LATENCY=12865.26
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [xr: Backported to 3.4: Adjust context]
      Signed-off-by: default avatarRui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      8c9c6ffb
    • Feng Tang's avatar
      x86/reboot: Fix a warning message triggered by stop_other_cpus() · b9909d50
      Feng Tang authored
      commit 55c844a4 upstream.
      
      When rebooting our 24 CPU Westmere servers with 3.4-rc6, we
      always see this warning msg:
      
      Restarting system.
      machine restart
      ------------[ cut here ]------------
      WARNING: at arch/x86/kernel/smp.c:125
      native_smp_send_reschedule+0x74/0xa7() Hardware name: X8DTN
      Modules linked in: igb [last unloaded: scsi_wait_scan]
      Pid: 1, comm: systemd-shutdow Not tainted 3.4.0-rc6+ #22
      Call Trace:
       <IRQ>  [<ffffffff8102a41f>] warn_slowpath_common+0x7e/0x96
       [<ffffffff8102a44c>] warn_slowpath_null+0x15/0x17
       [<ffffffff81018cf7>] native_smp_send_reschedule+0x74/0xa7
       [<ffffffff810561c1>] trigger_load_balance+0x279/0x2a6
       [<ffffffff81050112>] scheduler_tick+0xe0/0xe9
       [<ffffffff81036768>] update_process_times+0x60/0x70
       [<ffffffff81062f2f>] tick_sched_timer+0x68/0x92
       [<ffffffff81046e33>] __run_hrtimer+0xb3/0x13c
       [<ffffffff81062ec7>] ? tick_nohz_handler+0xd0/0xd0
       [<ffffffff810474f2>] hrtimer_interrupt+0xdb/0x198
       [<ffffffff81019a35>] smp_apic_timer_interrupt+0x81/0x94
       [<ffffffff81655187>] apic_timer_interrupt+0x67/0x70
       <EOI>  [<ffffffff8101a3c4>] ? default_send_IPI_mask_allbutself_phys+0xb4/0xc4
       [<ffffffff8101c680>] physflat_send_IPI_allbutself+0x12/0x14
       [<ffffffff81018db4>] native_nmi_stop_other_cpus+0x8a/0xd6
       [<ffffffff810188ba>] native_machine_shutdown+0x50/0x67
       [<ffffffff81018926>] machine_shutdown+0xa/0xc
       [<ffffffff8101897e>] native_machine_restart+0x20/0x32
       [<ffffffff810189b0>] machine_restart+0xa/0xc
       [<ffffffff8103b196>] kernel_restart+0x47/0x4c
       [<ffffffff8103b2e6>] sys_reboot+0x13e/0x17c
       [<ffffffff8164e436>] ? _raw_spin_unlock_bh+0x10/0x12
       [<ffffffff810fcac9>] ? bdi_queue_work+0xcf/0xd8
       [<ffffffff810fe82f>] ? __bdi_start_writeback+0xae/0xb7
       [<ffffffff810e0d64>] ? iterate_supers+0xa3/0xb7
       [<ffffffff816547a2>] system_call_fastpath+0x16/0x1b
      ---[ end trace 320af5cb1cb60c5b ]---
      
      The root cause seems to be the
      default_send_IPI_mask_allbutself_phys() takes quite some time (I
      measured it could be several ms) to complete sending NMIs to all
      the other 23 CPUs, and for HZ=250/1000 system, the time is long
      enough for a timer interrupt to happen, which will in turn
      trigger to kick load balance to a stopped CPU and cause this
      warning in native_smp_send_reschedule().
      
      So disabling the local irq before stop_other_cpu() can fix this
      problem (tested 25 times reboot ok), and it is fine as there
      should be nobody caring the timer interrupt in such reboot
      stage.
      
      The latest 3.4 kernel slightly changes this behavior by sending
      REBOOT_VECTOR first and only send NMI_VECTOR if the REBOOT_VCTOR
      fails, and this patch is still needed to prevent the problem.
      Signed-off-by: default avatarFeng Tang <feng.tang@intel.com>
      Acked-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120530231541.4c13433a@feng-i7Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Vinson Lee <vlee@twopensource.com>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      b9909d50
    • Dmitry M. Fedin's avatar
      ALSA: usb - Creative USB X-Fi Pro SB1095 volume knob support · 3f371d05
      Dmitry M. Fedin authored
      commit 3dc8523f upstream.
      
      Adds an entry for Creative USB X-Fi to the rc_config array in
      mixer_quirks.c to allow use of volume knob on the device.
      Adds support for newer X-Fi Pro card, known as "Model No. SB1095"
      with USB ID "041e:3237"
      Signed-off-by: default avatarDmitry M. Fedin <dmitry.fedin@gmail.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      3f371d05
    • Al Viro's avatar
      ocfs2: _really_ sync the right range · dfd04b4f
      Al Viro authored
      commit 64b4e252 upstream.
      
      "ocfs2 syncs the wrong range" had been broken; prior to it the
      code was doing the wrong thing in case of O_APPEND, all right,
      but _after_ it we were syncing the wrong range in 100% cases.
      *ppos, aka iocb->ki_pos is incremented prior to that point,
      so we are always doing sync on the area _after_ the one we'd
      written to.
      
      Spotted by Joseph Qi <joseph.qi@huawei.com> back in January;
      unfortunately, I'd missed his mail back then ;-/
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      dfd04b4f
    • Bart Van Assche's avatar
      Defer processing of REQ_PREEMPT requests for blocked devices · 419d4c98
      Bart Van Assche authored
      commit bba0bdd7 upstream.
      
      SCSI transport drivers and SCSI LLDs block a SCSI device if the
      transport layer is not operational. This means that in this state
      no requests should be processed, even if the REQ_PREEMPT flag has
      been set. This patch avoids that a rescan shortly after a cable
      pull sporadically triggers the following kernel oops:
      
      BUG: unable to handle kernel paging request at ffffc9001a6bc084
      IP: [<ffffffffa04e08f2>] mlx4_ib_post_send+0xd2/0xb30 [mlx4_ib]
      Process rescan-scsi-bus (pid: 9241, threadinfo ffff88053484a000, task ffff880534aae100)
      Call Trace:
       [<ffffffffa0718135>] srp_post_send+0x65/0x70 [ib_srp]
       [<ffffffffa071b9df>] srp_queuecommand+0x1cf/0x3e0 [ib_srp]
       [<ffffffffa0001ff1>] scsi_dispatch_cmd+0x101/0x280 [scsi_mod]
       [<ffffffffa0009ad1>] scsi_request_fn+0x411/0x4d0 [scsi_mod]
       [<ffffffff81223b37>] __blk_run_queue+0x27/0x30
       [<ffffffff8122a8d2>] blk_execute_rq_nowait+0x82/0x110
       [<ffffffff8122a9c2>] blk_execute_rq+0x62/0xf0
       [<ffffffffa000b0e8>] scsi_execute+0xe8/0x190 [scsi_mod]
       [<ffffffffa000b2f3>] scsi_execute_req+0xa3/0x130 [scsi_mod]
       [<ffffffffa000c1aa>] scsi_probe_lun+0x17a/0x450 [scsi_mod]
       [<ffffffffa000ce86>] scsi_probe_and_add_lun+0x156/0x480 [scsi_mod]
       [<ffffffffa000dc2f>] __scsi_scan_target+0xdf/0x1f0 [scsi_mod]
       [<ffffffffa000dfa3>] scsi_scan_host_selected+0x183/0x1c0 [scsi_mod]
       [<ffffffffa000edfb>] scsi_scan+0xdb/0xe0 [scsi_mod]
       [<ffffffffa000ee13>] store_scan+0x13/0x20 [scsi_mod]
       [<ffffffff811c8d9b>] sysfs_write_file+0xcb/0x160
       [<ffffffff811589de>] vfs_write+0xce/0x140
       [<ffffffff81158b53>] sys_write+0x53/0xa0
       [<ffffffff81464592>] system_call_fastpath+0x16/0x1b
       [<00007f611c9d9300>] 0x7f611c9d92ff
      Reported-by: default avatarMax Gurtuvoy <maxg@mellanox.com>
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Reviewed-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      419d4c98
    • John Soni Jose's avatar
      be2iscsi: Fix kernel panic when device initialization fails · 9796d87a
      John Soni Jose authored
      commit 2e7cee02 upstream.
      
      Kernel panic was happening as iscsi_host_remove() was called on
      a host which was not yet added.
      Signed-off-by: default avatarJohn Soni Jose <sony.john-n@emulex.com>
      Reviewed-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      9796d87a
    • Shachar Raindel's avatar
      IB/uverbs: Prevent integer overflow in ib_umem_get address arithmetic · ffaa96c7
      Shachar Raindel authored
      commit 8494057a upstream.
      
      Properly verify that the resulting page aligned end address is larger
      than both the start address and the length of the memory area requested.
      
      Both the start and length arguments for ib_umem_get are controlled by
      the user. A misbehaving user can provide values which will cause an
      integer overflow when calculating the page aligned end address.
      
      This overflow can cause also miscalculation of the number of pages
      mapped, and additional logic issues.
      
      Addresses: CVE-2014-8159
      Signed-off-by: default avatarShachar Raindel <raindel@mellanox.com>
      Signed-off-by: default avatarJack Morgenstein <jackm@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      ffaa96c7
    • Johannes Berg's avatar
      mac80211: fix RX A-MPDU session reorder timer deletion · ffabd89c
      Johannes Berg authored
      commit 788211d8 upstream.
      
      There's an issue with the way the RX A-MPDU reorder timer is
      deleted that can cause a kernel crash like this:
      
       * tid_rx is removed - call_rcu(ieee80211_free_tid_rx)
       * station is destroyed
       * reorder timer fires before ieee80211_free_tid_rx() runs,
         accessing the station, thus potentially crashing due to
         the use-after-free
      
      The station deletion is protected by synchronize_net(), but
      that isn't enough -- ieee80211_free_tid_rx() need not have
      run when that returns (it deletes the timer.) We could use
      rcu_barrier() instead of synchronize_net(), but that's much
      more expensive.
      
      Instead, to fix this, add a field tracking that the session
      is being deleted. In this case, the only re-arming of the
      timer happens with the reorder spinlock held, so make that
      code not rearm it if the session is being deleted and also
      delete the timer after setting that field. This ensures the
      timer cannot fire after ___ieee80211_stop_rx_ba_session()
      returns, which fixes the problem.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      ffabd89c