- 18 Sep, 2015 10 commits
-
-
Alex Deucher authored
commit fd99a094 upstream. Use the correct flags for atom. v2: handle DRM_MODE_FLAG_DBLCLK Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Baruch Siach authored
commit 838d030b upstream. The callback function signature has changed in commit a5818a8b (pinctrl: get_group_pins() const fixes) Fixes: a5818a8b ('pinctrl: get_group_pins() const fixes') Cc: Stephen Warren <swarren@nvidia.com> Signed-off-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Baruch Siach authored
commit b18104c0 upstream. This API has changed in commit 6e5e959d (pinctrl: API changes to support multiple states per device). Fixes: 6e5e959d ('pinctrl: API changes to support multiple states per device') Cc: Stephen Warren <swarren@nvidia.com> Signed-off-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Krzysztof Kozlowski authored
commit 1915a718 upstream. The return value of power_supply_register() call was not checked and even on error probe() function returned 0. If registering failed then during unbind the driver tried to unregister power supply which was not actually registered. This could lead to memory corruption because power_supply_unregister() unconditionally cleans up given power supply. Fix this by checking return status of power_supply_register() call. In case of failure, clean up sysfs entries and fail the probe. Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Fixes: 9be0fcb5 ("compal-laptop: add JHL90, battery & hwmon interface") Signed-off-by: Sebastian Reichel <sre@kernel.org> [lizf: Backported to 3.4: there's no "remove" label. Do cleanup inside if block] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Baruch Siach authored
commit 939417bd upstream. struct pinctrl_desc does not contain the maxpin member since commit 0d2006bb (pinctrl: remove unnecessary max pin number). Fixes: 0d2006bb ('pinctrl: remove unnecessary max pin number') Signed-off-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Felipe Balbi authored
commit e3c93e1a upstream. As per Mentor Graphics' documentation, we should always handle TX endpoints before RX endpoints. This patch fixes that error while also updating some hard-to-read comments which were scattered around musb_interrupt(). This patch should be backported as far back as possible since this error has been in the driver since it's conception. Signed-off-by: Felipe Balbi <balbi@ti.com> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Ekaterina Tumanova authored
commit b75f4c9a upstream. s390 documentation requires words 0 and 10-15 to be reserved and stored as zeros. As we fill out all other fields, we can memset the full structure. Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Sabrina Dubroca authored
commit 08e83316 upstream. There is a race condition between e1000_change_mtu's cleanups and netpoll, when we change the MTU across jumbo size: Changing MTU frees all the rx buffers: e1000_change_mtu -> e1000_down -> e1000_clean_all_rx_rings -> e1000_clean_rx_ring Then, close to the end of e1000_change_mtu: pr_info -> ... -> netpoll_poll_dev -> e1000_clean -> e1000_clean_rx_irq -> e1000_alloc_rx_buffers -> e1000_alloc_frag And when we come back to do the rest of the MTU change: e1000_up -> e1000_configure -> e1000_configure_rx -> e1000_alloc_jumbo_rx_buffers alloc_jumbo finds the buffers already != NULL, since data (shared with page in e1000_rx_buffer->rxbuf) has been re-alloc'd, but it's garbage, or at least not what is expected when in jumbo state. This results in an unusable adapter (packets don't get through), and a NULL pointer dereference on the next call to e1000_clean_rx_ring (other mtu change, link down, shutdown): BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81194d6e>] put_compound_page+0x7e/0x330 [...] Call Trace: [<ffffffff81195445>] put_page+0x55/0x60 [<ffffffff815d9f44>] e1000_clean_rx_ring+0x134/0x200 [<ffffffff815da055>] e1000_clean_all_rx_rings+0x45/0x60 [<ffffffff815df5e0>] e1000_down+0x1c0/0x1d0 [<ffffffff811e2260>] ? deactivate_slab+0x7f0/0x840 [<ffffffff815e21bc>] e1000_change_mtu+0xdc/0x170 [<ffffffff81647050>] dev_set_mtu+0xa0/0x140 [<ffffffff81664218>] do_setlink+0x218/0xac0 [<ffffffff814459e9>] ? nla_parse+0xb9/0x120 [<ffffffff816652d0>] rtnl_newlink+0x6d0/0x890 [<ffffffff8104f000>] ? kvm_clock_read+0x20/0x40 [<ffffffff810a2068>] ? sched_clock_cpu+0xa8/0x100 [<ffffffff81663802>] rtnetlink_rcv_msg+0x92/0x260 By setting the allocator to a dummy version, netpoll can't mess up our rx buffers. The allocator is set back to a sane value in e1000_configure_rx. Fixes: edbbb3ca ("e1000: implement jumbo receive with partial descriptors") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
K. Y. Srinivasan authored
commit 40384e4b upstream. Correctly rollback state if the failure occurs after we have handed over the ownership of the buffer to the host. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Alexander Ploumistos authored
commit 2eeff0b4 upstream. Add 04f2:aff1 to ath3k.c supported devices list and btusb.c blacklist, so that the device can load the ath3k firmware and re-enumerate itself as an AR3011 device. T: Bus=05 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=12 MxCh= 0 D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=04f2 ProdID=aff1 Rev= 0.01 C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms Signed-off-by: Alexander Ploumistos <alexpl@fedoraproject.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
- 14 Sep, 2015 1 commit
-
-
Weilong Chen authored
There's a check for ip6_null_entry, but it's not enough if the config CONFIG_IPV6_MULTIPLE_TABLES is selected. Blackhole or prohibited entries should also be ignored. This path is for kernel before v3.6, as there's a commit b94f1c09 use icmpv6_notify() instead of rt6_redirect() and rt6_redirect has been deleted. The oops as follow: [exception RIP: do_raw_write_lock+12] RIP: ffffffff8122c42c RSP: ffff880666e45820 RFLAGS: 00010282 RAX: ffff8801207bffd8 RBX: 0000000000000018 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff880666e45898 RDI: 0000000000000018 RBP: ffff880666e45830 R8: 000000000000001e R9: 0000000006000000 R10: ffff88011796b8a0 R11: 0000000000000004 R12: ffff88010391ed00 R13: 0000000000000000 R14: ffff880666e45898 R15: ffff88011796b890 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 [ffff880666e45838] _raw_write_lock_bh at ffffffff81450b39 [ffff880666e45858] __ip6_ins_rt at ffffffff813ed8c1 [ffff880666e45888] ip6_ins_rt at ffffffff813eef58 [ffff880666e458b8] rt6_redirect at ffffffff813f0b84 [ffff880666e45958] ndisc_rcv at ffffffff813f95d8 [ffff880666e45a08] icmpv6_rcv at ffffffff814000e8 [ffff880666e45ae8] ip6_input_finish at ffffffff813e43bb [ffff880666e45b38] ip6_input at ffffffff813e4b08 [ffff880666e45b68] ipv6_rcv at ffffffff813e4969 [ffff880666e45bc8] __netif_receive_skb at ffffffff8135158a [ffff880666e45c38] dev_gro_receive at ffffffff81351cb0 [ffff880666e45c78] napi_gro_receive at ffffffff81351fc5 [ffff880666e45cb8] tg3_rx at ffffffffa0bfb354 [tg] [ffff880666e45d88] tg3_poll_work at ffffffffa0c07857 [tg] [ffff880666e45e18] tg3_poll_msix at ffffffffa0c07d1b [tg] [ffff880666e45e68] net_rx_action at ffffffff81352219 [ffff880666e45ec8] __do_softirq at ffffffff8103e5a1 [ffff880666e45f38] call_softirq at ffffffff81459c4c [ffff880666e45f50] do_softirq at ffffffff8100413d [ffff880666e45f80] do_IRQ at ffffffff81003cce This happened when ip6_route_redirect found a rt which was set blackhole, the rt had a NULL rt6i_table argument which is accessed by __ip6_ins_rt() when trying to lock rt6i_table->tb6_lock caused a BUG: "BUG: unable to handle kernel NULL pointer" Signed-off-by: Weilong Chen <chenweilong@huawei.com>
-
- 19 Jun, 2015 29 commits
-
-
Zefan Li authored
-
Ian Campbell authored
commit 31a41898 upstream. When we come to tear things down in netback_remove() and generate the uevent it is possible that the xenstore directory has already been removed (details below). In such cases netback_uevent() won't be able to read the hotplug script and will write a xenstore error node. A recent change to the hypervisor exposed this race such that we now sometimes lose it (where apparently we didn't ever before). Instead read the hotplug script configuration during setup and use it for the lifetime of the backend device. The apparently more obvious fix of moving the transition to state=Closed in netback_remove() to after the uevent does not work because it is possible that we are already in state=Closed (in reaction to the guest having disconnected as it shutdown). Being already in Closed means the toolstack is at liberty to start tearing down the xenstore directories. In principal it might be possible to arrange to unregister the device sooner (e.g on transition to Closing) such that xenstore would still be there but this state machine is fragile and prone to anger... A modern Xen system only relies on the hotplug uevent for driver domains, when the backend is in the same domain as the toolstack it will run the necessary setup/teardown directly in the correct sequence wrt xenstore changes. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Joonsoo Kim authored
commit 43d77867 upstream. Current implementation of unfreeze_partials() is so complicated, but benefit from it is insignificant. In addition many code in do {} while loop have a bad influence to a fail rate of cmpxchg_double_slab. Under current implementation which test status of cpu partial slab and acquire list_lock in do {} while loop, we don't need to acquire a list_lock and gain a little benefit when front of the cpu partial slab is to be discarded, but this is a rare case. In case that add_partial is performed and cmpxchg_double_slab is failed, remove_partial should be called case by case. I think that these are disadvantages of current implementation, so I do refactoring unfreeze_partials(). Minimizing code in do {} while loop introduce a reduced fail rate of cmpxchg_double_slab. Below is output of 'slabinfo -r kmalloc-256' when './perf stat -r 33 hackbench 50 process 4000 > /dev/null' is done. ** before ** Cmpxchg_double Looping ------------------------ Locked Cmpxchg Double redos 182685 Unlocked Cmpxchg Double redos 0 ** after ** Cmpxchg_double Looping ------------------------ Locked Cmpxchg Double redos 177995 Unlocked Cmpxchg Double redos 1 We can see cmpxchg_double_slab fail rate is improved slightly. Bolow is output of './perf stat -r 30 hackbench 50 process 4000 > /dev/null'. ** before ** Performance counter stats for './hackbench 50 process 4000' (30 runs): 108517.190463 task-clock # 7.926 CPUs utilized ( +- 0.24% ) 2,919,550 context-switches # 0.027 M/sec ( +- 3.07% ) 100,774 CPU-migrations # 0.929 K/sec ( +- 4.72% ) 124,201 page-faults # 0.001 M/sec ( +- 0.15% ) 401,500,234,387 cycles # 3.700 GHz ( +- 0.24% ) <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend 250,576,913,354 instructions # 0.62 insns per cycle ( +- 0.13% ) 45,934,956,860 branches # 423.297 M/sec ( +- 0.14% ) 188,219,787 branch-misses # 0.41% of all branches ( +- 0.56% ) 13.691837307 seconds time elapsed ( +- 0.24% ) ** after ** Performance counter stats for './hackbench 50 process 4000' (30 runs): 107784.479767 task-clock # 7.928 CPUs utilized ( +- 0.22% ) 2,834,781 context-switches # 0.026 M/sec ( +- 2.33% ) 93,083 CPU-migrations # 0.864 K/sec ( +- 3.45% ) 123,967 page-faults # 0.001 M/sec ( +- 0.15% ) 398,781,421,836 cycles # 3.700 GHz ( +- 0.22% ) <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend 250,189,160,419 instructions # 0.63 insns per cycle ( +- 0.09% ) 45,855,370,128 branches # 425.436 M/sec ( +- 0.10% ) 169,881,248 branch-misses # 0.37% of all branches ( +- 0.43% ) 13.596272341 seconds time elapsed ( +- 0.22% ) No regression is found, but rather we can see slightly better result. Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Joonsoo Kim <js1304@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Ben Hutchings authored
commit 8014bcc8 upstream. The variable for the 'permissive' module parameter used to be static but was recently changed to be extern. This puts it in the kernel global namespace if the driver is built-in, so its name should begin with a prefix identifying the driver. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Fixes: af6fc858 ("xen-pciback: limit guest control of command register") Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Tejun Heo authored
commit 464d1387 upstream. mm/page-writeback.c has several places where 1 is added to the divisor to prevent division by zero exceptions; however, if the original divisor is equivalent to -1, adding 1 leads to division by zero. There are three places where +1 is used for this purpose - one in pos_ratio_polynom() and two in bdi_position_ratio(). The second one in bdi_position_ratio() actually triggered div-by-zero oops on a machine running a 3.10 kernel. The divisor is x_intercept - bdi_setpoint + 1 == span + 1 span is confirmed to be (u32)-1. It isn't clear how it ended up that but it could be from write bandwidth calculation underflow fixed by c72efb65 ("writeback: fix possible underflow in write bandwidth calculation"). At any rate, +1 isn't a proper protection against div-by-zero. This patch converts all +1 protections to |1. Note that bdi_update_dirty_ratelimit() was already using |1 before this patch. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com> [lizf: Backported to 3.4: drop other two changes as there's only one such statment in 3.4] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Yann Droneaud authored
commit 66578b0b upstream. In a call to ib_umem_get(), if address is 0x0 and size is already page aligned, check added in commit 8494057a ("IB/uverbs: Prevent integer overflow in ib_umem_get address arithmetic") will refuse to register a memory region that could otherwise be valid (provided vm.mmap_min_addr sysctl and mmap_low_allowed SELinux knobs allow userspace to map something at address 0x0). This patch allows back such registration: ib_umem_get() should probably don't care of the base address provided it can be pinned with get_user_pages(). There's two possible overflows, in (addr + size) and in PAGE_ALIGN(addr + size), this patch keep ensuring none of them happen while allowing to pin memory at address 0x0. Anyway, the case of size equal 0 is no more (partially) handled as 0-length memory region are disallowed by an earlier check. Link: http://mid.gmane.org/cover.1428929103.git.ydroneaud@opteya.com Cc: Shachar Raindel <raindel@mellanox.com> Cc: Jack Morgenstein <jackm@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Quentin Casasnovas authored
commit 0d3bba02 upstream. Phil and I found out a problem with commit: 7e860a6e ("cdc-acm: add sanity checks") It added some sanity checks to ignore potential garbage in CDC headers but also introduced a potential infinite loop. This can happen at the first loop iteration (elength = 0 in that case) if the description isn't a DT_CS_INTERFACE or later if 'buffer[0]' is zero. It should also be noted that the wrong length was being added to 'buffer' in case 'buffer[1]' was not a DT_CS_INTERFACE descriptor, since elength was assigned after that check in the loop. A specially crafted USB device could be used to trigger this infinite loop. Fixes: 7e860a6e ("cdc-acm: add sanity checks") Signed-off-by: Phil Turnbull <phil.turnbull@oracle.com> Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com> CC: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> CC: Oliver Neukum <oneukum@suse.de> CC: Adam Lee <adam8157@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Al Viro authored
commit 7bd88377 upstream. return the value instead, and have path_init() do the assignment. Broken by "vfs: Fix absolute RCU path walk failures due to uninitialized seq number", which was Cc-stable with 2.6.38+ as destination. This one should go where it went. To avoid dummy value returned in case when root is already set (it would do no harm, actually, since the only caller that doesn't ignore the return value is guaranteed to have nd->root *not* set, but it's more obvious that way), lift the check into callers. And do the same to set_root(), to keep them in sync. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Ian Jackson <ian.jackson@eu.citrix.com> [lizf: the previous backport of this upstream commit is buggy. fix it] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Yinghai Lu authored
commit fc279850 upstream. These interfaces: pcibios_resource_to_bus(struct pci_dev *dev, *bus_region, *resource) pcibios_bus_to_resource(struct pci_dev *dev, *resource, *bus_region) took a pci_dev, but they really depend only on the pci_bus. And we want to use them in resource allocation paths where we have the bus but not a device, so this patch converts them to take the pci_bus instead of the pci_dev: pcibios_resource_to_bus(struct pci_bus *bus, *bus_region, *resource) pcibios_bus_to_resource(struct pci_bus *bus, *resource, *bus_region) In fact, with standard PCI-PCI bridges, they only depend on the host bridge, because that's the only place address translation occurs, but we aren't going that far yet. [bhelgaas: changelog] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Dirk Behme <dirk.behme@gmail.com> [lizf: Backported to 3.4: - make changes to pci_host_bridge() instead of find_pci_root_bus() - adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Konrad Rzeszutek Wilk authored
commit a6dfa128 upstream. A huge amount of NIC drivers use the DMA API, however if compiled under 32-bit an very important part of the DMA API can be ommitted leading to the drivers not working at all (especially if used with 'swiotlb=force iommu=soft'). As Prashant Sreedharan explains it: "the driver [tg3] uses DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of the dma "mapping" and dma_unmap_addr() to get the "mapping" value. On most of the platforms this is a no-op, but ... with "iommu=soft and swiotlb=force" this house keeping is required, ... otherwise we pass 0 while calling pci_unmap_/pci_dma_sync_ instead of the DMA address." As such enable this even when using 32-bit kernels. Reported-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Prashant Sreedharan <prashant@broadcom.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Chan <mchan@broadcom.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boris.ostrovsky@oracle.com Cc: cascardo@linux.vnet.ibm.com Cc: david.vrabel@citrix.com Cc: sanjeevb@broadcom.com Cc: siva.kallam@broadcom.com Cc: vyasevich@gmail.com Cc: xen-devel@lists.xensource.com Link: http://lkml.kernel.org/r/20150417190448.GA9462@l.oracle.comSigned-off-by: Ingo Molnar <mingo@kernel.org> Cc: Ben Hutchings <ben@decadent.org.uk> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Kirill A. Shutemov authored
commit 575bf1d0 upstream. perl.h from new Perl release doesn't like -Wundef and -Wswitch-default: /usr/lib/perl5/core_perl/CORE/perl.h:548:5: error: "SILENT_NO_TAINT_SUPPORT" is not defined [-Werror=undef] #if SILENT_NO_TAINT_SUPPORT && !defined(NO_TAINT_SUPPORT) ^ /usr/lib/perl5/core_perl/CORE/perl.h:556:5: error: "NO_TAINT_SUPPORT" is not defined [-Werror=undef] #if NO_TAINT_SUPPORT ^ In file included from /usr/lib/perl5/core_perl/CORE/perl.h:3471:0, from util/scripting-engines/trace-event-perl.c:30: /usr/lib/perl5/core_perl/CORE/sv.h:1455:5: error: "NO_TAINT_SUPPORT" is not defined [-Werror=undef] #if NO_TAINT_SUPPORT ^ In file included from /usr/lib/perl5/core_perl/CORE/perl.h:3472:0, from util/scripting-engines/trace-event-perl.c:30: /usr/lib/perl5/core_perl/CORE/regexp.h:436:5: error: "NO_TAINT_SUPPORT" is not defined [-Werror=undef] #if NO_TAINT_SUPPORT ^ In file included from /usr/lib/perl5/core_perl/CORE/hv.h:592:0, from /usr/lib/perl5/core_perl/CORE/perl.h:3480, from util/scripting-engines/trace-event-perl.c:30: /usr/lib/perl5/core_perl/CORE/hv_func.h: In function ‘S_perl_hash_siphash_2_4’: /usr/lib/perl5/core_perl/CORE/hv_func.h:222:3: error: switch missing default case [-Werror=switch-default] switch( left ) ^ /usr/lib/perl5/core_perl/CORE/hv_func.h: In function ‘S_perl_hash_superfast’: /usr/lib/perl5/core_perl/CORE/hv_func.h:274:5: error: switch missing default case [-Werror=switch-default] switch (rem) { \ ^ /usr/lib/perl5/core_perl/CORE/hv_func.h: In function ‘S_perl_hash_murmur3’: /usr/lib/perl5/core_perl/CORE/hv_func.h:398:5: error: switch missing default case [-Werror=switch-default] switch(bytes_in_carry) { /* how many bytes in carry */ ^ Let's disable the warnings for code which uses perl.h. Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> Cc: Ingo Molnar <mingo@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1372063394-20126-1-git-send-email-kirill@shutemov.nameSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Vinson Lee <vlee@twopensource.com> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
hujianyang authored
commit 9aa272b4 upstream. Running mtd-utils/tests/ubi-tests/io_basic.c could cause soft lockup or watchdog reset. It is because *updatevol* will perform ubi_check_volume() after updating finish and this function will full scan the updated lebs if the volume is initialized as STATIC_VOLUME. This patch adds *cond_resched()* in the loop of lebs scan to avoid soft lockup. Helped by Richard Weinberger <richard@nod.at> [ 2158.067096] INFO: rcu_sched self-detected stall on CPU { 1} (t=2101 jiffies g=1606 c=1605 q=56) [ 2158.172867] CPU: 1 PID: 2073 Comm: io_basic Tainted: G O 3.10.53 #21 [ 2158.172898] [<c000f624>] (unwind_backtrace+0x0/0x120) from [<c000c294>] (show_stack+0x10/0x14) [ 2158.172918] [<c000c294>] (show_stack+0x10/0x14) from [<c008ac3c>] (rcu_check_callbacks+0x1c0/0x660) [ 2158.172936] [<c008ac3c>] (rcu_check_callbacks+0x1c0/0x660) from [<c002b480>] (update_process_times+0x38/0x64) [ 2158.172953] [<c002b480>] (update_process_times+0x38/0x64) from [<c005ff38>] (tick_sched_handle+0x54/0x60) [ 2158.172966] [<c005ff38>] (tick_sched_handle+0x54/0x60) from [<c00601ac>] (tick_sched_timer+0x44/0x74) [ 2158.172978] [<c00601ac>] (tick_sched_timer+0x44/0x74) from [<c003f348>] (__run_hrtimer+0xc8/0x1b8) [ 2158.172992] [<c003f348>] (__run_hrtimer+0xc8/0x1b8) from [<c003fd9c>] (hrtimer_interrupt+0x128/0x2a4) [ 2158.173007] [<c003fd9c>] (hrtimer_interrupt+0x128/0x2a4) from [<c0246f1c>] (arch_timer_handler_virt+0x28/0x30) [ 2158.173022] [<c0246f1c>] (arch_timer_handler_virt+0x28/0x30) from [<c0086214>] (handle_percpu_devid_irq+0x9c/0x124) [ 2158.173036] [<c0086214>] (handle_percpu_devid_irq+0x9c/0x124) from [<c0082bd8>] (generic_handle_irq+0x20/0x30) [ 2158.173049] [<c0082bd8>] (generic_handle_irq+0x20/0x30) from [<c000969c>] (handle_IRQ+0x64/0x8c) [ 2158.173060] [<c000969c>] (handle_IRQ+0x64/0x8c) from [<c0008544>] (gic_handle_irq+0x3c/0x60) [ 2158.173074] [<c0008544>] (gic_handle_irq+0x3c/0x60) from [<c02f0f80>] (__irq_svc+0x40/0x50) [ 2158.173083] Exception stack(0xc4043c98 to 0xc4043ce0) [ 2158.173092] 3c80: c4043ce4 00000019 [ 2158.173102] 3ca0: 1f8a865f c050ad10 1f8a864c 00000031 c04b5970 0003ebce 00000000 f3550000 [ 2158.173113] 3cc0: bf00bc68 00000800 0003ebce c4043ce0 c0186d14 c0186cb8 80000013 ffffffff [ 2158.173130] [<c02f0f80>] (__irq_svc+0x40/0x50) from [<c0186cb8>] (read_current_timer+0x4/0x38) [ 2158.173145] [<c0186cb8>] (read_current_timer+0x4/0x38) from [<1f8a865f>] (0x1f8a865f) [ 2183.927097] BUG: soft lockup - CPU#1 stuck for 22s! [io_basic:2073] [ 2184.002229] Modules linked in: nandflash(O) [last unloaded: nandflash] Signed-off-by: Wang Kai <morgan.wang@huawei.com> Signed-off-by: hujianyang <hujianyang@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Sasha Levin authored
commit e53d77eb upstream. There wasn't any check of the size passed from userspace before trying to allocate the memory required. This meant that userspace might request more space than allowed, triggering an OOM. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Dan Carpenter authored
commit 3b05ac38 upstream. The app_tcp_pkt_out() function expects "*diff" to be set and ends up using uninitialized data if CONFIG_IP_VS_IPV6 is turned on. The same issue is there in app_tcp_pkt_in(). Thanks to Julian Anastasov for noticing that. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Florian Westphal authored
commit 330966e5 upstream. skb_gso_segment has three possible return values: 1. a pointer to the first segmented skb 2. an errno value (IS_ERR()) 3. NULL. This can happen when GSO is used for header verification. However, several callers currently test IS_ERR instead of IS_ERR_OR_NULL and would oops when NULL is returned. Note that these call sites should never actually see such a NULL return value; all callers mask out the GSO bits in the feature argument. However, there have been issues with some protocol handlers erronously not respecting the specified feature mask in some cases. It is preferable to get 'have to turn off hw offloading, else slow' reports rather than 'kernel crashes'. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Ben Hutchings <ben@decadent.org.uk> [lizf: Backported to 3.4: drop some hunks as there are fewer skb_gso_segment() users in 3.4] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Pravin B Shelar authored
commit 92e5dfc3 upstream. Fix return check typo. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Jann Horn authored
commit 8b01fc86 upstream. This prevents a race between chown() and execve(), where chowning a setuid-user binary to root would momentarily make the binary setuid root. This patch was mostly written by Linus Torvalds. Signed-off-by: Jann Horn <jann@thejh.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [lizf: Backported to 3.4: - adjust context - remove task_no_new_priv and user namespace stuff - open-code file_inode() - s/READ_ONCE/ACCESS_ONCE] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Tomas Henzl authored
commit 03741d95 upstream. There is a potential memory leak in hpsa_kdump_hard_reset_controller. Reviewed-by: Don Brace <don.brace@pmcs.com> Reviewed-by: Scott Teel <scott.teel@pmcs.com> Signed-off-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Don Brace <don.brace@pmcs.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Vinson Lee <vlee@twopensource.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Tomas Henzl authored
commit 3b747298 upstream. Sometimes when the card is restarted it may cause - "irq 16: nobody cared (try booting with the "irqpoll" option)" that is likely caused so, that the card, after the hard reset finishes, pulls on the irq. Disabling the ints before or after the hpsa_kdump_hard_reset_controller fixes it. At this point we can't know in which state the card is, so using SA5_INTR_OFF + SA5_REPLY_INTR_MASK_OFFSET defines directly, instead of the function the drivers provides, seems to be apropriate. Reviewed-by: Scott Teel <scott.teel@pmcs.com> Signed-off-by: Don Brace <don.brace@pmcs.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Vinson Lee <vlee@twopensource.com> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Tomas Henzl authored
commit 859c75ab upstream. Add a call to pci_set_master(...) missing in the previous patch "hpsa: refine the pci enable/disable handling". Found thanks to Rob Elliot. Signed-off-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Robert Elliott <elliott@hp.com> Tested-by: Robert Elliott <elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Vinson Lee <vlee@twopensource.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Tomas Henzl authored
commit 132aa220 upstream. When a second(kdump) kernel starts and the hard reset method is used the driver calls pci_disable_device without previously enabling it, so the kernel shows a warning - [ 16.876248] WARNING: at drivers/pci/pci.c:1431 pci_disable_device+0x84/0x90() [ 16.882686] Device hpsa disabling already-disabled device ... This patch fixes it, in addition to this I tried to balance also some other pairs of enable/disable device in the driver. Unfortunately I wasn't able to verify the functionality for the case of a sw reset, because of a lack of proper hw. Signed-off-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Vinson Lee <vlee@twopensource.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Eli Cohen authored
commit 377b5134 upstream. Clear the reserved field of struct ib_uverbs_async_event_desc which is copied to user space. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Ian Abbott authored
commit f20fbaad upstream. `spidev_message()` sums the lengths of the individual SPI transfers to determine the overall SPI message length. It restricts the total length, returning an error if too long, but it does not check for arithmetic overflow. For example, if the SPI message consisted of two transfers and the first has a length of 10 and the second has a length of (__u32)(-1), the total length would be seen as 9, even though the second transfer is actually very long. If the second transfer specifies a null `rx_buf` and a non-null `tx_buf`, the `copy_from_user()` could overrun the spidev's pre-allocated tx buffer before it reaches an invalid user memory address. Fix it by checking that neither the total nor the individual transfer lengths exceed the maximum allowed value. Thanks to Dan Carpenter for reporting the potential integer overflow. Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Mark Brown <broonie@kernel.org> [Ian Abbott: Note: original commit compares the lengths to INT_MAX instead of bufsiz due to changes in earlier commits.] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Jim Snow authored
commit 8c009100 upstream. Signed-off-by: Jim Snow <jim.snow@intel.com> Signed-off-by: Lukasz Anaczkowski <lukasz.anaczkowski@intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Vinson Lee <vlee@twopensource.com> [lizf: Backported to 3.4: - adjust context - use debugf0() instead of edac_dbg()] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Scott Wood authored
commit bb344ca5 upstream. Commit 746c9e9f "of/base: Fix PowerPC address parsing hack" limited the applicability of the workaround whereby a missing ranges is treated as an empty ranges. This workaround was hiding a bug in the etsec2 device tree nodes, which have children with reg, but did not have ranges. Signed-off-by: Scott Wood <scottwood@freescale.com> Reported-by: Alexander Graf <agraf@suse.de> Cc: Scott Wood <scottwood@freescale.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Ben Hutchings authored
3.2.67-rc1 review patch. If anyone has any objections, please let me know. ------------------ From: Ben Hutchings <ben@decadent.org.uk> We need to check the position and size of file writes against various limits, using generic_write_check(). This was not being done for the splice write path. It was fixed upstream by commit 8d020765 ("->splice_write() via ->write_iter()") but we can't apply that. CVE-2014-7822 Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Ben Greear authored
commit 34376a50 upstream. The stop machine logic can lock up if all but one of the migration threads make it through the disable-irq step and the one remaining thread gets stuck in __do_softirq. The reason __do_softirq can hang is that it has a bail-out based on jiffies timeout, but in the lockup case, jiffies itself is not incremented. To work around this, re-add the max_restart counter in __do_irq and stop processing irqs after 10 restarts. Thanks to Tejun Heo and Rusty Russell and others for helping me track this down. This was introduced in 3.9 by commit c10d7367 ("softirq: reduce latencies"). It may be worth looking into ath9k to see if it has issues with its irq handler at a later date. The hang stack traces look something like this: ------------[ cut here ]------------ WARNING: at kernel/watchdog.c:245 watchdog_overflow_callback+0x9c/0xa7() Watchdog detected hard LOCKUP on cpu 2 Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc] Pid: 23, comm: migration/2 Tainted: G C 3.9.4+ #11 Call Trace: <NMI> warn_slowpath_common+0x85/0x9f warn_slowpath_fmt+0x46/0x48 watchdog_overflow_callback+0x9c/0xa7 __perf_event_overflow+0x137/0x1cb perf_event_overflow+0x14/0x16 intel_pmu_handle_irq+0x2dc/0x359 perf_event_nmi_handler+0x19/0x1b nmi_handle+0x7f/0xc2 do_nmi+0xbc/0x304 end_repeat_nmi+0x1e/0x2e <<EOE>> cpu_stopper_thread+0xae/0x162 smpboot_thread_fn+0x258/0x260 kthread+0xc7/0xcf ret_from_fork+0x7c/0xb0 ---[ end trace 4947dfa9b0a4cec3 ]--- BUG: soft lockup - CPU#1 stuck for 22s! [migration/1:17] Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc] irq event stamp: 835637905 hardirqs last enabled at (835637904): __do_softirq+0x9f/0x257 hardirqs last disabled at (835637905): apic_timer_interrupt+0x6d/0x80 softirqs last enabled at (5654720): __do_softirq+0x1ff/0x257 softirqs last disabled at (5654725): irq_exit+0x5f/0xbb CPU 1 Pid: 17, comm: migration/1 Tainted: G WC 3.9.4+ #11 To be filled by O.E.M. To be filled by O.E.M./To be filled by O.E.M. RIP: tasklet_hi_action+0xf0/0xf0 Process migration/1 Call Trace: <IRQ> __do_softirq+0x117/0x257 irq_exit+0x5f/0xbb smp_apic_timer_interrupt+0x8a/0x98 apic_timer_interrupt+0x72/0x80 <EOI> printk+0x4d/0x4f stop_machine_cpu_stop+0x22c/0x274 cpu_stopper_thread+0xae/0x162 smpboot_thread_fn+0x258/0x260 kthread+0xc7/0xcf ret_from_fork+0x7c/0xb0 Signed-off-by: Ben Greear <greearb@candelatech.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Pekka Riikonen <priikone@iki.fi> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [xr: Backported to 3.4: Adjust context] Signed-off-by: Rui Xiang <rui.xiang@huawei.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Eric Dumazet authored
commit c10d7367 upstream. In various network workloads, __do_softirq() latencies can be up to 20 ms if HZ=1000, and 200 ms if HZ=100. This is because we iterate 10 times in the softirq dispatcher, and some actions can consume a lot of cycles. This patch changes the fallback to ksoftirqd condition to : - A time limit of 2 ms. - need_resched() being set on current task When one of this condition is met, we wakeup ksoftirqd for further softirq processing if we still have pending softirqs. Using need_resched() as the only condition can trigger RCU stalls, as we can keep BH disabled for too long. I ran several benchmarks and got no significant difference in throughput, but a very significant reduction of latencies (one order of magnitude) : In following bench, 200 antagonist "netperf -t TCP_RR" are started in background, using all available cpus. Then we start one "netperf -t TCP_RR", bound to the cpu handling the NIC IRQ (hard+soft) Before patch : # netperf -H 7.7.7.84 -t TCP_RR -T2,2 -- -k RT_LATENCY,MIN_LATENCY,MAX_LATENCY,P50_LATENCY,P90_LATENCY,P99_LATENCY,MEAN_LATENCY,STDDEV_LATENCY MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.7.84 () port 0 AF_INET : first burst 0 : cpu bind RT_LATENCY=550110.424 MIN_LATENCY=146858 MAX_LATENCY=997109 P50_LATENCY=305000 P90_LATENCY=550000 P99_LATENCY=710000 MEAN_LATENCY=376989.12 STDDEV_LATENCY=184046.92 After patch : # netperf -H 7.7.7.84 -t TCP_RR -T2,2 -- -k RT_LATENCY,MIN_LATENCY,MAX_LATENCY,P50_LATENCY,P90_LATENCY,P99_LATENCY,MEAN_LATENCY,STDDEV_LATENCY MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.7.84 () port 0 AF_INET : first burst 0 : cpu bind RT_LATENCY=40545.492 MIN_LATENCY=9834 MAX_LATENCY=78366 P50_LATENCY=33583 P90_LATENCY=59000 P99_LATENCY=69000 MEAN_LATENCY=38364.67 STDDEV_LATENCY=12865.26 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Miller <davem@davemloft.net> Cc: Tom Herbert <therbert@google.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> [xr: Backported to 3.4: Adjust context] Signed-off-by: Rui Xiang <rui.xiang@huawei.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Feng Tang authored
commit 55c844a4 upstream. When rebooting our 24 CPU Westmere servers with 3.4-rc6, we always see this warning msg: Restarting system. machine restart ------------[ cut here ]------------ WARNING: at arch/x86/kernel/smp.c:125 native_smp_send_reschedule+0x74/0xa7() Hardware name: X8DTN Modules linked in: igb [last unloaded: scsi_wait_scan] Pid: 1, comm: systemd-shutdow Not tainted 3.4.0-rc6+ #22 Call Trace: <IRQ> [<ffffffff8102a41f>] warn_slowpath_common+0x7e/0x96 [<ffffffff8102a44c>] warn_slowpath_null+0x15/0x17 [<ffffffff81018cf7>] native_smp_send_reschedule+0x74/0xa7 [<ffffffff810561c1>] trigger_load_balance+0x279/0x2a6 [<ffffffff81050112>] scheduler_tick+0xe0/0xe9 [<ffffffff81036768>] update_process_times+0x60/0x70 [<ffffffff81062f2f>] tick_sched_timer+0x68/0x92 [<ffffffff81046e33>] __run_hrtimer+0xb3/0x13c [<ffffffff81062ec7>] ? tick_nohz_handler+0xd0/0xd0 [<ffffffff810474f2>] hrtimer_interrupt+0xdb/0x198 [<ffffffff81019a35>] smp_apic_timer_interrupt+0x81/0x94 [<ffffffff81655187>] apic_timer_interrupt+0x67/0x70 <EOI> [<ffffffff8101a3c4>] ? default_send_IPI_mask_allbutself_phys+0xb4/0xc4 [<ffffffff8101c680>] physflat_send_IPI_allbutself+0x12/0x14 [<ffffffff81018db4>] native_nmi_stop_other_cpus+0x8a/0xd6 [<ffffffff810188ba>] native_machine_shutdown+0x50/0x67 [<ffffffff81018926>] machine_shutdown+0xa/0xc [<ffffffff8101897e>] native_machine_restart+0x20/0x32 [<ffffffff810189b0>] machine_restart+0xa/0xc [<ffffffff8103b196>] kernel_restart+0x47/0x4c [<ffffffff8103b2e6>] sys_reboot+0x13e/0x17c [<ffffffff8164e436>] ? _raw_spin_unlock_bh+0x10/0x12 [<ffffffff810fcac9>] ? bdi_queue_work+0xcf/0xd8 [<ffffffff810fe82f>] ? __bdi_start_writeback+0xae/0xb7 [<ffffffff810e0d64>] ? iterate_supers+0xa3/0xb7 [<ffffffff816547a2>] system_call_fastpath+0x16/0x1b ---[ end trace 320af5cb1cb60c5b ]--- The root cause seems to be the default_send_IPI_mask_allbutself_phys() takes quite some time (I measured it could be several ms) to complete sending NMIs to all the other 23 CPUs, and for HZ=250/1000 system, the time is long enough for a timer interrupt to happen, which will in turn trigger to kick load balance to a stopped CPU and cause this warning in native_smp_send_reschedule(). So disabling the local irq before stop_other_cpu() can fix this problem (tested 25 times reboot ok), and it is fine as there should be nobody caring the timer interrupt in such reboot stage. The latest 3.4 kernel slightly changes this behavior by sending REBOOT_VECTOR first and only send NMI_VECTOR if the REBOOT_VCTOR fails, and this patch is still needed to prevent the problem. Signed-off-by: Feng Tang <feng.tang@intel.com> Acked-by: Don Zickus <dzickus@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20120530231541.4c13433a@feng-i7Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Vinson Lee <vlee@twopensource.com> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-