- 21 Nov, 2017 9 commits
-
-
Lu Baolu authored
commit c67678ec upstream. The DbC register set defines an interface for system software to specify the vendor id and product id for the debug device. These two values will be presented by the debug device in its device descriptor idVendor and idProduct fields. The current used product ID is a place holder. We now have a valid one. The description strings are changed accordingly. This patch should be back-ported to kernels as old as v4.12, that contain the commit aeb9dd1d ("usb/early: Add driver for xhci debug capability"). Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
raveendra padasalagi authored
commit f0e2ce58 upstream. Add support to explicity ACK mailbox message because after sending message we can know the send status via error attribute of brcm_message. This is needed to support "txdone_ack" supported in mailbox controller driver. Fixes: 9d12ba86 ("crypto: brcm - Add Broadcom SPU driver") Signed-off-by: Raveendra Padasalagi <raveendra.padasalagi@broadcom.com> Reviewed-by: Anup Patel <anup.patel@broadcom.com> Reviewed-by: Scott Branden <scott.branden@broadcom.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Biggers authored
commit ccd9888f upstream. The "qat-dh" DH implementation assumes that 'key' and 'g' can be copied into a buffer with size 'p_size'. However it was never checked that that was actually the case, which most likely allowed users to cause a buffer underflow via KEYCTL_DH_COMPUTE. Fix this by updating crypto_dh_decode_key() to verify this precondition for all DH implementations. Fixes: c9839143 ("crypto: qat - Add DH support") Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Biggers authored
commit 199512b1 upstream. If 'p' is 0 for the software Diffie-Hellman implementation, then dh_max_size() returns 0. In the case of KEYCTL_DH_COMPUTE, this causes ZERO_SIZE_PTR to be passed to sg_init_one(), which with CONFIG_DEBUG_SG=y triggers the 'BUG_ON(!virt_addr_valid(buf));' in sg_set_buf(). Fix this by making crypto_dh_decode_key() reject 0 for 'p'. p=0 makes no sense for any DH implementation because 'p' is supposed to be a prime number. Moreover, 'mod 0' is not mathematically defined. Bug report: kernel BUG at ./include/linux/scatterlist.h:140! invalid opcode: 0000 [#1] SMP KASAN CPU: 0 PID: 27112 Comm: syz-executor2 Not tainted 4.14.0-rc7-00010-gf5dbb5d0ce32-dirty #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.3-20171021_125229-anatol 04/01/2014 task: ffff88006caac0c0 task.stack: ffff88006c7c8000 RIP: 0010:sg_set_buf include/linux/scatterlist.h:140 [inline] RIP: 0010:sg_init_one+0x1b3/0x240 lib/scatterlist.c:156 RSP: 0018:ffff88006c7cfb08 EFLAGS: 00010216 RAX: 0000000000010000 RBX: ffff88006c7cfe30 RCX: 00000000000064ee RDX: ffffffff81cf64c3 RSI: ffffc90000d72000 RDI: ffffffff92e937e0 RBP: ffff88006c7cfb30 R08: ffffed000d8f9fab R09: ffff88006c7cfd30 R10: 0000000000000005 R11: ffffed000d8f9faa R12: ffff88006c7cfd30 R13: 0000000000000000 R14: 0000000000000010 R15: ffff88006c7cfc50 FS: 00007fce190fa700(0000) GS:ffff88003ea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fffc6b33db8 CR3: 000000003cf64000 CR4: 00000000000006f0 Call Trace: __keyctl_dh_compute+0xa95/0x19b0 security/keys/dh.c:360 keyctl_dh_compute+0xac/0x100 security/keys/dh.c:434 SYSC_keyctl security/keys/keyctl.c:1745 [inline] SyS_keyctl+0x72/0x2c0 security/keys/keyctl.c:1641 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x4585c9 RSP: 002b:00007fce190f9bd8 EFLAGS: 00000216 ORIG_RAX: 00000000000000fa RAX: ffffffffffffffda RBX: 0000000000738020 RCX: 00000000004585c9 RDX: 000000002000d000 RSI: 0000000020000ff4 RDI: 0000000000000017 RBP: 0000000000000046 R08: 0000000020008000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000216 R12: 00007fff6e610cde R13: 00007fff6e610cdf R14: 00007fce190fa700 R15: 0000000000000000 Code: 03 0f b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 04 84 d2 75 33 5b 45 89 6c 24 14 41 5c 41 5d 41 5e 41 5f 5d c3 e8 fd 8f 68 ff <0f> 0b e8 f6 8f 68 ff 0f 0b e8 ef 8f 68 ff 0f 0b e8 e8 8f 68 ff 20 RIP: sg_set_buf include/linux/scatterlist.h:140 [inline] RSP: ffff88006c7cfb08 RIP: sg_init_one+0x1b3/0x240 lib/scatterlist.c:156 RSP: ffff88006c7cfb08 Fixes: 802c7f1c ("crypto: dh - Add DH software implementation") Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Biggers authored
commit 12d41a02 upstream. When setting the secret with the software Diffie-Hellman implementation, if allocating 'g' failed (e.g. if it was longer than MAX_EXTERN_MPI_BITS), then 'p' was freed twice: once immediately, and once later when the crypto_kpp tfm was destroyed. Fix it by using dh_free_ctx() (renamed to dh_clear_ctx()) in the error paths, as that correctly sets the pointers to NULL. KASAN report: MPI: mpi too large (32760 bits) ================================================================== BUG: KASAN: use-after-free in mpi_free+0x131/0x170 Read of size 4 at addr ffff88006c7cdf90 by task reproduce_doubl/367 CPU: 1 PID: 367 Comm: reproduce_doubl Not tainted 4.14.0-rc7-00040-g05298abde6fe #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: dump_stack+0xb3/0x10b ? mpi_free+0x131/0x170 print_address_description+0x79/0x2a0 ? mpi_free+0x131/0x170 kasan_report+0x236/0x340 ? akcipher_register_instance+0x90/0x90 __asan_report_load4_noabort+0x14/0x20 mpi_free+0x131/0x170 ? akcipher_register_instance+0x90/0x90 dh_exit_tfm+0x3d/0x140 crypto_kpp_exit_tfm+0x52/0x70 crypto_destroy_tfm+0xb3/0x250 __keyctl_dh_compute+0x640/0xe90 ? kasan_slab_free+0x12f/0x180 ? dh_data_from_key+0x240/0x240 ? key_create_or_update+0x1ee/0xb20 ? key_instantiate_and_link+0x440/0x440 ? lock_contended+0xee0/0xee0 ? kfree+0xcf/0x210 ? SyS_add_key+0x268/0x340 keyctl_dh_compute+0xb3/0xf1 ? __keyctl_dh_compute+0xe90/0xe90 ? SyS_add_key+0x26d/0x340 ? entry_SYSCALL_64_fastpath+0x5/0xbe ? trace_hardirqs_on_caller+0x3f4/0x560 SyS_keyctl+0x72/0x2c0 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x43ccf9 RSP: 002b:00007ffeeec96158 EFLAGS: 00000246 ORIG_RAX: 00000000000000fa RAX: ffffffffffffffda RBX: 000000000248b9b9 RCX: 000000000043ccf9 RDX: 00007ffeeec96170 RSI: 00007ffeeec96160 RDI: 0000000000000017 RBP: 0000000000000046 R08: 0000000000000000 R09: 0248b9b9143dc936 R10: 0000000000001000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000409670 R14: 0000000000409700 R15: 0000000000000000 Allocated by task 367: save_stack_trace+0x16/0x20 kasan_kmalloc+0xeb/0x180 kmem_cache_alloc_trace+0x114/0x300 mpi_alloc+0x4b/0x230 mpi_read_raw_data+0xbe/0x360 dh_set_secret+0x1dc/0x460 __keyctl_dh_compute+0x623/0xe90 keyctl_dh_compute+0xb3/0xf1 SyS_keyctl+0x72/0x2c0 entry_SYSCALL_64_fastpath+0x1f/0xbe Freed by task 367: save_stack_trace+0x16/0x20 kasan_slab_free+0xab/0x180 kfree+0xb5/0x210 mpi_free+0xcb/0x170 dh_set_secret+0x2d7/0x460 __keyctl_dh_compute+0x623/0xe90 keyctl_dh_compute+0xb3/0xf1 SyS_keyctl+0x72/0x2c0 entry_SYSCALL_64_fastpath+0x1f/0xbe Fixes: 802c7f1c ("crypto: dh - Add DH software implementation") Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Andrey Konovalov authored
commit eb0c1994 upstream. dvb_detach(arg) calls symbol_put_addr(arg), where arg should be a pointer to a function. Right now a pointer to state->dib7000p_ops is passed to dvb_detach(), which causes a BUG() in symbol_put_addr() as discovered by syzkaller. Pass state->dib7000p_ops.set_wbd_ref instead. ------------[ cut here ]------------ kernel BUG at kernel/module.c:1081! invalid opcode: 0000 [#1] PREEMPT SMP KASAN Modules linked in: CPU: 1 PID: 1151 Comm: kworker/1:1 Tainted: G W 4.14.0-rc1-42251-gebb2c243 #224 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Workqueue: usb_hub_wq hub_event task: ffff88006a336300 task.stack: ffff88006a7c8000 RIP: 0010:symbol_put_addr+0x54/0x60 kernel/module.c:1083 RSP: 0018:ffff88006a7ce210 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff880062a8d190 RCX: 0000000000000000 RDX: dffffc0000000020 RSI: ffffffff85876d60 RDI: ffff880062a8d190 RBP: ffff88006a7ce218 R08: 1ffff1000d4f9c12 R09: 1ffff1000d4f9ae4 R10: 1ffff1000d4f9bed R11: 0000000000000000 R12: ffff880062a8d180 R13: 00000000ffffffed R14: ffff880062a8d190 R15: ffff88006947c000 FS: 0000000000000000(0000) GS:ffff88006c900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f6416532000 CR3: 00000000632f5000 CR4: 00000000000006e0 Call Trace: stk7070p_frontend_attach+0x515/0x610 drivers/media/usb/dvb-usb/dib0700_devices.c:1013 dvb_usb_adapter_frontend_init+0x32b/0x660 drivers/media/usb/dvb-usb/dvb-usb-dvb.c:286 dvb_usb_adapter_init drivers/media/usb/dvb-usb/dvb-usb-init.c:86 dvb_usb_init drivers/media/usb/dvb-usb/dvb-usb-init.c:162 dvb_usb_device_init+0xf70/0x17f0 drivers/media/usb/dvb-usb/dvb-usb-init.c:277 dib0700_probe+0x171/0x5a0 drivers/media/usb/dvb-usb/dib0700_core.c:886 usb_probe_interface+0x35d/0x8e0 drivers/usb/core/driver.c:361 really_probe drivers/base/dd.c:413 driver_probe_device+0x610/0xa00 drivers/base/dd.c:557 __device_attach_driver+0x230/0x290 drivers/base/dd.c:653 bus_for_each_drv+0x161/0x210 drivers/base/bus.c:463 __device_attach+0x26e/0x3d0 drivers/base/dd.c:710 device_initial_probe+0x1f/0x30 drivers/base/dd.c:757 bus_probe_device+0x1eb/0x290 drivers/base/bus.c:523 device_add+0xd0b/0x1660 drivers/base/core.c:1835 usb_set_configuration+0x104e/0x1870 drivers/usb/core/message.c:1932 generic_probe+0x73/0xe0 drivers/usb/core/generic.c:174 usb_probe_device+0xaf/0xe0 drivers/usb/core/driver.c:266 really_probe drivers/base/dd.c:413 driver_probe_device+0x610/0xa00 drivers/base/dd.c:557 __device_attach_driver+0x230/0x290 drivers/base/dd.c:653 bus_for_each_drv+0x161/0x210 drivers/base/bus.c:463 __device_attach+0x26e/0x3d0 drivers/base/dd.c:710 device_initial_probe+0x1f/0x30 drivers/base/dd.c:757 bus_probe_device+0x1eb/0x290 drivers/base/bus.c:523 device_add+0xd0b/0x1660 drivers/base/core.c:1835 usb_new_device+0x7b8/0x1020 drivers/usb/core/hub.c:2457 hub_port_connect drivers/usb/core/hub.c:4903 hub_port_connect_change drivers/usb/core/hub.c:5009 port_event drivers/usb/core/hub.c:5115 hub_event+0x194d/0x3740 drivers/usb/core/hub.c:5195 process_one_work+0xc7f/0x1db0 kernel/workqueue.c:2119 worker_thread+0x221/0x1850 kernel/workqueue.c:2253 kthread+0x3a1/0x470 kernel/kthread.c:231 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431 Code: ff ff 48 85 c0 74 24 48 89 c7 e8 48 ea ff ff bf 01 00 00 00 e8 de 20 e3 ff 65 8b 05 b7 2f c2 7e 85 c0 75 c9 e8 f9 0b c1 ff eb c2 <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 b8 00 00 RIP: symbol_put_addr+0x54/0x60 RSP: ffff88006a7ce210 ---[ end trace b75b357739e7e116 ]--- Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Arvind Yadav authored
commit 58fd55e8 upstream. It seems that the return value of usb_ifnum_to_if() can be NULL and needs to be checked. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Adam Wallis authored
commit a9df21e3 upstream. Commit adfa543e ("dmatest: don't use set_freezable_with_signal()") introduced a bug (that is in fact documented by the patch commit text) that leaves behind a dangling pointer. Since the done_wait structure is allocated on the stack, future invocations to the DMATEST can produce undesirable results (e.g., corrupted spinlocks). Ideally, this would be cleaned up in the thread handler, but at the very least, the kernel is left in a very precarious scenario that can lead to some long debug sessions when the crash comes later. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=197605Signed-off-by: Adam Wallis <awallis@codeaurora.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Qiuxu Zhuo authored
commit 15cc3ae0 upstream. Yi Zhang reported the following failure on a 2-socket Haswell (E5-2603v3) server (DELL PowerEdge 730xd): EDAC sbridge: Some needed devices are missing EDAC MC: Removed device 0 for sb_edac.c Haswell SrcID#0_Ha#0: DEV 0000:7f:12.0 EDAC MC: Removed device 1 for sb_edac.c Haswell SrcID#1_Ha#0: DEV 0000:ff:12.0 EDAC sbridge: Couldn't find mci handler EDAC sbridge: Couldn't find mci handler EDAC sbridge: Failed to register device with error -19. The refactored sb_edac driver creates the IMC1 (the 2nd memory controller) if any IMC1 device is present. In this case only HA1_TA of IMC1 was present, but the driver expected to find HA1/HA1_TM/HA1_TAD[0-3] devices too, leading to the above failure. The document [1] says the 'E5-2603 v3' CPU has 4 memory channels max. Yi Zhang inserted one DIMM per channel for each CPU, and did random error address injection test with this patch: 4024 addresses fell in TOLM hole area 12715 addresses fell in CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 12774 addresses fell in CPU_SrcID#0_Ha#0_Chan#1_DIMM#0 12798 addresses fell in CPU_SrcID#0_Ha#0_Chan#2_DIMM#0 12913 addresses fell in CPU_SrcID#0_Ha#0_Chan#3_DIMM#0 12674 addresses fell in CPU_SrcID#1_Ha#0_Chan#0_DIMM#0 12686 addresses fell in CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 12882 addresses fell in CPU_SrcID#1_Ha#0_Chan#2_DIMM#0 12934 addresses fell in CPU_SrcID#1_Ha#0_Chan#3_DIMM#0 106400 addresses were injected totally. The test result shows that all the 4 channels belong to IMC0 per CPU, so the server really only has one IMC per CPU. In the 1st page of chapter 2 in datasheet [2], it also says 'E5-2600 v3' implements either one or two IMCs. For CPUs with one IMC, IMC1 is not used and should be ignored. Thus, do not create a second memory controller if the key HA1 is absent. [1] http://ark.intel.com/products/83349/Intel-Xeon-Processor-E5-2603-v3-15M-Cache-1_60-GHz [2] https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e5-v3-datasheet-vol-2.pdfReported-and-tested-by: Yi Zhang <yizhan@redhat.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170913104214.7325-1-qiuxu.zhuo@intel.com [ Massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 12 Nov, 2017 3 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fixes from Thomas Gleixner: "A set of small fixes: - make KGDB work again which got broken by the conversion of WARN() to #UD. The WARN fixup needs to run before the notifier callchain, otherwise KGDB tries to handle it and crashes. - disable KASAN in the ORC unwinder to prevent false positive KASAN warnings - prevent default mapping above 47bit when 5 level page tables are enabled - make the delay calibration optimization work correctly, which had the conditionals the wrong way around and was operating on data which was not yet updated. - remove the bogus X86_TRAP_BP trap init from the default IDT init table, which broke 32bit int3 handling by overwriting the correct int3 setup. - replace this_cpu* with boot_cpu_data access in the preemptible oprofile init code" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/debug: Handle warnings before the notifier chain, to fix KGDB crash x86/mm: Fix ELF_ET_DYN_BASE for 5-level paging x86/idt: Remove X86_TRAP_BP initialization in idt_setup_traps() x86/oprofile/ppro: Do not use __this_cpu*() in preemptible context x86/unwind: Disable KASAN checking in the ORC unwinder x86/smpboot: Make optimization of delay calibration work correctly
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull perf tool fixes from Thomas Gleixner: "A small set of fixes for perf tool: - synchronize the i915 drm header to avoid the 'out of date' warning - make sure that perf trace cleans up its temporary files on exit - unbreak the build with newer flex versions - add missing braces in the eBPF parsing rules" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tooling/headers: Sync the tools/include/uapi/drm/i915_drm.h UAPI header perf trace: Call machine__exit() at exit perf tools: Fix eBPF event specification parsing perf tools: Add "reject" option for parse-events.l
-
- 11 Nov, 2017 8 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) Use after free in vlan, from Cong Wang. 2) Handle NAPI poll with a zero budget properly in mlx5 driver, from Saeed Mahameed. 3) If DMA mapping fails in mlx5 driver, NULL out page, from Inbar Karmy. 4) Handle overrun in RX FIFO of sun4i CAN driver, from Gerhard Bertelsmann. 5) Missing return in mdb and vlan prepare phase of DSA layer, from Vivien Didelot. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: vlan: fix a use-after-free in vlan_device_event() net: dsa: return after vlan prepare phase net: dsa: return after mdb prepare phase can: ifi: Fix transmitter delay calculation tcp: fix tcp_fastretrans_alert warning tcp: gso: avoid refcount_t warning from tcp_gso_segment() can: peak: Add support for new PCIe/M2 CAN FD interfaces can: sun4i: handle overrun in RX FIFO can: c_can: don't indicate triple sampling support for D_CAN net/mlx5e: Increase Striding RQ minimum size limit to 4 multi-packet WQEs net/mlx5e: Set page to null in case dma mapping fails net/mlx5e: Fix napi poll with zero budget net/mlx5: Cancel health poll before sending panic teardown command net/mlx5: Loop over temp list to release delay events rds: ib: Fix NULL pointer dereference in debug code
-
David S. Miller authored
Merge tag 'linux-can-fixes-for-4.14-20171110' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2017-11-10 this is a pull request for net/master. The first patch by Richard Schütz for the c_can driver removes the false indication to support triple sampling for d_can. Gerhard Bertelsmann's patch for the sun4i driver improves the RX overrun handling. The patch by Stephane Grosjean for the peak_canfd driver adds the PCI ids for various new PCIe/M2 interfaces. Marek Vasut's patch for the ifi driver fix transmitter delay calculation. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linuxDavid S. Miller authored
Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2017-11-08 The following series includes some fixes for mlx5 core and etherent driver. Sorry for the late submission but as you can see i have some very critical fixes below that i would like them merged into this RC. Please pull and let me know if there is any problem. For -stable: ('net/mlx5e: Set page to null in case dma mapping fails') kernels >= 4.13 ('net/mlx5: FPGA, return -EINVAL if size is zero') kernels >= 4.13 ('net/mlx5: Cancel health poll before sending panic teardown command') kernels >= 4.13 V1->V2: - Fix Reviewed-by tag of the 2nd patch. - Drop the FPGA 0 size fix, it needs some more change log info. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Cong Wang authored
After refcnt reaches zero, vlan_vid_del() could free dev->vlan_info via RCU: RCU_INIT_POINTER(dev->vlan_info, NULL); call_rcu(&vlan_info->rcu, vlan_info_rcu_free); However, the pointer 'grp' still points to that memory since it is set before vlan_vid_del(): vlan_info = rtnl_dereference(dev->vlan_info); if (!vlan_info) goto out; grp = &vlan_info->grp; Depends on when that RCU callback is scheduled, we could trigger a use-after-free in vlan_group_for_each_dev() right following this vlan_vid_del(). Fix it by moving vlan_vid_del() before setting grp. This is also symmetric to the vlan_vid_add() we call in vlan_device_event(). Reported-by: Fengguang Wu <fengguang.wu@intel.com> Fixes: efc73f4b ("net: Fix memory leak - vlan_info struct") Cc: Alexander Duyck <alexander.duyck@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Girish Moodalbail <girish.moodalbail@oracle.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Girish Moodalbail <girish.moodalbail@oracle.com> Tested-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ingo Molnar authored
Last minute upstream update to one of the UAPI headers - sync it with tooling, to address this warning: Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h' Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
Ingo Molnar authored
Merge branch 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf tooling fixes from Arnaldo Carvalho de Melo. Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
Vivien Didelot authored
The current code does not return after successfully preparing the VLAN addition on every ports member of a it. Fix this. Fixes: 1ca4aa9c ("net: dsa: check VLAN capability of every switch") Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vivien Didelot authored
The current code does not return after successfully preparing the MDB addition on every ports member of a multicast group. Fix this. Fixes: a1a6b7ea ("net: dsa: add cross-chip multicast support") Reported-by: Egil Hjelmeland <privat@egil-hjelmeland.no> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 10 Nov, 2017 20 commits
-
-
git://github.com/ceph/ceph-clientLinus Torvalds authored
Pull ceph gix from Ilya Dryomov: "Memory allocation flags fix, marked for stable" * tag 'ceph-for-4.14-rc9' of git://github.com/ceph/ceph-client: rbd: use GFP_NOIO for parent stat and data requests
-
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/inputLinus Torvalds authored
Pull input layer updates from Dmitry Torokhov: - a new ACPI ID for Elan touchpad found in yet another Ideapad model - Synaptics RMI4 will allow binding to controllers reporting SMB version 3 (note that we are not adding any new ACPI IDs to the Synaptics PS/2 drover so unless user explicitly enables intertouch support there is no user-visible change) - a fixup to TSC 2004/5 touchscreen driver to mark input devices as "direct" to help userspace identify the type of device they are dealing with * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: synaptics-rmi4 - RMI4 can also use SMBUS version 3 Input: tsc200x-core - set INPUT_PROP_DIRECT Input: elan_i2c - add ELAN060C to the ACPI table
-
git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds authored
Pull KVM fix from Radim Krčmář: "Fix PPC HV host crash that can occur as a result of resizing the guest hashed page table" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: PPC: Book3S HV: Fix exclusion between HPT resizing and other HPT updates
-
git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mipsLinus Torvalds authored
Pull MIPS fixes from James Hogan: "A final few MIPS fixes for 4.14: - fix BMIPS NULL pointer dereference (4.7) - fix AR7 early GPIO init allocation failure (3.19) - fix dead serial output on certain AR7 platforms (2.6.35)" * tag 'mips_fixes_4.14_2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips: MIPS: AR7: Ensure that serial ports are properly set up MIPS: AR7: Defer registration of GPIO MIPS: BMIPS: Fix missing cbr address
-
Maciej W. Rozycki authored
Following my recent transition from Imagination Technologies to the=20 reincarnated MIPS company add a .mailmap mapping for my work address, so that `scripts/get_maintainer.pl' gets it right for past commits. Signed-off-by: Maciej W. Rozycki <macro@mips.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
This reverts commit 941f5f0f. Sadly, it turns out that we really can't just do the cross-CPU IPI to all CPU's to get their proper frequencies, because it's much too expensive on systems with lots of cores. So we'll have to revert this for now, and revisit it using a smarter model (probably doing one system-wide IPI at open time, and doing all the frequency calculations in parallel). Reported-by: WANG Chao <chao.wang@ucloud.cn> Reported-by: Ingo Molnar <mingo@kernel.org> Cc: Rafael J Wysocki <rafael.j.wysocki@intel.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://people.freedesktop.org/~airlied/linuxLinus Torvalds authored
Pull drm fixes from Dave Airlie: "Last few patches to wrap up. Two i915 fixes that are on their way to stable, one vmware black screen bug, and one const patch that I was going to drop, but it was clearly a pretty safe one liner" * tag 'drm-fixes-for-v4.14-rc9' of git://people.freedesktop.org/~airlied/linux: drm/i915: Deconstruct struct sgt_dma initialiser drm/i915: Reject unknown syncobj flags drm/vmwgfx: Fix Ubuntu 17.10 Wayland black screen issue drm/vmwgfx: constify vmw_fence_ops
-
Marek Vasut authored
The CANFD transmitter delay calculation formula was updated in the latest software drop from IFI and improves the behavior of the IFI CANFD core during bitrate switching. Use the new formula to improve stability of the CANFD operation. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Markus Marb <markus@marb.org> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
Yuchung Cheng authored
This patch fixes the cause of an WARNING indicatng TCP has pending retransmission in Open state in tcp_fastretrans_alert(). The root cause is a bad interaction between path mtu probing, if enabled, and the RACK loss detection. Upong receiving a SACK above the sequence of the MTU probing packet, RACK could mark the probe packet lost in tcp_fastretrans_alert(), prior to calling tcp_simple_retransmit(). tcp_simple_retransmit() only enters Loss state if it newly marks the probe packet lost. If the probe packet is already identified as lost by RACK, the sender remains in Open state with some packets marked lost and retransmitted. Then the next SACK would trigger the warning. The likely scenario is that the probe packet was lost due to its size or network congestion. The actual impact of this warning is small by potentially entering fast recovery an ACK later. The simple fix is always entering recovery (Loss) state if some packet is marked lost during path MTU probing. Fixes: a0370b3f ("tcp: enable RACK loss detection to trigger recovery") Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name> Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Reported-by: Roman Gushchin <guro@fb.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
When a GSO skb of truesize O is segmented into 2 new skbs of truesize N1 and N2, we want to transfer socket ownership to the new fresh skbs. In order to avoid expensive atomic operations on a cache line subject to cache bouncing, we replace the sequence : refcount_add(N1, &sk->sk_wmem_alloc); refcount_add(N2, &sk->sk_wmem_alloc); // repeated by number of segments refcount_sub(O, &sk->sk_wmem_alloc); by a single refcount_add(sum_of(N) - O, &sk->sk_wmem_alloc); Problem is : In some pathological cases, sum(N) - O might be a negative number, and syzkaller bot was apparently able to trigger this trace [1] atomic_t was ok with this construct, but we need to take care of the negative delta with refcount_t [1] refcount_t: saturated; leaking memory. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 8404 at lib/refcount.c:77 refcount_add_not_zero+0x198/0x200 lib/refcount.c:77 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 8404 Comm: syz-executor2 Not tainted 4.14.0-rc5-mm1+ #20 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:52 panic+0x1e4/0x41c kernel/panic.c:183 __warn+0x1c4/0x1e0 kernel/panic.c:546 report_bug+0x211/0x2d0 lib/bug.c:183 fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:177 do_trap_no_signal arch/x86/kernel/traps.c:211 [inline] do_trap+0x260/0x390 arch/x86/kernel/traps.c:260 do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:297 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:310 invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905 RIP: 0010:refcount_add_not_zero+0x198/0x200 lib/refcount.c:77 RSP: 0018:ffff8801c606e3a0 EFLAGS: 00010282 RAX: 0000000000000026 RBX: 0000000000001401 RCX: 0000000000000000 RDX: 0000000000000026 RSI: ffffc900036fc000 RDI: ffffed0038c0dc68 RBP: ffff8801c606e430 R08: 0000000000000001 R09: 0000000000000000 R10: ffff8801d97f5eba R11: 0000000000000000 R12: ffff8801d5acf73c R13: 1ffff10038c0dc75 R14: 00000000ffffffff R15: 00000000fffff72f refcount_add+0x1b/0x60 lib/refcount.c:101 tcp_gso_segment+0x10d0/0x16b0 net/ipv4/tcp_offload.c:155 tcp4_gso_segment+0xd4/0x310 net/ipv4/tcp_offload.c:51 inet_gso_segment+0x60c/0x11c0 net/ipv4/af_inet.c:1271 skb_mac_gso_segment+0x33f/0x660 net/core/dev.c:2749 __skb_gso_segment+0x35f/0x7f0 net/core/dev.c:2821 skb_gso_segment include/linux/netdevice.h:3971 [inline] validate_xmit_skb+0x4ba/0xb20 net/core/dev.c:3074 __dev_queue_xmit+0xe49/0x2070 net/core/dev.c:3497 dev_queue_xmit+0x17/0x20 net/core/dev.c:3538 neigh_hh_output include/net/neighbour.h:471 [inline] neigh_output include/net/neighbour.h:479 [inline] ip_finish_output2+0xece/0x1460 net/ipv4/ip_output.c:229 ip_finish_output+0x85e/0xd10 net/ipv4/ip_output.c:317 NF_HOOK_COND include/linux/netfilter.h:238 [inline] ip_output+0x1cc/0x860 net/ipv4/ip_output.c:405 dst_output include/net/dst.h:459 [inline] ip_local_out+0x95/0x160 net/ipv4/ip_output.c:124 ip_queue_xmit+0x8c6/0x18e0 net/ipv4/ip_output.c:504 tcp_transmit_skb+0x1ab7/0x3840 net/ipv4/tcp_output.c:1137 tcp_write_xmit+0x663/0x4de0 net/ipv4/tcp_output.c:2341 __tcp_push_pending_frames+0xa0/0x250 net/ipv4/tcp_output.c:2513 tcp_push_pending_frames include/net/tcp.h:1722 [inline] tcp_data_snd_check net/ipv4/tcp_input.c:5050 [inline] tcp_rcv_established+0x8c7/0x18a0 net/ipv4/tcp_input.c:5497 tcp_v4_do_rcv+0x2ab/0x7d0 net/ipv4/tcp_ipv4.c:1460 sk_backlog_rcv include/net/sock.h:909 [inline] __release_sock+0x124/0x360 net/core/sock.c:2264 release_sock+0xa4/0x2a0 net/core/sock.c:2776 tcp_sendmsg+0x3a/0x50 net/ipv4/tcp.c:1462 inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763 sock_sendmsg_nosec net/socket.c:632 [inline] sock_sendmsg+0xca/0x110 net/socket.c:642 ___sys_sendmsg+0x31c/0x890 net/socket.c:2048 __sys_sendmmsg+0x1e6/0x5f0 net/socket.c:2138 Fixes: 14afee4b ("net: convert sock.sk_wmem_alloc from atomic_t to refcount_t") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephane Grosjean authored
This adds support for the following PEAK-System CAN FD interfaces: PCAN-cPCIe FD CAN FD Interface for cPCI Serial (2 or 4 channels) PCAN-PCIe/104-Express CAN FD Interface for PCIe/104-Express (1, 2 or 4 ch.) PCAN-miniPCIe FD CAN FD Interface for PCIe Mini (1, 2 or 4 channels) PCAN-PCIe FD OEM CAN FD Interface for PCIe OEM version (1, 2 or 4 ch.) PCAN-M.2 CAN FD Interface for M.2 (1 or 2 channels) Like the PCAN-PCIe FD interface, all of these boards run the same IP Core that is able to handle CAN FD (see also http://www.peak-system.com). Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
Gerhard Bertelsmann authored
SUN4Is CAN IP has a 64 byte deep FIFO buffer. If the buffer is not drained fast enough (overrun) it's getting mangled. Already received frames are dropped - the data can't be restored. Signed-off-by: Gerhard Bertelsmann <info@gerhard-bertelsmann.de> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
Richard Schütz authored
The D_CAN controller doesn't provide a triple sampling mode, so don't set the CAN_CTRLMODE_3_SAMPLES flag in ctrlmode_supported. Currently enabling triple sampling is a no-op. Signed-off-by: Richard Schütz <rschuetz@uni-koblenz.de> Cc: linux-stable <stable@vger.kernel.org> # >= v3.6 Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
Alexander Shishkin authored
Commit: 9a93848f ("x86/debug: Implement __WARN() using UD0") turned warnings into UD0, but the fixup code only runs after the notify_die() chain. This is a problem, in particular, with kgdb, which kicks in as if it was a BUG(). Fix this by running the fixup code before the notifier chain in the invalid op handler path. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Tested-by: Ilya Dryomov <idryomov@gmail.com> Acked-by: Daniel Thompson <daniel.thompson@linaro.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Weinberger <richard.weinberger@gmail.com> Cc: <stable@vger.kernel.org> # v4.12+ Link: http://lkml.kernel.org/r/20170724100428.19173-1-alexander.shishkin@linux.intel.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Eugenia Emantayev authored
This is to prevent the case of working with a single MPWQE (1 WQE is always reserved as RQ is linked-list). When the WQE is fully consumed, HW should still have available buffer in order not to drop packets. Fixes: 461017cb ("net/mlx5e: Support RX multi-packet WQE (Striding RQ)") Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Inbar Karmy authored
Currently, when dma mapping fails, put_page is called, but the page is not set to null. Later, in the page_reuse treatment in mlx5e_free_rx_descs(), mlx5e_page_release() is called for the second time, improperly doing dma_unmap (for a non-mapped address) and an extra put_page. Prevent this by nullifying the page pointer when dma_map fails. Fixes: accd5883 ("net/mlx5e: Introduce RX Page-Reuse") Signed-off-by: Inbar Karmy <inbark@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Saeed Mahameed authored
napi->poll can be called with budget 0, e.g. in netpoll scenarios where the caller only wants to poll TX rings (poll_one_napi@net/core/netpoll.c). The below commit changed RX polling from "while" loop to "do {} while", which caused to ignore the initial budget and handle at least one RX packet. This fixes the following warning: [ 2852.049194] mlx5e_napi_poll+0x0/0x260 [mlx5_core] exceeded budget in poll [ 2852.049195] ------------[ cut here ]------------ [ 2852.049195] WARNING: CPU: 0 PID: 25691 at net/core/netpoll.c:171 netpoll_poll_dev+0x18a/0x1a0 Fixes: 4b7dfc99 ("net/mlx5e: Early-return on empty completion queues") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Reported-by: Martin KaFai Lau <kafai@fb.com> Tested-by: Martin KaFai Lau <kafai@fb.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Huy Nguyen authored
After the panic teardown firmware command, health_care detects the error in PCI bus and calls the mlx5_pci_err_detected. This health_care flow is no longer needed because the panic teardown firmware command will bring down the PCI bus communication with the HCA. The solution is to cancel the health care timer and its pending workqueue request before sending panic teardown firmware command. Kernel trace: mlx5_core 0033:01:00.0: Shutdown was called mlx5_core 0033:01:00.0: health_care:154:(pid 9304): handling bad device here mlx5_core 0033:01:00.0: mlx5_handle_bad_state:114:(pid 9304): NIC state 1 mlx5_core 0033:01:00.0: mlx5_pci_err_detected was called mlx5_core 0033:01:00.0: mlx5_enter_error_state:96:(pid 9304): start mlx5_3:mlx5_ib_event:3061:(pid 9304): warning: event on port 0 mlx5_core 0033:01:00.0: mlx5_enter_error_state:104:(pid 9304): end Unable to handle kernel paging request for data at address 0x0000003f Faulting instruction address: 0xc0080000434b8c80 Fixes: 8812c24d ('net/mlx5: Add fast unload support in shutdown flow') Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Huy Nguyen authored
list_splice_init initializing waiting_events_list after splicing it to temp list, therefore we should loop over temp list to fire the events. Fixes: 4ca637a2 ("net/mlx5: Delay events till mlx5 interface's add complete for pci resume") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Håkon Bugge authored
rds_ib_recv_refill() is a function that refills an IB receive queue. It can be called from both the CQE handler (tasklet) and a worker thread. Just after the call to ib_post_recv(), a debug message is printed with rdsdebug(): ret = ib_post_recv(ic->i_cm_id->qp, &recv->r_wr, &failed_wr); rdsdebug("recv %p ibinc %p page %p addr %lu ret %d\n", recv, recv->r_ibinc, sg_page(&recv->r_frag->f_sg), (long) ib_sg_dma_address( ic->i_cm_id->device, &recv->r_frag->f_sg), ret); Now consider an invocation of rds_ib_recv_refill() from the worker thread, which is preemptible. Further, assume that the worker thread is preempted between the ib_post_recv() and rdsdebug() statements. Then, if the preemption is due to a receive CQE event, the rds_ib_recv_cqe_handler() will be invoked. This function processes receive completions, including freeing up data structures, such as the recv->r_frag. In this scenario, rds_ib_recv_cqe_handler() will process the receive WR posted above. That implies, that the recv->r_frag has been freed before the above rdsdebug() statement has been executed. When it is later executed, we will have a NULL pointer dereference: [ 4088.068008] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 4088.076754] IP: rds_ib_recv_refill+0x87/0x620 [rds_rdma] [ 4088.082686] PGD 0 P4D 0 [ 4088.085515] Oops: 0000 [#1] SMP [ 4088.089015] Modules linked in: rds_rdma(OE) rds(OE) rpcsec_gss_krb5(E) nfsv4(E) dns_resolver(E) nfs(E) fscache(E) mlx4_ib(E) ib_ipoib(E) rdma_ucm(E) ib_ucm(E) ib_uverbs(E) ib_umad(E) rdma_cm(E) ib_cm(E) iw_cm(E) ib_core(E) binfmt_misc(E) sb_edac(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) pcbc(E) aesni_intel(E) crypto_simd(E) iTCO_wdt(E) glue_helper(E) iTCO_vendor_support(E) sg(E) cryptd(E) pcspkr(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E) shpchp(E) ioatdma(E) i2c_i801(E) wmi(E) lpc_ich(E) mei_me(E) mei(E) mfd_core(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) ip_tables(E) ext4(E) mbcache(E) jbd2(E) fscrypto(E) mgag200(E) i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) [ 4088.168486] fb_sys_fops(E) ahci(E) ixgbe(E) libahci(E) ttm(E) mdio(E) ptp(E) pps_core(E) drm(E) sd_mod(E) libata(E) crc32c_intel(E) mlx4_core(E) i2c_core(E) dca(E) megaraid_sas(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) [last unloaded: rds] [ 4088.193442] CPU: 20 PID: 1244 Comm: kworker/20:2 Tainted: G OE 4.14.0-rc7.master.20171105.ol7.x86_64 #1 [ 4088.205097] Hardware name: Oracle Corporation ORACLE SERVER X5-2L/ASM,MOBO TRAY,2U, BIOS 31110000 03/03/2017 [ 4088.216074] Workqueue: ib_cm cm_work_handler [ib_cm] [ 4088.221614] task: ffff885fa11d0000 task.stack: ffffc9000e598000 [ 4088.228224] RIP: 0010:rds_ib_recv_refill+0x87/0x620 [rds_rdma] [ 4088.234736] RSP: 0018:ffffc9000e59bb68 EFLAGS: 00010286 [ 4088.240568] RAX: 0000000000000000 RBX: ffffc9002115d050 RCX: ffffc9002115d050 [ 4088.248535] RDX: ffffffffa0521380 RSI: ffffffffa0522158 RDI: ffffffffa0525580 [ 4088.256498] RBP: ffffc9000e59bbf8 R08: 0000000000000005 R09: 0000000000000000 [ 4088.264465] R10: 0000000000000339 R11: 0000000000000001 R12: 0000000000000000 [ 4088.272433] R13: ffff885f8c9d8000 R14: ffffffff81a0a060 R15: ffff884676268000 [ 4088.280397] FS: 0000000000000000(0000) GS:ffff885fbec80000(0000) knlGS:0000000000000000 [ 4088.289434] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4088.295846] CR2: 0000000000000020 CR3: 0000000001e09005 CR4: 00000000001606e0 [ 4088.303816] Call Trace: [ 4088.306557] rds_ib_cm_connect_complete+0xe0/0x220 [rds_rdma] [ 4088.312982] ? __dynamic_pr_debug+0x8c/0xb0 [ 4088.317664] ? __queue_work+0x142/0x3c0 [ 4088.321944] rds_rdma_cm_event_handler+0x19e/0x250 [rds_rdma] [ 4088.328370] cma_ib_handler+0xcd/0x280 [rdma_cm] [ 4088.333522] cm_process_work+0x25/0x120 [ib_cm] [ 4088.338580] cm_work_handler+0xd6b/0x17aa [ib_cm] [ 4088.343832] process_one_work+0x149/0x360 [ 4088.348307] worker_thread+0x4d/0x3e0 [ 4088.352397] kthread+0x109/0x140 [ 4088.355996] ? rescuer_thread+0x380/0x380 [ 4088.360467] ? kthread_park+0x60/0x60 [ 4088.364563] ret_from_fork+0x25/0x30 [ 4088.368548] Code: 48 89 45 90 48 89 45 98 eb 4d 0f 1f 44 00 00 48 8b 43 08 48 89 d9 48 c7 c2 80 13 52 a0 48 c7 c6 58 21 52 a0 48 c7 c7 80 55 52 a0 <4c> 8b 48 20 44 89 64 24 08 48 8b 40 30 49 83 e1 fc 48 89 04 24 [ 4088.389612] RIP: rds_ib_recv_refill+0x87/0x620 [rds_rdma] RSP: ffffc9000e59bb68 [ 4088.397772] CR2: 0000000000000020 [ 4088.401505] ---[ end trace fe922e6ccf004431 ]--- This bug was provoked by compiling rds out-of-tree with EXTRA_CFLAGS="-DRDS_DEBUG -DDEBUG" and inserting an artificial delay between the rdsdebug() and ib_ib_port_recv() statements: /* XXX when can this fail? */ ret = ib_post_recv(ic->i_cm_id->qp, &recv->r_wr, &failed_wr); + if (can_wait) + usleep_range(1000, 5000); rdsdebug("recv %p ibinc %p page %p addr %lu ret %d\n", recv, recv->r_ibinc, sg_page(&recv->r_frag->f_sg), (long) ib_sg_dma_address( The fix is simply to move the rdsdebug() statement up before the ib_post_recv() and remove the printing of ret, which is taken care of anyway by the non-debug code. Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Reviewed-by: Knut Omang <knut.omang@oracle.com> Reviewed-by: Wei Lin Guay <wei.lin.guay@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-