- 27 Mar, 2018 2 commits
-
-
Leon Romanovsky authored
Add missing check that device is connected prior to access it. [ 55.358652] BUG: KASAN: null-ptr-deref in rdma_init_qp_attr+0x4a/0x2c0 [ 55.359389] Read of size 8 at addr 00000000000000b0 by task qp/618 [ 55.360255] [ 55.360432] CPU: 1 PID: 618 Comm: qp Not tainted 4.16.0-rc1-00071-gcaf61b1b #91 [ 55.361693] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 [ 55.363264] Call Trace: [ 55.363833] dump_stack+0x5c/0x77 [ 55.364215] kasan_report+0x163/0x380 [ 55.364610] ? rdma_init_qp_attr+0x4a/0x2c0 [ 55.365238] rdma_init_qp_attr+0x4a/0x2c0 [ 55.366410] ucma_init_qp_attr+0x111/0x200 [ 55.366846] ? ucma_notify+0xf0/0xf0 [ 55.367405] ? _get_random_bytes+0xea/0x1b0 [ 55.367846] ? urandom_read+0x2f0/0x2f0 [ 55.368436] ? kmem_cache_alloc_trace+0xd2/0x1e0 [ 55.369104] ? refcount_inc_not_zero+0x9/0x60 [ 55.369583] ? refcount_inc+0x5/0x30 [ 55.370155] ? rdma_create_id+0x215/0x240 [ 55.370937] ? _copy_to_user+0x4f/0x60 [ 55.371620] ? mem_cgroup_commit_charge+0x1f5/0x290 [ 55.372127] ? _copy_from_user+0x5e/0x90 [ 55.372720] ucma_write+0x174/0x1f0 [ 55.373090] ? ucma_close_id+0x40/0x40 [ 55.373805] ? __lru_cache_add+0xa8/0xd0 [ 55.374403] __vfs_write+0xc4/0x350 [ 55.374774] ? kernel_read+0xa0/0xa0 [ 55.375173] ? fsnotify+0x899/0x8f0 [ 55.375544] ? fsnotify_unmount_inodes+0x170/0x170 [ 55.376689] ? __fsnotify_update_child_dentry_flags+0x30/0x30 [ 55.377522] ? handle_mm_fault+0x174/0x320 [ 55.378169] vfs_write+0xf7/0x280 [ 55.378864] SyS_write+0xa1/0x120 [ 55.379270] ? SyS_read+0x120/0x120 [ 55.379643] ? mm_fault_error+0x180/0x180 [ 55.380071] ? task_work_run+0x7d/0xd0 [ 55.380910] ? __task_pid_nr_ns+0x120/0x140 [ 55.381366] ? SyS_read+0x120/0x120 [ 55.381739] do_syscall_64+0xeb/0x250 [ 55.382143] entry_SYSCALL_64_after_hwframe+0x21/0x86 [ 55.382841] RIP: 0033:0x7fc2ef803e99 [ 55.383227] RSP: 002b:00007fffcc5f3be8 EFLAGS: 00000217 ORIG_RAX: 0000000000000001 [ 55.384173] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc2ef803e99 [ 55.386145] RDX: 0000000000000057 RSI: 0000000020000080 RDI: 0000000000000003 [ 55.388418] RBP: 00007fffcc5f3c00 R08: 0000000000000000 R09: 0000000000000000 [ 55.390542] R10: 0000000000000000 R11: 0000000000000217 R12: 0000000000400480 [ 55.392916] R13: 00007fffcc5f3cf0 R14: 0000000000000000 R15: 0000000000000000 [ 55.521088] Code: e5 4d 1e ff 48 89 df 44 0f b6 b3 b8 01 00 00 e8 65 50 1e ff 4c 8b 2b 49 8d bd b0 00 00 00 e8 56 50 1e ff 41 0f b6 c6 48 c1 e0 04 <49> 03 85 b0 00 00 00 48 8d 78 08 48 89 04 24 e8 3a 4f 1e ff 48 [ 55.525980] RIP: rdma_init_qp_attr+0x52/0x2c0 RSP: ffff8801e2c2f9d8 [ 55.532648] CR2: 00000000000000b0 [ 55.534396] ---[ end trace 70cee64090251c0b ]--- Fixes: 75216638 ("RDMA/cma: Export rdma cm interface to userspace") Fixes: d541e455 ("IB/core: Convert ah_attr from OPA to IB when copying to user") Reported-by: <syzbot+7b62c837c2516f8f38c8@syzkaller.appspotmail.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Jason Gunthorpe authored
process_one_req() can race with rdma_addr_cancel(): CPU0 CPU1 ==== ==== process_one_work() debug_work_deactivate(work); process_one_req() rdma_addr_cancel() mutex_lock(&lock); set_timeout(&req->work,..); __queue_work() debug_work_activate(work); mutex_unlock(&lock); mutex_lock(&lock); [..] list_del(&req->list); mutex_unlock(&lock); [..] // ODEBUG explodes since the work is still queued. kfree(req); Causing ODEBUG to detect the use after free: ODEBUG: free active (active state 0) object type: work_struct hint: process_one_req+0x0/0x6c0 include/net/dst.h:165 WARNING: CPU: 0 PID: 79 at lib/debugobjects.c:291 debug_print_object+0x166/0x220 lib/debugobjects.c:288 kvm: emulating exchange as write Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 79 Comm: kworker/u4:3 Not tainted 4.16.0-rc6+ #361 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: ib_addr process_one_req Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x24d lib/dump_stack.c:53 panic+0x1e4/0x41c kernel/panic.c:183 __warn+0x1dc/0x200 kernel/panic.c:547 report_bug+0x1f4/0x2b0 lib/bug.c:186 fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178 fixup_bug arch/x86/kernel/traps.c:247 [inline] do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315 invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:986 RIP: 0010:debug_print_object+0x166/0x220 lib/debugobjects.c:288 RSP: 0000:ffff8801d966f210 EFLAGS: 00010086 RAX: dffffc0000000008 RBX: 0000000000000003 RCX: ffffffff815acd6e RDX: 0000000000000000 RSI: 1ffff1003b2cddf2 RDI: 0000000000000000 RBP: ffff8801d966f250 R08: 0000000000000000 R09: 1ffff1003b2cddc8 R10: ffffed003b2cde71 R11: ffffffff86f39a98 R12: 0000000000000001 R13: ffffffff86f15540 R14: ffffffff86408700 R15: ffffffff8147c0a0 __debug_check_no_obj_freed lib/debugobjects.c:745 [inline] debug_check_no_obj_freed+0x662/0xf1f lib/debugobjects.c:774 kfree+0xc7/0x260 mm/slab.c:3799 process_one_req+0x2e7/0x6c0 drivers/infiniband/core/addr.c:592 process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113 worker_thread+0x223/0x1990 kernel/workqueue.c:2247 kthread+0x33c/0x400 kernel/kthread.c:238 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 Fixes: 5fff41e1 ("IB/core: Fix race condition in resolving IP to MAC") Reported-by: <syzbot+3b4acab09b6463472d0a@syzkaller.appspotmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
- 21 Mar, 2018 6 commits
-
-
Kalderon, Michal authored
Once the FW is transitioned to error, FLUSH cqes can be received. We want the driver to be aware of the fact that QP is already in error. Without this fix, a user may see false error messages in the dmesg log, mentioning that a FLUSH cqe was received while QP is not in error state. Fixes: cecbcddf ("qedr: Add support for QP verbs") Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Kalderon, Michal authored
Return code wasn't set properly when CNQ allocation failed. This only affect error message logging, currently user will receive an error message that says the qedr driver load failed with rc '0', instead of ENOMEM Fixes: ec72fce4 ("qedr: Add support for RoCE HW init") Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Kalderon, Michal authored
QPs that were configured with ack timeout value lower than 1 msec will not implement re-transmission timeout. This means that if a packet / ACK were dropped, the QP will not retransmit this packet. This can lead to an application hang. Fixes: cecbcddf ("qedr: Add support for QP verbs") Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Chien Tin Tung authored
The option size check is using optval instead of optlen causing the set option call to fail. Use the correct field, optlen, for size check. Fixes: 6a21dfc0 ("RDMA/ucma: Limit possible option size") Signed-off-by: Chien Tin Tung <chien.tin.tung@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Leon Romanovsky authored
The fact that resource tracking commit 02d8883f ("RDMA/restrack: Add general infrastructure to track RDMA resources") was added immediately after commit 16c1975f ("IB/mlx5: Create profile infrastructure to add and remove stages") caused us to miss the fact that PD and CQ are created after ib_register_device, but released after ib_unregister_device() and not before as it is expected from normal flow. Fix introduced in commit 42cea83f ("IB/mlx5: Fix cleanup order on unload") revealed this fact, so this patch is needed to avoid from restrack warnings It fixes resource tracking warnings during shutdown. [ 43.473906] CPU: 5 PID: 3016 Comm: modprobe Not tainted 4.16.0-rc5-for-linust-perf-2018-03-19_07-01-58-14 #1 [ 43.473907] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu2 04/01/2014 [ 43.473919] RIP: 0010:rdma_restrack_clean+0x25/0x30 [ib_core] [ 43.473921] RSP: 0018:ffffc9000267be48 EFLAGS: 00010282 [ 43.473924] RAX: 0000000000000000 RBX: ffff88033c690070 RCX: 0000000180080006 [ 43.473925] RDX: ffff88035ce922e0 RSI: ffffea000cf1a200 RDI: ffff88033c6907c8 [ 43.473926] RBP: ffff88033c690070 R08: ffff88033c689000 R09: 0000000180080006 [ 43.473927] R10: 000000003c68a001 R11: ffff88033c689000 R12: ffff88033c690000 [ 43.473929] R13: ffff88033c69005c R14: 0000000000000000 R15: 0000000000000000 [ 43.473932] FS: 00007f5928359740(0000) GS:ffff88036c540000(0000) knlGS:0000000000000000 [ 43.473933] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 43.473935] CR2: 00007ffffc760cc8 CR3: 000000035620c000 CR4: 00000000000006e0 [ 43.473940] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 43.473941] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 43.473942] Call Trace: [ 43.473969] ib_unregister_device+0xf5/0x190 [ib_core] [ 43.474000] __mlx5_ib_remove+0x2e/0x40 [mlx5_ib] [ 43.474098] mlx5_remove_device+0xf5/0x120 [mlx5_core] [ 43.474132] mlx5_unregister_interface+0x37/0x90 [mlx5_core] [ 43.474142] mlx5_ib_cleanup+0xc/0x16a [mlx5_ib] [ 43.474152] SyS_delete_module+0x159/0x260 [ 43.474159] do_syscall_64+0x61/0x110 [ 43.474165] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 43.474168] RIP: 0033:0x7f59278466b7 [ 43.474170] RSP: 002b:00007ffffc763e38 EFLAGS: 00000202 ORIG_RAX: 00000000000000b0 [ 43.474172] RAX: ffffffffffffffda RBX: 000000000130d590 RCX: 00007f59278466b7 [ 43.474173] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000000000130d5f8 [ 43.474175] RBP: 0000000000000000 R08: 00007f5927b0b060 R09: 00007f59278b6a40 [ 43.474176] R10: 00007ffffc763bc0 R11: 0000000000000202 R12: 0000000000000000 [ 43.474177] R13: 0000000000000001 R14: 000000000130d5f8 R15: 0000000000000000 [ 43.474179] Code: 84 00 00 00 00 00 0f 1f 44 00 00 48 83 c7 28 31 c0 eb 0c 48 83 c0 08 48 3d 00 08 00 00 74 0f 48 8d 14 07 48 8b 12 48 85 d2 74 e8 <0f> 0b c3 f3 c3 66 0f 1f 44 00 00 0f 1f 44 00 00 53 48 8b 47 28 [ 43.474221] ---[ end trace e89771e2250ffc23 ]--- Fixes: 42cea83f ("IB/mlx5: Fix cleanup order on unload") Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Mark Bloch authored
In case we failed to create UMR resources, mark them as invalid so we won't try to destroy them on the unwind path. Add the relevant checks to destroy_umrc_res(), this is done for the unlikely event ib_register_device() or create_umr_res() err out and we try to destroy invalid objects. Fixes: 42cea83f ("IB/mlx5: Fix cleanup order on unload") Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
- 20 Mar, 2018 1 commit
-
-
Leon Romanovsky authored
Prior to access UCMA commands, the context should be initialized and connected to CM_ID with ucma_create_id(). In case user skips this step, he can provide non-valid ctx without CM_ID and cause to multiple NULL dereferences. Also there are situations where the create_id can be raced with other user access, ensure that the context is only shared to other threads once it is fully initialized to avoid the races. [ 109.088108] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 109.090315] IP: ucma_connect+0x138/0x1d0 [ 109.092595] PGD 80000001dc02d067 P4D 80000001dc02d067 PUD 1da9ef067 PMD 0 [ 109.095384] Oops: 0000 [#1] SMP KASAN PTI [ 109.097834] CPU: 0 PID: 663 Comm: uclose Tainted: G B 4.16.0-rc1-00062-g2975d5de #45 [ 109.100816] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 [ 109.105943] RIP: 0010:ucma_connect+0x138/0x1d0 [ 109.108850] RSP: 0018:ffff8801c8567a80 EFLAGS: 00010246 [ 109.111484] RAX: 0000000000000000 RBX: 1ffff100390acf50 RCX: ffffffff9d7812e2 [ 109.114496] RDX: 1ffffffff3f507a5 RSI: 0000000000000297 RDI: 0000000000000297 [ 109.117490] RBP: ffff8801daa15600 R08: 0000000000000000 R09: ffffed00390aceeb [ 109.120429] R10: 0000000000000001 R11: ffffed00390aceea R12: 0000000000000000 [ 109.123318] R13: 0000000000000120 R14: ffff8801de6459c0 R15: 0000000000000118 [ 109.126221] FS: 00007fabb68d6700(0000) GS:ffff8801e5c00000(0000) knlGS:0000000000000000 [ 109.129468] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 109.132523] CR2: 0000000000000020 CR3: 00000001d45d8003 CR4: 00000000003606b0 [ 109.135573] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 109.138716] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 109.142057] Call Trace: [ 109.144160] ? ucma_listen+0x110/0x110 [ 109.146386] ? wake_up_q+0x59/0x90 [ 109.148853] ? futex_wake+0x10b/0x2a0 [ 109.151297] ? save_stack+0x89/0xb0 [ 109.153489] ? _copy_from_user+0x5e/0x90 [ 109.155500] ucma_write+0x174/0x1f0 [ 109.157933] ? ucma_resolve_route+0xf0/0xf0 [ 109.160389] ? __mod_node_page_state+0x1d/0x80 [ 109.162706] __vfs_write+0xc4/0x350 [ 109.164911] ? kernel_read+0xa0/0xa0 [ 109.167121] ? path_openat+0x1b10/0x1b10 [ 109.169355] ? fsnotify+0x899/0x8f0 [ 109.171567] ? fsnotify_unmount_inodes+0x170/0x170 [ 109.174145] ? __fget+0xa8/0xf0 [ 109.177110] vfs_write+0xf7/0x280 [ 109.179532] SyS_write+0xa1/0x120 [ 109.181885] ? SyS_read+0x120/0x120 [ 109.184482] ? compat_start_thread+0x60/0x60 [ 109.187124] ? SyS_read+0x120/0x120 [ 109.189548] do_syscall_64+0xeb/0x250 [ 109.192178] entry_SYSCALL_64_after_hwframe+0x21/0x86 [ 109.194725] RIP: 0033:0x7fabb61ebe99 [ 109.197040] RSP: 002b:00007fabb68d5e98 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 [ 109.200294] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fabb61ebe99 [ 109.203399] RDX: 0000000000000120 RSI: 00000000200001c0 RDI: 0000000000000004 [ 109.206548] RBP: 00007fabb68d5ec0 R08: 0000000000000000 R09: 0000000000000000 [ 109.209902] R10: 0000000000000000 R11: 0000000000000202 R12: 00007fabb68d5fc0 [ 109.213327] R13: 0000000000000000 R14: 00007fff40ab2430 R15: 00007fabb68d69c0 [ 109.216613] Code: 88 44 24 2c 0f b6 84 24 6e 01 00 00 88 44 24 2d 0f b6 84 24 69 01 00 00 88 44 24 2e 8b 44 24 60 89 44 24 30 e8 da f6 06 ff 31 c0 <66> 41 83 7c 24 20 1b 75 04 8b 44 24 64 48 8d 74 24 20 4c 89 e7 [ 109.223602] RIP: ucma_connect+0x138/0x1d0 RSP: ffff8801c8567a80 [ 109.226256] CR2: 0000000000000020 Fixes: 75216638 ("RDMA/cma: Export rdma cm interface to userspace") Reported-by: <syzbot+36712f50b0552615bf59@syzkaller.appspotmail.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
- 19 Mar, 2018 2 commits
-
-
Leon Romanovsky authored
XRCD object is not implemented in the restrack, so lets remove it. Fixes: 02d8883f ("RDMA/restrack: Add general infrastructure to track RDMA resources") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Leon Romanovsky authored
The error in ucma_create_id() left ctx in the list of contexts belong to ucma file descriptor. The attempt to close this file descriptor causes to use-after-free accesses while iterating over such list. Fixes: 75216638 ("RDMA/cma: Export rdma cm interface to userspace") Reported-by: <syzbot+dcfd344365a56fbebd0f@syzkaller.appspotmail.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
- 16 Mar, 2018 1 commit
-
-
Leon Romanovsky authored
Garbage supplied by user will cause to UCMA module provide zero memory size for memcpy(), because it wasn't checked, it will produce unpredictable results in rdma_resolve_addr(). [ 42.873814] BUG: KASAN: null-ptr-deref in rdma_resolve_addr+0xc8/0xfb0 [ 42.874816] Write of size 28 at addr 00000000000000a0 by task resaddr/1044 [ 42.876765] [ 42.876960] CPU: 1 PID: 1044 Comm: resaddr Not tainted 4.16.0-rc1-00057-gaa56a5293d7e #34 [ 42.877840] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 [ 42.879691] Call Trace: [ 42.880236] dump_stack+0x5c/0x77 [ 42.880664] kasan_report+0x163/0x380 [ 42.881354] ? rdma_resolve_addr+0xc8/0xfb0 [ 42.881864] memcpy+0x34/0x50 [ 42.882692] rdma_resolve_addr+0xc8/0xfb0 [ 42.883366] ? deref_stack_reg+0x88/0xd0 [ 42.883856] ? vsnprintf+0x31a/0x770 [ 42.884686] ? rdma_bind_addr+0xc40/0xc40 [ 42.885327] ? num_to_str+0x130/0x130 [ 42.885773] ? deref_stack_reg+0x88/0xd0 [ 42.886217] ? __read_once_size_nocheck.constprop.6+0x10/0x10 [ 42.887698] ? unwind_get_return_address_ptr+0x50/0x50 [ 42.888302] ? replace_slot+0x147/0x170 [ 42.889176] ? delete_node+0x12c/0x340 [ 42.890223] ? __radix_tree_lookup+0xa9/0x160 [ 42.891196] ? ucma_resolve_ip+0xb7/0x110 [ 42.891917] ucma_resolve_ip+0xb7/0x110 [ 42.893003] ? ucma_resolve_addr+0x190/0x190 [ 42.893531] ? _copy_from_user+0x5e/0x90 [ 42.894204] ucma_write+0x174/0x1f0 [ 42.895162] ? ucma_resolve_route+0xf0/0xf0 [ 42.896309] ? dequeue_task_fair+0x67e/0xd90 [ 42.897192] ? put_prev_entity+0x7d/0x170 [ 42.897870] ? ring_buffer_record_is_on+0xd/0x20 [ 42.898439] ? tracing_record_taskinfo_skip+0x20/0x50 [ 42.899686] __vfs_write+0xc4/0x350 [ 42.900142] ? kernel_read+0xa0/0xa0 [ 42.900602] ? firmware_map_remove+0xdf/0xdf [ 42.901135] ? do_task_dead+0x5d/0x60 [ 42.901598] ? do_exit+0xcc6/0x1220 [ 42.902789] ? __fget+0xa8/0xf0 [ 42.903190] vfs_write+0xf7/0x280 [ 42.903600] SyS_write+0xa1/0x120 [ 42.904206] ? SyS_read+0x120/0x120 [ 42.905710] ? compat_start_thread+0x60/0x60 [ 42.906423] ? SyS_read+0x120/0x120 [ 42.908716] do_syscall_64+0xeb/0x250 [ 42.910760] entry_SYSCALL_64_after_hwframe+0x21/0x86 [ 42.912735] RIP: 0033:0x7f138b0afe99 [ 42.914734] RSP: 002b:00007f138b799e98 EFLAGS: 00000287 ORIG_RAX: 0000000000000001 [ 42.917134] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f138b0afe99 [ 42.919487] RDX: 000000000000002e RSI: 0000000020000c40 RDI: 0000000000000004 [ 42.922393] RBP: 00007f138b799ec0 R08: 00007f138b79a700 R09: 0000000000000000 [ 42.925266] R10: 00007f138b79a700 R11: 0000000000000287 R12: 00007f138b799fc0 [ 42.927570] R13: 0000000000000000 R14: 00007ffdbae757c0 R15: 00007f138b79a9c0 [ 42.930047] [ 42.932681] Disabling lock debugging due to kernel taint [ 42.934795] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0 [ 42.936939] IP: memcpy_erms+0x6/0x10 [ 42.938864] PGD 80000001bea92067 P4D 80000001bea92067 PUD 1bea96067 PMD 0 [ 42.941576] Oops: 0002 [#1] SMP KASAN PTI [ 42.943952] CPU: 1 PID: 1044 Comm: resaddr Tainted: G B 4.16.0-rc1-00057-gaa56a5293d7e #34 [ 42.946964] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 [ 42.952336] RIP: 0010:memcpy_erms+0x6/0x10 [ 42.954707] RSP: 0018:ffff8801c8b479c8 EFLAGS: 00010286 [ 42.957227] RAX: 00000000000000a0 RBX: ffff8801c8b47ba0 RCX: 000000000000001c [ 42.960543] RDX: 000000000000001c RSI: ffff8801c8b47bbc RDI: 00000000000000a0 [ 42.963867] RBP: ffff8801c8b47b60 R08: 0000000000000000 R09: ffffed0039168ed1 [ 42.967303] R10: 0000000000000001 R11: ffffed0039168ed0 R12: ffff8801c8b47bbc [ 42.970685] R13: 00000000000000a0 R14: 1ffff10039168f4a R15: 0000000000000000 [ 42.973631] FS: 00007f138b79a700(0000) GS:ffff8801e5d00000(0000) knlGS:0000000000000000 [ 42.976831] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 42.979239] CR2: 00000000000000a0 CR3: 00000001be908002 CR4: 00000000003606a0 [ 42.982060] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 42.984877] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 42.988033] Call Trace: [ 42.990487] rdma_resolve_addr+0xc8/0xfb0 [ 42.993202] ? deref_stack_reg+0x88/0xd0 [ 42.996055] ? vsnprintf+0x31a/0x770 [ 42.998707] ? rdma_bind_addr+0xc40/0xc40 [ 43.000985] ? num_to_str+0x130/0x130 [ 43.003410] ? deref_stack_reg+0x88/0xd0 [ 43.006302] ? __read_once_size_nocheck.constprop.6+0x10/0x10 [ 43.008780] ? unwind_get_return_address_ptr+0x50/0x50 [ 43.011178] ? replace_slot+0x147/0x170 [ 43.013517] ? delete_node+0x12c/0x340 [ 43.016019] ? __radix_tree_lookup+0xa9/0x160 [ 43.018755] ? ucma_resolve_ip+0xb7/0x110 [ 43.021270] ucma_resolve_ip+0xb7/0x110 [ 43.023968] ? ucma_resolve_addr+0x190/0x190 [ 43.026312] ? _copy_from_user+0x5e/0x90 [ 43.029384] ucma_write+0x174/0x1f0 [ 43.031861] ? ucma_resolve_route+0xf0/0xf0 [ 43.034782] ? dequeue_task_fair+0x67e/0xd90 [ 43.037483] ? put_prev_entity+0x7d/0x170 [ 43.040215] ? ring_buffer_record_is_on+0xd/0x20 [ 43.042990] ? tracing_record_taskinfo_skip+0x20/0x50 [ 43.045595] __vfs_write+0xc4/0x350 [ 43.048624] ? kernel_read+0xa0/0xa0 [ 43.051604] ? firmware_map_remove+0xdf/0xdf [ 43.055379] ? do_task_dead+0x5d/0x60 [ 43.058000] ? do_exit+0xcc6/0x1220 [ 43.060783] ? __fget+0xa8/0xf0 [ 43.063133] vfs_write+0xf7/0x280 [ 43.065677] SyS_write+0xa1/0x120 [ 43.068647] ? SyS_read+0x120/0x120 [ 43.071179] ? compat_start_thread+0x60/0x60 [ 43.074025] ? SyS_read+0x120/0x120 [ 43.076705] do_syscall_64+0xeb/0x250 [ 43.079006] entry_SYSCALL_64_after_hwframe+0x21/0x86 [ 43.081606] RIP: 0033:0x7f138b0afe99 [ 43.083679] RSP: 002b:00007f138b799e98 EFLAGS: 00000287 ORIG_RAX: 0000000000000001 [ 43.086802] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f138b0afe99 [ 43.089989] RDX: 000000000000002e RSI: 0000000020000c40 RDI: 0000000000000004 [ 43.092866] RBP: 00007f138b799ec0 R08: 00007f138b79a700 R09: 0000000000000000 [ 43.096233] R10: 00007f138b79a700 R11: 0000000000000287 R12: 00007f138b799fc0 [ 43.098913] R13: 0000000000000000 R14: 00007ffdbae757c0 R15: 00007f138b79a9c0 [ 43.101809] Code: 90 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 [ 43.107950] RIP: memcpy_erms+0x6/0x10 RSP: ffff8801c8b479c8 Reported-by: <syzbot+1d8c43206853b369d00c@syzkaller.appspotmail.com> Fixes: 75216638 ("RDMA/cma: Export rdma cm interface to userspace") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
- 14 Mar, 2018 7 commits
-
-
Arnd Bergmann authored
On 32-bit targets, we otherwise get a warning about an impossible constant integer expression: In file included from include/linux/kernel.h:11, from include/linux/interrupt.h:6, from drivers/infiniband/hw/bnxt_re/ib_verbs.c:39: drivers/infiniband/hw/bnxt_re/ib_verbs.c: In function 'bnxt_re_query_device': include/linux/bitops.h:7:24: error: left shift count >= width of type [-Werror=shift-count-overflow] #define BIT(nr) (1UL << (nr)) ^~ drivers/infiniband/hw/bnxt_re/bnxt_re.h:61:34: note: in expansion of macro 'BIT' #define BNXT_RE_MAX_MR_SIZE_HIGH BIT(39) ^~~ drivers/infiniband/hw/bnxt_re/bnxt_re.h:62:30: note: in expansion of macro 'BNXT_RE_MAX_MR_SIZE_HIGH' #define BNXT_RE_MAX_MR_SIZE BNXT_RE_MAX_MR_SIZE_HIGH ^~~~~~~~~~~~~~~~~~~~~~~~ drivers/infiniband/hw/bnxt_re/ib_verbs.c:149:25: note: in expansion of macro 'BNXT_RE_MAX_MR_SIZE' ib_attr->max_mr_size = BNXT_RE_MAX_MR_SIZE; ^~~~~~~~~~~~~~~~~~~ Fixes: 872f3578 ("RDMA/bnxt_re: Add support for MRs with Huge pages") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Arnd Bergmann authored
Building for a 32-bit target results in a couple of warnings from casting between a 32-bit pointer and a 64-bit integer: drivers/infiniband/hw/bnxt_re/qplib_fp.c: In function 'bnxt_qplib_service_nq': drivers/infiniband/hw/bnxt_re/qplib_fp.c:333:23: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] bnxt_qplib_arm_srq((struct bnxt_qplib_srq *)q_handle, ^ drivers/infiniband/hw/bnxt_re/qplib_fp.c:336:12: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] (struct bnxt_qplib_srq *)q_handle, ^ In file included from include/linux/byteorder/little_endian.h:5, from arch/arm/include/uapi/asm/byteorder.h:22, from include/asm-generic/bitops/le.h:6, from arch/arm/include/asm/bitops.h:342, from include/linux/bitops.h:38, from include/linux/kernel.h:11, from include/linux/interrupt.h:6, from drivers/infiniband/hw/bnxt_re/qplib_fp.c:39: drivers/infiniband/hw/bnxt_re/qplib_fp.c: In function 'bnxt_qplib_create_srq': include/uapi/linux/byteorder/little_endian.h:31:43: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast] #define __cpu_to_le64(x) ((__force __le64)(__u64)(x)) ^ include/linux/byteorder/generic.h:86:21: note: in expansion of macro '__cpu_to_le64' #define cpu_to_le64 __cpu_to_le64 ^~~~~~~~~~~~~ drivers/infiniband/hw/bnxt_re/qplib_fp.c:569:19: note: in expansion of macro 'cpu_to_le64' req.srq_handle = cpu_to_le64(srq); Using a uintptr_t as an intermediate works on all architectures. Fixes: 37cb11ac ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Mark Bloch authored
On load we create private CQ/QP/PD in order to be used by UMR, we create those resources after we register ourself as an IB device, and we destroy them after we unregister as an IB device. This was changed by commit 16c1975f ("IB/mlx5: Create profile infrastructure to add and remove stages") which moved the destruction before we unregistration. This allowed to trigger an invalid memory access when unloading mlx5_ib while there are open resources: BUG: unable to handle kernel paging request at 00000001002c012c ... Call Trace: mlx5_ib_post_send_wait+0x75/0x110 [mlx5_ib] __slab_free+0x9a/0x2d0 delay_time_func+0x10/0x10 [mlx5_ib] unreg_umr.isra.15+0x4b/0x50 [mlx5_ib] mlx5_mr_cache_free+0x46/0x150 [mlx5_ib] clean_mr+0xc9/0x190 [mlx5_ib] dereg_mr+0xba/0xf0 [mlx5_ib] ib_dereg_mr+0x13/0x20 [ib_core] remove_commit_idr_uobject+0x16/0x70 [ib_uverbs] uverbs_cleanup_ucontext+0xe8/0x1a0 [ib_uverbs] ib_uverbs_cleanup_ucontext.isra.9+0x19/0x40 [ib_uverbs] ib_uverbs_remove_one+0x162/0x2e0 [ib_uverbs] ib_unregister_device+0xd4/0x190 [ib_core] __mlx5_ib_remove+0x2e/0x40 [mlx5_ib] mlx5_remove_device+0xf5/0x120 [mlx5_core] mlx5_unregister_interface+0x37/0x90 [mlx5_core] mlx5_ib_cleanup+0xc/0x225 [mlx5_ib] SyS_delete_module+0x153/0x230 do_syscall_64+0x62/0x110 entry_SYSCALL_64_after_hwframe+0x21/0x86 ... We restore the original behavior by breaking the UMR stage into two parts, pre and post IB registration stages, this way we can restore the original functionality and maintain clean separation of logic between stages. Fixes: 16c1975f ("IB/mlx5: Create profile infrastructure to add and remove stages") Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Leon Romanovsky authored
Users can provide garbage while calling to ucma_join_ip_multicast(), it will indirectly cause to rdma_addr_size() return 0, making the call to ucma_process_join(), which had the right checks, but it is better to check the input as early as possible. The following crash from syzkaller revealed it. kernel BUG at lib/string.c:1052! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 0 PID: 4113 Comm: syz-executor0 Not tainted 4.16.0-rc5+ #261 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:fortify_panic+0x13/0x20 lib/string.c:1051 RSP: 0018:ffff8801ca81f8f0 EFLAGS: 00010286 RAX: 0000000000000022 RBX: 1ffff10039503f23 RCX: 0000000000000000 RDX: 0000000000000022 RSI: 1ffff10039503ed3 RDI: ffffed0039503f12 RBP: ffff8801ca81f8f0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000006 R11: 0000000000000000 R12: ffff8801ca81f998 R13: ffff8801ca81f938 R14: ffff8801ca81fa58 R15: 000000000000fa00 FS: 0000000000000000(0000) GS:ffff8801db200000(0063) knlGS:000000000a12a900 CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 CR2: 0000000008138024 CR3: 00000001cbb58004 CR4: 00000000001606f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: memcpy include/linux/string.h:344 [inline] ucma_join_ip_multicast+0x36b/0x3b0 drivers/infiniband/core/ucma.c:1421 ucma_write+0x2d6/0x3d0 drivers/infiniband/core/ucma.c:1633 __vfs_write+0xef/0x970 fs/read_write.c:480 vfs_write+0x189/0x510 fs/read_write.c:544 SYSC_write fs/read_write.c:589 [inline] SyS_write+0xef/0x220 fs/read_write.c:581 do_syscall_32_irqs_on arch/x86/entry/common.c:330 [inline] do_fast_syscall_32+0x3ec/0xf9f arch/x86/entry/common.c:392 entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139 RIP: 0023:0xf7f9ec99 RSP: 002b:00000000ff8172cc EFLAGS: 00000282 ORIG_RAX: 0000000000000004 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000020000100 RDX: 0000000000000063 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Code: 08 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b 48 89 df e8 42 2c e3 fb eb de 55 48 89 fe 48 c7 c7 80 75 98 86 48 89 e5 e8 85 95 94 fb <0f> 0b 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 41 57 41 56 RIP: fortify_panic+0x13/0x20 lib/string.c:1051 RSP: ffff8801ca81f8f0 Fixes: 5bc2b7b3 ("RDMA/ucma: Allow user space to specify AF_IB when joining multicast") Reported-by: <syzbot+2287ac532caa81900a4e@syzkaller.appspotmail.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Leon Romanovsky authored
The attempt to join multicast group without ensuring that CMA device exists will lead to the following crash reported by syzkaller. [ 64.076794] BUG: KASAN: null-ptr-deref in rdma_join_multicast+0x26e/0x12c0 [ 64.076797] Read of size 8 at addr 00000000000000b0 by task join/691 [ 64.076797] [ 64.076800] CPU: 1 PID: 691 Comm: join Not tainted 4.16.0-rc1-00219-gb97853b65b93 #23 [ 64.076802] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-proj4 [ 64.076803] Call Trace: [ 64.076809] dump_stack+0x5c/0x77 [ 64.076817] kasan_report+0x163/0x380 [ 64.085859] ? rdma_join_multicast+0x26e/0x12c0 [ 64.086634] rdma_join_multicast+0x26e/0x12c0 [ 64.087370] ? rdma_disconnect+0xf0/0xf0 [ 64.088579] ? __radix_tree_replace+0xc3/0x110 [ 64.089132] ? node_tag_clear+0x81/0xb0 [ 64.089606] ? idr_alloc_u32+0x12e/0x1a0 [ 64.090517] ? __fprop_inc_percpu_max+0x150/0x150 [ 64.091768] ? tracing_record_taskinfo+0x10/0xc0 [ 64.092340] ? idr_alloc+0x76/0xc0 [ 64.092951] ? idr_alloc_u32+0x1a0/0x1a0 [ 64.093632] ? ucma_process_join+0x23d/0x460 [ 64.094510] ucma_process_join+0x23d/0x460 [ 64.095199] ? ucma_migrate_id+0x440/0x440 [ 64.095696] ? futex_wake+0x10b/0x2a0 [ 64.096159] ucma_join_multicast+0x88/0xe0 [ 64.096660] ? ucma_process_join+0x460/0x460 [ 64.097540] ? _copy_from_user+0x5e/0x90 [ 64.098017] ucma_write+0x174/0x1f0 [ 64.098640] ? ucma_resolve_route+0xf0/0xf0 [ 64.099343] ? rb_erase_cached+0x6c7/0x7f0 [ 64.099839] __vfs_write+0xc4/0x350 [ 64.100622] ? perf_syscall_enter+0xe4/0x5f0 [ 64.101335] ? kernel_read+0xa0/0xa0 [ 64.103525] ? perf_sched_cb_inc+0xc0/0xc0 [ 64.105510] ? syscall_exit_register+0x2a0/0x2a0 [ 64.107359] ? __switch_to+0x351/0x640 [ 64.109285] ? fsnotify+0x899/0x8f0 [ 64.111610] ? fsnotify_unmount_inodes+0x170/0x170 [ 64.113876] ? __fsnotify_update_child_dentry_flags+0x30/0x30 [ 64.115813] ? ring_buffer_record_is_on+0xd/0x20 [ 64.117824] ? __fget+0xa8/0xf0 [ 64.119869] vfs_write+0xf7/0x280 [ 64.122001] SyS_write+0xa1/0x120 [ 64.124213] ? SyS_read+0x120/0x120 [ 64.126644] ? SyS_read+0x120/0x120 [ 64.128563] do_syscall_64+0xeb/0x250 [ 64.130732] entry_SYSCALL_64_after_hwframe+0x21/0x86 [ 64.132984] RIP: 0033:0x7f5c994ade99 [ 64.135699] RSP: 002b:00007f5c99b97d98 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 64.138740] RAX: ffffffffffffffda RBX: 00000000200001e4 RCX: 00007f5c994ade99 [ 64.141056] RDX: 00000000000000a0 RSI: 00000000200001c0 RDI: 0000000000000015 [ 64.143536] RBP: 00007f5c99b97ec0 R08: 0000000000000000 R09: 0000000000000000 [ 64.146017] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f5c99b97fc0 [ 64.148608] R13: 0000000000000000 R14: 00007fff660e1c40 R15: 00007f5c99b989c0 [ 64.151060] [ 64.153703] Disabling lock debugging due to kernel taint [ 64.156032] BUG: unable to handle kernel NULL pointer dereference at 00000000000000b0 [ 64.159066] IP: rdma_join_multicast+0x26e/0x12c0 [ 64.161451] PGD 80000001d0298067 P4D 80000001d0298067 PUD 1dea39067 PMD 0 [ 64.164442] Oops: 0000 [#1] SMP KASAN PTI [ 64.166817] CPU: 1 PID: 691 Comm: join Tainted: G B 4.16.0-rc1-00219-gb97853b65b93 #23 [ 64.170004] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-proj4 [ 64.174985] RIP: 0010:rdma_join_multicast+0x26e/0x12c0 [ 64.177246] RSP: 0018:ffff8801c8207860 EFLAGS: 00010282 [ 64.179901] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff94789522 [ 64.183344] RDX: 1ffffffff2d50fa5 RSI: 0000000000000297 RDI: 0000000000000297 [ 64.186237] RBP: ffff8801c8207a50 R08: 0000000000000000 R09: ffffed0039040ea7 [ 64.189328] R10: 0000000000000001 R11: ffffed0039040ea6 R12: 0000000000000000 [ 64.192634] R13: 0000000000000000 R14: ffff8801e2022800 R15: ffff8801d4ac2400 [ 64.196105] FS: 00007f5c99b98700(0000) GS:ffff8801e5d00000(0000) knlGS:0000000000000000 [ 64.199211] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 64.202046] CR2: 00000000000000b0 CR3: 00000001d1c48004 CR4: 00000000003606a0 [ 64.205032] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 64.208221] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 64.211554] Call Trace: [ 64.213464] ? rdma_disconnect+0xf0/0xf0 [ 64.216124] ? __radix_tree_replace+0xc3/0x110 [ 64.219337] ? node_tag_clear+0x81/0xb0 [ 64.222140] ? idr_alloc_u32+0x12e/0x1a0 [ 64.224422] ? __fprop_inc_percpu_max+0x150/0x150 [ 64.226588] ? tracing_record_taskinfo+0x10/0xc0 [ 64.229763] ? idr_alloc+0x76/0xc0 [ 64.232186] ? idr_alloc_u32+0x1a0/0x1a0 [ 64.234505] ? ucma_process_join+0x23d/0x460 [ 64.237024] ucma_process_join+0x23d/0x460 [ 64.240076] ? ucma_migrate_id+0x440/0x440 [ 64.243284] ? futex_wake+0x10b/0x2a0 [ 64.245302] ucma_join_multicast+0x88/0xe0 [ 64.247783] ? ucma_process_join+0x460/0x460 [ 64.250841] ? _copy_from_user+0x5e/0x90 [ 64.253878] ucma_write+0x174/0x1f0 [ 64.257008] ? ucma_resolve_route+0xf0/0xf0 [ 64.259877] ? rb_erase_cached+0x6c7/0x7f0 [ 64.262746] __vfs_write+0xc4/0x350 [ 64.265537] ? perf_syscall_enter+0xe4/0x5f0 [ 64.267792] ? kernel_read+0xa0/0xa0 [ 64.270358] ? perf_sched_cb_inc+0xc0/0xc0 [ 64.272575] ? syscall_exit_register+0x2a0/0x2a0 [ 64.275367] ? __switch_to+0x351/0x640 [ 64.277700] ? fsnotify+0x899/0x8f0 [ 64.280530] ? fsnotify_unmount_inodes+0x170/0x170 [ 64.283156] ? __fsnotify_update_child_dentry_flags+0x30/0x30 [ 64.286182] ? ring_buffer_record_is_on+0xd/0x20 [ 64.288749] ? __fget+0xa8/0xf0 [ 64.291136] vfs_write+0xf7/0x280 [ 64.292972] SyS_write+0xa1/0x120 [ 64.294965] ? SyS_read+0x120/0x120 [ 64.297474] ? SyS_read+0x120/0x120 [ 64.299751] do_syscall_64+0xeb/0x250 [ 64.301826] entry_SYSCALL_64_after_hwframe+0x21/0x86 [ 64.304352] RIP: 0033:0x7f5c994ade99 [ 64.306711] RSP: 002b:00007f5c99b97d98 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 64.309577] RAX: ffffffffffffffda RBX: 00000000200001e4 RCX: 00007f5c994ade99 [ 64.312334] RDX: 00000000000000a0 RSI: 00000000200001c0 RDI: 0000000000000015 [ 64.315783] RBP: 00007f5c99b97ec0 R08: 0000000000000000 R09: 0000000000000000 [ 64.318365] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f5c99b97fc0 [ 64.320980] R13: 0000000000000000 R14: 00007fff660e1c40 R15: 00007f5c99b989c0 [ 64.323515] Code: e8 e8 79 08 ff 4c 89 ff 45 0f b6 a7 b8 01 00 00 e8 68 7c 08 ff 49 8b 1f 4d 89 e5 49 c1 e4 04 48 8 [ 64.330753] RIP: rdma_join_multicast+0x26e/0x12c0 RSP: ffff8801c8207860 [ 64.332979] CR2: 00000000000000b0 [ 64.335550] ---[ end trace 0c00c17a408849c1 ]--- Reported-by: <syzbot+e6aba77967bd72cbc9d6@syzkaller.appspotmail.com> Fixes: c8f6a362 ("RDMA/cma: Add multicast communication support") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Tatyana Nikolova authored
cma_port_is_unique() allows local port reuse if the quad (source address and port, destination address and port) for this connection is unique. However, if the destination info is zero or unspecified, it can't make a correct decision but still allows port reuse. For example, sometimes rdma_bind_addr() is called with unspecified destination and reusing the port can lead to creating a connection with a duplicate quad, after the destination is resolved. The issue manifests when MPI scale-up tests hang after the duplicate quad is used. Set the destination address family and add checks for zero destination address and port to prevent source port reuse based on invalid destination. Fixes: 19b752a1 ("IB/cma: Allow port reuse for rdma_id") Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Leon Romanovsky authored
The failure in rereg_mr flow caused to set garbage value (error value) into mr->umem pointer. This pointer is accessed at the release stage and it causes to the following crash. There is not enough to simply change umem to point to NULL, because the MR struct is needed to be accessed during MR deregistration phase, so delay kfree too. [ 6.237617] BUG: unable to handle kernel NULL pointer dereference a 0000000000000228 [ 6.238756] IP: ib_dereg_mr+0xd/0x30 [ 6.239264] PGD 80000000167eb067 P4D 80000000167eb067 PUD 167f9067 PMD 0 [ 6.240320] Oops: 0000 [#1] SMP PTI [ 6.240782] CPU: 0 PID: 367 Comm: dereg Not tainted 4.16.0-rc1-00029-gc198fafe0453 #183 [ 6.242120] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 [ 6.244504] RIP: 0010:ib_dereg_mr+0xd/0x30 [ 6.245253] RSP: 0018:ffffaf5d001d7d68 EFLAGS: 00010246 [ 6.246100] RAX: 0000000000000000 RBX: ffff95d4172daf00 RCX: 0000000000000000 [ 6.247414] RDX: 00000000ffffffff RSI: 0000000000000001 RDI: ffff95d41a317600 [ 6.248591] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 [ 6.249810] R10: ffff95d417033c10 R11: 0000000000000000 R12: ffff95d4172c3a80 [ 6.251121] R13: ffff95d4172c3720 R14: ffff95d4172c3a98 R15: 00000000ffffffff [ 6.252437] FS: 0000000000000000(0000) GS:ffff95d41fc00000(0000) knlGS:0000000000000000 [ 6.253887] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6.254814] CR2: 0000000000000228 CR3: 00000000172b4000 CR4: 00000000000006b0 [ 6.255943] Call Trace: [ 6.256368] remove_commit_idr_uobject+0x1b/0x80 [ 6.257118] uverbs_cleanup_ucontext+0xe4/0x190 [ 6.257855] ib_uverbs_cleanup_ucontext.constprop.14+0x19/0x40 [ 6.258857] ib_uverbs_close+0x2a/0x100 [ 6.259494] __fput+0xca/0x1c0 [ 6.259938] task_work_run+0x84/0xa0 [ 6.260519] do_exit+0x312/0xb40 [ 6.261023] ? __do_page_fault+0x24d/0x490 [ 6.261707] do_group_exit+0x3a/0xa0 [ 6.262267] SyS_exit_group+0x10/0x10 [ 6.262802] do_syscall_64+0x75/0x180 [ 6.263391] entry_SYSCALL_64_after_hwframe+0x21/0x86 [ 6.264253] RIP: 0033:0x7f1b39c49488 [ 6.264827] RSP: 002b:00007ffe2de05b68 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 [ 6.266049] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1b39c49488 [ 6.267187] RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000 [ 6.268377] RBP: 00007f1b39f258e0 R08: 00000000000000e7 R09: ffffffffffffff98 [ 6.269640] R10: 00007f1b3a147260 R11: 0000000000000246 R12: 00007f1b39f258e0 [ 6.270783] R13: 00007f1b39f2ac20 R14: 0000000000000000 R15: 0000000000000000 [ 6.271943] Code: 74 07 31 d2 e9 25 d8 6c 00 b8 da ff ff ff c3 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 07 53 48 8b 5f 08 <48> 8b 80 28 02 00 00 e8 f7 d7 6c 00 85 c0 75 04 3e ff 4b 18 5b [ 6.274927] RIP: ib_dereg_mr+0xd/0x30 RSP: ffffaf5d001d7d68 [ 6.275760] CR2: 0000000000000228 [ 6.276200] ---[ end trace a35641f1c474bd20 ]--- Fixes: e126ba97 ("mlx5: Add driver for Mellanox Connect-IB adapters") Cc: syzkaller <syzkaller@googlegroups.com> Cc: <stable@vger.kernel.org> Reported-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
- 13 Mar, 2018 2 commits
-
-
Boris Pismenny authored
This patch validates user provided input to prevent integer overflow due to integer manipulation in the mlx5_ib_create_srq function. Cc: syzkaller <syzkaller@googlegroups.com> Fixes: e126ba97 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Boris Pismenny authored
Add a check for the length of the qpin structure to prevent out-of-bounds reads BUG: KASAN: slab-out-of-bounds in create_raw_packet_qp+0x114c/0x15e2 Read of size 8192 at addr ffff880066b99290 by task syz-executor3/549 CPU: 3 PID: 549 Comm: syz-executor3 Not tainted 4.15.0-rc2+ #27 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 Call Trace: dump_stack+0x8d/0xd4 print_address_description+0x73/0x290 kasan_report+0x25c/0x370 ? create_raw_packet_qp+0x114c/0x15e2 memcpy+0x1f/0x50 create_raw_packet_qp+0x114c/0x15e2 ? create_raw_packet_qp_tis.isra.28+0x13d/0x13d ? lock_acquire+0x370/0x370 create_qp_common+0x2245/0x3b50 ? destroy_qp_user.isra.47+0x100/0x100 ? kasan_kmalloc+0x13d/0x170 ? sched_clock_cpu+0x18/0x180 ? fs_reclaim_acquire.part.15+0x5/0x30 ? __lock_acquire+0xa11/0x1da0 ? sched_clock_cpu+0x18/0x180 ? kmem_cache_alloc_trace+0x17e/0x310 ? mlx5_ib_create_qp+0x30e/0x17b0 mlx5_ib_create_qp+0x33d/0x17b0 ? sched_clock_cpu+0x18/0x180 ? create_qp_common+0x3b50/0x3b50 ? lock_acquire+0x370/0x370 ? __radix_tree_lookup+0x180/0x220 ? uverbs_try_lock_object+0x68/0xc0 ? rdma_lookup_get_uobject+0x114/0x240 create_qp.isra.5+0xce4/0x1e20 ? ib_uverbs_ex_create_cq_cb+0xa0/0xa0 ? copy_ah_attr_from_uverbs.isra.2+0xa00/0xa00 ? ib_uverbs_cq_event_handler+0x160/0x160 ? __might_fault+0x17c/0x1c0 ib_uverbs_create_qp+0x21b/0x2a0 ? ib_uverbs_destroy_cq+0x2e0/0x2e0 ib_uverbs_write+0x55a/0xad0 ? ib_uverbs_destroy_cq+0x2e0/0x2e0 ? ib_uverbs_destroy_cq+0x2e0/0x2e0 ? ib_uverbs_open+0x760/0x760 ? futex_wake+0x147/0x410 ? check_prev_add+0x1680/0x1680 ? do_futex+0x3d3/0xa60 ? sched_clock_cpu+0x18/0x180 __vfs_write+0xf7/0x5c0 ? ib_uverbs_open+0x760/0x760 ? kernel_read+0x110/0x110 ? lock_acquire+0x370/0x370 ? __fget+0x264/0x3b0 vfs_write+0x18a/0x460 SyS_write+0xc7/0x1a0 ? SyS_read+0x1a0/0x1a0 ? trace_hardirqs_on_thunk+0x1a/0x1c entry_SYSCALL_64_fastpath+0x18/0x85 RIP: 0033:0x4477b9 RSP: 002b:00007f1822cadc18 EFLAGS: 00000292 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00000000004477b9 RDX: 0000000000000070 RSI: 000000002000a000 RDI: 0000000000000005 RBP: 0000000000708000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000292 R12: 00000000ffffffff R13: 0000000000005d70 R14: 00000000006e6e30 R15: 0000000020010ff0 Allocated by task 549: __kmalloc+0x15e/0x340 kvmalloc_node+0xa1/0xd0 create_user_qp.isra.46+0xd42/0x1610 create_qp_common+0x2e63/0x3b50 mlx5_ib_create_qp+0x33d/0x17b0 create_qp.isra.5+0xce4/0x1e20 ib_uverbs_create_qp+0x21b/0x2a0 ib_uverbs_write+0x55a/0xad0 __vfs_write+0xf7/0x5c0 vfs_write+0x18a/0x460 SyS_write+0xc7/0x1a0 entry_SYSCALL_64_fastpath+0x18/0x85 Freed by task 368: kfree+0xeb/0x2f0 kernfs_fop_release+0x140/0x180 __fput+0x266/0x700 task_work_run+0x104/0x180 exit_to_usermode_loop+0xf7/0x110 syscall_return_slowpath+0x298/0x370 entry_SYSCALL_64_fastpath+0x83/0x85 The buggy address belongs to the object at ffff880066b99180 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 272 bytes inside of 512-byte region [ffff880066b99180, ffff880066b99380) The buggy address belongs to the page: page:000000006040eedd count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0 flags: 0x4000000000008100(slab|head) raw: 4000000000008100 0000000000000000 0000000000000000 0000000180190019 raw: ffffea00019a7500 0000000b0000000b ffff88006c403080 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff880066b99180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff880066b99200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff880066b99280: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff880066b99300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff880066b99380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc Cc: syzkaller <syzkaller@googlegroups.com> Fixes: 0fb2ed66 ("IB/mlx5: Add create and destroy functionality for Raw Packet QP") Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
- 09 Mar, 2018 2 commits
-
-
Leon Romanovsky authored
The user can provide very large cqe_size which will cause to integer overflow as it can be seen in the following UBSAN warning: ======================================================================= UBSAN: Undefined behaviour in drivers/infiniband/hw/mlx5/cq.c:1192:53 signed integer overflow: 64870 * 65536 cannot be represented in type 'int' CPU: 0 PID: 267 Comm: syzkaller605279 Not tainted 4.15.0+ #90 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 Call Trace: dump_stack+0xde/0x164 ? dma_virt_map_sg+0x22c/0x22c ubsan_epilogue+0xe/0x81 handle_overflow+0x1f3/0x251 ? __ubsan_handle_negate_overflow+0x19b/0x19b ? lock_acquire+0x440/0x440 mlx5_ib_resize_cq+0x17e7/0x1e40 ? cyc2ns_read_end+0x10/0x10 ? native_read_msr_safe+0x6c/0x9b ? cyc2ns_read_end+0x10/0x10 ? mlx5_ib_modify_cq+0x220/0x220 ? sched_clock_cpu+0x18/0x200 ? lookup_get_idr_uobject+0x200/0x200 ? rdma_lookup_get_uobject+0x145/0x2f0 ib_uverbs_resize_cq+0x207/0x3e0 ? ib_uverbs_ex_create_cq+0x250/0x250 ib_uverbs_write+0x7f9/0xef0 ? cyc2ns_read_end+0x10/0x10 ? print_irqtrace_events+0x280/0x280 ? ib_uverbs_ex_create_cq+0x250/0x250 ? uverbs_devnode+0x110/0x110 ? sched_clock_cpu+0x18/0x200 ? do_raw_spin_trylock+0x100/0x100 ? __lru_cache_add+0x16e/0x290 __vfs_write+0x10d/0x700 ? uverbs_devnode+0x110/0x110 ? kernel_read+0x170/0x170 ? sched_clock_cpu+0x18/0x200 ? security_file_permission+0x93/0x260 vfs_write+0x1b0/0x550 SyS_write+0xc7/0x1a0 ? SyS_read+0x1a0/0x1a0 ? trace_hardirqs_on_thunk+0x1a/0x1c entry_SYSCALL_64_fastpath+0x1e/0x8b RIP: 0033:0x433549 RSP: 002b:00007ffe63bd1ea8 EFLAGS: 00000217 ======================================================================= Cc: syzkaller <syzkaller@googlegroups.com> Cc: <stable@vger.kernel.org> # 3.13 Fixes: bde51583 ("IB/mlx5: Add support for resize CQ") Reported-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Doug Ledford authored
The original commit of this patch has a munged log message that is missing several of the tags the original author intended to be on the patch. This was due to patchworks misinterpreting a cut-n-paste separator line as an end of message line and munging the mbox that was used to import the patch: https://patchwork.kernel.org/patch/10264089/ The original patch will be reapplied with a fixed commit message so the proper tags are applied. This reverts commit aa0de36a. Signed-off-by: Doug Ledford <dledford@redhat.com>
-
- 07 Mar, 2018 14 commits
-
-
Leon Romanovsky authored
The QP state is limited and declared in enum ib_qp_state, but ucma user was able to supply any possible (u32) value. Reported-by: syzbot+0df1ab766f8924b1edba@syzkaller.appspotmail.com Fixes: 75216638 ("RDMA/cma: Export rdma cm interface to userspace") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Leon Romanovsky authored
The user can provide very large cqe_size which will cause to integer overflow as it can be seen in the following UBSAN warning: Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Leon Romanovsky authored
Users of ucma are supposed to provide size of option level, in most paths it is supposed to be equal to u8 or u16, but it is not the case for the IB path record, where it can be multiple of struct ib_path_rec_data. This patch takes simplest possible approach and prevents providing values more than possible to allocate. Reported-by: syzbot+a38b0e9f694c379ca7ce@syzkaller.appspotmail.com Fixes: 7ce86409 ("RDMA/ucma: Allow user space to set service type") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Parav Pandit authored
resolved_dev returned might be NULL as ifindex is transient number. Ignoring NULL check of resolved_dev might crash the kernel. Therefore perform NULL check before accessing resolved_dev. Additionally rdma_resolve_ip_route() invokes addr_resolve() which performs check and address translation for loopback ifindex. Therefore, checking it again in rdma_resolve_ip_route() is not helpful. Therefore, the code is simplified to avoid IFF_LOOPBACK check. Fixes: 20029832 ("IB/core: Validate route when we init ah") Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Selvin Xavier authored
Hitting the following hardlockup due to a race condition in error CQE processing. [26146.879798] bnxt_en 0000:04:00.0: QPLIB: FP: CQ Processed Req [26146.886346] bnxt_en 0000:04:00.0: QPLIB: wr_id[1251] = 0x0 with status 0xa [26156.350935] NMI watchdog: Watchdog detected hard LOCKUP on cpu 4 [26156.357470] Modules linked in: nfsd auth_rpcgss nfs_acl lockd grace [26156.447957] CPU: 4 PID: 3413 Comm: kworker/4:1H Kdump: loaded [26156.457994] Hardware name: Dell Inc. PowerEdge R430/0CN7X8, [26156.466390] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core] [26156.472639] Call Trace: [26156.475379] <NMI> [<ffffffff98d0d722>] dump_stack+0x19/0x1b [26156.481833] [<ffffffff9873f775>] watchdog_overflow_callback+0x135/0x140 [26156.489341] [<ffffffff9877f237>] __perf_event_overflow+0x57/0x100 [26156.496256] [<ffffffff98787c24>] perf_event_overflow+0x14/0x20 [26156.502887] [<ffffffff9860a580>] intel_pmu_handle_irq+0x220/0x510 [26156.509813] [<ffffffff98d16031>] perf_event_nmi_handler+0x31/0x50 [26156.516738] [<ffffffff98d1790c>] nmi_handle.isra.0+0x8c/0x150 [26156.523273] [<ffffffff98d17be8>] do_nmi+0x218/0x460 [26156.528834] [<ffffffff98d16d79>] end_repeat_nmi+0x1e/0x7e [26156.534980] [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200 [26156.543268] [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200 [26156.551556] [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200 [26156.559842] <EOE> [<ffffffff98d083e4>] queued_spin_lock_slowpath+0xb/0xf [26156.567555] [<ffffffff98d15690>] _raw_spin_lock+0x20/0x30 [26156.573696] [<ffffffffc08381a1>] bnxt_qplib_lock_buddy_cq+0x31/0x40 [bnxt_re] [26156.581789] [<ffffffffc083bbaa>] bnxt_qplib_poll_cq+0x43a/0xf10 [bnxt_re] [26156.589493] [<ffffffffc083239b>] bnxt_re_poll_cq+0x9b/0x760 [bnxt_re] The issue happens if RQ poll_cq or SQ poll_cq or Async error event tries to put the error QP in flush list. Since SQ and RQ of each error qp are added to two different flush list, we need to protect it using locks of corresponding CQs. Difference in order of acquiring the lock in SQ poll_cq and RQ poll_cq can cause a hard lockup. Revisits the locking strategy and removes the usage of qplib_cq.hwq.lock. Instead of this lock, introduces qplib_cq.flush_lock to handle addition/deletion of QPs in flush list. Also, always invoke the flush_lock in order (SQ CQ lock first and then RQ CQ lock) to avoid any potential deadlock. Other than the poll_cq context, the movement of QP to/from flush list can be done in modify_qp context or from an async error event from HW. Synchronize these operations using the bnxt_re verbs layer CQ locks. To achieve this, adds a call back to the HW abstraction layer(qplib) to bnxt_re ib_verbs layer in case of async error event. Also, removes the buddy cq functions as it is no longer required. Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Max Gurtovoy authored
Fix warning limit for kernel stack consumption: drivers/infiniband/core/cq.c: In function 'ib_process_cq_direct': drivers/infiniband/core/cq.c:78:1: error: the frame size of 1032 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] Using smaller ib_wc array on the stack brings us comfortably below that limit again. Fixes: 246d8b18 ("IB/cq: Don't force IB_POLL_DIRECT poll context for ib_process_cq_direct") Reported-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Sergey Gorenko <sergeygo@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Dan Carpenter authored
"err" is either zero or possibly uninitialized here. It should be -EINVAL. Fixes: 427c1e7b ("{IB, net}/mlx5: Move the modify QP operation table to mlx5_ib") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Mark Bloch authored
The series that introduced dual port RoCE mode assumed that we don't have a dual port HCA that use the mlx5 driver, this is not the case for Connect-IB HCAs. This reasoning led to assigning 1 as the native port index which causes issue when the second port is used. For example query_pkey() when called on the second port will return values of the first port. Make sure that we assign the right port index as the native port index. Fixes: 32f69e4b ("{net, IB}/mlx5: Manage port association for multiport RoCE") Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Jack M authored
The commit cited below added a gid_type field (RoCEv1 or RoCEv2) to GID properties. When adding GIDs, this gid_type field was copied over to the hardware gid table. However, when deleting GIDs, the gid_type field was not copied over to the hardware gid table. As a result, when running RoCEv2, all RoCEv2 gids in the hardware gid table were set to type RoCEv1 when any gid was deleted. This problem would persist until the next gid was added (which would again restore the gid_type field for all the gids in the hardware gid table). Fix this by copying over the gid_type field to the hardware gid table when deleting gids, so that the gid_type of all remaining gids is preserved when a gid is deleted. Fixes: b699a859 ("IB/mlx4: Add gid_type to GID properties") Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Jack Morgenstein authored
When using IPv4 addresses in RoCEv2, the GID format for the mapped IPv4 address should be: ::ffff:<4-byte IPv4 address>. In the cited commit, IPv4 mapped IPV6 addresses had the 3 upper dwords zeroed out by memset, which resulted in deleting the ffff field. However, since procedure ipv6_addr_v4mapped() already verifies that the gid has format ::ffff:<ipv4 address>, no change is needed for the gid, and the memset can simply be removed. Fixes: 7e57b85c ("IB/mlx4: Add support for setting RoCEv2 gids in hardware") Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Kalderon, Michal authored
iWARP does not support RDMA WRITE or SEND with immediate data. Driver should check this before submitting to FW and return an immediate error Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Kalderon, Michal authored
Race in qedr_poll_cq, lastest_cqe wasn't protected by lock, leading to a case where two context's accessing poll_cq at the same time lead to one of them having a pointer to an old latest_cqe and reading an invalid cqe element Signed-off-by: Amit Radzi <Amit.Radzi@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Kalderon, Michal authored
Fix iWARP connect and listen to use the mapped port for ipv4 and ipv6. Without this fixed, running on a server that has iwpmd enabled will not use the correct port Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Kalderon, Michal authored
The wrong parameter was passed to dst_neigh_lookup Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
- 28 Feb, 2018 3 commits
-
-
Muneendra Kumar M authored
dev_get_by_index is being called in addr_resolve function which returns NULL and NULL pointer access leads to kernel crash. Following call trace is observed while running rdma_lat test application [ 146.173149] BUG: unable to handle kernel NULL pointer dereference at 00000000000004a0 [ 146.173198] IP: addr_resolve+0x9e/0x3e0 [ib_core] [ 146.173221] PGD 0 P4D 0 [ 146.173869] Oops: 0000 [#1] SMP PTI [ 146.182859] CPU: 8 PID: 127 Comm: kworker/8:1 Tainted: G O 4.15.0-rc6+ #18 [ 146.183758] Hardware name: LENOVO System x3650 M5: -[8871AC1]-/01KN179, BIOS-[TCE132H-2.50]- 10/11/2017 [ 146.184691] Workqueue: ib_cm cm_work_handler [ib_cm] [ 146.185632] RIP: 0010:addr_resolve+0x9e/0x3e0 [ib_core] [ 146.186584] RSP: 0018:ffffc9000362faa0 EFLAGS: 00010246 [ 146.187521] RAX: 000000000000001b RBX: ffffc9000362fc08 RCX: 0000000000000006 [ 146.188472] RDX: 0000000000000000 RSI: 0000000000000096 RDI : ffff88087fc16990 [ 146.189427] RBP: ffffc9000362fb18 R08: 00000000ffffff9d R09: 00000000000004ac [ 146.190392] R10: 00000000000001e7 R11: 0000000000000001 R12: ffff88086af2e090 [ 146.191361] R13: 0000000000000000 R14: 0000000000000001 R15: 00000000ffffff9d [ 146.192327] FS: 0000000000000000(0000) GS:ffff88087fc00000(0000) knlGS:0000000000000000 [ 146.193301] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 146.194274] CR2: 00000000000004a0 CR3: 000000000220a002 CR4: 00000000003606e0 [ 146.195258] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 146.196256] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 146.197231] Call Trace: [ 146.198209] ? rdma_addr_register_client+0x30/0x30 [ib_core] [ 146.199199] rdma_resolve_ip+0x1af/0x280 [ib_core] [ 146.200196] rdma_addr_find_l2_eth_by_grh+0x154/0x2b0 [ib_core] The below patch adds the missing NULL pointer check returned by dev_get_by_index before accessing the netdev to avoid kernel crash. We observed the below crash when we try to do the below test. server client --------- --------- |1.1.1.1|<----rxe-channel--->|1.1.1.2| --------- --------- On server: rdma_lat -c -n 2 -s 1024 On client:rdma_lat 1.1.1.1 -c -n 2 -s 1024 Fixes: 20029832 ("IB/core: Validate route when we init ah") Signed-off-by: Muneendra <muneendra.kumar@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Selvin Xavier authored
Release the netdev references in the cleanup path. Invokes the cleanup routines if bnxt_re_ib_reg fails. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Devesh Sharma authored
To support host systems with non 4K page size, l2_db_size shall be calculated with 4096 instead of PAGE_SIZE. Also, supply the host page size to FW during initialization. Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-