- 19 Dec, 2018 32 commits
-
-
Lorenzo Bianconi authored
Add napi_disable routine in gro_cells_destroy since starting from commit c42858ea ("gro_cells: remove spinlock protecting receive queues") gro_cell_poll and gro_cells_destroy can run concurrently on napi_skbs list producing a kernel Oops if the tunnel interface is removed while gro_cell_poll is running. The following Oops has been triggered removing a vxlan device while the interface is receiving traffic [ 5628.948853] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 5628.949981] PGD 0 P4D 0 [ 5628.950308] Oops: 0002 [#1] SMP PTI [ 5628.950748] CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 4.20.0-rc6+ #41 [ 5628.952940] RIP: 0010:gro_cell_poll+0x49/0x80 [ 5628.955615] RSP: 0018:ffffc9000004fdd8 EFLAGS: 00010202 [ 5628.956250] RAX: 0000000000000000 RBX: ffffe8ffffc08150 RCX: 0000000000000000 [ 5628.957102] RDX: 0000000000000000 RSI: ffff88802356bf00 RDI: ffffe8ffffc08150 [ 5628.957940] RBP: 0000000000000026 R08: 0000000000000000 R09: 0000000000000000 [ 5628.958803] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000040 [ 5628.959661] R13: ffffe8ffffc08100 R14: 0000000000000000 R15: 0000000000000040 [ 5628.960682] FS: 0000000000000000(0000) GS:ffff88803ea00000(0000) knlGS:0000000000000000 [ 5628.961616] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5628.962359] CR2: 0000000000000008 CR3: 000000000221c000 CR4: 00000000000006b0 [ 5628.963188] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 5628.964034] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 5628.964871] Call Trace: [ 5628.965179] net_rx_action+0xf0/0x380 [ 5628.965637] __do_softirq+0xc7/0x431 [ 5628.966510] run_ksoftirqd+0x24/0x30 [ 5628.966957] smpboot_thread_fn+0xc5/0x160 [ 5628.967436] kthread+0x113/0x130 [ 5628.968283] ret_from_fork+0x3a/0x50 [ 5628.968721] Modules linked in: [ 5628.969099] CR2: 0000000000000008 [ 5628.969510] ---[ end trace 9d9dedc7181661fe ]--- [ 5628.970073] RIP: 0010:gro_cell_poll+0x49/0x80 [ 5628.972965] RSP: 0018:ffffc9000004fdd8 EFLAGS: 00010202 [ 5628.973611] RAX: 0000000000000000 RBX: ffffe8ffffc08150 RCX: 0000000000000000 [ 5628.974504] RDX: 0000000000000000 RSI: ffff88802356bf00 RDI: ffffe8ffffc08150 [ 5628.975462] RBP: 0000000000000026 R08: 0000000000000000 R09: 0000000000000000 [ 5628.976413] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000040 [ 5628.977375] R13: ffffe8ffffc08100 R14: 0000000000000000 R15: 0000000000000040 [ 5628.978296] FS: 0000000000000000(0000) GS:ffff88803ea00000(0000) knlGS:0000000000000000 [ 5628.979327] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5628.980044] CR2: 0000000000000008 CR3: 000000000221c000 CR4: 00000000000006b0 [ 5628.980929] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 5628.981736] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 5628.982409] Kernel panic - not syncing: Fatal exception in interrupt [ 5628.983307] Kernel Offset: disabled Fixes: c42858ea ("gro_cells: remove spinlock protecting receive queues") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Bryan Whitehead authored
The MAC Reset was noticed to erase important EEPROM settings. It is also unnecessary since a chip wide reset was done earlier in initialization, and that reset preserves EEPROM settings. There for this patch removes the unnecessary MAC specific reset. Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linuxDavid S. Miller authored
Saeed Mahameed says: ==================== mlx5-fixes-2018-12-19 Some fixes for the mlx5 driver ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alaa Hleihel authored
mlx5 driver falsely advertises support of software timestamping. Fix it by removing the false indication. Fixes: ef9814de ("net/mlx5e: Add HW timestamping (TS) support") Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Yuval Avnery authored
Expression terminated with "," instead of ";", resulted in set_fte getting bad value for modify_enable_mask field. Fixes: bd5251db ("net/mlx5_core: Introduce flow steering destination of type counter") Signed-off-by: Yuval Avnery <yuvalav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Tariq Toukan authored
When the completion queue of the RQ is empty, do not immediately return. If left-over decompressed CQEs (from the previous cycle) were processed, need to go to the finalization part of the poll function. Bug exists only when CQE compression is turned ON. This solves the following issue: mlx5_core 0000:82:00.1: mlx5_eq_int:544:(pid 0): CQ error on CQN 0xc08, syndrome 0x1 mlx5_core 0000:82:00.1 p4p2: mlx5e_cq_error_event: cqn=0x000c08 event=0x04 Fixes: 4b7dfc99 ("net/mlx5e: Early-return on empty completion queues") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Cong Wang authored
syzbot reported the use of uninitialized udp6_addr::sin6_scope_id. We can just set ::sin6_scope_id to zero, as tunnels are unlikely to use an IPv6 address that needs a scope id and there is no interface to bind in this context. For net-next, it looks different as we have cfg->bind_ifindex there so we can probably call ipv6_iface_scope_id(). Same for ::sin6_flowinfo, tunnels don't use it. Fixes: 8024e028 ("udp: Add udp_sock_create for UDP tunnels to open listener socket") Reported-by: syzbot+c56449ed3652e6720f30@syzkaller.appspotmail.com Cc: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
The current code has 2 problems. It assumes that the RX ring for the loopback packet is combined with the TX ring. This is not true if the ethtool channels are set to non-combined mode. The second problem is that it won't work on 57500 chips without adjusting the logic to get the proper completion ring (cpr) pointer. Fix both issues by locating the proper cpr pointer through the RX ring. Fixes: e44758b7 ("bnxt_en: Use bnxt_cp_ring_info struct pointer as parameter for RX path.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Shamir Rabinovitch says: ==================== WARNING in rds_message_alloc_sgs This patch set fix google syzbot rds bug found in linux-next. The first patch solve the syzbot issue. The second patch fix issue mentioned by Leon Romanovsky that drivers should not call WARN_ON as result from user input. syzbot bug report can be foud here: https://lkml.org/lkml/2018/10/31/28 v1->v2: - patch 1: make rds_iov_vector fields name more descriptive (Hakon) - patch 1: fix potential mem leak in rds_rm_size if krealloc fail (Hakon) v2->v3: - patch 2: harden rds_sendmsg for invalid number of sgs (Gerd) v3->v4 - Santosh a.b. on both patches + repost to net-dev ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
shamir rabinovitch authored
per comment from Leon in rdma mailing list https://lkml.org/lkml/2018/10/31/312 : Please don't forget to remove user triggered WARN_ON. https://lwn.net/Articles/769365/ "Greg Kroah-Hartman raised the problem of core kernel API code that will use WARN_ON_ONCE() to complain about bad usage; that will not generate the desired result if WARN_ON_ONCE() is configured to crash the machine. He was told that the code should just call pr_warn() instead, and that the called function should return an error in such situations. It was generally agreed that any WARN_ON() or WARN_ON_ONCE() calls that can be triggered from user space need to be fixed." in addition harden rds_sendmsg to detect and overcome issues with invalid sg count and fail the sendmsg. Suggested-by: Leon Romanovsky <leon@kernel.org> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: shamir rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
shamir rabinovitch authored
redundant copy_from_user in rds_sendmsg system call expose rds to issue where rds_rdma_extra_size walk the rds iovec and and calculate the number pf pages (sgs) it need to add to the tail of rds message and later rds_cmsg_rdma_args copy the rds iovec again and re calculate the same number and get different result causing WARN_ON in rds_message_alloc_sgs. fix this by doing the copy_from_user only once per rds_sendmsg system call. When issue occur the below dump is seen: WARNING: CPU: 0 PID: 19789 at net/rds/message.c:316 rds_message_alloc_sgs+0x10c/0x160 net/rds/message.c:316 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 19789 Comm: syz-executor827 Not tainted 4.19.0-next-20181030+ #101 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x244/0x39d lib/dump_stack.c:113 panic+0x2ad/0x55c kernel/panic.c:188 __warn.cold.8+0x20/0x45 kernel/panic.c:540 report_bug+0x254/0x2d0 lib/bug.c:186 fixup_bug arch/x86/kernel/traps.c:178 [inline] do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271 do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:969 RIP: 0010:rds_message_alloc_sgs+0x10c/0x160 net/rds/message.c:316 Code: c0 74 04 3c 03 7e 6c 44 01 ab 78 01 00 00 e8 2b 9e 35 fa 4c 89 e0 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 14 9e 35 fa <0f> 0b 31 ff 44 89 ee e8 18 9f 35 fa 45 85 ed 75 1b e8 fe 9d 35 fa RSP: 0018:ffff8801c51b7460 EFLAGS: 00010293 RAX: ffff8801bc412080 RBX: ffff8801d7bf4040 RCX: ffffffff8749c9e6 RDX: 0000000000000000 RSI: ffffffff8749ca5c RDI: 0000000000000004 RBP: ffff8801c51b7490 R08: ffff8801bc412080 R09: ffffed003b5c5b67 R10: ffffed003b5c5b67 R11: ffff8801dae2db3b R12: 0000000000000000 R13: 000000000007165c R14: 000000000007165c R15: 0000000000000005 rds_cmsg_rdma_args+0x82d/0x1510 net/rds/rdma.c:623 rds_cmsg_send net/rds/send.c:971 [inline] rds_sendmsg+0x19a2/0x3180 net/rds/send.c:1273 sock_sendmsg_nosec net/socket.c:622 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:632 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2117 __sys_sendmsg+0x11d/0x280 net/socket.c:2155 __do_sys_sendmsg net/socket.c:2164 [inline] __se_sys_sendmsg net/socket.c:2162 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2162 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x44a859 Code: e8 dc e6 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 6b cb fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f1d4710ada8 EFLAGS: 00000297 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000006dcc28 RCX: 000000000044a859 RDX: 0000000000000000 RSI: 0000000020001600 RDI: 0000000000000003 RBP: 00000000006dcc20 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000297 R12: 00000000006dcc2c R13: 646e732f7665642f R14: 00007f1d4710b9c0 R15: 00000000006dcd2c Kernel Offset: disabled Rebooting in 86400 seconds.. Reported-by: syzbot+26de17458aeda9d305d8@syzkaller.appspotmail.com Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: shamir rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Merge tag 'wireless-drivers-for-davem-2018-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for 4.20 Last set of fixes for 4.20. All (except the mt76 fix) of these are important fixes to user reported problems and pretty small in size. rtlwifi * fix skb leak mwifiex * revert a commit from v4.19 due to problems with locking mt76 * fix a potential NULL derenfence * add entry to MAINTAINERS iwlwifi * fix a firmware crash which was a regression introduced in v4.20-rc4 ath10k * fix a firmware crash with wcn3990 firmware ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Merge tag 'mac80211-for-davem-2018-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Just three fixes: * fix a memory leak in an error path * fix TXQs in interface teardown * free fraglist if we used it internally before returning SKB ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rakesh Pillai authored
HL2.0 firmware does not support setting quiet mode. If the host driver sends the quiet mode setting command to the HL2.0 firmware, it crashes with the below signature. fatal error received: err_qdi.c:456:EX:wlan_process:1:WLAN RT:207a:PC=b001b4f0 The quiet mode command support is exposed by the firmware via thermal throttle wmi service. Enable ath10k thermal support if thermal throttle wmi service bit is set. 10.x firmware versions support this feature by default, but unfortunately do not advertise the support via service flags, hence have to manually set the service flag in ath10k_core_compat_services(). Tested on QCA988X with 10.2.4.70.9-2. Also tested on WCN3990. Co-developed-by: Govind Singh <govinds@codeaurora.org> Co-developed-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Rakesh Pillai <pillair@codeaurora.org> Signed-off-by: Govind Singh <govinds@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
-
Sara Sharon authored
mac80211 uses the frag list to build AMSDU. When freeing the skb, it may not be really freed, since someone is still holding a reference to it. In that case, when TCP skb is being retransmitted, the pointer to the frag list is being reused, while the data in there is no longer valid. Since we will never get frag list from the network stack, as mac80211 doesn't advertise the capability, we can safely free and nullify it before releasing the SKB. Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
-
Johannes Berg authored
If validate_pae_over_nl80211() were to fail in nl80211_crypto_settings(), we might leak the 'connkeys' allocation. Fix this. Fixes: 64bf3d4b ("nl80211: Add CONTROL_PORT_OVER_NL80211 attribute") Signed-off-by: Johannes Berg <johannes.berg@intel.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller authored
Alexei Starovoitov says: ==================== pull-request: bpf 2018-12-18 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) promote bpf_perf_event.h to mandatory UAPI header, from Masahiro. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Myungho Jung authored
clcsock can be released while kernel_accept() references it in TCP listen worker. Also, clcsock needs to wake up before released if TCP fallback is used and the clcsock is blocked by accept. Add a lock to safely release clcsock and call kernel_sock_shutdown() to wake up clcsock from accept in smc_release(). Reported-by: syzbot+0bf2e01269f1274b4b03@syzkaller.appspotmail.com Reported-by: syzbot+e3132895630f957306bc@syzkaller.appspotmail.com Signed-off-by: Myungho Jung <mhjungk@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Colin Ian King authored
Currently variable data0 is not being initialized so a garbage value is being passed to vxge_hw_vpath_fw_api and this value is being written to the rts_access_steer_data0 register. There are other occurrances where data0 is being initialized to zero (e.g. in function vxge_hw_upgrade_read_version) so I think it makes sense to ensure data0 is initialized likewise to 0. Detected by CoverityScan, CID#140696 ("Uninitialized scalar variable") Fixes: 8424e00d ("vxge: serialize access to steering control register") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Juergen Gross authored
At least old Xen net backends seem to send frags with no real data sometimes. In case such a fragment happens to occur with the frag limit already reached the frontend will BUG currently even if this situation is easily recoverable. Modify the BUG_ON() condition accordingly. Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kunihiko Hayashi authored
Even though the link is down before entering hibernation, there is an issue that the network interface always links up after resuming from hibernation. If the link is still down before enabling the network interface, and after resuming from hibernation, the phydev->state is forcibly set to PHY_UP in mdio_bus_phy_restore(), and the link becomes up. In suspend sequence, only if the PHY is attached, mdio_bus_phy_suspend() calls phy_stop_machine(), and mdio_bus_phy_resume() calls phy_start_machine(). In resume sequence, it's enough to do the same as mdio_bus_phy_resume() because the state has been preserved. This patch fixes the issue by calling phy_start_machine() in mdio_bus_phy_restore() in the same way as mdio_bus_phy_resume(). Fixes: bc87922f ("phy: Move PHY PM operations into phy_device") Suggested-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jason Martinsen authored
Current state for the lan78xx driver does not allow for changing the MAC address of the interface, without either removing the module (if you compiled it that way) or rebooting the machine. If you attempt to change the MAC address, ifconfig will show the new address, however, the system/interface will not respond to any traffic using that configuration. A few short-term options to work around this are to unload the module and reload it with the new MAC address, change the interface to "promisc", or reboot with the correct configuration to change the MAC. This patch enables the ability to change the MAC address via fairly normal means... ifdown <interface> modify entry in /etc/network/interfaces OR a similar method ifup <interface> Then test via any network communication, such as ICMP requests to gateway. My only test platform for this patch has been a raspberry pi model 3b+. Signed-off-by: Jason Martinsen <jasonmartinsen@msn.com> ----- Signed-off-by: David S. Miller <davem@davemloft.net>
-
Bryan Whitehead authored
The LAN7431 uses an external phy, and it can be found anywhere in the phy address space. This patch uses phy address 1 for LAN7430 only. And searches all addresses otherwise. Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Petr Machata says: ==================== vxlan: Various fixes This patch set contains three fixes for the vxlan driver. Patch #1 fixes handling of offload mark on replaced VXLAN FDB entries. A way to trigger this is to replace the FDB entry with one that can not be offloaded. A future patch set should make it possible to veto such FDB changes. However the FDB might still fail to be offloaded due to another issue, and the offload mark should reflect that. Patch #2 fixes problems in __vxlan_dev_create() when a call to rtnl_configure_link() fails. These failures would be tricky to hit on a real system, the most likely vector is through an error in vxlan_open(). However, with the abovementioned vetoing patchset, vetoing the created entry would trigger the same problems (and be easier to reproduce). Patch #3 fixes a problem in vxlan_changelink(). In situations where the default remote configured in the FDB table (if any) does not exactly match the remote address configured at the VXLAN device, changing the remote address breaks the default FDB entry. Patch #4 is then a self test for this issue. v3: - Patch #2: - Reuse the same errout block for both cleanup paths. Use a bool to decide whether the unregister_netdevice() call should be made. v2: - Drop former patch #3 - Patch #2: - Delete the default entry before calling unregister_netdevice(). That takes care of former patch #3, hence tweak the commit message to mention that problem as well. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
Add a test to exercise the fix from the previous patch. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
Default remotes are stored as FDB entries with an Ethernet address of 00:00:00:00:00:00. When a request is made to change a remote address of a VXLAN device, vxlan_changelink() first deletes the existing default remote, and then creates a new FDB entry. This works well as long as the list of default remotes matches exactly the configuration of a VXLAN remote address. Thus when the VXLAN device has a remote of X, there should be exactly one default remote FDB entry X. If the VXLAN device has no remote address, there should be no such entry. Besides using "ip link set", it is possible to manipulate the list of default remotes by using the "bridge fdb". It is therefore easy to break the above condition. Under such circumstances, the __vxlan_fdb_delete() call doesn't delete the FDB entry itself, but just one remote. The following vxlan_fdb_create() then creates a new FDB entry, leading to a situation where two entries exist for the address 00:00:00:00:00:00, each with a different subset of default remotes. An even more obvious breakage rooted in the same cause can be observed when a remote address is configured for a VXLAN device that did not have one before. In that case vxlan_changelink() doesn't remove any remote, and just creates a new FDB entry for the new address: $ ip link add name vx up type vxlan id 2000 dstport 4789 $ bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.20 self permanent $ bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.30 self permanent $ ip link set dev vx type vxlan remote 192.0.2.30 $ bridge fdb sh dev vx | grep 00:00:00:00:00:00 00:00:00:00:00:00 dst 192.0.2.30 self permanent <- new entry, 1 rdst 00:00:00:00:00:00 dst 192.0.2.20 self permanent <- orig. entry, 2 rdsts 00:00:00:00:00:00 dst 192.0.2.30 self permanent To fix this, instead of calling vxlan_fdb_create() directly, defer to vxlan_fdb_update(). That has logic to handle the duplicates properly. Additionally, it also handles notifications, so drop that call from changelink as well. Fixes: 0241b836 ("vxlan: fix default fdb entry netlink notify ordering during netdev create") Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
When a failure occurs in rtnl_configure_link(), the current code calls unregister_netdevice() to roll back the earlier call to register_netdevice(), and jumps to errout, which calls vxlan_fdb_destroy(). However unregister_netdevice() calls transitively ndo_uninit, which is vxlan_uninit(), and that already takes care of deleting the default FDB entry by calling vxlan_fdb_delete_default(). Since the entry added earlier in __vxlan_dev_create() is exactly the default entry, the cleanup code in the errout block always leads to double free and thus a panic. Besides, since vxlan_fdb_delete_default() always destroys the FDB entry with notification enabled, the deletion of the default entry is notified even before the addition was notified. Instead, move the unregister_netdevice() call after the manual destroy, which solves both problems. Fixes: 0241b836 ("vxlan: fix default fdb entry netlink notify ordering during netdev create") Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
When rdst of an offloaded FDB entry is replaced, it certainly isn't offloaded anymore. Drivers are notified about such replacements, and can re-mark the entry as offloaded again if they so wish. However until a driver does so explicitly, assume a replaced FDB entry is not offloaded. Note that replaces coming via vxlan_fdb_external_learn_add() are always immediately followed by an explicit offload marking. Fixes: 0efe1173 ("vxlan: Support marking RDSTs as offloaded") Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Anssi Hannula says: ==================== net: macb: DMA race condition fixes Here are a couple of race condition fixes for the macb driver. The first two are for issues observed at runtime on real HW. v2: - added received Tested-bys and Acked-bys to the first two patches - in patch 3/3, moved the timestamp protection barrier closer to the timestamp reads - in patch 3/3, removed unnecessary move of the addr assignment in gem_rx() to keep the patch minimal for maximum clarity - in patch 3/3, clarified commit message and comments The 3/3 is the same one I improperly sent last week as a standalone patch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Anssi Hannula authored
When reading buffer descriptors on RX or on TX completion, an RX_USED/TX_USED bit is checked first to ensure that the descriptors have been populated, i.e. the ownership has been transferred. However, there are no memory barriers to ensure that the data protected by the RX_USED/TX_USED bit is up-to-date with respect to that bit. Specifically: - TX timestamp descriptors may be loaded before ctrl is loaded for the TX_USED check, which is racy as the descriptors may be updated between the loads, causing old timestamp descriptor data to be used. - RX ctrl may be loaded before addr is loaded for the RX_USED check, which is racy as a new frame may be written between the loads, causing old ctrl descriptor data to be used. This issue exists for both macb_rx() and gem_rx() variants. Fix the races by adding DMA read memory barriers on those paths and reordering the reads in macb_rx(). I have not observed any actual problems in practice caused by these being missing, though. Tested on a ZynqMP based system. Fixes: 89e5785f ("[PATCH] Atmel MACB ethernet driver") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Anssi Hannula authored
Bit RX_USED set to 0 in the address field allows the controller to write data to the receive buffer descriptor. The driver does not ensure the ctrl field is ready (cleared) when the controller sees the RX_USED=0 written by the driver. The ctrl field might only be cleared after the controller has already updated it according to a newly received frame, causing the frame to be discarded in gem_rx() due to unexpected ctrl field contents. A message is logged when the above scenario occurs: macb ff0b0000.ethernet eth0: not whole frame pointed by descriptor Fix the issue by ensuring that when the controller sees RX_USED=0 the ctrl field is already cleared. This issue was observed on a ZynqMP based system. Fixes: 4df95131 ("net/macb: change RX path for GEM") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Tested-by: Claudiu Beznea <claudiu.beznea@microchip.com> Cc: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Anssi Hannula authored
64-bit DMA addresses are split in upper and lower halves that are written in separate fields on GEM. For RX, bit 0 of the address is used as the ownership bit (RX_USED). When the RX_USED bit is unset the controller is allowed to write data to the buffer. The driver does not guarantee that the controller already sees the upper half when the RX_USED bit is cleared, possibly resulting in the controller writing an incoming frame to an address with an incorrect upper half and therefore possibly corrupting unrelated system memory. Fix that by adding the necessary DMA memory barrier between the writes. This corruption was observed on a ZynqMP based system. Fixes: fff8019a ("net: macb: Add 64 bit addressing support for GEM") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Acked-by: Harini Katakam <harini.katakam@xilinx.com> Tested-by: Claudiu Beznea <claudiu.beznea@microchip.com> Cc: Nicolas Ferre <nicolas.ferre@microchip.com> Cc: Michal Simek <michal.simek@xilinx.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 18 Dec, 2018 8 commits
-
-
Davide Caratti authored
Herton reports the following error when building a userspace program that includes net_stamp.h: In file included from foo.c:2: /usr/include/linux/net_tstamp.h:158:2: error: unknown type name ‘clockid_t’ clockid_t clockid; /* reference clockid */ ^~~~~~~~~ Fix it by using __kernel_clockid_t in place of clockid_t. Fixes: 80b14dee ("net: Add a new socket option for a future transmit time.") Cc: Timothy Redaelli <tredaelli@redhat.com> Reported-by: Herton R. Krzesinski <herton@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Tested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Claudiu Beznea authored
On some platforms (currently detected only on SAMA5D4) TX might stuck even the pachets are still present in DMA memories and TX start was issued for them. This happens due to race condition between MACB driver updating next TX buffer descriptor to be used and IP reading the same descriptor. In such a case, the "TX USED BIT READ" interrupt is asserted. GEM/MACB user guide specifies that if a "TX USED BIT READ" interrupt is asserted TX must be restarted. Restart TX if used bit is read and packets are present in software TX queue. Packets are removed from software TX queue if TX was successful for them (see macb_tx_interrupt()). Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dan Carpenter authored
The function should return an error if create_singlethread_workqueue() fails. Fixes: 34877a15 ("net: stmmac: Rework and fix TX Timeout code") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Cong Wang authored
Similar to commit 143ece65 ("tipc: check tsk->group in tipc_wait_for_cond()") we have to reload grp->dests too after we re-take the sock lock. This means we need to move the dsts check after tipc_wait_for_cond() too. Fixes: 75da2163 ("tipc: introduce communication groups") Reported-and-tested-by: syzbot+99f20222fc5018d2b97a@syzkaller.appspotmail.com Cc: Ying Xue <ying.xue@windriver.com> Cc: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dan Carpenter authored
We accidentally deleted the code to set "rc = -ENOMEM;" and this patch adds it back. Fixes: d2201a21 ("qed: No need for LL2 frags indication") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Tenart authored
The mvpp2_phylink_validate() function sets all modes that are supported by a given PPv2 port. A recent change made all ports to advertise they support 10G modes in certain cases. This is not true, as only the port #0 can do so. This patch fixes it. Fixes: 01b3fd5a ("net: mvpp2: fix detection of 10G SFP modules") Cc: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jorgen Hansen authored
If a server side socket is bound to an address, but not in the listening state yet, incoming connection requests should receive a reset control packet in response. However, the function used to send the reset silently drops the reset packet if the sending socket isn't bound to a remote address (as is the case for a bound socket not yet in the listening state). This change fixes this by using the src of the incoming packet as destination for the reset packet in this case. Fixes: d021c344 ("VSOCK: Introduce VM Sockets") Reviewed-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Vishnu Dasa <vdasa@vmware.com> Signed-off-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsecDavid S. Miller authored
Steffen Klassert says: ==================== pull request (net): ipsec 2018-12-18 1) Fix error return code in xfrm_output_one() when no dst_entry is attached to the skb. From Wei Yongjun. 2) The xfrm state hash bucket count reported to userspace is off by one. Fix from Benjamin Poirier. 3) Fix NULL pointer dereference in xfrm_input when skb_dst_force clears the dst_entry. 4) Fix freeing of xfrm states on acquire. We use a dedicated slab cache for the xfrm states now, so free it properly with kmem_cache_free. From Mathias Krause. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-