- 28 Nov, 2018 8 commits
-
-
Lihong Yang authored
In __i40e_del_filter function, the flag __I40E_MACVLAN_SYNC_PENDING for the PF state is wrongly set for the VSI. Deleting any of the MAC filters has caused the incorrect syncing for the PF. Fix it by setting this state flag to the intended PF. CC: stable <stable@vger.kernel.org> Signed-off-by: Lihong Yang <lihong.yang@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Yunjian Wang authored
This patch fixes the variable 'phy_word' may be used uninitialized. Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Bryan Whitehead authored
This driver was designed to work with both LAN7430 and LAN7431. The only difference between the two is the LAN7431 has support for external phy. This change adds LAN7431 to the list of recognized devices supported by this driver. Updates for v2: changed 'fixes' tag to match defined format fixes: 23f0703c ("lan743x: Add main source files for new lan743x driver") Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Maloy authored
We see the following lockdep warning: [ 2284.078521] ====================================================== [ 2284.078604] WARNING: possible circular locking dependency detected [ 2284.078604] 4.19.0+ #42 Tainted: G E [ 2284.078604] ------------------------------------------------------ [ 2284.078604] rmmod/254 is trying to acquire lock: [ 2284.078604] 00000000acd94e28 ((&n->timer)#2){+.-.}, at: del_timer_sync+0x5/0xa0 [ 2284.078604] [ 2284.078604] but task is already holding lock: [ 2284.078604] 00000000f997afc0 (&(&tn->node_list_lock)->rlock){+.-.}, at: tipc_node_stop+0xac/0x190 [tipc] [ 2284.078604] [ 2284.078604] which lock already depends on the new lock. [ 2284.078604] [ 2284.078604] [ 2284.078604] the existing dependency chain (in reverse order) is: [ 2284.078604] [ 2284.078604] -> #1 (&(&tn->node_list_lock)->rlock){+.-.}: [ 2284.078604] tipc_node_timeout+0x20a/0x330 [tipc] [ 2284.078604] call_timer_fn+0xa1/0x280 [ 2284.078604] run_timer_softirq+0x1f2/0x4d0 [ 2284.078604] __do_softirq+0xfc/0x413 [ 2284.078604] irq_exit+0xb5/0xc0 [ 2284.078604] smp_apic_timer_interrupt+0xac/0x210 [ 2284.078604] apic_timer_interrupt+0xf/0x20 [ 2284.078604] default_idle+0x1c/0x140 [ 2284.078604] do_idle+0x1bc/0x280 [ 2284.078604] cpu_startup_entry+0x19/0x20 [ 2284.078604] start_secondary+0x187/0x1c0 [ 2284.078604] secondary_startup_64+0xa4/0xb0 [ 2284.078604] [ 2284.078604] -> #0 ((&n->timer)#2){+.-.}: [ 2284.078604] del_timer_sync+0x34/0xa0 [ 2284.078604] tipc_node_delete+0x1a/0x40 [tipc] [ 2284.078604] tipc_node_stop+0xcb/0x190 [tipc] [ 2284.078604] tipc_net_stop+0x154/0x170 [tipc] [ 2284.078604] tipc_exit_net+0x16/0x30 [tipc] [ 2284.078604] ops_exit_list.isra.8+0x36/0x70 [ 2284.078604] unregister_pernet_operations+0x87/0xd0 [ 2284.078604] unregister_pernet_subsys+0x1d/0x30 [ 2284.078604] tipc_exit+0x11/0x6f2 [tipc] [ 2284.078604] __x64_sys_delete_module+0x1df/0x240 [ 2284.078604] do_syscall_64+0x66/0x460 [ 2284.078604] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 2284.078604] [ 2284.078604] other info that might help us debug this: [ 2284.078604] [ 2284.078604] Possible unsafe locking scenario: [ 2284.078604] [ 2284.078604] CPU0 CPU1 [ 2284.078604] ---- ---- [ 2284.078604] lock(&(&tn->node_list_lock)->rlock); [ 2284.078604] lock((&n->timer)#2); [ 2284.078604] lock(&(&tn->node_list_lock)->rlock); [ 2284.078604] lock((&n->timer)#2); [ 2284.078604] [ 2284.078604] *** DEADLOCK *** [ 2284.078604] [ 2284.078604] 3 locks held by rmmod/254: [ 2284.078604] #0: 000000003368be9b (pernet_ops_rwsem){+.+.}, at: unregister_pernet_subsys+0x15/0x30 [ 2284.078604] #1: 0000000046ed9c86 (rtnl_mutex){+.+.}, at: tipc_net_stop+0x144/0x170 [tipc] [ 2284.078604] #2: 00000000f997afc0 (&(&tn->node_list_lock)->rlock){+.-.}, at: tipc_node_stop+0xac/0x19 [...} The reason is that the node timer handler sometimes needs to delete a node which has been disconnected for too long. To do this, it grabs the lock 'node_list_lock', which may at the same time be held by the generic node cleanup function, tipc_node_stop(), during module removal. Since the latter is calling del_timer_sync() inside the same lock, we have a potential deadlock. We fix this letting the timer cleanup function use spin_trylock() instead of just spin_lock(), and when it fails to grab the lock it just returns so that the timer handler can terminate its execution. This is safe to do, since tipc_node_stop() anyway is about to delete both the timer and the node instance. Fixes: 6a939f36 ("tipc: Auto removal of peer down node instance") Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Bryan Whitehead authored
The lan743x driver, when under heavy traffic load, has been noticed to sometimes hang, or cause a kernel panic. Debugging reveals that the TX napi poll routine was returning the wrong value, 'weight'. Most other drivers return 0. And call napi_complete, instead of napi_complete_done. Additionally when creating the tx napi poll routine. Changed netif_napi_add, to netif_tx_napi_add. Updates for v3: changed 'fixes' tag to match defined format Updates for v2: use napi_complete, instead of napi_complete_done in lan743x_tx_napi_poll use netif_tx_napi_add, instead of netif_napi_add for registration of tx napi poll routine fixes: 23f0703c ("lan743x: Add main source files for new lan743x driver") Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Colin Ian King authored
The text in array velocity_gstrings contains a spelling mistake, rename rx_frame_alignement_errors to rx_frame_alignment_errors. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Colin Ian King authored
The text in array s_igu_fifo_error_strs contains a spelling mistake, fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lorenzo Bianconi authored
Fix a possible NULL pointer dereference in nic_remove routine removing the nicpf module if nic_probe fails. The issue can be triggered with the following reproducer: $rmmod nicvf $rmmod nicpf [ 521.412008] Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000014 [ 521.422777] Mem abort info: [ 521.425561] ESR = 0x96000004 [ 521.428624] Exception class = DABT (current EL), IL = 32 bits [ 521.434535] SET = 0, FnV = 0 [ 521.437579] EA = 0, S1PTW = 0 [ 521.440730] Data abort info: [ 521.443603] ISV = 0, ISS = 0x00000004 [ 521.447431] CM = 0, WnR = 0 [ 521.450417] user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000072a3da42 [ 521.457022] [0000000000000014] pgd=0000000000000000 [ 521.461916] Internal error: Oops: 96000004 [#1] SMP [ 521.511801] Hardware name: GIGABYTE H270-T70/MT70-HD0, BIOS T49 02/02/2018 [ 521.518664] pstate: 80400005 (Nzcv daif +PAN -UAO) [ 521.523451] pc : nic_remove+0x24/0x88 [nicpf] [ 521.527808] lr : pci_device_remove+0x48/0xd8 [ 521.532066] sp : ffff000013433cc0 [ 521.535370] x29: ffff000013433cc0 x28: ffff810f6ac50000 [ 521.540672] x27: 0000000000000000 x26: 0000000000000000 [ 521.545974] x25: 0000000056000000 x24: 0000000000000015 [ 521.551274] x23: ffff8007ff89a110 x22: ffff000001667070 [ 521.556576] x21: ffff8007ffb170b0 x20: ffff8007ffb17000 [ 521.561877] x19: 0000000000000000 x18: 0000000000000025 [ 521.567178] x17: 0000000000000000 x16: 000000000000010ffc33ff98 x8 : 0000000000000000 [ 521.593683] x7 : 0000000000000000 x6 : 0000000000000001 [ 521.598983] x5 : 0000000000000002 x4 : 0000000000000003 [ 521.604284] x3 : ffff8007ffb17184 x2 : ffff8007ffb17184 [ 521.609585] x1 : ffff000001662118 x0 : ffff000008557be0 [ 521.614887] Process rmmod (pid: 1897, stack limit = 0x00000000859535c3) [ 521.621490] Call trace: [ 521.623928] nic_remove+0x24/0x88 [nicpf] [ 521.627927] pci_device_remove+0x48/0xd8 [ 521.631847] device_release_driver_internal+0x1b0/0x248 [ 521.637062] driver_detach+0x50/0xc0 [ 521.640628] bus_remove_driver+0x60/0x100 [ 521.644627] driver_unregister+0x34/0x60 [ 521.648538] pci_unregister_driver+0x24/0xd8 [ 521.652798] nic_cleanup_module+0x14/0x111c [nicpf] [ 521.657672] __arm64_sys_delete_module+0x150/0x218 [ 521.662460] el0_svc_handler+0x94/0x110 [ 521.666287] el0_svc+0x8/0xc [ 521.669160] Code: aa1e03e0 9102c295 d503201f f9404eb3 (b9401660) Fixes: 4863dea3 ("net: Adding support for Cavium ThunderX network controller") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 27 Nov, 2018 12 commits
-
-
Xin Long authored
I changed to count sk_wmem_alloc by skb truesize instead of 1 to fix the sk_wmem_alloc leak caused by later truesize's change in xfrm in Commit 02968ccf ("sctp: count sk_wmem_alloc by skb truesize in sctp_packet_transmit"). But I should have also increased sk_wmem_alloc when head->truesize is increased in sctp_packet_gso_append() as xfrm does. Otherwise, sctp gso packet will cause sk_wmem_alloc underflow. Fixes: 02968ccf ("sctp: count sk_wmem_alloc by skb truesize in sctp_packet_transmit") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Colin Ian King authored
There are spelling mistakes in debug messages, fix them. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
After switching the r8169 driver to use phylib some user reported that their network is broken. This was caused by the genphy PHY driver being used instead of the dedicated PHY driver for the RTL8211B. Users reported that loading the Realtek PHY driver module upfront fixes the issue. See also this mail thread: https://marc.info/?t=154279781800003&r=1&w=2 The issue is quite weird and the root cause seems to be somewhere in the base driver core. The patch works around the issue and may be removed once the actual issue is fixed. The Fixes tag refers to the first reported occurrence of the issue. The issue itself may have been existing much longer and it may affect users of other network chips as well. Users typically will recognize this issue only if their PHY stops working when being used with the genphy driver. Fixes: f1e911d5 ("r8169: add basic phylib support") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Bernd Eckstein authored
The bug is not easily reproducable, as it may occur very infrequently (we had machines with 20minutes heavy downloading before it occurred) However, on a virual machine (VMWare on Windows 10 host) it occurred pretty frequently (1-2 seconds after a speedtest was started) dev->tx_skb mab be freed via dev_kfree_skb_irq on a callback before it is set. This causes the following problems: - double free of the skb or potential memory leak - in dmesg: 'recvmsg bug' and 'recvmsg bug 2' and eventually general protection fault Example dmesg output: [ 134.841986] ------------[ cut here ]------------ [ 134.841987] recvmsg bug: copied 9C24A555 seq 9C24B557 rcvnxt 9C25A6B3 fl 0 [ 134.841993] WARNING: CPU: 7 PID: 2629 at /build/linux-hwe-On9fm7/linux-hwe-4.15.0/net/ipv4/tcp.c:1865 tcp_recvmsg+0x44d/0xab0 [ 134.841994] Modules linked in: ipheth(OE) kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd vmw_balloon intel_rapl_perf joydev input_leds serio_raw vmw_vsock_vmci_transport vsock shpchp i2c_piix4 mac_hid binfmt_misc vmw_vmci parport_pc ppdev lp parport autofs4 vmw_pvscsi vmxnet3 hid_generic usbhid hid vmwgfx ttm drm_kms_helper syscopyarea sysfillrect mptspi mptscsih sysimgblt ahci psmouse fb_sys_fops pata_acpi mptbase libahci e1000 drm scsi_transport_spi [ 134.842046] CPU: 7 PID: 2629 Comm: python Tainted: G W OE 4.15.0-34-generic #37~16.04.1-Ubuntu [ 134.842046] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017 [ 134.842048] RIP: 0010:tcp_recvmsg+0x44d/0xab0 [ 134.842048] RSP: 0018:ffffa6630422bcc8 EFLAGS: 00010286 [ 134.842049] RAX: 0000000000000000 RBX: ffff997616f4f200 RCX: 0000000000000006 [ 134.842049] RDX: 0000000000000007 RSI: 0000000000000082 RDI: ffff9976257d6490 [ 134.842050] RBP: ffffa6630422bd98 R08: 0000000000000001 R09: 000000000004bba4 [ 134.842050] R10: 0000000001e00c6f R11: 000000000004bba4 R12: ffff99760dee3000 [ 134.842051] R13: 0000000000000000 R14: ffff99760dee3514 R15: 0000000000000000 [ 134.842051] FS: 00007fe332347700(0000) GS:ffff9976257c0000(0000) knlGS:0000000000000000 [ 134.842052] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 134.842053] CR2: 0000000001e41000 CR3: 000000020e9b4006 CR4: 00000000003606e0 [ 134.842055] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 134.842055] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 134.842057] Call Trace: [ 134.842060] ? aa_sk_perm+0x53/0x1a0 [ 134.842064] inet_recvmsg+0x51/0xc0 [ 134.842066] sock_recvmsg+0x43/0x50 [ 134.842070] SYSC_recvfrom+0xe4/0x160 [ 134.842072] ? __schedule+0x3de/0x8b0 [ 134.842075] ? ktime_get_ts64+0x4c/0xf0 [ 134.842079] SyS_recvfrom+0xe/0x10 [ 134.842082] do_syscall_64+0x73/0x130 [ 134.842086] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 134.842086] RIP: 0033:0x7fe331f5a81d [ 134.842088] RSP: 002b:00007ffe8da98398 EFLAGS: 00000246 ORIG_RAX: 000000000000002d [ 134.842090] RAX: ffffffffffffffda RBX: ffffffffffffffff RCX: 00007fe331f5a81d [ 134.842094] RDX: 00000000000003fb RSI: 0000000001e00874 RDI: 0000000000000003 [ 134.842095] RBP: 00007fe32f642c70 R08: 0000000000000000 R09: 0000000000000000 [ 134.842097] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe332347698 [ 134.842099] R13: 0000000001b7e0a0 R14: 0000000001e00874 R15: 0000000000000000 [ 134.842103] Code: 24 fd ff ff e9 cc fe ff ff 48 89 d8 41 8b 8c 24 10 05 00 00 44 8b 45 80 48 c7 c7 08 bd 59 8b 48 89 85 68 ff ff ff e8 b3 c4 7d ff <0f> 0b 48 8b 85 68 ff ff ff e9 e9 fe ff ff 41 8b 8c 24 10 05 00 [ 134.842126] ---[ end trace b7138fc08c83147f ]--- [ 134.842144] general protection fault: 0000 [#1] SMP PTI [ 134.842145] Modules linked in: ipheth(OE) kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd vmw_balloon intel_rapl_perf joydev input_leds serio_raw vmw_vsock_vmci_transport vsock shpchp i2c_piix4 mac_hid binfmt_misc vmw_vmci parport_pc ppdev lp parport autofs4 vmw_pvscsi vmxnet3 hid_generic usbhid hid vmwgfx ttm drm_kms_helper syscopyarea sysfillrect mptspi mptscsih sysimgblt ahci psmouse fb_sys_fops pata_acpi mptbase libahci e1000 drm scsi_transport_spi [ 134.842161] CPU: 7 PID: 2629 Comm: python Tainted: G W OE 4.15.0-34-generic #37~16.04.1-Ubuntu [ 134.842162] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017 [ 134.842164] RIP: 0010:tcp_close+0x2c6/0x440 [ 134.842165] RSP: 0018:ffffa6630422bde8 EFLAGS: 00010202 [ 134.842167] RAX: 0000000000000000 RBX: ffff99760dee3000 RCX: 0000000180400034 [ 134.842168] RDX: 5c4afd407207a6c4 RSI: ffffe868495bd300 RDI: ffff997616f4f200 [ 134.842169] RBP: ffffa6630422be08 R08: 0000000016f4d401 R09: 0000000180400034 [ 134.842169] R10: ffffa6630422bd98 R11: 0000000000000000 R12: 000000000000600c [ 134.842170] R13: 0000000000000000 R14: ffff99760dee30c8 R15: ffff9975bd44fe00 [ 134.842171] FS: 00007fe332347700(0000) GS:ffff9976257c0000(0000) knlGS:0000000000000000 [ 134.842173] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 134.842174] CR2: 0000000001e41000 CR3: 000000020e9b4006 CR4: 00000000003606e0 [ 134.842177] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 134.842178] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 134.842179] Call Trace: [ 134.842181] inet_release+0x42/0x70 [ 134.842183] __sock_release+0x42/0xb0 [ 134.842184] sock_close+0x15/0x20 [ 134.842187] __fput+0xea/0x220 [ 134.842189] ____fput+0xe/0x10 [ 134.842191] task_work_run+0x8a/0xb0 [ 134.842193] exit_to_usermode_loop+0xc4/0xd0 [ 134.842195] do_syscall_64+0xf4/0x130 [ 134.842197] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 134.842197] RIP: 0033:0x7fe331f5a560 [ 134.842198] RSP: 002b:00007ffe8da982e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 [ 134.842200] RAX: 0000000000000000 RBX: 00007fe32f642c70 RCX: 00007fe331f5a560 [ 134.842201] RDX: 00000000008f5320 RSI: 0000000001cd4b50 RDI: 0000000000000003 [ 134.842202] RBP: 00007fe32f6500f8 R08: 000000000000003c R09: 00000000009343c0 [ 134.842203] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe32f6500d0 [ 134.842204] R13: 00000000008f5320 R14: 00000000008f5320 R15: 0000000001cd4770 [ 134.842205] Code: c8 00 00 00 45 31 e4 49 39 fe 75 4d eb 50 83 ab d8 00 00 00 01 48 8b 17 48 8b 47 08 48 c7 07 00 00 00 00 48 c7 47 08 00 00 00 00 <48> 89 42 08 48 89 10 0f b6 57 34 8b 47 2c 2b 47 28 83 e2 01 80 [ 134.842226] RIP: tcp_close+0x2c6/0x440 RSP: ffffa6630422bde8 [ 134.842227] ---[ end trace b7138fc08c831480 ]--- The proposed patch eliminates a potential racing condition. Before, usb_submit_urb was called and _after_ that, the skb was attached (dev->tx_skb). So, on a callback it was possible, however unlikely that the skb was freed before it was set. That way (because dev->tx_skb was not set to NULL after it was freed), it could happen that a skb from a earlier transmission was freed a second time (and the skb we should have freed did not get freed at all) Now we free the skb directly in ipheth_tx(). It is not passed to the callback anymore, eliminating the posibility of a double free of the same skb. Depending on the retval of usb_submit_urb() we use dev_kfree_skb_any() respectively dev_consume_skb_any() to free the skb. Signed-off-by: Oliver Zweigle <Oliver.Zweigle@faro.com> Signed-off-by: Bernd Eckstein <3ernd.Eckstein@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller authored
Daniel Borkmann says: ==================== pull-request: bpf 2018-11-27 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix several bugs in BPF sparc JIT, that is, convergence for fused branches, initialization of frame pointer register, and moving all arguments into output registers from input registers in prologue to fix BPF to BPF calls, from David. 2) Fix a bug in arm64 JIT for fetching BPF to BPF call addresses where they are not guaranteed to fit into imm field and therefore must be retrieved through prog aux data, from Daniel. 3) Explicitly add all JITs to MAINTAINERS file with developers able to help out in feature development, fixes, review, etc. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Miller authored
Move all arguments into output registers from input registers. This path is exercised by test_verifier.c's "calls: two calls with args" test. Adjust BPF_TAILCALL_PROLOGUE_SKIP as needed. Let's also make the prologue length a constant size regardless of the combination of ->saw_frame_pointer and ->saw_tail_call settings. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Daniel Borkmann authored
Make the high-level BPF JIT entry a general 'catch-all' and add architecture specific entries to make it more clear who actively maintains which BPF JIT compiler. The list (L) address implies that this eventually lands in the bpf patchwork bucket. Goal is that this set of responsible developers listed here is always up to date and a point of contact for helping out in e.g. feature development, fixes, review or testing patches in order to help long-term in ensuring quality of the BPF JITs and therefore BPF core under a given architecture. Every new JIT in future /must/ have an entry here as well. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Sandipan Das <sandipan@linux.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Zi Shen Lim <zlim.lnx@gmail.com> Acked-by: Paul Burton <paul.burton@mips.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Wang YanQing <udknight@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
David Miller authored
We need to initialize the frame pointer register not just if it is seen as a source operand, but also if it is seen as the destination operand of a store or an atomic instruction (which effectively is a source operand). This is exercised by test_verifier's "non-invalid fp arithmetic" Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
David Miller authored
On T4 and later sparc64 cpus we can use the fused compare and branch instruction. However, it can only be used if the branch destination is in the range of a signed 10-bit immediate offset. This amounts to 1024 instructions forwards or backwards. After the commit referenced in the Fixes: tag, the largest possible size program seen by the JIT explodes by a significant factor. As a result of this convergance takes many more passes since the expanded "BPF_LDX | BPF_MSH | BPF_B" code sequence, for example, contains several embedded branch on condition instructions. On each pass, as suddenly new fused compare and branch instances become valid, this makes thousands more in range for the next pass. And so on and so forth. This is most greatly exemplified by "BPF_MAXINSNS: exec all MSH" which takes 35 passes to converge, and shrinks the image by about 64K. To decrease the cost of this number of convergance passes, do the convergance pass before we have the program image allocated, just like other JITs (such as x86) do. Fixes: e0cea7ce ("bpf: implement ld_abs/ld_ind in native bpf") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Alexei Starovoitov authored
Daniel Borkmann says: ==================== This set contains a fix for arm64 BPF JIT. First patch generalizes ppc64 way of retrieving subprog into bpf_jit_get_func_addr() as core code and uses the same on arm64 in second patch. Tested on both arm64 and ppc64. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Daniel Borkmann authored
The arm64 JIT has the same issue as ppc64 JIT in that the relative BPF to BPF call offset can be too far away from core kernel in that relative encoding into imm is not sufficient and could potentially be truncated, see also fd045f6c ("arm64: add support for module PLTs") which adds spill-over space for module_alloc() and therefore bpf_jit_binary_alloc(). Therefore, use the recently added bpf_jit_get_func_addr() helper for properly fetching the address through prog->aux->func[off]->bpf_func instead. This also has the benefit to optimize normal helper calls since their address can use the optimized emission. Tested on Cavium ThunderX CN8890. Fixes: db496944 ("bpf: arm64: add JIT support for multi-function programs") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Daniel Borkmann authored
Make fetching of the BPF call address from ppc64 JIT generic. ppc64 was using a slightly different variant rather than through the insns' imm field encoding as the target address would not fit into that space. Therefore, the target subprog number was encoded into the insns' offset and fetched through fp->aux->func[off]->bpf_func instead. Given there are other JITs with this issue and the mechanism of fetching the address is JIT-generic, move it into the core as a helper instead. On the JIT side, we get information on whether the retrieved address is a fixed one, that is, not changing through JIT passes, or a dynamic one. For the former, JITs can optimize their imm emission because this doesn't change jump offsets throughout JIT process. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Sandipan Das <sandipan@linux.ibm.com> Tested-by: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
- 26 Nov, 2018 1 commit
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller authored
Daniel Borkmann says: ==================== pull-request: bpf 2018-11-25 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix an off-by-one bug when adjusting subprog start offsets after patching, from Edward. 2) Fix several bugs such as overflow in size allocation in queue / stack map creation, from Alexei. 3) Fix wrong IPv6 destination port byte order in bpf_sk_lookup_udp helper, from Andrey. 4) Fix several bugs in bpftool such as preventing an infinite loop in get_fdinfo, error handling and man page references, from Quentin. 5) Fix a warning in bpf_trace_printk() that wasn't catching an invalid format string, from Martynas. 6) Fix a bug in BPF cgroup local storage where non-atomic allocation was used in atomic context, from Roman. 7) Fix a NULL pointer dereference bug in bpftool from reallocarray() error handling, from Jakub and Wen. 8) Add a copy of pkt_cls.h and tc_bpf.h uapi headers to the tools include infrastructure so that bpftool compiles on older RHEL7-like user space which does not ship these headers, from Yonghong. 9) Fix BPF kselftests for user space where to get ping test working with ping6 and ping -6, from Li. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 25 Nov, 2018 2 commits
-
-
Willem de Bruijn authored
In ip packet generation, pagedlen is initialized for each skb at the start of the loop in __ip(6)_append_data, before label alloc_new_skb. Depending on compiler options, code can be generated that jumps to this label, triggering use of an an uninitialized variable. In practice, at -O2, the generated code moves the initialization below the label. But the code should not rely on that for correctness. Fixes: 15e36f5b ("udp: paged allocation with gso") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
When a qdisc setup including pacing FQ is dismantled and recreated, some TCP packets are sent earlier than instructed by TCP stack. TCP can be fooled when ACK comes back, because the following operation can return a negative value. tcp_time_stamp(tp) - tp->rx_opt.rcv_tsecr; Some paths in TCP stack were not dealing properly with this, this patch addresses four of them. Fixes: ab408b6d ("tcp: switch tcp and sch_fq to new earliest departure time model") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 24 Nov, 2018 9 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds authored
Pull arm64 fixes from Catalin Marinas:: - Fix wrong conflict resolution around CONFIG_ARM64_SSBD - Fix sparse warning on unsigned long constant * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: cpufeature: Fix mismerge of CONFIG_ARM64_SSBD block arm64: sysreg: fix sparse warnings
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) Need to take mutex in ath9k_add_interface(), from Dan Carpenter. 2) Fix mt76 build without CONFIG_LEDS_CLASS, from Arnd Bergmann. 3) Fix socket wmem accounting in SCTP, from Xin Long. 4) Fix failed resume crash in ena driver, from Arthur Kiyanovski. 5) qed driver passes bytes instead of bits into second arg of bitmap_weight(). From Denis Bolotin. 6) Fix reset deadlock in ibmvnic, from Juliet Kim. 7) skb_scrube_packet() needs to scrub the fwd marks too, from Petr Machata. 8) Make sure older TCP stacks see enough dup ACKs, and avoid doing SACK compression during this period, from Eric Dumazet. 9) Add atomicity to SMC protocol cursor handling, from Ursula Braun. 10) Don't leave dangling error pointer if bpf_prog_add() fails in thunderx driver, from Lorenzo Bianconi. Also, when we unmap TSO headers, set sq->tso_hdrs to NULL. 11) Fix race condition over state variables in act_police, from Davide Caratti. 12) Disable guest csum in the presence of XDP in virtio_net, from Jason Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (64 commits) net: gemini: Fix copy/paste error net: phy: mscc: fix deadlock in vsc85xx_default_config dt-bindings: dsa: Fix typo in "probed" net: thunderx: set tso_hdrs pointer to NULL in nicvf_free_snd_queue net: amd: add missing of_node_put() team: no need to do team_notify_peers or team_mcast_rejoin when disabling port virtio-net: fail XDP set if guest csum is negotiated virtio-net: disable guest csum during XDP set net/sched: act_police: add missing spinlock initialization net: don't keep lonely packets forever in the gro hash net/ipv6: re-do dad when interface has IFF_NOARP flag change packet: copy user buffers before orphan or clone ibmvnic: Update driver queues after change in ring size support ibmvnic: Fix RX queue buffer cleanup net: thunderx: set xdp_prog to NULL if bpf_prog_add fails net/dim: Update DIM start sample after each DIM iteration net: faraday: ftmac100: remove netif_running(netdev) check before disabling interrupts net/smc: use after free fix in smc_wr_tx_put_slot() net/smc: atomic SMCD cursor handling net/smc: add SMC-D shutdown signal ...
-
git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds authored
Pull xfs fixes from Darrick Wong: "Dave and I have continued our work fixing corruption problems that can be found when running long-term burn-in exercisers on xfs. Here are some patches fixing most of the problems, but there will likely be more. :/ - Numerous corruption fixes for copy on write - Numerous corruption fixes for blocksize < pagesize writes - Don't miscalculate AG reservations for small final AGs - Fix page cache truncation to work properly for reflink and extent shifting - Fix use-after-free when retrying failed inode/dquot buffer logging - Fix corruptions seen when using copy_file_range in directio mode" * tag 'xfs-4.20-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: iomap: readpages doesn't zero page tail beyond EOF vfs: vfs_dedupe_file_range() doesn't return EOPNOTSUPP iomap: dio data corruption and spurious errors when pipes fill iomap: sub-block dio needs to zeroout beyond EOF iomap: FUA is wrong for DIO O_DSYNC writes into unwritten extents xfs: delalloc -> unwritten COW fork allocation can go wrong xfs: flush removing page cache in xfs_reflink_remap_prep xfs: extent shifting doesn't fully invalidate page cache xfs: finobt AG reserves don't consider last AG can be a runt xfs: fix transient reference count error in xfs_buf_resubmit_failed_buffers xfs: uncached buffer tracing needs to print bno xfs: make xfs_file_remap_range() static xfs: fix shared extent data corruption due to missing cow reservation
-
Andreas Fiedler authored
The TX stats should be started with the tx_stats_syncp, there seems to be a copy/paste error in the driver. Signed-off-by: Andreas Fiedler <andreas.fiedler@gmx.net> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Quentin Schulz authored
The vsc85xx_default_config function called in the vsc85xx_config_init function which is used by VSC8530, VSC8531, VSC8540 and VSC8541 PHYs mistakenly calls phy_read and phy_write in-between phy_select_page and phy_restore_page. phy_select_page and phy_restore_page actually take and release the MDIO bus lock and phy_write and phy_read take and release the lock to write or read to a PHY register. Let's fix this deadlock by using phy_modify_paged which handles correctly a read followed by a write in a non-standard page. Fixes: 6a0bfbbe ("net: phy: mscc: migrate to phy_select/restore_page functions") Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Fabio Estevam authored
The correct form is "can be probed", so fix the typo. Signed-off-by: Fabio Estevam <festevam@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lorenzo Bianconi authored
Reset snd_queue tso_hdrs pointer to NULL in nicvf_free_snd_queue routine since it is used to check if tso dma descriptor queue has been previously allocated. The issue can be triggered with the following reproducer: $ip link set dev enP2p1s0v0 xdpdrv obj xdp_dummy.o $ip link set dev enP2p1s0v0 xdpdrv off [ 341.467649] WARNING: CPU: 74 PID: 2158 at mm/vmalloc.c:1511 __vunmap+0x98/0xe0 [ 341.515010] Hardware name: GIGABYTE H270-T70/MT70-HD0, BIOS T49 02/02/2018 [ 341.521874] pstate: 60400005 (nZCv daif +PAN -UAO) [ 341.526654] pc : __vunmap+0x98/0xe0 [ 341.530132] lr : __vunmap+0x98/0xe0 [ 341.533609] sp : ffff00001c5db860 [ 341.536913] x29: ffff00001c5db860 x28: 0000000000020000 [ 341.542214] x27: ffff810feb5090b0 x26: ffff000017e57000 [ 341.547515] x25: 0000000000000000 x24: 00000000fbd00000 [ 341.552816] x23: 0000000000000000 x22: ffff810feb5090b0 [ 341.558117] x21: 0000000000000000 x20: 0000000000000000 [ 341.563418] x19: ffff000017e57000 x18: 0000000000000000 [ 341.568719] x17: 0000000000000000 x16: 0000000000000000 [ 341.574020] x15: 0000000000000010 x14: ffffffffffffffff [ 341.579321] x13: ffff00008985eb27 x12: ffff00000985eb2f [ 341.584622] x11: ffff0000096b3000 x10: ffff00001c5db510 [ 341.589923] x9 : 00000000ffffffd0 x8 : ffff0000086868e8 [ 341.595224] x7 : 3430303030303030 x6 : 00000000000006ef [ 341.600525] x5 : 00000000003fffff x4 : 0000000000000000 [ 341.605825] x3 : 0000000000000000 x2 : ffffffffffffffff [ 341.611126] x1 : ffff0000096b3728 x0 : 0000000000000038 [ 341.616428] Call trace: [ 341.618866] __vunmap+0x98/0xe0 [ 341.621997] vunmap+0x3c/0x50 [ 341.624961] arch_dma_free+0x68/0xa0 [ 341.628534] dma_direct_free+0x50/0x80 [ 341.632285] nicvf_free_resources+0x160/0x2d8 [nicvf] [ 341.637327] nicvf_config_data_transfer+0x174/0x5e8 [nicvf] [ 341.642890] nicvf_stop+0x298/0x340 [nicvf] [ 341.647066] __dev_close_many+0x9c/0x108 [ 341.650977] dev_close_many+0xa4/0x158 [ 341.654720] rollback_registered_many+0x140/0x530 [ 341.659414] rollback_registered+0x54/0x80 [ 341.663499] unregister_netdevice_queue+0x9c/0xe8 [ 341.668192] unregister_netdev+0x28/0x38 [ 341.672106] nicvf_remove+0xa4/0xa8 [nicvf] [ 341.676280] nicvf_shutdown+0x20/0x30 [nicvf] [ 341.680630] pci_device_shutdown+0x44/0x88 [ 341.684720] device_shutdown+0x144/0x250 [ 341.688640] kernel_restart_prepare+0x44/0x50 [ 341.692986] kernel_restart+0x20/0x68 [ 341.696638] __se_sys_reboot+0x210/0x238 [ 341.700550] __arm64_sys_reboot+0x24/0x30 [ 341.704555] el0_svc_handler+0x94/0x110 [ 341.708382] el0_svc+0x8/0xc [ 341.711252] ---[ end trace 3f4019c8439959c9 ]--- [ 341.715874] page:ffff7e0003ef4000 count:0 mapcount:0 mapping:0000000000000000 index:0x4 [ 341.723872] flags: 0x1fffe000000000() [ 341.727527] raw: 001fffe000000000 ffff7e0003f1a008 ffff7e0003ef4048 0000000000000000 [ 341.735263] raw: 0000000000000004 0000000000000000 00000000ffffffff 0000000000000000 [ 341.742994] page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0) where xdp_dummy.c is a simple bpf program that forwards the incoming frames to the network stack (available here: https://github.com/altoor/xdp_walkthrough_examples/blob/master/sample_1/xdp_dummy.c) Fixes: 05c773f5 ("net: thunderx: Add basic XDP support") Fixes: 4863dea3 ("net: Adding support for Cavium ThunderX network controller") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yangtao Li authored
of_find_node_by_path() acquires a reference to the node returned by it and that reference needs to be dropped by its caller. This place doesn't do that, so fix it. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Hangbin Liu authored
team_notify_peers() will send ARP and NA to notify peers. team_mcast_rejoin() will send multicast join group message to notify peers. We should do this when enabling/changed to a new port. But it doesn't make sense to do it when a port is disabled. On the other hand, when we set mcast_rejoin_count to 2, and do a failover, team_port_disable() will increase mcast_rejoin.count_pending to 2 and then team_port_enable() will increase mcast_rejoin.count_pending to 4. We will send 4 mcast rejoin messages at latest, which will make user confused. The same with notify_peers.count. Fix it by deleting team_notify_peers() and team_mcast_rejoin() in team_port_disable(). Reported-by: Liang Li <liali@redhat.com> Fixes: fc423ff0 ("team: add peer notification") Fixes: 492b200e ("team: add support for sending multicast rejoins") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 23 Nov, 2018 8 commits
-
-
Martynas Pumputis authored
A format string consisting of "%p" or "%s" followed by an invalid specifier (e.g. "%p%\n" or "%s%") could pass the check which would make format_decode (lib/vsprintf.c) to warn. Fixes: 9c959c86 ("tracing: Allow BPF programs to call bpf_trace_printk()") Reported-by: syzbot+1ec5c5ec949c4adaa0c4@syzkaller.appspotmail.com Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Jason Wang authored
We don't support partial csumed packet since its metadata will be lost or incorrect during XDP processing. So fail the XDP set if guest_csum feature is negotiated. Fixes: f600b690 ("virtio_net: Add XDP support") Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Pavel Popa <pashinho1990@gmail.com> Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jason Wang authored
We don't disable VIRTIO_NET_F_GUEST_CSUM if XDP was set. This means we can receive partial csumed packets with metadata kept in the vnet_hdr. This may have several side effects: - It could be overridden by header adjustment, thus is might be not correct after XDP processing. - There's no way to pass such metadata information through XDP_REDIRECT to another driver. - XDP does not support checksum offload right now. So simply disable guest csum if possible in this the case of XDP. Fixes: 3f93522f ("virtio-net: switch off offloads on demand if possible on XDP set") Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Pavel Popa <pashinho1990@gmail.com> Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
https://github.com/ceph/ceph-clientLinus Torvalds authored
Pullk ceph fix from Ilya Dryomov: "A messenger fix, marked for stable" * tag 'ceph-for-4.20-rc4' of https://github.com/ceph/ceph-client: libceph: fall back to sendmsg for slab pages
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull block fix from Jens Axboe: "Just a single fix for this week, fixing an issue with nvme-fc" * tag 'for-linus-20181123' of git://git.kernel.dk/linux-block: nvme-fc: resolve io failures during connect
-
Davide Caratti authored
commit f2cbd485 ("net/sched: act_police: fix race condition on state variables") introduces a new spinlock, but forgets its initialization. Ensure that tcf_police_init() initializes 'tcfp_lock' every time a 'police' action is newly created, to avoid the following lockdep splat: INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. <...> Call Trace: dump_stack+0x85/0xcb register_lock_class+0x581/0x590 __lock_acquire+0xd4/0x1330 ? tcf_police_init+0x2fa/0x650 [act_police] ? lock_acquire+0x9e/0x1a0 lock_acquire+0x9e/0x1a0 ? tcf_police_init+0x2fa/0x650 [act_police] ? tcf_police_init+0x55a/0x650 [act_police] _raw_spin_lock_bh+0x34/0x40 ? tcf_police_init+0x2fa/0x650 [act_police] tcf_police_init+0x2fa/0x650 [act_police] tcf_action_init_1+0x384/0x4c0 tcf_action_init+0xf6/0x160 tcf_action_add+0x73/0x170 tc_ctl_action+0x122/0x160 rtnetlink_rcv_msg+0x2a4/0x490 ? netlink_deliver_tap+0x99/0x400 ? validate_linkmsg+0x370/0x370 netlink_rcv_skb+0x4d/0x130 netlink_unicast+0x196/0x230 netlink_sendmsg+0x2e5/0x3e0 sock_sendmsg+0x36/0x40 ___sys_sendmsg+0x280/0x2f0 ? _raw_spin_unlock+0x24/0x30 ? handle_pte_fault+0xafe/0xf30 ? find_held_lock+0x2d/0x90 ? syscall_trace_enter+0x1df/0x360 ? __sys_sendmsg+0x5e/0xa0 __sys_sendmsg+0x5e/0xa0 do_syscall_64+0x60/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f1841c7cf10 Code: c3 48 8b 05 82 6f 2c 00 f7 db 64 89 18 48 83 cb ff eb dd 0f 1f 80 00 00 00 00 83 3d 8d d0 2c 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ae cc 00 00 48 89 04 24 RSP: 002b:00007ffcf9df4d68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f1841c7cf10 RDX: 0000000000000000 RSI: 00007ffcf9df4dc0 RDI: 0000000000000003 RBP: 000000005bf56105 R08: 0000000000000002 R09: 00007ffcf9df8edc R10: 00007ffcf9df47e0 R11: 0000000000000246 R12: 0000000000671be0 R13: 00007ffcf9df4e84 R14: 0000000000000008 R15: 0000000000000000 Fixes: f2cbd485 ("net/sched: act_police: fix race condition on state variables") Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
Eric noted that with UDP GRO and NAPI timeout, we could keep a single UDP packet inside the GRO hash forever, if the related NAPI instance calls napi_gro_complete() at an higher frequency than the NAPI timeout. Willem noted that even TCP packets could be trapped there, till the next retransmission. This patch tries to address the issue, flushing the old packets - those with a NAPI_GRO_CB age before the current jiffy - before scheduling the NAPI timeout. The rationale is that such a timeout should be well below a jiffy and we are not flushing packets eligible for sane GRO. v1 -> v2: - clarified the commit message and comment RFC -> v1: - added 'Fixes tags', cleaned-up the wording. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Fixes: 3b47d303 ("net: gro: add a per device gro flush timer") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Hangbin Liu authored
When we add a new IPv6 address, we should also join corresponding solicited-node multicast address, unless the interface has IFF_NOARP flag, as function addrconf_join_solict() did. But if we remove IFF_NOARP flag later, we do not do dad and add the mcast address. So we will drop corresponding neighbour discovery message that came from other nodes. A typical example is after creating a ipvlan with mode l3, setting up an ipv6 address and changing the mode to l2. Then we will not be able to ping this address as the interface doesn't join related solicited-node mcast address. Fix it by re-doing dad when interface changed IFF_NOARP flag. Then we will add corresponding mcast group and check if there is a duplicate address on the network. Reported-by: Jianlin Shi <jishi@redhat.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-