- 17 Feb, 2022 40 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski authored
Fast path bpf marge for some -next work. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski authored
Alexei Starovoitov says: ==================== pull-request: bpf 2022-02-17 We've added 8 non-merge commits during the last 7 day(s) which contain a total of 8 files changed, 119 insertions(+), 15 deletions(-). The main changes are: 1) Add schedule points in map batch ops, from Eric. 2) Fix bpf_msg_push_data with len 0, from Felix. 3) Fix crash due to incorrect copy_map_value, from Kumar. 4) Fix crash due to out of bounds access into reg2btf_ids, from Kumar. 5) Fix a bpf_timer initialization issue with clang, from Yonghong. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Add schedule points in batch ops bpf: Fix crash due to out of bounds access into reg2btf_ids. selftests: bpf: Check bpf_msg_push_data return value bpf: Fix a bpf_timer initialization issue bpf: Emit bpf_timer in vmlinux BTF selftests/bpf: Add test for bpf_timer overwriting crash bpf: Fix crash due to incorrect copy_map_value bpf: Do not try bpf_msg_push_data with len 0 ==================== Link: https://lore.kernel.org/r/20220217190000.37925-1-alexei.starovoitov@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski authored
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds authored
Pull networking fixes from Jakub Kicinski: "Including fixes from wireless and netfilter. Current release - regressions: - dsa: lantiq_gswip: fix use after free in gswip_remove() - smc: avoid overwriting the copies of clcsock callback functions Current release - new code bugs: - iwlwifi: - fix use-after-free when no FW is present - mei: fix the pskb_may_pull check in ipv4 - mei: retry mapping the shared area - mvm: don't feed the hardware RFKILL into iwlmei Previous releases - regressions: - ipv6: mcast: use rcu-safe version of ipv6_get_lladdr() - tipc: fix wrong publisher node address in link publications - iwlwifi: mvm: don't send SAR GEO command for 3160 devices, avoid FW assertion - bgmac: make idm and nicpm resource optional again - atl1c: fix tx timeout after link flap Previous releases - always broken: - vsock: remove vsock from connected table when connect is interrupted by a signal - ping: change destination interface checks to match raw sockets - crypto: af_alg - get rid of alg_memory_allocated to avoid confusing semantics (and null-deref) after SO_RESERVE_MEM was added - ipv6: make exclusive flowlabel checks per-netns - bonding: force carrier update when releasing slave - sched: limit TC_ACT_REPEAT loops - bridge: multicast: notify switchdev driver whenever MC processing gets disabled because of max entries reached - wifi: brcmfmac: fix crash in brcm_alt_fw_path when WLAN not found - iwlwifi: fix locking when "HW not ready" - phy: mediatek: remove PHY mode check on MT7531 - dsa: mv88e6xxx: flush switchdev FDB workqueue before removing VLAN - dsa: lan9303: - fix polarity of reset during probe - fix accelerated VLAN handling" * tag 'net-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (65 commits) bonding: force carrier update when releasing slave nfp: flower: netdev offload check for ip6gretap ipv6: fix data-race in fib6_info_hw_flags_set / fib6_purge_rt ipv4: fix data races in fib_alias_hw_flags_set net: dsa: lan9303: add VLAN IDs to master device net: dsa: lan9303: handle hwaccel VLAN tags vsock: remove vsock from connected table when connect is interrupted by a signal Revert "net: ethernet: bgmac: Use devm_platform_ioremap_resource_byname" ping: fix the dif and sdif check in ping_lookup net: usb: cdc_mbim: avoid altsetting toggling for Telit FN990 net: sched: limit TC_ACT_REPEAT loops tipc: fix wrong notification node addresses net: dsa: lantiq_gswip: fix use after free in gswip_remove() ipv6: per-netns exclusive flowlabel checks net: bridge: multicast: notify switchdev driver whenever MC processing gets disabled CDC-NCM: avoid overflow in sanity checking mctp: fix use after free net: mscc: ocelot: fix use-after-free in ocelot_vlan_del() bonding: fix data-races around agg_select_timer dpaa2-eth: Initialize mutex used in one step timestamping path ...
-
Zhang Changzhong authored
In __bond_release_one(), bond_set_carrier() is only called when bond device has no slave. Therefore, if we remove the up slave from a master with two slaves and keep the down slave, the master will remain up. Fix this by moving bond_set_carrier() out of if (!bond_has_slaves(bond)) statement. Reproducer: $ insmod bonding.ko mode=0 miimon=100 max_bonds=2 $ ifconfig bond0 up $ ifenslave bond0 eth0 eth1 $ ifconfig eth0 down $ ifenslave -d bond0 eth1 $ cat /proc/net/bonding/bond0 Fixes: ff59c456 ("[PATCH] bonding: support carrier state for master") Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Link: https://lore.kernel.org/r/1645021088-38370-1-git-send-email-zhangchangzhong@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
syzbot reported various soft lockups caused by bpf batch operations. INFO: task kworker/1:1:27 blocked for more than 140 seconds. INFO: task hung in rcu_barrier Nothing prevents batch ops to process huge amount of data, we need to add schedule points in them. Note that maybe_wait_bpf_programs(map) calls from generic_map_delete_batch() can be factorized by moving the call after the loop. This will be done later in -next tree once we get this fix merged, unless there is strong opinion doing this optimization sooner. Fixes: aa2e93b8 ("bpf: Add generic support for update and delete batch ops") Fixes: cb4d03ab ("bpf: Add generic support for lookup batch op") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Stanislav Fomichev <sdf@google.com> Acked-by: Brian Vazquez <brianvv@google.com> Link: https://lore.kernel.org/bpf/20220217181902.808742-1-eric.dumazet@gmail.com
-
Luis Chamberlain authored
Commit b42bc9a3 ("Fix regression due to "fs: move binfmt_misc sysctl to its own file") fixed a regression, however it failed to add a kmemleak_not_leak(). Fixes: b42bc9a3 ("Fix regression due to "fs: move binfmt_misc sysctl to its own file") Reported-by: Tong Zhang <ztong0001@gmail.com> Cc: Tong Zhang <ztong0001@gmail.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
Merge tag 'perf-tools-fixes-for-v5.17-2022-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix corrupt inject files when only last branch option is enabled with ARM CoreSight ETM - Fix use-after-free for realloc(..., 0) in libsubcmd, found by gcc 12 - Defer freeing string after possible strlen() on it in the BPF loader, found by gcc 12 - Avoid early exit in 'perf trace' due SIGCHLD from non-workload processes - Fix arm64 perf_event_attr 'perf test's wrt --call-graph initialization - Fix libperf 32-bit build for 'perf test' wrt uint64_t printf - Fix perf_cpu_map__for_each_cpu macro in libperf, providing access to the CPU iterator - Sync linux/perf_event.h UAPI with the kernel sources - Update Jiri Olsa's email address in MAINTAINERS * tag 'perf-tools-fixes-for-v5.17-2022-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf bpf: Defer freeing string after possible strlen() on it perf test: Fix arm64 perf_event_attr tests wrt --call-graph initialization libsubcmd: Fix use-after-free for realloc(..., 0) libperf: Fix perf_cpu_map__for_each_cpu macro perf cs-etm: Fix corrupt inject files when only last branch option is enabled perf cs-etm: No-op refactor of synth opt usage libperf: Fix 32-bit build for tests uint64_t printf tools headers UAPI: Sync linux/perf_event.h with the kernel sources perf trace: Avoid early exit due SIGCHLD from non-workload processes MAINTAINERS: Update Jiri's email address
-
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linuxLinus Torvalds authored
Pull module fix from Luis Chamberlain: "Fixes module decompression when CONFIG_SYSFS=n The only fix trickled down for v5.17-rc cycle so far is the fix for module decompression when CONFIG_SYSFS=n. This was reported through 0-day" * tag 'modules-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: module: fix building with sysfs disabled
-
Danie du Toit authored
IPv6 GRE tunnels are not being offloaded, this is caused by a missing netdev offload check. The functionality of IPv6 GRE tunnel offloading was previously added but this check was not included. Adding the ip6gretap check allows IPv6 GRE tunnels to be offloaded correctly. Fixes: f7536ffb ("nfp: flower: Allow ipv6gretap interface for offloading") Signed-off-by: Danie du Toit <danie.dutoit@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20220217124820.40436-1-louis.peens@corigine.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
Because fib6_info_hw_flags_set() is called without any synchronization, all accesses to gi6->offload, fi->trap and fi->offload_failed need some basic protection like READ_ONCE()/WRITE_ONCE(). BUG: KCSAN: data-race in fib6_info_hw_flags_set / fib6_purge_rt read to 0xffff8881087d5886 of 1 bytes by task 13953 on cpu 0: fib6_drop_pcpu_from net/ipv6/ip6_fib.c:1007 [inline] fib6_purge_rt+0x4f/0x580 net/ipv6/ip6_fib.c:1033 fib6_del_route net/ipv6/ip6_fib.c:1983 [inline] fib6_del+0x696/0x890 net/ipv6/ip6_fib.c:2028 __ip6_del_rt net/ipv6/route.c:3876 [inline] ip6_del_rt+0x83/0x140 net/ipv6/route.c:3891 __ipv6_dev_ac_dec+0x2b5/0x370 net/ipv6/anycast.c:374 ipv6_dev_ac_dec net/ipv6/anycast.c:387 [inline] __ipv6_sock_ac_close+0x141/0x200 net/ipv6/anycast.c:207 ipv6_sock_ac_close+0x79/0x90 net/ipv6/anycast.c:220 inet6_release+0x32/0x50 net/ipv6/af_inet6.c:476 __sock_release net/socket.c:650 [inline] sock_close+0x6c/0x150 net/socket.c:1318 __fput+0x295/0x520 fs/file_table.c:280 ____fput+0x11/0x20 fs/file_table.c:313 task_work_run+0x8e/0x110 kernel/task_work.c:164 tracehook_notify_resume include/linux/tracehook.h:189 [inline] exit_to_user_mode_loop kernel/entry/common.c:175 [inline] exit_to_user_mode_prepare+0x160/0x190 kernel/entry/common.c:207 __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline] syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:300 do_syscall_64+0x50/0xd0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x44/0xae write to 0xffff8881087d5886 of 1 bytes by task 1912 on cpu 1: fib6_info_hw_flags_set+0x155/0x3b0 net/ipv6/route.c:6230 nsim_fib6_rt_hw_flags_set drivers/net/netdevsim/fib.c:668 [inline] nsim_fib6_rt_add drivers/net/netdevsim/fib.c:691 [inline] nsim_fib6_rt_insert drivers/net/netdevsim/fib.c:756 [inline] nsim_fib6_event drivers/net/netdevsim/fib.c:853 [inline] nsim_fib_event drivers/net/netdevsim/fib.c:886 [inline] nsim_fib_event_work+0x284f/0x2cf0 drivers/net/netdevsim/fib.c:1477 process_one_work+0x3f6/0x960 kernel/workqueue.c:2307 worker_thread+0x616/0xa70 kernel/workqueue.c:2454 kthread+0x2c7/0x2e0 kernel/kthread.c:327 ret_from_fork+0x1f/0x30 value changed: 0x22 -> 0x2a Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 1912 Comm: kworker/1:3 Not tainted 5.16.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: events nsim_fib_event_work Fixes: 0c5fcf9e ("IPv6: Add "offload failed" indication to routes") Fixes: bb3c4ab9 ("ipv6: Add "offload" and "trap" indications to routes") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Amit Cohen <amcohen@nvidia.com> Cc: Ido Schimmel <idosch@nvidia.com> Reported-by: syzbot <syzkaller@googlegroups.com> Link: https://lore.kernel.org/r/20220216173217.3792411-2-eric.dumazet@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
fib_alias_hw_flags_set() can be used by concurrent threads, and is only RCU protected. We need to annotate accesses to following fields of struct fib_alias: offload, trap, offload_failed Because of READ_ONCE()WRITE_ONCE() limitations, make these field u8. BUG: KCSAN: data-race in fib_alias_hw_flags_set / fib_alias_hw_flags_set read to 0xffff888134224a6a of 1 bytes by task 2013 on cpu 1: fib_alias_hw_flags_set+0x28a/0x470 net/ipv4/fib_trie.c:1050 nsim_fib4_rt_hw_flags_set drivers/net/netdevsim/fib.c:350 [inline] nsim_fib4_rt_add drivers/net/netdevsim/fib.c:367 [inline] nsim_fib4_rt_insert drivers/net/netdevsim/fib.c:429 [inline] nsim_fib4_event drivers/net/netdevsim/fib.c:461 [inline] nsim_fib_event drivers/net/netdevsim/fib.c:881 [inline] nsim_fib_event_work+0x1852/0x2cf0 drivers/net/netdevsim/fib.c:1477 process_one_work+0x3f6/0x960 kernel/workqueue.c:2307 process_scheduled_works kernel/workqueue.c:2370 [inline] worker_thread+0x7df/0xa70 kernel/workqueue.c:2456 kthread+0x1bf/0x1e0 kernel/kthread.c:377 ret_from_fork+0x1f/0x30 write to 0xffff888134224a6a of 1 bytes by task 4872 on cpu 0: fib_alias_hw_flags_set+0x2d5/0x470 net/ipv4/fib_trie.c:1054 nsim_fib4_rt_hw_flags_set drivers/net/netdevsim/fib.c:350 [inline] nsim_fib4_rt_add drivers/net/netdevsim/fib.c:367 [inline] nsim_fib4_rt_insert drivers/net/netdevsim/fib.c:429 [inline] nsim_fib4_event drivers/net/netdevsim/fib.c:461 [inline] nsim_fib_event drivers/net/netdevsim/fib.c:881 [inline] nsim_fib_event_work+0x1852/0x2cf0 drivers/net/netdevsim/fib.c:1477 process_one_work+0x3f6/0x960 kernel/workqueue.c:2307 process_scheduled_works kernel/workqueue.c:2370 [inline] worker_thread+0x7df/0xa70 kernel/workqueue.c:2456 kthread+0x1bf/0x1e0 kernel/kthread.c:377 ret_from_fork+0x1f/0x30 value changed: 0x00 -> 0x02 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 4872 Comm: kworker/0:0 Not tainted 5.17.0-rc3-syzkaller-00188-g1d41d2e8-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: events nsim_fib_event_work Fixes: 90b93f1b ("ipv4: Add "offload" and "trap" indications to routes") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20220216173217.3792411-1-eric.dumazet@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Mans Rullgard authored
If the master device does VLAN filtering, the IDs used by the switch must be added for any frames to be received. Do this in the port_enable() function, and remove them in port_disable(). Fixes: a1292595 ("net: dsa: add new DSA switch driver for the SMSC-LAN9303") Signed-off-by: Mans Rullgard <mans@mansr.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20220216204818.28746-1-mans@mansr.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Mans Rullgard authored
Check for a hwaccel VLAN tag on rx and use it if present. Otherwise, use __skb_vlan_pop() like the other tag parsers do. This fixes the case where the VLAN tag has already been consumed by the master. Fixes: a1292595 ("net: dsa: add new DSA switch driver for the SMSC-LAN9303") Signed-off-by: Mans Rullgard <mans@mansr.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20220216124634.23123-1-mans@mansr.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Linus Torvalds authored
Oded Gabbay reports that enabling NUMA balancing causes corruption with his Gaudi accelerator test load: "All the details are in the bug, but the bottom line is that somehow, this patch causes corruption when the numa balancing feature is enabled AND we don't use process affinity AND we use GUP to pin pages so our accelerator can DMA to/from system memory. Either disabling numa balancing, using process affinity to bind to specific numa-node or reverting this patch causes the bug to disappear" and Oded bisected the issue to commit 09854ba9 ("mm: do_wp_page() simplification"). Now, the NUMA balancing shouldn't actually be changing the writability of a page, and as such shouldn't matter for COW. But it appears it does. Suspicious. However, regardless of that, the condition for enabling NUMA faults in change_pte_range() is nonsensical. It uses "page_mapcount(page)" to decide if a COW page should be NUMA-protected or not, and that makes absolutely no sense. The number of mappings a page has is irrelevant: not only does GUP get a reference to a page as in Oded's case, but the other mappings migth be paged out and the only reference to them would be in the page count. Since we should never try to NUMA-balance a page that we can't move anyway due to other references, just fix the code to use 'page_count()'. Oded confirms that that fixes his issue. Now, this does imply that something in NUMA balancing ends up changing page protections (other than the obvious one of making the page inaccessible to get the NUMA faulting information). Otherwise the COW simplification wouldn't matter - since doing the GUP on the page would make sure it's writable. The cause of that permission change would be good to figure out too, since it clearly results in spurious COW events - but fixing the nonsensical test that just happened to work before is obviously the CorrectThing(tm) to do regardless. Fixes: 09854ba9 ("mm: do_wp_page() simplification") Link: https://bugzilla.kernel.org/show_bug.cgi?id=215616 Link: https://lore.kernel.org/all/CAFCwf10eNmwq2wD71xjUhqkvv5+_pJMR1nPug2RqNDcFT4H86Q@mail.gmail.com/Reported-and-tested-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Seth Forshee authored
vsock_connect() expects that the socket could already be in the TCP_ESTABLISHED state when the connecting task wakes up with a signal pending. If this happens the socket will be in the connected table, and it is not removed when the socket state is reset. In this situation it's common for the process to retry connect(), and if the connection is successful the socket will be added to the connected table a second time, corrupting the list. Prevent this by calling vsock_remove_connected() if a signal is received while waiting for a connection. This is harmless if the socket is not in the connected table, and if it is in the table then removing it will prevent list corruption from a double add. Note for backporting: this patch requires d5afa82c ("vsock: correct removal of socket from the list"), which is in all current stable trees except 4.9.y. Fixes: d021c344 ("VSOCK: Introduce VM Sockets") Signed-off-by: Seth Forshee <sforshee@digitalocean.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20220217141312.2297547-1-sforshee@digitalocean.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jonas Gorski authored
This reverts commit 3710e809. Since idm_base and nicpm_base are still optional resources not present on all platforms, this breaks the driver for everything except Northstar 2 (which has both). The same change was already reverted once with 755f5738 ("net: broadcom: fix a mistake about ioremap resource"). So let's do it again. Fixes: 3710e809 ("net: ethernet: bgmac: Use devm_platform_ioremap_resource_byname") Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> [florian: Added comments to explain the resources are optional] Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220216184634.2032460-1-f.fainelli@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
Before freeing the hash table in addrconf_exit_net(), we need to make sure the work queue has completed, or risk NULL dereference or UAF. Thus, use cancel_delayed_work_sync() to enforce this. We do not hold RTNL in addrconf_exit_net(), making this safe. Fixes: 8805d13f ("ipv6/addrconf: use one delayed work per netns") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220216182037.3742-1-eric.dumazet@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Sprinkle for each loops to allow netdevices to be unregistered out of order, as their refs are released. This prevents problems caused by dependencies between netdevs which want to release references in their ->priv_destructor. See commit d6ff94af ("vlan: move dev_put into vlan_dev_uninit") for example. Eric has removed the only known ordering requirement in commit c002496b ("Merge branch 'ipv6-loopback'") so let's try this and see if anything explodes... Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Link: https://lore.kernel.org/r/20220215225310.3679266-2-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
In prep for unregistering netdevs out of order move the netdev state validation and change outside of the loop. While at it modernize this code and use WARN() instead of pr_err() + dump_stack(). Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Link: https://lore.kernel.org/r/20220215225310.3679266-1-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Xin Long authored
When 'ping' changes to use PING socket instead of RAW socket by: # sysctl -w net.ipv4.ping_group_range="0 100" There is another regression caused when matching sk_bound_dev_if and dif, RAW socket is using inet_iif() while PING socket lookup is using skb->dev->ifindex, the cmd below fails due to this: # ip link add dummy0 type dummy # ip link set dummy0 up # ip addr add 192.168.111.1/24 dev dummy0 # ping -I dummy0 192.168.111.1 -c1 The issue was also reported on: https://github.com/iputils/iputils/issues/104 But fixed in iputils in a wrong way by not binding to device when destination IP is on device, and it will cause some of kselftests to fail, as Jianlin noticed. This patch is to use inet(6)_iif and inet(6)_sdif to get dif and sdif for PING socket, and keep consistent with RAW socket. Fixes: c319b4d7 ("net: ipv4: add IPPROTO_ICMP socket kind") Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniele Palmas authored
Add quirk CDC_MBIM_FLAG_AVOID_ALTSETTING_TOGGLE for Telit FN990 0x1071 composition in order to avoid bind error. Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jakub Kicinski says: ==================== net: ping6: support setting basic SOL_IPV6 options via cmsg Support for IPV6_HOPLIMIT, IPV6_TCLASS, IPV6_DONTFRAG on ICMPv6 sockets and associated tests. I have no immediate plans to implement IPV6_FLOWINFO and all the extension header stuff. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Add a basic test to make sure ping sockets don't crash with IPV6_2292* options. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Test setting IPV6_HOPLIMIT via setsockopt and cmsg across socket types. Output without the kernel support (this series): Case HOPLIMIT ICMP cmsg - packet data returned 1, expected 0 Case HOPLIMIT ICMP diff - packet data returned 1, expected 0 Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Test setting IPV6_TCLASS via setsockopt and cmsg across socket types. Output without the kernel support (this series): Case TCLASS ICMP cmsg - packet data returned 1, expected 0 Case TCLASS ICMP cmsg - rejection returned 0, expected 1 Case TCLASS ICMP diff - pass returned 1, expected 0 Case TCLASS ICMP diff - packet data returned 1, expected 0 Case TCLASS ICMP diff - rejection returned 0, expected 1 Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Test setting IPV6_DONTFRAG via setsockopt and cmsg across socket types. Output without the kernel support (this series): Case DONTFRAG ICMP setsock returned 0, expected 1 Case DONTFRAG ICMP cmsg returned 0, expected 1 Case DONTFRAG ICMP both returned 0, expected 1 Case DONTFRAG ICMP diff returned 0, expected 1 FAIL - 4/24 cases failed Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Support setting IPV6_HOPLIMIT, IPV6_TCLASS, IPV6_DONTFRAG during sendmsg via SOL_IPV6 cmsgs. tclass and dontfrag are init'ed from struct ipv6_pinfo in ipcm6_init_sk(), while hlimit is inited to -1, so we need to handle it being populated via cmsg explicitly. Leave extension headers and flowlabel unimplemented. Those are slightly more laborious to test and users seem to primarily care about IPV6_TCLASS. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Vladimir Oltean says: ==================== kRemove BRENTRY checks from switchdev drivers As discussed here: https://patchwork.kernel.org/project/netdevbpf/patch/20220214233111.1586715-2-vladimir.oltean@nxp.com/#24738869 no switchdev driver makes use of VLAN port objects that lack the BRIDGE_VLAN_INFO_BRENTRY flag. Notifying them in the first place rather seems like an omission of commit 9c86ce2c ("net: bridge: Notify about bridge VLANs"). Since commit 3116ad06 ("net: bridge: vlan: don't notify to switchdev master VLANs without BRENTRY flag") that was just merged, the bridge no longer notifies switchdev upon creation of these VLANs, so we can remove the checks from drivers. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
Since commit 3116ad06 ("net: bridge: vlan: don't notify to switchdev master VLANs without BRENTRY flag"), the bridge no longer emits switchdev notifiers for VLANs that don't have the BRIDGE_VLAN_INFO_BRENTRY flag, so these checks are dead code. Remove them. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
Since commit 3116ad06 ("net: bridge: vlan: don't notify to switchdev master VLANs without BRENTRY flag"), the bridge no longer emits switchdev notifiers for VLANs that don't have the BRIDGE_VLAN_INFO_BRENTRY flag, so these checks are dead code. Remove them. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
Since commit 3116ad06 ("net: bridge: vlan: don't notify to switchdev master VLANs without BRENTRY flag"), the bridge no longer emits switchdev notifiers for VLANs that don't have the BRIDGE_VLAN_INFO_BRENTRY flag, so these checks are dead code. Remove them. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
Since commit 3116ad06 ("net: bridge: vlan: don't notify to switchdev master VLANs without BRENTRY flag"), the bridge no longer emits switchdev notifiers for VLANs that don't have the BRIDGE_VLAN_INFO_BRENTRY flag, so these checks are dead code. Remove them. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
Since commit 3116ad06 ("net: bridge: vlan: don't notify to switchdev master VLANs without BRENTRY flag"), the bridge no longer emits switchdev notifiers for VLANs that don't have the BRIDGE_VLAN_INFO_BRENTRY flag, so these checks are dead code. Remove them. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Vladimir Oltean says: ==================== Support PTP over UDP with the ocelot-8021q DSA tagging protocol The alternative tag_8021q-based tagger for Ocelot switches, added here: https://patchwork.kernel.org/project/netdevbpf/cover/20210129010009.3959398-1-olteanv@gmail.com/ gained support for PTP over L2 here: https://patchwork.kernel.org/project/netdevbpf/cover/20210213223801.1334216-1-olteanv@gmail.com/ mostly as a minimum viable requirement. That PTP support was mostly self-contained code that installed some rules to replicate PTP packets on the CPU queue, in felix_setup_mmio_filtering(). However ocelot-8021q starts to look more interesting for general purpose usage, so it is now time to reduce the technical debt by integrating the PTP traps used by Felix for tag_8021q with the rest of the Ocelot driver. There is further consolidation of traps to be done. The cookies used by MRP traps overlap with the cookies used for tag_8021q PTP traps, so those features could not be used at the same time. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
DSA inherits NETIF_F_CSUM_MASK from master->vlan_features, and the expectation is that TX checksumming is offloaded and not done in software. Normally the DSA master takes care of this, but packets handled by ocelot_defer_xmit() are a very special exception, because they are actually injected into the switch through register-based MMIO. So the DSA master is not involved at all for these packets => no one calculates the checksum. This allows PTP over UDP to work using the ocelot-8021q tagging protocol. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
Historically, the felix DSA driver has installed special traps such that PTP over L2 works with the ocelot-8021q tagging protocol; commit 0a6f17c6 ("net: dsa: tag_ocelot_8021q: add support for PTP timestamping") has the details. Then the ocelot switch library also gained more comprehensive support for PTP traps through commit 96ca08c0 ("net: mscc: ocelot: set up traps for PTP packets"). Right now, PTP over L2 works using ocelot-8021q via the traps it has set for itself, but nothing else does. Consolidating the two code blocks would make ocelot-8021q gain support for PTP over L4 and tc-flower traps, and at the same time avoid some code and TCAM duplication. The traps are similar in intent, but different in execution, so some explanation is required. The traps set up by felix_setup_mmio_filtering() are VCAP IS1 filters, which have a PAG that chains them to a VCAP IS2 filter, and the IS2 is where the 'trap' action resides. The traps set up by ocelot_trap_add(), on the other hand, have a single filter, in VCAP IS2. The reason for chaining VCAP IS1 and IS2 in Felix was to ensure that the hardcoded traps take precedence and cannot be overridden by the Ocelot switch library. So in principle, the PTP traps needed for ocelot-8021q in the Felix driver can rely on ocelot_trap_add(), but the filters need to be patched to account for a quirk that LS1028A has: the quirk_no_xtr_irq described in commit 0a6f17c6 ("net: dsa: tag_ocelot_8021q: add support for PTP timestamping"). Live-patching is done by iterating through the trap list every time we know it has been updated, and transforming a trap into a redirect + CPU copy if ocelot-8021q is in use. Making the DSA ocelot-8021q tagger work with the Ocelot traps means we can eliminate the dedicated OCELOT_VCAP_IS1_TAG_8021Q_PTP_MMIO and OCELOT_VCAP_IS2_TAG_8021Q_PTP_MMIO cookies. To minimize the patch delta, OCELOT_VCAP_IS2_MRP_TRAP takes the place of OCELOT_VCAP_IS2_TAG_8021Q_PTP_MMIO (the alternative would have been to left-shift all cookie numbers by 1). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
There has been some controversy related to the sanity check that a CPU port exists, and commit e8b1d769 ("net: dsa: felix: Fix memory leak in felix_setup_mmio_filtering") even "corrected" an apparent memory leak as static analysis tools see it. However, the check is completely dead code, since the earliest point at which felix_setup_mmio_filtering() can be called is: felix_pci_probe -> dsa_register_switch -> dsa_switch_probe -> dsa_tree_setup -> dsa_tree_setup_cpu_ports -> dsa_tree_setup_default_cpu -> contains the "DSA: tree %d has no CPU port\n" check -> dsa_tree_setup_master -> dsa_master_setup -> sysfs_create_group(&dev->dev.kobj, &dsa_group); -> makes tagging_store() callable -> dsa_tree_change_tag_proto -> dsa_tree_notify -> dsa_switch_event -> dsa_switch_change_tag_proto -> ds->ops->change_tag_protocol -> felix_change_tag_protocol -> felix_set_tag_protocol -> felix_setup_tag_8021q -> felix_setup_mmio_filtering -> breaks at first CPU port So probing would have failed earlier if there wasn't any CPU port defined. To avoid all confusion, delete the dead code and replace it with a comment. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
The ocelot switch library does not need this information, but the felix DSA driver does. As a reminder, the VSC9959 switch in LS1028A doesn't have an IRQ line for packet extraction, so to be notified that a PTP packet needs to be dequeued, it receives that packet also over Ethernet, by setting up a packet trap. The Felix driver needs to install special kinds of traps for packets in need of RX timestamps, such that the packets are replicated both over Ethernet and over the CPU port module. But the Ocelot switch library sets up more than one trap for PTP event messages; it also traps PTP general messages, MRP control messages etc. Those packets don't need PTP timestamps, so there's no reason for the Felix driver to send them to the CPU port module. By knowing which traps need PTP timestamps, the Felix driver can adjust the traps installed using ocelot_trap_add() such that only those will actually get delivered to the CPU port module. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
When using the ocelot-8021q tagging protocol, the CPU port isn't configured as an NPI port, but is a regular port. So a "trap to CPU" operation is actually a "redirect" operation. So DSA needs to set up the trapping action one way or another, depending on the tagging protocol in use. To ease DSA's work of modifying the action, keep all currently installed traps in a list, so that DSA can live-patch them when the tagging protocol changes. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-