- 20 Aug, 2024 17 commits
-
-
Gal Pressman authored
When metadata_dst struct is allocated (using metadata_dst_alloc()), it reserves room for options at the end of the struct. Change the memcpy() to unsafe_memcpy() as it is guaranteed that enough room (md_size bytes) was allocated and the field-spanning write is intentional. This resolves the following warning: ------------[ cut here ]------------ memcpy: detected field-spanning write (size 104) of single field "&new_md->u.tun_info" at include/net/dst_metadata.h:166 (size 96) WARNING: CPU: 2 PID: 391470 at include/net/dst_metadata.h:166 tun_dst_unclone+0x114/0x138 [geneve] Modules linked in: act_tunnel_key geneve ip6_udp_tunnel udp_tunnel act_vlan act_mirred act_skbedit cls_matchall nfnetlink_cttimeout act_gact cls_flower sch_ingress sbsa_gwdt ipmi_devintf ipmi_msghandler xfrm_interface xfrm6_tunnel tunnel6 tunnel4 xfrm_user xfrm_algo nvme_fabrics overlay optee openvswitch nsh nf_conncount ib_srp scsi_transport_srp rpcrdma rdma_ucm ib_iser rdma_cm ib_umad iw_cm libiscsi ib_ipoib scsi_transport_iscsi ib_cm uio_pdrv_genirq uio mlxbf_pmc pwr_mlxbf mlxbf_bootctl bluefield_edac nft_chain_nat binfmt_misc xt_MASQUERADE nf_nat xt_tcpmss xt_NFLOG nfnetlink_log xt_recent xt_hashlimit xt_state xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_mark xt_comment ipt_REJECT nf_reject_ipv4 nft_compat nf_tables nfnetlink sch_fq_codel dm_multipath fuse efi_pstore ip_tables btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq raid1 raid0 nvme nvme_core mlx5_ib ib_uverbs ib_core ipv6 crc_ccitt mlx5_core crct10dif_ce mlxfw psample i2c_mlxbf gpio_mlxbf2 mlxbf_gige mlxbf_tmfifo CPU: 2 PID: 391470 Comm: handler6 Not tainted 6.10.0-rc1 #1 Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS 4.5.0.12993 Dec 6 2023 pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : tun_dst_unclone+0x114/0x138 [geneve] lr : tun_dst_unclone+0x114/0x138 [geneve] sp : ffffffc0804533f0 x29: ffffffc0804533f0 x28: 000000000000024e x27: 0000000000000000 x26: ffffffdcfc0e8e40 x25: ffffff8086fa6600 x24: ffffff8096a0c000 x23: 0000000000000068 x22: 0000000000000008 x21: ffffff8092ad7000 x20: ffffff8081e17900 x19: ffffff8092ad7900 x18: 00000000fffffffd x17: 0000000000000000 x16: ffffffdcfa018488 x15: 695f6e75742e753e x14: 2d646d5f77656e26 x13: 6d5f77656e262220 x12: 646c65696620656c x11: ffffffdcfbe33ae8 x10: ffffffdcfbe1baa8 x9 : ffffffdcfa0a4c10 x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 0000000000000001 x5 : ffffff83fdeeb010 x4 : 0000000000000000 x3 : 0000000000000027 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffff80913f6780 Call trace: tun_dst_unclone+0x114/0x138 [geneve] geneve_xmit+0x214/0x10e0 [geneve] dev_hard_start_xmit+0xc0/0x220 __dev_queue_xmit+0xa14/0xd38 dev_queue_xmit+0x14/0x28 [openvswitch] ovs_vport_send+0x98/0x1c8 [openvswitch] do_output+0x80/0x1a0 [openvswitch] do_execute_actions+0x172c/0x1958 [openvswitch] ovs_execute_actions+0x64/0x1a8 [openvswitch] ovs_packet_cmd_execute+0x258/0x2d8 [openvswitch] genl_family_rcv_msg_doit+0xc8/0x138 genl_rcv_msg+0x1ec/0x280 netlink_rcv_skb+0x64/0x150 genl_rcv+0x40/0x60 netlink_unicast+0x2e4/0x348 netlink_sendmsg+0x1b0/0x400 __sock_sendmsg+0x64/0xc0 ____sys_sendmsg+0x284/0x308 ___sys_sendmsg+0x88/0xf0 __sys_sendmsg+0x70/0xd8 __arm64_sys_sendmsg+0x2c/0x40 invoke_syscall+0x50/0x128 el0_svc_common.constprop.0+0x48/0xf0 do_el0_svc+0x24/0x38 el0_svc+0x38/0x100 el0t_64_sync_handler+0xc0/0xc8 el0t_64_sync+0x1a4/0x1a8 ---[ end trace 0000000000000000 ]--- Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20240818114351.3612692-1-gal@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Zhang Zekun authored
There is a helper function ARRAY_SIZE() to help calculating the u32 array size, and we don't need to do it mannually. So, let's use ARRAY_SIZE() to calculate the array size, and improve the code readability. Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com> Reviewed-by: Jijie Shao<shaojijie@huawei.com> Link: https://patch.msgid.link/20240818052518.45489-1-zhangzekun11@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Looking at timestamped output of netdev CI reveals that most of the time in forwarding tests for custom route hashing is spent on a single case, namely the test which uses ping (mausezahn does not support flow labels). On a non-debug kernel we spend 714 of 730 total test runtime (97%) on this test case. While having flow label support in a traffic gen tool / mausezahn would be best, we can significantly speed up the loop by putting ip vrf exec outside of the iteration. In a test of 1000 pings using a normal loop takes 50 seconds to finish. While using: ip vrf exec $vrf sh -c "$loop-body" takes 12 seconds (1/4 of the time). Some of the slowness is likely due to our inefficient virtualization setup, but even on my laptop running "ip link help" 16k times takes 25-30 seconds, so I think it's worth optimizing even for fastest setups. Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20240817203659.712085-1-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Zhang Zekun authored
for_each_child_of_node can help to iterate through the device_node, and we don't need to use while loop. No functional change with this conversion. Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240816015837.109627-1-zhangzekun11@huawei.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Paolo Abeni authored
Ido Schimmel says: ==================== Preparations for FIB rule DSCP selector This patchset moves the masking of the upper DSCP bits in 'flowi4_tos' to the core instead of relying on callers of the FIB lookup API to do it. This will allow us to start changing users of the API to initialize the 'flowi4_tos' field with all six bits of the DSCP field. In turn, this will allow us to extend FIB rules with a new DSCP selector. By masking the upper DSCP bits in the core we are able to maintain the behavior of the TOS selector in FIB rules and routes to only match on the lower DSCP bits. While working on this I found two users of the API that do not mask the upper DSCP bits before performing the lookup. The first is an ancient netlink family that is unlikely to be used. It is adjusted in patch #1 to mask both the upper DSCP bits and the ECN bits before calling the API. The second user is a nftables module that differs in this regard from its equivalent iptables module. It is adjusted in patch #2 to invoke the API with the upper DSCP bits masked, like all other callers. The relevant selftest passed, but in the unlikely case that regressions are reported because of this change, we can restore the existing behavior using a new flow information flag as discussed here [1]. The last patch moves the masking of the upper DSCP bits to the core, making the first two patches redundant, but I wanted to post them separately to call attention to the behavior change for these two users of the FIB lookup API. Future patchsets (around 3) will start unmasking the upper DSCP bits throughout the networking stack before adding support for the new FIB rule DSCP selector. Changes from v1 [2]: Patch #3: Include <linux/ip.h> in <linux/in_route.h> instead of including it in net/ip_fib.h [1] https://lore.kernel.org/netdev/ZpqpB8vJU%2FQ6LSqa@debian/ [2] https://lore.kernel.org/netdev/20240725131729.1729103-1-idosch@nvidia.com/ ==================== Link: https://patch.msgid.link/20240814125224.972815-1-idosch@nvidia.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Ido Schimmel authored
The TOS field in the IPv4 flow information structure ('flowi4_tos') is matched by the kernel against the TOS selector in IPv4 rules and routes. The field is initialized differently by different call sites. Some treat it as DSCP (RFC 2474) and initialize all six DSCP bits, some treat it as RFC 1349 TOS and initialize it using RT_TOS() and some treat it as RFC 791 TOS and initialize it using IPTOS_RT_MASK. What is common to all these call sites is that they all initialize the lower three DSCP bits, which fits the TOS definition in the initial IPv4 specification (RFC 791). Therefore, the kernel only allows configuring IPv4 FIB rules that match on the lower three DSCP bits which are always guaranteed to be initialized by all call sites: # ip -4 rule add tos 0x1c table 100 # ip -4 rule add tos 0x3c table 100 Error: Invalid tos. While this works, it is unlikely to be very useful. RFC 791 that initially defined the TOS and IP precedence fields was updated by RFC 2474 over twenty five years ago where these fields were replaced by a single six bits DSCP field. Extending FIB rules to match on DSCP can be done by adding a new DSCP selector while maintaining the existing semantics of the TOS selector for applications that rely on that. A prerequisite for allowing FIB rules to match on DSCP is to adjust all the call sites to initialize the high order DSCP bits and remove their masking along the path to the core where the field is matched on. However, making this change alone will result in a behavior change. For example, a forwarded IPv4 packet with a DS field of 0xfc will no longer match a FIB rule that was configured with 'tos 0x1c'. This behavior change can be avoided by masking the upper three DSCP bits in 'flowi4_tos' before comparing it against the TOS selectors in FIB rules and routes. Implement the above by adding a new function that checks whether a given DSCP value matches the one specified in the IPv4 flow information structure and invoke it from the three places that currently match on 'flowi4_tos'. Use RT_TOS() for the masking of 'flowi4_tos' instead of IPTOS_RT_MASK since the latter is not uAPI and we should be able to remove it at some point. Include <linux/ip.h> in <linux/in_route.h> since the former defines IPTOS_TOS_MASK which is used in the definition of RT_TOS() in <linux/in_route.h>. No regressions in FIB tests: # ./fib_tests.sh [...] Tests passed: 218 Tests failed: 0 And FIB rule tests: # ./fib_rule_tests.sh [...] Tests passed: 116 Tests failed: 0 Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Ido Schimmel authored
As part of its functionality, the nftables FIB expression module performs a FIB lookup, but unlike other users of the FIB lookup API, it does so without masking the upper DSCP bits. In particular, this differs from the equivalent iptables match ("rpfilter") that does mask the upper DSCP bits before the FIB lookup. Align the module to other users of the FIB lookup API and mask the upper DSCP bits using IPTOS_RT_MASK before the lookup. No regressions in nft_fib.sh: # ./nft_fib.sh PASS: fib expression did not cause unwanted packet drops PASS: fib expression did drop packets for 1.1.1.1 PASS: fib expression did drop packets for 1c3::c01d PASS: fib expression forward check with policy based routing Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Ido Schimmel authored
The NETLINK_FIB_LOOKUP netlink family can be used to perform a FIB lookup according to user provided parameters and communicate the result back to user space. However, unlike other users of the FIB lookup API, the upper DSCP bits and the ECN bits of the DS field are not masked, which can result in the wrong result being returned. Solve this by masking the upper DSCP bits and the ECN bits using IPTOS_RT_MASK. The structure that communicates the request and the response is not exported to user space, so it is unlikely that this netlink family is actually in use [1]. [1] https://lore.kernel.org/netdev/ZpqpB8vJU%2FQ6LSqa@debian/Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Paolo Abeni authored
Wen Gu says: ==================== net/smc: introduce ringbufs usage statistics Currently, we have histograms that show the sizes of ringbufs that ever used by SMC connections. However, they are always incremental and since SMC allows the reuse of ringbufs, we cannot know the actual amount of ringbufs being allocated or actively used. So this patch set introduces statistics for the amount of ringbufs that actually allocated by link group and actively used by connections of a certain net namespace, so that we can react based on these memory usage information, e.g. active fallback to TCP. With appropriate adaptations of smc-tools, we can obtain these ringbufs usage information: $ smcr -d linkgroup LG-ID : 00000500 LG-Role : SERV LG-Type : ASYML VLAN : 0 PNET-ID : Version : 1 Conns : 0 Sndbuf : 12910592 B <- RMB : 12910592 B <- or $ smcr -d stats [...] RX Stats Data transmitted (Bytes) 869225943 (869.2M) Total requests 18494479 Buffer usage (Bytes) 12910592 (12.31M) <- [...] TX Stats Data transmitted (Bytes) 12760884405 (12.76G) Total requests 36988338 Buffer usage (Bytes) 12910592 (12.31M) <- [...] [...] Change log: v3->v2 - use new helper nla_put_uint() instead of nla_put_u64_64bit(). v2->v1 https://lore.kernel.org/r/20240807075939.57882-1-guwen@linux.alibaba.com/ - remove inline keyword in .c files. - use local variable in macros to avoid potential side effects. v1 https://lore.kernel.org/r/20240805090551.80786-1-guwen@linux.alibaba.com/ ==================== Link: https://patch.msgid.link/20240814130827.73321-1-guwen@linux.alibaba.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Wen Gu authored
The buffer size histograms in smc_stats, namely rx/tx_rmbsize, record the sizes of ringbufs for all connections that have ever appeared in the net namespace. They are incremental and we cannot know the actual ringbufs usage from these. So here introduces statistics for current ringbufs usage of existing smc connections in the net namespace into smc_stats, it will be incremented when new connection uses a ringbuf and decremented when the ringbuf is unused. Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Wen Gu authored
Currently we have the statistics on sndbuf/RMB sizes of all connections that have ever been on the link group, namely smc_stats_memsize. However these statistics are incremental and since the ringbufs of link group are allowed to be reused, we cannot know the actual allocated buffers through these. So here introduces the statistic on actual allocated ringbufs of the link group, it will be incremented when a new ringbuf is added into buf_list and decremented when it is deleted from buf_list. Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Zhang Changzhong authored
The check for '!to' is redundant here, since skb_can_coalesce() already contains this check. Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1723730983-22912-1-git-send-email-zhangchangzhong@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Yue Haibing authored
Commit a1ab24e5 ("mptcp: consolidate sockopt synchronization") removed the implementation but leave declaration. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240816100404.879598-1-yuehaibing@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Yue Haibing authored
These are never implenmented since commit b691b111 ("net/mlx5: Implement devlink port function cmds to control ipsec_packet"). Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240816101550.881844-1-yuehaibing@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Yue Haibing authored
There is no caller and implementations in tree. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240816101638.882072-1-yuehaibing@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Yue Haibing authored
Commit f13697cc ("gve: Switch to config-aware queue allocation") convert this function to gve_rx_alloc_rings_gqi(). Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240816101906.882743-1-yuehaibing@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Use the netlink policy to validate IPv6 address length. Destination address currently has policy for max len set, and source has no policy validation. In both cases the code does the real check. With correct policy check the code can be removed. Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240816212245.467745-1-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 19 Aug, 2024 1 commit
-
-
Frank Li authored
fsl,enetc-ptp is embedded pcie device. Add compatible string pci1957,ee02. Fix warning: arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-kbox-a-230-ls.dtb: ethernet@0,4: compatible:0: 'pci1957,ee02' is not one of ['fsl,etsec-ptp', 'fsl,fman-ptp-timer', 'fsl,dpaa2-ptp', 'fsl,enetc-ptp'] Signed-off-by: Frank Li <Frank.Li@nxp.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 17 Aug, 2024 2 commits
-
-
Simon Horman authored
In bnx2x_get_vf_config(): * The vlan field of ivi is a 32-bit integer, it is used to store a vlan ID. * The vlan field of bulletin is a 16-bit integer, it is also used to store a vlan ID. In the current code, ivi->vlan is set using memset. But in the case of setting it to the value of bulletin->vlan, this involves reading 32 bits from a 16bit source. This is likely safe, as the following 6 bytes are padding in the same structure, but none the less, it seems undesirable. However, it is entirely unclear to me how this scheme works on big-endian systems. Resolve this by simply assigning integer values to ivi->vlan. Flagged by W=1 builds. f.e. gcc-14 reports: In function 'fortify_memcpy_chk', inlined from 'bnx2x_get_vf_config' at .../bnx2x_sriov.c:2655:4: .../fortify-string.h:580:25: warning: call to '__read_overflow2_field' declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] 580 | __read_overflow2_field(q_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Link: https://patch.msgid.link/20240815-bnx2x-int-vlan-v1-1-5940b76e37ad@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Christoph Paasch authored
mpls_xmit() needs to prepend the MPLS-labels to the packet. That implies one needs to make sure there is enough space for it in the headers. Calling skb_cow() implies however that one wants to change even the playload part of the packet (which is not true for MPLS). Thus, call skb_cow_head() instead, which is what other tunnelling protocols do. Running a server with this comm it entirely removed the calls to pskb_expand_head() from the callstack in mpls_xmit() thus having significant CPU-reduction, especially at peak times. Cc: Roopa Prabhu <roopa@nvidia.com> Reported-by: Craig Taylor <cmtaylor@apple.com> Signed-off-by: Christoph Paasch <cpaasch@apple.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240815161201.22021-1-cpaasch@apple.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 16 Aug, 2024 20 commits
-
-
Simon Horman authored
Remove unnecessary NULL check before freeing using kvfree(). This function will ignore a NULL argument. Flagged by Coccinelle: .../txgbe_hw.c:187:2-8: WARNING: NULL check before some freeing functions is not needed. No functional change intended. Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240815-txgbe-kvfree-v1-1-5ecf8656f555@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Aleksander Jan Bajkowski authored
Remove a variable that has never been used. Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Link: https://patch.msgid.link/20240815074956.155224-1-olek2@wp.plSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Tariq Toukan authored
Following commit 9f7e8fbb ("net/mlx5: offset comp irq index in name by one"), which fixed the index in IRQ name to start once again from 0, we change the documentation accordingly. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Link: https://patch.msgid.link/20240815142343.2254247-1-tariqt@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Frank Li authored
Change mdio.yaml nodename match pattern to '^mdio(-(bus|external))?(@.+|-([0-9]+))$' Fix mdio.yaml wrong parser mdio controller's address instead phy's address when mdio-mux exista. For example: mdio-mux-emi1@54 { compatible = "mdio-mux-mmioreg", "mdio-mux"; mdio@20 { reg = <0x20>; ^^^ This is mdio controller register ethernet-phy@2 { reg = <0x2>; ^^^ This phy's address }; }; }; Only phy's address is limited to 31 because MDIO bus definition. But CHECK_DTBS report below warning: arch/arm64/boot/dts/freescale/fsl-ls1043a-qds.dtb: mdio-mux-emi1@54: mdio@20:reg:0:0: 32 is greater than the maximum of 31 The reason is that "mdio-mux-emi1@54" match "nodename: '^mdio(@.*)?'" in mdio.yaml. Change to '^mdio(-(bus|external))?(@.+|-([0-9]+))?$' to avoid wrong match mdio mux controller's node. Signed-off-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20240815163408.4184705-1-Frank.Li@nxp.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queueJakub Kicinski authored
Tony Nguyen says: ==================== ice: iavf: add support for TC U32 filters on VFs Ahmed Zaki says: The Intel Ethernet 800 Series is designed with a pipeline that has an on-chip programmable capability called Dynamic Device Personalization (DDP). A DDP package is loaded by the driver during probe time. The DDP package programs functionality in both the parser and switching blocks in the pipeline, allowing dynamic support for new and existing protocols. Once the pipeline is configured, the driver can identify the protocol and apply any HW action in different stages, for example, direct packets to desired hardware queues (flow director), queue groups or drop. Patches 1-8 introduce a DDP package parser API that enables different pipeline stages in the driver to learn the HW parser capabilities from the DDP package that is downloaded to HW. The parser library takes raw packet patterns and masks (in binary) indicating the packet protocol fields to be matched and generates the final HW profiles that can be applied at the required stage. With this API, raw flow filtering for FDIR or RSS could be done on new protocols or headers without any driver or Kernel updates (only need to update the DDP package). These patches were submitted before [1] but were not accepted mainly due to lack of a user. Patches 9-11 extend the virtchnl support to allow the VF to request raw flow director filters. Upon receiving the raw FDIR filter request, the PF driver allocates and runs a parser lib instance and generates the hardware profile definitions required to program the FDIR stage. These were also submitted before [2]. Finally, patches 12 and 13 add TC U32 filter support to the iavf driver. Using the parser API, the ice driver runs the raw patterns sent by the user and then adds a new profile to the FDIR stage associated with the VF's VSI. Refer to examples in patch 13 commit message. [1]: https://lore.kernel.org/netdev/20230904021455.3944605-1-junfeng.guo@intel.com/ [2]: https://lore.kernel.org/intel-wired-lan/20230818064703.154183-1-junfeng.guo@intel.com/ * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: iavf: add support for offloading tc U32 cls filters iavf: refactor add/del FDIR filters ice: enable FDIR filters from raw binary patterns for VFs ice: add method to disable FDIR SWAP option virtchnl: support raw packet in protocol header ice: add API for parser profile initialization ice: add UDP tunnels support to the parser ice: support turning on/off the parser's double vlan mode ice: add parser execution main loop ice: add parser internal helper functions ice: add debugging functions for the parser sections ice: parse and init various DDP parser sections ice: add parser create and destroy skeleton ==================== Link: https://patch.msgid.link/20240813222249.3708070-1-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Pavan Kumar Linga authored
'req_vec_chunks' is used to store the vector info received from the device control plane. The memory for it is allocated in idpf_send_alloc_vectors_msg and returns an error if the memory allocation fails. 'req_vec_chunks' cannot be NULL in the later code flow. So remove the conditional check to extract the vector ids received from the device control plane. Smatch static checker warning: drivers/net/ethernet/intel/idpf/idpf_lib.c:417 idpf_intr_req() error: we previously assumed 'adapter->req_vec_chunks' could be null (see line 360) Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/intel-wired-lan/a355ae8a-9011-4a85-a4d1-5b2793bb5f7b@stanley.mountain/Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20240814175903.4166390-1-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Rosen Penev says: ==================== use more devm for ag71xx Some of these were introduced after the driver got introduced. In any case, using more devm allows removal of the remove function and overall simplifies the code. All of these were tested on a TP-LINK Archer C7v2. ==================== Link: https://patch.msgid.link/20240813170516.7301-1-rosenp@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Rosen Penev authored
Allows completely removing the remove function. Nothing is being done manually now. Tested on TP-LINK Archer C7v2. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20240813170516.7301-4-rosenp@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Rosen Penev authored
Allows removing ag71xx_mdio_remove. Removed ag.mii_bus variable. Local one can be used with devm. Easier to reason about as mii_bus is only used here now. Also shrinks the struct slightly. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20240813170516.7301-3-rosenp@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Rosen Penev authored
Allows removal of clk_prepare_enable to simplify the code slightly. Tested on a TP-LINK Archer C7v2. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20240813170516.7301-2-rosenp@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Ido Schimmel says: ==================== selftests: fib_rule_tests: Cleanups and new tests This patchset performs some cleanups and adds new tests in preparation for upcoming FIB rule DSCP selector. ==================== Link: https://patch.msgid.link/20240814111005.955359-1-idosch@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Ido Schimmel authored
The TOS value reaches the FIB rule core via different call paths when an input route is looked up compared to an output route. Re-test TOS matching with input routes to exercise these code paths. Pass the 'iif' and 'from' selectors separately from the 'get{,no}match' variables as otherwise the test name is too long to be printed without misalignments. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20240814111005.955359-6-idosch@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Ido Schimmel authored
The fib_rule{4,6}_connect tests verify that locally generated traffic from a socket that specifies a DS Field using the IP_TOS / IPV6_TCLASS socket options is correctly redirected using a FIB rule that matches on the given DS Field. Add negative tests to verify that the FIB rule is not hit when the socket specifies a DS Field that differs from the one used by the FIB rule. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20240814111005.955359-5-idosch@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Ido Schimmel authored
The fib_rule{4,6} tests verify the behavior of a given FIB rule selector (e.g., dport, sport) by redirecting to a routing table with a default route using a FIB rule with the given selector and checking that a route lookup using the selector matches this default route. Add negative tests to verify that a FIB rule is not hit when it should not. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20240814111005.955359-4-idosch@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Ido Schimmel authored
Clarify the test results by grouping the output of test cases belonging to the same test under a common title. This is consistent with the output of fib_tests.sh. Before: # ./fib_rule_tests.sh TEST: rule6 check: oif redirect to table [ OK ] TEST: rule6 del by pref: oif redirect to table [ OK ] [...] TEST: rule4 check: oif redirect to table [ OK ] TEST: rule4 del by pref: oif redirect to table [ OK ] [...] Tests passed: 116 Tests failed: 0 After: # ./fib_rule_tests.sh IPv6 FIB rule tests TEST: rule6 check: oif redirect to table [ OK ] TEST: rule6 del by pref: oif redirect to table [ OK ] [...] IPv4 FIB rule tests TEST: rule4 check: oif redirect to table [ OK ] TEST: rule4 del by pref: oif redirect to table [ OK ] [...] Tests passed: 116 Tests failed: 0 Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20240814111005.955359-3-idosch@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Ido Schimmel authored
The functions are unused since commit 816cda9a ("selftests: net: fib_rule_tests: add support to select a test to run"). Remove them. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20240814111005.955359-2-idosch@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Simon Horman says: ==================== ipv6: Add ipv6_addr_{cpu_to_be32,be32_to_cpu} helpers This series adds and uses some new helpers, ipv6_addr_{cpu_to_be32,be32_to_cpu}, which are intended to assist in byte order manipulation of IPv6 addresses stored as as arrays. v1: https://lore.kernel.org/r/20240812-ipv6_addr-helpers-v1-0-aab5d1f35c40@kernel.org ==================== Link: https://patch.msgid.link/20240813-ipv6_addr-helpers-v2-0-5c974f8cca3e@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Simon Horman authored
Use new ipv6_addr_cpu_to_be32 and ipv6_addr_be32_to_cpu helpers, and IPV6_ADDR_WORDS. This is arguably slightly nicer. No functional change intended. Compile tested only. Suggested-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/netdev/c7684349-535c-45a4-9a74-d47479a50020@lunn.ch/Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240813-ipv6_addr-helpers-v2-3-5c974f8cca3e@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Simon Horman authored
Use ipv6_addr_cpu_to_be32 and ipv6_addr_be32_to_cpu helpers to convert address, rather than open coding the conversion. No functional change intended. Compile tested only. Suggested-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/netdev/c7684349-535c-45a4-9a74-d47479a50020@lunn.ch/Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240813-ipv6_addr-helpers-v2-2-5c974f8cca3e@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Simon Horman authored
Add helpers to convert an ipv6 addr, expressed as an array of words, from CPU to big-endian byte order, and vice versa. No functional change intended. Compile tested only. Suggested-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/netdev/c7684349-535c-45a4-9a74-d47479a50020@lunn.ch/Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240813-ipv6_addr-helpers-v2-1-5c974f8cca3e@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-