- 23 Oct, 2023 40 commits
-
-
Jakub Kicinski authored
Jiri Pirko says: ==================== devlink: finish conversion to generated split_ops This patchset converts the remaining genetlink commands to generated split_ops and removes the existing small_ops arrays entirely alongside with shared netlink attribute policy. Patches #1-#6 are just small preparations and small fixes on multiple places. Note that couple of patches contain the "Fixes" tag but no need to put them into -net tree. Patch #7 is a simple rename preparation Patch #8 is the main one in this set and adds actual definitions of cmds in to yaml file. Patches #9-#10 finalize the change removing bits that are no longer in use. ==================== Link: https://lore.kernel.org/r/20231021112711.660606-1-jiri@resnulli.usSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jiri Pirko authored
All commands are now covered by generated split_ops. Remove the small_ops entirely alongside with unified devlink netlink policy array. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231021112711.660606-11-jiri@resnulli.usSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jiri Pirko authored
The prototypes are now generated, remove the old ones. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231021112711.660606-10-jiri@resnulli.usSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jiri Pirko authored
Currently, some of the commands are not described in devlink yaml file and are manually filled in net/devlink/netlink.c in small_ops. To make all part of split_ops, add definitions of the rest of the commands alongside with needed attributes and enums. Note that this focuses on the kernel side. The requests are fully described in order to generate split_op alongside with policies. Follow-up will describe the replies in order to make the userspace helpers complete. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231021112711.660606-9-jiri@resnulli.usSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jiri Pirko authored
All remaining doit and dumpit netlink callback functions are going to be used by generated split ops. They expect certain name format. Rename the callback to be aligned with generated names. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231021112711.660606-8-jiri@resnulli.usSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jiri Pirko authored
Since this enum is going to be used in generated userspace file, name it properly. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231021112711.660606-7-jiri@resnulli.usSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jiri Pirko authored
Make dont-validate field more compact and push it into a single line. Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231021112711.660606-6-jiri@resnulli.usSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jiri Pirko authored
devlink-get command does not contain reload-action attr in reply. Remove it. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231021112711.660606-5-jiri@resnulli.usSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jiri Pirko authored
Due to the check in RenderInfo class constructor, type_consistent flag is set to False to avoid rendering the same response parsing helper for do and dump ops. However, in case there is no do, the helper needs to be rendered for dump op. So split check to achieve that. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231021112711.660606-4-jiri@resnulli.usSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jiri Pirko authored
Introduce support for attribute type bitfield32. Note that since the generated code works with struct nla_bitfield32, the generator adds netlink.h to the list of includes for userspace headers in case any bitfield32 is present. Note that this is added only to genetlink-legacy scheme as requested by Jakub Kicinski. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231021112711.660606-3-jiri@resnulli.usSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jiri Pirko authored
Currently, split ops of doit and dumpit are merged into a single iter item when they are subsequent. However, there is no guarantee that the dumpit op is for the same cmd as doit op. Fix this by checking if cmd is the same for both. This problem does not occur in existing families. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231021112711.660606-2-jiri@resnulli.usSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Jacob Keller says: ==================== Intel Wired LAN Driver Updates 2023-10-19 (idpf) This series contains two fixes for the recently merged idpf driver. Michal adds missing logic for programming the scheduling mode of completion queues. Pavan fixes a call trace caused by the mailbox work item not being canceled properly if an error occurred during initialization. ==================== Link: https://lore.kernel.org/r/20231023202655.173369-1-jacob.e.keller@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Pavan Kumar Linga authored
In idpf_vc_core_init, the mailbox work is queued on a mailbox workqueue but it is not cancelled on error. This results in a call trace when idpf_mbx_task tries to access the freed mailbox queue pointer. Fix it by cancelling the mailbox work in the error path. Fixes: 4930fbf4 ("idpf: add core init and interrupt request") Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231023202655.173369-3-jacob.e.keller@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michal Kubiak authored
The HW must be programmed differently for queue-based scheduling mode. To program the completion queue context correctly, the control plane must know the scheduling mode not only for the Tx queue, but also for the completion queue. Unfortunately, currently the driver sets the scheduling mode only for the Tx queues. Propagate the scheduling mode data for the completion queue as well when sending the queue configuration messages. Fixes: 1c325aac ("idpf: configure resources for TX queues") Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Alan Brady <alan.brady@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231023202655.173369-2-jacob.e.keller@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
If packets of a TCP flows take the fast path, we need to make sure sk->sk_pacing_status is set to SK_PACING_FQ otherwise TCP might fallback to internal pacing, which is not optimal. Fixes: 076433bd ("net_sched: sch_fq: add fast path for mostly idle qdisc") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20231020201254.732527-1-edumazet@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
A last minute change went wrong. We need to look for a packet in all 3 bands, not only two. Fixes: 29f834aa ("net_sched: sch_fq: add 3 bands and WRR scheduling") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202310201422.a22b0999-oliver.sang@intel.comSigned-off-by: Eric Dumazet <edumazet@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Dave Taht <dave.taht@gmail.com> Cc: Toke Høiland-Jørgensen <toke@redhat.com> Tested-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20231020200053.675951-1-edumazet@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Rob Herring authored
Commit a243ecc3 ("net: mdio: xgene: Use device_get_match_data()") dropped the unconditional use of xgene_mdio_of_match resulting in this warning: drivers/net/mdio/mdio-xgene.c:303:34: warning: unused variable 'xgene_mdio_of_match' [-Wunused-const-variable] The fix is to drop of_match_ptr() which is not necessary because DT is always used for this driver (well, it could in theory support ACPI only, but CONFIG_OF is always enabled for arm64). Fixes: a243ecc3 ("net: mdio: xgene: Use device_get_match_data()") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202310170832.xnVXw1bb-lkp@intel.com/Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20231019182345.833136-1-robh@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
checkpatch gets confused and treats __attribute__ as a function call. It complains about white space before "(": WARNING:SPACING: space prohibited between function name and open parenthesis '(' + struct netdev_queue_get_rsp obj __attribute__ ((aligned (8))); No spaces wins in the kernel: $ git grep 'attribute__((.*aligned(' | wc -l 480 $ git grep 'attribute__ ((.*aligned (' | wc -l 110 $ git grep 'attribute__ ((.*aligned(' | wc -l 94 $ git grep 'attribute__((.*aligned (' | wc -l 63 So, whatever, change the codegen. Note that checkpatch also thinks we should use __aligned(), but this is user space code. Link: https://lore.kernel.org/all/202310190900.9Dzgkbev-lkp@intel.com/Acked-by: Stanislav Fomichev <sdf@google.com> Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20231020221827.3436697-1-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sabrina Dubroca authored
Prior to commit 1a074f76 ("tls: also use init_prot_info in tls_set_device_offload"), setting TLS_HW on TX didn't touch prot->aad_size and prot->tail_size. They are set to 0 during context allocation (tls_prot_info is embedded in tls_context, kzalloc'd by tls_ctx_create). When the RX key is configured, tls_set_sw_offload is called (for both TLS_SW and TLS_HW). If the TX key is configured in TLS_HW mode after the RX key has been installed, init_prot_info will now overwrite the correct values of aad_size and tail_size, breaking SW decryption and causing -EBADMSG errors to be returned to userspace. Since TLS_HW doesn't use aad_size and tail_size at all (for TLS1.2, tail_size is always 0, and aad_size is equal to TLS_HEADER_SIZE + rec_seq_size), we can simply drop this hunk. Fixes: 1a074f76 ("tls: also use init_prot_info in tls_set_device_offload") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Jakub Kicinski <kuba@kernel.org> Tested-by: Ran Rozenstein <ranro@nvidia.com> Link: https://lore.kernel.org/r/979d2f89a6a994d5bb49cae49a80be54150d094d.1697653889.git.sd@queasysnail.netSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Su Hui authored
'err' is useless after break, remove this to save space and be more clear. Signed-off-by: Su Hui <suhui@nfschina.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Eric Dumazet says: ==================== tcp: add optional usec resolution to TCP TS As discussed in various public places in 2016, Google adopted usec resolution in RFC 7323 TS values, at Van Jacobson suggestion. Goals were : 1) better observability of delays in networking stacks/fabrics. 2) better disambiguation of events based on TSval/ecr values. 3) building block for congestion control modules needing usec resolution. Back then we implemented a schem based on private SYN options to safely negotiate the feature. For upstream submission, we chose to use a much simpler route attribute because this feature is probably going to be used in private networks. ip route add 10/8 ... features tcp_usec_ts References: https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf https://datatracker.ietf.org/doc/draft-wang-tcpm-low-latency-opt/ First two patches are fixing old minor bugs and might be taken by stable teams (thanks to appropriate Fixes: tags) ==================== Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Add the ability to report in tcp_info.tcpi_options if a flow is using usec resolution in TCP TS val. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Back in 2015, Van Jacobson suggested to use usec resolution in TCP TS values. This has been implemented in our private kernels. Goals were : 1) better observability of delays in networking stacks. 2) better disambiguation of events based on TSval/ecr values. 3) building block for congestion control modules needing usec resolution. Back then we implemented a schem based on private SYN options to negotiate the feature. For upstream submission, we chose to use a route attribute, because this feature is probably going to be used in private networks [1] [2]. ip route add 10/8 ... features tcp_usec_ts Note that RFC 7323 recommends a "timestamp clock frequency in the range 1 ms to 1 sec per tick.", but also mentions "the maximum acceptable clock frequency is one tick every 59 ns." [1] Unfortunately RFC 7323 5.5 (Outdated Timestamps) suggests to invalidate TS.Recent values after a flow was idle for more than 24 days. This is the part making usec_ts a problem for peers following this recommendation for long living idle flows. [2] Attempts to standardize usec ts went nowhere: https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf https://datatracker.ietf.org/doc/draft-wang-tcpm-low-latency-opt/Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
tcp_paws_check() uses TCP_PAWS_24DAYS constant to detect if TCP TS values might have wrapped after a long idle period. This mechanism is described in RFC 7323 5.5 (Outdated Timestamps) TCP_PAWS_24DAYS value was based on the assumption of a clock of 1 Khz. As we want to adopt a 1 Mhz clock in the future, we reduce this constant. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
This new dst feature flag will be used to allow TCP to use usec based timestamps instead of msec ones. ip route .... feature tcp_usec_ts Also document that RTAX_FEATURE_SACK and RTAX_FEATURE_TIMESTAMP are unused. RTAX_FEATURE_ALLFRAG is also going away soon. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Before adding usec TS support, add tcp_rtt_tsopt_us() helper to factorize code. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
This helper returns a TSval from a TCP socket. It currently calls tcp_time_stamp_ms() but will soon be able to return a usec based TSval, depending on an upcoming tp->tcp_usec_ts field. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
tcp_ns_to_ts() is only used once from cookie_init_timestamp(). Also add the 'bool usec_ts' parameter to enable usec TS later. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
This helper returns a 32bit TCP TSval from skb->tstamp. As we are going to support usec or ms units soon, rename it to tcp_skb_timestamp_ts() and add a boolean to select the unit. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
In preparation of usec TCP TS support, remove tcp_time_stamp_raw() in favor of tcp_clock_ts() helper. This helper will return a suitable 32bit result to feed TS values, depending on a socket field. Also add tcp_tw_tsval() and tcp_rsk_tsval() helpers to factorize the details. We do not yet support usec timestamps. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
It delivers current TCP time stamp in ms unit, and is used in place of confusing tcp_time_stamp_raw() It is the same family than tcp_clock_ns() and tcp_clock_ms(). tcp_time_stamp_raw() will be replaced later for TSval contexts with a more descriptive name. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
In preparation of adding usec TCP TS values, add tcp_time_stamp_ms() for contexts needing ms based values. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
cookie_init_timestamp() is supposed to return a 64bit timestamp suitable for both TSval determination and setting of skb->tstamp. Unfortunately it uses 32bit fields and overflows after 2^32 * 10^6 nsec (~49 days) of uptime. Generated TSval are still correct, but skb->tstamp might be set far away in the past, potentially confusing other layers. tcp_ns_to_ts() is changed to return a full 64bit value, ts and ts_now variables are changed to u64 type, and TSMASK is removed in favor of shifts operations. While we are at it, change this sequence: ts >>= TSBITS; ts--; ts <<= TSBITS; ts |= options; to: ts -= (1UL << TSBITS); Fixes: 9a568de4 ("tcp: switch TCP TS option (RFC 7323) to 1ms clock") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
tp->rcv_tstamp should be set to tcp_jiffies, not tcp_time_stamp(). Fixes: cc35c88a ("crypto : chtls - CPL handler definition") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ayush Sawal <ayush.sawal@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Beniamino Galvani says: ==================== net: consolidate IPv6 route lookup for UDP tunnels At the moment different UDP tunnels rely on different functions for IPv6 route lookup, and those functions all implement the same logic. Extend the generic lookup function so that it is suitable for all UDP tunnel implementations, and then adapt bareudp, geneve and vxlan to use it. This is similar to what already done for IPv4. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Beniamino Galvani authored
The route lookup can be done now via generic function udp_tunnel6_dst_lookup() to replace the custom implementation in vxlan6_get_route(). This is similar to what already done for IPv4 in commit 6f19b2c1 ("vxlan: use generic function for tunnel IPv4 route lookup"). Suggested-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Beniamino Galvani <b.galvani@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Beniamino Galvani authored
The route lookup can be done now via generic function udp_tunnel6_dst_lookup() to replace the custom implementation in geneve_get_v6_dst(). This is similar to what already done for IPv4 in commit daa2ba7e ("geneve: use generic function for tunnel IPv4 route lookup"). Suggested-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Beniamino Galvani <b.galvani@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Beniamino Galvani authored
We want to make the function more generic so that it can be used by other UDP tunnel implementations such as geneve and vxlan. To do that, add the following arguments: - source and destination UDP port; - ifindex of the output interface, needed by vxlan; - the tos, because in some cases it is not taken from struct ip_tunnel_info (for example, when it's inherited from the inner packet); - the dst cache, because not all tunnel types (e.g. vxlan) want to use the one from struct ip_tunnel_info. With these parameters, the function no longer needs the full struct ip_tunnel_info as argument and we can pass only the relevant part of it (struct ip_tunnel_key). This is similar to what already done for IPv4 in commit 72fc68c6 ("ipv4: add new arguments to udp_tunnel_dst_lookup()"). Suggested-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Beniamino Galvani <b.galvani@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Beniamino Galvani authored
The function is now UDP-specific, the protocol is always IPPROTO_UDP. This is similar to what already done for IPv4 in commit 78f3655a ("ipv4: remove "proto" argument from udp_tunnel_dst_lookup()"). Suggested-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Beniamino Galvani <b.galvani@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Beniamino Galvani authored
At the moment ip6_dst_lookup_tunnel() is used only by bareudp. Ideally, other UDP tunnel implementations should use it, but to do so the function needs to accept new parameters that are specific for UDP tunnels, such as the ports. Prepare for these changes by renaming the function to udp_tunnel6_dst_lookup() and move it to file net/ipv6/ip6_udp_tunnel.c. This is similar to what already done for IPv4 in commit bf3fcbf7 ("ipv4: rename and move ip_route_output_tunnel()"). Suggested-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Beniamino Galvani <b.galvani@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-