- 20 Feb, 2023 16 commits
-
-
Eric Dumazet authored
Change ndisc_recv_ns() to return a drop reason. For the moment, return PKT_TOO_SMALL, NOT_SPECIFIED or SKB_CONSUMED. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
kfree_skb() includes the location, it makes sense to add it to consume_skb() as well. After patch: taskd_EventMana 8602 [004] 420.406239: skb:consume_skb: skbaddr=0xffff893a4a6d0500 location=unix_stream_read_generic swapper 0 [011] 422.732607: skb:consume_skb: skbaddr=0xffff89597f68cee0 location=mlx4_en_free_tx_desc discipline 9141 [043] 423.065653: skb:consume_skb: skbaddr=0xffff893a487e9c00 location=skb_consume_udp swapper 0 [010] 423.073166: skb:consume_skb: skbaddr=0xffff8949ce9cdb00 location=icmpv6_rcv borglet 8672 [014] 425.628256: skb:consume_skb: skbaddr=0xffff8949c42e9400 location=netlink_dump swapper 0 [028] 426.263317: skb:consume_skb: skbaddr=0xffff893b1589dce0 location=net_rx_action wget 14339 [009] 426.686380: skb:consume_skb: skbaddr=0xffff893a51b552e0 location=tcp_rcv_state_process Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Xuan Zhuo authored
When we try to start AF_XDP on some machines with long running time, due to the machine's memory fragmentation problem, there is no sufficient contiguous physical memory that will cause the start failure. If the size of the queue is 8 * 1024, then the size of the desc[] is 8 * 1024 * 8 = 16 * PAGE, but we also add struct xdp_ring size, so it is 16page+. This is necessary to apply for a 4-order memory. If there are a lot of queues, it is difficult to these machine with long running time. Here, that we actually waste 15 pages. 4-Order memory is 32 pages, but we only use 17 pages. This patch replaces __get_free_pages() by vmalloc() to allocate memory to solve these problems. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
Vladimir Oltean says: ==================== taprio queueMaxSDU fixes This fixes 3 issues noticed while attempting to reoffload the dynamically calculated queueMaxSDU values. These are: - Dynamic queueMaxSDU is not calculated correctly due to a lost patch - Dynamically calculated queueMaxSDU needs to be clamped on the low end - Dynamically calculated queueMaxSDU needs to be clamped on the high end ==================== Link: https://lore.kernel.org/r/20230215224632.2532685-1-vladimir.oltean@nxp.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Vladimir Oltean authored
It makes no sense to keep randomly large max_sdu values, especially if larger than the device's max_mtu. These are visible in "tc qdisc show". Such a max_sdu is practically unlimited and will cause no packets for that traffic class to be dropped on enqueue. Just set max_sdu_dynamic to U32_MAX, which in the logic below causes taprio to save a max_frm_len of U32_MAX and a max_sdu presented to user space of 0 (unlimited). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Vladimir Oltean authored
The overhead specified in the size table comes from the user. With small time intervals (or gates always closed), the overhead can be larger than the max interval for that traffic class, and their difference is negative. What we want to happen is for max_sdu_dynamic to have the smallest non-zero value possible (1) which means that all packets on that traffic class are dropped on enqueue. However, since max_sdu_dynamic is u32, a negative is represented as a large value and oversized dropping never happens. Use max_t with int to force a truncation of max_frm_len to no smaller than dev->hard_header_len + 1, which in turn makes max_sdu_dynamic no smaller than 1. Fixes: fed87cc6 ("net/sched: taprio: automatically calculate queueMaxSDU based on TC gate durations") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Vladimir Oltean authored
taprio_calculate_gate_durations() depends on netdev_get_num_tc() and this returns 0. So it calculates the maximum gate durations for no traffic class. I had tested the blamed commit only with another patch in my tree, one which in the end I decided isn't valuable enough to submit ("net/sched: taprio: mask off bits in gate mask that exceed number of TCs"). The problem is that having this patch threw off my testing. By moving the netdev_set_num_tc() call earlier, we implicitly gave to taprio_calculate_gate_durations() the information it needed. Extract only the portion from the unsubmitted change which applies the mqprio configuration to the netdev earlier. Link: https://patchwork.kernel.org/project/netdevbpf/patch/20230130173145.475943-15-vladimir.oltean@nxp.com/ Fixes: a306a90c ("net/sched: taprio: calculate tc gate durations") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
David Howells authored
Fix three cases of overproduction of wakeups: (1) rxrpc_input_split_jumbo() conditionally notifies the app that there's data for recvmsg() to collect if it queues some data - and then its only caller, rxrpc_input_data(), goes and wakes up recvmsg() anyway. Fix the rxrpc_input_data() to only do the wakeup in failure cases. (2) If a DATA packet is received for a call by the I/O thread whilst recvmsg() is busy draining the call's rx queue in the app thread, the call will left on the recvmsg() queue for recvmsg() to pick up, even though there isn't any data on it. This can cause an unexpected recvmsg() with a 0 return and no MSG_EOR set after the reply has been posted to a service call. Fix this by discarding pending calls from the recvmsg() queue that don't need servicing yet. (3) Not-yet-completed calls get requeued after having data read from them, even if they have no data to read. Fix this by only requeuing them if they have data waiting on them; if they don't, the I/O thread will requeue them when data arrives or they fail. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/3386149.1676497685@warthog.procyon.org.ukSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Paolo Abeni authored
Alex Elder says: ==================== net: final GSI register updates I believe this is the last set of changes required to allow IPA v5.0 to be supported. There is a little cleanup work remaining, but that can happen in the next Linux release cycle. Otherwise we just need config data and register definitions for IPA v5.0 (and DTS updates). These are ready but won't be posted without further testing. The first patch in this series fixes a minor bug in a patch just posted, which I found too late. The second eliminates the GSI memory "adjustment"; this was done previously to avoid/delay the need to implement a more general way to define GSI register offsets. Note that this patch causes "checkpatch" warnings due to indentation that aligns with an open parenthesis. The third patch makes use of the newly-defined register offsets, to eliminate the need for a function that hid a few details. The next modifies a different helper function to work properly for IPA v5.0+. The fifth patch changes the way the event ring size is specified based on how it's now done for IPA v5.0+. And the last defines a new register required for IPA v5.0+. ==================== Link: https://lore.kernel.org/r/20230215195352.755744-1-elder@linaro.orgSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Alex Elder authored
Starting at IPA v5.0, the number of event rings per EE is defined in a field in a new HW_PARAM_4 GSI register rather than HW_PARAM_2. Define this new register and its fields, and update the code that checks the number of rings supported by hardware to use the proper field based on IPA version. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Alex Elder authored
Starting with IPA v5.0, a channel's event ring index is encoded in a field in the CH_C_CNTXT_1 GSI register rather than CH_C_CNTXT_0. Define a new field ID for the former register and encode the event ring in the appropriate register. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Alex Elder authored
The GSI channel protocol field in the CH_C_CNTXT_0 GSI register is widened starting IPA v5.0, making the CHTYPE_PROTOCOL_MSB field added in IPA v4.5 unnecessary. Update the code to reflect this. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Alex Elder authored
Now that we explicitly define each register field width there is no need to have a special encoding function for the event ring length. Add a field for this to the EV_CH_E_CNTXT_1 GSI register, and use it in place of ev_ch_e_cntxt_1_length_encode() (which can be removed). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Alex Elder authored
Starting at IPA v4.5, almost all GSI registers had their offsets changed by a fixed amount (shifted downward by 0xd000). Rather than defining offsets for all those registers dependent on version, an adjustment was applied for most register accesses. This was implemented in commit cdeee49f ("net: ipa: adjust GSI register addresses"). It was later modified to be a bit more obvious about the adjusment, in commit 571b1e7e ("net: ipa: use a separate pointer for adjusted GSI memory"). We now are able to define every GSI register with its own offset, so there's no need to implement this special adjustment. So get rid of the "virt_raw" pointer, and just maintain "virt" as the (non-adjusted) base address of I/O mapped GSI register memory. Redefine the offsets of all GSI registers (other than the INTER_EE ones, which were not subject to the adjustment) for IPA v4.5+, subtracting 0xd000 from their defined offsets instead. Move the ERROR_LOG and ERROR_LOG_CLR definitions further down in the register definition files so all registers are defined in order of their offset. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Alex Elder authored
I spotted an error in a patch posted this week, unfortunately just after it got accepted. The effect of the bug is that time-based interrupt moderation is disabled. This is not technically a bug, but it is not what is intended. The problem is that a |= assignment got implemented as a simple assignment, so the previously assigned value was ignored. Fixes: edc6158b ("net: ipa: define fields for event-ring related registers") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Lorenzo Bianconi authored
Do not always add NETDEV_XDP_ACT_XSK_ZEROCOPY bit in xdp_features flag but check if the NIC really supports it. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/3dba6ea42dc343a9f2d7d1a6a6a6c173235e1ebf.1676471386.git.lorenzo@kernel.orgSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
- 17 Feb, 2023 2 commits
-
-
David S. Miller authored
Some of the devlink bits were tricky, but I think I got it right. Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://anongit.freedesktop.org/drm/drmLinus Torvalds authored
Pull drm fixes from Dave Airlie: "Just a final collection of misc fixes, the biggest disables the recently added dynamic debugging support, it has a regression that needs some bigger fixes. Otherwise a bunch of fixes across the board, vc4, amdgpu and vmwgfx mostly, with some smaller i915 and ast fixes. drm: - dynamic debug disable for now fbdev: - deferred i/o device close fix amdgpu: - Fix GC11.x suspend warning - Fix display warning vc4: - YUV planes fix - hdmi display fix - crtc reduced blanking fix ast: - fix start address computation vmwgfx: - fix bo/handle races i915: - gen11 WA fix" * tag 'drm-fixes-2023-02-17' of git://anongit.freedesktop.org/drm/drm: drm/amd/display: Fail atomic_check early on normalize_zpos error drm/amd/amdgpu: fix warning during suspend drm/vmwgfx: Do not drop the reference to the handle too soon drm/vmwgfx: Stop accessing buffer objects which failed init drm/i915/gen11: Wa_1408615072/Wa_1407596294 should be on GT list drm: Disable dynamic debug as broken drm/ast: Fix start address computation fbdev: Fix invalid page access after closing deferred I/O devices drm/vc4: crtc: Increase setup cost in core clock calculation to handle extreme reduced blanking drm/vc4: hdmi: Always enable GCP with AVMUTE cleared drm/vc4: Fix YUV plane handling when planes are in different buffers
-
- 16 Feb, 2023 22 commits
-
-
Dave Airlie authored
Merge tag 'drm-intel-fixes-2023-02-16' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - Moving gen11 hw wa to the right place. (Matt) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Y+47eUvwbafER35/@intel.com
-
git://anongit.freedesktop.org/drm/drm-miscDave Airlie authored
Multiple fixes in vc4 to address issues with YUV planes, HDMI and CRTC; an invalid page access fix for fbdev, mark dynamic debug as broken, a double free and refcounting fix for vmwgfx. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20230216091905.i5wswy4dd74x4br5@houat
-
Dave Airlie authored
Merge tag 'amd-drm-fixes-6.2-2023-02-15' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.2-2023-02-15: amdgpu: - Fix GC11.x suspend warning - Fix display warning Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230216041122.7714-1-alexander.deucher@amd.com
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds authored
Pull networking fixes from Jakub Kicinski: "Fixes from the main networking tree only, probably because all sub-trees have backed off and haven't submitted their changes. None of the fixes here are particularly scary and no outstanding regressions. In an ideal world the "current release" sections would be empty at this stage but that never happens. Current release - regressions: - fix unwanted sign extension in netdev_stats_to_stats64() Current release - new code bugs: - initialize net->notrefcnt_tracker earlier - devlink: fix netdev notifier chain corruption - nfp: make sure mbox accesses in IPsec code are atomic - ice: fix check for weight and priority of a scheduling node Previous releases - regressions: - ice: xsk: fix cleaning of XDP_TX frame, prevent inf loop - igb: fix I2C bit banging config with external thermal sensor Previous releases - always broken: - sched: tcindex: update imperfect hash filters respecting rcu - mpls: fix stale pointer if allocation fails during device rename - dccp/tcp: avoid negative sk_forward_alloc by ipv6_pinfo.pktoptions - remove WARN_ON_ONCE(sk->sk_forward_alloc) from sk_stream_kill_queues() - af_key: fix heap information leak - ipv6: fix socket connection with DSCP (correct interpretation of the tclass field vs fib rule matching) - tipc: fix kernel warning when sending SYN message - vmxnet3: read RSS information from the correct descriptor (eop)" * tag 'net-6.2-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (35 commits) devlink: Fix netdev notifier chain corruption igb: conditionalize I2C bit banging on external thermal sensor support net: mpls: fix stale pointer if allocation fails during device rename net/sched: tcindex: search key must be 16 bits tipc: fix kernel warning when sending SYN message igb: Fix PPS input and output using 3rd and 4th SDP net: use a bounce buffer for copying skb->mark ixgbe: add double of VLAN header when computing the max MTU i40e: add double of VLAN header when computing the max MTU ixgbe: allow to increase MTU to 3K with XDP enabled net: stmmac: Restrict warning on disabling DMA store and fwd mode net/sched: act_ctinfo: use percpu stats net: stmmac: fix order of dwmac5 FlexPPS parametrization sequence ice: fix lost multicast packets in promisc mode ice: Fix check for weight and priority of a scheduling node bnxt_en: Fix mqprio and XDP ring checking logic net: Fix unwanted sign extension in netdev_stats_to_stats64() net/usb: kalmia: Don't pass act_len in usb_bulk_msg error path net: openvswitch: fix possible memory leak in ovs_meter_cmd_set() af_key: Fix heap information leak ...
-
git://git.kernel.dk/linuxLinus Torvalds authored
Pull block fixes from Jens Axboe: "Just a few NVMe fixes that should go into the 6.2 release, adding a quirk and fixing two issues introduced in this release: - NVMe fixes via Christoph: - Always return an ERR_PTR from nvme_pci_alloc_dev (Irvin Cote) - Add bogus ID quirk for ADATA SX6000PNP (Daniel Wagner) - Set the DMA mask earlier (Christoph Hellwig)" * tag 'block-6.2-2023-02-16' of git://git.kernel.dk/linux: nvme-pci: always return an ERR_PTR from nvme_pci_alloc_dev nvme-pci: set the DMA mask earlier nvme-pci: add bogus ID quirk for ADATA SX6000PNP
-
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spiLinus Torvalds authored
Pull spi fix from Mark Brown: "One more last minute patch for v6.2 updating the parsing of the newly added spi-cs-setup-delay-ns. It's been pointed out that due to the way DT parsing works the change in property size is ABI visible so let's not let a release go out without it being fixed. The change got split from some earlier ABI related fixes to the property since the first version sent had a build error" * tag 'spi-v6.2-rc8-abi' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: Use a 32-bit DT property for spi-cs-setup-delay-ns
-
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linuxLinus Torvalds authored
Pull gpio fixes from Bartosz Golaszewski: - fix a potential Kconfig issue with gpio-mlxbf2 not selecting GPIOLIB_IRQCHIP - another immutable irqchip conversion, this time for gpio-vf610 - fix a wakeup issue on Clevo NH5xAx * tag 'gpio-fixes-for-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: mlxbf2: select GPIOLIB_IRQCHIP gpiolib: acpi: Add a ignore wakeup quirk for Clevo NH5xAx gpio: vf610: make irq_chip immutable gpiolib: acpi: remove redundant declaration
-
Christoph Hellwig authored
The uuid code is very low maintainance now that the major overhaul has completed, and doesn't need it's own tree. All the recent work has been done by Andy who'd like to stay on as a reviewer without an explicit tree. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Andy Shevchenko <andy@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Christoph Hellwig authored
This code has been stale for years and I have no way to test it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linuxJakub Kicinski authored
Leon Romanovsky says: ==================== mlx5-next changes Following previous conversations [1] and our clear commitment to do the TC work [2], please pull mlx5-next shared branch, which includes low-level steering logic to allow RoCEv2 traffic to be encrypted/ decrypted through IPsec. [1] https://lore.kernel.org/all/20230126230815.224239-1-saeed@kernel.org/ [2] https://lore.kernel.org/all/Y+Z7lVVWqnRBiPh2@nvidia.com/ * 'mlx5-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Configure IPsec steering for egress RoCEv2 traffic net/mlx5: Configure IPsec steering for ingress RoCEv2 traffic net/mlx5: Add IPSec priorities in RDMA namespaces net/mlx5: Implement new destination type TABLE_TYPE net/mlx5: Introduce new destination type TABLE_TYPE ==================== Link: https://lore.kernel.org/r/20230215095624.1365200-1-leon@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Merge tag 'wireless-next-2023-03-16' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Johannes Berg says: ==================== Major stack changes: * EHT channel puncturing support (client & AP) * some support for AP MLD without mac80211 * fixes for A-MSDU on mesh connections Major driver changes: iwlwifi * EHT rate reporting * Bump FW API to 74 for AX devices * STEP equalizer support: transfer some STEP (connection to radio on platforms with integrated wifi) related parameters from the BIOS to the firmware mt76 * switch to using page pool allocator * mt7996 EHT (Wi-Fi 7) support * Wireless Ethernet Dispatch (WED) reset support libertas * WPS enrollee support brcmfmac * Rename Cypress 89459 to BCM4355 * BCM4355 and BCM4377 support mwifiex * SD8978 chipset support rtl8xxxu * LED support ath12k * new driver for Qualcomm Wi-Fi 7 devices ath11k * IPQ5018 support * Fine Timing Measurement (FTM) responder role support * channel 177 support ath10k * store WLAN firmware version in SMEM image table * tag 'wireless-next-2023-03-16' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (207 commits) wifi: brcmfmac: p2p: Introduce generic flexible array frame member wifi: mac80211: add documentation for amsdu_mesh_control wifi: cfg80211: remove gfp parameter from cfg80211_obss_color_collision_notify description wifi: mac80211: always initialize link_sta with sta wifi: mac80211: pass 'sta' to ieee80211_rx_data_set_sta() wifi: cfg80211: Set SSID if it is not already set wifi: rtw89: move H2C of del_pkt_offload before polling FW status ready wifi: rtw89: use readable return 0 in rtw89_mac_cfg_ppdu_status() wifi: rtw88: usb: drop now unnecessary URB size check wifi: rtw88: usb: send Zero length packets if necessary wifi: rtw88: usb: Set qsel correctly wifi: mac80211: fix off-by-one link setting wifi: mac80211: Fix for Rx fragmented action frames wifi: mac80211: avoid u32_encode_bits() warning wifi: mac80211: Don't translate MLD addresses for multicast wifi: cfg80211: call reg_notifier for self managed wiphy from driver hint wifi: cfg80211: get rid of gfp in cfg80211_bss_color_notify wifi: nl80211: Allow authentication frames and set keys on NAN interface wifi: mac80211: fix non-MLO station association wifi: mac80211: Allow NSS change only up to capability ... ==================== Link: https://lore.kernel.org/r/20230216105406.208416-1-johannes@sipsolutions.netSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Bartosz Golaszewski authored
Merge tag 'intel-gpio-v6.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-gpio-intel into gpio/for-current intel-gpio for v6.2-2 * Ignore spurious wakeup by touchpad on Clevo NH5xAx * Miscellaneous fix(es)
-
Paolo Abeni authored
Andrea Mayer says: ==================== seg6: add PSP flavor support for SRv6 End behavior Segment Routing for IPv6 (SRv6 in short) [1] is the instantiation of the Segment Routing (SR) [2] architecture on the IPv6 dataplane. In SRv6, the segment identifiers (SID) are IPv6 addresses and the segment list (SID List) is carried in the Segment Routing Header (SRH). A segment may be bound to a specific packet processing operation called "behavior". The RFC8986 [3] defines and standardizes the most common/relevant behaviors for network operators, e.g., End, End.X and End.T and so on. The RFC8986 also introduces the "flavors" framework aiming to modify or extend the capabilities of SRv6 End, End.X and End.T behaviors. Specifically, these behaviors support the following flavors (either individually or in combinations): - Penultimate Segment Pop (PSP); - Ultimate Segment Pop (USP); - Ultimate Segment Decapsulation (USD). Such flavors enable an End/End.X/End.T behavior to pop the SRH on the penultimate/ultimate SR endpoint node listed in the SID List or to perform a full decapsulation. Currently, the Linux kernel supports a large subset of behaviors described in RFC8986, including the End, End.X and End.T. However, PSP, USP and USD flavors have not yet been implemented. In this patchset, we extend the SRv6 subsystem to implement the PSP flavor in the SRv6 End behavior. To accomplish this task, we leverage the flavor framework previously introduced by another patchset required for supporting the efficient representation of the SID List through the NEXT-C-SID mechanism [4]. In details, the patchset is made of: - patch 1/3: seg6: factor out End lookup nexthop processing to a dedicated function - patch 2/3: seg6: add PSP flavor support for SRv6 End behavior - patch 3/3: selftests: seg6: add selftest for PSP flavor in SRv6 End behavior From the user space perspective, we do not need to change the iproute2 code to support the PSP flavor. However, we provide the man page for the PSP flavor in a separate patch. Comments, improvements and suggestions are always appreciated. [1] - RFC8754: https://datatracker.ietf.org/doc/html/rfc8754 [2] - RFC8402: https://datatracker.ietf.org/doc/html/rfc8402 [3] - RFC8986: https://datatracker.ietf.org/doc/html/rfc8986 [4] - https://datatracker.ietf.org/doc/html/draft-ietf-spring-srv6-srh-compression ==================== Link: https://lore.kernel.org/r/20230215134659.7613-1-andrea.mayer@uniroma2.itSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Andrea Mayer authored
This selftest is designed for testing the PSP flavor in SRv6 End behavior. It instantiates a virtual network composed of several nodes: hosts and SRv6 routers. Each node is realized using a network namespace that is properly interconnected to others through veth pairs. The test makes use of the SRv6 End behavior and of the PSP flavor needed for removing the SRH from the IPv6 header at the penultimate node. The correct execution of the behavior is verified through reachability tests carried out between hosts. Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Signed-off-by: Paolo Lungaroni <paolo.lungaroni@uniroma2.it> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Andrea Mayer authored
The "flavors" framework defined in RFC8986 [1] represents additional operations that can modify or extend a subset of existing behaviors such as SRv6 End, End.X and End.T. We report these flavors hereafter: - Penultimate Segment Pop (PSP); - Ultimate Segment Pop (USP); - Ultimate Segment Decapsulation (USD). Depending on how the Segment Routing Header (SRH) has to be handled, an SRv6 End* behavior can support these flavors either individually or in combinations. In this patch, we only consider the PSP flavor for the SRv6 End behavior. A PSP enabled SRv6 End behavior is used by the Source/Ingress SR node (i.e., the one applying the SRv6 Policy) when it needs to instruct the penultimate SR Endpoint node listed in the SID List (carried by the SRH) to remove the SRH from the IPv6 header. Specifically, a PSP enabled SRv6 End behavior processes the SRH by: i) decreasing the Segment Left (SL) from 1 to 0; ii) copying the Last Segment IDentifier (SID) into the IPv6 Destination Address (DA); iii) removing (i.e., popping) the outer SRH from the extension headers following the IPv6 header. It is important to note that PSP operation (steps i, ii, iii) takes place only at a penultimate SR Segment Endpoint node (i.e., when the SL=1) and does not happen at non-penultimate Endpoint nodes. Indeed, when a SID of PSP flavor is processed at a non-penultimate SR Segment Endpoint node, the PSP operation is not performed because it would not be possible to decrease the SL from 1 to 0. SL=2 SL=1 SL=0 | | | For example, given the SRv6 policy (SID List := < X, Y, Z >): - a PSP enabled SRv6 End behavior bound to SID "Y" will apply the PSP operation as Segment Left (SL) is 1, corresponding to the Penultimate Segment of the SID List; - a PSP enabled SRv6 End behavior bound to SID "X" will *NOT* apply the PSP operation as the Segment Left is 2. This behavior instance will apply the "standard" End packet processing, ignoring the configured PSP flavor at all. [1] - RFC8986: https://datatracker.ietf.org/doc/html/rfc8986Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Andrea Mayer authored
The End nexthop lookup/input operations are moved into a new helper function named input_action_end_finish(). This avoids duplicating the code needed to compute the nexthop in the different flavors of the End behavior. Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Lukas Bulwahn authored
Commit 3d7316ac ("net: dsa: ocelot: add external ocelot switch control") adds config NET_DSA_MSCC_OCELOT_EXT, which selects the non-existing config MFD_OCELOT_CORE. Replace this select with the intended and existing MFD_OCELOT. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Colin Foster <colin.foster@in-advantage.com> Link: https://lore.kernel.org/r/20230215104631.31568-1-lukas.bulwahn@gmail.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Paolo Abeni authored
Alejandro Lucero says: ==================== sfc: devlink support for ef100 This patchset adds devlink port support for ef100 allowing setting VFs mac addresses through the VF representor devlink ports. Basic devlink infrastructure is first introduced, then support for info command. Next changes for enumerating MAE ports which will be used for devlink port creation when netdevs are registered. Adding support for devlink port_function_hw_addr_get requires changes in the ef100 driver for getting the mac address based on a client handle. This allows to obtain VFs mac addresses during netdev initialization as well what is included in patch 6. Such client handle is used in patches 7 and 8 for getting and setting devlink port addresses. ==================== Link: https://lore.kernel.org/r/20230215090828.11697-1-alejandro.lucero-palau@amd.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Alejandro Lucero authored
Using the builtin client handle id infrastructure, add support for setting the mac address linked to mports in ef100. This implies to execute an MCDI command for giving the address to the firmware for the specific devlink port. Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Alejandro Lucero authored
Using the builtin client handle id infrastructure, add support for obtaining the mac address linked to mports in ef100. This implies to execute an MCDI command for getting the data from the firmware for each devlink port. Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Alejandro Lucero authored
Getting device mac address is currently based on a specific MCDI command only available for the PF. This patch changes the MCDI command to a generic one for PFs and VFs based on a client handle. This allows both PFs and VFs to ask for their mac address during initialization using the CLIENT_HANDLE_SELF. Moreover, the patch allows other client handles which will be used by the PF to ask for mac addresses linked to VFs. This is necessary for suporting the port_function_hw_addr_get devlink function in further patches. Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Alejandro Lucero authored
Using the data when enumerating mports, create devlink ports just before netdevs are registered and remove those devlink ports after netdev has been unregistered. Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-