1. 21 Apr, 2023 40 commits
    • Shannon Nelson's avatar
      pds_core: health timer and workqueue · c2dbb090
      Shannon Nelson authored
      Add in the periodic health check and the related workqueue,
      as well as the handlers for when a FW reset is seen.
      
      The firmware is polled every 5 seconds to be sure that it is
      still alive and that the FW generation didn't change.
      
      The alive check looks to see that the PCI bus is still readable
      and the fw_status still has the RUNNING bit on.  If not alive,
      the driver stops activity and tears things down.  When the FW
      recovers and the alive check again succeeds, the driver sets
      back up for activity.
      
      The generation check looks at the fw_generation to see if it
      has changed, which can happen if the FW crashed and recovered
      or was updated in between health checks.  If changed, the
      driver counts that as though the alive test failed and forces
      the fw_down/fw_up cycle.
      Signed-off-by: default avatarShannon Nelson <shannon.nelson@amd.com>
      Acked-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c2dbb090
    • Shannon Nelson's avatar
      pds_core: add devcmd device interfaces · 523847df
      Shannon Nelson authored
      The devcmd interface is the basic connection to the device through the
      PCI BAR for low level identification and command services.  This does
      the early device initialization and finds the identity data, and adds
      devcmd routines to be used by later driver bits.
      Signed-off-by: default avatarShannon Nelson <shannon.nelson@amd.com>
      Acked-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      523847df
    • Shannon Nelson's avatar
      pds_core: initial framework for pds_core PF driver · 55435ea7
      Shannon Nelson authored
      This is the initial PCI driver framework for the new pds_core device
      driver and its family of devices.  This does the very basics of
      registering for the new PF PCI device 1dd8:100c, setting up debugfs
      entries, and registering with devlink.
      Signed-off-by: default avatarShannon Nelson <shannon.nelson@amd.com>
      Acked-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      55435ea7
    • David S. Miller's avatar
      Merge branch 'bridge-neigh-suppression' · 25c800b2
      David S. Miller authored
      Ido Schimmel says:
      
      ====================
      bridge: Add per-{Port, VLAN} neighbor suppression
      
      Background
      ==========
      
      In order to minimize the flooding of ARP and ND messages in the VXLAN
      network, EVPN includes provisions [1] that allow participating VTEPs to
      suppress such messages in case they know the MAC-IP binding and can
      reply on behalf of the remote host. In Linux, the above is implemented
      in the bridge driver using a per-port option called "neigh_suppress"
      that was added in kernel version 4.15 [2].
      
      Motivation
      ==========
      
      Some applications use ARP messages as keepalives between the application
      nodes in the network. This works perfectly well when two nodes are
      connected to the same VTEP. When a node goes down it will stop
      responding to ARP requests and the other node will notice it
      immediately.
      
      However, when the two nodes are connected to different VTEPs and
      neighbor suppression is enabled, the local VTEP will reply to ARP
      requests even after the remote node went down, until certain timers
      expire and the EVPN control plane decides to withdraw the MAC/IP
      Advertisement route for the address. Therefore, some users would like to
      be able to disable neighbor suppression on VLANs where such applications
      reside and keep it enabled on the rest.
      
      Implementation
      ==============
      
      The proposed solution is to allow user space to control neighbor
      suppression on a per-{Port, VLAN} basis, in a similar fashion to other
      per-port options that gained per-{Port, VLAN} counterparts such as
      "mcast_router". This allows users to benefit from the operational
      simplicity and scalability associated with shared VXLAN devices (i.e.,
      external / collect-metadata mode), while still allowing for per-VLAN/VNI
      neighbor suppression control.
      
      The user interface is extended with a new "neigh_vlan_suppress" bridge
      port option that allows user space to enable per-{Port, VLAN} neighbor
      suppression on the bridge port. When enabled, the existing
      "neigh_suppress" option has no effect and neighbor suppression is
      controlled using a new "neigh_suppress" VLAN option. Example usage:
      
       # bridge link set dev vxlan0 neigh_vlan_suppress on
       # bridge vlan add vid 10 dev vxlan0
       # bridge vlan set vid 10 dev vxlan0 neigh_suppress on
      
      Testing
      =======
      
      Tested using existing bridge selftests. Added a dedicated selftest in
      the last patch.
      
      Patchset overview
      =================
      
      Patches #1-#5 are preparations.
      
      Patch #6 adds per-{Port, VLAN} neighbor suppression support to the
      bridge's data path.
      
      Patches #7-#8 add the required netlink attributes to enable the feature.
      
      Patch #9 adds a selftest.
      
      iproute2 patches can be found here [3].
      
      Changelog
      =========
      
      Since RFC [4]:
      
      No changes.
      
      [1] https://www.rfc-editor.org/rfc/rfc7432#section-10
      [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a42317785c898c0ed46db45a33b0cc71b671bf29
      [3] https://github.com/idosch/iproute2/tree/submit/neigh_suppress_v1
      [4] https://lore.kernel.org/netdev/20230413095830.2182382-1-idosch@nvidia.com/
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      25c800b2
    • Ido Schimmel's avatar
      selftests: net: Add bridge neighbor suppression test · 7648ac72
      Ido Schimmel authored
      Add test cases for bridge neighbor suppression, testing both per-port
      and per-{Port, VLAN} neighbor suppression with both ARP and NS packets.
      
      Example truncated output:
      
       # ./test_bridge_neigh_suppress.sh
       [...]
       Tests passed: 148
       Tests failed:   0
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7648ac72
    • Ido Schimmel's avatar
      bridge: Allow setting per-{Port, VLAN} neighbor suppression state · 160656d7
      Ido Schimmel authored
      Add a new bridge port attribute that allows user space to enable
      per-{Port, VLAN} neighbor suppression. Example:
      
       # bridge -d -j -p link show dev swp1 | jq '.[]["neigh_vlan_suppress"]'
       false
       # bridge link set dev swp1 neigh_vlan_suppress on
       # bridge -d -j -p link show dev swp1 | jq '.[]["neigh_vlan_suppress"]'
       true
       # bridge link set dev swp1 neigh_vlan_suppress off
       # bridge -d -j -p link show dev swp1 | jq '.[]["neigh_vlan_suppress"]'
       false
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      160656d7
    • Ido Schimmel's avatar
      bridge: vlan: Allow setting VLAN neighbor suppression state · 83f6d600
      Ido Schimmel authored
      Add a new VLAN attribute that allows user space to set the neighbor
      suppression state of the port VLAN. Example:
      
       # bridge -d -j -p vlan show dev swp1 vid 10 | jq '.[]["vlans"][]["neigh_suppress"]'
       false
       # bridge vlan set vid 10 dev swp1 neigh_suppress on
       # bridge -d -j -p vlan show dev swp1 vid 10 | jq '.[]["vlans"][]["neigh_suppress"]'
       true
       # bridge vlan set vid 10 dev swp1 neigh_suppress off
       # bridge -d -j -p vlan show dev swp1 vid 10 | jq '.[]["vlans"][]["neigh_suppress"]'
       false
      
       # bridge vlan set vid 10 dev br0 neigh_suppress on
       Error: bridge: Can't set neigh_suppress for non-port vlans.
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      83f6d600
    • Ido Schimmel's avatar
      bridge: Add per-{Port, VLAN} neighbor suppression data path support · 412614b1
      Ido Schimmel authored
      When the bridge is not VLAN-aware (i.e., VLAN ID is 0), determine if
      neighbor suppression is enabled on a given bridge port solely based on
      the existing 'BR_NEIGH_SUPPRESS' flag.
      
      Otherwise, if the bridge is VLAN-aware, first check if per-{Port, VLAN}
      neighbor suppression is enabled on the given bridge port using the
      'BR_NEIGH_VLAN_SUPPRESS' flag. If so, look up the VLAN and check whether
      it has neighbor suppression enabled based on the per-VLAN
      'BR_VLFLAG_NEIGH_SUPPRESS_ENABLED' flag.
      
      If the bridge is VLAN-aware, but the bridge port does not have
      per-{Port, VLAN} neighbor suppression enabled, then fallback to
      determine neighbor suppression based on the 'BR_NEIGH_SUPPRESS' flag.
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      412614b1
    • Ido Schimmel's avatar
      bridge: Encapsulate data path neighbor suppression logic · 3aca683e
      Ido Schimmel authored
      Currently, there are various places in the bridge data path that check
      whether neighbor suppression is enabled on a given bridge port.
      
      As a preparation for per-{Port, VLAN} neighbor suppression, encapsulate
      this logic in a function and pass the VLAN ID of the packet as an
      argument.
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3aca683e
    • Ido Schimmel's avatar
      bridge: Take per-{Port, VLAN} neighbor suppression into account · 6be42ed0
      Ido Schimmel authored
      The bridge driver gates the neighbor suppression code behind an internal
      per-bridge flag called 'BROPT_NEIGH_SUPPRESS_ENABLED'. The flag is set
      when at least one bridge port has neighbor suppression enabled.
      
      As a preparation for per-{Port, VLAN} neighbor suppression, make sure
      the global flag is also set if per-{Port, VLAN} neighbor suppression is
      enabled. That is, when the 'BR_NEIGH_VLAN_SUPPRESS' flag is set on at
      least one bridge port.
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6be42ed0
    • Ido Schimmel's avatar
      bridge: Add internal flags for per-{Port, VLAN} neighbor suppression · a714e3ec
      Ido Schimmel authored
      Add two internal flags that will be used to enable / disable per-{Port,
      VLAN} neighbor suppression:
      
      1. 'BR_NEIGH_VLAN_SUPPRESS': A per-port flag used to indicate that
      per-{Port, VLAN} neighbor suppression is enabled on the bridge port.
      When set, 'BR_NEIGH_SUPPRESS' has no effect.
      
      2. 'BR_VLFLAG_NEIGH_SUPPRESS_ENABLED': A per-VLAN flag used to indicate
      that neighbor suppression is enabled on the given VLAN.
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a714e3ec
    • Ido Schimmel's avatar
      bridge: Pass VLAN ID to br_flood() · e408336a
      Ido Schimmel authored
      Subsequent patches are going to add per-{Port, VLAN} neighbor
      suppression, which will require br_flood() to potentially suppress ARP /
      NS packets on a per-{Port, VLAN} basis.
      
      As a preparation, pass the VLAN ID of the packet as another argument to
      br_flood().
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e408336a
    • Ido Schimmel's avatar
      bridge: Reorder neighbor suppression check when flooding · 013a7ce8
      Ido Schimmel authored
      The bridge does not flood ARP / NS packets for which a reply was sent to
      bridge ports that have neighbor suppression enabled.
      
      Subsequent patches are going to add per-{Port, VLAN} neighbor
      suppression, which is going to make it more expensive to check whether
      neighbor suppression is enabled since a VLAN lookup will be required.
      
      Therefore, instead of unnecessarily performing this lookup for every
      packet, only perform it for ARP / NS packets for which a reply was sent.
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      013a7ce8
    • David S. Miller's avatar
      Merge branch 'macsec-vlan' · 1cf3fe1c
      David S. Miller authored
      Emeel Hakim says:
      
      ====================
      Support MACsec VLAN
      
      This patch series introduces support for hardware (HW) offload MACsec
      devices with VLAN configuration. The patches address both scenarios
      where the VLAN header is both the inner and outer header for MACsec.
      
      The changes include:
      
      1. Adding MACsec offload operation for VLAN.
      2. Considering VLAN when accessing MACsec net device.
      3. Currently offloading MACsec when it's configured over VLAN with
      current MACsec TX steering rules would wrongly insert the MACsec sec tag
      after inserting the VLAN header. This resulted in an ETHERNET | SECTAG |
      VLAN packet when ETHERNET | VLAN | SECTAG is configured. The patche
      handles this issue when configuring steering rules.
      4. Adding MACsec rx_handler change support in case of a marked skb and a
      mismatch on the dst MAC address.
      
      Please review these changes and let me know if you have any feedback or
      concerns.
      
      Updates since v1:
      - Consult vlan_features when adding NETIF_F_HW_MACSEC.
      - Allow grep for the functions.
      - Add helper function to get the macsec operation to allow the compiler
        to make some choice.
      
      Updates since v2:
      - Don't use macros to allow direct navigattion from mdo functions to its
        implementation.
      - Make the vlan_get_macsec_ops argument a const.
      - Check if the specific mdo function is available before calling it.
      - Enable NETIF_F_HW_MACSEC by default when the lower device has it enabled
        and in case the lower device currently has NETIF_F_HW_MACSEC but disabled
        let the new vlan device also have it disabled.
      
      Updates since v3:
      - Split patch ("vlan: Add MACsec offload operations for VLAN interface")
        to prevent mixing generic vlan code changes with driver changes.
      - Add mdo_open, stop and stats to support drivers which have those.
      - Don't fail if macsec offload operations are available but a specific
        function is not, to support drivers which does not implement all
        macsec offload operations.
      - Don't call find_rx_sc twice in the same loop, instead save the result
        in a parameter and re-use it.
      - Completely remove _BUILD_VLAN_MACSEC_MDO macro, to prevent returning
        from a macro.
      - Reorder the functions inside struct macsec_ops to match the struct
        decleration.
      
       Updates since v4:
       - Change subject line of ("macsec: Add MACsec rx_handler change support") and adapt commit message.
       - Don't separate the new check in patch ("macsec: Add MACsec rx_handler change support")
         from the previous if/else if.
       - Drop"_found" from the parameter naming "rx_sc_found" and move the definition to
         the relevant block.
       - Remove "{}" since not needed around a single line.
      
       Updates since v5:
       - Consider promiscuous mode case.
      
       Updates since v6:
       - Use IS_ENABLED instead of checking for ifdef.
       - Don't add inline keywork in c files, let the compiler make its own decisions.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1cf3fe1c
    • Emeel Hakim's avatar
      macsec: Don't rely solely on the dst MAC address to identify destination MACsec device · 7661351a
      Emeel Hakim authored
      Offloading device drivers will mark offloaded MACsec SKBs with the
      corresponding SCI in the skb_metadata_dst so the macsec rx handler will
      know to which interface to divert those skbs, in case of a marked skb
      and a mismatch on the dst MAC address, divert the skb to the macsec
      net_device where the macsec rx_handler will be called to consider cases
      where relying solely on the dst MAC address is insufficient.
      
      One such instance is when using MACsec with a VLAN as an inner
      header, where the packet structure is ETHERNET | SECTAG | VLAN.
      In such a scenario, the dst MAC address in the ethernet header
      will correspond to the VLAN MAC address, resulting in a mismatch.
      Signed-off-by: default avatarEmeel Hakim <ehakim@nvidia.com>
      Reviewed-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7661351a
    • Emeel Hakim's avatar
      net/mlx5: Consider VLAN interface in MACsec TX steering rules · 765f974c
      Emeel Hakim authored
      Offloading MACsec when its configured over VLAN with current MACsec
      TX steering rules will wrongly insert MACsec sec tag after inserting
      the VLAN header leading to a ETHERNET | SECTAG | VLAN packet when
      ETHERNET | VLAN | SECTAG is configured.
      
      The above issue is due to adding the SECTAG by HW which is a later
      stage compared to the VLAN header insertion stage.
      
      Detect such a case and adjust TX steering rules to insert the
      SECTAG in the correct place by using reformat_param_0 field in
      the packet reformat to indicate the offset of SECTAG from end of
      the MAC header to account for VLANs in granularity of 4Bytes.
      Signed-off-by: default avatarEmeel Hakim <ehakim@nvidia.com>
      Reviewed-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      765f974c
    • Emeel Hakim's avatar
      net/mlx5: Support MACsec over VLAN · 4bba492b
      Emeel Hakim authored
      MACsec device may have a VLAN device on top of it.
      Detect MACsec state correctly under this condition,
      and return the correct net device accordingly.
      Signed-off-by: default avatarEmeel Hakim <ehakim@nvidia.com>
      Reviewed-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4bba492b
    • Emeel Hakim's avatar
      net/mlx5: Enable MACsec offload feature for VLAN interface · 339ccec8
      Emeel Hakim authored
      Enable MACsec offload feature over VLAN by adding NETIF_F_HW_MACSEC
      to the device vlan_features.
      Signed-off-by: default avatarEmeel Hakim <ehakim@nvidia.com>
      Reviewed-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      339ccec8
    • Emeel Hakim's avatar
      vlan: Add MACsec offload operations for VLAN interface · abff3e5e
      Emeel Hakim authored
      Add support for MACsec offload operations for VLAN driver
      to allow offloading MACsec when VLAN's real device supports
      Macsec offload by forwarding the offload request to it.
      Signed-off-by: default avatarEmeel Hakim <ehakim@nvidia.com>
      Reviewed-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      abff3e5e
    • David S. Miller's avatar
      Merge branch 'sctp-nested-flex-arrays' · e2598dbd
      David S. Miller authored
      Xin Long says:
      
      ====================
      sctp: fix a plenty of flexible-array-nested warnings
      
      Paolo noticed a compile warning in SCTP,
      
      ../net/sctp/stream_sched_fc.c: note: in included file (through ../include/net/sctp/sctp.h):
      ../include/net/sctp/structs.h:335:41: warning: array of flexible structures
      
      But not only this, there are actually quite a lot of such warnings in
      some SCTP structs. This patchset fixes most of warnings by deleting
      these nested flexible array members.
      
      After this patchset, there are still some warnings left:
      
        # make C=2 CF="-Wflexible-array-nested" M=./net/sctp/
        ./include/net/sctp/structs.h:1145:41: warning: nested flexible array
        ./include/uapi/linux/sctp.h:641:34: warning: nested flexible array
        ./include/uapi/linux/sctp.h:643:34: warning: nested flexible array
        ./include/uapi/linux/sctp.h:644:33: warning: nested flexible array
        ./include/uapi/linux/sctp.h:650:40: warning: nested flexible array
        ./include/uapi/linux/sctp.h:653:39: warning: nested flexible array
      
      the 1st is caused by __data[] in struct ip_options, not in SCTP;
      the others are in uapi, and we should not touch them.
      
      Note that instead of completely deleting it, we just leave it as a
      comment in the struct, signalling to the reader that we do expect
      such variable parameters over there, as Marcelo suggested.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e2598dbd
    • Xin Long's avatar
      sctp: delete the nested flexible array payload · dbda0fba
      Xin Long authored
      This patch deletes the flexible-array payload[] from the structure
      sctp_datahdr to avoid some sparse warnings:
      
        # make C=2 CF="-Wflexible-array-nested" M=./net/sctp/
        net/sctp/socket.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h):
        ./include/linux/sctp.h:230:29: warning: nested flexible array
      
      This member is not even used anywhere.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dbda0fba
    • Xin Long's avatar
      sctp: delete the nested flexible array hmac · 2ab399a9
      Xin Long authored
      This patch deletes the flexible-array hmac[] from the structure
      sctp_authhdr to avoid some sparse warnings:
      
        # make C=2 CF="-Wflexible-array-nested" M=./net/sctp/
        net/sctp/auth.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h):
        ./include/linux/sctp.h:735:29: warning: nested flexible array
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2ab399a9
    • Xin Long's avatar
      sctp: delete the nested flexible array peer_init · f97278ff
      Xin Long authored
      This patch deletes the flexible-array peer_init[] from the structure
      sctp_cookie to avoid some sparse warnings:
      
        # make C=2 CF="-Wflexible-array-nested" M=./net/sctp/
        net/sctp/sm_make_chunk.c: note: in included file (through include/net/sctp/sctp.h):
        ./include/net/sctp/structs.h:1588:28: warning: nested flexible array
        ./include/net/sctp/structs.h:343:28: warning: nested flexible array
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f97278ff
    • Xin Long's avatar
      sctp: delete the nested flexible array variable · 9789c1c6
      Xin Long authored
      This patch deletes the flexible-array variable[] from the structure
      sctp_sackhdr and sctp_errhdr to avoid some sparse warnings:
      
        # make C=2 CF="-Wflexible-array-nested" M=./net/sctp/
        net/sctp/sm_statefuns.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h):
        ./include/linux/sctp.h:451:28: warning: nested flexible array
        ./include/linux/sctp.h:393:29: warning: nested flexible array
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9789c1c6
    • Xin Long's avatar
      sctp: delete the nested flexible array skip · 73175a04
      Xin Long authored
      This patch deletes the flexible-array skip[] from the structure
      sctp_ifwdtsn/fwdtsn_hdr to avoid some sparse warnings:
      
        # make C=2 CF="-Wflexible-array-nested" M=./net/sctp/
        net/sctp/stream_interleave.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h):
        ./include/linux/sctp.h:611:32: warning: nested flexible array
        ./include/linux/sctp.h:628:33: warning: nested flexible array
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      73175a04
    • Xin Long's avatar
      sctp: delete the nested flexible array params · add7370a
      Xin Long authored
      This patch deletes the flexible-array params[] from the structure
      sctp_inithdr, sctp_addiphdr and sctp_reconf_chunk to avoid some
      sparse warnings:
      
        # make C=2 CF="-Wflexible-array-nested" M=./net/sctp/
        net/sctp/input.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h):
        ./include/linux/sctp.h:278:29: warning: nested flexible array
        ./include/linux/sctp.h:675:30: warning: nested flexible array
      
      This warning is reported if a structure having a flexible array
      member is included by other structures.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      add7370a
    • Jakub Kicinski's avatar
      Merge branch 'net-extend-drop-reasons' · 2f3a247c
      Jakub Kicinski authored
      Johannes Berg says:
      
      ====================
      net: extend drop reasons
      
      Here's v4 of the extended drop reasons, with fixes to kernel-doc
      and checkpatch.
      ====================
      
      Link: https://lore.kernel.org/r/20230419125254.20789-1-johannes@sipsolutions.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2f3a247c
    • Johannes Berg's avatar
      mac80211: use the new drop reasons infrastructure · baa951a1
      Johannes Berg authored
      It can be really hard to analyse or debug why packets are
      going missing in mac80211, so add the needed infrastructure
      to use use the new per-subsystem drop reasons.
      
      We actually use two drop reason subsystems here because of
      the different handling of frames that are dropped but still
      go to monitor for old versions of hostapd, and those that
      are just completely unusable (e.g. crypto failed.)
      
      Annotate a few reasons here just to illustrate this, we'll
      need to go through and annotate more of them later.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      baa951a1
    • Johannes Berg's avatar
      net: extend drop reasons for multiple subsystems · 071c0fc6
      Johannes Berg authored
      Extend drop reasons to make them usable by subsystems
      other than core by reserving the high 16 bits for a
      new subsystem ID, of which 0 of course is used for the
      existing reasons immediately.
      
      To still be able to have string reasons, restructure
      that code a bit to make the loopup under RCU, the only
      user of this (right now) is drop_monitor.
      
      Link: https://lore.kernel.org/netdev/00659771ed54353f92027702c5bbb84702da62ce.camel@sipsolutions.netSigned-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      071c0fc6
    • Johannes Berg's avatar
      net: move dropreason.h to dropreason-core.h · 5b8285cc
      Johannes Berg authored
      This will, after the next patch, hold only the core
      drop reasons and minimal infrastructure. Fix a small
      kernel-doc issue while at it, to avoid the move
      triggering a checker.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5b8285cc
    • Mahesh Bandewar's avatar
      ipv6: add icmpv6_error_anycast_as_unicast for ICMPv6 · 7ab75456
      Mahesh Bandewar authored
      ICMPv6 error packets are not sent to the anycast destinations and this
      prevents things like traceroute from working. So create a setting similar
      to ECHO when dealing with Anycast sources (icmpv6_echo_ignore_anycast).
      Signed-off-by: default avatarMahesh Bandewar <maheshb@google.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Reviewed-by: default avatarMaciej Żenczykowski <maze@google.com>
      Link: https://lore.kernel.org/r/20230419013238.2691167-1-maheshb@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7ab75456
    • Jakub Kicinski's avatar
      Merge branch 'ethtool-mm-api-consolidation' · b7b871f5
      Jakub Kicinski authored
      Vladimir Oltean says:
      
      ====================
      ethtool mm API consolidation
      
      This series consolidates the behavior of the 2 drivers that implement
      the ethtool MAC Merge layer by making NXP ENETC commit its preemptible
      traffic classes to hardware only when MM TX is active (same as Ocelot).
      
      Then, after resolving an issue with the ENETC driver, it restricts user
      space from entering 2 states which don't make sense:
      
      - pmac-enabled off tx-enabled on  verify-enabled *
      - pmac-enabled *   tx-enabled off verify-enabled on
      
      Then, it introduces a selftest (ethtool_mm.sh) which puts everything
      together and tests all valid configurations known to me.
      
      This is simultaneously the v2 of "[PATCH net-next 0/2] ethtool mm API
      improvements":
      https://lore.kernel.org/netdev/20230415173454.3970647-1-vladimir.oltean@nxp.com/
      which had caused some problems to openlldp. Those were solved in the
      meantime, see:
      https://github.com/intel/openlldp/commit/11171b474f6f3cbccac5d608b7f26b32ff72c651
      
      and of "[RFC PATCH net-next] selftests: forwarding: add a test for MAC
      Merge layer":
      https://lore.kernel.org/netdev/20230210221243.228932-1-vladimir.oltean@nxp.com/
      ====================
      
      Link: https://lore.kernel.org/r/20230418111459.811553-1-vladimir.oltean@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b7b871f5
    • Vladimir Oltean's avatar
      selftests: forwarding: add a test for MAC Merge layer · e6991384
      Vladimir Oltean authored
      The MAC Merge layer (IEEE 802.3-2018 clause 99) does all the heavy
      lifting for Frame Preemption (IEEE 802.1Q-2018 clause 6.7.2), a TSN
      feature for minimizing latency.
      
      Preemptible traffic is different on the wire from normal traffic in
      incompatible ways. If we send a preemptible packet and the link partner
      doesn't support preemption, it will drop it as an error frame and we
      will never know. The MAC Merge layer has a control plane of its own,
      which can be manipulated (using ethtool) in order to negotiate this
      capability with the link partner (through LLDP).
      
      Actually the TLV format for LLDP solves this problem only partly,
      because both partners only advertise:
      - if they support preemption (RX and TX)
      - if they have enabled preemption (TX)
      so we cannot tell the link partner what to do - we cannot force it to
      enable reception of our preemptible packets.
      
      That is fully solved by the verification feature, where the local device
      generates some small probe frames which look like preemptible frames
      with no useful content, and the link partner is obliged to respond to
      them if it supports the standard. If the verification times out, we know
      that preemption isn't active in our TX direction on the link.
      
      Having clarified the definition, this selftest exercises the manual
      (ethtool) configuration path of 2 link partners (with and without
      verification), and the LLDP code path, using the openlldp project.
      
      The test also verifies the TX activity of the MAC Merge layer by
      sending traffic through a traffic class configured as preemptible
      (using mqprio). There isn't a good way to make this really portable
      (user space cannot find out how many traffic classes there are for
      a device), but I chose num_tc 4 here, that should work reasonably well.
      I also know that some devices (stmmac) only permit TXQ0 to be
      preemptible, so this is why PREEMPTIBLE_PRIO was strategically chosen
      as 0. Even if other hardware is more configurable, this test should
      cover the baseline.
      
      This is not really a "forwarding" selftest, but I put it near the other
      "ethtool" selftests.
      
      $ ./ethtool_mm.sh eno0 swp0
      TEST: Manual configuration with verification: eno0 to swp0          [ OK ]
      TEST: Manual configuration with verification: swp0 to eno0          [ OK ]
      TEST: Manual configuration without verification: eno0 to swp0       [ OK ]
      TEST: Manual configuration without verification: swp0 to eno0       [ OK ]
      TEST: Manual configuration with failed verification: eno0 to swp0   [ OK ]
      TEST: Manual configuration with failed verification: swp0 to eno0   [ OK ]
      TEST: LLDP                                                          [ OK ]
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e6991384
    • Vladimir Oltean's avatar
      selftests: forwarding: introduce helper for standard ethtool counters · b5bf7126
      Vladimir Oltean authored
      Counters for the MAC Merge layer and preemptible MAC have standardized
      so far on using structured ethtool stats as opposed to the driver
      specific names and meanings.
      
      Benefit from that rare opportunity and introduce a helper to lib.sh for
      querying standardized counters, in the hope that these will take off for
      other uses as well.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b5bf7126
    • Petr Machata's avatar
      selftests: forwarding: generalize bail_on_lldpad from mlxsw · 8fcac792
      Petr Machata authored
      mlxsw selftests often invoke a bail_on_lldpad() helper to make sure LLDPAD
      is not running, to prevent conflicts between the QoS configuration applied
      through TC or DCB command line tool, and the DCB configuration that LLDPAD
      might apply. This helper might be useful to others. Move the function to
      lib.sh, and parameterize to make reusable in other contexts.
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarDanielle Ratson <danieller@nvidia.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8fcac792
    • Petr Machata's avatar
      selftests: forwarding: sch_tbf_*: Add a pre-run hook · 54e906f1
      Petr Machata authored
      The driver-specific wrappers of these selftests invoke bail_on_lldpad to
      make sure that LLDPAD doesn't trample the configuration. The function
      bail_on_lldpad is going to move to lib.sh in the next patch. With that, it
      won't be visible for the wrappers before sourcing the framework script. And
      after sourcing it, it is too late: the selftest will have run by then.
      
      One option might be to source NUM_NETIFS=0 lib.sh from the wrapper, but
      even if that worked (it might, it might not), that seems cumbersome. lib.sh
      is doing fair amount of stuff, and even if it works today, it does not look
      particularly solid as a solution.
      
      Instead, introduce a hook, sch_tbf_pre_hook(), that when available, gets
      invoked. Move the bail to the hook.
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarDanielle Ratson <danieller@nvidia.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      54e906f1
    • Vladimir Oltean's avatar
      net: ethtool: mm: sanitize some UAPI configurations · 35b288d6
      Vladimir Oltean authored
      The verify-enabled boolean (ETHTOOL_A_MM_VERIFY_ENABLED) was intended to
      be a sub-setting of tx-enabled (ETHTOOL_A_MM_TX_ENABLED). IOW, MAC Merge
      TX can be enabled with or without verification, but verification with TX
      disabled makes no sense.
      
      The pmac-enabled boolean (ETHTOOL_A_MM_PMAC_ENABLED) was intended to be
      a global toggle from an API perspective, whereas tx-enabled just handles
      the TX direction. IOW, the pMAC can be enabled with or without TX, but
      it doesn't make sense to enable TX if the pMAC is not enabled.
      
      Add two checks which sanitize and reject these invalid cases.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      35b288d6
    • Vladimir Oltean's avatar
      net: enetc: include MAC Merge / FP registers in register dump · 16a2c763
      Vladimir Oltean authored
      These have been useful in debugging various problems related to frame
      preemption, so make them available through ethtool --register-dump for
      later too.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      16a2c763
    • Vladimir Oltean's avatar
      net: enetc: only commit preemptible TCs to hardware when MM TX is active · 82714539
      Vladimir Oltean authored
      This was left as TODO in commit 01e23b2b ("net: enetc: add support
      for preemptible traffic classes") since it's relatively complicated.
      
      Where this makes a difference is with a configuration as follows:
      
      ethtool --set-mm eno0 pmac-enabled on tx-enabled on verify-enabled on
      
      Preemptible packets should only be sent when the MAC Merge TX direction
      becomes active (i.o.w. when the verification process succeeds, aka when
      the link partner confirms it can process preemptible traffic). But the
      tc qdisc with the preemptible traffic classes is offloaded completely
      asynchronously w.r.t. the MM becoming active.
      
      The ENETC manual does suggest that this should be handled in the driver:
      "On startup, software should wait for the verification process to
      complete (MMCSR[VSTS]=011) before initiating traffic".
      
      Adding the necessary logic allows future selftests to uphold the claim
      that an inactive or disabled MAC Merge layer should never send data
      packets through the pMAC.
      
      This change moves enetc_set_ptcfpr() from enetc.c to enetc_ethtool.c,
      where its only caller is now - enetc_mm_commit_preemptible_tcs().
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      82714539
    • Vladimir Oltean's avatar
      net: enetc: report mm tx-active based on tx-enabled and verify-status · 153b5b1d
      Vladimir Oltean authored
      The MMCSR register contains 2 fields with overlapping meaning:
      
      - LPA (Local preemption active):
      This read-only status bit indicates whether preemption is active for
      this port. This bit will be set if preemption is both enabled and has
      completed the verification process.
      - TXSTS (Merge status):
      This read-only status field provides the state of the MAC Merge sublayer
      transmit status as defined in IEEE Std 802.3-2018 Clause 99.
      00 Transmit preemption is inactive
      01 Transmit preemption is active
      10 Reserved
      11 Reserved
      
      However none of these 2 fields offer reliable reporting to software.
      
      When connecting ENETC to a link partner which is not capable of Frame
      Preemption, the expectation is that ENETC's verification should fail
      (VSTS=4) and its MM TX direction should be inactive (LPA=0, TXSTS=00)
      even though the MM TX is enabled (ME=1). But surprise, the LPA bit of
      MMCSR stays set even if VSTS=4 and ME=1.
      
      OTOH, the TXSTS field has the opposite problem. I cannot get its value
      to change from 0, even when connecting to a link partner capable of
      frame preemption, which does respond to its verification frames (ME=1
      and VSTS=3, "SUCCEEDED").
      
      The only option with such buggy hardware seems to be to reimplement the
      formula for calculating tx-active in software, which is for tx-enabled
      to be true, and for the verify-status to be either SUCCEEDED, or
      DISABLED.
      
      Without reliable tx-active reporting, we have no good indication when
      to commit the preemptible traffic classes to hardware, which makes it
      possible (but not desirable) to send preemptible traffic to a link
      partner incapable of receiving it. However, currently we do not have the
      logic to wait for TX to be active yet, so the impact is limited.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      153b5b1d