1. 01 Jan, 2024 14 commits
  2. 29 Dec, 2023 6 commits
    • David S. Miller's avatar
      Merge tag 'mlx5-updates-2023-12-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 92de776d
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      mlx5-updates-2023-12-20
      
      mlx5 Socket direct support and management PF profile.
      
      Tariq Says:
      ===========
      Support Socket-Direct multi-dev netdev
      
      This series adds support for combining multiple devices (PFs) of the
      same port under one netdev instance. Passing traffic through different
      devices belonging to different NUMA sockets saves cross-numa traffic and
      allows apps running on the same netdev from different numas to still
      feel a sense of proximity to the device and achieve improved
      performance.
      
      We achieve this by grouping PFs together, and creating the netdev only
      once all group members are probed. Symmetrically, we destroy the netdev
      once any of the PFs is removed.
      
      The channels are distributed between all devices, a proper configuration
      would utilize the correct close numa when working on a certain app/cpu.
      
      We pick one device to be a primary (leader), and it fills a special
      role.  The other devices (secondaries) are disconnected from the network
      in the chip level (set to silent mode). All RX/TX traffic is steered
      through the primary to/from the secondaries.
      
      Currently, we limit the support to PFs only, and up to two devices
      (sockets).
      
      ===========
      
      Armen Says:
      ===========
      Management PF support and module integration
      
      This patch rolls out comprehensive support for the Management Physical
      Function (MGMT PF) within the mlx5 driver. It involves updating the
      mlx5 interface header to introduce necessary definitions for MGMT PF
      and adding a new management PF netdev profile, which will allow the host
      side to communicate with the embedded linux on Blue-field devices.
      
      ===========
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      92de776d
    • Ido Schimmel's avatar
      genetlink: Use internal flags for multicast groups · cd4d7263
      Ido Schimmel authored
      As explained in commit e0378187 ("drop_monitor: Require
      'CAP_SYS_ADMIN' when joining "events" group"), the "flags" field in the
      multicast group structure reuses uAPI flags despite the field not being
      exposed to user space. This makes it impossible to extend its use
      without adding new uAPI flags, which is inappropriate for internal
      kernel checks.
      
      Solve this by adding internal flags (i.e., "GENL_MCAST_*") and convert
      the existing users to use them instead of the uAPI flags.
      
      Tested using the reproducers in commit 44ec98ea ("psample: Require
      'CAP_NET_ADMIN' when joining "packets" group") and commit e0378187
      ("drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group").
      
      No functional changes intended.
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarMat Martineau <martineau@kernel.org>
      Reviewed-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cd4d7263
    • Greg Kroah-Hartman's avatar
      iucv: make iucv_bus const · f732ba4a
      Greg Kroah-Hartman authored
      Now that the driver core can properly handle constant struct bus_type,
      move the iucv_bus variable to be a constant structure as well, placing
      it into read-only memory which can not be modified at runtime.
      
      Cc: Wenjia Zhang <wenjia@linux.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: linux-s390@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Acked-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f732ba4a
    • Jonathan Corbet's avatar
      ethtool: reformat kerneldoc for struct ethtool_fec_stats · 1271ca00
      Jonathan Corbet authored
      The kerneldoc comment for struct ethtool_fec_stats attempts to describe the
      "total" and "lanes" fields of the ethtool_fec_stat substructure in a way
      leading to these warnings:
      
        ./include/linux/ethtool.h:424: warning: Excess struct member 'lane' description in 'ethtool_fec_stats'
        ./include/linux/ethtool.h:424: warning: Excess struct member 'total' description in 'ethtool_fec_stats'
      
      Reformat the comment to retain the information while eliminating the
      warnings.
      Signed-off-by: default avatarJonathan Corbet <corbet@lwn.net>
      Reviewed-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1271ca00
    • Jonathan Corbet's avatar
      ethtool: reformat kerneldoc for struct ethtool_link_settings · d0c3891d
      Jonathan Corbet authored
      The kernel doc comments for struct ethtool_link_settings includes
      documentation for three fields that were never present there, leading to
      these docs-build warnings:
      
        ./include/uapi/linux/ethtool.h:2207: warning: Excess struct member 'supported' description in 'ethtool_link_settings'
        ./include/uapi/linux/ethtool.h:2207: warning: Excess struct member 'advertising' description in 'ethtool_link_settings'
        ./include/uapi/linux/ethtool.h:2207: warning: Excess struct member 'lp_advertising' description in 'ethtool_link_settings'
      
      Remove the entries to make the warnings go away.  There was some
      information there on how data in >link_mode_masks is formatted; move that
      to the body of the comment to preserve it.
      Signed-off-by: default avatarJonathan Corbet <corbet@lwn.net>
      Reviewed-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d0c3891d
    • Jonathan Corbet's avatar
      net: sock: remove excess structure-member documentation · 144377c3
      Jonathan Corbet authored
      Remove a couple of kerneldoc entries for struct members that do not exist,
      addressing these warnings:
      
        ./include/net/sock.h:548: warning: Excess struct member '__sk_flags_offset' description in 'sock'
        ./include/net/sock.h:548: warning: Excess struct member 'sk_padding' description in 'sock'
      Signed-off-by: default avatarJonathan Corbet <corbet@lwn.net>
      Reviewed-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      144377c3
  3. 27 Dec, 2023 11 commits
  4. 26 Dec, 2023 9 commits
    • Lin Ma's avatar
      bridge: cfm: fix enum typo in br_cc_ccm_tx_parse · c2b2ee36
      Lin Ma authored
      It appears that there is a typo in the code where the nlattr array is
      being parsed with policy br_cfm_cc_ccm_tx_policy, but the instance is
      being accessed via IFLA_BRIDGE_CFM_CC_RDI_INSTANCE, which is associated
      with the policy br_cfm_cc_rdi_policy.
      
      This problem was introduced by commit 2be665c3 ("bridge: cfm: Netlink
      SET configuration Interface.").
      
      Though it seems like a harmless typo since these two enum owns the exact
      same value (1 here), it is quite misleading hence fix it by using the
      correct enum IFLA_BRIDGE_CFM_CC_CCM_TX_INSTANCE here.
      Signed-off-by: default avatarLin Ma <linma@zju.edu.cn>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c2b2ee36
    • David S. Miller's avatar
      Merge branch 'mptcp-cleanups-ephemeral-port-sockopts' · 1f62f58d
      David S. Miller authored
      Matthieu Baerts says:
      
      ====================
      mptcp: cleanup and support more ephemeral ports sockopts
      
      Patch 1 is a cleanup one: mptcp_is_tcpsk() helper was modifying sock_ops
      in some cases which is unexpected with that name.
      
      Patch 2 to 4 add support for two socket options: IP_LOCAL_PORT_RANGE and
      IP_BIND_ADDRESS_NO_PORT. The first one is a preparation patch, the
      second one adds the support while the last one modifies an existing
      selftest to validate the new features.
      ====================
      Signed-off-by: default avatarMatthieu Baerts <matttbe@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1f62f58d
    • Maxim Galaganov's avatar
      selftests/net: add MPTCP coverage for IP_LOCAL_PORT_RANGE · 122db5e3
      Maxim Galaganov authored
      Since previous commit, MPTCP has support for IP_BIND_ADDRESS_NO_PORT and
      IP_LOCAL_PORT_RANGE sockopts.
      
      Add ip4_mptcp and ip6_mptcp fixture variants to ip_local_port_range
      selftest to provide selftest coverage for these sockopts.
      Acked-by: default avatarMat Martineau <martineau@kernel.org>
      Signed-off-by: default avatarMaxim Galaganov <max@internet.ru>
      Signed-off-by: default avatarMatthieu Baerts <matttbe@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      122db5e3
    • Maxim Galaganov's avatar
      mptcp: sockopt: support IP_LOCAL_PORT_RANGE and IP_BIND_ADDRESS_NO_PORT · c85636a2
      Maxim Galaganov authored
      Support for IP_BIND_ADDRESS_NO_PORT sockopt was introduced in [1].
      Recently [2] allowed its value to be accessed without locking the
      socket.
      
      Support for (newer) IP_LOCAL_PORT_RANGE sockopt was introduced in [3].
      In the same series a selftest was added in [4]. This selftest also
      covers the IP_BIND_ADDRESS_NO_PORT sockopt.
      
      This patch enables getsockopt()/setsockopt() on MPTCP sockets for these
      socket options, syncing set values to subflows in sync_socket_options().
      Ephemeral port range is synced to subflows, enabling NAT usecase
      described in [3].
      
      [1] commit 90c337da ("inet: add IP_BIND_ADDRESS_NO_PORT to overcome
      bind(0) limitations")
      [2] commit ca571e2e ("inet: move inet->bind_address_no_port to
      inet->inet_flags")
      [3] commit 91d0b78c ("inet: Add IP_LOCAL_PORT_RANGE socket option")
      [4] commit ae543965 ("selftests/net: Cover the IP_LOCAL_PORT_RANGE
      socket option")
      Signed-off-by: default avatarMaxim Galaganov <max@internet.ru>
      Reviewed-by: default avatarMat Martineau <martineau@kernel.org>
      Signed-off-by: default avatarMatthieu Baerts <matttbe@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c85636a2
    • Maxim Galaganov's avatar
      mptcp: rename mptcp_setsockopt_sol_ip_set_transparent() · 57d3117c
      Maxim Galaganov authored
      Next patch extends this function so that it's not specific to
      IP_TRANSPARENT. Change function name to mptcp_setsockopt_sol_ip_set().
      Reviewed-by: default avatarMat Martineau <martineau@kernel.org>
      Signed-off-by: default avatarMaxim Galaganov <max@internet.ru>
      Signed-off-by: default avatarMatthieu Baerts <matttbe@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      57d3117c
    • Davide Caratti's avatar
      mptcp: don't overwrite sock_ops in mptcp_is_tcpsk() · 8e2b8a9f
      Davide Caratti authored
      Eric Dumazet suggests:
      
       > The fact that mptcp_is_tcpsk() was able to write over sock->ops was a
       > bit strange to me.
       > mptcp_is_tcpsk() should answer a question, with a read-only argument.
      
      re-factor code to avoid overwriting sock_ops inside that function. Also,
      change the helper name to reflect the semantics and to disambiguate from
      its dual, sk_is_mptcp(). While at it, collapse mptcp_stream_accept() and
      mptcp_accept() into a single function, where fallback / non-fallback are
      separated into a single sk_is_mptcp() conditional.
      
      Link: https://github.com/multipath-tcp/mptcp_net-next/issues/432Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarMatthieu Baerts <matttbe@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8e2b8a9f
    • Christian Marangi's avatar
      net: phy: at803x: better align function varibles to open parenthesis · 7961ef1f
      Christian Marangi authored
      Better align function variables to open parenthesis as suggested by
      checkpatch script for qca808x function to make code cleaner.
      
      For cable_test_get_status function some additional rework was needed to
      handle too long functions.
      Signed-off-by: default avatarChristian Marangi <ansuelsmth@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7961ef1f
    • David S. Miller's avatar
      Merge branch 'net-sched-tc-block-ports-tracking' · 44a949ad
      David S. Miller authored
      Victor Nogueira says:
      
      ====================
      net/sched: Introduce tc block ports tracking and use
      
      __context__
      The "tc block" is a collection of netdevs/ports which allow qdiscs to share
      match-action block instances (as opposed to the traditional tc filter per
      netdev/port)[1].
      
      Up to this point in the implementation, the block is unaware of its ports.
      This patch makes the tc block ports available to the datapath.
      
      For the datapath we provide a use case of the tc block in a mirred
      action in patch 3. For users can levarage mirred to do something like
      the following:
      
      $ tc qdisc add dev ens7 ingress_block 22 clsact
      $ tc qdisc add dev ens8 ingress_block 22 clsact
      $ tc qdisc add dev ens9 ingress_block 22 clsact
      $ tc filter add block 22 protocol ip pref 25 \
        flower dst_ip 192.168.0.0/16 action mirred egress mirror blockid 22
      
      In this case, if the packet arrives on ens8, it will be copied and sent to
      all ports in the block excluding ens8. Note that the packet is still in
      the pipeline at this point - meaning other actions could be added after the
      mirror because mirred copies/clones the skb. Example the following is
      valid:
      
      $ tc filter add block 22 protocol ip pref 25 flower dst_ip 192.168.0.0/16 \
      action mirred egress mirror blockid 22 \
      action vlan push id 123 \
      action mirred egress redirect dev dummy0
      
      redirect behavior always steals the packet from the pipeline and therefore
      the skb is no longer available for a subsequent action as illustrated above
      (in redirecting to dummy0).
      
      The behavior of redirecting to a tc block is therefore adapted to work in
      the same manner. So a setup as such:
      $ tc qdisc add dev ens7 ingress_block 22
      $ tc qdisc add dev ens8 ingress_block 22
      $ tc qdisc add dev ens9 ingress_block 22
      $ tc filter add block 22 protocol ip pref 25 \
        flower dst_ip 192.168.0.0/16 action mirred egress redirect blockid 22
      
      for a matching packet arriving on ens7 will first send a copy/clone to ens8
      (as in the "mirror" behavior) then to ens9 as in the redirect behavior
      above. Once this processing is done - no other actions are able to process
      this skb. i.e it is removed from the "pipeline".
      
      In this case, if the packet arrives on ens8, it will be copied and sent to
      all ports in the block excluding ens8.
      
      Patch 1 separates/exports mirror and redirect functions from act_mirred
      Patch 2 introduces the required infra.
      Patch 3 Allows mirred to blocks
      
      Subsequent patches will come with tdc test cases.
      
      __Acknowledgements__
      Suggestions from Vlad Buslov and Marcelo Ricardo Leitner made this patchset
      better. The idea of integrating the ports into the tc block was suggested
      by Jiri Pirko.
      
      [1] See commit ca46abd6 ("Merge branch'net-sched-allow-qdiscs-to-share-filter-block-instances'")
      
      Changes in v2:
        - Remove RFC tag
        - Add more details in patch 0(Jiri)
        - When CONFIG_NET_TC_SKB_EXT is selected we have unused qdisc_cb
          Reported-by: kernel test robot <lkp@intel.com> (and
      horms@kernel.org)
        - Fix bad dev dereference in printk of blockcast action (Simon)
      
      Changes in v3:
        - Add missing xa_destroy (pointed out by Vlad)
        - Remove bugfix pointed by Vlad (will send in separate patch)
        - Removed ports from subject in patch #2 and typos (suggested by
          Marcelo)
        - Remove net_notice_ratelimited debug messages in error
          cases (suggested by Marcelo)
        - Minor changes to appease sparse's lock context warning
      
      Changes in v4:
        - Avoid code repetition using gotos in cast_one (suggested by Paolo)
        - Fix typo in cover letter (pointed out by Paolo)
        - Create a module description for act_blockcast
          (reported by Paolo and CI)
      
      Changes in v5:
        - Add new patch which separated mirred into mirror and redirect
          functions (suggested by Jiri)
        - Instead of repeating the code to mirror in blockcast use mirror
          exported function by patch1 (tcf_mirror_act)
        - Make Block ID into act_blockcast's parameter passed by user space
          instead of always getting it from SKB (suggested by Jiri)
        - Add tx_type parameter which will specify what transmission behaviour
          we want (as described earlier)
      
      Changes in v6:
        - Remove blockcast and make it a part of mirred (suggestd by Jiri)
        - Block ID is now a mirred parameter
        - We now allow redirecting and mirroring to either ingress or egress
      
      Changes in v7:
        - Remove set but not used variable in tcf_mirred_act (pointed out by
          Jakub)
      
      Changes in v8:
        - Fix uapi issues (pointed out by Jiri)
        - Separate last patch into 3 - two as preparations for adding
          block ID to mirred and one allowing mirred to block (suggested by Jiri)
        - Remove declaration initialisation of eg_block and in_block in
          qdisc_block_add_dev (suggested by Jiri)
        - Avoid unnecessary if guards in qdisc_block_add_dev (suggested by Jiri)
        - Remove unncessary block_index retrieval in __qdisc_destroy
          (suggested by Jiri)
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      44a949ad
    • Victor Nogueira's avatar
      net/sched: act_mirred: Allow mirred to block · 42f39036
      Victor Nogueira authored
      So far the mirred action has dealt with syntax that handles
      mirror/redirection for netdev. A matching packet is redirected or mirrored
      to a target netdev.
      
      In this patch we enable mirred to mirror to a tc block as well.
      IOW, the new syntax looks as follows:
      ... mirred <ingress | egress> <mirror | redirect> [index INDEX] < <blockid BLOCKID> | <dev <devname>> >
      
      Examples of mirroring or redirecting to a tc block:
      $ tc filter add block 22 protocol ip pref 25 \
        flower dst_ip 192.168.0.0/16 action mirred egress mirror blockid 22
      
      $ tc filter add block 22 protocol ip pref 25 \
        flower dst_ip 10.10.10.10/32 action mirred egress redirect blockid 22
      Co-developed-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Co-developed-by: default avatarPedro Tammela <pctammela@mojatatu.com>
      Signed-off-by: default avatarPedro Tammela <pctammela@mojatatu.com>
      Signed-off-by: default avatarVictor Nogueira <victor@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      42f39036