1. 03 Apr, 2023 23 commits
  2. 02 Apr, 2023 7 commits
    • Tom Rix's avatar
      net: alteon: remove unused len variable · 51aaa682
      Tom Rix authored
      clang with W=1 reports
      drivers/net/ethernet/alteon/acenic.c:2438:10: error: variable
        'len' set but not used [-Werror,-Wunused-but-set-variable]
                      int i, len = 0;
                             ^
      This variable is not used so remove it.
      Signed-off-by: default avatarTom Rix <trix@redhat.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      51aaa682
    • David S. Miller's avatar
      Merge branch 'mlxsw-transceiver-trip-points' · f85b8824
      David S. Miller authored
      Petr Machata says:
      
      ====================
      mlxsw: Use static trip points for transceiver modules
      
      Ido Schimmel writes:
      
      See patch #1 for motivation and implementation details.
      
      Patches #2-#3 are simple cleanups as a result of the changes in the
      first patch.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f85b8824
    • Ido Schimmel's avatar
      mlxsw: core_thermal: Simplify transceiver module get_temp() callback · cc19439f
      Ido Schimmel authored
      The get_temp() callback of a thermal zone associated with a transceiver
      module no longer needs to read the temperature thresholds of the module.
      Therefore, simplify the callback by only reading the temperature.
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarVadim Pasternak <vadimp@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cc19439f
    • Ido Schimmel's avatar
      mlxsw: core_thermal: Make mlxsw_thermal_module_init() void · c1536d85
      Ido Schimmel authored
      The function can no longer fail so make it void and remove the
      associated error path.
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarVadim Pasternak <vadimp@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c1536d85
    • Ido Schimmel's avatar
      mlxsw: core_thermal: Use static trip points for transceiver modules · 5601ef91
      Ido Schimmel authored
      The driver registers a thermal zone for each transceiver module and
      tries to set the trip point temperatures according to the thresholds
      read from the transceiver. If a threshold cannot be read or if a
      transceiver is unplugged, the trip point temperature is set to zero,
      which means that it is disabled as far as the thermal subsystem is
      concerned.
      
      A recent change in the thermal core made it so that such trip points are
      no longer marked as disabled, which lead the thermal subsystem to
      incorrectly set the associated cooling devices to the their maximum
      state [1]. A fix to restore this behavior was merged in commit
      f1b80a38 ("thermal: core: Restore behavior regarding invalid trip
      points"). However, the thermal maintainer suggested to not rely on this
      behavior and instead always register a valid array of trip points [2].
      
      Therefore, create a static array of trip points with sane defaults
      (suggested by Vadim) and register it with the thermal zone of each
      transceiver module. User space can choose to override these defaults
      using the thermal zone sysfs interface since these files are writeable.
      
      Before:
      
       $ cat /sys/class/thermal/thermal_zone11/type
       mlxsw-module11
       $ cat /sys/class/thermal/thermal_zone11/trip_point_*_temp
       65000
       75000
       80000
      
      After:
      
       $ cat /sys/class/thermal/thermal_zone11/type
       mlxsw-module11
       $ cat /sys/class/thermal/thermal_zone11/trip_point_*_temp
       55000
       65000
       80000
      
      Also tested by reverting commit f1b80a38 ("thermal: core: Restore
      behavior regarding invalid trip points") and making sure that the
      associated cooling devices are not set to their maximum state.
      
      [1] https://lore.kernel.org/linux-pm/ZA3CFNhU4AbtsP4G@shredder/
      [2] https://lore.kernel.org/linux-pm/f78e6b70-a963-c0ca-a4b2-0d4c6aeef1fb@linaro.org/Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarVadim Pasternak <vadimp@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5601ef91
    • Jakub Kicinski's avatar
      net: minor reshuffle of napi_struct · dd2d6604
      Jakub Kicinski authored
      napi_id is read by GRO and drivers to mark skbs, and it currently
      sits at the end of the structure, in a mostly unused cache line.
      Move it up into a hole, and separate the clearly control path
      fields from the important ones.
      
      Before:
      
      struct napi_struct {
      	struct list_head           poll_list;            /*     0    16 */
      	long unsigned int          state;                /*    16     8 */
      	int                        weight;               /*    24     4 */
      	int                        defer_hard_irqs_count; /*    28     4 */
      	long unsigned int          gro_bitmask;          /*    32     8 */
      	int                        (*poll)(struct napi_struct *, int); /*    40     8 */
      	int                        poll_owner;           /*    48     4 */
      
      	/* XXX 4 bytes hole, try to pack */
      
      	struct net_device *        dev;                  /*    56     8 */
      	/* --- cacheline 1 boundary (64 bytes) --- */
      	struct gro_list            gro_hash[8];          /*    64   192 */
      	/* --- cacheline 4 boundary (256 bytes) --- */
      	struct sk_buff *           skb;                  /*   256     8 */
      	struct list_head           rx_list;              /*   264    16 */
      	int                        rx_count;             /*   280     4 */
      
      	/* XXX 4 bytes hole, try to pack */
      
      	struct hrtimer             timer;                /*   288    64 */
      
      	/* XXX last struct has 4 bytes of padding */
      
      	/* --- cacheline 5 boundary (320 bytes) was 32 bytes ago --- */
      	struct list_head           dev_list;             /*   352    16 */
      	struct hlist_node          napi_hash_node;       /*   368    16 */
      	/* --- cacheline 6 boundary (384 bytes) --- */
      	unsigned int               napi_id;              /*   384     4 */
      
      	/* XXX 4 bytes hole, try to pack */
      
      	struct task_struct *       thread;               /*   392     8 */
      
      	/* size: 400, cachelines: 7, members: 17 */
      	/* sum members: 388, holes: 3, sum holes: 12 */
      	/* paddings: 1, sum paddings: 4 */
      	/* last cacheline: 16 bytes */
      };
      
      After:
      
      struct napi_struct {
      	struct list_head           poll_list;            /*     0    16 */
      	long unsigned int          state;                /*    16     8 */
      	int                        weight;               /*    24     4 */
      	int                        defer_hard_irqs_count; /*    28     4 */
      	long unsigned int          gro_bitmask;          /*    32     8 */
      	int                        (*poll)(struct napi_struct *, int); /*    40     8 */
      	int                        poll_owner;           /*    48     4 */
      
      	/* XXX 4 bytes hole, try to pack */
      
      	struct net_device *        dev;                  /*    56     8 */
      	/* --- cacheline 1 boundary (64 bytes) --- */
      	struct gro_list            gro_hash[8];          /*    64   192 */
      	/* --- cacheline 4 boundary (256 bytes) --- */
      	struct sk_buff *           skb;                  /*   256     8 */
      	struct list_head           rx_list;              /*   264    16 */
      	int                        rx_count;             /*   280     4 */
      	unsigned int               napi_id;              /*   284     4 */
      	struct hrtimer             timer;                /*   288    64 */
      
      	/* XXX last struct has 4 bytes of padding */
      
      	/* --- cacheline 5 boundary (320 bytes) was 32 bytes ago --- */
      	struct task_struct *       thread;               /*   352     8 */
      	struct list_head           dev_list;             /*   360    16 */
      	struct hlist_node          napi_hash_node;       /*   376    16 */
      
      	/* size: 392, cachelines: 7, members: 17 */
      	/* sum members: 388, holes: 1, sum holes: 4 */
      	/* paddings: 1, sum paddings: 4 */
      	/* forced alignments: 1 */
      	/* last cacheline: 8 bytes */
      } __attribute__((__aligned__(8)));
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dd2d6604
    • Sylwester Dziedziuch's avatar
      i40e: Add support for VF to specify its primary MAC address · ceb29474
      Sylwester Dziedziuch authored
      Currently in the i40e driver there is no implementation of different
      MAC address handling depending on whether it is a legacy or primary.
      Introduce new checks for VF to be able to specify its primary MAC
      address based on the VIRTCHNL_ETHER_ADDR_PRIMARY type.
      
      Primary MAC address are treated differently compared to legacy
      ones in a scenario where:
      1. If a unicast MAC is being added and it's specified as
      VIRTCHNL_ETHER_ADDR_PRIMARY, then replace the current
      default_lan_addr.addr.
      2. If a unicast MAC is being deleted and it's type
      is specified as VIRTCHNL_ETHER_ADDR_PRIMARY, then zero the
      hw_lan_addr.addr.
      Signed-off-by: default avatarSylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
      Signed-off-by: default avatarMateusz Palczewski <mateusz.palczewski@intel.com>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ceb29474
  3. 01 Apr, 2023 1 commit
  4. 31 Mar, 2023 9 commits
    • Jakub Kicinski's avatar
      Merge tag 'nf-next-2023-03-30' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next · 54fd494a
      Jakub Kicinski authored
      Florian Westphal says:
      
      ====================
      netfilter updates for net-next
      
      1. No need to disable BH in nfnetlink proc handler, freeing happens
         via call_rcu.
      2. Expose classid in nfetlink_queue, from Eric Sage.
      3. Fix nfnetlink message description comments, from Matthieu De Beule.
      4. Allow removal of offloaded connections via ctnetlink, from Paul Blakey.
      
      * tag 'nf-next-2023-03-30' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
        netfilter: ctnetlink: Support offloaded conntrack entry deletion
        netfilter: Correct documentation errors in nf_tables.h
        netfilter: nfnetlink_queue: enable classid socket info retrieval
        netfilter: nfnetlink_log: remove rcu_bh usage
      ====================
      
      Link: https://lore.kernel.org/r/20230331104809.2959-1-fw@strlen.deSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      54fd494a
    • Peng Fan's avatar
    • Kuniyuki Iwashima's avatar
      tcp: Refine SYN handling for PAWS. · ee05d90d
      Kuniyuki Iwashima authored
      Our Network Load Balancer (NLB) [0] has multiple nodes with different
      IP addresses, and each node forwards TCP flows from clients to backend
      targets.  NLB has an option to preserve the client's source IP address
      and port when routing packets to backend targets. [1]
      
      When a client connects to two different NLB nodes, they may select the
      same backend target.  Then, if the client has used the same source IP
      and port, the two flows at the backend side will have the same 4-tuple.
      
      While testing around such cases, I saw these sequences on the backend
      target.
      
      IP 10.0.0.215.60000 > 10.0.3.249.10000: Flags [S], seq 2819965599, win 62727, options [mss 8365,sackOK,TS val 1029816180 ecr 0,nop,wscale 7], length 0
      IP 10.0.3.249.10000 > 10.0.0.215.60000: Flags [S.], seq 3040695044, ack 2819965600, win 62643, options [mss 8961,sackOK,TS val 1224784076 ecr 1029816180,nop,wscale 7], length 0
      IP 10.0.0.215.60000 > 10.0.3.249.10000: Flags [.], ack 1, win 491, options [nop,nop,TS val 1029816181 ecr 1224784076], length 0
      IP 10.0.0.215.60000 > 10.0.3.249.10000: Flags [S], seq 2681819307, win 62727, options [mss 8365,sackOK,TS val 572088282 ecr 0,nop,wscale 7], length 0
      IP 10.0.3.249.10000 > 10.0.0.215.60000: Flags [.], ack 1, win 490, options [nop,nop,TS val 1224794914 ecr 1029816181,nop,nop,sack 1 {4156821004:4156821005}], length 0
      
      It seems to be working correctly, but the last ACK was generated by
      tcp_send_dupack() and PAWSEstab was increased.  This is because the
      second connection has a smaller timestamp than the first one.
      
      In this case, we should send a dup ACK in tcp_send_challenge_ack()
      to increase the correct counter and rate-limit it properly.
      
      Let's check the SYN flag after the PAWS tests to avoid adding unnecessary
      overhead for most packets.
      
      Link: https://docs.aws.amazon.com/elasticloadbalancing/latest/network/introduction.html [0]
      Link: https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-target-groups.html#client-ip-preservation [1]
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarJason Xing <kerneljasonxing@gmail.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ee05d90d
    • Herbert Xu's avatar
      macvlan: Fix mc_filter calculation · ae63ad9b
      Herbert Xu authored
      On Wed, Mar 29, 2023 at 08:10:26AM +0000, patchwork-bot+netdevbpf@kernel.org wrote:
      >
      > Here is the summary with links:
      >   - [1/2] macvlan: Skip broadcast queue if multicast with single receiver
      >     https://git.kernel.org/netdev/net-next/c/d45276e75e90
      >   - [2/2] macvlan: Add netlink attribute for broadcast cutoff
      >     https://git.kernel.org/netdev/net-next/c/954d1fa1ac93
      
      Sorry, I made an error and posted my patches from an earlier
      revision so a follow-up fix was missing:
      
      ---8<---
      The bc_cutoff patch broke the calculation of mc_filter causing
      some multicast packets to not make it through to the targeted
      device.
      
      Fix this by checking whether vlan is set instead of cutoff >= 0.
      
      Also move the cutoff < 0 logic into macvlan_recompute_bc_filter
      so that it doesn't change the mc_filter at all.
      
      Fixes: d45276e7 ("macvlan: Skip broadcast queue if multicast with single receiver")
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ae63ad9b
    • Jakub Kicinski's avatar
      Merge tag 'wireless-next-2023-03-30' of... · ce7928f7
      Jakub Kicinski authored
      Merge tag 'wireless-next-2023-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
      
      Johannes Berg says:
      
      ====================
      Major stack changes:
      
       * TC offload support for drivers below mac80211
       * reduced neighbor report (RNR) handling for AP mode
       * mac80211 mesh fast-xmit and fast-rx support
       * support for another mesh A-MSDU format
         (seems nobody got the spec right)
      
      Major driver changes:
      
      Kalle moved the drivers that were just plain C files
      in drivers/net/wireless/ to legacy/ and virtual/ dirs.
      
      hwsim
       * multi-BSSID support
       * some FTM support
      
      ath11k
       * MU-MIMO parameters support
       * ack signal support for management packets
      
      rtl8xxxu
       * support for RTL8710BU aka RTL8188GU chips
      
      rtw89
       * support for various newer firmware APIs
      
      ath10k
       * enabled threaded NAPI on WCN3990
      
      iwlwifi
       * lots of work for multi-link/EHT (wifi7)
       * hardware timestamping support for some devices/firwmares
       * TX beacon protection on newer hardware
      
      * tag 'wireless-next-2023-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (181 commits)
        wifi: clean up erroneously introduced file
        wifi: iwlwifi: mvm: correctly use link in iwl_mvm_sta_del()
        wifi: iwlwifi: separate AP link management queues
        wifi: iwlwifi: mvm: free probe_resp_data later
        wifi: iwlwifi: bump FW API to 75 for AX devices
        wifi: iwlwifi: mvm: move max_agg_bufsize into host TLC lq_sta
        wifi: iwlwifi: mvm: send full STA during HW restart
        wifi: iwlwifi: mvm: rework active links counting
        wifi: iwlwifi: mvm: update mac config when assigning chanctx
        wifi: iwlwifi: mvm: use the correct link queue
        wifi: iwlwifi: mvm: clean up mac_id vs. link_id in MLD sta
        wifi: iwlwifi: mvm: fix station link data leak
        wifi: iwlwifi: mvm: initialize max_rc_amsdu_len per-link
        wifi: iwlwifi: mvm: use appropriate link for rate selection
        wifi: iwlwifi: mvm: use the new lockdep-checking macros
        wifi: iwlwifi: mvm: remove chanctx WARN_ON
        wifi: iwlwifi: mvm: avoid sending MAC context for idle
        wifi: iwlwifi: mvm: remove only link-specific AP keys
        wifi: iwlwifi: mvm: skip inactive links
        wifi: iwlwifi: mvm: adjust iwl_mvm_scan_respect_p2p_go_iter() for MLO
        ...
      ====================
      
      Link: https://lore.kernel.org/r/20230330205612.921134-1-johannes@sipsolutions.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ce7928f7
    • Jakub Kicinski's avatar
      Merge branch 'tools-ynl-fill-in-some-gaps-of-ethtool-spec' · dee1efb3
      Jakub Kicinski authored
      Stanislav Fomichev says:
      
      ====================
      tools: ynl: fill in some gaps of ethtool spec
      
      I was trying to fill in the spec while exploring ethtool API for some
      related work. I don't think I'll have the patience to fill in the rest,
      so decided to share whatever I currently have.
      
      Patches 1-2 add the be16 + spec.
      Patches 3-4 implement an ethtool-like python tool to test the spec.
      
      Patches 3-4 are there because it felt more fun do the tool instead
      of writing the actual tests; feel free to drop it; sharing mostly
      to show that the spec is not a complete nonsense.
      
      The spec is not 100% complete, see patch 2 for what's missing.
      I was hoping to finish the stats-get message, but I'm too dump
      to implement bitmask marshaling (multi-attr).
      ====================
      
      Link: https://lore.kernel.org/r/20230329221655.708489-1-sdf@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      dee1efb3
    • Stanislav Fomichev's avatar
      tools: ynl: ethtool testing tool · f3d07b02
      Stanislav Fomichev authored
      This is what I've been using to see whether the spec makes sense.
      A small subset of getters (mostly the unprivileged ones) is implemented.
      Some setters (channels) also work.
      Setters for messages with bitmasks are not implemented.
      
      Initially I was trying to make this tool look 1:1 like real ethtool,
      but eventually gave up :-)
      
      Sample output:
      
      $ ./tools/net/ynl/ethtool enp0s31f6
      Settings for enp0s31f6:
      Supported ports: [ TP ]
      Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half
      100baseT/Full 1000baseT/Full
      Supported pause frame use: no
      Supports auto-negotiation: yes
      Supported FEC modes: Not reported
      Speed: Unknown!
      Duplex: Unknown! (255)
      Auto-negotiation: on
      Port: Twisted Pair
      PHYAD: 2
      Transceiver: Internal
      MDI-X: Unknown (auto)
      Current message level: drv probe link
      Link detected: no
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f3d07b02
    • Stanislav Fomichev's avatar
      tools: ynl: replace print with NlError · 48993e22
      Stanislav Fomichev authored
      Instead of dumping the error on the stdout, make the callee and
      opportunity to decide what to do with it. This is mostly for the
      ethtool testing.
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      48993e22
    • Stanislav Fomichev's avatar
      tools: ynl: populate most of the ethtool spec · a353318e
      Stanislav Fomichev authored
      Things that are not implemented:
      - cable tests
      - bitmaks in the requests don't work (needs multi-attr support in ynl.py)
      - stats-get seems to return nonsense (not passing a bitmask properly?)
      - notifications are not tested
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a353318e