1. 15 Dec, 2021 35 commits
    • Amit Cohen's avatar
      mlxsw: Split handling of FDB tunnel entries between address families · 1fd85416
      Amit Cohen authored
      Currently, the function which adds/removes unicast tunnel FDB entries is
      shared between IPv4 and IPv6, while for IPv6 it warns because there is
      no support for it.
      
      The code for IPv6 will be more complicated because it needs to
      allocate/release a KVDL pointer for the underlay IPv6 address.
      
      As a preparation for IPv6 underlay support, split the code according to
      address family.
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1fd85416
    • Amit Cohen's avatar
      mlxsw: spectrum_nve_vxlan: Make VxLAN flags check per address family · 720d683c
      Amit Cohen authored
      As part of 'can_offload' checks, there is a check of VxLAN flags.
      
      The supported flags for IPv6 VxLAN will be different from the existing
      flags because of some limitations.
      
      As preparation for IPv6 underlay support, make this check per address
      family.
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      720d683c
    • Amit Cohen's avatar
      mlxsw: spectrum_ipip: Use common hash table for IPv6 address mapping · cf429115
      Amit Cohen authored
      Use the common hash table introduced by the previous patch instead of
      the IP-in-IP specific implementation.
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cf429115
    • Amit Cohen's avatar
      mlxsw: spectrum: Add hash table for IPv6 address mapping · e846efe2
      Amit Cohen authored
      The device supports forwarding entries such as routes and FDBs that
      perform tunnel (e.g., VXLAN, IP-in-IP) encapsulation or decapsulation.
      When the underlay is IPv6, these entries do not encode the 128 bit IPv6
      address used for encapsulation / decapsulation. Instead, these entries
      encode a 24 bit pointer to an array called KVDL where the IPv6 address
      is stored.
      
      Currently, only IP-in-IP with IPv6 underlay is supported, but subsequent
      patches will add support for VxLAN with IPv6 underlay. To avoid
      duplicating the logic required to store and retrieve these IPv6
      addresses, introduce a hash table that will store the mapping between
      IPv6 addresses and their KVDL index.
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e846efe2
    • David S. Miller's avatar
      Merge tag 'mlx5-updates-2021-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · f71f1bcb
      David S. Miller authored
      Saed Mahameed says:
      
      ====================
      mlx5-updates-2021-12-14
      
      Parsing Infrastructure for TC actions:
      
      The series introduce a TC action infrastructure to help
      parsing TC actions in a generic way for both FDB and NIC rules.
      
      To help maintain the parsing code of TC actions, we the parsing code to
      action parser per action TC type in separate files, instead of having one
      big switch case loop, duplicated between FDB and NIC parsers as before this
      patchset.
      
      Each TC flow_action->id is represented by a dedicated mlx5e_tc_act handler
      which has callbacks to check if the specific action is offload supported and
      to parse the specific action.
      
      We move each case (TC action) handling into the specific handler, which is
      responsible for parsing and determining if the action is supported.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f71f1bcb
    • David S. Miller's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue · 5a21bf5b
      David S. Miller authored
      Tony Nguyen says:
      
      ====================
      100GbE Intel Wired LAN Driver Updates 2021-12-14
      
      This series contains updates to ice driver only.
      
      Haiyue adds support to query hardware for supported PTYPEs.
      
      Jeff changes PTYPE validation to utilize the capabilities queried from
      the hardware instead of maintaining a per DDP support list.
      
      Brett refactors promiscuous functions to provide common and clear
      interfaces to call for configuration.
      
      Wojciech modifies DDP package load to simplify determining the final
      state of the load.
      
      Tony removes the use of ice_status from the driver. This involves
      removing string conversion functions, converting variables and values to
      standard errors, and clean up. He also removes an unused define.
      
      Dan Carpenter removes unneeded casts.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5a21bf5b
    • Joakim Zhang's avatar
      net: fec: fix system hang during suspend/resume · 0b6f65c7
      Joakim Zhang authored
      1. During normal suspend (WoL not enabled) process, system has posibility
      to hang. The root cause is TXF interrupt coming after clocks disabled,
      system hang when accessing registers from interrupt handler. To fix this
      issue, disable all interrupts when system suspend.
      
      2. System also has posibility to hang with WoL enabled during suspend,
      after entering stop mode, then magic pattern coming after clocks
      disabled, system will be waked up, and interrupt handler will be called,
      system hang when access registers. To fix this issue, disable wakeup
      irq in .suspend(), and enable it in .resume().
      Signed-off-by: default avatarJoakim Zhang <qiangqing.zhang@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0b6f65c7
    • Clément Léger's avatar
      net: ocelot: add support to get port mac from device-tree · 84386995
      Clément Léger authored
      Add support to get mac from device-tree using of_get_ethdev_address.
      Reviewed-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarClément Léger <clement.leger@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      84386995
    • Conley Lee's avatar
      sun4i-emac.c: remove unnecessary branch · 3899c928
      Conley Lee authored
      According to the current implementation of emac_rx, every arrived packet
      will be processed in the while loop. So, there is no remain packet last
      time. The skb_last field and this branch for dealing with it is
      unnecessary.
      Signed-off-by: default avatarConley Lee <conleylee@foxmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3899c928
    • Eric Dumazet's avatar
      ethtool: use ethnl_parse_header_dev_put() · 34ac17ec
      Eric Dumazet authored
      It seems I missed that most ethnl_parse_header_dev_get() callers
      declare an on-stack struct ethnl_req_info, and that they simply call
      dev_put(req_info.dev) when about to return.
      
      Add ethnl_parse_header_dev_put() helper to properly untrack
      reference taken by ethnl_parse_header_dev_get().
      
      Fixes: e4b89540 ("netlink: add net device refcount tracker to struct ethnl_req_info")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      34ac17ec
    • Roi Dayan's avatar
      net/mlx5e: Move goto action checks into tc_action goto post parse op · 35bb5242
      Roi Dayan authored
      Move goto action checks from parse nic/fdb funcs into the tc action
      infra goto post parse op.
      While moving this part also use NL_SET_ERR_MSG_MOD() instead of
      NL_SET_ERR_MSG().
      Signed-off-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      35bb5242
    • Roi Dayan's avatar
      net/mlx5e: Move vlan action chunk into tc action vlan post parse op · c2208035
      Roi Dayan authored
      Move vlan prio tag rewrite handling into tc action infra vlan post parse op.
      Signed-off-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      c2208035
    • Roi Dayan's avatar
      net/mlx5e: Add post_parse() op to tc action infrastructure · dd5ab6d1
      Roi Dayan authored
      The post_parse() op should be called after the parse op was called
      for all actions. It could be an action state is dependent on other
      actions. In the new op an action can fail the parse if the state
      is not valid anymore.
      Signed-off-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      dd5ab6d1
    • Roi Dayan's avatar
      net/mlx5e: Move sample attr allocation to tc_action sample parse op · 6bcba1bd
      Roi Dayan authored
      There is no reason to wait with the kmalloc to after parsing all
      other actions. There could still be a failure later and before
      offloading the rule. So alloc the mem when parsing.
      The memory is being released on mlx5e_flow_put() which is called
      also on error flow.
      Signed-off-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      6bcba1bd
    • Roi Dayan's avatar
      net/mlx5e: TC action parsing loop · 8333d53e
      Roi Dayan authored
      Introduce a common function to implement the generic parsing loop.
      The same function can be used for parsing NIC and FDB (Switchdev mode) flows.
      Signed-off-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      8333d53e
    • Roi Dayan's avatar
      net/mlx5e: Add redirect ingress to tc action infra · 922d69ed
      Roi Dayan authored
      Add parsing support by implementing struct mlx5e_tc_act
      for this action.
      Signed-off-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      922d69ed
    • Roi Dayan's avatar
      net/mlx5e: Add sample and ptype to tc_action infra · 3929ff58
      Roi Dayan authored
      Add parsing support by implementing struct mlx5e_tc_act
      for this action.
      Signed-off-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      3929ff58
    • Roi Dayan's avatar
      net/mlx5e: Add ct to tc action infra · 758bc134
      Roi Dayan authored
      Add parsing support by implementing struct mlx5e_tc_act
      for this action.
      Signed-off-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      758bc134
    • Roi Dayan's avatar
      net/mlx5e: Add mirred/redirect to tc action infra · ab3f3d5e
      Roi Dayan authored
      Add parsing support by implementing struct mlx5e_tc_act
      for this action.
      Signed-off-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      ab3f3d5e
    • Roi Dayan's avatar
      net/mlx5e: Add mpls push/pop to tc action infra · 163b766f
      Roi Dayan authored
      Add parsing support by implementing struct mlx5e_tc_act
      for this action.
      Signed-off-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      163b766f
    • Roi Dayan's avatar
      net/mlx5e: Add vlan push/pop/mangle to tc action infra · 8ee72638
      Roi Dayan authored
      Add parsing support by implementing struct mlx5e_tc_act
      for this action.
      Signed-off-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      8ee72638
    • Roi Dayan's avatar
      net/mlx5e: Add pedit to tc action infra · e36db1ee
      Roi Dayan authored
      Add parsing support by implementing struct mlx5e_tc_act
      for this action.
      Signed-off-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      e36db1ee
    • Roi Dayan's avatar
      net/mlx5e: Add csum to tc action infra · 9ca1bb2c
      Roi Dayan authored
      Add parsing support by implementing struct mlx5e_tc_act
      for this action.
      Signed-off-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      9ca1bb2c
    • Roi Dayan's avatar
      net/mlx5e: Add tunnel encap/decap to tc action infra · c65686d7
      Roi Dayan authored
      Add parsing support by implementing struct mlx5e_tc_act
      for this action.
      Signed-off-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      c65686d7
    • Roi Dayan's avatar
      net/mlx5e: Add goto to tc action infra · 67d62ee7
      Roi Dayan authored
      Add parsing support by implementing struct mlx5e_tc_act
      for this action.
      Signed-off-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      67d62ee7
    • Roi Dayan's avatar
      net/mlx5e: Add tc action infrastructure · fad54790
      Roi Dayan authored
      Add an infrastructure to help parsing tc actions in a generic way.
      
      Supporting an action parser means implementing struct mlx5e_tc_act
      for that action.
      
      The infrastructure will give the possibility to be generic when parsing tc
      actions, i.e. parse_tc_nic_actions() and parse_tc_fdb_actions().
      To parse tc actions a user needs to allocate a parse_state instance
      and pass it when iterating over the tc actions parsers.
      If a parser doesn't exists then a user can treat it as unsupported.
      
      To add an action parser a user needs to implement two callbacks.
      The can_offload() callback to quickly check if an action can be offloaded.
      The parse_action() callback to do actual parsing and prepare for offload.
      
      Add implementation for drop, trap, mark and accept action parsers with this
      commit to act as examples and implement usage of the new infrastructure for
      those actions.
      Signed-off-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      fad54790
    • Jakub Kicinski's avatar
      Merge branch 'net-dsa-hellcreek-fix-handling-of-mgmt-protocols' · 6cf7a1ac
      Jakub Kicinski authored
      Kurt Kanzenbach says:
      
      ====================
      net: dsa: hellcreek: Fix handling of MGMT protocols
      
      this series fixes some minor issues with regards to management protocols
      such as PTP and STP in the hellcreek DSA driver. Configure static FDB
      for these protocols. The end result is:
      
      |root@tsn:~# mv88e6xxx_dump --atu
      |Using device <platform/ff240000.switch>
      |ATU:
      |FID  MAC               0123 Age OBT Pass Static Reprio Prio
      |   0 01:1b:19:00:00:00 1100   1               X       X    6
      |   1 01:00:5e:00:01:81 1100   1               X       X    6
      |   2 33:33:00:00:01:81 1100   1               X       X    6
      |   3 01:80:c2:00:00:0e 1100   1        X      X       X    6
      |   4 01:00:5e:00:00:6b 1100   1        X      X       X    6
      |   5 33:33:00:00:00:6b 1100   1        X      X       X    6
      |   6 01:80:c2:00:00:00 1100   1        X      X       X    6
      
      Previous version:
       * https://lore.kernel.org/r/20211213101810.121553-1-kurt@linutronix.de/
      
      Changes since v1:
       * Target net-next, as this never worked correctly and is not critical
       * Add STP and PTP over UDP rules
       * Use pass_blocked for PDelay messages only (Richard Cochran)
      ====================
      
      Link: https://lore.kernel.org/r/20211214134508.57806-1-kurt@linutronix.deSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6cf7a1ac
    • Kurt Kanzenbach's avatar
      net: dsa: hellcreek: Add missing PTP via UDP rules · 6cf01e45
      Kurt Kanzenbach authored
      The switch supports PTP for UDP transport too. Therefore, add the missing static
      FDB entries to ensure correct forwarding of these packets.
      
      Fixes: ddd56dfe ("net: dsa: hellcreek: Add PTP clock support")
      Signed-off-by: Kurt Kanzenbach's avatarKurt Kanzenbach <kurt@linutronix.de>
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6cf01e45
    • Kurt Kanzenbach's avatar
      net: dsa: hellcreek: Allow PTP P2P measurements on blocked ports · cad1798d
      Kurt Kanzenbach authored
      Allow PTP peer delay measurements on blocked ports by STP. In case of topology
      changes the PTP stack can directly start with the correct delays.
      
      Fixes: ddd56dfe ("net: dsa: hellcreek: Add PTP clock support")
      Signed-off-by: Kurt Kanzenbach's avatarKurt Kanzenbach <kurt@linutronix.de>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cad1798d
    • Kurt Kanzenbach's avatar
      net: dsa: hellcreek: Add STP forwarding rule · b7ade35e
      Kurt Kanzenbach authored
      Treat STP as management traffic. STP traffic is designated for the CPU port
      only. In addition, STP traffic has to pass blocked ports.
      
      Fixes: e4b27ebc ("net: dsa: Add DSA driver for Hirschmann Hellcreek switches")
      Signed-off-by: Kurt Kanzenbach's avatarKurt Kanzenbach <kurt@linutronix.de>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b7ade35e
    • Kurt Kanzenbach's avatar
      net: dsa: hellcreek: Fix insertion of static FDB entries · 4db4c3ea
      Kurt Kanzenbach authored
      The insertion of static FDB entries ignores the pass_blocked bit. That bit is
      evaluated with regards to STP. Add the missing functionality.
      
      Fixes: e4b27ebc ("net: dsa: Add DSA driver for Hirschmann Hellcreek switches")
      Signed-off-by: Kurt Kanzenbach's avatarKurt Kanzenbach <kurt@linutronix.de>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4db4c3ea
    • Eric Dumazet's avatar
      net: dev_replace_track() cleanup · 9280ac2e
      Eric Dumazet authored
      Use existing helpers (netdev_tracker_free()
      and netdev_tracker_alloc()) to remove ifdefery.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20211214151515.312535-1-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9280ac2e
    • Eric Dumazet's avatar
      net: linkwatch: be more careful about dev->linkwatch_dev_tracker · 123e495e
      Eric Dumazet authored
      Apparently a concurrent linkwatch_add_event() could
      run while we are in __linkwatch_run_queue().
      
      We need to free dev->linkwatch_dev_tracker tracker
      under lweventlist_lock protection to avoid this race.
      
      syzbot report:
      [   77.935949][ T3661] reference already released.
      [   77.941015][ T3661] allocated in:
      [   77.944482][ T3661]  linkwatch_fire_event+0x202/0x260
      [   77.950318][ T3661]  netif_carrier_on+0x9c/0x100
      [   77.955120][ T3661]  __ieee80211_sta_join_ibss+0xc52/0x1590
      [   77.960888][ T3661]  ieee80211_sta_create_ibss.cold+0xd2/0x11f
      [   77.966908][ T3661]  ieee80211_ibss_work.cold+0x30e/0x60f
      [   77.972483][ T3661]  ieee80211_iface_work+0xb70/0xd00
      [   77.977715][ T3661]  process_one_work+0x9ac/0x1680
      [   77.982671][ T3661]  worker_thread+0x652/0x11c0
      [   77.987371][ T3661]  kthread+0x405/0x4f0
      [   77.991465][ T3661]  ret_from_fork+0x1f/0x30
      [   77.995895][ T3661] freed in:
      [   77.999006][ T3661]  linkwatch_do_dev+0x96/0x160
      [   78.004014][ T3661]  __linkwatch_run_queue+0x233/0x6a0
      [   78.009496][ T3661]  linkwatch_event+0x4a/0x60
      [   78.014099][ T3661]  process_one_work+0x9ac/0x1680
      [   78.019034][ T3661]  worker_thread+0x652/0x11c0
      [   78.023719][ T3661]  kthread+0x405/0x4f0
      [   78.027810][ T3661]  ret_from_fork+0x1f/0x30
      [   78.042541][ T3661] ------------[ cut here ]------------
      [   78.048253][ T3661] WARNING: CPU: 0 PID: 3661 at lib/ref_tracker.c:120 ref_tracker_free.cold+0x110/0x14e
      [   78.062364][ T3661] Modules linked in:
      [   78.066424][ T3661] CPU: 0 PID: 3661 Comm: kworker/0:5 Not tainted 5.16.0-rc4-next-20211210-syzkaller #0
      [   78.076075][ T3661] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      [   78.090648][ T3661] Workqueue: events linkwatch_event
      [   78.095890][ T3661] RIP: 0010:ref_tracker_free.cold+0x110/0x14e
      [   78.102191][ T3661] Code: ea 03 48 c1 e0 2a 0f b6 04 02 84 c0 74 04 3c 03 7e 4c 8b 7b 18 e8 6b 54 e9 fa e8 26 4d 57 f8 4c 89 ee 48 89 ef e8 fb 33 36 00 <0f> 0b 41 bd ea ff ff ff e9 bd 60 e9 fa 4c 89 f7 e8 16 45 a2 f8 e9
      [   78.127211][ T3661] RSP: 0018:ffffc90002b5fb18 EFLAGS: 00010246
      [   78.133684][ T3661] RAX: 0000000000000000 RBX: ffff88807467f700 RCX: 0000000000000000
      [   78.141928][ T3661] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000001
      [   78.150087][ T3661] RBP: ffff888057e105b8 R08: 0000000000000001 R09: ffffffff8ffa1967
      [   78.158211][ T3661] R10: 0000000000000001 R11: 0000000000000000 R12: 1ffff9200056bf65
      [   78.166204][ T3661] R13: 0000000000000292 R14: ffff88807467f718 R15: 00000000c0e0008c
      [   78.174321][ T3661] FS:  0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
      [   78.183310][ T3661] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   78.190156][ T3661] CR2: 000000c000208800 CR3: 000000007f7b5000 CR4: 00000000003506f0
      [   78.198235][ T3661] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   78.206214][ T3661] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   78.214328][ T3661] Call Trace:
      [   78.217679][ T3661]  <TASK>
      [   78.220621][ T3661]  ? __sanitizer_cov_trace_const_cmp4+0x1c/0x70
      [   78.226981][ T3661]  ? nlmsg_notify+0xbe/0x280
      [   78.231607][ T3661]  ? ref_tracker_dir_exit+0x330/0x330
      [   78.237654][ T3661]  ? linkwatch_do_dev+0x96/0x160
      [   78.242628][ T3661]  ? __linkwatch_run_queue+0x233/0x6a0
      [   78.248170][ T3661]  ? linkwatch_event+0x4a/0x60
      [   78.252946][ T3661]  ? process_one_work+0x9ac/0x1680
      [   78.258136][ T3661]  ? worker_thread+0x853/0x11c0
      [   78.263020][ T3661]  ? kthread+0x405/0x4f0
      [   78.267905][ T3661]  ? ret_from_fork+0x1f/0x30
      [   78.272670][ T3661]  ? netdev_state_change+0xa1/0x130
      [   78.278019][ T3661]  ? netdev_exit+0xd0/0xd0
      [   78.282466][ T3661]  ? dev_activate+0x420/0xa60
      [   78.287261][ T3661]  linkwatch_do_dev+0x96/0x160
      [   78.292043][ T3661]  __linkwatch_run_queue+0x233/0x6a0
      [   78.297505][ T3661]  ? linkwatch_do_dev+0x160/0x160
      [   78.302561][ T3661]  linkwatch_event+0x4a/0x60
      [   78.307225][ T3661]  process_one_work+0x9ac/0x1680
      [   78.312292][ T3661]  ? pwq_dec_nr_in_flight+0x2a0/0x2a0
      [   78.317757][ T3661]  ? rwlock_bug.part.0+0x90/0x90
      [   78.322726][ T3661]  ? _raw_spin_lock_irq+0x41/0x50
      [   78.327844][ T3661]  worker_thread+0x853/0x11c0
      [   78.332543][ T3661]  ? process_one_work+0x1680/0x1680
      [   78.338500][ T3661]  kthread+0x405/0x4f0
      [   78.342610][ T3661]  ? set_kthread_struct+0x130/0x130
      
      Fixes: 63f13937 ("net: linkwatch: add net device refcount tracker")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Link: https://lore.kernel.org/r/20211214051955.3569843-1-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      123e495e
    • Eric Dumazet's avatar
      mptcp: adjust to use netns refcount tracker · 1d2f3d3c
      Eric Dumazet authored
      MPTCP can change sk_net_refcnt after sock_create_kern() call.
      
      We need to change its corresponding get_net() to avoid
      a splat at release time, as in :
      
      refcount_t: decrement hit 0; leaking memory.
      WARNING: CPU: 0 PID: 3599 at lib/refcount.c:31 refcount_warn_saturate+0xbf/0x1e0 lib/refcount.c:31
      Modules linked in:
      CPU: 1 PID: 3599 Comm: syz-fuzzer Not tainted 5.16.0-rc4-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:refcount_warn_saturate+0xbf/0x1e0 lib/refcount.c:31
      Code: 1d b1 99 a1 09 31 ff 89 de e8 5d 3a 9c fd 84 db 75 e0 e8 74 36 9c fd 48 c7 c7 60 00 05 8a c6 05 91 99 a1 09 01 e8 cc 4b 27 05 <0f> 0b eb c4 e8 58 36 9c fd 0f b6 1d 80 99 a1 09 31 ff 89 de e8 28
      RSP: 0018:ffffc90001f5fab0 EFLAGS: 00010286
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      RDX: ffff888021873a00 RSI: ffffffff815f1e28 RDI: fffff520003ebf48
      RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000
      R10: ffffffff815ebbce R11: 0000000000000000 R12: 1ffff920003ebf5b
      R13: 00000000ffffffef R14: ffffffff8d2fcd94 R15: ffffc90001f5fd10
      FS:  000000c00008a090(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f0a5b59e300 CR3: 000000001cbe6000 CR4: 00000000003506e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       __refcount_dec include/linux/refcount.h:344 [inline]
       refcount_dec include/linux/refcount.h:359 [inline]
       ref_tracker_free+0x4fe/0x610 lib/ref_tracker.c:101
       netns_tracker_free include/net/net_namespace.h:327 [inline]
       put_net_track include/net/net_namespace.h:341 [inline]
       __sk_destruct+0x4a6/0x920 net/core/sock.c:2042
       sk_destruct+0xbd/0xe0 net/core/sock.c:2058
       __sk_free+0xef/0x3d0 net/core/sock.c:2069
       sk_free+0x78/0xa0 net/core/sock.c:2080
       sock_put include/net/sock.h:1911 [inline]
       __mptcp_close_ssk+0x435/0x590 net/mptcp/protocol.c:2276
       __mptcp_destroy_sock+0x35f/0x830 net/mptcp/protocol.c:2702
       mptcp_close+0x5f8/0x7f0 net/mptcp/protocol.c:2750
       inet_release+0x12e/0x280 net/ipv4/af_inet.c:428
       inet6_release+0x4c/0x70 net/ipv6/af_inet6.c:476
       __sock_release+0xcd/0x280 net/socket.c:649
       sock_close+0x18/0x20 net/socket.c:1314
       __fput+0x286/0x9f0 fs/file_table.c:280
       task_work_run+0xdd/0x1a0 kernel/task_work.c:164
       tracehook_notify_resume include/linux/tracehook.h:189 [inline]
       exit_to_user_mode_loop kernel/entry/common.c:175 [inline]
       exit_to_user_mode_prepare+0x27e/0x290 kernel/entry/common.c:207
       __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
       syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300
       do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Fixes: ffa84b5f ("net: add netns refcount tracker to struct sock")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Mat Martineau <mathew.j.martineau@linux.intel.com>
      Cc: Florian Westphal <fw@strlen.de>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Link: https://lore.kernel.org/r/20211214043208.3543046-1-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1d2f3d3c
    • Eric Dumazet's avatar
      ipv6: use GFP_ATOMIC in rt6_probe() · 8b40a9d5
      Eric Dumazet authored
      syzbot reminded me that rt6_probe() can run from
      atomic contexts.
      
      stack backtrace:
      
      CPU: 1 PID: 7461 Comm: syz-executor.2 Not tainted 5.16.0-rc4-next-20211210-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
       print_usage_bug kernel/locking/lockdep.c:203 [inline]
       valid_state kernel/locking/lockdep.c:3945 [inline]
       mark_lock_irq kernel/locking/lockdep.c:4148 [inline]
       mark_lock.cold+0x61/0x8e kernel/locking/lockdep.c:4605
       mark_usage kernel/locking/lockdep.c:4500 [inline]
       __lock_acquire+0x11d5/0x54a0 kernel/locking/lockdep.c:4981
       lock_acquire kernel/locking/lockdep.c:5639 [inline]
       lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5604
       __fs_reclaim_acquire mm/page_alloc.c:4550 [inline]
       fs_reclaim_acquire+0x115/0x160 mm/page_alloc.c:4564
       might_alloc include/linux/sched/mm.h:253 [inline]
       slab_pre_alloc_hook mm/slab.h:739 [inline]
       slab_alloc_node mm/slub.c:3145 [inline]
       slab_alloc mm/slub.c:3239 [inline]
       kmem_cache_alloc_trace+0x3b/0x2c0 mm/slub.c:3256
       kmalloc include/linux/slab.h:581 [inline]
       kzalloc include/linux/slab.h:715 [inline]
       ref_tracker_alloc+0xe1/0x430 lib/ref_tracker.c:74
       netdev_tracker_alloc include/linux/netdevice.h:3860 [inline]
       dev_hold_track include/linux/netdevice.h:3877 [inline]
       rt6_probe net/ipv6/route.c:661 [inline]
       find_match.part.0+0xac9/0xd00 net/ipv6/route.c:752
       find_match net/ipv6/route.c:825 [inline]
       __find_rr_leaf+0x17f/0xd20 net/ipv6/route.c:826
       find_rr_leaf net/ipv6/route.c:847 [inline]
       rt6_select net/ipv6/route.c:891 [inline]
       fib6_table_lookup+0x649/0xa20 net/ipv6/route.c:2185
       ip6_pol_route+0x1c5/0x11e0 net/ipv6/route.c:2221
       pol_lookup_func include/net/ip6_fib.h:580 [inline]
       fib6_rule_lookup+0x52a/0x6f0 net/ipv6/fib6_rules.c:120
       ip6_route_output_flags_noref+0x2e2/0x380 net/ipv6/route.c:2629
       ip6_route_output_flags+0x72/0x320 net/ipv6/route.c:2642
       ip6_route_output include/net/ip6_route.h:98 [inline]
       ip6_dst_lookup_tail+0x5ab/0x1620 net/ipv6/ip6_output.c:1070
       ip6_dst_lookup_flow+0x8c/0x1d0 net/ipv6/ip6_output.c:1200
       geneve_get_v6_dst+0x46f/0x9a0 drivers/net/geneve.c:858
       geneve6_xmit_skb drivers/net/geneve.c:991 [inline]
       geneve_xmit+0x520/0x3530 drivers/net/geneve.c:1074
       __netdev_start_xmit include/linux/netdevice.h:4685 [inline]
       netdev_start_xmit include/linux/netdevice.h:4699 [inline]
       xmit_one net/core/dev.c:3473 [inline]
       dev_hard_start_xmit+0x1eb/0x920 net/core/dev.c:3489
       __dev_queue_xmit+0x2983/0x3640 net/core/dev.c:4112
       neigh_resolve_output net/core/neighbour.c:1522 [inline]
       neigh_resolve_output+0x50e/0x820 net/core/neighbour.c:1502
       neigh_output include/net/neighbour.h:541 [inline]
       ip6_finish_output2+0x56e/0x14f0 net/ipv6/ip6_output.c:126
       __ip6_finish_output net/ipv6/ip6_output.c:191 [inline]
       __ip6_finish_output+0x61e/0xe80 net/ipv6/ip6_output.c:170
       ip6_finish_output+0x32/0x200 net/ipv6/ip6_output.c:201
       NF_HOOK_COND include/linux/netfilter.h:296 [inline]
       ip6_output+0x1e4/0x530 net/ipv6/ip6_output.c:224
       dst_output include/net/dst.h:451 [inline]
       NF_HOOK include/linux/netfilter.h:307 [inline]
       ndisc_send_skb+0xa99/0x17f0 net/ipv6/ndisc.c:508
       ndisc_send_rs+0x12e/0x6f0 net/ipv6/ndisc.c:702
       addrconf_rs_timer+0x3f2/0x820 net/ipv6/addrconf.c:3898
       call_timer_fn+0x1a5/0x6b0 kernel/time/timer.c:1421
       expire_timers kernel/time/timer.c:1466 [inline]
       __run_timers.part.0+0x675/0xa20 kernel/time/timer.c:1734
       __run_timers kernel/time/timer.c:1715 [inline]
       run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1747
       __do_softirq+0x29b/0x9c2 kernel/softirq.c:558
       invoke_softirq kernel/softirq.c:432 [inline]
       __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637
       irq_exit_rcu+0x5/0x20 kernel/softirq.c:649
       sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1097
       </IRQ>
      
      Fixes: fb67510b ("ipv6: add net device refcount tracker to rt6_probe_deferred()")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Link: https://lore.kernel.org/r/20211214025806.3456382-1-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8b40a9d5
  2. 14 Dec, 2021 5 commits