1. 16 Nov, 2022 6 commits
  2. 15 Nov, 2022 10 commits
    • Ido Schimmel's avatar
      bridge: switchdev: Fix memory leaks when changing VLAN protocol · 9d45921e
      Ido Schimmel authored
      The bridge driver can offload VLANs to the underlying hardware either
      via switchdev or the 8021q driver. When the former is used, the VLAN is
      marked in the bridge driver with the 'BR_VLFLAG_ADDED_BY_SWITCHDEV'
      private flag.
      
      To avoid the memory leaks mentioned in the cited commit, the bridge
      driver will try to delete a VLAN via the 8021q driver if the VLAN is not
      marked with the previously mentioned flag.
      
      When the VLAN protocol of the bridge changes, switchdev drivers are
      notified via the 'SWITCHDEV_ATTR_ID_BRIDGE_VLAN_PROTOCOL' attribute, but
      the 8021q driver is also called to add the existing VLANs with the new
      protocol and delete them with the old protocol.
      
      In case the VLANs were offloaded via switchdev, the above behavior is
      both redundant and buggy. Redundant because the VLANs are already
      programmed in hardware and drivers that support VLAN protocol change
      (currently only mlx5) change the protocol upon the switchdev attribute
      notification. Buggy because the 8021q driver is called despite these
      VLANs being marked with 'BR_VLFLAG_ADDED_BY_SWITCHDEV'. This leads to
      memory leaks [1] when the VLANs are deleted.
      
      Fix by not calling the 8021q driver for VLANs that were already
      programmed via switchdev.
      
      [1]
      unreferenced object 0xffff8881f6771200 (size 256):
        comm "ip", pid 446855, jiffies 4298238841 (age 55.240s)
        hex dump (first 32 bytes):
          00 00 7f 0e 83 88 ff ff 00 00 00 00 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<00000000012819ac>] vlan_vid_add+0x437/0x750
          [<00000000f2281fad>] __br_vlan_set_proto+0x289/0x920
          [<000000000632b56f>] br_changelink+0x3d6/0x13f0
          [<0000000089d25f04>] __rtnl_newlink+0x8ae/0x14c0
          [<00000000f6276baf>] rtnl_newlink+0x5f/0x90
          [<00000000746dc902>] rtnetlink_rcv_msg+0x336/0xa00
          [<000000001c2241c0>] netlink_rcv_skb+0x11d/0x340
          [<0000000010588814>] netlink_unicast+0x438/0x710
          [<00000000e1a4cd5c>] netlink_sendmsg+0x788/0xc40
          [<00000000e8992d4e>] sock_sendmsg+0xb0/0xe0
          [<00000000621b8f91>] ____sys_sendmsg+0x4ff/0x6d0
          [<000000000ea26996>] ___sys_sendmsg+0x12e/0x1b0
          [<00000000684f7e25>] __sys_sendmsg+0xab/0x130
          [<000000004538b104>] do_syscall_64+0x3d/0x90
          [<0000000091ed9678>] entry_SYSCALL_64_after_hwframe+0x46/0xb0
      
      Fixes: 27973793 ("net: bridge: Fix VLANs memory leak")
      Reported-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Tested-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Link: https://lore.kernel.org/r/20221114084509.860831-1-idosch@nvidia.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      9d45921e
    • Paolo Abeni's avatar
      Merge branch 'net-hns3-this-series-bugfix-for-the-hns3-ethernet-driver' · 598ab4b1
      Paolo Abeni authored
      Hao Lan says:
      
      ====================
      net: hns3: This series bugfix for the HNS3 ethernet driver.
      
      This series includes some bugfix for the HNS3 ethernet driver.
      Patch 1# fix incorrect hw rss hash type of rx packet.
      Fixes: 79664077 ("net: hns3: support RXD advanced layout")
      Fixes: 232fc64b ("net: hns3: Add HW RSS hash information to RX skb")
      Fixes: ea485867 ("net: hns3: handle the BD info on the last BD of the packet")
      
      Patch 2# fix return value check bug of rx copybreak.
      Fixes: e74a726d ("net: hns3: refactor hns3_nic_reuse_page()")
      Fixes: 99f6b5fb ("net: hns3: use bounce buffer when rx page can not be reused")
      
      Patch 3# net: hns3: fix setting incorrect phy link ksettings
       for firmware in resetting process
      Fixes: f5f2b3e4 ("net: hns3: add support for imp-controlled PHYs")
      Fixes: c5ef83cb ("net: hns3: fix for phy_addr error in hclge_mac_mdio_config")
      Fixes: 2312e050 ("net: hns3: Fix for deadlock problem occurring when unregistering ae_algo")
      ====================
      
      Link: https://lore.kernel.org/r/20221114082048.49450-1-lanhao@huawei.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      598ab4b1
    • Guangbin Huang's avatar
      net: hns3: fix setting incorrect phy link ksettings for firmware in resetting process · 510d7b6a
      Guangbin Huang authored
      Currently, if driver is in phy-imp(phy controlled by imp firmware) mode, as
      driver did not update phy link ksettings after initialization process or
      not update advertising when getting phy link ksettings from firmware, it
      may set incorrect phy link ksettings for firmware in resetting process.
      So fix it.
      
      Fixes: f5f2b3e4 ("net: hns3: add support for imp-controlled PHYs")
      Fixes: c5ef83cb ("net: hns3: fix for phy_addr error in hclge_mac_mdio_config")
      Fixes: 2312e050 ("net: hns3: Fix for deadlock problem occurring when unregistering ae_algo")
      Signed-off-by: default avatarGuangbin Huang <huangguangbin2@huawei.com>
      Signed-off-by: default avatarHao Lan <lanhao@huawei.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      510d7b6a
    • Jie Wang's avatar
      net: hns3: fix return value check bug of rx copybreak · 29df7c69
      Jie Wang authored
      The refactoring of rx copybreak modifies the original return logic, which
      will make this feature unavailable. So this patch fixes the return logic of
      rx copybreak.
      
      Fixes: e74a726d ("net: hns3: refactor hns3_nic_reuse_page()")
      Fixes: 99f6b5fb ("net: hns3: use bounce buffer when rx page can not be reused")
      Signed-off-by: default avatarJie Wang <wangjie125@huawei.com>
      Signed-off-by: default avatarHao Lan <lanhao@huawei.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      29df7c69
    • Jian Shen's avatar
      net: hns3: fix incorrect hw rss hash type of rx packet · a56cad69
      Jian Shen authored
      Currently, the HNS3 driver reports the rss hash type
      of each packet based on the rss hash tuples set. It
      always reports PKT_HASH_TYPE_L4, without checking the
      type of current packet. It's incorrect.
      Fixes it by reporting it base on the packet type.
      
      Fixes: 79664077 ("net: hns3: support RXD advanced layout")
      Fixes: 232fc64b ("net: hns3: Add HW RSS hash information to RX skb")
      Fixes: ea485867 ("net: hns3: handle the BD info on the last BD of the packet")
      Signed-off-by: default avatarJian Shen <shenjian15@huawei.com>
      Signed-off-by: default avatarHao Lan <lanhao@huawei.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      a56cad69
    • Aminuddin Jamaluddin's avatar
      net: phy: marvell: add sleep time after enabling the loopback bit · 18c532e4
      Aminuddin Jamaluddin authored
      Sleep time is added to ensure the phy to be ready after loopback
      bit was set. This to prevent the phy loopback test from failing.
      
      Fixes: 020a45af ("net: phy: marvell: add Marvell specific PHY loopback")
      Cc: <stable@vger.kernel.org> # 5.15.x
      Signed-off-by: default avatarMuhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com>
      Signed-off-by: default avatarAminuddin Jamaluddin <aminuddin.jamaluddin@intel.com>
      Link: https://lore.kernel.org/r/20221114065302.10625-1-aminuddin.jamaluddin@intel.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      18c532e4
    • Yuan Can's avatar
      net: ena: Fix error handling in ena_init() · d349e9be
      Yuan Can authored
      The ena_init() won't destroy workqueue created by
      create_singlethread_workqueue() when pci_register_driver() failed.
      Call destroy_workqueue() when pci_register_driver() failed to prevent the
      resource leak.
      
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarYuan Can <yuancan@huawei.com>
      Acked-by: default avatarShay Agroskin <shayagr@amazon.com>
      Link: https://lore.kernel.org/r/20221114025659.124726-1-yuancan@huawei.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      d349e9be
    • Cong Wang's avatar
      kcm: close race conditions on sk_receive_queue · 5121197e
      Cong Wang authored
      sk->sk_receive_queue is protected by skb queue lock, but for KCM
      sockets its RX path takes mux->rx_lock to protect more than just
      skb queue. However, kcm_recvmsg() still only grabs the skb queue
      lock, so race conditions still exist.
      
      We can teach kcm_recvmsg() to grab mux->rx_lock too but this would
      introduce a potential performance regression as struct kcm_mux can
      be shared by multiple KCM sockets.
      
      So we have to enforce skb queue lock in requeue_rx_msgs() and handle
      skb peek case carefully in kcm_wait_data(). Fortunately,
      skb_recv_datagram() already handles it nicely and is widely used by
      other sockets, we can just switch to skb_recv_datagram() after
      getting rid of the unnecessary sock lock in kcm_recvmsg() and
      kcm_splice_read(). Side note: SOCK_DONE is not used by KCM sockets,
      so it is safe to get rid of this check too.
      
      I ran the original syzbot reproducer for 30 min without seeing any
      issue.
      
      Fixes: ab7ac4eb ("kcm: Kernel Connection Multiplexor module")
      Reported-by: syzbot+278279efdd2730dd14bf@syzkaller.appspotmail.com
      Reported-by: default avatarshaozhengchao <shaozhengchao@huawei.com>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Tom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarCong Wang <cong.wang@bytedance.com>
      Link: https://lore.kernel.org/r/20221114005119.597905-1-xiyou.wangcong@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      5121197e
    • Yuan Can's avatar
      net: ionic: Fix error handling in ionic_init_module() · 280c0f7c
      Yuan Can authored
      A problem about ionic create debugfs failed is triggered with the
      following log given:
      
       [  415.799514] debugfs: Directory 'ionic' with parent '/' already present!
      
      The reason is that ionic_init_module() returns ionic_bus_register_driver()
      directly without checking its return value, if ionic_bus_register_driver()
      failed, it returns without destroy the newly created debugfs, resulting
      the debugfs of ionic can never be created later.
      
       ionic_init_module()
         ionic_debugfs_create() # create debugfs directory
         ionic_bus_register_driver()
           pci_register_driver()
             driver_register()
               bus_add_driver()
                 priv = kzalloc(...) # OOM happened
         # return without destroy debugfs directory
      
      Fix by removing debugfs when ionic_bus_register_driver() returns error.
      
      Fixes: fbfb8031 ("ionic: Add hardware init and device commands")
      Signed-off-by: default avatarYuan Can <yuancan@huawei.com>
      Acked-by: default avatarShannon Nelson <snelson@pensando.io>
      Link: https://lore.kernel.org/r/20221113092929.19161-1-yuancan@huawei.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      280c0f7c
    • Amit Cohen's avatar
      mlxsw: Avoid warnings when not offloaded FDB entry with IPv6 is removed · 30f5312d
      Amit Cohen authored
      FDB entries that perform VXLAN encapsulation with an IPv6 underlay hold
      a reference on a resource - the KVDL entry where the IPv6 underlay
      destination IP is stored. For that, the driver maintains two hash tables:
      1. Maps IPv6 to KVDL index
      2. Maps {MAC, FID index} to IPv6 address
      
      When a FDB entry is removed, the second table is used to find the relevant
      IPv6 address and the first table is used to remove the reference count and
      free the index if is not used anymore.
      
      In order for a packet to be forwarded to a single remote VTEP, FDB
      entries need to be configured at both the bridge and VXLAN devices' FDB
      tables. Both entries are squashed into one {MAC, VLAN/VNI} -> IP entry
      in the hardware. Therefore, in case one entry is removed, the entry will
      be removed from the hardware and the remaining entry will be unmarked
      with 'offload' flag since it is not offloaded anymore.
      
      For example, the two FDB entries should be added to allow packets to be
      forwarded via vx10:
      $ bridge fdb add dev vx10 aa:bb:cc:dd:ee:ff self static dst 2001:db8:5::1
      $ bridge fdb add dev vx10 aa:bb:cc:dd:ee:ff master static vlan 10
      
      When one entry will be removed, the second one will not be offloaded
      anymore. When the first entry (in VXLAN FDB) will be removed / will not be
      offloaded anymore, the two mappings in IPv6 hash tables will be removed.
      
      In case that the second entry is removed before the first one, unexpected
      warnings[1][2] will be shown in user space as a result of removing the
      first entry. The issue is that not offloaded entry is removed, the driver
      tries to search the relevant entries in the hash tables, does not find them
      and therefore warns.
      
      Do not handle removing of not offloaded VXLAN FDB entries, as they were
      already removed when the offload flag was removed.
      
      [1]:
      WARNING: CPU: 1 PID: 239 at drivers/net/ethernet/mellanox/mlxsw/spectrum_nve.c:914 mlxsw_sp_nve_ipv6_addr_map_del+0x6b/0x80 [mlxsw_spectrum]
      ...
      Hardware name: Mellanox Technologies Ltd. Mellanox switch/Mellanox switch, BIOS 4.6.5 05/21/2015
      Workqueue: mlxsw_core_ordered mlxsw_sp_switchdev_vxlan_fdb_event_work [mlxsw_spectrum]
      RIP: 0010:mlxsw_sp_nve_ipv6_addr_map_del+0x6b/0x80 [mlxsw_spectrum]
      ...
      Call Trace:
        <TASK>
        mlxsw_sp_port_fdb_tunnel_uc_op+0x6cf/0x7b0 [mlxsw_spectrum]
        mlxsw_sp_switchdev_vxlan_fdb_event_work+0x17c/0x420 [mlxsw_spectrum]
        ? finish_task_switch.isra.0+0x8c/0x290
        process_one_work+0x1cd/0x390
        worker_thread+0x48/0x3c0
        ? process_one_work+0x390/0x390
        kthread+0xe0/0x110
        ? kthread_complete_and_exit+0x20/0x20
        ret_from_fork+0x1f/0x30
        </TASK>
      
      [2]:
      WARNING: CPU: 0 PID: 239 at drivers/net/ethernet/mellanox/mlxsw/spectrum.c:3035 mlxsw_sp_ipv6_addr_put+0x142/0x220 [mlxsw_spectrum]
      ...
      Hardware name: Mellanox Technologies Ltd. Mellanox switch/Mellanox switch, BIOS 4.6.5 05/21/2015
      Workqueue: mlxsw_core_ordered mlxsw_sp_switchdev_vxlan_fdb_event_work [mlxsw_spectrum]
      RIP: 0010:mlxsw_sp_ipv6_addr_put+0x142/0x220 [mlxsw_spectrum]
      ...
      Call Trace:
        <TASK>
        ? mlxsw_sp_port_fdb_tun_uc_op6_sfd_write+0x5c1/0x610 [mlxsw_spectrum]
        mlxsw_sp_port_fdb_tunnel_uc_op+0x6ec/0x7b0 [mlxsw_spectrum]
        mlxsw_sp_switchdev_vxlan_fdb_event_work+0x17c/0x420 [mlxsw_spectrum]
        ? finish_task_switch.isra.0+0x8c/0x290
        process_one_work+0x1cd/0x390
        worker_thread+0x48/0x3c0
        ? process_one_work+0x390/0x390
        kthread+0xe0/0x110
        ? kthread_complete_and_exit+0x20/0x20
        ret_from_fork+0x1f/0x30
        </TASK>
      
      Fixes: 0860c764 ("mlxsw: spectrum_nve: Keep track of IPv6 addresses used by FDB entries")
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Link: https://lore.kernel.org/r/c186de8cbd28e3eb661e06f31f7f2f2dff30020f.1668184350.git.petrm@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      30f5312d
  3. 14 Nov, 2022 12 commits
  4. 12 Nov, 2022 8 commits
  5. 11 Nov, 2022 4 commits