1. 12 Oct, 2023 10 commits
    • Shay Drory's avatar
      net/mlx5: E-switch, register event handler before arming the event · 7624e58a
      Shay Drory authored
      Currently, mlx5 is registering event handler for vport context change
      event some time after arming the event. this can lead to missing an
      event, which will result in wrong rules in the FDB.
      Hence, register the event handler before arming the event.
      
      This solution is valid since FW is sending vport context change event
      only on vports which SW armed, and SW arming the vport when enabling
      it, which is done after the FDB has been created.
      
      Fixes: 6933a937 ("net/mlx5: E-Switch, Use async events chain")
      Signed-off-by: default avatarShay Drory <shayd@nvidia.com>
      Reviewed-by: default avatarMark Bloch <mbloch@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      7624e58a
    • Shay Drory's avatar
      net/mlx5: Perform DMA operations in the right locations · 8698cb92
      Shay Drory authored
      The cited patch change mlx5 driver so that during probe DMA
      operations were performed before pci_enable_device(), and during
      teardown DMA operations were performed after pci_disable_device().
      DMA operations require PCI to be enabled. Hence, The above leads to
      the following oops in PPC systems[1].
      
      On s390x systems, as reported by Niklas Schnelle, this is a problem
      because mlx5_pci_init() is where the DMA and coherent mask is set but
      mlx5_cmd_init() already does a dma_alloc_coherent(). Thus a DMA
      allocation is done during probe before the correct mask is set. This
      causes probe to fail initialization of the cmdif SW structs on s390x
      after that is converted to the common dma-iommu code. This is because on
      s390x DMA addresses below 4 GiB are reserved on current machines and
      unlike the old s390x specific DMA API implementation common code
      enforces DMA masks.
      
      Fix it by performing the DMA operations during probe after
      pci_enable_device() and after the dma mask is set,
      and during teardown before pci_disable_device().
      
      [1]
      Oops: Kernel access of bad area, sig: 11 [#1]
      LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
      Modules linked in: xt_MASQUERADE nf_conntrack_netlink
      nfnetlink xfrm_user iptable_nat xt_addrtype xt_conntrack nf_nat
      nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 netconsole rpcsec_gss_krb5
      auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser ib_umad
      rdma_cm ib_ipoib iw_cm libiscsi scsi_transport_iscsi ib_cm ib_uverbs
      ib_core mlx5_core(-) ptp pps_core fuse vmx_crypto crc32c_vpmsum [last
      unloaded: mlx5_ib]
      CPU: 1 PID: 8937 Comm: modprobe Not tainted 6.5.0-rc3_for_upstream_min_debug_2023_07_31_16_02 #1
      Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,HEAD hv:linux,kvm pSeries
      NIP:  c000000000423388 LR: c0000000001e733c CTR: c0000000001e4720
      REGS: c0000000055636d0 TRAP: 0380   Not tainted (6.5.0-rc3_for_upstream_min_debug_2023_07_31_16_02)
      MSR:  8000000000009033  CR: 24008884  XER: 20040000
      CFAR: c0000000001e7338 IRQMASK: 0
      NIP [c000000000423388] __free_pages+0x28/0x160
      LR [c0000000001e733c] dma_direct_free+0xac/0x190
      Call Trace:
      [c000000005563970] [5deadbeef0000100] 0x5deadbeef0000100 (unreliable)
      [c0000000055639b0] [c0000000003d46cc] kfree+0x7c/0x150
      [c000000005563a40] [c0000000001e47c8] dma_free_attrs+0xa8/0x1a0
      [c000000005563aa0] [c008000000d0064c] mlx5_cmd_cleanup+0xa4/0x100 [mlx5_core]
      [c000000005563ad0] [c008000000cf629c] mlx5_mdev_uninit+0xf4/0x140 [mlx5_core]
      [c000000005563b00] [c008000000cf6448] remove_one+0x160/0x1d0 [mlx5_core]
      [c000000005563b40] [c000000000958540] pci_device_remove+0x60/0x110
      [c000000005563b80] [c000000000a35e80] device_remove+0x70/0xd0
      [c000000005563bb0] [c000000000a37a38] device_release_driver_internal+0x2a8/0x330
      [c000000005563c00] [c000000000a37b8c] driver_detach+0x8c/0x160
      [c000000005563c40] [c000000000a35350] bus_remove_driver+0x90/0x110
      [c000000005563c80] [c000000000a38948] driver_unregister+0x48/0x90
      [c000000005563cf0] [c000000000957e38] pci_unregister_driver+0x38/0x150
      [c000000005563d40] [c008000000eb6140] mlx5_cleanup+0x38/0x90 [mlx5_core]
      
      Fixes: 06cd555f ("net/mlx5: split mlx5_cmd_init() to probe and reload routines")
      Signed-off-by: default avatarShay Drory <shayd@nvidia.com>
      Reviewed-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Reviewed-by: default avatarNiklas Schnelle <schnelle@linux.ibm.com>
      Tested-by: default avatarNiklas Schnelle <schnelle@linux.ibm.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      8698cb92
    • Paolo Abeni's avatar
      Merge branch 'rswitch-fix-issues-on-specific-conditions' · b91e8403
      Paolo Abeni authored
      Yoshihiro Shimoda says:
      
      ====================
      rswitch: Fix issues on specific conditions
      
      This patch series fix some issues of rswitch driver on specific
      condtions.
      ====================
      
      Link: https://lore.kernel.org/r/20231010124858.183891-1-yoshihiro.shimoda.uh@renesas.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      b91e8403
    • Yoshihiro Shimoda's avatar
      rswitch: Fix imbalance phy_power_off() calling · 053f13f6
      Yoshihiro Shimoda authored
      The phy_power_off() should not be called if phy_power_on() failed.
      So, add a condition .power_count before calls phy_power_off().
      
      Fixes: 5cb63092 ("net: renesas: rswitch: Add phy_power_{on,off}() calling")
      Signed-off-by: default avatarYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      053f13f6
    • Yoshihiro Shimoda's avatar
      rswitch: Fix renesas_eth_sw_remove() implementation · 510b18cf
      Yoshihiro Shimoda authored
      Fix functions calling order and a condition in renesas_eth_sw_remove().
      Otherwise, kernel NULL pointer dereference happens from phy_stop() if
      a net device opens.
      
      Fixes: 3590918b ("net: ethernet: renesas: Add support for "Ethernet Switch"")
      Signed-off-by: default avatarYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      510b18cf
    • Ratheesh Kannoth's avatar
      octeontx2-pf: Fix page pool frag allocation warning · 50e49214
      Ratheesh Kannoth authored
      Since page pool param's "order" is set to 0, will result
      in below warn message if interface is configured with higher
      rx buffer size.
      
      Steps to reproduce the issue.
      1. devlink dev param set pci/0002:04:00.0 name receive_buffer_size \
         value 8196 cmode runtime
      2. ifconfig eth0 up
      
      [   19.901356] ------------[ cut here ]------------
      [   19.901361] WARNING: CPU: 11 PID: 12331 at net/core/page_pool.c:567 page_pool_alloc_frag+0x3c/0x230
      [   19.901449] pstate: 82401009 (Nzcv daif +PAN -UAO +TCO -DIT +SSBS BTYPE=--)
      [   19.901451] pc : page_pool_alloc_frag+0x3c/0x230
      [   19.901453] lr : __otx2_alloc_rbuf+0x60/0xbc [rvu_nicpf]
      [   19.901460] sp : ffff80000f66b970
      [   19.901461] x29: ffff80000f66b970 x28: 0000000000000000 x27: 0000000000000000
      [   19.901464] x26: ffff800000d15b68 x25: ffff000195b5c080 x24: ffff0002a5a32dc0
      [   19.901467] x23: ffff0001063c0878 x22: 0000000000000100 x21: 0000000000000000
      [   19.901469] x20: 0000000000000000 x19: ffff00016f781000 x18: 0000000000000000
      [   19.901472] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
      [   19.901474] x14: 0000000000000000 x13: ffff0005ffdc9c80 x12: 0000000000000000
      [   19.901477] x11: ffff800009119a38 x10: 4c6ef2e3ba300519 x9 : ffff800000d13844
      [   19.901479] x8 : ffff0002a5a33cc8 x7 : 0000000000000030 x6 : 0000000000000030
      [   19.901482] x5 : 0000000000000005 x4 : 0000000000000000 x3 : 0000000000000a20
      [   19.901484] x2 : 0000000000001080 x1 : ffff80000f66b9d4 x0 : 0000000000001000
      [   19.901487] Call trace:
      [   19.901488]  page_pool_alloc_frag+0x3c/0x230
      [   19.901490]  __otx2_alloc_rbuf+0x60/0xbc [rvu_nicpf]
      [   19.901494]  otx2_rq_aura_pool_init+0x1c4/0x240 [rvu_nicpf]
      [   19.901498]  otx2_open+0x228/0xa70 [rvu_nicpf]
      [   19.901501]  otx2vf_open+0x20/0xd0 [rvu_nicvf]
      [   19.901504]  __dev_open+0x114/0x1d0
      [   19.901507]  __dev_change_flags+0x194/0x210
      [   19.901510]  dev_change_flags+0x2c/0x70
      [   19.901512]  devinet_ioctl+0x3a4/0x6c4
      [   19.901515]  inet_ioctl+0x228/0x240
      [   19.901518]  sock_ioctl+0x2ac/0x480
      [   19.901522]  __arm64_sys_ioctl+0x564/0xe50
      [   19.901525]  invoke_syscall.constprop.0+0x58/0xf0
      [   19.901529]  do_el0_svc+0x58/0x150
      [   19.901531]  el0_svc+0x30/0x140
      [   19.901533]  el0t_64_sync_handler+0xe8/0x114
      [   19.901535]  el0t_64_sync+0x1a0/0x1a4
      [   19.901537] ---[ end trace 678c0bf660ad8116 ]---
      
      Fixes: b2e3406a ("octeontx2-pf: Add support for page pool")
      Signed-off-by: default avatarRatheesh Kannoth <rkannoth@marvell.com>
      Reviewed-by: default avatarYunsheng Lin <linyunsheng@huawei.com>
      Link: https://lore.kernel.org/r/20231010034842.3807816-1-rkannoth@marvell.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      50e49214
    • Jeremy Cline's avatar
      nfc: nci: assert requested protocol is valid · 354a6e70
      Jeremy Cline authored
      The protocol is used in a bit mask to determine if the protocol is
      supported. Assert the provided protocol is less than the maximum
      defined so it doesn't potentially perform a shift-out-of-bounds and
      provide a clearer error for undefined protocols vs unsupported ones.
      
      Fixes: 6a2968aa ("NFC: basic NCI protocol implementation")
      Reported-and-tested-by: syzbot+0839b78e119aae1fec78@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=0839b78e119aae1fec78Signed-off-by: default avatarJeremy Cline <jeremy@jcline.org>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20231009200054.82557-1-jeremy@jcline.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      354a6e70
    • Kuniyuki Iwashima's avatar
      af_packet: Fix fortified memcpy() without flex array. · e2bca487
      Kuniyuki Iwashima authored
      Sergei Trofimovich reported a regression [0] caused by commit a0ade840
      ("af_packet: Fix warning of fortified memcpy() in packet_getname().").
      
      It introduced a flex array sll_addr_flex in struct sockaddr_ll as a
      union-ed member with sll_addr to work around the fortified memcpy() check.
      
      However, a userspace program uses a struct that has struct sockaddr_ll in
      the middle, where a flex array is illegal to exist.
      
        include/linux/if_packet.h:24:17: error: flexible array member 'sockaddr_ll::<unnamed union>::<unnamed struct>::sll_addr_flex' not at end of 'struct packet_info_t'
           24 |                 __DECLARE_FLEX_ARRAY(unsigned char, sll_addr_flex);
              |                 ^~~~~~~~~~~~~~~~~~~~
      
      To fix the regression, let's go back to the first attempt [1] telling
      memcpy() the actual size of the array.
      Reported-by: default avatarSergei Trofimovich <slyich@gmail.com>
      Closes: https://github.com/NixOS/nixpkgs/pull/252587#issuecomment-1741733002 [0]
      Link: https://lore.kernel.org/netdev/20230720004410.87588-3-kuniyu@amazon.com/ [1]
      Fixes: a0ade840 ("af_packet: Fix warning of fortified memcpy() in packet_getname().")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20231009153151.75688-1-kuniyu@amazon.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e2bca487
    • Jakub Kicinski's avatar
      net: tcp: fix crashes trying to free half-baked MTU probes · 71c299c7
      Jakub Kicinski authored
      tcp_stream_alloc_skb() initializes the skb to use tcp_tsorted_anchor
      which is a union with the destructor. We need to clean that
      TCP-iness up before freeing.
      
      Fixes: 73601329 ("tcp: let tcp_mtu_probe() build headless packets")
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20231010173651.3990234-1-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      71c299c7
    • Jakub Kicinski's avatar
      Merge tag 'ieee802154-for-net-2023-10-10' of... · 8bcfc9de
      Jakub Kicinski authored
      Merge tag 'ieee802154-for-net-2023-10-10' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan
      
      Stefan Schmidt says:
      
      ====================
      pull-request: ieee802154 for net 2023-10-10
      
      Just one small fix this time around.
      
      Dinghao Liu fixed a potential use-after-free in the ca8210 driver probe
      function.
      
      * tag 'ieee802154-for-net-2023-10-10' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan:
        ieee802154: ca8210: Fix a potential UAF in ca8210_probe
      ====================
      
      Link: https://lore.kernel.org/r/20231010200943.82225-1-stefan@datenfreihafen.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8bcfc9de
  2. 11 Oct, 2023 9 commits
  3. 10 Oct, 2023 11 commits
  4. 09 Oct, 2023 5 commits
  5. 08 Oct, 2023 3 commits
  6. 07 Oct, 2023 2 commits
    • Dinghao Liu's avatar
      ieee802154: ca8210: Fix a potential UAF in ca8210_probe · f990874b
      Dinghao Liu authored
      If of_clk_add_provider() fails in ca8210_register_ext_clock(),
      it calls clk_unregister() to release priv->clk and returns an
      error. However, the caller ca8210_probe() then calls ca8210_remove(),
      where priv->clk is freed again in ca8210_unregister_ext_clock(). In
      this case, a use-after-free may happen in the second time we call
      clk_unregister().
      
      Fix this by removing the first clk_unregister(). Also, priv->clk could
      be an error code on failure of clk_register_fixed_rate(). Use
      IS_ERR_OR_NULL to catch this case in ca8210_unregister_ext_clock().
      
      Fixes: ded845a7 ("ieee802154: Add CA8210 IEEE 802.15.4 device driver")
      Signed-off-by: default avatarDinghao Liu <dinghao.liu@zju.edu.cn>
      Message-ID: <20231007033049.22353-1-dinghao.liu@zju.edu.cn>
      Signed-off-by: default avatarStefan Schmidt <stefan@datenfreihafen.org>
      f990874b
    • Daniel Borkmann's avatar
      selftests/bpf: Make seen_tc* variable tests more robust · 37345b85
      Daniel Borkmann authored
      Martin reported that on his local dev machine the test_tc_chain_mixed() fails as
      "test_tc_chain_mixed:FAIL:seen_tc5 unexpected seen_tc5: actual 1 != expected 0"
      and others occasionally, too.
      
      However, when running in a more isolated setup (qemu in particular), it works fine
      for him. The reason is that there is a small race-window where seen_tc* could turn
      into true for various test cases when there is background traffic, e.g. after the
      asserts they often get reset. In such case when subsequent detach takes place,
      unrelated background traffic could have already flipped the bool to true beforehand.
      
      Add a small helper tc_skel_reset_all_seen() to reset all bools before we do the ping
      test. At this point, everything is set up as expected and therefore no race can occur.
      All tc_{opts,links} tests continue to pass after this change.
      Reported-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/r/20231006220655.1653-7-daniel@iogearbox.netSigned-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      37345b85