1. 12 Oct, 2023 13 commits
    • Dragos Tatulea's avatar
      net/mlx5e: RX, Fix page_pool allocation failure recovery for striding rq · be43b748
      Dragos Tatulea authored
      When a page allocation fails during refill in mlx5e_post_rx_mpwqes, the
      page will be released again on the next refill call. This triggers the
      page_pool negative page fragment count warning below:
      
       [ 2436.447717] WARNING: CPU: 1 PID: 2419 at include/net/page_pool/helpers.h:130 mlx5e_page_release_fragmented.isra.0+0x42/0x50 [mlx5_core]
       ...
       [ 2436.447895] RIP: 0010:mlx5e_page_release_fragmented.isra.0+0x42/0x50 [mlx5_core]
       [ 2436.447991] Call Trace:
       [ 2436.447975]  mlx5e_post_rx_mpwqes+0x1d5/0xcf0 [mlx5_core]
       [ 2436.447994]  <IRQ>
       [ 2436.447996]  ? __warn+0x7d/0x120
       [ 2436.448009]  ? mlx5e_handle_rx_cqe_mpwrq+0x109/0x1d0 [mlx5_core]
       [ 2436.448002]  ? mlx5e_page_release_fragmented.isra.0+0x42/0x50 [mlx5_core]
       [ 2436.448044]  ? mlx5e_poll_rx_cq+0x87/0x6e0 [mlx5_core]
       [ 2436.448061]  ? report_bug+0x155/0x180
       [ 2436.448065]  ? handle_bug+0x36/0x70
       [ 2436.448067]  ? exc_invalid_op+0x13/0x60
       [ 2436.448070]  ? asm_exc_invalid_op+0x16/0x20
       [ 2436.448079]  mlx5e_napi_poll+0x122/0x6b0 [mlx5_core]
       [ 2436.448077]  ? mlx5e_page_release_fragmented.isra.0+0x42/0x50 [mlx5_core]
       [ 2436.448113]  ? generic_exec_single+0x35/0x100
       [ 2436.448117]  __napi_poll+0x25/0x1a0
       [ 2436.448120]  net_rx_action+0x28a/0x300
       [ 2436.448122]  __do_softirq+0xcd/0x279
       [ 2436.448126]  irq_exit_rcu+0x6a/0x90
       [ 2436.448128]  sysvec_apic_timer_interrupt+0x6e/0x90
       [ 2436.448130]  </IRQ>
      
      This patch fixes the striding rq case by setting the skip flag on all
      the wqe pages that were expected to have new pages allocated.
      
      Fixes: 4c2a1323 ("net/mlx5e: RX, Defer page release in striding rq for better recycling")
      Tested-by: default avatarChris Mason <clm@fb.com>
      Reported-by: default avatarChris Mason <clm@fb.com>
      Closes: https://lore.kernel.org/netdev/117FF31A-7BE0-4050-B2BB-E41F224FF72F@meta.comSigned-off-by: default avatarDragos Tatulea <dtatulea@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      be43b748
    • Maher Sanalla's avatar
      net/mlx5: Handle fw tracer change ownership event based on MTRC · 92fd3963
      Maher Sanalla authored
      Currently, whenever fw issues a change ownership event, the PF that owns
      the fw tracer drops its ownership directly and the other PFs try to pick
      up the ownership via what MTRC register suggests.
      
      In some cases, driver releases the ownership of the tracer and reacquires
      it later on. Whenever the driver releases ownership of the tracer, fw
      issues a change ownership event. This event can be delayed and come after
      driver has reacquired ownership of the tracer. Thus the late event will
      trigger the tracer owner PF to release the ownership again and lead to a
      scenario where no PF is owning the tracer.
      
      To prevent the scenario described above, when handling a change
      ownership event, do not drop ownership of the tracer directly, instead
      read the fw MTRC register to retrieve the up-to-date owner of the tracer
      and set it accordingly in driver level.
      
      Fixes: f53aaa31 ("net/mlx5: FW tracer, implement tracer logic")
      Signed-off-by: default avatarMaher Sanalla <msanalla@nvidia.com>
      Reviewed-by: default avatarShay Drory <shayd@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      92fd3963
    • Vlad Buslov's avatar
      net/mlx5: Bridge, fix peer entry ageing in LAG mode · 7a3ce807
      Vlad Buslov authored
      With current implementation in single FDB LAG mode all packets are
      processed by eswitch 0 rules. As such, 'peer' FDB entries receive the
      packets for rules of other eswitches and are responsible for updating the
      main entry by sending SWITCHDEV_FDB_ADD_TO_BRIDGE notification from their
      background update wq task. However, this introduces a race condition when
      non-zero eswitch instance decides to delete a FDB entry, sends
      SWITCHDEV_FDB_DEL_TO_BRIDGE notification, but another eswitch's update task
      refreshes the same entry concurrently while its async delete work is still
      pending on the workque. In such case another SWITCHDEV_FDB_ADD_TO_BRIDGE
      event may be generated and entry will remain stuck in FDB marked as
      'offloaded' since no more SWITCHDEV_FDB_DEL_TO_BRIDGE notifications are
      sent for deleting the peer entries.
      
      Fix the issue by synchronously marking deleted entries with
      MLX5_ESW_BRIDGE_FLAG_DELETED flag and skipping them in background update
      job.
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: default avatarJianbo Liu <jianbol@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      7a3ce807
    • Shay Drory's avatar
      net/mlx5: E-switch, register event handler before arming the event · 7624e58a
      Shay Drory authored
      Currently, mlx5 is registering event handler for vport context change
      event some time after arming the event. this can lead to missing an
      event, which will result in wrong rules in the FDB.
      Hence, register the event handler before arming the event.
      
      This solution is valid since FW is sending vport context change event
      only on vports which SW armed, and SW arming the vport when enabling
      it, which is done after the FDB has been created.
      
      Fixes: 6933a937 ("net/mlx5: E-Switch, Use async events chain")
      Signed-off-by: default avatarShay Drory <shayd@nvidia.com>
      Reviewed-by: default avatarMark Bloch <mbloch@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      7624e58a
    • Shay Drory's avatar
      net/mlx5: Perform DMA operations in the right locations · 8698cb92
      Shay Drory authored
      The cited patch change mlx5 driver so that during probe DMA
      operations were performed before pci_enable_device(), and during
      teardown DMA operations were performed after pci_disable_device().
      DMA operations require PCI to be enabled. Hence, The above leads to
      the following oops in PPC systems[1].
      
      On s390x systems, as reported by Niklas Schnelle, this is a problem
      because mlx5_pci_init() is where the DMA and coherent mask is set but
      mlx5_cmd_init() already does a dma_alloc_coherent(). Thus a DMA
      allocation is done during probe before the correct mask is set. This
      causes probe to fail initialization of the cmdif SW structs on s390x
      after that is converted to the common dma-iommu code. This is because on
      s390x DMA addresses below 4 GiB are reserved on current machines and
      unlike the old s390x specific DMA API implementation common code
      enforces DMA masks.
      
      Fix it by performing the DMA operations during probe after
      pci_enable_device() and after the dma mask is set,
      and during teardown before pci_disable_device().
      
      [1]
      Oops: Kernel access of bad area, sig: 11 [#1]
      LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
      Modules linked in: xt_MASQUERADE nf_conntrack_netlink
      nfnetlink xfrm_user iptable_nat xt_addrtype xt_conntrack nf_nat
      nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 netconsole rpcsec_gss_krb5
      auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser ib_umad
      rdma_cm ib_ipoib iw_cm libiscsi scsi_transport_iscsi ib_cm ib_uverbs
      ib_core mlx5_core(-) ptp pps_core fuse vmx_crypto crc32c_vpmsum [last
      unloaded: mlx5_ib]
      CPU: 1 PID: 8937 Comm: modprobe Not tainted 6.5.0-rc3_for_upstream_min_debug_2023_07_31_16_02 #1
      Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,HEAD hv:linux,kvm pSeries
      NIP:  c000000000423388 LR: c0000000001e733c CTR: c0000000001e4720
      REGS: c0000000055636d0 TRAP: 0380   Not tainted (6.5.0-rc3_for_upstream_min_debug_2023_07_31_16_02)
      MSR:  8000000000009033  CR: 24008884  XER: 20040000
      CFAR: c0000000001e7338 IRQMASK: 0
      NIP [c000000000423388] __free_pages+0x28/0x160
      LR [c0000000001e733c] dma_direct_free+0xac/0x190
      Call Trace:
      [c000000005563970] [5deadbeef0000100] 0x5deadbeef0000100 (unreliable)
      [c0000000055639b0] [c0000000003d46cc] kfree+0x7c/0x150
      [c000000005563a40] [c0000000001e47c8] dma_free_attrs+0xa8/0x1a0
      [c000000005563aa0] [c008000000d0064c] mlx5_cmd_cleanup+0xa4/0x100 [mlx5_core]
      [c000000005563ad0] [c008000000cf629c] mlx5_mdev_uninit+0xf4/0x140 [mlx5_core]
      [c000000005563b00] [c008000000cf6448] remove_one+0x160/0x1d0 [mlx5_core]
      [c000000005563b40] [c000000000958540] pci_device_remove+0x60/0x110
      [c000000005563b80] [c000000000a35e80] device_remove+0x70/0xd0
      [c000000005563bb0] [c000000000a37a38] device_release_driver_internal+0x2a8/0x330
      [c000000005563c00] [c000000000a37b8c] driver_detach+0x8c/0x160
      [c000000005563c40] [c000000000a35350] bus_remove_driver+0x90/0x110
      [c000000005563c80] [c000000000a38948] driver_unregister+0x48/0x90
      [c000000005563cf0] [c000000000957e38] pci_unregister_driver+0x38/0x150
      [c000000005563d40] [c008000000eb6140] mlx5_cleanup+0x38/0x90 [mlx5_core]
      
      Fixes: 06cd555f ("net/mlx5: split mlx5_cmd_init() to probe and reload routines")
      Signed-off-by: default avatarShay Drory <shayd@nvidia.com>
      Reviewed-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Reviewed-by: default avatarNiklas Schnelle <schnelle@linux.ibm.com>
      Tested-by: default avatarNiklas Schnelle <schnelle@linux.ibm.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      8698cb92
    • Paolo Abeni's avatar
      Merge branch 'rswitch-fix-issues-on-specific-conditions' · b91e8403
      Paolo Abeni authored
      Yoshihiro Shimoda says:
      
      ====================
      rswitch: Fix issues on specific conditions
      
      This patch series fix some issues of rswitch driver on specific
      condtions.
      ====================
      
      Link: https://lore.kernel.org/r/20231010124858.183891-1-yoshihiro.shimoda.uh@renesas.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      b91e8403
    • Yoshihiro Shimoda's avatar
      rswitch: Fix imbalance phy_power_off() calling · 053f13f6
      Yoshihiro Shimoda authored
      The phy_power_off() should not be called if phy_power_on() failed.
      So, add a condition .power_count before calls phy_power_off().
      
      Fixes: 5cb63092 ("net: renesas: rswitch: Add phy_power_{on,off}() calling")
      Signed-off-by: default avatarYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      053f13f6
    • Yoshihiro Shimoda's avatar
      rswitch: Fix renesas_eth_sw_remove() implementation · 510b18cf
      Yoshihiro Shimoda authored
      Fix functions calling order and a condition in renesas_eth_sw_remove().
      Otherwise, kernel NULL pointer dereference happens from phy_stop() if
      a net device opens.
      
      Fixes: 3590918b ("net: ethernet: renesas: Add support for "Ethernet Switch"")
      Signed-off-by: default avatarYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      510b18cf
    • Ratheesh Kannoth's avatar
      octeontx2-pf: Fix page pool frag allocation warning · 50e49214
      Ratheesh Kannoth authored
      Since page pool param's "order" is set to 0, will result
      in below warn message if interface is configured with higher
      rx buffer size.
      
      Steps to reproduce the issue.
      1. devlink dev param set pci/0002:04:00.0 name receive_buffer_size \
         value 8196 cmode runtime
      2. ifconfig eth0 up
      
      [   19.901356] ------------[ cut here ]------------
      [   19.901361] WARNING: CPU: 11 PID: 12331 at net/core/page_pool.c:567 page_pool_alloc_frag+0x3c/0x230
      [   19.901449] pstate: 82401009 (Nzcv daif +PAN -UAO +TCO -DIT +SSBS BTYPE=--)
      [   19.901451] pc : page_pool_alloc_frag+0x3c/0x230
      [   19.901453] lr : __otx2_alloc_rbuf+0x60/0xbc [rvu_nicpf]
      [   19.901460] sp : ffff80000f66b970
      [   19.901461] x29: ffff80000f66b970 x28: 0000000000000000 x27: 0000000000000000
      [   19.901464] x26: ffff800000d15b68 x25: ffff000195b5c080 x24: ffff0002a5a32dc0
      [   19.901467] x23: ffff0001063c0878 x22: 0000000000000100 x21: 0000000000000000
      [   19.901469] x20: 0000000000000000 x19: ffff00016f781000 x18: 0000000000000000
      [   19.901472] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
      [   19.901474] x14: 0000000000000000 x13: ffff0005ffdc9c80 x12: 0000000000000000
      [   19.901477] x11: ffff800009119a38 x10: 4c6ef2e3ba300519 x9 : ffff800000d13844
      [   19.901479] x8 : ffff0002a5a33cc8 x7 : 0000000000000030 x6 : 0000000000000030
      [   19.901482] x5 : 0000000000000005 x4 : 0000000000000000 x3 : 0000000000000a20
      [   19.901484] x2 : 0000000000001080 x1 : ffff80000f66b9d4 x0 : 0000000000001000
      [   19.901487] Call trace:
      [   19.901488]  page_pool_alloc_frag+0x3c/0x230
      [   19.901490]  __otx2_alloc_rbuf+0x60/0xbc [rvu_nicpf]
      [   19.901494]  otx2_rq_aura_pool_init+0x1c4/0x240 [rvu_nicpf]
      [   19.901498]  otx2_open+0x228/0xa70 [rvu_nicpf]
      [   19.901501]  otx2vf_open+0x20/0xd0 [rvu_nicvf]
      [   19.901504]  __dev_open+0x114/0x1d0
      [   19.901507]  __dev_change_flags+0x194/0x210
      [   19.901510]  dev_change_flags+0x2c/0x70
      [   19.901512]  devinet_ioctl+0x3a4/0x6c4
      [   19.901515]  inet_ioctl+0x228/0x240
      [   19.901518]  sock_ioctl+0x2ac/0x480
      [   19.901522]  __arm64_sys_ioctl+0x564/0xe50
      [   19.901525]  invoke_syscall.constprop.0+0x58/0xf0
      [   19.901529]  do_el0_svc+0x58/0x150
      [   19.901531]  el0_svc+0x30/0x140
      [   19.901533]  el0t_64_sync_handler+0xe8/0x114
      [   19.901535]  el0t_64_sync+0x1a0/0x1a4
      [   19.901537] ---[ end trace 678c0bf660ad8116 ]---
      
      Fixes: b2e3406a ("octeontx2-pf: Add support for page pool")
      Signed-off-by: default avatarRatheesh Kannoth <rkannoth@marvell.com>
      Reviewed-by: default avatarYunsheng Lin <linyunsheng@huawei.com>
      Link: https://lore.kernel.org/r/20231010034842.3807816-1-rkannoth@marvell.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      50e49214
    • Jeremy Cline's avatar
      nfc: nci: assert requested protocol is valid · 354a6e70
      Jeremy Cline authored
      The protocol is used in a bit mask to determine if the protocol is
      supported. Assert the provided protocol is less than the maximum
      defined so it doesn't potentially perform a shift-out-of-bounds and
      provide a clearer error for undefined protocols vs unsupported ones.
      
      Fixes: 6a2968aa ("NFC: basic NCI protocol implementation")
      Reported-and-tested-by: syzbot+0839b78e119aae1fec78@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=0839b78e119aae1fec78Signed-off-by: default avatarJeremy Cline <jeremy@jcline.org>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20231009200054.82557-1-jeremy@jcline.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      354a6e70
    • Kuniyuki Iwashima's avatar
      af_packet: Fix fortified memcpy() without flex array. · e2bca487
      Kuniyuki Iwashima authored
      Sergei Trofimovich reported a regression [0] caused by commit a0ade840
      ("af_packet: Fix warning of fortified memcpy() in packet_getname().").
      
      It introduced a flex array sll_addr_flex in struct sockaddr_ll as a
      union-ed member with sll_addr to work around the fortified memcpy() check.
      
      However, a userspace program uses a struct that has struct sockaddr_ll in
      the middle, where a flex array is illegal to exist.
      
        include/linux/if_packet.h:24:17: error: flexible array member 'sockaddr_ll::<unnamed union>::<unnamed struct>::sll_addr_flex' not at end of 'struct packet_info_t'
           24 |                 __DECLARE_FLEX_ARRAY(unsigned char, sll_addr_flex);
              |                 ^~~~~~~~~~~~~~~~~~~~
      
      To fix the regression, let's go back to the first attempt [1] telling
      memcpy() the actual size of the array.
      Reported-by: default avatarSergei Trofimovich <slyich@gmail.com>
      Closes: https://github.com/NixOS/nixpkgs/pull/252587#issuecomment-1741733002 [0]
      Link: https://lore.kernel.org/netdev/20230720004410.87588-3-kuniyu@amazon.com/ [1]
      Fixes: a0ade840 ("af_packet: Fix warning of fortified memcpy() in packet_getname().")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20231009153151.75688-1-kuniyu@amazon.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e2bca487
    • Jakub Kicinski's avatar
      net: tcp: fix crashes trying to free half-baked MTU probes · 71c299c7
      Jakub Kicinski authored
      tcp_stream_alloc_skb() initializes the skb to use tcp_tsorted_anchor
      which is a union with the destructor. We need to clean that
      TCP-iness up before freeing.
      
      Fixes: 73601329 ("tcp: let tcp_mtu_probe() build headless packets")
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20231010173651.3990234-1-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      71c299c7
    • Jakub Kicinski's avatar
      Merge tag 'ieee802154-for-net-2023-10-10' of... · 8bcfc9de
      Jakub Kicinski authored
      Merge tag 'ieee802154-for-net-2023-10-10' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan
      
      Stefan Schmidt says:
      
      ====================
      pull-request: ieee802154 for net 2023-10-10
      
      Just one small fix this time around.
      
      Dinghao Liu fixed a potential use-after-free in the ca8210 driver probe
      function.
      
      * tag 'ieee802154-for-net-2023-10-10' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan:
        ieee802154: ca8210: Fix a potential UAF in ca8210_probe
      ====================
      
      Link: https://lore.kernel.org/r/20231010200943.82225-1-stefan@datenfreihafen.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8bcfc9de
  2. 11 Oct, 2023 9 commits
  3. 10 Oct, 2023 11 commits
  4. 09 Oct, 2023 5 commits
  5. 08 Oct, 2023 2 commits