1. 16 Nov, 2021 5 commits
    • Valentine Fatiev's avatar
      net/mlx5e: nullify cq->dbg pointer in mlx5_debug_cq_remove() · 76ded29d
      Valentine Fatiev authored
      Prior to this patch in case mlx5_core_destroy_cq() failed it proceeds
      to rest of destroy operations. mlx5_core_destroy_cq() could be called again
      by user and cause additional call of mlx5_debug_cq_remove().
      cq->dbg was not nullify in previous call and cause the crash.
      
      Fix it by nullify cq->dbg pointer after removal.
      
      Also proceed to destroy operations only if FW return 0
      for MLX5_CMD_OP_DESTROY_CQ command.
      
      general protection fault, probably for non-canonical address 0x2000300004058: 0000 [#1] SMP PTI
      CPU: 5 PID: 1228 Comm: python Not tainted 5.15.0-rc5_for_upstream_min_debug_2021_10_14_11_06 #1
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
      RIP: 0010:lockref_get+0x1/0x60
      Code: 5d e9 53 ff ff ff 48 8d 7f 70 e8 0a 2e 48 00 c7 85 d0 00 00 00 02
      00 00 00 c6 45 70 00 fb 5d c3 c3 cc cc cc cc cc cc cc cc 53 <48> 8b 17
      48 89 fb 85 d2 75 3d 48 89 d0 bf 64 00 00 00 48 89 c1 48
      RSP: 0018:ffff888137dd7a38 EFLAGS: 00010206
      RAX: 0000000000000000 RBX: ffff888107d5f458 RCX: 00000000fffffffe
      RDX: 000000000002c2b0 RSI: ffffffff8155e2e0 RDI: 0002000300004058
      RBP: ffff888137dd7a88 R08: 0002000300004058 R09: ffff8881144a9f88
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881141d4000
      R13: ffff888137dd7c68 R14: ffff888137dd7d58 R15: ffff888137dd7cc0
      FS:  00007f4644f2a4c0(0000) GS:ffff8887a2d40000(0000)
      knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000055b4500f4380 CR3: 0000000114f7a003 CR4: 0000000000170ea0
      Call Trace:
        simple_recursive_removal+0x33/0x2e0
        ? debugfs_remove+0x60/0x60
        debugfs_remove+0x40/0x60
        mlx5_debug_cq_remove+0x32/0x70 [mlx5_core]
        mlx5_core_destroy_cq+0x41/0x1d0 [mlx5_core]
        devx_obj_cleanup+0x151/0x330 [mlx5_ib]
        ? __pollwait+0xd0/0xd0
        ? xas_load+0x5/0x70
        ? xa_load+0x62/0xa0
        destroy_hw_idr_uobject+0x20/0x80 [ib_uverbs]
        uverbs_destroy_uobject+0x3b/0x360 [ib_uverbs]
        uobj_destroy+0x54/0xa0 [ib_uverbs]
        ib_uverbs_cmd_verbs+0xaf2/0x1160 [ib_uverbs]
        ? uverbs_finalize_object+0xd0/0xd0 [ib_uverbs]
        ib_uverbs_ioctl+0xc4/0x1b0 [ib_uverbs]
        __x64_sys_ioctl+0x3e4/0x8e0
      
      Fixes: 94b960b9 ("net/mlx5e: Fix memory leak in mlx5_core_destroy_cq() error path")
      Signed-off-by: default avatarValentine Fatiev <valentinef@nvidia.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      76ded29d
    • Paul Blakey's avatar
      net/mlx5: E-Switch, Fix resetting of encap mode when entering switchdev · d7751d64
      Paul Blakey authored
      E-Switch encap mode is relevant only when in switchdev mode.
      The RDMA driver can query the encap configuration via
      mlx5_eswitch_get_encap_mode(). Make sure it returns the currently
      used mode and not the set one.
      
      This reverts the cited commit which reset the encap mode
      on entering switchdev and fixes the original issue properly.
      
      Fixes: 9a64144d ("net/mlx5: E-Switch, Fix default encap mode")
      Signed-off-by: default avatarPaul Blakey <paulb@nvidia.com>
      Reviewed-by: default avatarMark Bloch <mbloch@nvidia.com>
      Reviewed-by: default avatarMaor Dickman <maord@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      d7751d64
    • Vlad Buslov's avatar
      net/mlx5e: Wait for concurrent flow deletion during neigh/fib events · 362980ea
      Vlad Buslov authored
      Function mlx5e_take_tmp_flow() skips flows with zero reference count. This
      can cause syndrome 0x179e84 when the called from neigh or route update code
      and the skipped flow is not removed from the hardware by the time
      underlying encap/decap resource is deleted. Add new completion
      'del_hw_done' that is completed when flow is unoffloaded. This is safe to
      do because flow with reference count zero needs to be detached from
      encap/decap entry before its memory is deallocated, which requires taking
      the encap_tbl_lock mutex that is held by the event handlers code.
      
      Fixes: 8914add2 ("net/mlx5e: Handle FIB events to update tunnel endpoint device")
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      362980ea
    • Tariq Toukan's avatar
      net/mlx5e: kTLS, Fix crash in RX resync flow · cc4a9cc0
      Tariq Toukan authored
      For the TLS RX resync flow, we maintain a list of TLS contexts
      that require some attention, to communicate their resync information
      to the HW.
      Here we fix list corruptions, by protecting the entries against
      movements coming from resync_handle_seq_match(), until their resync
      handling in napi is fully completed.
      
      Fixes: e9ce991b ("net/mlx5e: kTLS, Add resiliency to RX resync failures")
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      cc4a9cc0
    • David S. Miller's avatar
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 848e5d66
      David S. Miller authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2021-11-15
      
      This series contains updates to iavf driver only.
      
      Mateusz adds a wait for reset completion when changing queue count which
      could otherwise cause issues with VF reset.
      
      Nick adds a null check for vf_res in iavf_fix_features(), corrects
      ordering of function calls to resolve dependency issues, and prevents
      possible freeing of a lock which isn't being held.
      
      Piotr fixes logic that did not allow setting all multicast mode without
      promiscuous mode.
      
      Jake prevents possible accidental freeing of filter structure.
      
      Mitch adds null checks for key and indir parameters in iavf_get_rxfh().
      
      Surabhi adds an additional check that would, previously, cause the driver
      to print a false error due to values obtained while the VF is in reset.
      
      Grzegorz prevents a queue request of 0 which would cause queue count to
      reset to default values.
      
      Akeem restores VLAN filters when bringing the interface back up.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      848e5d66
  2. 15 Nov, 2021 29 commits
  3. 14 Nov, 2021 1 commit
    • Paul Moore's avatar
      net,lsm,selinux: revert the security_sctp_assoc_established() hook · 1aa3b220
      Paul Moore authored
      This patch reverts two prior patches, e7310c94
      ("security: implement sctp_assoc_established hook in selinux") and
      7c2ef024 ("security: add sctp_assoc_established hook"), which
      create the security_sctp_assoc_established() LSM hook and provide a
      SELinux implementation.  Unfortunately these two patches were merged
      without proper review (the Reviewed-by and Tested-by tags from
      Richard Haines were for previous revisions of these patches that
      were significantly different) and there are outstanding objections
      from the SELinux maintainers regarding these patches.
      
      Work is currently ongoing to correct the problems identified in the
      reverted patches, as well as others that have come up during review,
      but it is unclear at this point in time when that work will be ready
      for inclusion in the mainline kernel.  In the interest of not keeping
      objectionable code in the kernel for multiple weeks, and potentially
      a kernel release, we are reverting the two problematic patches.
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1aa3b220
  4. 13 Nov, 2021 5 commits