1. 09 Feb, 2024 15 commits
    • Jakub Kicinski's avatar
      Merge branch 'net-openvswitch-limit-the-recursions-from-action-sets' · 6a12401b
      Jakub Kicinski authored
      Aaron Conole says:
      
      ====================
      net: openvswitch: limit the recursions from action sets
      
      Open vSwitch module accepts actions as a list from the netlink socket
      and then creates a copy which it uses in the action set processing.
      During processing of the action list on a packet, the module keeps a
      count of the execution depth and exits processing if the action depth
      goes too high.
      
      However, during netlink processing the recursion depth isn't checked
      anywhere, and the copy trusts that kernel has large enough stack to
      accommodate it.  The OVS sample action was the original action which
      could perform this kinds of recursion, and it originally checked that
      it didn't exceed the sample depth limit.  However, when sample became
      optimized to provide the clone() semantics, the recursion limit was
      dropped.
      
      This series adds a depth limit during the __ovs_nla_copy_actions() call
      that will ensure we don't exceed the max that the OVS userspace could
      generate for a clone().
      
      Additionally, this series provides a selftest in 2/2 that can be used to
      determine if the OVS module is allowing unbounded access.  It can be
      safely omitted where the ovs selftest framework isn't available.
      ====================
      
      Link: https://lore.kernel.org/r/20240207132416.1488485-1-aconole@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6a12401b
    • Aaron Conole's avatar
      selftests: openvswitch: Add validation for the recursion test · bd128f62
      Aaron Conole authored
      Add a test case into the netlink checks that will show the number of
      nested action recursions won't exceed 16.  Going to 17 on a small
      clone call isn't enough to exhaust the stack on (most) systems, so
      it should be safe to run even on systems that don't have the fix
      applied.
      Signed-off-by: default avatarAaron Conole <aconole@redhat.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240207132416.1488485-3-aconole@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      bd128f62
    • Aaron Conole's avatar
      net: openvswitch: limit the number of recursions from action sets · 6e2f90d3
      Aaron Conole authored
      The ovs module allows for some actions to recursively contain an action
      list for complex scenarios, such as sampling, checking lengths, etc.
      When these actions are copied into the internal flow table, they are
      evaluated to validate that such actions make sense, and these calls
      happen recursively.
      
      The ovs-vswitchd userspace won't emit more than 16 recursion levels
      deep.  However, the module has no such limit and will happily accept
      limits larger than 16 levels nested.  Prevent this by tracking the
      number of recursions happening and manually limiting it to 16 levels
      nested.
      
      The initial implementation of the sample action would track this depth
      and prevent more than 3 levels of recursion, but this was removed to
      support the clone use case, rather than limited at the current userspace
      limit.
      
      Fixes: 798c1661 ("openvswitch: Optimize sample action for the clone use cases")
      Signed-off-by: default avatarAaron Conole <aconole@redhat.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240207132416.1488485-2-aconole@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6e2f90d3
    • Jakub Kicinski's avatar
      Merge branch 'selftests-forwarding-various-fixes' · d02bfae3
      Jakub Kicinski authored
      Ido Schimmel says:
      
      ====================
      selftests: forwarding: Various fixes
      
      Fix various problems in the forwarding selftests so that they will pass
      in the netdev CI instead of being ignored. See commit messages for
      details.
      ====================
      
      Link: https://lore.kernel.org/r/20240208155529.1199729-1-idosch@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d02bfae3
    • Ido Schimmel's avatar
      selftests: forwarding: Fix bridge locked port test flakiness · f97f1fcc
      Ido Schimmel authored
      The redirection test case fails in the netdev CI on debug kernels
      because an FDB entry is learned despite the presence of a tc filter that
      redirects incoming traffic [1].
      
      I am unable to reproduce the failure locally, but I can see how it can
      happen given that learning is first enabled and only then the ingress tc
      filter is configured. On debug kernels the time window between these two
      operations is longer compared to regular kernels, allowing random
      packets to be transmitted and trigger learning.
      
      Fix by reversing the order and configure the ingress tc filter before
      enabling learning.
      
      [1]
      [...]
       # TEST: Locked port MAB redirect                                      [FAIL]
       # Locked entry created for redirected traffic
      
      Fixes: 38c43a1c ("selftests: forwarding: Add test case for traffic redirection from a locked port")
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Link: https://lore.kernel.org/r/20240208155529.1199729-5-idosch@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f97f1fcc
    • Ido Schimmel's avatar
      selftests: forwarding: Suppress grep warnings · dd6b3458
      Ido Schimmel authored
      Suppress the following grep warnings:
      
      [...]
      INFO: # Port group entries configuration tests - (*, G)
      TEST: Common port group entries configuration tests (IPv4 (*, G))   [ OK ]
      TEST: Common port group entries configuration tests (IPv6 (*, G))   [ OK ]
      grep: warning: stray \ before /
      grep: warning: stray \ before /
      grep: warning: stray \ before /
      TEST: IPv4 (*, G) port group entries configuration tests            [ OK ]
      grep: warning: stray \ before /
      grep: warning: stray \ before /
      grep: warning: stray \ before /
      TEST: IPv6 (*, G) port group entries configuration tests            [ OK ]
      [...]
      
      They do not fail the test, but do clutter the output.
      
      Fixes: b6d00da0 ("selftests: forwarding: Add bridge MDB test")
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Link: https://lore.kernel.org/r/20240208155529.1199729-4-idosch@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      dd6b3458
    • Ido Schimmel's avatar
      selftests: forwarding: Fix bridge MDB test flakiness · 7399e2ce
      Ido Schimmel authored
      After enabling a multicast querier on the bridge (like the test is
      doing), the bridge will wait for the Max Response Delay before starting
      to forward according to its MDB in order to let Membership Reports
      enough time to be received and processed.
      
      Currently, the test is waiting for exactly the default Max Response
      Delay (10 seconds) which is racy and leads to failures [1].
      
      Fix by reducing the Max Response Delay to 1 second.
      
      [1]
       [...]
       # TEST: IPv4 host entries forwarding tests                            [FAIL]
       # Packet locally received after flood
      
      Fixes: b6d00da0 ("selftests: forwarding: Add bridge MDB test")
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Link: https://lore.kernel.org/r/20240208155529.1199729-3-idosch@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7399e2ce
    • Ido Schimmel's avatar
      selftests: forwarding: Fix layer 2 miss test flakiness · 93590849
      Ido Schimmel authored
      After enabling a multicast querier on the bridge (like the test is
      doing), the bridge will wait for the Max Response Delay before starting
      to forward according to its MDB in order to let Membership Reports
      enough time to be received and processed.
      
      Currently, the test is waiting for exactly the default Max Response
      Delay (10 seconds) which is racy and leads to failures [1].
      
      Fix by reducing the Max Response Delay to 1 second.
      
      [1]
       [...]
       # TEST: L2 miss - Multicast (IPv4)                                    [FAIL]
       # Unregistered multicast filter was hit after adding MDB entry
      
      Fixes: 8c33266a ("selftests: forwarding: Add layer 2 miss test cases")
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Link: https://lore.kernel.org/r/20240208155529.1199729-2-idosch@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      93590849
    • Ido Schimmel's avatar
      selftests: net: Fix bridge backup port test flakiness · 38ee0cb2
      Ido Schimmel authored
      The test toggles the carrier of a bridge port in order to test the
      bridge backup port feature.
      
      Due to the linkwatch delayed work the carrier change is not always
      reflected fast enough to the bridge driver and packets are not forwarded
      as the test expects, resulting in failures [1].
      
      Fix by busy waiting on the bridge port state until it changes to the
      desired state following the carrier change.
      
      [1]
       # Backup port
       # -----------
       [...]
       # TEST: swp1 carrier off                                              [ OK ]
       # TEST: No forwarding out of swp1                                     [FAIL]
       [  641.995910] br0: port 1(swp1) entered disabled state
       # TEST: No forwarding out of vx0                                      [ OK ]
      
      Fixes: b4084530 ("selftests: net: Add bridge backup port and backup nexthop ID test")
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Link: https://lore.kernel.org/r/20240208123110.1063930-1-idosch@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      38ee0cb2
    • Paolo Abeni's avatar
      selftests: net: add more missing kernel config · 02d9009f
      Paolo Abeni authored
      The reuseport_addr_any.sh is currently skipping DCCP tests and
      pmtu.sh is skipping all the FOU/GUE related cases: add the missing
      options.
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/38d3ca7f909736c1aef56e6244d67c82a9bba6ff.1707326987.git.pabeni@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      02d9009f
    • Parav Pandit's avatar
      devlink: Fix command annotation documentation · 4ab18af4
      Parav Pandit authored
      Command example string is not read as command.
      Fix command annotation.
      
      Fixes: a8ce7b26 ("devlink: Expose port function commands to control migratable")
      Signed-off-by: default avatarParav Pandit <parav@nvidia.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240206161717.466653-1-parav@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4ab18af4
    • Magnus Karlsson's avatar
      bonding: do not report NETDEV_XDP_ACT_XSK_ZEROCOPY · 9b0ed890
      Magnus Karlsson authored
      Do not report the XDP capability NETDEV_XDP_ACT_XSK_ZEROCOPY as the
      bonding driver does not support XDP and AF_XDP in zero-copy mode even
      if the real NIC drivers do.
      
      Note that the driver used to report everything as supported before a
      device was bonded. Instead of just masking out the zero-copy support
      from this, have the driver report that no XDP feature is supported
      until a real device is bonded. This seems to be more truthful as it is
      the real drivers that decide what XDP features are supported.
      
      Fixes: cb9e6e58 ("bonding: add xdp_features support")
      Reported-by: default avatarPrashant Batra <prbatra.mail@gmail.com>
      Link: https://lore.kernel.org/all/CAJ8uoz2ieZCopgqTvQ9ZY6xQgTbujmC6XkMTamhp68O-h_-rLg@mail.gmail.com/T/Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Reviewed-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/r/20240207084737.20890-1-magnus.karlsson@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9b0ed890
    • Chuck Lever's avatar
      net/handshake: Fix handshake_req_destroy_test1 · 4e1d71ca
      Chuck Lever authored
      Recently, handshake_req_destroy_test1 started failing:
      
      Expected handshake_req_destroy_test == req, but
          handshake_req_destroy_test == 0000000000000000
          req == 0000000060f99b40
      not ok 11 req_destroy works
      
      This is because "sock_release(sock)" was replaced with "fput(filp)"
      to address a memory leak. Note that sock_release() is synchronous
      but fput() usually delays the final close and clean-up.
      
      The delay is not consequential in the other cases that were changed
      but handshake_req_destroy_test1 is testing that handshake_req_cancel()
      followed by closing the file actually does call the ->hp_destroy
      method. Thus the PTR_EQ test at the end has to be sure that the
      final close is complete before it checks the pointer.
      
      We cannot use a completion here because if ->hp_destroy is never
      called (ie, there is an API bug) then the test will hang.
      
      Reported by: Guenter Roeck <linux@roeck-us.net>
      Closes: https://lore.kernel.org/netdev/ZcKDd1to4MPANCrn@tissot.1015granger.net/T/#mac5c6299f86799f1c71776f3a07f9c566c7c3c40
      Fixes: 4a0f07d7 ("net/handshake: Fix memory leak in __sock_create() and sock_alloc_file()")
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Link: https://lore.kernel.org/r/170724699027.91401.7839730697326806733.stgit@oracle-102.nfsv4bat.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4e1d71ca
    • Jiri Pirko's avatar
      net/mlx5: DPLL, Fix possible use after free after delayed work timer triggers · aa1eec2f
      Jiri Pirko authored
      I managed to hit following use after free warning recently:
      
      [ 2169.711665] ==================================================================
      [ 2169.714009] BUG: KASAN: slab-use-after-free in __run_timers.part.0+0x179/0x4c0
      [ 2169.716293] Write of size 8 at addr ffff88812b326a70 by task swapper/4/0
      
      [ 2169.719022] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 6.8.0-rc2jiri+ #2
      [ 2169.720974] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
      [ 2169.722457] Call Trace:
      [ 2169.722756]  <IRQ>
      [ 2169.723024]  dump_stack_lvl+0x58/0xb0
      [ 2169.723417]  print_report+0xc5/0x630
      [ 2169.723807]  ? __virt_addr_valid+0x126/0x2b0
      [ 2169.724268]  kasan_report+0xbe/0xf0
      [ 2169.724667]  ? __run_timers.part.0+0x179/0x4c0
      [ 2169.725116]  ? __run_timers.part.0+0x179/0x4c0
      [ 2169.725570]  __run_timers.part.0+0x179/0x4c0
      [ 2169.726003]  ? call_timer_fn+0x320/0x320
      [ 2169.726404]  ? lock_downgrade+0x3a0/0x3a0
      [ 2169.726820]  ? kvm_clock_get_cycles+0x14/0x20
      [ 2169.727257]  ? ktime_get+0x92/0x150
      [ 2169.727630]  ? lapic_next_deadline+0x35/0x60
      [ 2169.728069]  run_timer_softirq+0x40/0x80
      [ 2169.728475]  __do_softirq+0x1a1/0x509
      [ 2169.728866]  irq_exit_rcu+0x95/0xc0
      [ 2169.729241]  sysvec_apic_timer_interrupt+0x6b/0x80
      [ 2169.729718]  </IRQ>
      [ 2169.729993]  <TASK>
      [ 2169.730259]  asm_sysvec_apic_timer_interrupt+0x16/0x20
      [ 2169.730755] RIP: 0010:default_idle+0x13/0x20
      [ 2169.731190] Code: c0 08 00 00 00 4d 29 c8 4c 01 c7 4c 29 c2 e9 72 ff ff ff cc cc cc cc 8b 05 9a 7f 1f 02 85 c0 7e 07 0f 00 2d cf 69 43 00 fb f4 <fa> c3 66 66 2e 0f 1f 84 00 00 00 00 00 65 48 8b 04 25 c0 93 04 00
      [ 2169.732759] RSP: 0018:ffff888100dbfe10 EFLAGS: 00000242
      [ 2169.733264] RAX: 0000000000000001 RBX: ffff888100d9c200 RCX: ffffffff8241bd62
      [ 2169.733925] RDX: ffffed109a848b15 RSI: 0000000000000004 RDI: ffffffff8127ac55
      [ 2169.734566] RBP: 0000000000000004 R08: 0000000000000000 R09: ffffed109a848b14
      [ 2169.735200] R10: ffff8884d42458a3 R11: 000000000000ba7e R12: ffffffff83d7d3a0
      [ 2169.735835] R13: 1ffff110201b7fc6 R14: 0000000000000000 R15: ffff888100d9c200
      [ 2169.736478]  ? ct_kernel_exit.constprop.0+0xa2/0xc0
      [ 2169.736954]  ? do_idle+0x285/0x290
      [ 2169.737323]  default_idle_call+0x63/0x90
      [ 2169.737730]  do_idle+0x285/0x290
      [ 2169.738089]  ? arch_cpu_idle_exit+0x30/0x30
      [ 2169.738511]  ? mark_held_locks+0x1a/0x80
      [ 2169.738917]  ? lockdep_hardirqs_on_prepare+0x12e/0x200
      [ 2169.739417]  cpu_startup_entry+0x30/0x40
      [ 2169.739825]  start_secondary+0x19a/0x1c0
      [ 2169.740229]  ? set_cpu_sibling_map+0xbd0/0xbd0
      [ 2169.740673]  secondary_startup_64_no_verify+0x15d/0x16b
      [ 2169.741179]  </TASK>
      
      [ 2169.741686] Allocated by task 1098:
      [ 2169.742058]  kasan_save_stack+0x1c/0x40
      [ 2169.742456]  kasan_save_track+0x10/0x30
      [ 2169.742852]  __kasan_kmalloc+0x83/0x90
      [ 2169.743246]  mlx5_dpll_probe+0xf5/0x3c0 [mlx5_dpll]
      [ 2169.743730]  auxiliary_bus_probe+0x62/0xb0
      [ 2169.744148]  really_probe+0x127/0x590
      [ 2169.744534]  __driver_probe_device+0xd2/0x200
      [ 2169.744973]  device_driver_attach+0x6b/0xf0
      [ 2169.745402]  bind_store+0x90/0xe0
      [ 2169.745761]  kernfs_fop_write_iter+0x1df/0x2a0
      [ 2169.746210]  vfs_write+0x41f/0x790
      [ 2169.746579]  ksys_write+0xc7/0x160
      [ 2169.746947]  do_syscall_64+0x6f/0x140
      [ 2169.747333]  entry_SYSCALL_64_after_hwframe+0x46/0x4e
      
      [ 2169.748049] Freed by task 1220:
      [ 2169.748393]  kasan_save_stack+0x1c/0x40
      [ 2169.748789]  kasan_save_track+0x10/0x30
      [ 2169.749188]  kasan_save_free_info+0x3b/0x50
      [ 2169.749621]  poison_slab_object+0x106/0x180
      [ 2169.750044]  __kasan_slab_free+0x14/0x50
      [ 2169.750451]  kfree+0x118/0x330
      [ 2169.750792]  mlx5_dpll_remove+0xf5/0x110 [mlx5_dpll]
      [ 2169.751271]  auxiliary_bus_remove+0x2e/0x40
      [ 2169.751694]  device_release_driver_internal+0x24b/0x2e0
      [ 2169.752191]  unbind_store+0xa6/0xb0
      [ 2169.752563]  kernfs_fop_write_iter+0x1df/0x2a0
      [ 2169.753004]  vfs_write+0x41f/0x790
      [ 2169.753381]  ksys_write+0xc7/0x160
      [ 2169.753750]  do_syscall_64+0x6f/0x140
      [ 2169.754132]  entry_SYSCALL_64_after_hwframe+0x46/0x4e
      
      [ 2169.754847] Last potentially related work creation:
      [ 2169.755315]  kasan_save_stack+0x1c/0x40
      [ 2169.755709]  __kasan_record_aux_stack+0x9b/0xf0
      [ 2169.756165]  __queue_work+0x382/0x8f0
      [ 2169.756552]  call_timer_fn+0x126/0x320
      [ 2169.756941]  __run_timers.part.0+0x2ea/0x4c0
      [ 2169.757376]  run_timer_softirq+0x40/0x80
      [ 2169.757782]  __do_softirq+0x1a1/0x509
      
      [ 2169.758387] Second to last potentially related work creation:
      [ 2169.758924]  kasan_save_stack+0x1c/0x40
      [ 2169.759322]  __kasan_record_aux_stack+0x9b/0xf0
      [ 2169.759773]  __queue_work+0x382/0x8f0
      [ 2169.760156]  call_timer_fn+0x126/0x320
      [ 2169.760550]  __run_timers.part.0+0x2ea/0x4c0
      [ 2169.760978]  run_timer_softirq+0x40/0x80
      [ 2169.761381]  __do_softirq+0x1a1/0x509
      
      [ 2169.761998] The buggy address belongs to the object at ffff88812b326a00
                      which belongs to the cache kmalloc-256 of size 256
      [ 2169.763061] The buggy address is located 112 bytes inside of
                      freed 256-byte region [ffff88812b326a00, ffff88812b326b00)
      
      [ 2169.764346] The buggy address belongs to the physical page:
      [ 2169.764866] page:000000000f2b1e89 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12b324
      [ 2169.765731] head:000000000f2b1e89 order:2 entire_mapcount:0 nr_pages_mapped:0 pincount:0
      [ 2169.766484] anon flags: 0x200000000000840(slab|head|node=0|zone=2)
      [ 2169.767048] page_type: 0xffffffff()
      [ 2169.767422] raw: 0200000000000840 ffff888100042b40 0000000000000000 dead000000000001
      [ 2169.768183] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
      [ 2169.768899] page dumped because: kasan: bad access detected
      
      [ 2169.769649] Memory state around the buggy address:
      [ 2169.770116]  ffff88812b326900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 2169.770805]  ffff88812b326980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 2169.771485] >ffff88812b326a00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 2169.772173]                                                              ^
      [ 2169.772787]  ffff88812b326a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 2169.773477]  ffff88812b326b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 2169.774160] ==================================================================
      [ 2169.774845] ==================================================================
      
      I didn't manage to reproduce it. Though the issue seems to be obvious.
      There is a chance that the mlx5_dpll_remove() calls
      cancel_delayed_work() when the work runs and manages to re-arm itself.
      In that case, after delay timer triggers next attempt to queue it,
      it works with freed memory.
      
      Fix this by using cancel_delayed_work_sync() instead which makes sure
      that work is done when it returns.
      
      Fixes: 496fd0a2 ("mlx5: Implement SyncE support using DPLL infrastructure")
      Signed-off-by: default avatarJiri Pirko <jiri@nvidia.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240206164328.360313-1-jiri@resnulli.usSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      aa1eec2f
    • Jiri Pirko's avatar
      dpll: fix possible deadlock during netlink dump operation · 53c0441d
      Jiri Pirko authored
      Recently, I've been hitting following deadlock warning during dpll pin
      dump:
      
      [52804.637962] ======================================================
      [52804.638536] WARNING: possible circular locking dependency detected
      [52804.639111] 6.8.0-rc2jiri+ #1 Not tainted
      [52804.639529] ------------------------------------------------------
      [52804.640104] python3/2984 is trying to acquire lock:
      [52804.640581] ffff88810e642678 (nlk_cb_mutex-GENERIC){+.+.}-{3:3}, at: netlink_dump+0xb3/0x780
      [52804.641417]
                     but task is already holding lock:
      [52804.642010] ffffffff83bde4c8 (dpll_lock){+.+.}-{3:3}, at: dpll_lock_dumpit+0x13/0x20
      [52804.642747]
                     which lock already depends on the new lock.
      
      [52804.643551]
                     the existing dependency chain (in reverse order) is:
      [52804.644259]
                     -> #1 (dpll_lock){+.+.}-{3:3}:
      [52804.644836]        lock_acquire+0x174/0x3e0
      [52804.645271]        __mutex_lock+0x119/0x1150
      [52804.645723]        dpll_lock_dumpit+0x13/0x20
      [52804.646169]        genl_start+0x266/0x320
      [52804.646578]        __netlink_dump_start+0x321/0x450
      [52804.647056]        genl_family_rcv_msg_dumpit+0x155/0x1e0
      [52804.647575]        genl_rcv_msg+0x1ed/0x3b0
      [52804.648001]        netlink_rcv_skb+0xdc/0x210
      [52804.648440]        genl_rcv+0x24/0x40
      [52804.648831]        netlink_unicast+0x2f1/0x490
      [52804.649290]        netlink_sendmsg+0x36d/0x660
      [52804.649742]        __sock_sendmsg+0x73/0xc0
      [52804.650165]        __sys_sendto+0x184/0x210
      [52804.650597]        __x64_sys_sendto+0x72/0x80
      [52804.651045]        do_syscall_64+0x6f/0x140
      [52804.651474]        entry_SYSCALL_64_after_hwframe+0x46/0x4e
      [52804.652001]
                     -> #0 (nlk_cb_mutex-GENERIC){+.+.}-{3:3}:
      [52804.652650]        check_prev_add+0x1ae/0x1280
      [52804.653107]        __lock_acquire+0x1ed3/0x29a0
      [52804.653559]        lock_acquire+0x174/0x3e0
      [52804.653984]        __mutex_lock+0x119/0x1150
      [52804.654423]        netlink_dump+0xb3/0x780
      [52804.654845]        __netlink_dump_start+0x389/0x450
      [52804.655321]        genl_family_rcv_msg_dumpit+0x155/0x1e0
      [52804.655842]        genl_rcv_msg+0x1ed/0x3b0
      [52804.656272]        netlink_rcv_skb+0xdc/0x210
      [52804.656721]        genl_rcv+0x24/0x40
      [52804.657119]        netlink_unicast+0x2f1/0x490
      [52804.657570]        netlink_sendmsg+0x36d/0x660
      [52804.658022]        __sock_sendmsg+0x73/0xc0
      [52804.658450]        __sys_sendto+0x184/0x210
      [52804.658877]        __x64_sys_sendto+0x72/0x80
      [52804.659322]        do_syscall_64+0x6f/0x140
      [52804.659752]        entry_SYSCALL_64_after_hwframe+0x46/0x4e
      [52804.660281]
                     other info that might help us debug this:
      
      [52804.661077]  Possible unsafe locking scenario:
      
      [52804.661671]        CPU0                    CPU1
      [52804.662129]        ----                    ----
      [52804.662577]   lock(dpll_lock);
      [52804.662924]                                lock(nlk_cb_mutex-GENERIC);
      [52804.663538]                                lock(dpll_lock);
      [52804.664073]   lock(nlk_cb_mutex-GENERIC);
      [52804.664490]
      
      The issue as follows: __netlink_dump_start() calls control->start(cb)
      with nlk->cb_mutex held. In control->start(cb) the dpll_lock is taken.
      Then nlk->cb_mutex is released and taken again in netlink_dump(), while
      dpll_lock still being held. That leads to ABBA deadlock when another
      CPU races with the same operation.
      
      Fix this by moving dpll_lock taking into dumpit() callback which ensures
      correct lock taking order.
      
      Fixes: 9d71b54b ("dpll: netlink: Add DPLL framework base functions")
      Signed-off-by: default avatarJiri Pirko <jiri@nvidia.com>
      Reviewed-by: default avatarArkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
      Link: https://lore.kernel.org/r/20240207115902.371649-1-jiri@resnulli.usSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      53c0441d
  2. 08 Feb, 2024 20 commits
    • Linus Torvalds's avatar
      Merge tag 'net-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 1f719a2f
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from WiFi and netfilter.
      
        Current release - regressions:
      
         - nic: intel: fix old compiler regressions
      
         - netfilter: ipset: missing gc cancellations fixed
      
        Current release - new code bugs:
      
         - netfilter: ctnetlink: fix filtering for zone 0
      
        Previous releases - regressions:
      
         - core: fix from address in memcpy_to_iter_csum()
      
         - netfilter: nfnetlink_queue: un-break NF_REPEAT
      
         - af_unix: fix memory leak for dead unix_(sk)->oob_skb in GC.
      
         - devlink: avoid potential loop in devlink_rel_nested_in_notify_work()
      
         - iwlwifi:
             - mvm: fix a battery life regression
             - fix double-free bug
      
         - mac80211: fix waiting for beacons logic
      
         - nic: nfp: flower: prevent re-adding mac index for bonded port
      
        Previous releases - always broken:
      
         - rxrpc: fix generation of serial numbers to skip zero
      
         - tipc: check the bearer type before calling tipc_udp_nl_bearer_add()
      
         - tunnels: fix out of bounds access when building IPv6 PMTU error
      
         - nic: hv_netvsc: register VF in netvsc_probe if NET_DEVICE_REGISTER
           missed
      
         - nic: atlantic: fix DMA mapping for PTP hwts ring
      
        Misc:
      
         - selftests: more fixes to deal with very slow hosts"
      
      * tag 'net-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (80 commits)
        netfilter: nft_set_pipapo: remove scratch_aligned pointer
        netfilter: nft_set_pipapo: add helper to release pcpu scratch area
        netfilter: nft_set_pipapo: store index in scratch maps
        netfilter: nft_set_rbtree: skip end interval element from gc
        netfilter: nfnetlink_queue: un-break NF_REPEAT
        netfilter: nf_tables: use timestamp to check for set element timeout
        netfilter: nft_ct: reject direction for ct id
        netfilter: ctnetlink: fix filtering for zone 0
        s390/qeth: Fix potential loss of L3-IP@ in case of network issues
        netfilter: ipset: Missing gc cancellations fixed
        octeontx2-af: Initialize maps.
        net: ethernet: ti: cpsw: enable mac_managed_pm to fix mdio
        net: ethernet: ti: cpsw_new: enable mac_managed_pm to fix mdio
        netfilter: nft_set_pipapo: remove static in nft_pipapo_get()
        netfilter: nft_compat: restrict match/target protocol to u16
        netfilter: nft_compat: reject unused compat flag
        netfilter: nft_compat: narrow down revision to unsigned 8-bits
        net: intel: fix old compiler regressions
        MAINTAINERS: Maintainer change for rds
        selftests: cmsg_ipv6: repeat the exact packet
        ...
      1f719a2f
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · b0d5d0f7
      Linus Torvalds authored
      Pull pinctrl fix from Linus Walleij:
       "A single fix for the AMD driver which affects developer laptops, the
        pinctrl/GPIO driver won't probe on some systems"
      
      * tag 'pinctrl-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: amd: Add IRQF_ONESHOT to the interrupt request
      b0d5d0f7
    • Paolo Abeni's avatar
      Merge tag 'nf-24-02-08' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · 63e4b9d6
      Paolo Abeni authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for net:
      
      1) Narrow down target/match revision to u8 in nft_compat.
      
      2) Bail out with unused flags in nft_compat.
      
      3) Restrict layer 4 protocol to u16 in nft_compat.
      
      4) Remove static in pipapo get command that slipped through when
         reducing set memory footprint.
      
      5) Follow up incremental fix for the ipset performance regression,
         this includes the missing gc cancellation, from Jozsef Kadlecsik.
      
      6) Allow to filter by zone 0 in ctnetlink, do not interpret zone 0
         as no filtering, from Felix Huettner.
      
      7) Reject direction for NFT_CT_ID.
      
      8) Use timestamp to check for set element expiration while transaction
         is handled to prevent garbage collection from removing set elements
         that were just added by this transaction. Packet path and netlink
         dump/get path still use current time to check for expiration.
      
      9) Restore NF_REPEAT in nfnetlink_queue, from Florian Westphal.
      
      10) map_index needs to be percpu and per-set, not just percpu.
          At this time its possible for a pipapo set to fill the all-zero part
          with ones and take the 'might have bits set' as 'start-from-zero' area.
          From Florian Westphal. This includes three patches:
      
          - Change scratchpad area to a structure that provides space for a
            per-set-and-cpu toggle and uses it of the percpu one.
      
          - Add a new free helper to prepare for the next patch.
      
          - Remove the scratch_aligned pointer and makes AVX2 implementation
            use the exact same memory addresses for read/store of the matching
            state.
      
      netfilter pull request 24-02-08
      
      * tag 'nf-24-02-08' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        netfilter: nft_set_pipapo: remove scratch_aligned pointer
        netfilter: nft_set_pipapo: add helper to release pcpu scratch area
        netfilter: nft_set_pipapo: store index in scratch maps
        netfilter: nft_set_rbtree: skip end interval element from gc
        netfilter: nfnetlink_queue: un-break NF_REPEAT
        netfilter: nf_tables: use timestamp to check for set element timeout
        netfilter: nft_ct: reject direction for ct id
        netfilter: ctnetlink: fix filtering for zone 0
        netfilter: ipset: Missing gc cancellations fixed
        netfilter: nft_set_pipapo: remove static in nft_pipapo_get()
        netfilter: nft_compat: restrict match/target protocol to u16
        netfilter: nft_compat: reject unused compat flag
        netfilter: nft_compat: narrow down revision to unsigned 8-bits
      ====================
      
      Link: https://lore.kernel.org/r/20240208112834.1433-1-pablo@netfilter.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      63e4b9d6
    • Florian Westphal's avatar
      netfilter: nft_set_pipapo: remove scratch_aligned pointer · 5a8cdf6f
      Florian Westphal authored
      use ->scratch for both avx2 and the generic implementation.
      
      After previous change the scratch->map member is always aligned properly
      for AVX2, so we can just use scratch->map in AVX2 too.
      
      The alignoff delta is stored in the scratchpad so we can reconstruct
      the correct address to free the area again.
      
      Fixes: 7400b063 ("nft_set_pipapo: Introduce AVX2-based lookup implementation")
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      5a8cdf6f
    • Florian Westphal's avatar
      netfilter: nft_set_pipapo: add helper to release pcpu scratch area · 47b1c03c
      Florian Westphal authored
      After next patch simple kfree() is not enough anymore, so add
      a helper for it.
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      47b1c03c
    • Florian Westphal's avatar
      netfilter: nft_set_pipapo: store index in scratch maps · 76313d1a
      Florian Westphal authored
      Pipapo needs a scratchpad area to keep state during matching.
      This state can be large and thus cannot reside on stack.
      
      Each set preallocates percpu areas for this.
      
      On each match stage, one scratchpad half starts with all-zero and the other
      is inited to all-ones.
      
      At the end of each stage, the half that starts with all-ones is
      always zero.  Before next field is tested, pointers to the two halves
      are swapped, i.e.  resmap pointer turns into fill pointer and vice versa.
      
      After the last field has been processed, pipapo stashes the
      index toggle in a percpu variable, with assumption that next packet
      will start with the all-zero half and sets all bits in the other to 1.
      
      This isn't reliable.
      
      There can be multiple sets and we can't be sure that the upper
      and lower half of all set scratch map is always in sync (lookups
      can be conditional), so one set might have swapped, but other might
      not have been queried.
      
      Thus we need to keep the index per-set-and-cpu, just like the
      scratchpad.
      
      Note that this bug fix is incomplete, there is a related issue.
      
      avx2 and normal implementation might use slightly different areas of the
      map array space due to the avx2 alignment requirements, so
      m->scratch (generic/fallback implementation) and ->scratch_aligned
      (avx) may partially overlap. scratch and scratch_aligned are not distinct
      objects, the latter is just the aligned address of the former.
      
      After this change, write to scratch_align->map_index may write to
      scratch->map, so this issue becomes more prominent, we can set to 1
      a bit in the supposedly-all-zero area of scratch->map[].
      
      A followup patch will remove the scratch_aligned and makes generic and
      avx code use the same (aligned) area.
      
      Its done in a separate change to ease review.
      
      Fixes: 3c4287f6 ("nf_tables: Add set type for arbitrary concatenation of ranges")
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      76313d1a
    • Pablo Neira Ayuso's avatar
      netfilter: nft_set_rbtree: skip end interval element from gc · 60c0c230
      Pablo Neira Ayuso authored
      rbtree lazy gc on insert might collect an end interval element that has
      been just added in this transactions, skip end interval elements that
      are not yet active.
      
      Fixes: f718863a ("netfilter: nft_set_rbtree: fix overlap expiration walk")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarlonial con <kongln9170@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      60c0c230
    • Florian Westphal's avatar
      netfilter: nfnetlink_queue: un-break NF_REPEAT · f82777e8
      Florian Westphal authored
      Only override userspace verdict if the ct hook returns something
      other than ACCEPT.
      
      Else, this replaces NF_REPEAT (run all hooks again) with NF_ACCEPT
      (move to next hook).
      
      Fixes: 6291b3a6 ("netfilter: conntrack: convert nf_conntrack_update to netfilter verdicts")
      Reported-by: l.6diay@passmail.com
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      f82777e8
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: use timestamp to check for set element timeout · 7395dfac
      Pablo Neira Ayuso authored
      Add a timestamp field at the beginning of the transaction, store it
      in the nftables per-netns area.
      
      Update set backend .insert, .deactivate and sync gc path to use the
      timestamp, this avoids that an element expires while control plane
      transaction is still unfinished.
      
      .lookup and .update, which are used from packet path, still use the
      current time to check if the element has expired. And .get path and dump
      also since this runs lockless under rcu read size lock. Then, there is
      async gc which also needs to check the current time since it runs
      asynchronously from a workqueue.
      
      Fixes: c3e1b005 ("netfilter: nf_tables: add set element timeout support")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      7395dfac
    • Pablo Neira Ayuso's avatar
      netfilter: nft_ct: reject direction for ct id · 38ed1c70
      Pablo Neira Ayuso authored
      Direction attribute is ignored, reject it in case this ever needs to be
      supported
      
      Fixes: 3087c3f7 ("netfilter: nft_ct: Add ct id support")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      38ed1c70
    • Felix Huettner's avatar
      netfilter: ctnetlink: fix filtering for zone 0 · fa173a1b
      Felix Huettner authored
      previously filtering for the default zone would actually skip the zone
      filter and flush all zones.
      
      Fixes: eff3c558 ("netfilter: ctnetlink: support filtering by zone")
      Reported-by: default avatarIlya Maximets <i.maximets@ovn.org>
      Closes: https://lore.kernel.org/netdev/2032238f-31ac-4106-8f22-522e76df5a12@ovn.org/Signed-off-by: default avatarFelix Huettner <felix.huettner@mail.schwarz>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      fa173a1b
    • Alexandra Winter's avatar
      s390/qeth: Fix potential loss of L3-IP@ in case of network issues · 2fe8a236
      Alexandra Winter authored
      Symptom:
      In case of a bad cable connection (e.g. dirty optics) a fast sequence of
      network DOWN-UP-DOWN-UP could happen. UP triggers recovery of the qeth
      interface. In case of a second DOWN while recovery is still ongoing, it
      can happen that the IP@ of a Layer3 qeth interface is lost and will not
      be recovered by the second UP.
      
      Problem:
      When registration of IP addresses with Layer 3 qeth devices fails, (e.g.
      because of bad address format) the respective IP address is deleted from
      its hash-table in the driver. If registration fails because of a ENETDOWN
      condition, the address should stay in the hashtable, so a subsequent
      recovery can restore it.
      
      3caa4af8 ("qeth: keep ip-address after LAN_OFFLINE failure")
      fixes this for registration failures during normal operation, but not
      during recovery.
      
      Solution:
      Keep L3-IP address in case of ENETDOWN in qeth_l3_recover_ip(). For
      consistency with qeth_l3_add_ip() we also keep it in case of EADDRINUSE,
      i.e. for some reason the card already/still has this address registered.
      
      Fixes: 4a71df50 ("qeth: new qeth device driver")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Link: https://lore.kernel.org/r/20240206085849.2902775-1-wintera@linux.ibm.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2fe8a236
    • Jozsef Kadlecsik's avatar
      netfilter: ipset: Missing gc cancellations fixed · 27c5a095
      Jozsef Kadlecsik authored
      The patch fdb8e12cc2cc ("netfilter: ipset: fix performance regression
      in swap operation") missed to add the calls to gc cancellations
      at the error path of create operations and at module unload. Also,
      because the half of the destroy operations now executed by a
      function registered by call_rcu(), neither NFNL_SUBSYS_IPSET mutex
      or rcu read lock is held and therefore the checking of them results
      false warnings.
      
      Fixes: 97f7cf1c ("netfilter: ipset: fix performance regression in swap operation")
      Reported-by: syzbot+52bbc0ad036f6f0d4a25@syzkaller.appspotmail.com
      Reported-by: default avatarBrad Spengler <spender@grsecurity.net>
      Reported-by: default avatarСтас Ничипорович <stasn77@gmail.com>
      Tested-by: default avatarBrad Spengler <spender@grsecurity.net>
      Tested-by: default avatarСтас Ничипорович <stasn77@gmail.com>
      Signed-off-by: default avatarJozsef Kadlecsik <kadlec@netfilter.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      27c5a095
    • Ratheesh Kannoth's avatar
      octeontx2-af: Initialize maps. · db010ff6
      Ratheesh Kannoth authored
      kmalloc_array() without __GFP_ZERO flag does not initialize
      memory to zero. This causes issues. Use kcalloc() for maps and
      bitmap_zalloc() for bitmaps.
      
      Fixes: dd784287 ("octeontx2-af: Add new devlink param to configure maximum usable NIX block LFs")
      Signed-off-by: default avatarRatheesh Kannoth <rkannoth@marvell.com>
      Reviewed-by: default avatarBrett Creeley <bcreeley@amd.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240206024000.1070260-1-rkannoth@marvell.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      db010ff6
    • Paolo Abeni's avatar
    • Sinthu Raja's avatar
      net: ethernet: ti: cpsw: enable mac_managed_pm to fix mdio · bc4ce46b
      Sinthu Raja authored
      The below commit  introduced a WARN when phy state is not in the states:
      PHY_HALTED, PHY_READY and PHY_UP.
      commit 744d23c7 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state")
      
      When cpsw resumes, there have port in PHY_NOLINK state, so the below
      warning comes out. Set mac_managed_pm be true to tell mdio that the phy
      resume/suspend is managed by the mac, to fix the following warning:
      
      WARNING: CPU: 0 PID: 965 at drivers/net/phy/phy_device.c:326 mdio_bus_phy_resume+0x140/0x144
      CPU: 0 PID: 965 Comm: sh Tainted: G           O       6.1.46-g247b2535b2 #1
      Hardware name: Generic AM33XX (Flattened Device Tree)
       unwind_backtrace from show_stack+0x18/0x1c
       show_stack from dump_stack_lvl+0x24/0x2c
       dump_stack_lvl from __warn+0x84/0x15c
       __warn from warn_slowpath_fmt+0x1a8/0x1c8
       warn_slowpath_fmt from mdio_bus_phy_resume+0x140/0x144
       mdio_bus_phy_resume from dpm_run_callback+0x3c/0x140
       dpm_run_callback from device_resume+0xb8/0x2b8
       device_resume from dpm_resume+0x144/0x314
       dpm_resume from dpm_resume_end+0x14/0x20
       dpm_resume_end from suspend_devices_and_enter+0xd0/0x924
       suspend_devices_and_enter from pm_suspend+0x2e0/0x33c
       pm_suspend from state_store+0x74/0xd0
       state_store from kernfs_fop_write_iter+0x104/0x1ec
       kernfs_fop_write_iter from vfs_write+0x1b8/0x358
       vfs_write from ksys_write+0x78/0xf8
       ksys_write from ret_fast_syscall+0x0/0x54
      Exception stack(0xe094dfa8 to 0xe094dff0)
      dfa0:                   00000004 005c3fb8 00000001 005c3fb8 00000004 00000001
      dfc0: 00000004 005c3fb8 b6f6bba0 00000004 00000004 0059edb8 00000000 00000000
      dfe0: 00000004 bed918f0 b6f09bd3 b6e89a66
      
      Cc: <stable@vger.kernel.org> # v6.0+
      Fixes: 744d23c7 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state")
      Fixes: fba863b8 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM")
      Signed-off-by: default avatarSinthu Raja <sinthu.raja@ti.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      bc4ce46b
    • Sinthu Raja's avatar
      net: ethernet: ti: cpsw_new: enable mac_managed_pm to fix mdio · 9def04e7
      Sinthu Raja authored
      The below commit  introduced a WARN when phy state is not in the states:
      PHY_HALTED, PHY_READY and PHY_UP.
      commit 744d23c7 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state")
      
      When cpsw_new resumes, there have port in PHY_NOLINK state, so the below
      warning comes out. Set mac_managed_pm be true to tell mdio that the phy
      resume/suspend is managed by the mac, to fix the following warning:
      
      WARNING: CPU: 0 PID: 965 at drivers/net/phy/phy_device.c:326 mdio_bus_phy_resume+0x140/0x144
      CPU: 0 PID: 965 Comm: sh Tainted: G           O       6.1.46-g247b2535b2 #1
      Hardware name: Generic AM33XX (Flattened Device Tree)
       unwind_backtrace from show_stack+0x18/0x1c
       show_stack from dump_stack_lvl+0x24/0x2c
       dump_stack_lvl from __warn+0x84/0x15c
       __warn from warn_slowpath_fmt+0x1a8/0x1c8
       warn_slowpath_fmt from mdio_bus_phy_resume+0x140/0x144
       mdio_bus_phy_resume from dpm_run_callback+0x3c/0x140
       dpm_run_callback from device_resume+0xb8/0x2b8
       device_resume from dpm_resume+0x144/0x314
       dpm_resume from dpm_resume_end+0x14/0x20
       dpm_resume_end from suspend_devices_and_enter+0xd0/0x924
       suspend_devices_and_enter from pm_suspend+0x2e0/0x33c
       pm_suspend from state_store+0x74/0xd0
       state_store from kernfs_fop_write_iter+0x104/0x1ec
       kernfs_fop_write_iter from vfs_write+0x1b8/0x358
       vfs_write from ksys_write+0x78/0xf8
       ksys_write from ret_fast_syscall+0x0/0x54
      Exception stack(0xe094dfa8 to 0xe094dff0)
      dfa0:                   00000004 005c3fb8 00000001 005c3fb8 00000004 00000001
      dfc0: 00000004 005c3fb8 b6f6bba0 00000004 00000004 0059edb8 00000000 00000000
      dfe0: 00000004 bed918f0 b6f09bd3 b6e89a66
      
      Cc: <stable@vger.kernel.org> # v6.0+
      Fixes: 744d23c7 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state")
      Fixes: fba863b8 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM")
      Signed-off-by: default avatarSinthu Raja <sinthu.raja@ti.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      9def04e7
    • Pablo Neira Ayuso's avatar
      netfilter: nft_set_pipapo: remove static in nft_pipapo_get() · ab0beafd
      Pablo Neira Ayuso authored
      This has slipped through when reducing memory footprint for set
      elements, remove it.
      
      Fixes: 9dad402b ("netfilter: nf_tables: expose opaque set element as struct nft_elem_priv")
      Reported-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      ab0beafd
    • Linus Torvalds's avatar
      Merge tag 'v6.8-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 04737196
      Linus Torvalds authored
      Pull crypto fixes from Herbert Xu:
       "Fix regressions in cbc and algif_hash, as well as an older
        NULL-pointer dereference in ccp"
      
      * tag 'v6.8-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: algif_hash - Remove bogus SGL free on zero-length error path
        crypto: cbc - Ensure statesize is zero
        crypto: ccp - Fix null pointer dereference in __sev_platform_shutdown_locked
      04737196
    • Linus Torvalds's avatar
      Merge tag 'percpu-for-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu · 860d7dcb
      Linus Torvalds authored
      Pull percpu fix from Dennis Zhou:
      
       - fix riscv wrong size passed to local_flush_tlb_range_asid()
      
      * tag 'percpu-for-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
        riscv: Fix wrong size passed to local_flush_tlb_range_asid()
      860d7dcb
  3. 07 Feb, 2024 5 commits