1. 04 Apr, 2024 22 commits
    • Jakub Kicinski's avatar
      Merge tag 'nf-24-04-04' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · d432f7bd
      Jakub Kicinski authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for net:
      
      Patch #1 unlike early commit path stage which triggers a call to abort,
               an explicit release of the batch is required on abort, otherwise
               mutex is released and commit_list remains in place.
      
      Patch #2 release mutex after nft_gc_seq_end() in commit path, otherwise
               async GC worker could collect expired objects.
      
      Patch #3 flush pending destroy work in module removal path, otherwise UaF
               is possible.
      
      Patch #4 and #6 restrict the table dormant flag with basechain updates
      	 to fix state inconsistency in the hook registration.
      
      Patch #5 adds missing RCU read side lock to flowtable type to avoid races
      	 with module removal.
      
      * tag 'nf-24-04-04' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        netfilter: nf_tables: discard table flag update with pending basechain deletion
        netfilter: nf_tables: Fix potential data-race in __nft_flowtable_type_get()
        netfilter: nf_tables: reject new basechain after table flag update
        netfilter: nf_tables: flush pending destroy work before exit_net release
        netfilter: nf_tables: release mutex after nft_gc_seq_end from abort path
        netfilter: nf_tables: release batch on table validation from abort path
      ====================
      
      Link: https://lore.kernel.org/r/20240404104334.1627-1-pablo@netfilter.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d432f7bd
    • Jakub Kicinski's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · a66323e4
      Jakub Kicinski authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2024-04-03 (ice, idpf)
      
      This series contains updates to ice and idpf drivers.
      
      Dan Carpenter initializes some pointer declarations to NULL as needed for
      resource cleanup on ice driver.
      
      Petr Oros corrects assignment of VLAN operators to fix Rx VLAN filtering
      in legacy mode for ice.
      
      Joshua calls eth_type_trans() on unknown packets to prevent possible
      kernel panic on idpf.
      
      * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
        idpf: fix kernel panic on unknown packet types
        ice: fix enabling RX VLAN filtering
        ice: Fix freeing uninitialized pointers
      ====================
      
      Link: https://lore.kernel.org/r/20240403201929.1945116-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a66323e4
    • Eric Dumazet's avatar
      net/sched: act_skbmod: prevent kernel-infoleak · d313eb8b
      Eric Dumazet authored
      syzbot found that tcf_skbmod_dump() was copying four bytes
      from kernel stack to user space [1].
      
      The issue here is that 'struct tc_skbmod' has a four bytes hole.
      
      We need to clear the structure before filling fields.
      
      [1]
      BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:114 [inline]
       BUG: KMSAN: kernel-infoleak in copy_to_user_iter lib/iov_iter.c:24 [inline]
       BUG: KMSAN: kernel-infoleak in iterate_ubuf include/linux/iov_iter.h:29 [inline]
       BUG: KMSAN: kernel-infoleak in iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
       BUG: KMSAN: kernel-infoleak in iterate_and_advance include/linux/iov_iter.h:271 [inline]
       BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x366/0x2520 lib/iov_iter.c:185
        instrument_copy_to_user include/linux/instrumented.h:114 [inline]
        copy_to_user_iter lib/iov_iter.c:24 [inline]
        iterate_ubuf include/linux/iov_iter.h:29 [inline]
        iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
        iterate_and_advance include/linux/iov_iter.h:271 [inline]
        _copy_to_iter+0x366/0x2520 lib/iov_iter.c:185
        copy_to_iter include/linux/uio.h:196 [inline]
        simple_copy_to_iter net/core/datagram.c:532 [inline]
        __skb_datagram_iter+0x185/0x1000 net/core/datagram.c:420
        skb_copy_datagram_iter+0x5c/0x200 net/core/datagram.c:546
        skb_copy_datagram_msg include/linux/skbuff.h:4050 [inline]
        netlink_recvmsg+0x432/0x1610 net/netlink/af_netlink.c:1962
        sock_recvmsg_nosec net/socket.c:1046 [inline]
        sock_recvmsg+0x2c4/0x340 net/socket.c:1068
        __sys_recvfrom+0x35a/0x5f0 net/socket.c:2242
        __do_sys_recvfrom net/socket.c:2260 [inline]
        __se_sys_recvfrom net/socket.c:2256 [inline]
        __x64_sys_recvfrom+0x126/0x1d0 net/socket.c:2256
       do_syscall_64+0xd5/0x1f0
       entry_SYSCALL_64_after_hwframe+0x6d/0x75
      
      Uninit was stored to memory at:
        pskb_expand_head+0x30f/0x19d0 net/core/skbuff.c:2253
        netlink_trim+0x2c2/0x330 net/netlink/af_netlink.c:1317
        netlink_unicast+0x9f/0x1260 net/netlink/af_netlink.c:1351
        nlmsg_unicast include/net/netlink.h:1144 [inline]
        nlmsg_notify+0x21d/0x2f0 net/netlink/af_netlink.c:2610
        rtnetlink_send+0x73/0x90 net/core/rtnetlink.c:741
        rtnetlink_maybe_send include/linux/rtnetlink.h:17 [inline]
        tcf_add_notify net/sched/act_api.c:2048 [inline]
        tcf_action_add net/sched/act_api.c:2071 [inline]
        tc_ctl_action+0x146e/0x19d0 net/sched/act_api.c:2119
        rtnetlink_rcv_msg+0x1737/0x1900 net/core/rtnetlink.c:6595
        netlink_rcv_skb+0x375/0x650 net/netlink/af_netlink.c:2559
        rtnetlink_rcv+0x34/0x40 net/core/rtnetlink.c:6613
        netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
        netlink_unicast+0xf4c/0x1260 net/netlink/af_netlink.c:1361
        netlink_sendmsg+0x10df/0x11f0 net/netlink/af_netlink.c:1905
        sock_sendmsg_nosec net/socket.c:730 [inline]
        __sock_sendmsg+0x30f/0x380 net/socket.c:745
        ____sys_sendmsg+0x877/0xb60 net/socket.c:2584
        ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638
        __sys_sendmsg net/socket.c:2667 [inline]
        __do_sys_sendmsg net/socket.c:2676 [inline]
        __se_sys_sendmsg net/socket.c:2674 [inline]
        __x64_sys_sendmsg+0x307/0x4a0 net/socket.c:2674
       do_syscall_64+0xd5/0x1f0
       entry_SYSCALL_64_after_hwframe+0x6d/0x75
      
      Uninit was stored to memory at:
        __nla_put lib/nlattr.c:1041 [inline]
        nla_put+0x1c6/0x230 lib/nlattr.c:1099
        tcf_skbmod_dump+0x23f/0xc20 net/sched/act_skbmod.c:256
        tcf_action_dump_old net/sched/act_api.c:1191 [inline]
        tcf_action_dump_1+0x85e/0x970 net/sched/act_api.c:1227
        tcf_action_dump+0x1fd/0x460 net/sched/act_api.c:1251
        tca_get_fill+0x519/0x7a0 net/sched/act_api.c:1628
        tcf_add_notify_msg net/sched/act_api.c:2023 [inline]
        tcf_add_notify net/sched/act_api.c:2042 [inline]
        tcf_action_add net/sched/act_api.c:2071 [inline]
        tc_ctl_action+0x1365/0x19d0 net/sched/act_api.c:2119
        rtnetlink_rcv_msg+0x1737/0x1900 net/core/rtnetlink.c:6595
        netlink_rcv_skb+0x375/0x650 net/netlink/af_netlink.c:2559
        rtnetlink_rcv+0x34/0x40 net/core/rtnetlink.c:6613
        netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
        netlink_unicast+0xf4c/0x1260 net/netlink/af_netlink.c:1361
        netlink_sendmsg+0x10df/0x11f0 net/netlink/af_netlink.c:1905
        sock_sendmsg_nosec net/socket.c:730 [inline]
        __sock_sendmsg+0x30f/0x380 net/socket.c:745
        ____sys_sendmsg+0x877/0xb60 net/socket.c:2584
        ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638
        __sys_sendmsg net/socket.c:2667 [inline]
        __do_sys_sendmsg net/socket.c:2676 [inline]
        __se_sys_sendmsg net/socket.c:2674 [inline]
        __x64_sys_sendmsg+0x307/0x4a0 net/socket.c:2674
       do_syscall_64+0xd5/0x1f0
       entry_SYSCALL_64_after_hwframe+0x6d/0x75
      
      Local variable opt created at:
        tcf_skbmod_dump+0x9d/0xc20 net/sched/act_skbmod.c:244
        tcf_action_dump_old net/sched/act_api.c:1191 [inline]
        tcf_action_dump_1+0x85e/0x970 net/sched/act_api.c:1227
      
      Bytes 188-191 of 248 are uninitialized
      Memory access of size 248 starts at ffff888117697680
      Data copied to user address 00007ffe56d855f0
      
      Fixes: 86da71b5 ("net_sched: Introduce skbmod action")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Link: https://lore.kernel.org/r/20240403130908.93421-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d313eb8b
    • Jose Ignacio Tornos Martinez's avatar
      net: usb: ax88179_178a: avoid the interface always configured as random address · 2e91bb99
      Jose Ignacio Tornos Martinez authored
      After the commit d2689b6a ("net: usb: ax88179_178a: avoid two
      consecutive device resets"), reset is not executed from bind operation and
      mac address is not read from the device registers or the devicetree at that
      moment. Since the check to configure if the assigned mac address is random
      or not for the interface, happens after the bind operation from
      usbnet_probe, the interface keeps configured as random address, although the
      address is correctly read and set during open operation (the only reset
      now).
      
      In order to keep only one reset for the device and to avoid the interface
      always configured as random address, after reset, configure correctly the
      suitable field from the driver, if the mac address is read successfully from
      the device registers or the devicetree. Take into account if a locally
      administered address (random) was previously stored.
      
      cc: stable@vger.kernel.org # 6.6+
      Fixes: d2689b6a ("net: usb: ax88179_178a: avoid two consecutive device resets")
      Reported-by: default avatarDave Stevenson  <dave.stevenson@raspberrypi.com>
      Signed-off-by: default avatarJose Ignacio Tornos Martinez <jtornosm@redhat.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240403132158.344838-1-jtornosm@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2e91bb99
    • Christophe JAILLET's avatar
      net: dsa: sja1105: Fix parameters order in sja1110_pcs_mdio_write_c45() · c120209b
      Christophe JAILLET authored
      The definition and declaration of sja1110_pcs_mdio_write_c45() don't have
      parameters in the same order.
      
      Knowing that sja1110_pcs_mdio_write_c45() is used as a function pointer
      in 'sja1105_info' structure with .pcs_mdio_write_c45, and that we have:
      
         int (*pcs_mdio_write_c45)(struct mii_bus *bus, int phy, int mmd,
      				  int reg, u16 val);
      
      it is likely that the definition is the one to change.
      
      Found with cppcheck, funcArgOrderDifferent.
      
      Fixes: ae271547 ("net: dsa: sja1105: C45 only transactions for PCS")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Reviewed-by: default avatarMichael Walle <mwalle@kernel.org>
      Reviewed-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Link: https://lore.kernel.org/r/ff2a5af67361988b3581831f7bd1eddebfb4c48f.1712082763.git.christophe.jaillet@wanadoo.frSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      c120209b
    • Paul Barker's avatar
      net: ravb: Always update error counters · 101b7641
      Paul Barker authored
      The error statistics should be updated each time the poll function is
      called, even if the full RX work budget has been consumed. This prevents
      the counts from becoming stuck when RX bandwidth usage is high.
      
      This also ensures that error counters are not updated after we've
      re-enabled interrupts as that could result in a race condition.
      
      Also drop an unnecessary space.
      
      Fixes: c156633f ("Renesas Ethernet AVB driver proper")
      Signed-off-by: default avatarPaul Barker <paul.barker.ct@bp.renesas.com>
      Reviewed-by: default avatarSergey Shtylyov <s.shtylyov@omp.ru>
      Link: https://lore.kernel.org/r/20240402145305.82148-2-paul.barker.ct@bp.renesas.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      101b7641
    • Paul Barker's avatar
      net: ravb: Always process TX descriptor ring · 596a4254
      Paul Barker authored
      The TX queue should be serviced each time the poll function is called,
      even if the full RX work budget has been consumed. This prevents
      starvation of the TX queue when RX bandwidth usage is high.
      
      Fixes: c156633f ("Renesas Ethernet AVB driver proper")
      Signed-off-by: default avatarPaul Barker <paul.barker.ct@bp.renesas.com>
      Reviewed-by: default avatarSergey Shtylyov <s.shtylyov@omp.ru>
      Link: https://lore.kernel.org/r/20240402145305.82148-1-paul.barker.ct@bp.renesas.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      596a4254
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: discard table flag update with pending basechain deletion · 1bc83a01
      Pablo Neira Ayuso authored
      Hook unregistration is deferred to the commit phase, same occurs with
      hook updates triggered by the table dormant flag. When both commands are
      combined, this results in deleting a basechain while leaving its hook
      still registered in the core.
      
      Fixes: 179d9ba5 ("netfilter: nf_tables: fix table flag updates")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      1bc83a01
    • Ziyang Xuan's avatar
      netfilter: nf_tables: Fix potential data-race in __nft_flowtable_type_get() · 24225011
      Ziyang Xuan authored
      nft_unregister_flowtable_type() within nf_flow_inet_module_exit() can
      concurrent with __nft_flowtable_type_get() within nf_tables_newflowtable().
      And thhere is not any protection when iterate over nf_tables_flowtables
      list in __nft_flowtable_type_get(). Therefore, there is pertential
      data-race of nf_tables_flowtables list entry.
      
      Use list_for_each_entry_rcu() to iterate over nf_tables_flowtables list
      in __nft_flowtable_type_get(), and use rcu_read_lock() in the caller
      nft_flowtable_type_get() to protect the entire type query process.
      
      Fixes: 3b49e2e9 ("netfilter: nf_tables: add flow table netlink frontend")
      Signed-off-by: default avatarZiyang Xuan <william.xuanziyang@huawei.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      24225011
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: reject new basechain after table flag update · 994209dd
      Pablo Neira Ayuso authored
      When dormant flag is toggled, hooks are disabled in the commit phase by
      iterating over current chains in table (existing and new).
      
      The following configuration allows for an inconsistent state:
      
        add table x
        add chain x y { type filter hook input priority 0; }
        add table x { flags dormant; }
        add chain x w { type filter hook input priority 1; }
      
      which triggers the following warning when trying to unregister chain w
      which is already unregistered.
      
      [  127.322252] WARNING: CPU: 7 PID: 1211 at net/netfilter/core.c:50                                                                     1 __nf_unregister_net_hook+0x21a/0x260
      [...]
      [  127.322519] Call Trace:
      [  127.322521]  <TASK>
      [  127.322524]  ? __warn+0x9f/0x1a0
      [  127.322531]  ? __nf_unregister_net_hook+0x21a/0x260
      [  127.322537]  ? report_bug+0x1b1/0x1e0
      [  127.322545]  ? handle_bug+0x3c/0x70
      [  127.322552]  ? exc_invalid_op+0x17/0x40
      [  127.322556]  ? asm_exc_invalid_op+0x1a/0x20
      [  127.322563]  ? kasan_save_free_info+0x3b/0x60
      [  127.322570]  ? __nf_unregister_net_hook+0x6a/0x260
      [  127.322577]  ? __nf_unregister_net_hook+0x21a/0x260
      [  127.322583]  ? __nf_unregister_net_hook+0x6a/0x260
      [  127.322590]  ? __nf_tables_unregister_hook+0x8a/0xe0 [nf_tables]
      [  127.322655]  nft_table_disable+0x75/0xf0 [nf_tables]
      [  127.322717]  nf_tables_commit+0x2571/0x2620 [nf_tables]
      
      Fixes: 179d9ba5 ("netfilter: nf_tables: fix table flag updates")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      994209dd
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: flush pending destroy work before exit_net release · 24cea967
      Pablo Neira Ayuso authored
      Similar to 2c9f0293 ("netfilter: nf_tables: flush pending destroy
      work before netlink notifier") to address a race between exit_net and
      the destroy workqueue.
      
      The trace below shows an element to be released via destroy workqueue
      while exit_net path (triggered via module removal) has already released
      the set that is used in such transaction.
      
      [ 1360.547789] BUG: KASAN: slab-use-after-free in nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables]
      [ 1360.547861] Read of size 8 at addr ffff888140500cc0 by task kworker/4:1/152465
      [ 1360.547870] CPU: 4 PID: 152465 Comm: kworker/4:1 Not tainted 6.8.0+ #359
      [ 1360.547882] Workqueue: events nf_tables_trans_destroy_work [nf_tables]
      [ 1360.547984] Call Trace:
      [ 1360.547991]  <TASK>
      [ 1360.547998]  dump_stack_lvl+0x53/0x70
      [ 1360.548014]  print_report+0xc4/0x610
      [ 1360.548026]  ? __virt_addr_valid+0xba/0x160
      [ 1360.548040]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
      [ 1360.548054]  ? nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables]
      [ 1360.548176]  kasan_report+0xae/0xe0
      [ 1360.548189]  ? nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables]
      [ 1360.548312]  nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables]
      [ 1360.548447]  ? __pfx_nf_tables_trans_destroy_work+0x10/0x10 [nf_tables]
      [ 1360.548577]  ? _raw_spin_unlock_irq+0x18/0x30
      [ 1360.548591]  process_one_work+0x2f1/0x670
      [ 1360.548610]  worker_thread+0x4d3/0x760
      [ 1360.548627]  ? __pfx_worker_thread+0x10/0x10
      [ 1360.548640]  kthread+0x16b/0x1b0
      [ 1360.548653]  ? __pfx_kthread+0x10/0x10
      [ 1360.548665]  ret_from_fork+0x2f/0x50
      [ 1360.548679]  ? __pfx_kthread+0x10/0x10
      [ 1360.548690]  ret_from_fork_asm+0x1a/0x30
      [ 1360.548707]  </TASK>
      
      [ 1360.548719] Allocated by task 192061:
      [ 1360.548726]  kasan_save_stack+0x20/0x40
      [ 1360.548739]  kasan_save_track+0x14/0x30
      [ 1360.548750]  __kasan_kmalloc+0x8f/0xa0
      [ 1360.548760]  __kmalloc_node+0x1f1/0x450
      [ 1360.548771]  nf_tables_newset+0x10c7/0x1b50 [nf_tables]
      [ 1360.548883]  nfnetlink_rcv_batch+0xbc4/0xdc0 [nfnetlink]
      [ 1360.548909]  nfnetlink_rcv+0x1a8/0x1e0 [nfnetlink]
      [ 1360.548927]  netlink_unicast+0x367/0x4f0
      [ 1360.548935]  netlink_sendmsg+0x34b/0x610
      [ 1360.548944]  ____sys_sendmsg+0x4d4/0x510
      [ 1360.548953]  ___sys_sendmsg+0xc9/0x120
      [ 1360.548961]  __sys_sendmsg+0xbe/0x140
      [ 1360.548971]  do_syscall_64+0x55/0x120
      [ 1360.548982]  entry_SYSCALL_64_after_hwframe+0x55/0x5d
      
      [ 1360.548994] Freed by task 192222:
      [ 1360.548999]  kasan_save_stack+0x20/0x40
      [ 1360.549009]  kasan_save_track+0x14/0x30
      [ 1360.549019]  kasan_save_free_info+0x3b/0x60
      [ 1360.549028]  poison_slab_object+0x100/0x180
      [ 1360.549036]  __kasan_slab_free+0x14/0x30
      [ 1360.549042]  kfree+0xb6/0x260
      [ 1360.549049]  __nft_release_table+0x473/0x6a0 [nf_tables]
      [ 1360.549131]  nf_tables_exit_net+0x170/0x240 [nf_tables]
      [ 1360.549221]  ops_exit_list+0x50/0xa0
      [ 1360.549229]  free_exit_list+0x101/0x140
      [ 1360.549236]  unregister_pernet_operations+0x107/0x160
      [ 1360.549245]  unregister_pernet_subsys+0x1c/0x30
      [ 1360.549254]  nf_tables_module_exit+0x43/0x80 [nf_tables]
      [ 1360.549345]  __do_sys_delete_module+0x253/0x370
      [ 1360.549352]  do_syscall_64+0x55/0x120
      [ 1360.549360]  entry_SYSCALL_64_after_hwframe+0x55/0x5d
      
      (gdb) list *__nft_release_table+0x473
      0x1e033 is in __nft_release_table (net/netfilter/nf_tables_api.c:11354).
      11349           list_for_each_entry_safe(flowtable, nf, &table->flowtables, list) {
      11350                   list_del(&flowtable->list);
      11351                   nft_use_dec(&table->use);
      11352                   nf_tables_flowtable_destroy(flowtable);
      11353           }
      11354           list_for_each_entry_safe(set, ns, &table->sets, list) {
      11355                   list_del(&set->list);
      11356                   nft_use_dec(&table->use);
      11357                   if (set->flags & (NFT_SET_MAP | NFT_SET_OBJECT))
      11358                           nft_map_deactivate(&ctx, set);
      (gdb)
      
      [ 1360.549372] Last potentially related work creation:
      [ 1360.549376]  kasan_save_stack+0x20/0x40
      [ 1360.549384]  __kasan_record_aux_stack+0x9b/0xb0
      [ 1360.549392]  __queue_work+0x3fb/0x780
      [ 1360.549399]  queue_work_on+0x4f/0x60
      [ 1360.549407]  nft_rhash_remove+0x33b/0x340 [nf_tables]
      [ 1360.549516]  nf_tables_commit+0x1c6a/0x2620 [nf_tables]
      [ 1360.549625]  nfnetlink_rcv_batch+0x728/0xdc0 [nfnetlink]
      [ 1360.549647]  nfnetlink_rcv+0x1a8/0x1e0 [nfnetlink]
      [ 1360.549671]  netlink_unicast+0x367/0x4f0
      [ 1360.549680]  netlink_sendmsg+0x34b/0x610
      [ 1360.549690]  ____sys_sendmsg+0x4d4/0x510
      [ 1360.549697]  ___sys_sendmsg+0xc9/0x120
      [ 1360.549706]  __sys_sendmsg+0xbe/0x140
      [ 1360.549715]  do_syscall_64+0x55/0x120
      [ 1360.549725]  entry_SYSCALL_64_after_hwframe+0x55/0x5d
      
      Fixes: 0935d558 ("netfilter: nf_tables: asynchronous release")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      24cea967
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: release mutex after nft_gc_seq_end from abort path · 0d459e2f
      Pablo Neira Ayuso authored
      The commit mutex should not be released during the critical section
      between nft_gc_seq_begin() and nft_gc_seq_end(), otherwise, async GC
      worker could collect expired objects and get the released commit lock
      within the same GC sequence.
      
      nf_tables_module_autoload() temporarily releases the mutex to load
      module dependencies, then it goes back to replay the transaction again.
      Move it at the end of the abort phase after nft_gc_seq_end() is called.
      
      Cc: stable@vger.kernel.org
      Fixes: 72034434 ("netfilter: nf_tables: GC transaction race with abort path")
      Reported-by: default avatarKuan-Ting Chen <hexrabbit@devco.re>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      0d459e2f
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: release batch on table validation from abort path · a45e6889
      Pablo Neira Ayuso authored
      Unlike early commit path stage which triggers a call to abort, an
      explicit release of the batch is required on abort, otherwise mutex is
      released and commit_list remains in place.
      
      Add WARN_ON_ONCE to ensure commit_list is empty from the abort path
      before releasing the mutex.
      
      After this patch, commit_list is always assumed to be empty before
      grabbing the mutex, therefore
      
        03c1f1ef ("netfilter: Cleanup nft_net->module_list from nf_tables_exit_net()")
      
      only needs to release the pending modules for registration.
      
      Cc: stable@vger.kernel.org
      Fixes: c0391b6a ("netfilter: nf_tables: missing validation from the abort path")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      a45e6889
    • Paolo Abeni's avatar
      Revert "tg3: Remove residual error handling in tg3_suspend" · 72076fc9
      Paolo Abeni authored
      This reverts commit 9ab4ad29.
      
      I went out of coffee and applied it to the wrong tree. Blame on me.
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      72076fc9
    • Nikita Kiryushin's avatar
      tg3: Remove residual error handling in tg3_suspend · 9ab4ad29
      Nikita Kiryushin authored
      As of now, tg3_power_down_prepare always ends with success, but
      the error handling code from former tg3_set_power_state call is still here.
      
      This code became unreachable in commit c866b7ea ("tg3: Do not use
      legacy PCI power management").
      
      Remove (now unreachable) error handling code for simplification and change
      tg3_power_down_prepare to a void function as its result is no more checked.
      Signed-off-by: default avatarNikita Kiryushin <kiryushin@ancud.ru>
      Reviewed-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240401191418.361747-1-kiryushin@ancud.ruSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      9ab4ad29
    • Haiyang Zhang's avatar
      net: mana: Fix Rx DMA datasize and skb_over_panic · c0de6ab9
      Haiyang Zhang authored
      mana_get_rxbuf_cfg() aligns the RX buffer's DMA datasize to be
      multiple of 64. So a packet slightly bigger than mtu+14, say 1536,
      can be received and cause skb_over_panic.
      
      Sample dmesg:
      [ 5325.237162] skbuff: skb_over_panic: text:ffffffffc043277a len:1536 put:1536 head:ff1100018b517000 data:ff1100018b517100 tail:0x700 end:0x6ea dev:<NULL>
      [ 5325.243689] ------------[ cut here ]------------
      [ 5325.245748] kernel BUG at net/core/skbuff.c:192!
      [ 5325.247838] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
      [ 5325.258374] RIP: 0010:skb_panic+0x4f/0x60
      [ 5325.302941] Call Trace:
      [ 5325.304389]  <IRQ>
      [ 5325.315794]  ? skb_panic+0x4f/0x60
      [ 5325.317457]  ? asm_exc_invalid_op+0x1f/0x30
      [ 5325.319490]  ? skb_panic+0x4f/0x60
      [ 5325.321161]  skb_put+0x4e/0x50
      [ 5325.322670]  mana_poll+0x6fa/0xb50 [mana]
      [ 5325.324578]  __napi_poll+0x33/0x1e0
      [ 5325.326328]  net_rx_action+0x12e/0x280
      
      As discussed internally, this alignment is not necessary. To fix
      this bug, remove it from the code. So oversized packets will be
      marked as CQE_RX_TRUNCATED by NIC, and dropped.
      
      Cc: stable@vger.kernel.org
      Fixes: 2fbbd712 ("net: mana: Enable RX path to handle various MTU sizes")
      Signed-off-by: default avatarHaiyang Zhang <haiyangz@microsoft.com>
      Reviewed-by: default avatarDexuan Cui <decui@microsoft.com>
      Link: https://lore.kernel.org/r/1712087316-20886-1-git-send-email-haiyangz@microsoft.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c0de6ab9
    • Eric Dumazet's avatar
      net/sched: fix lockdep splat in qdisc_tree_reduce_backlog() · 7eb32236
      Eric Dumazet authored
      qdisc_tree_reduce_backlog() is called with the qdisc lock held,
      not RTNL.
      
      We must use qdisc_lookup_rcu() instead of qdisc_lookup()
      
      syzbot reported:
      
      WARNING: suspicious RCU usage
      6.1.74-syzkaller #0 Not tainted
      -----------------------------
      net/sched/sch_api.c:305 suspicious rcu_dereference_protected() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      3 locks held by udevd/1142:
        #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:306 [inline]
        #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:747 [inline]
        #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: net_tx_action+0x64a/0x970 net/core/dev.c:5282
        #1: ffff888171861108 (&sch->q.lock){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:350 [inline]
        #1: ffff888171861108 (&sch->q.lock){+.-.}-{2:2}, at: net_tx_action+0x754/0x970 net/core/dev.c:5297
        #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:306 [inline]
        #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:747 [inline]
        #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: qdisc_tree_reduce_backlog+0x84/0x580 net/sched/sch_api.c:792
      
      stack backtrace:
      CPU: 1 PID: 1142 Comm: udevd Not tainted 6.1.74-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
      Call Trace:
       <TASK>
        [<ffffffff85b85f14>] __dump_stack lib/dump_stack.c:88 [inline]
        [<ffffffff85b85f14>] dump_stack_lvl+0x1b1/0x28f lib/dump_stack.c:106
        [<ffffffff85b86007>] dump_stack+0x15/0x1e lib/dump_stack.c:113
        [<ffffffff81802299>] lockdep_rcu_suspicious+0x1b9/0x260 kernel/locking/lockdep.c:6592
        [<ffffffff84f0054c>] qdisc_lookup+0xac/0x6f0 net/sched/sch_api.c:305
        [<ffffffff84f037c3>] qdisc_tree_reduce_backlog+0x243/0x580 net/sched/sch_api.c:811
        [<ffffffff84f5b78c>] pfifo_tail_enqueue+0x32c/0x4b0 net/sched/sch_fifo.c:51
        [<ffffffff84fbcf63>] qdisc_enqueue include/net/sch_generic.h:833 [inline]
        [<ffffffff84fbcf63>] netem_dequeue+0xeb3/0x15d0 net/sched/sch_netem.c:723
        [<ffffffff84eecab9>] dequeue_skb net/sched/sch_generic.c:292 [inline]
        [<ffffffff84eecab9>] qdisc_restart net/sched/sch_generic.c:397 [inline]
        [<ffffffff84eecab9>] __qdisc_run+0x249/0x1e60 net/sched/sch_generic.c:415
        [<ffffffff84d7aa96>] qdisc_run+0xd6/0x260 include/net/pkt_sched.h:125
        [<ffffffff84d85d29>] net_tx_action+0x7c9/0x970 net/core/dev.c:5313
        [<ffffffff85e002bd>] __do_softirq+0x2bd/0x9bd kernel/softirq.c:616
        [<ffffffff81568bca>] invoke_softirq kernel/softirq.c:447 [inline]
        [<ffffffff81568bca>] __irq_exit_rcu+0xca/0x230 kernel/softirq.c:700
        [<ffffffff81568ae9>] irq_exit_rcu+0x9/0x20 kernel/softirq.c:712
        [<ffffffff85b89f52>] sysvec_apic_timer_interrupt+0x42/0x90 arch/x86/kernel/apic/apic.c:1107
        [<ffffffff85c00ccb>] asm_sysvec_apic_timer_interrupt+0x1b/0x20 arch/x86/include/asm/idtentry.h:656
      
      Fixes: d636fc5d ("net: sched: add rcu annotations around qdisc->qdisc_sleeping")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Link: https://lore.kernel.org/r/20240402134133.2352776-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7eb32236
    • Horatiu Vultur's avatar
      net: phy: micrel: lan8814: Fix when enabling/disabling 1-step timestamping · de99e1ea
      Horatiu Vultur authored
      There are 2 issues with the blamed commit.
      1. When the phy is initialized, it would enable the disabled of UDPv4
         checksums. The UDPv6 checksum is already enabled by default. So when
         1-step is configured then it would clear these flags.
      2. After the 1-step is configured, then if 2-step is configured then the
         1-step would be still configured because it is not clearing the flag.
         So the sync frames will still have origin timestamps set.
      
      Fix this by reading first the value of the register and then
      just change bit 12 as this one determines if the timestamp needs to
      be inserted in the frame, without changing any other bits.
      
      Fixes: ece19502 ("net: phy: micrel: 1588 support for LAN8814 phy")
      Signed-off-by: default avatarHoratiu Vultur <horatiu.vultur@microchip.com>
      Reviewed-by: default avatarDivya Koppera <divya.koppera@microchip.com>
      Link: https://lore.kernel.org/r/20240402071634.2483524-1-horatiu.vultur@microchip.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      de99e1ea
    • Piotr Wejman's avatar
      net: stmmac: fix rx queue priority assignment · b3da86d4
      Piotr Wejman authored
      The driver should ensure that same priority is not mapped to multiple
      rx queues. From DesignWare Cores Ethernet Quality-of-Service
      Databook, section 17.1.29 MAC_RxQ_Ctrl2:
      "[...]The software must ensure that the content of this field is
      mutually exclusive to the PSRQ fields for other queues, that is,
      the same priority is not mapped to multiple Rx queues[...]"
      
      Previously rx_queue_priority() function was:
      - clearing all priorities from a queue
      - adding new priorities to that queue
      After this patch it will:
      - first assign new priorities to a queue
      - then remove those priorities from all other queues
      - keep other priorities previously assigned to that queue
      
      Fixes: a8f5102a ("net: stmmac: TX and RX queue priority configuration")
      Fixes: 2142754f ("net: stmmac: Add MAC related callbacks for XGMAC2")
      Signed-off-by: default avatarPiotr Wejman <piotrwejman90@gmail.com>
      Link: https://lore.kernel.org/r/20240401192239.33942-1-piotrwejman90@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b3da86d4
    • Duanqiang Wen's avatar
      net: txgbe: fix i2c dev name cannot match clkdev · c644920c
      Duanqiang Wen authored
      txgbe clkdev shortened clk_name, so i2c_dev info_name
      also need to shorten. Otherwise, i2c_dev cannot initialize
      clock.
      
      Fixes: e30cef00 ("net: txgbe: fix clk_name exceed MAX_DEV_ID limits")
      Signed-off-by: default avatarDuanqiang Wen <duanqiangwen@net-swift.com>
      Link: https://lore.kernel.org/r/20240402021843.126192-1-duanqiangwen@net-swift.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c644920c
    • Jakub Kicinski's avatar
      Merge branch 'net-fec-fix-to-suspend-resume-with-mac_managed_pm' · 22c5e0bc
      Jakub Kicinski authored
      John Ernberg says:
      
      ====================
      net: fec: Fix to suspend / resume with mac_managed_pm
      
      Since the introduction of mac_managed_pm in the FEC driver there were some
      discrepancies regarding power management of the PHY.
      
      This failed on our board that has a permanently powered Microchip LAN8700R
      attached to the FEC. Although the root cause of the failure can be traced
      back to f166f890 ("net: ethernet: fec: Replace interrupt driven MDIO
      with polled IO") and probably even before that, we only started noticing
      the problem going from 5.10 to 6.1.
      
      Since 557d5dc8 ("net: fec: use mac-managed PHY PM") is actually a fix
      to most of the power management sequencing problems that came with power
      managing the MDIO bus which for the FEC meant adding a race with FEC
      resume (and phy_start() if netif was running) and PHY resume.
      
      That it worked before for us was probably just luck...
      
      Thanks to Wei's response to my report at [1] I was able to pick up his
      patch and start honing in on the remaining missing details.
      
      [1]: https://lore.kernel.org/netdev/1f45bdbe-eab1-4e59-8f24-add177590d27@actia.se/
      
      v3: https://lore.kernel.org/netdev/20240306133734.4144808-1-john.ernberg@actia.se/
      v2: https://lore.kernel.org/netdev/20240229105256.2903095-1-john.ernberg@actia.se/
      v1: https://lore.kernel.org/netdev/20240212105010.2258421-1-john.ernberg@actia.se/
      ====================
      
      Link: https://lore.kernel.org/r/20240328155909.59613-1-john.ernberg@actia.seSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      22c5e0bc
    • Wei Fang's avatar
      net: fec: Set mac_managed_pm during probe · cbc17e78
      Wei Fang authored
      Setting mac_managed_pm during interface up is too late.
      
      In situations where the link is not brought up yet and the system suspends
      the regular PHY power management will run. Since the FEC ETHEREN control
      bit is cleared (automatically) on suspend the controller is off in resume.
      When the regular PHY power management resume path runs in this context it
      will write to the MII_DATA register but nothing will be transmitted on the
      MDIO bus.
      
      This can be observed by the following log:
      
          fec 5b040000.ethernet eth0: MDIO read timeout
          Microchip LAN87xx T1 5b040000.ethernet-1:04: PM: dpm_run_callback(): mdio_bus_phy_resume+0x0/0xc8 returns -110
          Microchip LAN87xx T1 5b040000.ethernet-1:04: PM: failed to resume: error -110
      
      The data written will however remain in the MII_DATA register.
      
      When the link later is set to administrative up it will trigger a call to
      fec_restart() which will restore the MII_SPEED register. This triggers the
      quirk explained in f166f890 ("net: ethernet: fec: Replace interrupt
      driven MDIO with polled IO") causing an extra MII_EVENT.
      
      This extra event desynchronizes all the MDIO register reads, causing them
      to complete too early. Leading all reads to read as 0 because
      fec_enet_mdio_wait() returns too early.
      
      When a Microchip LAN8700R PHY is connected to the FEC, the 0 reads causes
      the PHY to be initialized incorrectly and the PHY will not transmit any
      ethernet signal in this state. It cannot be brought out of this state
      without a power cycle of the PHY.
      
      Fixes: 557d5dc8 ("net: fec: use mac-managed PHY PM")
      Closes: https://lore.kernel.org/netdev/1f45bdbe-eab1-4e59-8f24-add177590d27@actia.se/Signed-off-by: default avatarWei Fang <wei.fang@nxp.com>
      [jernberg: commit message]
      Signed-off-by: default avatarJohn Ernberg <john.ernberg@actia.se>
      Link: https://lore.kernel.org/r/20240328155909.59613-2-john.ernberg@actia.seSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cbc17e78
  2. 03 Apr, 2024 10 commits
  3. 02 Apr, 2024 7 commits
  4. 29 Mar, 2024 1 commit