1. 27 Feb, 2024 18 commits
  2. 26 Feb, 2024 3 commits
  3. 24 Feb, 2024 1 commit
  4. 23 Feb, 2024 8 commits
    • Geoff Levand's avatar
      ps3/gelic: Fix SKB allocation · b0b1210b
      Geoff Levand authored
      Commit 3ce4f9c3 ("net/ps3_gelic_net: Add gelic_descr structures") of
      6.8-rc1 had a copy-and-paste error where the pointer that holds the
      allocated SKB (struct gelic_descr.skb)  was set to NULL after the SKB was
      allocated. This resulted in a kernel panic when the SKB pointer was
      accessed.
      
      This fix moves the initialization of the gelic_descr to before the SKB
      is allocated.
      Reported-by: default avatarsambat goson <sombat3960@gmail.com>
      Fixes: 3ce4f9c3 ("net/ps3_gelic_net: Add gelic_descr structures")
      Signed-off-by: default avatarGeoff Levand <geoff@infradead.org>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b0b1210b
    • Vladimir Oltean's avatar
      net: dpaa: fman_memac: accept phy-interface-type = "10gbase-r" in the device tree · 734f06db
      Vladimir Oltean authored
      Since commit 5d93cfcf ("net: dpaa: Convert to phylink"), we support
      the "10gbase-r" phy-mode through a driver-based conversion of "xgmii",
      but we still don't actually support it when the device tree specifies
      "10gbase-r" proper.
      
      This is because boards such as LS1046A-RDB do not define pcs-handle-names
      (for whatever reason) in the ethernet@f0000 device tree node, and the
      code enters through this code path:
      
      	err = of_property_match_string(mac_node, "pcs-handle-names", "xfi");
      	// code takes neither branch and falls through
      	if (err >= 0) {
      		(...)
      	} else if (err != -EINVAL && err != -ENODATA) {
      		goto _return_fm_mac_free;
      	}
      
      	(...)
      
      	/* For compatibility, if pcs-handle-names is missing, we assume this
      	 * phy is the first one in pcsphy-handle
      	 */
      	err = of_property_match_string(mac_node, "pcs-handle-names", "sgmii");
      	if (err == -EINVAL || err == -ENODATA)
      		pcs = memac_pcs_create(mac_node, 0); // code takes this branch
      	else if (err < 0)
      		goto _return_fm_mac_free;
      	else
      		pcs = memac_pcs_create(mac_node, err);
      
      	// A default PCS is created and saved in "pcs"
      
      	// This determination fails and mistakenly saves the default PCS
      	// memac->sgmii_pcs instead of memac->xfi_pcs, because at this
      	// stage, mac_dev->phy_if == PHY_INTERFACE_MODE_10GBASER.
      	if (err && mac_dev->phy_if == PHY_INTERFACE_MODE_XGMII)
      		memac->xfi_pcs = pcs;
      	else
      		memac->sgmii_pcs = pcs;
      
      In other words, in the absence of pcs-handle-names, the default
      xfi_pcs assignment logic only works when in the device tree we have
      PHY_INTERFACE_MODE_XGMII.
      
      By reversing the order between the fallback xfi_pcs assignment and the
      "xgmii" overwrite with "10gbase-r", we are able to support both values
      in the device tree, with identical behavior.
      
      Currently, it is impossible to make the s/xgmii/10gbase-r/ device tree
      conversion, because it would break forward compatibility (new device
      tree with old kernel). The only way to modify existing device trees to
      phy-interface-mode = "10gbase-r" is to fix stable kernels to accept this
      value and handle it properly.
      
      One reason why the conversion is desirable is because with pre-phylink
      kernels, the Aquantia PHY driver used to warn about the improper use
      of PHY_INTERFACE_MODE_XGMII [1]. It is best to have a single (latest)
      device tree that works with all supported stable kernel versions.
      
      Note that the blamed commit does not constitute a regression per se.
      Older stable kernels like 6.1 still do not work with "10gbase-r", but
      for a different reason. That is a battle for another time.
      
      [1] https://lore.kernel.org/netdev/20240214-ls1046-dts-use-10gbase-r-v1-1-8c2d68547393@concurrent-rt.com/
      
      Fixes: 5d93cfcf ("net: dpaa: Convert to phylink")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarSean Anderson <sean.anderson@seco.com>
      Acked-by: default avatarMadalin Bucur <madalin.bucur@oss.nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      734f06db
    • Jeremy Kerr's avatar
      net: mctp: take ownership of skb in mctp_local_output · 3773d65a
      Jeremy Kerr authored
      Currently, mctp_local_output only takes ownership of skb on success, and
      we may leak an skb if mctp_local_output fails in specific states; the
      skb ownership isn't transferred until the actual output routing occurs.
      
      Instead, make mctp_local_output free the skb on all error paths up to
      the route action, so it always consumes the passed skb.
      
      Fixes: 833ef3b9 ("mctp: Populate socket implementation")
      Signed-off-by: default avatarJeremy Kerr <jk@codeconstruct.com.au>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240220081053.1439104-1-jk@codeconstruct.com.auSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3773d65a
    • Jakub Kicinski's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · e872469c
      Jakub Kicinski authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2024-02-20 (ice)
      
      This series contains updates to ice driver only.
      
      Yochai sets parent device to properly reflect connection state between
      source DPLL and output pin.
      
      Arkadiusz fixes additional issues related to DPLL; proper reporting of
      phase_adjust value and preventing use/access of data while resetting.
      
      Amritha resolves ASSERT_RTNL() being triggered on certain reset/rebuild
      flows.
      
      * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
        ice: Fix ASSERT_RTNL() warning during certain scenarios
        ice: fix pin phase adjust updates on PF reset
        ice: fix dpll periodic work data updates on PF reset
        ice: fix dpll and dpll_pin data access on PF reset
        ice: fix dpll input pin phase_adjust value updates
        ice: fix connection state of DPLL and out pin
      ====================
      Reviewed-by: default avatarVadim Fedorenko <vadim.fedorenko@linux.dev>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Link: https://lore.kernel.org/r/20240220214444.1039759-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e872469c
    • Florian Westphal's avatar
      net: ip_tunnel: prevent perpetual headroom growth · 5ae1e992
      Florian Westphal authored
      syzkaller triggered following kasan splat:
      BUG: KASAN: use-after-free in __skb_flow_dissect+0x19d1/0x7a50 net/core/flow_dissector.c:1170
      Read of size 1 at addr ffff88812fb4000e by task syz-executor183/5191
      [..]
       kasan_report+0xda/0x110 mm/kasan/report.c:588
       __skb_flow_dissect+0x19d1/0x7a50 net/core/flow_dissector.c:1170
       skb_flow_dissect_flow_keys include/linux/skbuff.h:1514 [inline]
       ___skb_get_hash net/core/flow_dissector.c:1791 [inline]
       __skb_get_hash+0xc7/0x540 net/core/flow_dissector.c:1856
       skb_get_hash include/linux/skbuff.h:1556 [inline]
       ip_tunnel_xmit+0x1855/0x33c0 net/ipv4/ip_tunnel.c:748
       ipip_tunnel_xmit+0x3cc/0x4e0 net/ipv4/ipip.c:308
       __netdev_start_xmit include/linux/netdevice.h:4940 [inline]
       netdev_start_xmit include/linux/netdevice.h:4954 [inline]
       xmit_one net/core/dev.c:3548 [inline]
       dev_hard_start_xmit+0x13d/0x6d0 net/core/dev.c:3564
       __dev_queue_xmit+0x7c1/0x3d60 net/core/dev.c:4349
       dev_queue_xmit include/linux/netdevice.h:3134 [inline]
       neigh_connected_output+0x42c/0x5d0 net/core/neighbour.c:1592
       ...
       ip_finish_output2+0x833/0x2550 net/ipv4/ip_output.c:235
       ip_finish_output+0x31/0x310 net/ipv4/ip_output.c:323
       ..
       iptunnel_xmit+0x5b4/0x9b0 net/ipv4/ip_tunnel_core.c:82
       ip_tunnel_xmit+0x1dbc/0x33c0 net/ipv4/ip_tunnel.c:831
       ipgre_xmit+0x4a1/0x980 net/ipv4/ip_gre.c:665
       __netdev_start_xmit include/linux/netdevice.h:4940 [inline]
       netdev_start_xmit include/linux/netdevice.h:4954 [inline]
       xmit_one net/core/dev.c:3548 [inline]
       dev_hard_start_xmit+0x13d/0x6d0 net/core/dev.c:3564
       ...
      
      The splat occurs because skb->data points past skb->head allocated area.
      This is because neigh layer does:
        __skb_pull(skb, skb_network_offset(skb));
      
      ... but skb_network_offset() returns a negative offset and __skb_pull()
      arg is unsigned.  IOW, we skb->data gets "adjusted" by a huge value.
      
      The negative value is returned because skb->head and skb->data distance is
      more than 64k and skb->network_header (u16) has wrapped around.
      
      The bug is in the ip_tunnel infrastructure, which can cause
      dev->needed_headroom to increment ad infinitum.
      
      The syzkaller reproducer consists of packets getting routed via a gre
      tunnel, and route of gre encapsulated packets pointing at another (ipip)
      tunnel.  The ipip encapsulation finds gre0 as next output device.
      
      This results in the following pattern:
      
      1). First packet is to be sent out via gre0.
      Route lookup found an output device, ipip0.
      
      2).
      ip_tunnel_xmit for gre0 bumps gre0->needed_headroom based on the future
      output device, rt.dev->needed_headroom (ipip0).
      
      3).
      ip output / start_xmit moves skb on to ipip0. which runs the same
      code path again (xmit recursion).
      
      4).
      Routing step for the post-gre0-encap packet finds gre0 as output device
      to use for ipip0 encapsulated packet.
      
      tunl0->needed_headroom is then incremented based on the (already bumped)
      gre0 device headroom.
      
      This repeats for every future packet:
      
      gre0->needed_headroom gets inflated because previous packets' ipip0 step
      incremented rt->dev (gre0) headroom, and ipip0 incremented because gre0
      needed_headroom was increased.
      
      For each subsequent packet, gre/ipip0->needed_headroom grows until
      post-expand-head reallocations result in a skb->head/data distance of
      more than 64k.
      
      Once that happens, skb->network_header (u16) wraps around when
      pskb_expand_head tries to make sure that skb_network_offset() is unchanged
      after the headroom expansion/reallocation.
      
      After this skb_network_offset(skb) returns a different (and negative)
      result post headroom expansion.
      
      The next trip to neigh layer (or anything else that would __skb_pull the
      network header) makes skb->data point to a memory location outside
      skb->head area.
      
      v2: Cap the needed_headroom update to an arbitarily chosen upperlimit to
      prevent perpetual increase instead of dropping the headroom increment
      completely.
      
      Reported-and-tested-by: syzbot+bfde3bef047a81b8fde6@syzkaller.appspotmail.com
      Closes: https://groups.google.com/g/syzkaller-bugs/c/fL9G6GtWskY/m/VKk_PR5FBAAJ
      Fixes: 243aad83 ("ip_gre: include route header_len in max_headroom calculation")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240220135606.4939-1-fw@strlen.deSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5ae1e992
    • Andre Werner's avatar
      net: smsc95xx: add support for SYS TEC USB-SPEmodule1 · 45532b21
      Andre Werner authored
      This patch adds support for the SYS TEC USB-SPEmodule1 10Base-T1L
      ethernet device to the existing smsc95xx driver by adding the new
      USB VID/PID pair.
      Signed-off-by: default avatarAndre Werner <andre.werner@systec-electronic.com>
      Link: https://lore.kernel.org/r/20240219053413.4732-1-andre.werner@systec-electronic.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      45532b21
    • Florian Westphal's avatar
      netlink: add nla be16/32 types to minlen array · 9a0d1885
      Florian Westphal authored
      BUG: KMSAN: uninit-value in nla_validate_range_unsigned lib/nlattr.c:222 [inline]
      BUG: KMSAN: uninit-value in nla_validate_int_range lib/nlattr.c:336 [inline]
      BUG: KMSAN: uninit-value in validate_nla lib/nlattr.c:575 [inline]
      BUG: KMSAN: uninit-value in __nla_validate_parse+0x2e20/0x45c0 lib/nlattr.c:631
       nla_validate_range_unsigned lib/nlattr.c:222 [inline]
       nla_validate_int_range lib/nlattr.c:336 [inline]
       validate_nla lib/nlattr.c:575 [inline]
      ...
      
      The message in question matches this policy:
      
       [NFTA_TARGET_REV]       = NLA_POLICY_MAX(NLA_BE32, 255),
      
      but because NLA_BE32 size in minlen array is 0, the validation
      code will read past the malformed (too small) attribute.
      
      Note: Other attributes, e.g. BITFIELD32, SINT, UINT.. are also missing:
      those likely should be added too.
      
      Reported-by: syzbot+3f497b07aa3baf2fb4d0@syzkaller.appspotmail.com
      Reported-by: default avatarxingwei lee <xrivendell7@gmail.com>
      Closes: https://lore.kernel.org/all/CABOYnLzFYHSnvTyS6zGa-udNX55+izqkOt2sB9WDqUcEGW6n8w@mail.gmail.com/raw
      Fixes: ecaf75ff ("netlink: introduce bigendian integer types")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Link: https://lore.kernel.org/r/20240221172740.5092-1-fw@strlen.deSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9a0d1885
    • Ryosuke Yasuoka's avatar
      netlink: Fix kernel-infoleak-after-free in __skb_datagram_iter · 661779e1
      Ryosuke Yasuoka authored
      syzbot reported the following uninit-value access issue [1]:
      
      netlink_to_full_skb() creates a new `skb` and puts the `skb->data`
      passed as a 1st arg of netlink_to_full_skb() onto new `skb`. The data
      size is specified as `len` and passed to skb_put_data(). This `len`
      is based on `skb->end` that is not data offset but buffer offset. The
      `skb->end` contains data and tailroom. Since the tailroom is not
      initialized when the new `skb` created, KMSAN detects uninitialized
      memory area when copying the data.
      
      This patch resolved this issue by correct the len from `skb->end` to
      `skb->len`, which is the actual data offset.
      
      BUG: KMSAN: kernel-infoleak-after-free in instrument_copy_to_user include/linux/instrumented.h:114 [inline]
      BUG: KMSAN: kernel-infoleak-after-free in copy_to_user_iter lib/iov_iter.c:24 [inline]
      BUG: KMSAN: kernel-infoleak-after-free in iterate_ubuf include/linux/iov_iter.h:29 [inline]
      BUG: KMSAN: kernel-infoleak-after-free in iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
      BUG: KMSAN: kernel-infoleak-after-free in iterate_and_advance include/linux/iov_iter.h:271 [inline]
      BUG: KMSAN: kernel-infoleak-after-free in _copy_to_iter+0x364/0x2520 lib/iov_iter.c:186
       instrument_copy_to_user include/linux/instrumented.h:114 [inline]
       copy_to_user_iter lib/iov_iter.c:24 [inline]
       iterate_ubuf include/linux/iov_iter.h:29 [inline]
       iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
       iterate_and_advance include/linux/iov_iter.h:271 [inline]
       _copy_to_iter+0x364/0x2520 lib/iov_iter.c:186
       copy_to_iter include/linux/uio.h:197 [inline]
       simple_copy_to_iter+0x68/0xa0 net/core/datagram.c:532
       __skb_datagram_iter+0x123/0xdc0 net/core/datagram.c:420
       skb_copy_datagram_iter+0x5c/0x200 net/core/datagram.c:546
       skb_copy_datagram_msg include/linux/skbuff.h:3960 [inline]
       packet_recvmsg+0xd9c/0x2000 net/packet/af_packet.c:3482
       sock_recvmsg_nosec net/socket.c:1044 [inline]
       sock_recvmsg net/socket.c:1066 [inline]
       sock_read_iter+0x467/0x580 net/socket.c:1136
       call_read_iter include/linux/fs.h:2014 [inline]
       new_sync_read fs/read_write.c:389 [inline]
       vfs_read+0x8f6/0xe00 fs/read_write.c:470
       ksys_read+0x20f/0x4c0 fs/read_write.c:613
       __do_sys_read fs/read_write.c:623 [inline]
       __se_sys_read fs/read_write.c:621 [inline]
       __x64_sys_read+0x93/0xd0 fs/read_write.c:621
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0x44/0x110 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Uninit was stored to memory at:
       skb_put_data include/linux/skbuff.h:2622 [inline]
       netlink_to_full_skb net/netlink/af_netlink.c:181 [inline]
       __netlink_deliver_tap_skb net/netlink/af_netlink.c:298 [inline]
       __netlink_deliver_tap+0x5be/0xc90 net/netlink/af_netlink.c:325
       netlink_deliver_tap net/netlink/af_netlink.c:338 [inline]
       netlink_deliver_tap_kernel net/netlink/af_netlink.c:347 [inline]
       netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline]
       netlink_unicast+0x10f1/0x1250 net/netlink/af_netlink.c:1368
       netlink_sendmsg+0x1238/0x13d0 net/netlink/af_netlink.c:1910
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg net/socket.c:745 [inline]
       ____sys_sendmsg+0x9c2/0xd60 net/socket.c:2584
       ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638
       __sys_sendmsg net/socket.c:2667 [inline]
       __do_sys_sendmsg net/socket.c:2676 [inline]
       __se_sys_sendmsg net/socket.c:2674 [inline]
       __x64_sys_sendmsg+0x307/0x490 net/socket.c:2674
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0x44/0x110 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Uninit was created at:
       free_pages_prepare mm/page_alloc.c:1087 [inline]
       free_unref_page_prepare+0xb0/0xa40 mm/page_alloc.c:2347
       free_unref_page_list+0xeb/0x1100 mm/page_alloc.c:2533
       release_pages+0x23d3/0x2410 mm/swap.c:1042
       free_pages_and_swap_cache+0xd9/0xf0 mm/swap_state.c:316
       tlb_batch_pages_flush mm/mmu_gather.c:98 [inline]
       tlb_flush_mmu_free mm/mmu_gather.c:293 [inline]
       tlb_flush_mmu+0x6f5/0x980 mm/mmu_gather.c:300
       tlb_finish_mmu+0x101/0x260 mm/mmu_gather.c:392
       exit_mmap+0x49e/0xd30 mm/mmap.c:3321
       __mmput+0x13f/0x530 kernel/fork.c:1349
       mmput+0x8a/0xa0 kernel/fork.c:1371
       exit_mm+0x1b8/0x360 kernel/exit.c:567
       do_exit+0xd57/0x4080 kernel/exit.c:858
       do_group_exit+0x2fd/0x390 kernel/exit.c:1021
       __do_sys_exit_group kernel/exit.c:1032 [inline]
       __se_sys_exit_group kernel/exit.c:1030 [inline]
       __x64_sys_exit_group+0x3c/0x50 kernel/exit.c:1030
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0x44/0x110 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Bytes 3852-3903 of 3904 are uninitialized
      Memory access of size 3904 starts at ffff88812ea1e000
      Data copied to user address 0000000020003280
      
      CPU: 1 PID: 5043 Comm: syz-executor297 Not tainted 6.7.0-rc5-syzkaller-00047-g5bd7ef53 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
      
      Fixes: 1853c949 ("netlink, mmap: transform mmap skb into full skb on taps")
      Reported-and-tested-by: syzbot+34ad5fab48f7bf510349@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=34ad5fab48f7bf510349 [1]
      Signed-off-by: default avatarRyosuke Yasuoka <ryasuoka@redhat.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20240221074053.1794118-1-ryasuoka@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      661779e1
  5. 22 Feb, 2024 10 commits
    • Linus Torvalds's avatar
      Merge tag 'net-6.8.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 6714ebb9
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from bpf and netfilter.
      
        Current release - regressions:
      
         - af_unix: fix another unix GC hangup
      
        Previous releases - regressions:
      
         - core: fix a possible AF_UNIX deadlock
      
         - bpf: fix NULL pointer dereference in sk_psock_verdict_data_ready()
      
         - netfilter: nft_flow_offload: release dst in case direct xmit path
           is used
      
         - bridge: switchdev: ensure MDB events are delivered exactly once
      
         - l2tp: pass correct message length to ip6_append_data
      
         - dccp/tcp: unhash sk from ehash for tb2 alloc failure after
           check_estalblished()
      
         - tls: fixes for record type handling with PEEK
      
         - devlink: fix possible use-after-free and memory leaks in
           devlink_init()
      
        Previous releases - always broken:
      
         - bpf: fix an oops when attempting to read the vsyscall page through
           bpf_probe_read_kernel
      
         - sched: act_mirred: use the backlog for mirred ingress
      
         - netfilter: nft_flow_offload: fix dst refcount underflow
      
         - ipv6: sr: fix possible use-after-free and null-ptr-deref
      
         - mptcp: fix several data races
      
         - phonet: take correct lock to peek at the RX queue
      
        Misc:
      
         - handful of fixes and reliability improvements for selftests"
      
      * tag 'net-6.8.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits)
        l2tp: pass correct message length to ip6_append_data
        net: phy: realtek: Fix rtl8211f_config_init() for RTL8211F(D)(I)-VD-CG PHY
        selftests: ioam: refactoring to align with the fix
        Fix write to cloned skb in ipv6_hop_ioam()
        phonet/pep: fix racy skb_queue_empty() use
        phonet: take correct lock to peek at the RX queue
        net: sparx5: Add spinlock for frame transmission from CPU
        net/sched: flower: Add lock protection when remove filter handle
        devlink: fix port dump cmd type
        net: stmmac: Fix EST offset for dwmac 5.10
        tools: ynl: don't leak mcast_groups on init error
        tools: ynl: make sure we always pass yarg to mnl_cb_run
        net: mctp: put sock on tag allocation failure
        netfilter: nf_tables: use kzalloc for hook allocation
        netfilter: nf_tables: register hooks last when adding new chain/flowtable
        netfilter: nft_flow_offload: release dst in case direct xmit path is used
        netfilter: nft_flow_offload: reset dst in route object after setting up flow
        netfilter: nf_tables: set dormant flag on hook register failure
        selftests: tls: add test for peeking past a record of a different type
        selftests: tls: add test for merging of same-type control messages
        ...
      6714ebb9
    • Linus Torvalds's avatar
      Merge tag 'trace-v6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · efa80dcb
      Linus Torvalds authored
      Pull tracing fix from Steven Rostedt:
      
       - While working on the ring buffer I noticed that the counter used for
         knowing where the end of the data is on a sub-buffer was not a full
         "int" but just 20 bits. It was masked out to 0xfffff.
      
         With the new code that allows the user to change the size of the
         sub-buffer, it is theoretically possible to ask for a size bigger
         than 2^20. If that happens, unexpected results may occur as there's
         no code checking if the counter overflowed the 20 bits of the write
         mask. There are other checks to make sure events fit in the
         sub-buffer, but if the sub-buffer itself is too big, that is not
         checked.
      
         Add a check in the resize of the sub-buffer to make sure that it
         never goes beyond the size of the counter that holds how much data is
         on it.
      
      * tag 'trace-v6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        ring-buffer: Do not let subbuf be bigger than write mask
      efa80dcb
    • Tom Parkin's avatar
      l2tp: pass correct message length to ip6_append_data · 359e54a9
      Tom Parkin authored
      l2tp_ip6_sendmsg needs to avoid accounting for the transport header
      twice when splicing more data into an already partially-occupied skbuff.
      
      To manage this, we check whether the skbuff contains data using
      skb_queue_empty when deciding how much data to append using
      ip6_append_data.
      
      However, the code which performed the calculation was incorrect:
      
           ulen = len + skb_queue_empty(&sk->sk_write_queue) ? transhdrlen : 0;
      
      ...due to C operator precedence, this ends up setting ulen to
      transhdrlen for messages with a non-zero length, which results in
      corrupted packets on the wire.
      
      Add parentheses to correct the calculation in line with the original
      intent.
      
      Fixes: 9d4c7580 ("ipv4, ipv6: Fix handling of transhdrlen in __ip{,6}_append_data()")
      Cc: David Howells <dhowells@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarTom Parkin <tparkin@katalix.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240220122156.43131-1-tparkin@katalix.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      359e54a9
    • Paolo Abeni's avatar
      Merge tag 'nf-24-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · 9ff27943
      Paolo Abeni authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for net:
      
      1) If user requests to wake up a table and hook fails, restore the
         dormant flag from the error path, from Florian Westphal.
      
      2) Reset dst after transferring it to the flow object, otherwise dst
         gets released twice from the error path.
      
      3) Release dst in case the flowtable selects a direct xmit path, eg.
         transmission to bridge port. Otherwise, dst is memleaked.
      
      4) Register basechain and flowtable hooks at the end of the command.
         Error path releases these datastructure without waiting for the
         rcu grace period.
      
      5) Use kzalloc() to initialize struct nft_hook to fix a KMSAN report
         on access to hook type, also from Florian Westphal.
      
      netfilter pull request 24-02-22
      
      * tag 'nf-24-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        netfilter: nf_tables: use kzalloc for hook allocation
        netfilter: nf_tables: register hooks last when adding new chain/flowtable
        netfilter: nft_flow_offload: release dst in case direct xmit path is used
        netfilter: nft_flow_offload: reset dst in route object after setting up flow
        netfilter: nf_tables: set dormant flag on hook register failure
      ====================
      
      Link: https://lore.kernel.org/r/20240222000843.146665-1-pablo@netfilter.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      9ff27943
    • Paolo Abeni's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · fdcd4467
      Paolo Abeni authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2024-02-22
      
      The following pull-request contains BPF updates for your *net* tree.
      
      We've added 11 non-merge commits during the last 24 day(s) which contain
      a total of 15 files changed, 217 insertions(+), 17 deletions(-).
      
      The main changes are:
      
      1) Fix a syzkaller-triggered oops when attempting to read the vsyscall
         page through bpf_probe_read_kernel and friends, from Hou Tao.
      
      2) Fix a kernel panic due to uninitialized iter position pointer in
         bpf_iter_task, from Yafang Shao.
      
      3) Fix a race between bpf_timer_cancel_and_free and bpf_timer_cancel,
         from Martin KaFai Lau.
      
      4) Fix a xsk warning in skb_add_rx_frag() (under CONFIG_DEBUG_NET)
         due to incorrect truesize accounting, from Sebastian Andrzej Siewior.
      
      5) Fix a NULL pointer dereference in sk_psock_verdict_data_ready,
         from Shigeru Yoshida.
      
      6) Fix a resolve_btfids warning when bpf_cpumask symbol cannot be
         resolved, from Hari Bathini.
      
      bpf-for-netdev
      
      * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        bpf, sockmap: Fix NULL pointer dereference in sk_psock_verdict_data_ready()
        selftests/bpf: Add negtive test cases for task iter
        bpf: Fix an issue due to uninitialized bpf_iter_task
        selftests/bpf: Test racing between bpf_timer_cancel_and_free and bpf_timer_cancel
        bpf: Fix racing between bpf_timer_cancel_and_free and bpf_timer_cancel
        selftest/bpf: Test the read of vsyscall page under x86-64
        x86/mm: Disallow vsyscall page read for copy_from_kernel_nofault()
        x86/mm: Move is_vsyscall_vaddr() into asm/vsyscall.h
        bpf, scripts: Correct GPL license name
        xsk: Add truesize to skb_add_rx_frag().
        bpf: Fix warning for bpf_cpumask in verifier
      ====================
      
      Link: https://lore.kernel.org/r/20240221231826.1404-1-daniel@iogearbox.netSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      fdcd4467
    • Siddharth Vadapalli's avatar
      net: phy: realtek: Fix rtl8211f_config_init() for RTL8211F(D)(I)-VD-CG PHY · 3489182b
      Siddharth Vadapalli authored
      Commit bb726b75 ("net: phy: realtek: add support for
      RTL8211F(D)(I)-VD-CG") extended support of the driver from the existing
      support for RTL8211F(D)(I)-CG PHY to the newer RTL8211F(D)(I)-VD-CG PHY.
      
      While that commit indicated that the RTL8211F_PHYCR2 register is not
      supported by the "VD-CG" PHY model and therefore updated the corresponding
      section in rtl8211f_config_init() to be invoked conditionally, the call to
      "genphy_soft_reset()" was left as-is, when it should have also been invoked
      conditionally. This is because the call to "genphy_soft_reset()" was first
      introduced by the commit 0a4355c2 ("net: phy: realtek: add dt property
      to disable CLKOUT clock") since the RTL8211F guide indicates that a PHY
      reset should be issued after setting bits in the PHYCR2 register.
      
      As the PHYCR2 register is not applicable to the "VD-CG" PHY model, fix the
      rtl8211f_config_init() function by invoking "genphy_soft_reset()"
      conditionally based on the presence of the "PHYCR2" register.
      
      Fixes: bb726b75 ("net: phy: realtek: add support for RTL8211F(D)(I)-VD-CG")
      Signed-off-by: default avatarSiddharth Vadapalli <s-vadapalli@ti.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240220070007.968762-1-s-vadapalli@ti.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      3489182b
    • Paolo Abeni's avatar
      Merge branch 'ioam6-fix-write-to-cloned-skb-s' · 39a4cd5a
      Paolo Abeni authored
      Justin Iurman says:
      
      ====================
      ioam6: fix write to cloned skb's
      
      Make sure the IOAM data insertion is not applied on cloned skb's. As a
      consequence, ioam selftests needed a refactoring.
      ====================
      
      Link: https://lore.kernel.org/r/20240219135255.15429-1-justin.iurman@uliege.beSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      39a4cd5a
    • Justin Iurman's avatar
      selftests: ioam: refactoring to align with the fix · 187bbb69
      Justin Iurman authored
      ioam6_parser uses a packet socket. After the fix to prevent writing to
      cloned skb's, the receiver does not see its IOAM data anymore, which
      makes input/forward ioam-selftests to fail. As a workaround,
      ioam6_parser now uses an IPv6 raw socket and leverages ancillary data to
      get hop-by-hop options. As a consequence, the hook is "after" the IOAM
      data insertion by the receiver and all tests are working again.
      Signed-off-by: default avatarJustin Iurman <justin.iurman@uliege.be>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      187bbb69
    • Justin Iurman's avatar
      Fix write to cloned skb in ipv6_hop_ioam() · f198d933
      Justin Iurman authored
      ioam6_fill_trace_data() writes inside the skb payload without ensuring
      it's writeable (e.g., not cloned). This function is called both from the
      input and output path. The output path (ioam6_iptunnel) already does the
      check. This commit provides a fix for the input path, inside
      ipv6_hop_ioam(). It also updates ip6_parse_tlv() to refresh the network
      header pointer ("nh") when returning from ipv6_hop_ioam().
      
      Fixes: 9ee11f0f ("ipv6: ioam: Data plane support for Pre-allocated Trace")
      Reported-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarJustin Iurman <justin.iurman@uliege.be>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      f198d933
    • Rémi Denis-Courmont's avatar
      phonet/pep: fix racy skb_queue_empty() use · 7d2a894d
      Rémi Denis-Courmont authored
      The receive queues are protected by their respective spin-lock, not
      the socket lock. This could lead to skb_peek() unexpectedly
      returning NULL or a pointer to an already dequeued socket buffer.
      
      Fixes: 9641458d ("Phonet: Pipe End Point for Phonet Pipes protocol")
      Signed-off-by: default avatarRémi Denis-Courmont <courmisch@gmail.com>
      Link: https://lore.kernel.org/r/20240218081214.4806-2-remi@remlab.netSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      7d2a894d