1. 29 Jul, 2022 26 commits
  2. 28 Jul, 2022 14 commits
    • Linus Torvalds's avatar
      Merge tag 'net-5.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 33ea1340
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from bluetooth and netfilter, no known blockers for
        the release.
      
        Current release - regressions:
      
         - wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop(), fix
           taking the lock before its initialized
      
         - Bluetooth: mgmt: fix double free on error path
      
        Current release - new code bugs:
      
         - eth: ice: fix tunnel checksum offload with fragmented traffic
      
        Previous releases - regressions:
      
         - tcp: md5: fix IPv4-mapped support after refactoring, don't take the
           pure v6 path
      
         - Revert "tcp: change pingpong threshold to 3", improving detection
           of interactive sessions
      
         - mld: fix netdev refcount leak in mld_{query | report}_work() due to
           a race
      
         - Bluetooth:
            - always set event mask on suspend, avoid early wake ups
            - L2CAP: fix use-after-free caused by l2cap_chan_put
      
         - bridge: do not send empty IFLA_AF_SPEC attribute
      
        Previous releases - always broken:
      
         - ping6: fix memleak in ipv6_renew_options()
      
         - sctp: prevent null-deref caused by over-eager error paths
      
         - virtio-net: fix the race between refill work and close, resulting
           in NAPI scheduled after close and a BUG()
      
         - macsec:
            - fix three netlink parsing bugs
            - avoid breaking the device state on invalid change requests
            - fix a memleak in another error path
      
        Misc:
      
         - dt-bindings: net: ethernet-controller: rework 'fixed-link' schema
      
         - two more batches of sysctl data race adornment"
      
      * tag 'net-5.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (67 commits)
        stmmac: dwmac-mediatek: fix resource leak in probe
        ipv6/addrconf: fix a null-ptr-deref bug for ip6_ptr
        net: ping6: Fix memleak in ipv6_renew_options().
        net/funeth: Fix fun_xdp_tx() and XDP packet reclaim
        sctp: leave the err path free in sctp_stream_init to sctp_stream_free
        sfc: disable softirqs for ptp TX
        ptp: ocp: Select CRC16 in the Kconfig.
        tcp: md5: fix IPv4-mapped support
        virtio-net: fix the race between refill work and close
        mptcp: Do not return EINPROGRESS when subflow creation succeeds
        Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put
        Bluetooth: Always set event mask on suspend
        Bluetooth: mgmt: Fix double free on error path
        wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()
        ice: do not setup vlan for loopback VSI
        ice: check (DD | EOF) bits on Rx descriptor rather than (EOP | RS)
        ice: Fix VSIs unable to share unicast MAC
        ice: Fix tunnel checksum offload with fragmented traffic
        ice: Fix max VLANs available for VF
        netfilter: nft_queue: only allow supported familes and hooks
        ...
      33ea1340
    • Dan Carpenter's avatar
      stmmac: dwmac-mediatek: fix resource leak in probe · 4d3d3a1b
      Dan Carpenter authored
      If mediatek_dwmac_clks_config() fails, then call stmmac_remove_config_dt()
      before returning.  Otherwise it is a resource leak.
      
      Fixes: fa4b3ca6 ("stmmac: dwmac-mediatek: fix clock issue")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Link: https://lore.kernel.org/r/YuJ4aZyMUlG6yGGa@kiliSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4d3d3a1b
    • Ziyang Xuan's avatar
      ipv6/addrconf: fix a null-ptr-deref bug for ip6_ptr · 85f0173d
      Ziyang Xuan authored
      Change net device's MTU to smaller than IPV6_MIN_MTU or unregister
      device while matching route. That may trigger null-ptr-deref bug
      for ip6_ptr probability as following.
      
      =========================================================
      BUG: KASAN: null-ptr-deref in find_match.part.0+0x70/0x134
      Read of size 4 at addr 0000000000000308 by task ping6/263
      
      CPU: 2 PID: 263 Comm: ping6 Not tainted 5.19.0-rc7+ #14
      Call trace:
       dump_backtrace+0x1a8/0x230
       show_stack+0x20/0x70
       dump_stack_lvl+0x68/0x84
       print_report+0xc4/0x120
       kasan_report+0x84/0x120
       __asan_load4+0x94/0xd0
       find_match.part.0+0x70/0x134
       __find_rr_leaf+0x408/0x470
       fib6_table_lookup+0x264/0x540
       ip6_pol_route+0xf4/0x260
       ip6_pol_route_output+0x58/0x70
       fib6_rule_lookup+0x1a8/0x330
       ip6_route_output_flags_noref+0xd8/0x1a0
       ip6_route_output_flags+0x58/0x160
       ip6_dst_lookup_tail+0x5b4/0x85c
       ip6_dst_lookup_flow+0x98/0x120
       rawv6_sendmsg+0x49c/0xc70
       inet_sendmsg+0x68/0x94
      
      Reproducer as following:
      Firstly, prepare conditions:
      $ip netns add ns1
      $ip netns add ns2
      $ip link add veth1 type veth peer name veth2
      $ip link set veth1 netns ns1
      $ip link set veth2 netns ns2
      $ip netns exec ns1 ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1
      $ip netns exec ns2 ip -6 addr add 2001:0db8:0:f101::2/64 dev veth2
      $ip netns exec ns1 ifconfig veth1 up
      $ip netns exec ns2 ifconfig veth2 up
      $ip netns exec ns1 ip -6 route add 2000::/64 dev veth1 metric 1
      $ip netns exec ns2 ip -6 route add 2001::/64 dev veth2 metric 1
      
      Secondly, execute the following two commands in two ssh windows
      respectively:
      $ip netns exec ns1 sh
      $while true; do ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1; ip -6 route add 2000::/64 dev veth1 metric 1; ping6 2000::2; done
      
      $ip netns exec ns1 sh
      $while true; do ip link set veth1 mtu 1000; ip link set veth1 mtu 1500; sleep 5; done
      
      It is because ip6_ptr has been assigned to NULL in addrconf_ifdown() firstly,
      then ip6_ignore_linkdown() accesses ip6_ptr directly without NULL check.
      
      	cpu0			cpu1
      fib6_table_lookup
      __find_rr_leaf
      			addrconf_notify [ NETDEV_CHANGEMTU ]
      			addrconf_ifdown
      			RCU_INIT_POINTER(dev->ip6_ptr, NULL)
      find_match
      ip6_ignore_linkdown
      
      So we can add NULL check for ip6_ptr before using in ip6_ignore_linkdown() to
      fix the null-ptr-deref bug.
      
      Fixes: dcd1f572 ("net/ipv6: Remove fib6_idev")
      Signed-off-by: default avatarZiyang Xuan <william.xuanziyang@huawei.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20220728013307.656257-1-william.xuanziyang@huawei.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      85f0173d
    • Kuniyuki Iwashima's avatar
      net: ping6: Fix memleak in ipv6_renew_options(). · e2732600
      Kuniyuki Iwashima authored
      When we close ping6 sockets, some resources are left unfreed because
      pingv6_prot is missing sk->sk_prot->destroy().  As reported by
      syzbot [0], just three syscalls leak 96 bytes and easily cause OOM.
      
          struct ipv6_sr_hdr *hdr;
          char data[24] = {0};
          int fd;
      
          hdr = (struct ipv6_sr_hdr *)data;
          hdr->hdrlen = 2;
          hdr->type = IPV6_SRCRT_TYPE_4;
      
          fd = socket(AF_INET6, SOCK_DGRAM, NEXTHDR_ICMP);
          setsockopt(fd, IPPROTO_IPV6, IPV6_RTHDR, data, 24);
          close(fd);
      
      To fix memory leaks, let's add a destroy function.
      
      Note the socket() syscall checks if the GID is within the range of
      net.ipv4.ping_group_range.  The default value is [1, 0] so that no
      GID meets the condition (1 <= GID <= 0).  Thus, the local DoS does
      not succeed until we change the default value.  However, at least
      Ubuntu/Fedora/RHEL loosen it.
      
          $ cat /usr/lib/sysctl.d/50-default.conf
          ...
          -net.ipv4.ping_group_range = 0 2147483647
      
      Also, there could be another path reported with these options, and
      some of them require CAP_NET_RAW.
      
        setsockopt
            IPV6_ADDRFORM (inet6_sk(sk)->pktoptions)
            IPV6_RECVPATHMTU (inet6_sk(sk)->rxpmtu)
            IPV6_HOPOPTS (inet6_sk(sk)->opt)
            IPV6_RTHDRDSTOPTS (inet6_sk(sk)->opt)
            IPV6_RTHDR (inet6_sk(sk)->opt)
            IPV6_DSTOPTS (inet6_sk(sk)->opt)
            IPV6_2292PKTOPTIONS (inet6_sk(sk)->opt)
      
        getsockopt
            IPV6_FLOWLABEL_MGR (inet6_sk(sk)->ipv6_fl_list)
      
      For the record, I left a different splat with syzbot's one.
      
        unreferenced object 0xffff888006270c60 (size 96):
          comm "repro2", pid 231, jiffies 4294696626 (age 13.118s)
          hex dump (first 32 bytes):
            01 00 00 00 44 00 00 00 00 00 00 00 00 00 00 00  ....D...........
            00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          backtrace:
            [<00000000f6bc7ea9>] sock_kmalloc (net/core/sock.c:2564 net/core/sock.c:2554)
            [<000000006d699550>] do_ipv6_setsockopt.constprop.0 (net/ipv6/ipv6_sockglue.c:715)
            [<00000000c3c3b1f5>] ipv6_setsockopt (net/ipv6/ipv6_sockglue.c:1024)
            [<000000007096a025>] __sys_setsockopt (net/socket.c:2254)
            [<000000003a8ff47b>] __x64_sys_setsockopt (net/socket.c:2265 net/socket.c:2262 net/socket.c:2262)
            [<000000007c409dcb>] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
            [<00000000e939c4a9>] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
      
      [0]: https://syzkaller.appspot.com/bug?extid=a8430774139ec3ab7176
      
      Fixes: 6d0bfe22 ("net: ipv6: Add IPv6 support to the ping socket.")
      Reported-by: syzbot+a8430774139ec3ab7176@syzkaller.appspotmail.com
      Reported-by: default avatarAyushman Dutta <ayudutta@amazon.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20220728012220.46918-1-kuniyu@amazon.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e2732600
    • Linus Torvalds's avatar
      watch_queue: Fix missing locking in add_watch_to_object() · e64ab2db
      Linus Torvalds authored
      If a watch is being added to a queue, it needs to guard against
      interference from addition of a new watch, manual removal of a watch and
      removal of a watch due to some other queue being destroyed.
      
      KEYCTL_WATCH_KEY guards against this for the same {key,queue} pair by
      holding the key->sem writelocked and by holding refs on both the key and
      the queue - but that doesn't prevent interaction from other {key,queue}
      pairs.
      
      While add_watch_to_object() does take the spinlock on the event queue,
      it doesn't take the lock on the source's watch list.  The assumption was
      that the caller would prevent that (say by taking key->sem) - but that
      doesn't prevent interference from the destruction of another queue.
      
      Fix this by locking the watcher list in add_watch_to_object().
      
      Fixes: c73be61c ("pipe: Add general notification queue support")
      Reported-by: syzbot+03d7b43290037d1f87ca@syzkaller.appspotmail.com
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: keyrings@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e64ab2db
    • David Howells's avatar
      watch_queue: Fix missing rcu annotation · e0339f03
      David Howells authored
      Since __post_watch_notification() walks wlist->watchers with only the
      RCU read lock held, we need to use RCU methods to add to the list (we
      already use RCU methods to remove from the list).
      
      Fix add_watch_to_object() to use hlist_add_head_rcu() instead of
      hlist_add_head() for that list.
      
      Fixes: c73be61c ("pipe: Add general notification queue support")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e0339f03
    • Krzysztof Kozlowski's avatar
      net: cdns,macb: use correct xlnx prefix for Xilinx · 623cd870
      Krzysztof Kozlowski authored
      Use correct vendor for Xilinx versions of Cadence MACB/GEM Ethernet
      controller.  The Versal compatible was not released, so it can be
      changed.  Zynq-7xxx and Ultrascale+ has to be kept in new and deprecated
      form.
      Signed-off-by: default avatarKrzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
      Acked-by: default avatarHarini Katakam <harini.katakam@amd.com>
      Link: https://lore.kernel.org/r/20220726070802.26579-2-krzysztof.kozlowski@linaro.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      623cd870
    • Krzysztof Kozlowski's avatar
      dt-bindings: net: cdns,macb: use correct xlnx prefix for Xilinx · afa950b8
      Krzysztof Kozlowski authored
      Use correct vendor for Xilinx versions of Cadence MACB/GEM Ethernet
      controller.  The Versal compatible was not released, so it can be
      changed.  Zynq-7xxx and Ultrascale+ has to be kept in new and deprecated
      form.
      Signed-off-by: default avatarKrzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
      Reviewed-by: default avatarRob Herring <robh@kernel.org>
      Link: https://lore.kernel.org/r/20220726070802.26579-1-krzysztof.kozlowski@linaro.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      afa950b8
    • Dimitris Michailidis's avatar
      net/funeth: Fix fun_xdp_tx() and XDP packet reclaim · 51a83391
      Dimitris Michailidis authored
      The current implementation of fun_xdp_tx(), used for XPD_TX, is
      incorrect in that it takes an address/length pair and later releases it
      with page_frag_free(). It is OK for XDP_TX but the same code is used by
      ndo_xdp_xmit. In that case it loses the XDP memory type and releases the
      packet incorrectly for some of the types. Assorted breakage follows.
      
      Change fun_xdp_tx() to take xdp_frame and rely on xdp_return_frame() in
      reclaim.
      
      Fixes: db37bc17 ("net/funeth: add the data path")
      Signed-off-by: default avatarDimitris Michailidis <dmichail@fungible.com>
      Link: https://lore.kernel.org/r/20220726215923.7887-1-dmichail@fungible.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      51a83391
    • Paolo Abeni's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue · 7d85e9cb
      Paolo Abeni authored
      Tony Nguyen says:
      
      ====================
      ice: PPPoE offload support
      
      Marcin Szycik says:
      
      Add support for dissecting PPPoE and PPP-specific fields in flow dissector:
      PPPoE session id and PPP protocol type. Add support for those fields in
      tc-flower and support offloading PPPoE. Finally, add support for hardware
      offload of PPPoE packets in switchdev mode in ice driver.
      
      Example filter:
      tc filter add dev $PF1 ingress protocol ppp_ses prio 1 flower pppoe_sid \
          1234 ppp_proto ip skip_sw action mirred egress redirect dev $VF1_PR
      
      Changes in iproute2 are required to use the new fields (will be submitted
      soon).
      
      ICE COMMS DDP package is required to create a filter in ice.
      
      * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
        ice: Add support for PPPoE hardware offload
        flow_offload: Introduce flow_match_pppoe
        net/sched: flower: Add PPPoE filter
        flow_dissector: Add PPPoE dissectors
      ====================
      
      Link: https://lore.kernel.org/r/20220726203133.2171332-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      7d85e9cb
    • Jakub Kicinski's avatar
      add missing includes and forward declarations to networking includes under linux/ · 5f10376b
      Jakub Kicinski authored
      Similarly to a recent include/net/ cleanup, this patch adds
      missing includes to networking headers under include/linux.
      All these problems are currently masked by the existing users
      including the missing dependency before the broken header.
      
      Link: https://lore.kernel.org/all/20220723045755.2676857-1-kuba@kernel.org/ v1
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Link: https://lore.kernel.org/r/20220726215652.158167-1-kuba@kernel.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      5f10376b
    • Paolo Abeni's avatar
      Revert "Merge branch 'octeontx2-minor-tc-fixes'" · 4158e389
      Paolo Abeni authored
      This reverts commit 35d099da, reversing
      changes made to 58d8bcd4.
      
      I wrongly applied that to the net-next tree instead of the intended
      target tree (net). Reverting it on net-next.
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      4158e389
    • Marcin Wojtas's avatar
      net: dsa: mv88e6xxx: fix speed setting for CPU/DSA ports · cc1049cc
      Marcin Wojtas authored
      Commit 3c783b83 ("net: dsa: mv88e6xxx: get rid of SPEED_MAX setting")
      stopped relying on SPEED_MAX constant and hardcoded speed settings
      for the switch ports and rely on phylink configuration.
      
      It turned out, however, that when the relevant code is called,
      the mac_capabilites of CPU/DSA port remain unset.
      mv88e6xxx_setup_port() is called via mv88e6xxx_setup() in
      dsa_tree_setup_switches(), which precedes setting the caps in
      phylink_get_caps down in the chain of dsa_tree_setup_ports().
      
      As a result the mac_capabilites are 0 and the default speed for CPU/DSA
      port is 10M at the start. To fix that, execute mv88e6xxx_get_caps()
      and obtain the capabilities driectly.
      
      Fixes: 3c783b83 ("net: dsa: mv88e6xxx: get rid of SPEED_MAX setting")
      Signed-off-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20220726230918.2772378-1-mw@semihalf.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cc1049cc
    • Jakub Kicinski's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · bf84719d
      Jakub Kicinski authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2022-07-26
      
      This series contains updates to ice driver only.
      
      Przemyslaw corrects accounting for VF VLANs to allow for correct number
      of VLANs for untrusted VF. He also correct issue with checksum offload
      on VXLAN tunnels.
      
      Ani allows for two VSIs to share the same MAC address.
      
      Maciej corrects checked bits for descriptor completion of loopback
      
      * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
        ice: do not setup vlan for loopback VSI
        ice: check (DD | EOF) bits on Rx descriptor rather than (EOP | RS)
        ice: Fix VSIs unable to share unicast MAC
        ice: Fix tunnel checksum offload with fragmented traffic
        ice: Fix max VLANs available for VF
      ====================
      
      Link: https://lore.kernel.org/r/20220726204646.2171589-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      bf84719d