1. 19 Aug, 2022 3 commits
    • Linus Torvalds's avatar
      Merge tag 'net-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 4c2d0b03
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from netfilter.
      
        Current release - regressions:
      
         - tcp: fix cleanup and leaks in tcp_read_skb() (the new way BPF
           socket maps get data out of the TCP stack)
      
         - tls: rx: react to strparser initialization errors
      
         - netfilter: nf_tables: fix scheduling-while-atomic splat
      
         - net: fix suspicious RCU usage in bpf_sk_reuseport_detach()
      
        Current release - new code bugs:
      
         - mlxsw: ptp: fix a couple of races, static checker warnings and
           error handling
      
        Previous releases - regressions:
      
         - netfilter:
            - nf_tables: fix possible module reference underflow in error path
            - make conntrack helpers deal with BIG TCP (skbs > 64kB)
            - nfnetlink: re-enable conntrack expectation events
      
         - net: fix potential refcount leak in ndisc_router_discovery()
      
        Previous releases - always broken:
      
         - sched: cls_route: disallow handle of 0
      
         - neigh: fix possible local DoS due to net iface start/stop loop
      
         - rtnetlink: fix module refcount leak in rtnetlink_rcv_msg
      
         - sched: fix adding qlen to qcpu->backlog in gnet_stats_add_queue_cpu
      
         - virtio_net: fix endian-ness for RSS
      
         - dsa: mv88e6060: prevent crash on an unused port
      
         - fec: fix timer capture timing in `fec_ptp_enable_pps()`
      
         - ocelot: stats: fix races, integer wrapping and reading incorrect
           registers (the change of register definitions here accounts for
           bulk of the changed LoC in this PR)"
      
      * tag 'net-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (77 commits)
        net: moxa: MAC address reading, generating, validity checking
        tcp: handle pure FIN case correctly
        tcp: refactor tcp_read_skb() a bit
        tcp: fix tcp_cleanup_rbuf() for tcp_read_skb()
        tcp: fix sock skb accounting in tcp_read_skb()
        igb: Add lock to avoid data race
        dt-bindings: Fix incorrect "the the" corrections
        net: genl: fix error path memory leak in policy dumping
        stmmac: intel: Add a missing clk_disable_unprepare() call in intel_eth_pci_remove()
        net: ethernet: mtk_eth_soc: fix possible NULL pointer dereference in mtk_xdp_run
        net/mlx5e: Allocate flow steering storage during uplink initialization
        net: mscc: ocelot: report ndo_get_stats64 from the wraparound-resistant ocelot->stats
        net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset
        net: mscc: ocelot: make struct ocelot_stat_layout array indexable
        net: mscc: ocelot: fix race between ndo_get_stats64 and ocelot_check_stats_work
        net: mscc: ocelot: turn stats_lock into a spinlock
        net: mscc: ocelot: fix address of SYS_COUNT_TX_AGING counter
        net: mscc: ocelot: fix incorrect ndo_get_stats64 packet counters
        net: dsa: felix: fix ethtool 256-511 and 512-1023 TX packet counters
        net: dsa: don't warn in dsa_port_set_state_now() when driver doesn't support it
        ...
      4c2d0b03
    • Linus Torvalds's avatar
      Merge tag 'linux-kselftest-next-6.0-rc2' of... · 90b6b686
      Linus Torvalds authored
      Merge tag 'linux-kselftest-next-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull Kselftest fix from Shuah Khan:
      
       - fix landlock test build regression
      
      * tag 'linux-kselftest-next-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        selftests/landlock: fix broken include of linux/landlock.h
      90b6b686
    • Linus Torvalds's avatar
      Merge tag 'trace-rtla-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 0de277d4
      Linus Torvalds authored
      Pull rtla tool fixes from Steven Rostedt:
       "Fixes for the Real-Time Linux Analysis tooling:
      
         - Fix tracer name in comments and prints
      
         - Fix setting up symlinks
      
         - Allow extra flags to be set in build
      
         - Consolidate and show all necessary libraries not found in build
           error"
      
      * tag 'trace-rtla-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        rtla: Consolidate and show all necessary libraries that failed for building
        tools/rtla: Build with EXTRA_{C,LD}FLAGS
        tools/rtla: Fix command symlinks
        rtla: Fix tracer name
      0de277d4
  2. 18 Aug, 2022 26 commits
  3. 17 Aug, 2022 11 commits
    • David Howells's avatar
      net: Fix suspicious RCU usage in bpf_sk_reuseport_detach() · fc4aaf9f
      David Howells authored
      bpf_sk_reuseport_detach() calls __rcu_dereference_sk_user_data_with_flags()
      to obtain the value of sk->sk_user_data, but that function is only usable
      if the RCU read lock is held, and neither that function nor any of its
      callers hold it.
      
      Fix this by adding a new helper, __locked_read_sk_user_data_with_flags()
      that checks to see if sk->sk_callback_lock() is held and use that here
      instead.
      
      Alternatively, making __rcu_dereference_sk_user_data_with_flags() use
      rcu_dereference_checked() might suffice.
      
      Without this, the following warning can be occasionally observed:
      
      =============================
      WARNING: suspicious RCU usage
      6.0.0-rc1-build2+ #563 Not tainted
      -----------------------------
      include/net/sock.h:592 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      5 locks held by locktest/29873:
       #0: ffff88812734b550 (&sb->s_type->i_mutex_key#9){+.+.}-{3:3}, at: __sock_release+0x77/0x121
       #1: ffff88812f5621b0 (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_close+0x1c/0x70
       #2: ffff88810312f5c8 (&h->lhash2[i].lock){+.+.}-{2:2}, at: inet_unhash+0x76/0x1c0
       #3: ffffffff83768bb8 (reuseport_lock){+...}-{2:2}, at: reuseport_detach_sock+0x18/0xdd
       #4: ffff88812f562438 (clock-AF_INET){++..}-{2:2}, at: bpf_sk_reuseport_detach+0x24/0xa4
      
      stack backtrace:
      CPU: 1 PID: 29873 Comm: locktest Not tainted 6.0.0-rc1-build2+ #563
      Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
      Call Trace:
       <TASK>
       dump_stack_lvl+0x4c/0x5f
       bpf_sk_reuseport_detach+0x6d/0xa4
       reuseport_detach_sock+0x75/0xdd
       inet_unhash+0xa5/0x1c0
       tcp_set_state+0x169/0x20f
       ? lockdep_sock_is_held+0x3a/0x3a
       ? __lock_release.isra.0+0x13e/0x220
       ? reacquire_held_locks+0x1bb/0x1bb
       ? hlock_class+0x31/0x96
       ? mark_lock+0x9e/0x1af
       __tcp_close+0x50/0x4b6
       tcp_close+0x28/0x70
       inet_release+0x8e/0xa7
       __sock_release+0x95/0x121
       sock_close+0x14/0x17
       __fput+0x20f/0x36a
       task_work_run+0xa3/0xcc
       exit_to_user_mode_prepare+0x9c/0x14d
       syscall_exit_to_user_mode+0x18/0x44
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Fixes: cf8c1e96 ("net: refactor bpf_sk_reuseport_detach()")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Hawkins Jiawei <yin31149@gmail.com>
      Link: https://lore.kernel.org/r/166064248071.3502205.10036394558814861778.stgit@warthog.procyon.org.ukSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fc4aaf9f
    • Linus Torvalds's avatar
      Merge tag 'ntfs3_for_6.0' of https://github.com/Paragon-Software-Group/linux-ntfs3 · 3b06a275
      Linus Torvalds authored
      Pull ntfs3 updates from Konstantin Komarov:
      
       - implement FALLOC_FL_INSERT_RANGE
      
       - fix some logic errors
      
       - fixed xfstests (tested on x86_64): generic/064 generic/213
         generic/300 generic/361 generic/449 generic/485
      
       - some dead code removed or refactored
      
      * tag 'ntfs3_for_6.0' of https://github.com/Paragon-Software-Group/linux-ntfs3: (39 commits)
        fs/ntfs3: uninitialized variable in ntfs_set_acl_ex()
        fs/ntfs3: Remove unused function wnd_bits
        fs/ntfs3: Make ni_ins_new_attr return error
        fs/ntfs3: Create MFT zone only if length is large enough
        fs/ntfs3: Refactoring attr_insert_range to restore after errors
        fs/ntfs3: Refactoring attr_punch_hole to restore after errors
        fs/ntfs3: Refactoring attr_set_size to restore after errors
        fs/ntfs3: New function ntfs_bad_inode
        fs/ntfs3: Make MFT zone less fragmented
        fs/ntfs3: Check possible errors in run_pack in advance
        fs/ntfs3: Added comments to frecord functions
        fs/ntfs3: Fill duplicate info in ni_add_name
        fs/ntfs3: Make static function attr_load_runs
        fs/ntfs3: Add new argument is_mft to ntfs_mark_rec_free
        fs/ntfs3: Remove unused mi_mark_free
        fs/ntfs3: Fix very fragmented case in attr_punch_hole
        fs/ntfs3: Fix work with fragmented xattr
        fs/ntfs3: Make ntfs_fallocate return -ENOSPC instead of -EFBIG
        fs/ntfs3: extend ni_insert_nonresident to return inserted ATTR_LIST_ENTRY
        fs/ntfs3: Check reserved size for maximum allowed
        ...
      3b06a275
    • Linus Torvalds's avatar
      dcache: move the DCACHE_OP_COMPARE case out of the __d_lookup_rcu loop · ae2a8236
      Linus Torvalds authored
      __d_lookup_rcu() is one of the hottest functions in the kernel on
      certain loads, and it is complicated by filesystems that might want to
      have their own name compare function.
      
      We can improve code generation by moving the test of DCACHE_OP_COMPARE
      outside the loop, which makes the loop itself much simpler, at the cost
      of some code duplication.  But both cases end up being simpler, and the
      "native" direct case-sensitive compare particularly so.
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ae2a8236
    • Arun Ramadoss's avatar
      net: dsa: microchip: ksz9477: fix fdb_dump last invalid entry · 36c0d935
      Arun Ramadoss authored
      In the ksz9477_fdb_dump function it reads the ALU control register and
      exit from the timeout loop if there is valid entry or search is
      complete. After exiting the loop, it reads the alu entry and report to
      the user space irrespective of entry is valid. It works till the valid
      entry. If the loop exited when search is complete, it reads the alu
      table. The table returns all ones and it is reported to user space. So
      bridge fdb show gives ff:ff:ff:ff:ff:ff as last entry for every port.
      To fix it, after exiting the loop the entry is reported only if it is
      valid one.
      
      Fixes: b987e98e ("dsa: add DSA switch driver for Microchip KSZ9477")
      Signed-off-by: default avatarArun Ramadoss <arun.ramadoss@microchip.com>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20220816105516.18350-1-arun.ramadoss@microchip.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      36c0d935
    • Sylwester Dziedziuch's avatar
      ice: Fix VF not able to send tagged traffic with no VLAN filters · 664d4646
      Sylwester Dziedziuch authored
      VF was not able to send tagged traffic when it didn't
      have any VLAN interfaces and VLAN anti-spoofing was enabled.
      Fix this by allowing VFs with no VLAN filters to send tagged
      traffic. After VF adds a VLAN interface it will be able to
      send tagged traffic matching VLAN filters only.
      
      Testing hints:
      1. Spawn VF
      2. Send tagged packet from a VF
      3. The packet should be sent out and not dropped
      4. Add a VLAN interface on VF
      5. Send tagged packet on that VLAN interface
      6. Packet should be sent out and not dropped
      7. Send tagged packet with id different than VLAN interface
      8. Packet should be dropped
      
      Fixes: daf4dd16 ("ice: Refactor spoofcheck configuration functions")
      Signed-off-by: default avatarSylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
      Signed-off-by: default avatarMateusz Palczewski <mateusz.palczewski@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      664d4646
    • Benjamin Mikailenko's avatar
      ice: Ignore error message when setting same promiscuous mode · 79956b83
      Benjamin Mikailenko authored
      Commit 1273f895 ("ice: Fix broken IFF_ALLMULTI handling")
      introduced new checks when setting/clearing promiscuous mode. But if the
      requested promiscuous mode setting already exists, an -EEXIST error
      message would be printed. This is incorrect because promiscuous mode is
      either on/off and shouldn't print an error when the requested
      configuration is already set.
      
      This can happen when removing a bridge with two bonded interfaces and
      promiscuous most isn't fully cleared from VLAN VSI in hardware.
      
      Fix this by ignoring cases where requested promiscuous mode exists.
      
      Fixes: 1273f895 ("ice: Fix broken IFF_ALLMULTI handling")
      Signed-off-by: default avatarBenjamin Mikailenko <benjamin.mikailenko@intel.com>
      Signed-off-by: default avatarGrzegorz Siwik <grzegorz.siwik@intel.com>
      Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/
      Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      79956b83
    • Grzegorz Siwik's avatar
      ice: Fix clearing of promisc mode with bridge over bond · abddafd4
      Grzegorz Siwik authored
      When at least two interfaces are bonded and a bridge is enabled on the
      bond, an error can occur when the bridge is removed and re-added. The
      reason for the error is because promiscuous mode was not fully cleared from
      the VLAN VSI in the hardware. With this change, promiscuous mode is
      properly removed when the bridge disconnects from bonding.
      
      [ 1033.676359] bond1: link status definitely down for interface enp95s0f0, disabling it
      [ 1033.676366] bond1: making interface enp175s0f0 the new active one
      [ 1033.676369] device enp95s0f0 left promiscuous mode
      [ 1033.676522] device enp175s0f0 entered promiscuous mode
      [ 1033.676901] ice 0000:af:00.0 enp175s0f0: Error setting Multicast promiscuous mode on VSI 6
      [ 1041.795662] ice 0000:af:00.0 enp175s0f0: Error setting Multicast promiscuous mode on VSI 6
      [ 1041.944826] bond1: link status definitely down for interface enp175s0f0, disabling it
      [ 1041.944874] device enp175s0f0 left promiscuous mode
      [ 1041.944918] bond1: now running without any active interface!
      
      Fixes: c31af68a ("ice: Add outer_vlan_ops and VSI specific VLAN ops implementations")
      Co-developed-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
      Signed-off-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
      Signed-off-by: default avatarGrzegorz Siwik <grzegorz.siwik@intel.com>
      Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/Tested-by: default avatarJaroslav Pulchart <jaroslav.pulchart@gooddata.com>
      Tested-by: default avatarIgor Raits <igor@gooddata.com>
      Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      abddafd4
    • Grzegorz Siwik's avatar
      ice: Ignore EEXIST when setting promisc mode · 11e551a2
      Grzegorz Siwik authored
      Ignore EEXIST error when setting promiscuous mode.
      This fix is needed because the driver could set promiscuous mode
      when it still has not cleared properly.
      Promiscuous mode could be set only once, so setting it second
      time will be rejected.
      
      Fixes: 5eda8afd ("ice: Add support for PF/VF promiscuous mode")
      Signed-off-by: default avatarGrzegorz Siwik <grzegorz.siwik@intel.com>
      Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/Tested-by: default avatarJaroslav Pulchart <jaroslav.pulchart@gooddata.com>
      Tested-by: default avatarIgor Raits <igor@gooddata.com>
      Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      11e551a2
    • Grzegorz Siwik's avatar
      ice: Fix double VLAN error when entering promisc mode · ffa9ed86
      Grzegorz Siwik authored
      Avoid enabling or disabling VLAN 0 when trying to set promiscuous
      VLAN mode if double VLAN mode is enabled. This fix is needed
      because the driver tries to add the VLAN 0 filter twice (once for
      inner and once for outer) when double VLAN mode is enabled. The
      filter program is rejected by the firmware when double VLAN is
      enabled, because the promiscuous filter only needs to be set once.
      
      This issue was missed in the initial implementation of double VLAN
      mode.
      
      Fixes: 5eda8afd ("ice: Add support for PF/VF promiscuous mode")
      Signed-off-by: default avatarGrzegorz Siwik <grzegorz.siwik@intel.com>
      Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/Tested-by: default avatarJaroslav Pulchart <jaroslav.pulchart@gooddata.com>
      Tested-by: default avatarIgor Raits <igor@gooddata.com>
      Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      ffa9ed86
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · 274a2eeb
      Linus Torvalds authored
      Pull virtio fixes from Michael Tsirkin:
       "Most notably this drops the commits that trip up google cloud (turns
        out, any legacy device).
      
        Plus a kerneldoc patch"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        virtio: kerneldocs fixes and enhancements
        virtio: Revert "virtio: find_vqs() add arg sizes"
        virtio_vdpa: Revert "virtio_vdpa: support the arg sizes of find_vqs()"
        virtio_pci: Revert "virtio_pci: support the arg sizes of find_vqs()"
        virtio-mmio: Revert "virtio_mmio: support the arg sizes of find_vqs()"
        virtio: Revert "virtio: add helper virtio_find_vqs_ctx_size()"
        virtio_net: Revert "virtio_net: set the default max ring size by find_vqs()"
      274a2eeb
    • Florian Westphal's avatar
      testing: selftests: nft_flowtable.sh: rework test to detect offload failure · c8550b90
      Florian Westphal authored
      This test fails on current kernel releases because the flotwable path
      now calls dst_check from packet path and will then remove the offload.
      
      Test script has two purposes:
      1. check that file (random content) can be sent to other netns (and vv)
      2. check that the flow is offloaded (rather than handled by classic
         forwarding path).
      
      Since dst_check is in place, 2) fails because the nftables ruleset in
      router namespace 1 intentionally blocks traffic under the assumption
      that packets are not passed via classic path at all.
      
      Rework this: Instead of blocking traffic, create two named counters, one
      for original and one for reverse direction.
      
      The first three test cases are handled by classic forwarding path
      (path mtu discovery is disabled and packets exceed MTU).
      
      But all other tests enable PMTUD, so the originator and responder are
      expected to lower packet size and flowtable is expected to do the packet
      forwarding.
      
      For those tests, check that the packet counters (which are only
      incremented for packets that are passed up to classic forward path)
      are significantly lower than the file size transferred.
      
      I've tested that the counter-checks fail as expected when the 'flow add'
      statement is removed from the ruleset.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      c8550b90