1. 19 Aug, 2022 8 commits
    • Arnaldo Carvalho de Melo's avatar
      perf beauty: Update copy of linux/socket.h with the kernel sources · cf1258ac
      Arnaldo Carvalho de Melo authored
      To pick the changes in:
      
        7fa875b8 ("net: copy from user before calling __copy_msghdr")
        ebe73a28 ("net: Allow custom iter handler in msghdr")
        7c701d92 ("skbuff: carry external ubuf_info in msghdr")
        c0424532 ("net: make __sys_accept4_file() static")
      
      That don't result in any changes in the tables generated from that
      header.
      
      This silences this perf build warning:
      
        Warning: Kernel ABI header at 'tools/perf/trace/beauty/include/linux/socket.h' differs from latest version at 'include/linux/socket.h'
        diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h
      
      Cc: David Ahern <dsahern@kernel.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Dylan Yudaken <dylany@fb.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Pavel Begunkov <asml.silence@gmail.com>
      Cc: Yajun Deng <yajun.deng@linux.dev>
      Link: https://lore.kernel.org/lkml/YvzYs+F+Xzq8Hvvp@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cf1258ac
    • Ian Rogers's avatar
      perf cpumap: Fix alignment for masks in event encoding · b2f10cd4
      Ian Rogers authored
      A mask encoding of a cpu map is laid out as:
      
        u16 nr
        u16 long_size
        unsigned long mask[];
      
      However, the mask may be 8-byte aligned meaning there is a 4-byte pad
      after long_size. This means 32-bit and 64-bit builds see the mask as
      being at different offsets. On top of this the structure is in the byte
      data[] encoded as:
      
        u16 type
        char data[]
      
      This means the mask's struct isn't the required 4 or 8 byte aligned, but
      is offset by 2. Consequently the long reads and writes are causing
      undefined behavior as the alignment is broken.
      
      Fix the mask struct by creating explicit 32 and 64-bit variants, use a
      union to avoid data[] and casts; the struct must be packed so the
      layout matches the existing perf.data layout. Taking an address of a
      member of a packed struct breaks alignment so pass the packed
      perf_record_cpu_map_data to functions, so they can access variables with
      the right alignment.
      
      As the 64-bit version has 4 bytes of padding, optimizing writing to only
      write the 32-bit version.
      
      Committer notes:
      
      Disable warnings about 'packed' that break the build in some arches like
      riscv64, but just around that specific struct.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Colin Ian King <colin.king@intel.com>
      Cc: Dave Marchevsky <davemarchevsky@fb.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Kees Kook <keescook@chromium.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20220614143353.1559597-5-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b2f10cd4
    • Ian Rogers's avatar
      perf cpumap: Compute mask size in constant time · 28526478
      Ian Rogers authored
      perf_cpu_map__max() computes the cpumap's maximum value, no need to
      iterate over all values.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Colin Ian King <colin.king@intel.com>
      Cc: Dave Marchevsky <davemarchevsky@fb.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Kees Kook <keescook@chromium.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20220614143353.1559597-4-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      28526478
    • Ian Rogers's avatar
      perf cpumap: Synthetic events and const/static · 35ae6f09
      Ian Rogers authored
      Make the cpumap arguments const to make it clearer they are in rather
      than out arguments. Make two functions static and remove external
      declarations.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Colin Ian King <colin.king@intel.com>
      Cc: Dave Marchevsky <davemarchevsky@fb.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Kees Kook <keescook@chromium.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20220614143353.1559597-3-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      35ae6f09
    • Ian Rogers's avatar
      perf cpumap: Const map for max() · e989bc3d
      Ian Rogers authored
      Allows max() to be used with 'const struct perf_cpu_maps *'.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Colin Ian King <colin.king@intel.com>
      Cc: Dave Marchevsky <davemarchevsky@fb.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Kees Kook <keescook@chromium.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20220614143353.1559597-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e989bc3d
    • Linus Torvalds's avatar
      Merge tag 'net-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 4c2d0b03
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from netfilter.
      
        Current release - regressions:
      
         - tcp: fix cleanup and leaks in tcp_read_skb() (the new way BPF
           socket maps get data out of the TCP stack)
      
         - tls: rx: react to strparser initialization errors
      
         - netfilter: nf_tables: fix scheduling-while-atomic splat
      
         - net: fix suspicious RCU usage in bpf_sk_reuseport_detach()
      
        Current release - new code bugs:
      
         - mlxsw: ptp: fix a couple of races, static checker warnings and
           error handling
      
        Previous releases - regressions:
      
         - netfilter:
            - nf_tables: fix possible module reference underflow in error path
            - make conntrack helpers deal with BIG TCP (skbs > 64kB)
            - nfnetlink: re-enable conntrack expectation events
      
         - net: fix potential refcount leak in ndisc_router_discovery()
      
        Previous releases - always broken:
      
         - sched: cls_route: disallow handle of 0
      
         - neigh: fix possible local DoS due to net iface start/stop loop
      
         - rtnetlink: fix module refcount leak in rtnetlink_rcv_msg
      
         - sched: fix adding qlen to qcpu->backlog in gnet_stats_add_queue_cpu
      
         - virtio_net: fix endian-ness for RSS
      
         - dsa: mv88e6060: prevent crash on an unused port
      
         - fec: fix timer capture timing in `fec_ptp_enable_pps()`
      
         - ocelot: stats: fix races, integer wrapping and reading incorrect
           registers (the change of register definitions here accounts for
           bulk of the changed LoC in this PR)"
      
      * tag 'net-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (77 commits)
        net: moxa: MAC address reading, generating, validity checking
        tcp: handle pure FIN case correctly
        tcp: refactor tcp_read_skb() a bit
        tcp: fix tcp_cleanup_rbuf() for tcp_read_skb()
        tcp: fix sock skb accounting in tcp_read_skb()
        igb: Add lock to avoid data race
        dt-bindings: Fix incorrect "the the" corrections
        net: genl: fix error path memory leak in policy dumping
        stmmac: intel: Add a missing clk_disable_unprepare() call in intel_eth_pci_remove()
        net: ethernet: mtk_eth_soc: fix possible NULL pointer dereference in mtk_xdp_run
        net/mlx5e: Allocate flow steering storage during uplink initialization
        net: mscc: ocelot: report ndo_get_stats64 from the wraparound-resistant ocelot->stats
        net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset
        net: mscc: ocelot: make struct ocelot_stat_layout array indexable
        net: mscc: ocelot: fix race between ndo_get_stats64 and ocelot_check_stats_work
        net: mscc: ocelot: turn stats_lock into a spinlock
        net: mscc: ocelot: fix address of SYS_COUNT_TX_AGING counter
        net: mscc: ocelot: fix incorrect ndo_get_stats64 packet counters
        net: dsa: felix: fix ethtool 256-511 and 512-1023 TX packet counters
        net: dsa: don't warn in dsa_port_set_state_now() when driver doesn't support it
        ...
      4c2d0b03
    • Linus Torvalds's avatar
      Merge tag 'linux-kselftest-next-6.0-rc2' of... · 90b6b686
      Linus Torvalds authored
      Merge tag 'linux-kselftest-next-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull Kselftest fix from Shuah Khan:
      
       - fix landlock test build regression
      
      * tag 'linux-kselftest-next-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        selftests/landlock: fix broken include of linux/landlock.h
      90b6b686
    • Linus Torvalds's avatar
      Merge tag 'trace-rtla-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 0de277d4
      Linus Torvalds authored
      Pull rtla tool fixes from Steven Rostedt:
       "Fixes for the Real-Time Linux Analysis tooling:
      
         - Fix tracer name in comments and prints
      
         - Fix setting up symlinks
      
         - Allow extra flags to be set in build
      
         - Consolidate and show all necessary libraries not found in build
           error"
      
      * tag 'trace-rtla-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        rtla: Consolidate and show all necessary libraries that failed for building
        tools/rtla: Build with EXTRA_{C,LD}FLAGS
        tools/rtla: Fix command symlinks
        rtla: Fix tracer name
      0de277d4
  2. 18 Aug, 2022 26 commits
  3. 17 Aug, 2022 6 commits
    • David Howells's avatar
      net: Fix suspicious RCU usage in bpf_sk_reuseport_detach() · fc4aaf9f
      David Howells authored
      bpf_sk_reuseport_detach() calls __rcu_dereference_sk_user_data_with_flags()
      to obtain the value of sk->sk_user_data, but that function is only usable
      if the RCU read lock is held, and neither that function nor any of its
      callers hold it.
      
      Fix this by adding a new helper, __locked_read_sk_user_data_with_flags()
      that checks to see if sk->sk_callback_lock() is held and use that here
      instead.
      
      Alternatively, making __rcu_dereference_sk_user_data_with_flags() use
      rcu_dereference_checked() might suffice.
      
      Without this, the following warning can be occasionally observed:
      
      =============================
      WARNING: suspicious RCU usage
      6.0.0-rc1-build2+ #563 Not tainted
      -----------------------------
      include/net/sock.h:592 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      5 locks held by locktest/29873:
       #0: ffff88812734b550 (&sb->s_type->i_mutex_key#9){+.+.}-{3:3}, at: __sock_release+0x77/0x121
       #1: ffff88812f5621b0 (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_close+0x1c/0x70
       #2: ffff88810312f5c8 (&h->lhash2[i].lock){+.+.}-{2:2}, at: inet_unhash+0x76/0x1c0
       #3: ffffffff83768bb8 (reuseport_lock){+...}-{2:2}, at: reuseport_detach_sock+0x18/0xdd
       #4: ffff88812f562438 (clock-AF_INET){++..}-{2:2}, at: bpf_sk_reuseport_detach+0x24/0xa4
      
      stack backtrace:
      CPU: 1 PID: 29873 Comm: locktest Not tainted 6.0.0-rc1-build2+ #563
      Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
      Call Trace:
       <TASK>
       dump_stack_lvl+0x4c/0x5f
       bpf_sk_reuseport_detach+0x6d/0xa4
       reuseport_detach_sock+0x75/0xdd
       inet_unhash+0xa5/0x1c0
       tcp_set_state+0x169/0x20f
       ? lockdep_sock_is_held+0x3a/0x3a
       ? __lock_release.isra.0+0x13e/0x220
       ? reacquire_held_locks+0x1bb/0x1bb
       ? hlock_class+0x31/0x96
       ? mark_lock+0x9e/0x1af
       __tcp_close+0x50/0x4b6
       tcp_close+0x28/0x70
       inet_release+0x8e/0xa7
       __sock_release+0x95/0x121
       sock_close+0x14/0x17
       __fput+0x20f/0x36a
       task_work_run+0xa3/0xcc
       exit_to_user_mode_prepare+0x9c/0x14d
       syscall_exit_to_user_mode+0x18/0x44
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Fixes: cf8c1e96 ("net: refactor bpf_sk_reuseport_detach()")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Hawkins Jiawei <yin31149@gmail.com>
      Link: https://lore.kernel.org/r/166064248071.3502205.10036394558814861778.stgit@warthog.procyon.org.ukSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fc4aaf9f
    • Linus Torvalds's avatar
      Merge tag 'ntfs3_for_6.0' of https://github.com/Paragon-Software-Group/linux-ntfs3 · 3b06a275
      Linus Torvalds authored
      Pull ntfs3 updates from Konstantin Komarov:
      
       - implement FALLOC_FL_INSERT_RANGE
      
       - fix some logic errors
      
       - fixed xfstests (tested on x86_64): generic/064 generic/213
         generic/300 generic/361 generic/449 generic/485
      
       - some dead code removed or refactored
      
      * tag 'ntfs3_for_6.0' of https://github.com/Paragon-Software-Group/linux-ntfs3: (39 commits)
        fs/ntfs3: uninitialized variable in ntfs_set_acl_ex()
        fs/ntfs3: Remove unused function wnd_bits
        fs/ntfs3: Make ni_ins_new_attr return error
        fs/ntfs3: Create MFT zone only if length is large enough
        fs/ntfs3: Refactoring attr_insert_range to restore after errors
        fs/ntfs3: Refactoring attr_punch_hole to restore after errors
        fs/ntfs3: Refactoring attr_set_size to restore after errors
        fs/ntfs3: New function ntfs_bad_inode
        fs/ntfs3: Make MFT zone less fragmented
        fs/ntfs3: Check possible errors in run_pack in advance
        fs/ntfs3: Added comments to frecord functions
        fs/ntfs3: Fill duplicate info in ni_add_name
        fs/ntfs3: Make static function attr_load_runs
        fs/ntfs3: Add new argument is_mft to ntfs_mark_rec_free
        fs/ntfs3: Remove unused mi_mark_free
        fs/ntfs3: Fix very fragmented case in attr_punch_hole
        fs/ntfs3: Fix work with fragmented xattr
        fs/ntfs3: Make ntfs_fallocate return -ENOSPC instead of -EFBIG
        fs/ntfs3: extend ni_insert_nonresident to return inserted ATTR_LIST_ENTRY
        fs/ntfs3: Check reserved size for maximum allowed
        ...
      3b06a275
    • Linus Torvalds's avatar
      dcache: move the DCACHE_OP_COMPARE case out of the __d_lookup_rcu loop · ae2a8236
      Linus Torvalds authored
      __d_lookup_rcu() is one of the hottest functions in the kernel on
      certain loads, and it is complicated by filesystems that might want to
      have their own name compare function.
      
      We can improve code generation by moving the test of DCACHE_OP_COMPARE
      outside the loop, which makes the loop itself much simpler, at the cost
      of some code duplication.  But both cases end up being simpler, and the
      "native" direct case-sensitive compare particularly so.
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ae2a8236
    • Arun Ramadoss's avatar
      net: dsa: microchip: ksz9477: fix fdb_dump last invalid entry · 36c0d935
      Arun Ramadoss authored
      In the ksz9477_fdb_dump function it reads the ALU control register and
      exit from the timeout loop if there is valid entry or search is
      complete. After exiting the loop, it reads the alu entry and report to
      the user space irrespective of entry is valid. It works till the valid
      entry. If the loop exited when search is complete, it reads the alu
      table. The table returns all ones and it is reported to user space. So
      bridge fdb show gives ff:ff:ff:ff:ff:ff as last entry for every port.
      To fix it, after exiting the loop the entry is reported only if it is
      valid one.
      
      Fixes: b987e98e ("dsa: add DSA switch driver for Microchip KSZ9477")
      Signed-off-by: default avatarArun Ramadoss <arun.ramadoss@microchip.com>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20220816105516.18350-1-arun.ramadoss@microchip.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      36c0d935
    • Sylwester Dziedziuch's avatar
      ice: Fix VF not able to send tagged traffic with no VLAN filters · 664d4646
      Sylwester Dziedziuch authored
      VF was not able to send tagged traffic when it didn't
      have any VLAN interfaces and VLAN anti-spoofing was enabled.
      Fix this by allowing VFs with no VLAN filters to send tagged
      traffic. After VF adds a VLAN interface it will be able to
      send tagged traffic matching VLAN filters only.
      
      Testing hints:
      1. Spawn VF
      2. Send tagged packet from a VF
      3. The packet should be sent out and not dropped
      4. Add a VLAN interface on VF
      5. Send tagged packet on that VLAN interface
      6. Packet should be sent out and not dropped
      7. Send tagged packet with id different than VLAN interface
      8. Packet should be dropped
      
      Fixes: daf4dd16 ("ice: Refactor spoofcheck configuration functions")
      Signed-off-by: default avatarSylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
      Signed-off-by: default avatarMateusz Palczewski <mateusz.palczewski@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      664d4646
    • Benjamin Mikailenko's avatar
      ice: Ignore error message when setting same promiscuous mode · 79956b83
      Benjamin Mikailenko authored
      Commit 1273f895 ("ice: Fix broken IFF_ALLMULTI handling")
      introduced new checks when setting/clearing promiscuous mode. But if the
      requested promiscuous mode setting already exists, an -EEXIST error
      message would be printed. This is incorrect because promiscuous mode is
      either on/off and shouldn't print an error when the requested
      configuration is already set.
      
      This can happen when removing a bridge with two bonded interfaces and
      promiscuous most isn't fully cleared from VLAN VSI in hardware.
      
      Fix this by ignoring cases where requested promiscuous mode exists.
      
      Fixes: 1273f895 ("ice: Fix broken IFF_ALLMULTI handling")
      Signed-off-by: default avatarBenjamin Mikailenko <benjamin.mikailenko@intel.com>
      Signed-off-by: default avatarGrzegorz Siwik <grzegorz.siwik@intel.com>
      Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/
      Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      79956b83