1. 09 Sep, 2024 5 commits
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Introduce SCA_SOCKADDR_FROM_USER() to set .from_user = true · be14a719
      Arnaldo Carvalho de Melo authored
      Paving the way for the generic BPF BTF based syscall arg augmenter.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Howard Chu <howardchu95@gmail.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      be14a719
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Introduce SCA_PERF_ATTR_FROM_USER() to set .from_user = true · 690eda65
      Arnaldo Carvalho de Melo authored
      Paving the way for the generic BPF BTF based syscall arg augmenter.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Howard Chu <howardchu95@gmail.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      690eda65
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Mark which syscall arguments go from user space to kernel space · 2f2e439b
      Arnaldo Carvalho de Melo authored
      We need to know where to collect it in the BPF augmenters, if in the
      sys_enter hook or in the sys_exit hook.
      
      Start with the SCA_FILENAME one, that is just from user to kernel space.
      
      The alternative, better, but takes a bit more time than I have now, is
      to use the __user information that is already in the syscall args and
      encoded in BTF via a tag, do it later.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Howard Chu <howardchu95@gmail.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2f2e439b
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Use a common encoding for augmented arguments, with size + error + payload · c90a88d3
      Arnaldo Carvalho de Melo authored
      We were using a more compact format, without explicitely encoding the
      size and possible error in the payload for an argument.
      
      To do it generically, at least as Howard Chu did in his GSoC activities,
      it is more convenient to use the same model that was being used for
      string arguments, passing { size, error, payload }.
      
      So use that for the non string syscall args we have so far:
      
        struct timespec
        struct perf_event_attr
        struct sockaddr (this one has even a variable size)
      
      With this in place we have the userspace pretty printers:
      
        perf_event_attr___scnprintf()
        syscall_arg__scnprintf_augmented_sockaddr()
        syscall_arg__scnprintf_augmented_timespec()
      
      Ready to have the generic BPF collector in tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
      sending its generic payload and thus we'll use them instead of a generic
      libbpf btf_dump interface that doesn't know about about the sockaddr
      mux, perf_event_attr non-trivial fields (sample_type, etc), leaving it
      as a (useful) fallback that prints just basic types until we put in
      place a more sophisticated pretty printer infrastructure that associates
      synthesized enums to struct fields using the header scrapers we have in
      tools/perf/trace/beauty/, some of them in this list:
      
        $ ls tools/perf/trace/beauty/*.sh
        tools/perf/trace/beauty/arch_errno_names.sh
        tools/perf/trace/beauty/kcmp_type.sh
        tools/perf/trace/beauty/perf_ioctl.sh
        tools/perf/trace/beauty/statx_mask.sh
        tools/perf/trace/beauty/clone.sh
        tools/perf/trace/beauty/kvm_ioctl.sh
        tools/perf/trace/beauty/pkey_alloc_access_rights.sh
        tools/perf/trace/beauty/sync_file_range.sh
        tools/perf/trace/beauty/drm_ioctl.sh
        tools/perf/trace/beauty/madvise_behavior.sh
        tools/perf/trace/beauty/prctl_option.sh
        tools/perf/trace/beauty/usbdevfs_ioctl.sh
        tools/perf/trace/beauty/fadvise.sh
        tools/perf/trace/beauty/mmap_flags.sh
        tools/perf/trace/beauty/rename_flags.sh
        tools/perf/trace/beauty/vhost_virtio_ioctl.sh
        tools/perf/trace/beauty/fs_at_flags.sh
        tools/perf/trace/beauty/mmap_prot.sh
        tools/perf/trace/beauty/sndrv_ctl_ioctl.sh
        tools/perf/trace/beauty/x86_arch_prctl.sh
        tools/perf/trace/beauty/fsconfig.sh
        tools/perf/trace/beauty/mount_flags.sh
        tools/perf/trace/beauty/sndrv_pcm_ioctl.sh
        tools/perf/trace/beauty/fsmount.sh
        tools/perf/trace/beauty/move_mount_flags.sh
        tools/perf/trace/beauty/sockaddr.sh
        tools/perf/trace/beauty/fspick.sh
        tools/perf/trace/beauty/mremap_flags.sh
        tools/perf/trace/beauty/socket.sh
        $
      
      Testing it:
      
        root@number:~# rm -f 987654 ; touch 123456 ; perf trace -e rename* mv 123456 987654
           0.000 ( 0.031 ms): mv/1193096 renameat2(olddfd: CWD, oldname: "123456", newdfd: CWD, newname: "987654", flags: NOREPLACE) = 0
        root@number:~# perf trace -e *nanosleep sleep 1.2345678901
             0.000 (1234.654 ms): sleep/1192697 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 234567891 }, rmtp: 0x7ffe1ea80460) = 0
        root@number:~# perf trace -e perf_event_open* perf stat -e cpu-clock sleep 1
             0.000 ( 0.011 ms): perf/1192701 perf_event_open(attr_uptr: { type: 1 (software), size: 136, config: 0 (PERF_COUNT_SW_CPU_CLOCK), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 1192702 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
      
         Performance counter stats for 'sleep 1':
      
                      0.51 msec cpu-clock                        #    0.001 CPUs utilized
      
               1.001242090 seconds time elapsed
      
               0.000000000 seconds user
               0.001010000 seconds sys
      
        root@number:~# perf trace -e connect* ping -c 1 bsky.app
             0.000 ( 0.130 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: LOCAL, path: /run/systemd/resolve/io.systemd.Resolve }, addrlen: 42) = 0
            23.907 ( 0.006 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 3.20.108.158 }, addrlen: 16) = 0
            23.915 PING bsky.app (3.20.108.158) 56(84) bytes of data.
        ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16)           = 0
            23.917 ( 0.002 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 3.12.170.30 }, addrlen: 16) = 0
            23.921 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16)           = 0
            23.923 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 18.217.70.179 }, addrlen: 16) = 0
            23.925 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16)           = 0
            23.927 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 3.132.20.46 }, addrlen: 16) = 0
            23.930 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16)           = 0
            23.931 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 3.142.89.165 }, addrlen: 16) = 0
            23.934 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16)           = 0
            23.935 ( 0.002 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 18.119.147.159 }, addrlen: 16) = 0
            23.938 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16)           = 0
            23.940 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 3.22.38.164 }, addrlen: 16) = 0
            23.942 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16)           = 0
            23.944 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 3.13.14.133 }, addrlen: 16) = 0
            23.956 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 1025, addr: 3.20.108.158 }, addrlen: 16) = 0
        ^C
        --- bsky.app ping statistics ---
        1 packets transmitted, 0 received, 100% packet loss, time 0ms
      
        root@number:~#
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Howard Chu <howardchu95@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/CAP-5=fW4=2GoP6foAN6qbrCiUzy0a_TzHbd8rvDsakTPfdzvfg@mail.gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c90a88d3
    • Arnaldo Carvalho de Melo's avatar
      perf trace augmented_syscalls.bpf: Move the renameat aumenter to renameat2, temporarily · c1632cc5
      Arnaldo Carvalho de Melo authored
      While trying to shape Howard Chu's generic BPF augmenter transition into
      the codebase I got stuck with the renameat2 syscall.
      
      Until I noticed that the attempt at reusing augmenters were making it
      use the 'openat' syscall augmenter, that collect just one string syscall
      arg, for the 'renameat2' syscall, that takes two strings.
      
      So, for the moment, just to help in this transition period, since
      'renameat2' is what is used these days in the 'mv' utility, just make
      the BPF collector be associated with the more widely used syscall,
      hopefully the transition to Howard's generic BPF augmenter will cure
      this, so get this out of the way for now!
      
      So now we still have that odd "reuse", but for something we're not
      testing so won't get in the way anymore:
      
        root@number:~# rm -f 987654 ; touch 123456 ; perf trace -vv -e rename* mv 123456 987654 |& grep renameat
        Reusing "openat" BPF sys_enter augmenter for "renameat"
             0.000 ( 0.079 ms): mv/1158612 renameat2(olddfd: CWD, oldname: "123456", newdfd: CWD, newname: "987654", flags: NOREPLACE) = 0
        root@number:~#
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Howard Chu <howardchu95@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/CAP-5=fXjGYs=tpBgETK-P9U-CuXssytk9pSnTXpfphrmmOydWA@mail.gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c1632cc5
  2. 06 Sep, 2024 8 commits
    • Kan Liang's avatar
      perf mem: Fix the wrong reference in parse_record_events() · 003265bb
      Kan Liang authored
      A segmentation fault can be triggered when running
      'perf mem record -e ldlat-loads'
      
      The commit 35b38a71 ("perf mem: Rework command option handling")
      moves the OPT_CALLBACK of event from __cmd_record() to cmd_mem().
      
      When invoking the __cmd_record(), the 'mem' has been referenced (&).
      
      So the &mem passed into the parse_record_events() is a double reference
      (&&) of the original struct perf_mem mem.
      
      But in the cmd_mem(), the &mem is the single reference (&) of the
      original struct perf_mem mem.
      
      Fixes: 35b38a71 ("perf mem: Rework command option handling")
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240905170737.4070743-3-kan.liang@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      003265bb
    • Kan Liang's avatar
      perf mem: Fix missed p-core mem events on ADL and RPL · 5ad7db2c
      Kan Liang authored
      The p-core mem events are missed when launching 'perf mem record' on ADL
      and RPL.
      
        root@number:~# perf mem record sleep 1
        Memory events are enabled on a subset of CPUs: 16-27
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.032 MB perf.data ]
        root@number:~# perf evlist
        cpu_atom/mem-loads,ldlat=30/P
        cpu_atom/mem-stores/P
        dummy:u
      
      A variable 'record' in the 'struct perf_mem_event' is to indicate
      whether a mem event in a mem_events[] should be recorded. The current
      code only configure the variable for the first eligible PMU.
      
      It's good enough for a non-hybrid machine or a hybrid machine which has
      the same mem_events[].
      
      However, if a different mem_events[] is used for different PMUs on a
      hybrid machine, e.g., ADL or RPL, the 'record' for the second PMU never
      get a chance to be set.
      
      The mem_events[] of the second PMU are always ignored.
      
      'perf mem' doesn't support the per-PMU configuration now. A per-PMU
      mem_events[] 'record' variable doesn't make sense. Make it global.
      
      That could also avoid searching for the per-PMU mem_events[] via
      perf_pmu__mem_events_ptr every time.
      
      Committer testing:
      
        root@number:~# perf evlist -g
        cpu_atom/mem-loads,ldlat=30/P
        cpu_atom/mem-stores/P
        {cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}
        cpu_core/mem-stores/P
        dummy:u
        root@number:~#
      
      The :S for '{cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}' is
      not being added by 'perf evlist -g', to be checked.
      
      Fixes: abbdd79b ("perf mem: Clean up perf_mem_events__name()")
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Closes: https://lore.kernel.org/lkml/Zthu81fA3kLC2CS2@x1/
      Link: https://lore.kernel.org/r/20240905170737.4070743-2-kan.liang@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5ad7db2c
    • Kan Liang's avatar
      perf mem: Check mem_events for all eligible PMUs · 6e05d28f
      Kan Liang authored
      The current perf_pmu__mem_events_init() only checks the availability of
      the mem_events for the first eligible PMU. It works for non-hybrid
      machines and hybrid machines that have the same mem_events.
      
      However, it may bring issues if a hybrid machine has a different
      mem_events on different PMU, e.g., Alder Lake and Raptor Lake. A
      mem-loads-aux event is only required for the p-core. The mem_events on
      both e-core and p-core should be checked and marked.
      
      The issue was not found, because it's hidden by another bug, which only
      records the mem-events for the e-core. The wrong check for the p-core
      events didn't yell.
      
      Fixes: abbdd79b ("perf mem: Clean up perf_mem_events__name()")
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240905170737.4070743-1-kan.liang@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6e05d28f
    • Andi Kleen's avatar
      perf script python: Avoid buffer overflow in python PEBS register interface · 4bef6168
      Andi Kleen authored
      Running a script that processes PEBS records gives buffer overflows
      in valgrind.
      
      The problem is that the allocation of the register string doesn't
      include the terminating 0 byte. Fix this.
      
      I also replaced the very magic "28" with a more reasonable larger buffer
      that should fit all registers.  There's no need to conserve memory here.
      
        ==2106591== Memcheck, a memory error detector
        ==2106591== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
        ==2106591== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
        ==2106591== Command: ../perf script -i tcall.data gcov.py tcall.gcov
        ==2106591==
        ==2106591== Invalid write of size 1
        ==2106591==    at 0x713354: regs_map (trace-event-python.c:748)
        ==2106591==    by 0x7134EB: set_regs_in_dict (trace-event-python.c:784)
        ==2106591==    by 0x713E58: get_perf_sample_dict (trace-event-python.c:940)
        ==2106591==    by 0x716327: python_process_general_event (trace-event-python.c:1499)
        ==2106591==    by 0x7164E1: python_process_event (trace-event-python.c:1531)
        ==2106591==    by 0x44F9AF: process_sample_event (builtin-script.c:2549)
        ==2106591==    by 0x6294DC: evlist__deliver_sample (session.c:1534)
        ==2106591==    by 0x6296D0: machines__deliver_event (session.c:1573)
        ==2106591==    by 0x629C39: perf_session__deliver_event (session.c:1655)
        ==2106591==    by 0x625830: ordered_events__deliver_event (session.c:193)
        ==2106591==    by 0x630B23: do_flush (ordered-events.c:245)
        ==2106591==    by 0x630E7A: __ordered_events__flush (ordered-events.c:324)
        ==2106591==  Address 0x7186fe0 is 0 bytes after a block of size 0 alloc'd
        ==2106591==    at 0x484280F: malloc (vg_replace_malloc.c:442)
        ==2106591==    by 0x7134AD: set_regs_in_dict (trace-event-python.c:780)
        ==2106591==    by 0x713E58: get_perf_sample_dict (trace-event-python.c:940)
        ==2106591==    by 0x716327: python_process_general_event (trace-event-python.c:1499)
        ==2106591==    by 0x7164E1: python_process_event (trace-event-python.c:1531)
        ==2106591==    by 0x44F9AF: process_sample_event (builtin-script.c:2549)
        ==2106591==    by 0x6294DC: evlist__deliver_sample (session.c:1534)
        ==2106591==    by 0x6296D0: machines__deliver_event (session.c:1573)
        ==2106591==    by 0x629C39: perf_session__deliver_event (session.c:1655)
        ==2106591==    by 0x625830: ordered_events__deliver_event (session.c:193)
        ==2106591==    by 0x630B23: do_flush (ordered-events.c:245)
        ==2106591==    by 0x630E7A: __ordered_events__flush (ordered-events.c:324)
        ==2106591==
        ==2106591== Invalid read of size 1
        ==2106591==    at 0x484B6C6: strlen (vg_replace_strmem.c:502)
        ==2106591==    by 0x555D494: PyUnicode_FromString (unicodeobject.c:1899)
        ==2106591==    by 0x7134F7: set_regs_in_dict (trace-event-python.c:786)
        ==2106591==    by 0x713E58: get_perf_sample_dict (trace-event-python.c:940)
        ==2106591==    by 0x716327: python_process_general_event (trace-event-python.c:1499)
        ==2106591==    by 0x7164E1: python_process_event (trace-event-python.c:1531)
        ==2106591==    by 0x44F9AF: process_sample_event (builtin-script.c:2549)
        ==2106591==    by 0x6294DC: evlist__deliver_sample (session.c:1534)
        ==2106591==    by 0x6296D0: machines__deliver_event (session.c:1573)
        ==2106591==    by 0x629C39: perf_session__deliver_event (session.c:1655)
        ==2106591==    by 0x625830: ordered_events__deliver_event (session.c:193)
        ==2106591==    by 0x630B23: do_flush (ordered-events.c:245)
        ==2106591==  Address 0x7186fe0 is 0 bytes after a block of size 0 alloc'd
        ==2106591==    at 0x484280F: malloc (vg_replace_malloc.c:442)
        ==2106591==    by 0x7134AD: set_regs_in_dict (trace-event-python.c:780)
        ==2106591==    by 0x713E58: get_perf_sample_dict (trace-event-python.c:940)
        ==2106591==    by 0x716327: python_process_general_event (trace-event-python.c:1499)
        ==2106591==    by 0x7164E1: python_process_event (trace-event-python.c:1531)
        ==2106591==    by 0x44F9AF: process_sample_event (builtin-script.c:2549)
        ==2106591==    by 0x6294DC: evlist__deliver_sample (session.c:1534)
        ==2106591==    by 0x6296D0: machines__deliver_event (session.c:1573)
        ==2106591==    by 0x629C39: perf_session__deliver_event (session.c:1655)
        ==2106591==    by 0x625830: ordered_events__deliver_event (session.c:193)
        ==2106591==    by 0x630B23: do_flush (ordered-events.c:245)
        ==2106591==    by 0x630E7A: __ordered_events__flush (ordered-events.c:324)
        ==2106591==
        ==2106591== Invalid write of size 1
        ==2106591==    at 0x713354: regs_map (trace-event-python.c:748)
        ==2106591==    by 0x713539: set_regs_in_dict (trace-event-python.c:789)
        ==2106591==    by 0x713E58: get_perf_sample_dict (trace-event-python.c:940)
        ==2106591==    by 0x716327: python_process_general_event (trace-event-python.c:1499)
        ==2106591==    by 0x7164E1: python_process_event (trace-event-python.c:1531)
        ==2106591==    by 0x44F9AF: process_sample_event (builtin-script.c:2549)
        ==2106591==    by 0x6294DC: evlist__deliver_sample (session.c:1534)
        ==2106591==    by 0x6296D0: machines__deliver_event (session.c:1573)
        ==2106591==    by 0x629C39: perf_session__deliver_event (session.c:1655)
        ==2106591==    by 0x625830: ordered_events__deliver_event (session.c:193)
        ==2106591==    by 0x630B23: do_flush (ordered-events.c:245)
        ==2106591==    by 0x630E7A: __ordered_events__flush (ordered-events.c:324)
        ==2106591==  Address 0x7186fe0 is 0 bytes after a block of size 0 alloc'd
        ==2106591==    at 0x484280F: malloc (vg_replace_malloc.c:442)
        ==2106591==    by 0x7134AD: set_regs_in_dict (trace-event-python.c:780)
        ==2106591==    by 0x713E58: get_perf_sample_dict (trace-event-python.c:940)
        ==2106591==    by 0x716327: python_process_general_event (trace-event-python.c:1499)
        ==2106591==    by 0x7164E1: python_process_event (trace-event-python.c:1531)
        ==2106591==    by 0x44F9AF: process_sample_event (builtin-script.c:2549)
        ==2106591==    by 0x6294DC: evlist__deliver_sample (session.c:1534)
        ==2106591==    by 0x6296D0: machines__deliver_event (session.c:1573)
        ==2106591==    by 0x629C39: perf_session__deliver_event (session.c:1655)
        ==2106591==    by 0x625830: ordered_events__deliver_event (session.c:193)
        ==2106591==    by 0x630B23: do_flush (ordered-events.c:245)
        ==2106591==    by 0x630E7A: __ordered_events__flush (ordered-events.c:324)
        ==2106591==
        ==2106591== Invalid read of size 1
        ==2106591==    at 0x484B6C6: strlen (vg_replace_strmem.c:502)
        ==2106591==    by 0x555D494: PyUnicode_FromString (unicodeobject.c:1899)
        ==2106591==    by 0x713545: set_regs_in_dict (trace-event-python.c:791)
        ==2106591==    by 0x713E58: get_perf_sample_dict (trace-event-python.c:940)
        ==2106591==    by 0x716327: python_process_general_event (trace-event-python.c:1499)
        ==2106591==    by 0x7164E1: python_process_event (trace-event-python.c:1531)
        ==2106591==    by 0x44F9AF: process_sample_event (builtin-script.c:2549)
        ==2106591==    by 0x6294DC: evlist__deliver_sample (session.c:1534)
        ==2106591==    by 0x6296D0: machines__deliver_event (session.c:1573)
        ==2106591==    by 0x629C39: perf_session__deliver_event (session.c:1655)
        ==2106591==    by 0x625830: ordered_events__deliver_event (session.c:193)
        ==2106591==    by 0x630B23: do_flush (ordered-events.c:245)
        ==2106591==  Address 0x7186fe0 is 0 bytes after a block of size 0 alloc'd
        ==2106591==    at 0x484280F: malloc (vg_replace_malloc.c:442)
        ==2106591==    by 0x7134AD: set_regs_in_dict (trace-event-python.c:780)
        ==2106591==    by 0x713E58: get_perf_sample_dict (trace-event-python.c:940)
        ==2106591==    by 0x716327: python_process_general_event (trace-event-python.c:1499)
        ==2106591==    by 0x7164E1: python_process_event (trace-event-python.c:1531)
        ==2106591==    by 0x44F9AF: process_sample_event (builtin-script.c:2549)
        ==2106591==    by 0x6294DC: evlist__deliver_sample (session.c:1534)
        ==2106591==    by 0x6296D0: machines__deliver_event (session.c:1573)
        ==2106591==    by 0x629C39: perf_session__deliver_event (session.c:1655)
        ==2106591==    by 0x625830: ordered_events__deliver_event (session.c:193)
        ==2106591==    by 0x630B23: do_flush (ordered-events.c:245)
        ==2106591==    by 0x630E7A: __ordered_events__flush (ordered-events.c:324)
        ==2106591==
        73056 total, 29 ignored
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240905151058.2127122-2-ak@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4bef6168
    • Ian Rogers's avatar
      perf jevents: Ignore sys when determining a model directory · f2dbc779
      Ian Rogers authored
      Existing sys directories aren't placed under a model directory like
      skylake.
      
      Placing a sys directory there causes the `is_leaf_dir` test to fail and
      consequently no events or metrics are generated for the model.
      
      Ignore sys directories in this case and update the comments to
      reflect why.
      
      This change has no affect, but when testing with a sys directory for a
      model people have reported running into the no event/metric issue.
      Reported-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Xu Yang <xu.yang_2@nxp.com>
      Link: https://lore.kernel.org/r/20240904211705.915101-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f2dbc779
    • Arnaldo Carvalho de Melo's avatar
      Merge remote-tracking branch 'torvalds/master' into perf-tools-next · 92984e44
      Arnaldo Carvalho de Melo authored
      To pick up fixes from perf-tools/perf-tools, some of which were also in
      perf-tools-next but were then indentified as being more appropriate to
      go sooner, to fix regressions in v6.11.
      
      Resolve a simple merge conflict in tools/perf/tests/pmu.c where a more
      future proof approach to initialize all fields of a struct was used in
      perf-tools-next, the one that is going into v6.11 is enough for the
      segfault it addressed (using an uninitialized test_pmu.alias field).
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      92984e44
    • Linus Torvalds's avatar
      Merge tag 'bpf-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · b831f83e
      Linus Torvalds authored
      Pull bpf fixes from Alexei Starovoitov:
      
       - Fix crash when btf_parse_base() returns an error (Martin Lau)
      
       - Fix out of bounds access in btf_name_valid_section() (Jeongjun Park)
      
      * tag 'bpf-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        selftests/bpf: Add a selftest to check for incorrect names
        bpf: add check for invalid name in btf_name_valid_section()
        bpf: Fix a crash when btf_parse_base() returns an error pointer
      b831f83e
    • Linus Torvalds's avatar
      Merge tag 'net-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · d759ee24
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from can, bluetooth and wireless.
      
        No known regressions at this point. Another calm week, but chances are
        that has more to do with vacation season than the quality of our work.
      
        Current release - new code bugs:
      
         - smc: prevent NULL pointer dereference in txopt_get
      
         - eth: ti: am65-cpsw: number of XDP-related fixes
      
        Previous releases - regressions:
      
         - Revert "Bluetooth: MGMT/SMP: Fix address type when using SMP over
           BREDR/LE", it breaks existing user space
      
         - Bluetooth: qca: if memdump doesn't work, re-enable IBS to avoid
           later problems with suspend
      
         - can: mcp251x: fix deadlock if an interrupt occurs during
           mcp251x_open
      
         - eth: r8152: fix the firmware communication error due to use of bulk
           write
      
         - ptp: ocp: fix serial port information export
      
         - eth: igb: fix not clearing TimeSync interrupts for 82580
      
         - Revert "wifi: ath11k: support hibernation", fix suspend on Lenovo
      
        Previous releases - always broken:
      
         - eth: intel: fix crashes and bugs when reconfiguration and resets
           happening in parallel
      
         - wifi: ath11k: fix NULL dereference in ath11k_mac_get_eirp_power()
      
        Misc:
      
         - docs: netdev: document guidance on cleanup.h"
      
      * tag 'net-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (61 commits)
        ila: call nf_unregister_net_hooks() sooner
        tools/net/ynl: fix cli.py --subscribe feature
        MAINTAINERS: fix ptp ocp driver maintainers address
        selftests: net: enable bind tests
        net: dsa: vsc73xx: fix possible subblocks range of CAPT block
        sched: sch_cake: fix bulk flow accounting logic for host fairness
        docs: netdev: document guidance on cleanup.h
        net: xilinx: axienet: Fix race in axienet_stop
        net: bridge: br_fdb_external_learn_add(): always set EXT_LEARN
        r8152: fix the firmware doesn't work
        fou: Fix null-ptr-deref in GRO.
        bareudp: Fix device stats updates.
        net: mana: Fix error handling in mana_create_txq/rxq's NAPI cleanup
        bpf, net: Fix a potential race in do_sock_getsockopt()
        net: dqs: Do not use extern for unused dql_group
        sch/netem: fix use after free in netem_dequeue
        usbnet: modern method to get random MAC
        MAINTAINERS: wifi: cw1200: add net-cw1200.h
        ice: do not bring the VSI up, if it was down before the XDP setup
        ice: remove ICE_CFG_BUSY locking from AF_XDP code
        ...
      d759ee24
  3. 05 Sep, 2024 24 commits
    • Linus Torvalds's avatar
      Merge tag 'spi-fix-v6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · f9535999
      Linus Torvalds authored
      Pull spi fixes from Mark Brown:
       "A few small driver specific fixes (including some of the widespread
        work on fixing missing ID tables for module autoloading and the revert
        of some problematic PM work in spi-rockchip), some improvements to the
        MAINTAINERS information for the NXP drivers and the addition of a new
        device ID to spidev"
      
      * tag 'spi-fix-v6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        MAINTAINERS: SPI: Add mailing list imx@lists.linux.dev for nxp spi drivers
        MAINTAINERS: SPI: Add freescale lpspi maintainer information
        spi: spi-fsl-lpspi: Fix off-by-one in prescale max
        spi: spidev: Add missing spi_device_id for jg10309-01
        spi: bcm63xx: Enable module autoloading
        spi: intel: Add check devm_kasprintf() returned value
        spi: spidev: Add an entry for elgin,jg10309-01
        spi: rockchip: Resolve unbalanced runtime PM / system PM handling
      f9535999
    • Linus Torvalds's avatar
      Merge tag 'regulator-fix-v6.11-stub' of... · 2a660447
      Linus Torvalds authored
      Merge tag 'regulator-fix-v6.11-stub' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
      
      Pull regulator fix from Mark Brown:
       "A fix from Doug Anderson for a missing stub, required to fix the build
        for some newly added users of devm_regulator_bulk_get_const() in
        !REGULATOR configurations"
      
      * tag 'regulator-fix-v6.11-stub' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
        regulator: core: Stub devm_regulator_bulk_get_const() if !CONFIG_REGULATOR
      2a660447
    • Linus Torvalds's avatar
      Merge tag 'rust-fixes-6.11-2' of https://github.com/Rust-for-Linux/linux · 6c5b3e30
      Linus Torvalds authored
      Pull Rust fixes from Miguel Ojeda:
       "Toolchain and infrastructure:
      
         - Fix builds for nightly compiler users now that 'new_uninit' was
           split into new features by using an alternative approach for the
           code that used what is now called the 'box_uninit_write' feature
      
         - Allow the 'stable_features' lint to preempt upcoming warnings about
           them, since soon there will be unstable features that will become
           stable in nightly compilers
      
         - Export bss symbols too
      
        'kernel' crate:
      
         - 'block' module: fix wrong usage of lockdep API
      
        'macros' crate:
      
         - Provide correct provenance when constructing 'THIS_MODULE'
      
        Documentation:
      
         - Remove unintended indentation (blockquotes) in generated output
      
         - Fix a couple typos
      
        MAINTAINERS:
      
         - Remove Wedson as Rust maintainer
      
         - Update Andreas' email"
      
      * tag 'rust-fixes-6.11-2' of https://github.com/Rust-for-Linux/linux:
        MAINTAINERS: update Andreas Hindborg's email address
        MAINTAINERS: Remove Wedson as Rust maintainer
        rust: macros: provide correct provenance when constructing THIS_MODULE
        rust: allow `stable_features` lint
        docs: rust: remove unintended blockquote in Quick Start
        rust: alloc: eschew `Box<MaybeUninit<T>>::write`
        rust: kernel: fix typos in code comments
        docs: rust: remove unintended blockquote in Coding Guidelines
        rust: block: fix wrong usage of lockdep API
        rust: kbuild: fix export of bss symbols
      6c5b3e30
    • Linus Torvalds's avatar
      Merge tag 'trace-v6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · e4b42053
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
      
       - Fix adding a new fgraph callback after function graph tracing has
         already started.
      
         If the new caller does not initialize its hash before registering the
         fgraph_ops, it can cause a NULL pointer dereference. Fix this by
         adding a new parameter to ftrace_graph_enable_direct() passing in the
         newly added gops directly and not rely on using the fgraph_array[],
         as entries in the fgraph_array[] must be initialized.
      
         Assign the new gops to the fgraph_array[] after it goes through
         ftrace_startup_subops() as that will properly initialize the
         gops->ops and initialize its hashes.
      
       - Fix a memory leak in fgraph storage memory test.
      
         If the "multiple fgraph storage on a function" boot up selftest fails
         in the registering of the function graph tracer, it will not free the
         memory it allocated for the filter. Break the loop up into two where
         it allocates the filters first and then registers the functions where
         any errors will do the appropriate clean ups.
      
       - Only clear the timerlat timers if it has an associated kthread.
      
         In the rtla tool that uses timerlat, if it was killed just as it was
         shutting down, the signals can free the kthread and the timer. But
         the closing of the timerlat files could cause the hrtimer_cancel() to
         be called on the already freed timer. As the kthread variable is is
         set to NULL when the kthreads are stopped and the timers are freed it
         can be used to know not to call hrtimer_cancel() on the timer if the
         kthread variable is NULL.
      
       - Use a cpumask to keep track of osnoise/timerlat kthreads
      
         The timerlat tracer can use user space threads for its analysis. With
         the killing of the rtla tool, the kernel can get confused between if
         it is using a user space thread to analyze or one of its own kernel
         threads. When this confusion happens, kthread_stop() can be called on
         a user space thread and bad things happen. As the kernel threads are
         per-cpu, a bitmask can be used to know when a kernel thread is used
         or when a user space thread is used.
      
       - Add missing interface_lock to osnoise/timerlat stop_kthread()
      
         The stop_kthread() function in osnoise/timerlat clears the osnoise
         kthread variable, and if it was a user space thread does a put_task
         on it. But this can race with the closing of the timerlat files that
         also does a put_task on the kthread, and if the race happens the task
         will have put_task called on it twice and oops.
      
       - Add cond_resched() to the tracing_iter_reset() loop.
      
         The latency tracers keep writing to the ring buffer without resetting
         when it issues a new "start" event (like interrupts being disabled).
         When reading the buffer with an iterator, the tracing_iter_reset()
         sets its pointer to that start event by walking through all the
         events in the buffer until it gets to the time stamp of the start
         event. In the case of a very large buffer, the loop that looks for
         the start event has been reported taking a very long time with a non
         preempt kernel that it can trigger a soft lock up warning. Add a
         cond_resched() into that loop to make sure that doesn't happen.
      
       - Use list_del_rcu() for eventfs ei->list variable
      
         It was reported that running loops of creating and deleting kprobe
         events could cause a crash due to the eventfs list iteration hitting
         a LIST_POISON variable. This is because the list is protected by SRCU
         but when an item is deleted from the list, it was using list_del()
         which poisons the "next" pointer. This is what list_del_rcu() was to
         prevent.
      
      * tag 'trace-v6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        tracing/timerlat: Add interface_lock around clearing of kthread in stop_kthread()
        tracing/timerlat: Only clear timer if a kthread exists
        tracing/osnoise: Use a cpumask to know what threads are kthreads
        eventfs: Use list_del_rcu() for SRCU protected list variable
        tracing: Avoid possible softlockup in tracing_iter_reset()
        tracing: Fix memory leak in fgraph storage selftest
        tracing: fgraph: Fix to add new fgraph_ops to array after ftrace_startup_subops()
      e4b42053
    • Eric Dumazet's avatar
      ila: call nf_unregister_net_hooks() sooner · 031ae728
      Eric Dumazet authored
      syzbot found an use-after-free Read in ila_nf_input [1]
      
      Issue here is that ila_xlat_exit_net() frees the rhashtable,
      then call nf_unregister_net_hooks().
      
      It should be done in the reverse way, with a synchronize_rcu().
      
      This is a good match for a pre_exit() method.
      
      [1]
       BUG: KASAN: use-after-free in rht_key_hashfn include/linux/rhashtable.h:159 [inline]
       BUG: KASAN: use-after-free in __rhashtable_lookup include/linux/rhashtable.h:604 [inline]
       BUG: KASAN: use-after-free in rhashtable_lookup include/linux/rhashtable.h:646 [inline]
       BUG: KASAN: use-after-free in rhashtable_lookup_fast+0x77a/0x9b0 include/linux/rhashtable.h:672
      Read of size 4 at addr ffff888064620008 by task ksoftirqd/0/16
      
      CPU: 0 UID: 0 PID: 16 Comm: ksoftirqd/0 Not tainted 6.11.0-rc4-syzkaller-00238-g2ad6d23f #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
      Call Trace:
       <TASK>
        __dump_stack lib/dump_stack.c:93 [inline]
        dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
        print_address_description mm/kasan/report.c:377 [inline]
        print_report+0x169/0x550 mm/kasan/report.c:488
        kasan_report+0x143/0x180 mm/kasan/report.c:601
        rht_key_hashfn include/linux/rhashtable.h:159 [inline]
        __rhashtable_lookup include/linux/rhashtable.h:604 [inline]
        rhashtable_lookup include/linux/rhashtable.h:646 [inline]
        rhashtable_lookup_fast+0x77a/0x9b0 include/linux/rhashtable.h:672
        ila_lookup_wildcards net/ipv6/ila/ila_xlat.c:132 [inline]
        ila_xlat_addr net/ipv6/ila/ila_xlat.c:652 [inline]
        ila_nf_input+0x1fe/0x3c0 net/ipv6/ila/ila_xlat.c:190
        nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline]
        nf_hook_slow+0xc3/0x220 net/netfilter/core.c:626
        nf_hook include/linux/netfilter.h:269 [inline]
        NF_HOOK+0x29e/0x450 include/linux/netfilter.h:312
        __netif_receive_skb_one_core net/core/dev.c:5661 [inline]
        __netif_receive_skb+0x1ea/0x650 net/core/dev.c:5775
        process_backlog+0x662/0x15b0 net/core/dev.c:6108
        __napi_poll+0xcb/0x490 net/core/dev.c:6772
        napi_poll net/core/dev.c:6841 [inline]
        net_rx_action+0x89b/0x1240 net/core/dev.c:6963
        handle_softirqs+0x2c4/0x970 kernel/softirq.c:554
        run_ksoftirqd+0xca/0x130 kernel/softirq.c:928
        smpboot_thread_fn+0x544/0xa30 kernel/smpboot.c:164
        kthread+0x2f0/0x390 kernel/kthread.c:389
        ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
        ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
       </TASK>
      
      The buggy address belongs to the physical page:
      page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x64620
      flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
      page_type: 0xbfffffff(buddy)
      raw: 00fff00000000000 ffffea0000959608 ffffea00019d9408 0000000000000000
      raw: 0000000000000000 0000000000000003 00000000bfffffff 0000000000000000
      page dumped because: kasan: bad access detected
      page_owner tracks the page as freed
      page last allocated via order 3, migratetype Unmovable, gfp_mask 0x52dc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_ZERO), pid 5242, tgid 5242 (syz-executor), ts 73611328570, free_ts 618981657187
        set_page_owner include/linux/page_owner.h:32 [inline]
        post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1493
        prep_new_page mm/page_alloc.c:1501 [inline]
        get_page_from_freelist+0x2e4c/0x2f10 mm/page_alloc.c:3439
        __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4695
        __alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
        alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
        ___kmalloc_large_node+0x8b/0x1d0 mm/slub.c:4103
        __kmalloc_large_node_noprof+0x1a/0x80 mm/slub.c:4130
        __do_kmalloc_node mm/slub.c:4146 [inline]
        __kmalloc_node_noprof+0x2d2/0x440 mm/slub.c:4164
        __kvmalloc_node_noprof+0x72/0x190 mm/util.c:650
        bucket_table_alloc lib/rhashtable.c:186 [inline]
        rhashtable_init_noprof+0x534/0xa60 lib/rhashtable.c:1071
        ila_xlat_init_net+0xa0/0x110 net/ipv6/ila/ila_xlat.c:613
        ops_init+0x359/0x610 net/core/net_namespace.c:139
        setup_net+0x515/0xca0 net/core/net_namespace.c:343
        copy_net_ns+0x4e2/0x7b0 net/core/net_namespace.c:508
        create_new_namespaces+0x425/0x7b0 kernel/nsproxy.c:110
        unshare_nsproxy_namespaces+0x124/0x180 kernel/nsproxy.c:228
        ksys_unshare+0x619/0xc10 kernel/fork.c:3328
        __do_sys_unshare kernel/fork.c:3399 [inline]
        __se_sys_unshare kernel/fork.c:3397 [inline]
        __x64_sys_unshare+0x38/0x40 kernel/fork.c:3397
      page last free pid 11846 tgid 11846 stack trace:
        reset_page_owner include/linux/page_owner.h:25 [inline]
        free_pages_prepare mm/page_alloc.c:1094 [inline]
        free_unref_page+0xd22/0xea0 mm/page_alloc.c:2612
        __folio_put+0x2c8/0x440 mm/swap.c:128
        folio_put include/linux/mm.h:1486 [inline]
        free_large_kmalloc+0x105/0x1c0 mm/slub.c:4565
        kfree+0x1c4/0x360 mm/slub.c:4588
        rhashtable_free_and_destroy+0x7c6/0x920 lib/rhashtable.c:1169
        ila_xlat_exit_net+0x55/0x110 net/ipv6/ila/ila_xlat.c:626
        ops_exit_list net/core/net_namespace.c:173 [inline]
        cleanup_net+0x802/0xcc0 net/core/net_namespace.c:640
        process_one_work kernel/workqueue.c:3231 [inline]
        process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
        worker_thread+0x86d/0xd40 kernel/workqueue.c:3390
        kthread+0x2f0/0x390 kernel/kthread.c:389
        ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
        ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
      
      Memory state around the buggy address:
       ffff88806461ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ffff88806461ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      >ffff888064620000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                            ^
       ffff888064620080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ffff888064620100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
      
      Fixes: 7f00feaf ("ila: Add generic ILA translation facility")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <tom@herbertland.com>
      Reviewed-by: default avatarFlorian Westphal <fw@strlen.de>
      Link: https://patch.msgid.link/20240904144418.1162839-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      031ae728
    • Arkadiusz Kubalewski's avatar
      tools/net/ynl: fix cli.py --subscribe feature · 6fda63c4
      Arkadiusz Kubalewski authored
      Execution of command:
      ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml /
      	--subscribe "monitor" --sleep 10
      fails with:
        File "/repo/./tools/net/ynl/cli.py", line 109, in main
          ynl.check_ntf()
        File "/repo/tools/net/ynl/lib/ynl.py", line 924, in check_ntf
          op = self.rsp_by_value[nl_msg.cmd()]
      KeyError: 19
      
      Parsing Generic Netlink notification messages performs lookup for op in
      the message. The message was not yet decoded, and is not yet considered
      GenlMsg, thus msg.cmd() returns Generic Netlink family id (19) instead of
      proper notification command id (i.e.: DPLL_CMD_PIN_CHANGE_NTF=13).
      
      Allow the op to be obtained within NetlinkProtocol.decode(..) itself if the
      op was not passed to the decode function, thus allow parsing of Generic
      Netlink notifications without causing the failure.
      Suggested-by: default avatarDonald Hunter <donald.hunter@gmail.com>
      Link: https://lore.kernel.org/netdev/m2le0n5xpn.fsf@gmail.com/
      Fixes: 0a966d60 ("tools/net/ynl: Fix extack decoding for directional ops")
      Signed-off-by: default avatarArkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
      Reviewed-by: default avatarDonald Hunter <donald.hunter@gmail.com>
      Link: https://patch.msgid.link/20240904135034.316033-1-arkadiusz.kubalewski@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6fda63c4
    • Vadim Fedorenko's avatar
      MAINTAINERS: fix ptp ocp driver maintainers address · 20d664eb
      Vadim Fedorenko authored
      While checking the latest series for ptp_ocp driver I realised that
      MAINTAINERS file has wrong item about email on linux.dev domain.
      
      Fixes: 795fd934 ("ptp_ocp: adjust MAINTAINERS and mailmap")
      Signed-off-by: default avatarVadim Fedorenko <vadim.fedorenko@linux.dev>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://patch.msgid.link/20240904131855.559078-1-vadim.fedorenko@linux.devSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      20d664eb
    • Jamie Bainbridge's avatar
      selftests: net: enable bind tests · e4af74a5
      Jamie Bainbridge authored
      bind_wildcard is compiled but not run, bind_timewait is not compiled.
      
      These two tests complete in a very short time, use the test harness
      properly, and seem reasonable to enable.
      
      The author of the tests confirmed via email that these were
      intended to be run.
      
      Enable these two tests.
      
      Fixes: 13715acf ("selftest: Add test for bind() conflicts.")
      Fixes: 2c042e8e ("tcp: Add selftest for bind() and TIME_WAIT.")
      Signed-off-by: default avatarJamie Bainbridge <jamie.bainbridge@gmail.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://patch.msgid.link/5a009b26cf5fb1ad1512d89c61b37e2fac702323.1725430322.git.jamie.bainbridge@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e4af74a5
    • Frank Li's avatar
      MAINTAINERS: SPI: Add mailing list imx@lists.linux.dev for nxp spi drivers · c9ca76e8
      Frank Li authored
      Add mailing list imx@lists.linux.dev for nxp spi drivers(qspi, fspi and
      dspi).
      Signed-off-by: default avatarFrank Li <Frank.Li@nxp.com>
      Reviewed-by: default avatarStefan Wahren <wahrenst@gmx.net>
      Link: https://patch.msgid.link/20240905155230.1901787-1-Frank.Li@nxp.comSigned-off-by: default avatarMark Brown <broonie@kernel.org>
      c9ca76e8
    • Frank Li's avatar
      MAINTAINERS: SPI: Add freescale lpspi maintainer information · fb9820c5
      Frank Li authored
      Add imx@lists.linux.dev and NXP maintainer information for lpspi driver
      (drivers/spi/spi-fsl-lpspi.c).
      Signed-off-by: default avatarFrank Li <Frank.Li@nxp.com>
      Reviewed-by: default avatarStefan Wahren <wahrenst@gmx.net>
      Link: https://patch.msgid.link/20240905154124.1901311-1-Frank.Li@nxp.comSigned-off-by: default avatarMark Brown <broonie@kernel.org>
      fb9820c5
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v6.11-6' of... · ad618736
      Linus Torvalds authored
      Merge tag 'platform-drivers-x86-v6.11-6' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
      
      Pull x86 platform driver fixes from Ilpo Järvinen:
      
       - amd/pmf: ASUS GA403 quirk matching tweak
      
       - dell-smbios: Fix to the init function rollback path
      
      * tag 'platform-drivers-x86-v6.11-6' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
        platform/x86/amd: pmf: Make ASUS GA403 quirk generic
        platform/x86: dell-smbios: Fix error path in dell_smbios_init()
      ad618736
    • Linus Torvalds's avatar
      Merge tag 'linux_kselftest-kunit-fixes-6.11-rc7' of... · 120434e5
      Linus Torvalds authored
      Merge tag 'linux_kselftest-kunit-fixes-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull kunit fix fromShuah Khan:
       "One single fix to a use-after-free bug resulting from
        kunit_driver_create() failing to copy the driver name leaving it on
        the stack or freeing it"
      
      * tag 'linux_kselftest-kunit-fixes-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        kunit: Device wrappers should also manage driver name
      120434e5
    • Steven Rostedt's avatar
      tracing/timerlat: Add interface_lock around clearing of kthread in stop_kthread() · 5bfbcd1e
      Steven Rostedt authored
      The timerlat interface will get and put the task that is part of the
      "kthread" field of the osn_var to keep it around until all references are
      released. But here's a race in the "stop_kthread()" code that will call
      put_task_struct() on the kthread if it is not a kernel thread. This can
      race with the releasing of the references to that task struct and the
      put_task_struct() can be called twice when it should have been called just
      once.
      
      Take the interface_lock() in stop_kthread() to synchronize this change.
      But to do so, the function stop_per_cpu_kthreads() needs to change the
      loop from for_each_online_cpu() to for_each_possible_cpu() and remove the
      cpu_read_lock(), as the interface_lock can not be taken while the cpu
      locks are held. The only side effect of this change is that it may do some
      extra work, as the per_cpu variables of the offline CPUs would not be set
      anyway, and would simply be skipped in the loop.
      
      Remove unneeded "return;" in stop_kthread().
      
      Cc: stable@vger.kernel.org
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Tomas Glozar <tglozar@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
      Link: https://lore.kernel.org/20240905113359.2b934242@gandalf.local.home
      Fixes: e88ed227 ("tracing/timerlat: Add user-space interface")
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      5bfbcd1e
    • Steven Rostedt's avatar
      tracing/timerlat: Only clear timer if a kthread exists · e6a53481
      Steven Rostedt authored
      The timerlat tracer can use user space threads to check for osnoise and
      timer latency. If the program using this is killed via a SIGTERM, the
      threads are shutdown one at a time and another tracing instance can start
      up resetting the threads before they are fully closed. That causes the
      hrtimer assigned to the kthread to be shutdown and freed twice when the
      dying thread finally closes the file descriptors, causing a use-after-free
      bug.
      
      Only cancel the hrtimer if the associated thread is still around. Also add
      the interface_lock around the resetting of the tlat_var->kthread.
      
      Note, this is just a quick fix that can be backported to stable. A real
      fix is to have a better synchronization between the shutdown of old
      threads and the starting of new ones.
      
      Link: https://lore.kernel.org/all/20240820130001.124768-1-tglozar@redhat.com/
      
      Cc: stable@vger.kernel.org
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
      Link: https://lore.kernel.org/20240905085330.45985730@gandalf.local.home
      Fixes: e88ed227 ("tracing/timerlat: Add user-space interface")
      Reported-by: default avatarTomas Glozar <tglozar@redhat.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      e6a53481
    • Steven Rostedt's avatar
      tracing/osnoise: Use a cpumask to know what threads are kthreads · 177e1cc2
      Steven Rostedt authored
      The start_kthread() and stop_thread() code was not always called with the
      interface_lock held. This means that the kthread variable could be
      unexpectedly changed causing the kthread_stop() to be called on it when it
      should not have been, leading to:
      
       while true; do
         rtla timerlat top -u -q & PID=$!;
         sleep 5;
         kill -INT $PID;
         sleep 0.001;
         kill -TERM $PID;
         wait $PID;
        done
      
      Causing the following OOPS:
      
       Oops: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN PTI
       KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
       CPU: 5 UID: 0 PID: 885 Comm: timerlatu/5 Not tainted 6.11.0-rc4-test-00002-gbc754cc7-dirty #125 a533010b71dab205ad2f507188ce8c82203b0254
       Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
       RIP: 0010:hrtimer_active+0x58/0x300
       Code: 48 c1 ee 03 41 54 48 01 d1 48 01 d6 55 53 48 83 ec 20 80 39 00 0f 85 30 02 00 00 49 8b 6f 30 4c 8d 75 10 4c 89 f0 48 c1 e8 03 <0f> b6 3c 10 4c 89 f0 83 e0 07 83 c0 03 40 38 f8 7c 09 40 84 ff 0f
       RSP: 0018:ffff88811d97f940 EFLAGS: 00010202
       RAX: 0000000000000002 RBX: ffff88823c6b5b28 RCX: ffffed10478d6b6b
       RDX: dffffc0000000000 RSI: ffffed10478d6b6c RDI: ffff88823c6b5b28
       RBP: 0000000000000000 R08: ffff88823c6b5b58 R09: ffff88823c6b5b60
       R10: ffff88811d97f957 R11: 0000000000000010 R12: 00000000000a801d
       R13: ffff88810d8b35d8 R14: 0000000000000010 R15: ffff88823c6b5b28
       FS:  0000000000000000(0000) GS:ffff88823c680000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000561858ad7258 CR3: 000000007729e001 CR4: 0000000000170ef0
       Call Trace:
        <TASK>
        ? die_addr+0x40/0xa0
        ? exc_general_protection+0x154/0x230
        ? asm_exc_general_protection+0x26/0x30
        ? hrtimer_active+0x58/0x300
        ? __pfx_mutex_lock+0x10/0x10
        ? __pfx_locks_remove_file+0x10/0x10
        hrtimer_cancel+0x15/0x40
        timerlat_fd_release+0x8e/0x1f0
        ? security_file_release+0x43/0x80
        __fput+0x372/0xb10
        task_work_run+0x11e/0x1f0
        ? _raw_spin_lock+0x85/0xe0
        ? __pfx_task_work_run+0x10/0x10
        ? poison_slab_object+0x109/0x170
        ? do_exit+0x7a0/0x24b0
        do_exit+0x7bd/0x24b0
        ? __pfx_migrate_enable+0x10/0x10
        ? __pfx_do_exit+0x10/0x10
        ? __pfx_read_tsc+0x10/0x10
        ? ktime_get+0x64/0x140
        ? _raw_spin_lock_irq+0x86/0xe0
        do_group_exit+0xb0/0x220
        get_signal+0x17ba/0x1b50
        ? vfs_read+0x179/0xa40
        ? timerlat_fd_read+0x30b/0x9d0
        ? __pfx_get_signal+0x10/0x10
        ? __pfx_timerlat_fd_read+0x10/0x10
        arch_do_signal_or_restart+0x8c/0x570
        ? __pfx_arch_do_signal_or_restart+0x10/0x10
        ? vfs_read+0x179/0xa40
        ? ksys_read+0xfe/0x1d0
        ? __pfx_ksys_read+0x10/0x10
        syscall_exit_to_user_mode+0xbc/0x130
        do_syscall_64+0x74/0x110
        ? __pfx___rseq_handle_notify_resume+0x10/0x10
        ? __pfx_ksys_read+0x10/0x10
        ? fpregs_restore_userregs+0xdb/0x1e0
        ? fpregs_restore_userregs+0xdb/0x1e0
        ? syscall_exit_to_user_mode+0x116/0x130
        ? do_syscall_64+0x74/0x110
        ? do_syscall_64+0x74/0x110
        ? do_syscall_64+0x74/0x110
        entry_SYSCALL_64_after_hwframe+0x71/0x79
       RIP: 0033:0x7ff0070eca9c
       Code: Unable to access opcode bytes at 0x7ff0070eca72.
       RSP: 002b:00007ff006dff8c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
       RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00007ff0070eca9c
       RDX: 0000000000000400 RSI: 00007ff006dff9a0 RDI: 0000000000000003
       RBP: 00007ff006dffde0 R08: 0000000000000000 R09: 00007ff000000ba0
       R10: 00007ff007004b08 R11: 0000000000000246 R12: 0000000000000003
       R13: 00007ff006dff9a0 R14: 0000000000000007 R15: 0000000000000008
        </TASK>
       Modules linked in: snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hwdep snd_hda_core
       ---[ end trace 0000000000000000 ]---
      
      This is because it would mistakenly call kthread_stop() on a user space
      thread making it "exit" before it actually exits.
      
      Since kthreads are created based on global behavior, use a cpumask to know
      when kthreads are running and that they need to be shutdown before
      proceeding to do new work.
      
      Link: https://lore.kernel.org/all/20240820130001.124768-1-tglozar@redhat.com/
      
      This was debugged by using the persistent ring buffer:
      
      Link: https://lore.kernel.org/all/20240823013902.135036960@goodmis.org/
      
      Note, locking was originally used to fix this, but that proved to cause too
      many deadlocks to work around:
      
        https://lore.kernel.org/linux-trace-kernel/20240823102816.5e55753b@gandalf.local.home/
      
      Cc: stable@vger.kernel.org
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
      Link: https://lore.kernel.org/20240904103428.08efdf4c@gandalf.local.home
      Fixes: e88ed227 ("tracing/timerlat: Add user-space interface")
      Reported-by: default avatarTomas Glozar <tglozar@redhat.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      177e1cc2
    • Steven Rostedt's avatar
      eventfs: Use list_del_rcu() for SRCU protected list variable · d2603279
      Steven Rostedt authored
      Chi Zhiling reported:
      
        We found a null pointer accessing in tracefs[1], the reason is that the
        variable 'ei_child' is set to LIST_POISON1, that means the list was
        removed in eventfs_remove_rec. so when access the ei_child->is_freed, the
        panic triggered.
      
        by the way, the following script can reproduce this panic
      
        loop1 (){
            while true
            do
                echo "p:kp submit_bio" > /sys/kernel/debug/tracing/kprobe_events
                echo "" > /sys/kernel/debug/tracing/kprobe_events
            done
        }
        loop2 (){
            while true
            do
                tree /sys/kernel/debug/tracing/events/kprobes/
            done
        }
        loop1 &
        loop2
      
        [1]:
        [ 1147.959632][T17331] Unable to handle kernel paging request at virtual address dead000000000150
        [ 1147.968239][T17331] Mem abort info:
        [ 1147.971739][T17331]   ESR = 0x0000000096000004
        [ 1147.976172][T17331]   EC = 0x25: DABT (current EL), IL = 32 bits
        [ 1147.982171][T17331]   SET = 0, FnV = 0
        [ 1147.985906][T17331]   EA = 0, S1PTW = 0
        [ 1147.989734][T17331]   FSC = 0x04: level 0 translation fault
        [ 1147.995292][T17331] Data abort info:
        [ 1147.998858][T17331]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
        [ 1148.005023][T17331]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
        [ 1148.010759][T17331]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
        [ 1148.016752][T17331] [dead000000000150] address between user and kernel address ranges
        [ 1148.024571][T17331] Internal error: Oops: 0000000096000004 [#1] SMP
        [ 1148.030825][T17331] Modules linked in: team_mode_loadbalance team nlmon act_gact cls_flower sch_ingress bonding tls macvlan dummy ib_core bridge stp llc veth amdgpu amdxcp mfd_core gpu_sched drm_exec drm_buddy radeon crct10dif_ce video drm_suballoc_helper ghash_ce drm_ttm_helper sha2_ce ttm sha256_arm64 i2c_algo_bit sha1_ce sbsa_gwdt cp210x drm_display_helper cec sr_mod cdrom drm_kms_helper binfmt_misc sg loop fuse drm dm_mod nfnetlink ip_tables autofs4 [last unloaded: tls]
        [ 1148.072808][T17331] CPU: 3 PID: 17331 Comm: ls Tainted: G        W         ------- ----  6.6.43 #2
        [ 1148.081751][T17331] Source Version: 21b3b386e948bedd29369af66f3e98ab01b1c650
        [ 1148.088783][T17331] Hardware name: Greatwall GW-001M1A-FTF/GW-001M1A-FTF, BIOS KunLun BIOS V4.0 07/16/2020
        [ 1148.098419][T17331] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
        [ 1148.106060][T17331] pc : eventfs_iterate+0x2c0/0x398
        [ 1148.111017][T17331] lr : eventfs_iterate+0x2fc/0x398
        [ 1148.115969][T17331] sp : ffff80008d56bbd0
        [ 1148.119964][T17331] x29: ffff80008d56bbf0 x28: ffff001ff5be2600 x27: 0000000000000000
        [ 1148.127781][T17331] x26: ffff001ff52ca4e0 x25: 0000000000009977 x24: dead000000000100
        [ 1148.135598][T17331] x23: 0000000000000000 x22: 000000000000000b x21: ffff800082645f10
        [ 1148.143415][T17331] x20: ffff001fddf87c70 x19: ffff80008d56bc90 x18: 0000000000000000
        [ 1148.151231][T17331] x17: 0000000000000000 x16: 0000000000000000 x15: ffff001ff52ca4e0
        [ 1148.159048][T17331] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
        [ 1148.166864][T17331] x11: 0000000000000000 x10: 0000000000000000 x9 : ffff8000804391d0
        [ 1148.174680][T17331] x8 : 0000000180000000 x7 : 0000000000000018 x6 : 0000aaab04b92862
        [ 1148.182498][T17331] x5 : 0000aaab04b92862 x4 : 0000000080000000 x3 : 0000000000000068
        [ 1148.190314][T17331] x2 : 000000000000000f x1 : 0000000000007ea8 x0 : 0000000000000001
        [ 1148.198131][T17331] Call trace:
        [ 1148.201259][T17331]  eventfs_iterate+0x2c0/0x398
        [ 1148.205864][T17331]  iterate_dir+0x98/0x188
        [ 1148.210036][T17331]  __arm64_sys_getdents64+0x78/0x160
        [ 1148.215161][T17331]  invoke_syscall+0x78/0x108
        [ 1148.219593][T17331]  el0_svc_common.constprop.0+0x48/0xf0
        [ 1148.224977][T17331]  do_el0_svc+0x24/0x38
        [ 1148.228974][T17331]  el0_svc+0x40/0x168
        [ 1148.232798][T17331]  el0t_64_sync_handler+0x120/0x130
        [ 1148.237836][T17331]  el0t_64_sync+0x1a4/0x1a8
        [ 1148.242182][T17331] Code: 54ffff6c f9400676 910006d6 f9000676 (b9405300)
        [ 1148.248955][T17331] ---[ end trace 0000000000000000 ]---
      
      The issue is that list_del() is used on an SRCU protected list variable
      before the synchronization occurs. This can poison the list pointers while
      there is a reader iterating the list.
      
      This is simply fixed by using list_del_rcu() that is specifically made for
      this purpose.
      
      Link: https://lore.kernel.org/linux-trace-kernel/20240829085025.3600021-1-chizhiling@163.com/
      
      Cc: stable@vger.kernel.org
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Link: https://lore.kernel.org/20240904131605.640d42b1@gandalf.local.home
      Fixes: 43aa6f97 ("eventfs: Get rid of dentry pointers without refcounts")
      Reported-by: default avatarChi Zhiling <chizhiling@kylinos.cn>
      Tested-by: default avatarChi Zhiling <chizhiling@kylinos.cn>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      d2603279
    • Zheng Yejian's avatar
      tracing: Avoid possible softlockup in tracing_iter_reset() · 49aa8a1f
      Zheng Yejian authored
      In __tracing_open(), when max latency tracers took place on the cpu,
      the time start of its buffer would be updated, then event entries with
      timestamps being earlier than start of the buffer would be skipped
      (see tracing_iter_reset()).
      
      Softlockup will occur if the kernel is non-preemptible and too many
      entries were skipped in the loop that reset every cpu buffer, so add
      cond_resched() to avoid it.
      
      Cc: stable@vger.kernel.org
      Fixes: 2f26ebd5 ("tracing: use timestamp to determine start of latency traces")
      Link: https://lore.kernel.org/20240827124654.3817443-1-zhengyejian@huaweicloud.comSuggested-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarZheng Yejian <zhengyejian@huaweicloud.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      49aa8a1f
    • Stefan Wahren's avatar
      spi: spi-fsl-lpspi: Fix off-by-one in prescale max · ff949d98
      Stefan Wahren authored
      The commit 783bf5d0 ("spi: spi-fsl-lpspi: limit PRESCALE bit in
      TCR register") doesn't implement the prescaler maximum as intended.
      The maximum allowed value for i.MX93 should be 1 and for i.MX7ULP
      it should be 7. So this needs also a adjustment of the comparison
      in the scldiv calculation.
      
      Fixes: 783bf5d0 ("spi: spi-fsl-lpspi: limit PRESCALE bit in TCR register")
      Signed-off-by: default avatarStefan Wahren <wahrenst@gmx.net>
      Link: https://patch.msgid.link/20240905111537.90389-1-wahrenst@gmx.netSigned-off-by: default avatarMark Brown <broonie@kernel.org>
      ff949d98
    • Pawel Dembicki's avatar
      net: dsa: vsc73xx: fix possible subblocks range of CAPT block · 8e69c96d
      Pawel Dembicki authored
      CAPT block (CPU Capture Buffer) have 7 sublocks: 0-3, 4, 6, 7.
      Function 'vsc73xx_is_addr_valid' allows to use only block 0 at this
      moment.
      
      This patch fix it.
      
      Fixes: 05bd97fc ("net: dsa: Add Vitesse VSC73xx DSA router driver")
      Signed-off-by: default avatarPawel Dembicki <paweldembicki@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <florian.fainelli@broadcom.com>
      Link: https://patch.msgid.link/20240903203340.1518789-1-paweldembicki@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      8e69c96d
    • Toke Høiland-Jørgensen's avatar
      sched: sch_cake: fix bulk flow accounting logic for host fairness · 546ea84d
      Toke Høiland-Jørgensen authored
      In sch_cake, we keep track of the count of active bulk flows per host,
      when running in dst/src host fairness mode, which is used as the
      round-robin weight when iterating through flows. The count of active
      bulk flows is updated whenever a flow changes state.
      
      This has a peculiar interaction with the hash collision handling: when a
      hash collision occurs (after the set-associative hashing), the state of
      the hash bucket is simply updated to match the new packet that collided,
      and if host fairness is enabled, that also means assigning new per-host
      state to the flow. For this reason, the bulk flow counters of the
      host(s) assigned to the flow are decremented, before new state is
      assigned (and the counters, which may not belong to the same host
      anymore, are incremented again).
      
      Back when this code was introduced, the host fairness mode was always
      enabled, so the decrement was unconditional. When the configuration
      flags were introduced the *increment* was made conditional, but
      the *decrement* was not. Which of course can lead to a spurious
      decrement (and associated wrap-around to U16_MAX).
      
      AFAICT, when host fairness is disabled, the decrement and wrap-around
      happens as soon as a hash collision occurs (which is not that common in
      itself, due to the set-associative hashing). However, in most cases this
      is harmless, as the value is only used when host fairness mode is
      enabled. So in order to trigger an array overflow, sch_cake has to first
      be configured with host fairness disabled, and while running in this
      mode, a hash collision has to occur to cause the overflow. Then, the
      qdisc has to be reconfigured to enable host fairness, which leads to the
      array out-of-bounds because the wrapped-around value is retained and
      used as an array index. It seems that syzbot managed to trigger this,
      which is quite impressive in its own right.
      
      This patch fixes the issue by introducing the same conditional check on
      decrement as is used on increment.
      
      The original bug predates the upstreaming of cake, but the commit listed
      in the Fixes tag touched that code, meaning that this patch won't apply
      before that.
      
      Fixes: 71263992 ("sch_cake: Make the dual modes fairer")
      Reported-by: syzbot+7fe7b81d602cc1e6b94d@syzkaller.appspotmail.com
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://patch.msgid.link/20240903160846.20909-1-toke@redhat.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      546ea84d
    • Jakub Kicinski's avatar
      docs: netdev: document guidance on cleanup.h · c82299fb
      Jakub Kicinski authored
      Document what was discussed multiple times on list and various
      virtual / in-person conversations. guard() being okay in functions
      <= 20 LoC is a bit of my own invention. If the function is trivial
      it should be fine, but feel free to disagree :)
      
      We'll obviously revisit this guidance as time passes and we and other
      subsystems get more experience.
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Link: https://patch.msgid.link/20240830171443.3532077-1-kuba@kernel.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      c82299fb
    • Jakub Kicinski's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · f0417c50
      Jakub Kicinski authored
      Tony Nguyen says:
      
      ====================
      ice: fix synchronization between .ndo_bpf() and reset
      
      Larysa Zaremba says:
      
      PF reset can be triggered asynchronously, by tx_timeout or by a user. With some
      unfortunate timings both ice_vsi_rebuild() and .ndo_bpf will try to access and
      modify XDP rings at the same time, causing system crash.
      
      The first patch factors out rtnl-locked code from VSI rebuild code to avoid
      deadlock. The following changes lock rebuild and .ndo_bpf() critical sections
      with an internal mutex as well and provide complementary fixes.
      
      * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
        ice: do not bring the VSI up, if it was down before the XDP setup
        ice: remove ICE_CFG_BUSY locking from AF_XDP code
        ice: check ICE_VSI_DOWN under rtnl_lock when preparing for reset
        ice: check for XDP rings instead of bpf program when unconfiguring
        ice: protect XDP configuration with a mutex
        ice: move netif_queue_set_napi to rtnl-protected sections
      ====================
      
      Link: https://patch.msgid.link/20240903183034.3530411-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f0417c50
    • Jakub Kicinski's avatar
      Merge tag 'wireless-2024-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless · 2603d315
      Jakub Kicinski authored
      Kalle Valo says:
      
      ====================
      wireless fixes for v6.11
      
      Hopefully final fixes for v6.11 and this time only fixes to ath11k
      driver. We need to revert hibernation support due to reported
      regressions and we have a fix for kernel crash introduced in
      v6.11-rc1.
      
      * tag 'wireless-2024-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
        MAINTAINERS: wifi: cw1200: add net-cw1200.h
        Revert "wifi: ath11k: support hibernation"
        Revert "wifi: ath11k: restore country code during resume"
        wifi: ath11k: fix NULL pointer dereference in ath11k_mac_get_eirp_power()
      ====================
      
      Link: https://patch.msgid.link/20240904135906.5986EC4CECA@smtp.kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2603d315
    • Sean Anderson's avatar
      net: xilinx: axienet: Fix race in axienet_stop · 858430db
      Sean Anderson authored
      axienet_dma_err_handler can race with axienet_stop in the following
      manner:
      
      CPU 1                       CPU 2
      ======================      ==================
      axienet_stop()
          napi_disable()
          axienet_dma_stop()
                                  axienet_dma_err_handler()
                                      napi_disable()
                                      axienet_dma_stop()
                                      axienet_dma_start()
                                      napi_enable()
          cancel_work_sync()
          free_irq()
      
      Fix this by setting a flag in axienet_stop telling
      axienet_dma_err_handler not to bother doing anything. I chose not to use
      disable_work_sync to allow for easier backporting.
      Signed-off-by: default avatarSean Anderson <sean.anderson@linux.dev>
      Fixes: 8a3b7a25 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
      Link: https://patch.msgid.link/20240903175141.4132898-1-sean.anderson@linux.devSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      858430db
  4. 04 Sep, 2024 3 commits
    • Jonas Gorski's avatar
      net: bridge: br_fdb_external_learn_add(): always set EXT_LEARN · bee2ef94
      Jonas Gorski authored
      When userspace wants to take over a fdb entry by setting it as
      EXTERN_LEARNED, we set both flags BR_FDB_ADDED_BY_EXT_LEARN and
      BR_FDB_ADDED_BY_USER in br_fdb_external_learn_add().
      
      If the bridge updates the entry later because its port changed, we clear
      the BR_FDB_ADDED_BY_EXT_LEARN flag, but leave the BR_FDB_ADDED_BY_USER
      flag set.
      
      If userspace then wants to take over the entry again,
      br_fdb_external_learn_add() sees that BR_FDB_ADDED_BY_USER and skips
      setting the BR_FDB_ADDED_BY_EXT_LEARN flags, thus silently ignores the
      update.
      
      Fix this by always allowing to set BR_FDB_ADDED_BY_EXT_LEARN regardless
      if this was a user fdb entry or not.
      
      Fixes: 710ae728 ("net: bridge: Mark FDB entries that were added by user as such")
      Signed-off-by: default avatarJonas Gorski <jonas.gorski@bisdn.de>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Link: https://patch.msgid.link/20240903081958.29951-1-jonas.gorski@bisdn.deSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      bee2ef94
    • Hayes Wang's avatar
      r8152: fix the firmware doesn't work · 8487b4af
      Hayes Wang authored
      generic_ocp_write() asks the parameter "size" must be 4 bytes align.
      Therefore, write the bp would fail, if the mac->bp_num is odd. Align the
      size to 4 for fixing it. The way may write an extra bp, but the
      rtl8152_is_fw_mac_ok() makes sure the value must be 0 for the bp whose
      index is more than mac->bp_num. That is, there is no influence for the
      firmware.
      
      Besides, I check the return value of generic_ocp_write() to make sure
      everything is correct.
      
      Fixes: e5c266a6 ("r8152: set bp in bulk")
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Link: https://patch.msgid.link/20240903063333.4502-1-hayeswang@realtek.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8487b4af
    • Kuniyuki Iwashima's avatar
      fou: Fix null-ptr-deref in GRO. · 7e419693
      Kuniyuki Iwashima authored
      We observed a null-ptr-deref in fou_gro_receive() while shutting down
      a host.  [0]
      
      The NULL pointer is sk->sk_user_data, and the offset 8 is of protocol
      in struct fou.
      
      When fou_release() is called due to netns dismantle or explicit tunnel
      teardown, udp_tunnel_sock_release() sets NULL to sk->sk_user_data.
      Then, the tunnel socket is destroyed after a single RCU grace period.
      
      So, in-flight udp4_gro_receive() could find the socket and execute the
      FOU GRO handler, where sk->sk_user_data could be NULL.
      
      Let's use rcu_dereference_sk_user_data() in fou_from_sock() and add NULL
      checks in FOU GRO handlers.
      
      [0]:
      BUG: kernel NULL pointer dereference, address: 0000000000000008
       PF: supervisor read access in kernel mode
       PF: error_code(0x0000) - not-present page
      PGD 80000001032f4067 P4D 80000001032f4067 PUD 103240067 PMD 0
      SMP PTI
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.216-204.855.amzn2.x86_64 #1
      Hardware name: Amazon EC2 c5.large/, BIOS 1.0 10/16/2017
      RIP: 0010:fou_gro_receive (net/ipv4/fou.c:233) [fou]
      Code: 41 5f c3 cc cc cc cc e8 e7 2e 69 f4 0f 1f 80 00 00 00 00 0f 1f 44 00 00 49 89 f8 41 54 48 89 f7 48 89 d6 49 8b 80 88 02 00 00 <0f> b6 48 08 0f b7 42 4a 66 25 fd fd 80 cc 02 66 89 42 4a 0f b6 42
      RSP: 0018:ffffa330c0003d08 EFLAGS: 00010297
      RAX: 0000000000000000 RBX: ffff93d9e3a6b900 RCX: 0000000000000010
      RDX: ffff93d9e3a6b900 RSI: ffff93d9e3a6b900 RDI: ffff93dac2e24d08
      RBP: ffff93d9e3a6b900 R08: ffff93dacbce6400 R09: 0000000000000002
      R10: 0000000000000000 R11: ffffffffb5f369b0 R12: ffff93dacbce6400
      R13: ffff93dac2e24d08 R14: 0000000000000000 R15: ffffffffb4edd1c0
      FS:  0000000000000000(0000) GS:ffff93daee800000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000008 CR3: 0000000102140001 CR4: 00000000007706f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      PKRU: 55555554
      Call Trace:
       <IRQ>
       ? show_trace_log_lvl (arch/x86/kernel/dumpstack.c:259)
       ? __die_body.cold (arch/x86/kernel/dumpstack.c:478 arch/x86/kernel/dumpstack.c:420)
       ? no_context (arch/x86/mm/fault.c:752)
       ? exc_page_fault (arch/x86/include/asm/irqflags.h:49 arch/x86/include/asm/irqflags.h:89 arch/x86/mm/fault.c:1435 arch/x86/mm/fault.c:1483)
       ? asm_exc_page_fault (arch/x86/include/asm/idtentry.h:571)
       ? fou_gro_receive (net/ipv4/fou.c:233) [fou]
       udp_gro_receive (include/linux/netdevice.h:2552 net/ipv4/udp_offload.c:559)
       udp4_gro_receive (net/ipv4/udp_offload.c:604)
       inet_gro_receive (net/ipv4/af_inet.c:1549 (discriminator 7))
       dev_gro_receive (net/core/dev.c:6035 (discriminator 4))
       napi_gro_receive (net/core/dev.c:6170)
       ena_clean_rx_irq (drivers/amazon/net/ena/ena_netdev.c:1558) [ena]
       ena_io_poll (drivers/amazon/net/ena/ena_netdev.c:1742) [ena]
       napi_poll (net/core/dev.c:6847)
       net_rx_action (net/core/dev.c:6917)
       __do_softirq (arch/x86/include/asm/jump_label.h:25 include/linux/jump_label.h:200 include/trace/events/irq.h:142 kernel/softirq.c:299)
       asm_call_irq_on_stack (arch/x86/entry/entry_64.S:809)
      </IRQ>
       do_softirq_own_stack (arch/x86/include/asm/irq_stack.h:27 arch/x86/include/asm/irq_stack.h:77 arch/x86/kernel/irq_64.c:77)
       irq_exit_rcu (kernel/softirq.c:393 kernel/softirq.c:423 kernel/softirq.c:435)
       common_interrupt (arch/x86/kernel/irq.c:239)
       asm_common_interrupt (arch/x86/include/asm/idtentry.h:626)
      RIP: 0010:acpi_idle_do_entry (arch/x86/include/asm/irqflags.h:49 arch/x86/include/asm/irqflags.h:89 drivers/acpi/processor_idle.c:114 drivers/acpi/processor_idle.c:575)
      Code: 8b 15 d1 3c c4 02 ed c3 cc cc cc cc 65 48 8b 04 25 40 ef 01 00 48 8b 00 a8 08 75 eb 0f 1f 44 00 00 0f 00 2d d5 09 55 00 fb f4 <fa> c3 cc cc cc cc e9 be fc ff ff 66 66 2e 0f 1f 84 00 00 00 00 00
      RSP: 0018:ffffffffb5603e58 EFLAGS: 00000246
      RAX: 0000000000004000 RBX: ffff93dac0929c00 RCX: ffff93daee833900
      RDX: ffff93daee800000 RSI: ffff93daee87dc00 RDI: ffff93daee87dc64
      RBP: 0000000000000001 R08: ffffffffb5e7b6c0 R09: 0000000000000044
      R10: ffff93daee831b04 R11: 00000000000001cd R12: 0000000000000001
      R13: ffffffffb5e7b740 R14: 0000000000000001 R15: 0000000000000000
       ? sched_clock_cpu (kernel/sched/clock.c:371)
       acpi_idle_enter (drivers/acpi/processor_idle.c:712 (discriminator 3))
       cpuidle_enter_state (drivers/cpuidle/cpuidle.c:237)
       cpuidle_enter (drivers/cpuidle/cpuidle.c:353)
       cpuidle_idle_call (kernel/sched/idle.c:158 kernel/sched/idle.c:239)
       do_idle (kernel/sched/idle.c:302)
       cpu_startup_entry (kernel/sched/idle.c:395 (discriminator 1))
       start_kernel (init/main.c:1048)
       secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:310)
      Modules linked in: udp_diag tcp_diag inet_diag nft_nat ipip tunnel4 dummy fou ip_tunnel nft_masq nft_chain_nat nf_nat wireguard nft_ct curve25519_x86_64 libcurve25519_generic nf_conntrack libchacha20poly1305 nf_defrag_ipv6 nf_defrag_ipv4 nft_objref chacha_x86_64 nft_counter nf_tables nfnetlink poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper mousedev psmouse button ena ptp pps_core crc32c_intel
      CR2: 0000000000000008
      
      Fixes: d92283e3 ("fou: change to use UDP socket GRO")
      Reported-by: default avatarAlphonse Kurian <alkurian@amazon.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://patch.msgid.link/20240902173927.62706-1-kuniyu@amazon.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7e419693