1. 15 Mar, 2022 2 commits
  2. 09 Mar, 2022 3 commits
    • Jiapeng Chong's avatar
      ftrace: Fix some W=1 warnings in kernel doc comments · 78cbc651
      Jiapeng Chong authored
      Clean up the following clang-w1 warning:
      
      kernel/trace/ftrace.c:7827: warning: Function parameter or member 'ops'
      not described in 'unregister_ftrace_function'.
      
      kernel/trace/ftrace.c:7805: warning: Function parameter or member 'ops'
      not described in 'register_ftrace_function'.
      
      Link: https://lkml.kernel.org/r/20220307004303.26399-1-jiapeng.chong@linux.alibaba.comReported-by: default avatarAbaci Robot <abaci@linux.alibaba.com>
      Signed-off-by: default avatarJiapeng Chong <jiapeng.chong@linux.alibaba.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      78cbc651
    • Nicolas Saenz Julienne's avatar
      tracing/osnoise: Force quiescent states while tracing · caf4c86b
      Nicolas Saenz Julienne authored
      At the moment running osnoise on a nohz_full CPU or uncontested FIFO
      priority and a PREEMPT_RCU kernel might have the side effect of
      extending grace periods too much. This will entice RCU to force a
      context switch on the wayward CPU to end the grace period, all while
      introducing unwarranted noise into the tracer. This behaviour is
      unavoidable as overly extending grace periods might exhaust the system's
      memory.
      
      This same exact problem is what extended quiescent states (EQS) were
      created for, conversely, rcu_momentary_dyntick_idle() emulates them by
      performing a zero duration EQS. So let's make use of it.
      
      In the common case rcu_momentary_dyntick_idle() is fairly inexpensive:
      atomically incrementing a local per-CPU counter and doing a store. So it
      shouldn't affect osnoise's measurements (which has a 1us granularity),
      so we'll call it unanimously.
      
      The uncommon case involve calling rcu_momentary_dyntick_idle() after
      having the osnoise process:
      
       - Receive an expedited quiescent state IPI with preemption disabled or
         during an RCU critical section. (activates rdp->cpu_no_qs.b.exp
         code-path).
      
       - Being preempted within in an RCU critical section and having the
         subsequent outermost rcu_read_unlock() called with interrupts
         disabled. (t->rcu_read_unlock_special.b.blocked code-path).
      
      Neither of those are possible at the moment, and are unlikely to be in
      the future given the osnoise's loop design. On top of this, the noise
      generated by the situations described above is unavoidable, and if not
      exposed by rcu_momentary_dyntick_idle() will be eventually seen in
      subsequent rcu_read_unlock() calls or schedule operations.
      
      Link: https://lkml.kernel.org/r/20220307180740.577607-1-nsaenzju@redhat.com
      
      Cc: stable@vger.kernel.org
      Fixes: bce29ac9 ("trace: Add osnoise tracer")
      Signed-off-by: default avatarNicolas Saenz Julienne <nsaenzju@redhat.com>
      Acked-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Acked-by: default avatarDaniel Bristot de Oliveira <bristot@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      caf4c86b
    • Daniel Bristot de Oliveira's avatar
      tracing/osnoise: Do not unregister events twice · f0cfe17b
      Daniel Bristot de Oliveira authored
      Nicolas reported that using:
      
       # trace-cmd record -e all -M 10 -p osnoise --poll
      
      Resulted in the following kernel warning:
      
       ------------[ cut here ]------------
       WARNING: CPU: 0 PID: 1217 at kernel/tracepoint.c:404 tracepoint_probe_unregister+0x280/0x370
       [...]
       CPU: 0 PID: 1217 Comm: trace-cmd Not tainted 5.17.0-rc6-next-20220307-nico+ #19
       RIP: 0010:tracepoint_probe_unregister+0x280/0x370
       [...]
       CR2: 00007ff919b29497 CR3: 0000000109da4005 CR4: 0000000000170ef0
       Call Trace:
        <TASK>
        osnoise_workload_stop+0x36/0x90
        tracing_set_tracer+0x108/0x260
        tracing_set_trace_write+0x94/0xd0
        ? __check_object_size.part.0+0x10a/0x150
        ? selinux_file_permission+0x104/0x150
        vfs_write+0xb5/0x290
        ksys_write+0x5f/0xe0
        do_syscall_64+0x3b/0x90
        entry_SYSCALL_64_after_hwframe+0x44/0xae
       RIP: 0033:0x7ff919a18127
       [...]
       ---[ end trace 0000000000000000 ]---
      
      The warning complains about an attempt to unregister an
      unregistered tracepoint.
      
      This happens on trace-cmd because it first stops tracing, and
      then switches the tracer to nop. Which is equivalent to:
      
        # cd /sys/kernel/tracing/
        # echo osnoise > current_tracer
        # echo 0 > tracing_on
        # echo nop > current_tracer
      
      The osnoise tracer stops the workload when no trace instance
      is actually collecting data. This can be caused both by
      disabling tracing or disabling the tracer itself.
      
      To avoid unregistering events twice, use the existing
      trace_osnoise_callback_enabled variable to check if the events
      (and the workload) are actually active before trying to
      deactivate them.
      
      Link: https://lore.kernel.org/all/c898d1911f7f9303b7e14726e7cc9678fbfb4a0e.camel@redhat.com/
      Link: https://lkml.kernel.org/r/938765e17d5a781c2df429a98f0b2e7cc317b022.1646823913.git.bristot@kernel.org
      
      Cc: stable@vger.kernel.org
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Fixes: 2fac8d64 ("tracing/osnoise: Allow multiple instances of the same tracer")
      Reported-by: default avatarNicolas Saenz Julienne <nsaenzju@redhat.com>
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      f0cfe17b
  3. 04 Mar, 2022 1 commit
  4. 02 Mar, 2022 1 commit
    • Steven Rostedt (Google)'s avatar
      tracing/histogram: Fix sorting on old "cpu" value · 1d1898f6
      Steven Rostedt (Google) authored
      When trying to add a histogram against an event with the "cpu" field, it
      was impossible due to "cpu" being a keyword to key off of the running CPU.
      So to fix this, it was changed to "common_cpu" to match the other generic
      fields (like "common_pid"). But since some scripts used "cpu" for keying
      off of the CPU (for events that did not have "cpu" as a field, which is
      most of them), a backward compatibility trick was added such that if "cpu"
      was used as a key, and the event did not have "cpu" as a field name, then
      it would fallback and switch over to "common_cpu".
      
      This fix has a couple of subtle bugs. One was that when switching over to
      "common_cpu", it did not change the field name, it just set a flag. But
      the code still found a "cpu" field. The "cpu" field is used for filtering
      and is returned when the event does not have a "cpu" field.
      
      This was found by:
      
        # cd /sys/kernel/tracing
        # echo hist:key=cpu,pid:sort=cpu > events/sched/sched_wakeup/trigger
        # cat events/sched/sched_wakeup/hist
      
      Which showed the histogram unsorted:
      
      { cpu:         19, pid:       1175 } hitcount:          1
      { cpu:          6, pid:        239 } hitcount:          2
      { cpu:         23, pid:       1186 } hitcount:         14
      { cpu:         12, pid:        249 } hitcount:          2
      { cpu:          3, pid:        994 } hitcount:          5
      
      Instead of hard coding the "cpu" checks, take advantage of the fact that
      trace_event_field_field() returns a special field for "cpu" and "CPU" if
      the event does not have "cpu" as a field. This special field has the
      "filter_type" of "FILTER_CPU". Check that to test if the returned field is
      of the CPU type instead of doing the string compare.
      
      Also, fix the sorting bug by testing for the hist_field flag of
      HIST_FIELD_FL_CPU when setting up the sort routine. Otherwise it will use
      the special CPU field to know what compare routine to use, and since that
      special field does not have a size, it returns tracing_map_cmp_none.
      
      Cc: stable@vger.kernel.org
      Fixes: 1e3bac71 ("tracing/histogram: Rename "cpu" to "common_cpu"")
      Reported-by: default avatarDaniel Bristot de Oliveira <bristot@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      1d1898f6
  5. 26 Feb, 2022 5 commits
  6. 25 Feb, 2022 7 commits
  7. 24 Feb, 2022 1 commit
  8. 08 Feb, 2022 3 commits
  9. 04 Feb, 2022 3 commits
  10. 28 Jan, 2022 10 commits
  11. 23 Jan, 2022 4 commits
    • Linus Torvalds's avatar
      Linux 5.17-rc1 · e783362e
      Linus Torvalds authored
      e783362e
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-for-v5.17-2022-01-22' of... · 40c84321
      Linus Torvalds authored
      Merge tag 'perf-tools-for-v5.17-2022-01-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull more perf tools updates from Arnaldo Carvalho de Melo:
      
       - Fix printing 'phys_addr' in 'perf script'.
      
       - Fix failure to add events with 'perf probe' in ppc64 due to not
         removing leading dot (ppc64 ABIv1).
      
       - Fix cpu_map__item() python binding building.
      
       - Support event alias in form foo-bar-baz, add pmu-events and
         parse-event tests for it.
      
       - No need to setup affinities when starting a workload or attaching to
         a pid.
      
       - Use path__join() to compose a path instead of ad-hoc snprintf()
         equivalent.
      
       - Override attr->sample_period for non-libpfm4 events.
      
       - Use libperf cpumap APIs instead of accessing the internal state
         directly.
      
       - Sync x86 arch prctl headers and files changed by the new
         set_mempolicy_home_node syscall with the kernel sources.
      
       - Remove duplicate include in cpumap.h.
      
       - Remove redundant err variable.
      
      * tag 'perf-tools-for-v5.17-2022-01-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
        perf tools: Remove redundant err variable
        perf test: Add parse-events test for aliases with hyphens
        perf test: Add pmu-events test for aliases with hyphens
        perf parse-events: Support event alias in form foo-bar-baz
        perf evsel: Override attr->sample_period for non-libpfm4 events
        perf cpumap: Remove duplicate include in cpumap.h
        perf cpumap: Migrate to libperf cpumap api
        perf python: Fix cpu_map__item() building
        perf script: Fix printing 'phys_addr' failure issue
        tools headers UAPI: Sync files changed by new set_mempolicy_home_node syscall
        tools headers UAPI: Sync x86 arch prctl headers with the kernel sources
        perf machine: Use path__join() to compose a path instead of snprintf(dir, '/', filename)
        perf evlist: No need to setup affinities when disabling events for pid targets
        perf evlist: No need to setup affinities when enabling events for pid targets
        perf stat: No need to setup affinities when starting a workload
        perf affinity: Allow passing a NULL arg to affinity__cleanup()
        perf probe: Fix ppc64 'perf probe add events failed' case
      40c84321
    • Linus Torvalds's avatar
      Merge tag 'trace-v5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 67bfce0e
      Linus Torvalds authored
      Pull ftrace fix from Steven Rostedt:
       "Fix s390 breakage from sorting mcount tables.
      
        The latest merge of the tracing tree sorts the mcount table at build
        time. But s390 appears to do things differently (like always) and
        replaces the sorted table back to the original unsorted one. As the
        ftrace algorithm depends on it being sorted, bad things happen when it
        is not, and s390 experienced those bad things.
      
        Add a new config to tell the boot if the mcount table is sorted or
        not, and allow s390 to opt out of it"
      
      * tag 'trace-v5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        ftrace: Fix assuming build time sort works for s390
      67bfce0e
    • Steven Rostedt (Google)'s avatar
      ftrace: Fix assuming build time sort works for s390 · 6b9b6413
      Steven Rostedt (Google) authored
      To speed up the boot process, as mcount_loc needs to be sorted for ftrace
      to work properly, sorting it at build time is more efficient than boot up
      and can save milliseconds of time. Unfortunately, this change broke s390
      as it will modify the mcount_loc location after the sorting takes place
      and will put back the unsorted locations. Since the sorting is skipped at
      boot up if it is believed that it was sorted at run time, ftrace can crash
      as its algorithms are dependent on the list being sorted.
      
      Add a new config BUILDTIME_MCOUNT_SORT that is set when
      BUILDTIME_TABLE_SORT but not if S390 is set. Use this config to determine
      if sorting should take place at boot up.
      
      Link: https://lore.kernel.org/all/yt9dee51ctfn.fsf@linux.ibm.com/
      
      Fixes: 72b3942a ("scripts: ftrace - move the sort-processing in ftrace_init")
      Reported-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Tested-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      6b9b6413