1. 11 Feb, 2022 5 commits
    • Beau Belgrave's avatar
      user_events: Add minimal support for trace_event into ftrace · 7f5a08c7
      Beau Belgrave authored
      Minimal support for interacting with dynamic events, trace_event and
      ftrace. Core outline of flow between user process, ioctl and trace_event
      APIs.
      
      User mode processes that wish to use trace events to get data into
      ftrace, perf, eBPF, etc are limited to uprobes today. The user events
      features enables an ABI for user mode processes to create and write to
      trace events that are isolated from kernel level trace events. This
      enables a faster path for tracing from user mode data as well as opens
      managed code to participate in trace events, where stub locations are
      dynamic.
      
      User processes often want to trace only when it's useful. To enable this
      a set of pages are mapped into the user process space that indicate the
      current state of the user events that have been registered. User
      processes can check if their event is hooked to a trace/probe, and if it
      is, emit the event data out via the write() syscall.
      
      Two new files are introduced into tracefs to accomplish this:
      user_events_status - This file is mmap'd into participating user mode
      processes to indicate event status.
      
      user_events_data - This file is opened and register/delete ioctl's are
      issued to create/open/delete trace events that can be used for tracing.
      
      The typical scenario is on process start to mmap user_events_status. Processes
      then register the events they plan to use via the REG ioctl. The ioctl reads
      and updates the passed in user_reg struct. The status_index of the struct is
      used to know the byte in the status page to check for that event. The
      write_index of the struct is used to describe that event when writing out to
      the fd that was used for the ioctl call. The data must always include this
      index first when writing out data for an event. Data can be written either by
      write() or by writev().
      
      For example, in memory:
      int index;
      char data[];
      
      Psuedo code example of typical usage:
      struct user_reg reg;
      
      int page_fd = open("user_events_status", O_RDWR);
      char *page_data = mmap(NULL, PAGE_SIZE, PROT_READ, MAP_SHARED, page_fd, 0);
      close(page_fd);
      
      int data_fd = open("user_events_data", O_RDWR);
      
      reg.size = sizeof(reg);
      reg.name_args = (__u64)"test";
      
      ioctl(data_fd, DIAG_IOCSREG, &reg);
      int status_id = reg.status_index;
      int write_id = reg.write_index;
      
      struct iovec io[2];
      io[0].iov_base = &write_id;
      io[0].iov_len = sizeof(write_id);
      io[1].iov_base = payload;
      io[1].iov_len = sizeof(payload);
      
      if (page_data[status_id])
      	writev(data_fd, io, 2);
      
      User events are also exposed via the dynamic_events tracefs file for
      both create and delete. Current status is exposed via the user_events_status
      tracefs file.
      
      Simple example to register a user event via dynamic_events:
      	echo u:test >> dynamic_events
      	cat dynamic_events
      	u:test
      
      If an event is hooked to a probe, the probe hooked shows up:
      	echo 1 > events/user_events/test/enable
      	cat user_events_status
      	1:test # Used by ftrace
      
      	Active: 1
      	Busy: 1
      	Max: 4096
      
      If an event is not hooked to a probe, no probe status shows up:
      	echo 0 > events/user_events/test/enable
      	cat user_events_status
      	1:test
      
      	Active: 1
      	Busy: 0
      	Max: 4096
      
      Users can describe the trace event format via the following format:
      	name[:FLAG1[,FLAG2...] [field1[;field2...]]
      
      Each field has the following format:
      	type name
      
      Example for char array with a size of 20 named msg:
      	echo 'u:detailed char[20] msg' >> dynamic_events
      	cat dynamic_events
      	u:detailed char[20] msg
      
      Data offsets are based on the data written out via write() and will be
      updated to reflect the correct offset in the trace_event fields. For dynamic
      data it is recommended to use the new __rel_loc data type. This type will be
      the same as __data_loc, but the offset is relative to this entry. This allows
      user_events to not worry about what common fields are being inserted before
      the data.
      
      The above format is valid for both the ioctl and the dynamic_events file.
      
      Link: https://lkml.kernel.org/r/20220118204326.2169-2-beaub@linux.microsoft.comAcked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarBeau Belgrave <beaub@linux.microsoft.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      7f5a08c7
    • Steven Rostedt (Google)'s avatar
      tracing: Save both wakee and current on wakeup events · 55bc8384
      Steven Rostedt (Google) authored
      Use the sched_switch function to save both the wakee and the waker comms
      in the saved cmdlines list when sched_wakeup is done.
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      55bc8384
    • Tom Zanussi's avatar
      tracing: Remove size restriction on synthetic event cmd error logging · 27c888da
      Tom Zanussi authored
      Currently, synthetic event command error strings are restricted to a
      length of MAX_FILTER_STR_VAL (256), which is too short for some
      commands already seen in the wild (with cmd strings longer than that
      showing up truncated in err_log).
      
      Remove the restriction so that no synthetic event command error string
      is ever truncated.
      
      Link: https://lkml.kernel.org/r/0376692396a81d0b795127c66ea92ca5bf60f481.1643399022.git.zanussi@kernel.orgSigned-off-by: default avatarTom Zanussi <zanussi@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      27c888da
    • Tom Zanussi's avatar
      tracing: Remove size restriction on hist trigger cmd error logging · edfeed31
      Tom Zanussi authored
      Currently, hist trigger command error strings are restricted to a
      length of MAX_FILTER_STR_VAL (256), which is too short for some
      commands already seen in the wild (with cmd strings longer than that
      showing up truncated in err_log).
      
      Remove the restriction so that no hist trigger command error string is
      ever truncated.
      
      Link: https://lkml.kernel.org/r/0f9d46407222eaf6632cd3b417bc50a11f401b71.1643399022.git.zanussi@kernel.orgSigned-off-by: default avatarTom Zanussi <zanussi@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      edfeed31
    • Tom Zanussi's avatar
      tracing: Remove size restriction on tracing_log_err cmd strings · 1581a884
      Tom Zanussi authored
      Currently, tracing_log_err.cmd strings are restricted to a length of
      MAX_FILTER_STR_VAL (256), which is too short for some commands already
      seen in the wild (with cmd strings longer than that showing up
      truncated).
      
      Remove the restriction so that no command string is ever truncated.
      
      Link: https://lkml.kernel.org/r/ca965f23256b350ebd94b3dc1a319f28e8267f5f.1643319703.git.zanussi@kernel.orgSigned-off-by: default avatarTom Zanussi <zanussi@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      1581a884
  2. 08 Feb, 2022 3 commits
  3. 04 Feb, 2022 3 commits
  4. 28 Jan, 2022 10 commits
  5. 23 Jan, 2022 6 commits
    • Linus Torvalds's avatar
      Linux 5.17-rc1 · e783362e
      Linus Torvalds authored
      e783362e
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-for-v5.17-2022-01-22' of... · 40c84321
      Linus Torvalds authored
      Merge tag 'perf-tools-for-v5.17-2022-01-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull more perf tools updates from Arnaldo Carvalho de Melo:
      
       - Fix printing 'phys_addr' in 'perf script'.
      
       - Fix failure to add events with 'perf probe' in ppc64 due to not
         removing leading dot (ppc64 ABIv1).
      
       - Fix cpu_map__item() python binding building.
      
       - Support event alias in form foo-bar-baz, add pmu-events and
         parse-event tests for it.
      
       - No need to setup affinities when starting a workload or attaching to
         a pid.
      
       - Use path__join() to compose a path instead of ad-hoc snprintf()
         equivalent.
      
       - Override attr->sample_period for non-libpfm4 events.
      
       - Use libperf cpumap APIs instead of accessing the internal state
         directly.
      
       - Sync x86 arch prctl headers and files changed by the new
         set_mempolicy_home_node syscall with the kernel sources.
      
       - Remove duplicate include in cpumap.h.
      
       - Remove redundant err variable.
      
      * tag 'perf-tools-for-v5.17-2022-01-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
        perf tools: Remove redundant err variable
        perf test: Add parse-events test for aliases with hyphens
        perf test: Add pmu-events test for aliases with hyphens
        perf parse-events: Support event alias in form foo-bar-baz
        perf evsel: Override attr->sample_period for non-libpfm4 events
        perf cpumap: Remove duplicate include in cpumap.h
        perf cpumap: Migrate to libperf cpumap api
        perf python: Fix cpu_map__item() building
        perf script: Fix printing 'phys_addr' failure issue
        tools headers UAPI: Sync files changed by new set_mempolicy_home_node syscall
        tools headers UAPI: Sync x86 arch prctl headers with the kernel sources
        perf machine: Use path__join() to compose a path instead of snprintf(dir, '/', filename)
        perf evlist: No need to setup affinities when disabling events for pid targets
        perf evlist: No need to setup affinities when enabling events for pid targets
        perf stat: No need to setup affinities when starting a workload
        perf affinity: Allow passing a NULL arg to affinity__cleanup()
        perf probe: Fix ppc64 'perf probe add events failed' case
      40c84321
    • Linus Torvalds's avatar
      Merge tag 'trace-v5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 67bfce0e
      Linus Torvalds authored
      Pull ftrace fix from Steven Rostedt:
       "Fix s390 breakage from sorting mcount tables.
      
        The latest merge of the tracing tree sorts the mcount table at build
        time. But s390 appears to do things differently (like always) and
        replaces the sorted table back to the original unsorted one. As the
        ftrace algorithm depends on it being sorted, bad things happen when it
        is not, and s390 experienced those bad things.
      
        Add a new config to tell the boot if the mcount table is sorted or
        not, and allow s390 to opt out of it"
      
      * tag 'trace-v5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        ftrace: Fix assuming build time sort works for s390
      67bfce0e
    • Steven Rostedt (Google)'s avatar
      ftrace: Fix assuming build time sort works for s390 · 6b9b6413
      Steven Rostedt (Google) authored
      To speed up the boot process, as mcount_loc needs to be sorted for ftrace
      to work properly, sorting it at build time is more efficient than boot up
      and can save milliseconds of time. Unfortunately, this change broke s390
      as it will modify the mcount_loc location after the sorting takes place
      and will put back the unsorted locations. Since the sorting is skipped at
      boot up if it is believed that it was sorted at run time, ftrace can crash
      as its algorithms are dependent on the list being sorted.
      
      Add a new config BUILDTIME_MCOUNT_SORT that is set when
      BUILDTIME_TABLE_SORT but not if S390 is set. Use this config to determine
      if sorting should take place at boot up.
      
      Link: https://lore.kernel.org/all/yt9dee51ctfn.fsf@linux.ibm.com/
      
      Fixes: 72b3942a ("scripts: ftrace - move the sort-processing in ftrace_init")
      Reported-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Tested-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      6b9b6413
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v5.17' of... · 473aec0e
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - Bring include/uapi/linux/nfc.h into the UAPI compile-test coverage
      
       - Revert the workaround of CONFIG_CC_IMPLICIT_FALLTHROUGH
      
       - Fix build errors in certs/Makefile
      
      * tag 'kbuild-fixes-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        certs: Fix build error when CONFIG_MODULE_SIG_KEY is empty
        certs: Fix build error when CONFIG_MODULE_SIG_KEY is PKCS#11 URI
        Revert "Makefile: Do not quote value for CONFIG_CC_IMPLICIT_FALLTHROUGH"
        usr/include/Makefile: add linux/nfc.h to the compile-test coverage
      473aec0e
    • Linus Torvalds's avatar
      Merge tag 'bitmap-5.17-rc1' of git://github.com/norov/linux · 3689f9f8
      Linus Torvalds authored
      Pull bitmap updates from Yury Norov:
      
       - introduce for_each_set_bitrange()
      
       - use find_first_*_bit() instead of find_next_*_bit() where possible
      
       - unify for_each_bit() macros
      
      * tag 'bitmap-5.17-rc1' of git://github.com/norov/linux:
        vsprintf: rework bitmap_list_string
        lib: bitmap: add performance test for bitmap_print_to_pagebuf
        bitmap: unify find_bit operations
        mm/percpu: micro-optimize pcpu_is_populated()
        Replace for_each_*_bit_from() with for_each_*_bit() where appropriate
        find: micro-optimize for_each_{set,clear}_bit()
        include/linux: move for_each_bit() macros from bitops.h to find.h
        cpumask: replace cpumask_next_* with cpumask_first_* where appropriate
        tools: sync tools/bitmap with mother linux
        all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate
        cpumask: use find_first_and_bit()
        lib: add find_first_and_bit()
        arch: remove GENERIC_FIND_FIRST_BIT entirely
        include: move find.h from asm_generic to linux
        bitops: move find_bit_*_le functions from le.h to find.h
        bitops: protect find_first_{,zero}_bit properly
      3689f9f8
  6. 22 Jan, 2022 13 commits