• Linus Torvalds's avatar
    Merge tag 'perf-core-2024-09-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9f0c253d
    Linus Torvalds authored
    Pull perf events updates from Ingo Molnar:
    
     - Implement per-PMU context rescheduling to significantly improve
       single-PMU performance, and related cleanups/fixes (Peter Zijlstra
       and Namhyung Kim)
    
     - Fix ancient bug resulting in a lot of events being dropped
       erroneously at higher sampling frequencies (Luo Gengkun)
    
     - uprobes enhancements:
    
         - Implement RCU-protected hot path optimizations for better
           performance:
    
             "For baseline vs SRCU, peak througput increased from 3.7 M/s
              (million uprobe triggerings per second) up to about 8 M/s. For
              uretprobes it's a bit more modest with bump from 2.4 M/s to
              5 M/s.
    
              For SRCU vs RCU Tasks Trace, peak throughput for uprobes
              increases further from 8 M/s to 10.3 M/s (+28%!), and for
              uretprobes from 5.3 M/s to 5.8 M/s (+11%), as we have more
              work to do on uretprobes side.
    
              Even single-thread (no contention) performance is slightly
              better: 3.276 M/s to 3.396 M/s (+3.5%) for uprobes, and 2.055
              M/s to 2.174 M/s (+5.8%) for uretprobes."
    
              (Andrii Nakryiko et al)
    
         - Document mmap_lock, don't abuse get_user_pages_remote() (Oleg
           Nesterov)
    
         - Cleanups & fixes to prepare for future work:
            - Remove uprobe_register_refctr()
    	- Simplify error handling for alloc_uprobe()
            - Make uprobe_register() return struct uprobe *
            - Fold __uprobe_unregister() into uprobe_unregister()
            - Shift put_uprobe() from delete_uprobe() to uprobe_unregister()
            - BPF: Fix use-after-free in bpf_uprobe_multi_link_attach()
              (Oleg Nesterov)
    
     - New feature & ABI extension: allow events to use PERF_SAMPLE READ
       with inheritance, enabling sample based profiling of a group of
       counters over a hierarchy of processes or threads (Ben Gainey)
    
     - Intel uncore & power events updates:
    
          - Add Arrow Lake and Lunar Lake support
          - Add PERF_EV_CAP_READ_SCOPE
          - Clean up and enhance cpumask and hotplug support
            (Kan Liang)
    
          - Add LNL uncore iMC freerunning support
          - Use D0:F0 as a default device
            (Zhenyu Wang)
    
     - Intel PT: fix AUX snapshot handling race (Adrian Hunter)
    
     - Misc fixes and cleanups (James Clark, Jiri Olsa, Oleg Nesterov and
       Peter Zijlstra)
    
    * tag 'perf-core-2024-09-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
      dmaengine: idxd: Clean up cpumask and hotplug for perfmon
      iommu/vt-d: Clean up cpumask and hotplug for perfmon
      perf/x86/intel/cstate: Clean up cpumask and hotplug
      perf: Add PERF_EV_CAP_READ_SCOPE
      perf: Generic hotplug support for a PMU with a scope
      uprobes: perform lockless SRCU-protected uprobes_tree lookup
      rbtree: provide rb_find_rcu() / rb_find_add_rcu()
      perf/uprobe: split uprobe_unregister()
      uprobes: travers uprobe's consumer list locklessly under SRCU protection
      uprobes: get rid of enum uprobe_filter_ctx in uprobe filter callbacks
      uprobes: protected uprobe lifetime with SRCU
      uprobes: revamp uprobe refcounting and lifetime management
      bpf: Fix use-after-free in bpf_uprobe_multi_link_attach()
      perf/core: Fix small negative period being ignored
      perf: Really fix event_function_call() locking
      perf: Optimize __pmu_ctx_sched_out()
      perf: Add context time freeze
      perf: Fix event_function_call() locking
      perf: Extract a few helpers
      perf: Optimize context reschedule for single PMU cases
      ...
    9f0c253d
core.c 344 KB