1. 06 Nov, 2019 35 commits
  2. 28 Oct, 2019 5 commits
    • Liang, Kan's avatar
      perf/core: Optimize perf_init_event() for TYPE_SOFTWARE · d44f821b
      Liang, Kan authored
      Andi reported that he was hitting the linear search in
      perf_init_event() a lot. Now that all !TYPE_SOFTWARE events should hit
      the IDR, make sure the TYPE_SOFTWARE events are at the head of the
      list such that we'll quickly find the right PMU (provided a valid
      event was given).
      Signed-off-by: default avatarLiang, Kan <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d44f821b
    • Peter Zijlstra's avatar
      perf/core: Optimize perf_init_event() · 66d258c5
      Peter Zijlstra authored
      Andi reported that he was hitting the linear search in
      perf_init_event() a lot. Make more agressive use of the IDR lookup to
      avoid hitting the linear search.
      
      With exception of PERF_TYPE_SOFTWARE (which relies on a hideous hack),
      we can put everything in the IDR. On top of that, we can alias
      TYPE_HARDWARE and TYPE_HW_CACHE to TYPE_RAW on the lookup side.
      
      This greatly reduces the chances of hitting the linear search.
      Reported-by: default avatarAndi Kleen <andi@firstfloor.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan <kan.liang@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      66d258c5
    • Peter Zijlstra's avatar
      perf/core: Optimize perf_install_in_event() · db0503e4
      Peter Zijlstra authored
      Andi reported that when creating a lot of events, a lot of time is
      spent in IPIs and asked if it would be possible to elide some of that.
      
      Now when, as for example the perf-tool always does, events are created
      disabled, then these events will not need to be scheduled when added
      to the context (they're still disable) and therefore the IPI is not
      required -- except for the very first event, that will need to set
      ctx->is_active.
      
      ( It might be possible to set ctx->is_active remotely for cpu_ctx, but
        we really need the IPI for task_ctx, so lets not make that
        distinction. )
      
      Also use __perf_effective_state() since group events depend on the
      state of the leader, if the leader is OFF, the whole group is OFF.
      
      So when sibling events are created enabled (XXX check tool) then we
      only need a single IPI to create and enable the whole group (+ that
      initial IPI to initialize the context).
      Suggested-by: default avatarAndi Kleen <andi@firstfloor.org>
      Reported-by: default avatarAndi Kleen <andi@firstfloor.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Cc: kan.liang@linux.intel.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      db0503e4
    • Alexey Budankov's avatar
      perf/x86: Synchronize PMU task contexts on optimized context switches · c2b98a86
      Alexey Budankov authored
      Install Intel specific PMU task context synchronization adapter and
      extend optimized context switch path with PMU specific task context
      synchronization to fix LBR callstack virtualization on context switches.
      Signed-off-by: default avatarAlexey Budankov <alexey.budankov@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: https://lkml.kernel.org/r/9c6445a9-bdba-ef03-3859-f1f91198f27a@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c2b98a86
    • Alexey Budankov's avatar
      perf/x86/intel: Implement LBR callstack context synchronization · 421ca868
      Alexey Budankov authored
      Implement intel_pmu_lbr_swap_task_ctx() method updating counters
      of the events that requested LBR callstack data on a sample.
      
      The counter can be zero for the case when task context belongs to
      a thread that has just come from a block on a futex and the context
      contains saved (lbr_stack_state == LBR_VALID) LBR register values.
      
      For the values to be restored at LBR registers on the next thread's
      switch-in event it swaps the counter value with the one that is
      expected to be non zero at the previous equivalent task perf event
      context.
      
      Swap operation type ensures the previous task perf event context
      stays consistent with the amount of events that requested LBR
      callstack data on a sample.
      Signed-off-by: default avatarAlexey Budankov <alexey.budankov@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: https://lkml.kernel.org/r/261ac742-9022-c3f4-5885-1eae7415b091@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      421ca868