1. 12 Dec, 2023 2 commits
    • Ian Rogers's avatar
      libperf cpumap: Rename perf_cpu_map__default_new() to... · 8f60f870
      Ian Rogers authored
      libperf cpumap: Rename perf_cpu_map__default_new() to perf_cpu_map__new_online_cpus() and prefer sysfs
      
      Rename perf_cpu_map__default_new() to perf_cpu_map__new_online_cpus() to
      better indicate what the implementation does.
      
      Read the online CPUs from /sys/devices/system/cpu/online first before
      using sysconf() as it can't accurately configure holes in the CPU map.
      
      If sysconf() is used, warn when the configured and online processors
      disagree.
      
      When reading from a file, if the read doesn't yield a CPU map then
      return an empty map rather than the default online. This avoids
      recursion but also better yields being able to detect failures.
      
      Add more comments.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Andrew Jones <ajones@ventanamicro.com>
      Cc: André Almeida <andrealmeid@igalia.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Atish Patra <atishp@rivosinc.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paran Lee <p4ranlee@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yang Li <yang.lee@linux.alibaba.com>
      Cc: Yanteng Si <siyanteng@loongson.cn>
      Cc: bpf@vger.kernel.org
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20231129060211.1890454-3-irogers@google.com
      [ s/syfs/sysfs/g typo ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8f60f870
    • Ian Rogers's avatar
      libperf cpumap: Rename perf_cpu_map__dummy_new() to perf_cpu_map__new_any_cpu() · 48219b08
      Ian Rogers authored
      Rename perf_cpu_map__dummy_new() to perf_cpu_map__new_any_cpu() to
      better indicate this is creating a CPU map for the perf_event_open "any"
      CPU case.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Andrew Jones <ajones@ventanamicro.com>
      Cc: André Almeida <andrealmeid@igalia.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Atish Patra <atishp@rivosinc.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paran Lee <p4ranlee@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yang Li <yang.lee@linux.alibaba.com>
      Cc: Yanteng Si <siyanteng@loongson.cn>
      Cc: bpf@vger.kernel.org
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20231129060211.1890454-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      48219b08
  2. 11 Dec, 2023 1 commit
  3. 07 Dec, 2023 8 commits
  4. 06 Dec, 2023 8 commits
    • Ian Rogers's avatar
      perf stat: Exit perf stat if parse groups fails · 0713ab3b
      Ian Rogers authored
      Metrics were added by a callback but commit a4b8cfca ("perf
      stat: Delay metric parsing") postponed this to allow optimizations based
      on the CPU configuration.
      
      In doing so it stopped errors in metric parsing from causing 'perf stat'
      termination.
      
      This change adds the termination for bad metric names back in.
      
      Fixes: a4b8cfca ("perf stat: Delay metric parsing")
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Closes: https://lore.kernel.org/lkml/ZXByT1K6enTh2EHT@kernel.org/
      Link: https://lore.kernel.org/r/20231206183533.972028-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0713ab3b
    • Ian Rogers's avatar
      perf thread: Add missing RC_CHK_EQUAL · 01261d8a
      Ian Rogers authored
      Comparing pointers without RC_CHK_ACCESS means the indirect object
      will be compared rather than the underlying maps when REFCNT_CHECKING
      is enabled. Fix by adding missing RC_CHK_EQUAL.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Guilherme Amadio <amadio@gentoo.org>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Li Dong <lidong@vivo.com>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Ming Wang <wangming01@loongson.cn>
      Cc: Nick Terrell <terrelln@fb.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
      Cc: Wenyu Liu <liuwenyu7@huawei.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20231127220902.1315692-15-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      01261d8a
    • Ian Rogers's avatar
      perf maps: Move symbol maps functions to maps.c · 0f6ab6a3
      Ian Rogers authored
      Move the find and certain other symbol maps__* functions to maps.c for
      better abstraction.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Guilherme Amadio <amadio@gentoo.org>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Li Dong <lidong@vivo.com>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Ming Wang <wangming01@loongson.cn>
      Cc: Nick Terrell <terrelln@fb.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
      Cc: Wenyu Liu <liuwenyu7@huawei.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20231127220902.1315692-14-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0f6ab6a3
    • Ian Rogers's avatar
      perf map: Simplify map_ip/unmap_ip and make 'struct map' smaller · 9fa688ea
      Ian Rogers authored
      When mapping an IP it is either an identity mapping or a DSO relative
      mapping, so a single bit is required in the struct to identify
      this.
      
      The current code uses function pointers, adding 2 pointers per map and
      also pushing the size of a map beyond 1 cache line.
      
      Switch to using a byte to identify the mapping type (as well as priv and
      erange_warned), to avoid any masking.
      
      Change struct maps's layout to avoid holes.
      
      Before:
      ```
      struct map {
              u64                        start;                /*     0     8 */
              u64                        end;                  /*     8     8 */
              _Bool                      erange_warned:1;      /*    16: 0  1 */
              _Bool                      priv:1;               /*    16: 1  1 */
      
              /* XXX 6 bits hole, try to pack */
              /* XXX 3 bytes hole, try to pack */
      
              u32                        prot;                 /*    20     4 */
              u64                        pgoff;                /*    24     8 */
              u64                        reloc;                /*    32     8 */
              u64                        (*map_ip)(const struct map  *, u64); /*    40     8 */
              u64                        (*unmap_ip)(const struct map  *, u64); /*    48     8 */
              struct dso *               dso;                  /*    56     8 */
              /* --- cacheline 1 boundary (64 bytes) --- */
              refcount_t                 refcnt;               /*    64     4 */
              u32                        flags;                /*    68     4 */
      
              /* size: 72, cachelines: 2, members: 12 */
              /* sum members: 68, holes: 1, sum holes: 3 */
              /* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits */
              /* last cacheline: 8 bytes */
      };
      ```
      
      After:
      ```
      struct map {
              u64                        start;                /*     0     8 */
              u64                        end;                  /*     8     8 */
              u64                        pgoff;                /*    16     8 */
              u64                        reloc;                /*    24     8 */
              struct dso *               dso;                  /*    32     8 */
              refcount_t                 refcnt;               /*    40     4 */
              u32                        prot;                 /*    44     4 */
              u32                        flags;                /*    48     4 */
              enum mapping_type          mapping_type:8;       /*    52: 0  4 */
      
              /* Bitfield combined with next fields */
      
              _Bool                      erange_warned;        /*    53     1 */
              _Bool                      priv;                 /*    54     1 */
      
              /* size: 56, cachelines: 1, members: 11 */
              /* padding: 1 */
              /* last cacheline: 56 bytes */
      };
      ```
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Guilherme Amadio <amadio@gentoo.org>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Li Dong <lidong@vivo.com>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Ming Wang <wangming01@loongson.cn>
      Cc: Nick Terrell <terrelln@fb.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
      Cc: Wenyu Liu <liuwenyu7@huawei.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20231127220902.1315692-13-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9fa688ea
    • Ian Rogers's avatar
      perf test shell diff: Skip test if test_loop symbol is missing in the perf binary · 407a3898
      Ian Rogers authored
      The diff test depends on finding the symbol test_loop in perf and will
      fail if perf has been stripped and no debug object is available. In that
      case, skip the test instead.
      Suggested-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20231205164924.835682-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      407a3898
    • Chengen Du's avatar
      perf symbols: Parse NOTE segments until the build id is found · d0acce68
      Chengen Du authored
      In the ELF file, multiple NOTE segments may exist.
      To locate the build id, the process shall persist
      in parsing NOTE segments until the build id is found.
      Signed-off-by: default avatarChengen Du <chengen.du@canonical.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20231130135723.17562-1-chengen.du@canonical.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d0acce68
    • Ian Rogers's avatar
      perf record: Be lazier in allocating lost samples buffer · 030ac3ca
      Ian Rogers authored
      Wait until a lost sample occurs to allocate the lost samples buffer,
      often the buffer isn't necessary. This saves a 64kb allocation and
      5.3kb of peak memory consumption.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Guilherme Amadio <amadio@gentoo.org>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Li Dong <lidong@vivo.com>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Ming Wang <wangming01@loongson.cn>
      Cc: Nick Terrell <terrelln@fb.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
      Cc: Wenyu Liu <liuwenyu7@huawei.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20231127220902.1315692-9-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      030ac3ca
    • Ian Rogers's avatar
      perf evsel: Fallback to "task-clock" when not system wide · eb2eac0c
      Ian Rogers authored
      When the "cycles" event isn't available evsel will fallback to the
      "cpu-clock" software event.
      
      "task-clock" is similar to "cpu-clock" but only runs when the process is
      running.
      
      Falling back to "cpu-clock" when not system wide leads to confusion, by
      falling back to "task-clock" it is hoped the confusion is less.
      
      Pass the target to determine if "task-clock" is more appropriate.
      
      Update a nearby comment and debug string for the change.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ajay Kaher <akaher@vmware.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Makhalov <amakhalov@vmware.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20231121000420.368075-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      eb2eac0c
  5. 05 Dec, 2023 9 commits
  6. 04 Dec, 2023 5 commits
  7. 30 Nov, 2023 4 commits
    • Ian Rogers's avatar
      tools api fs: Avoid reading whole file for a 1 byte bool · f8846a1a
      Ian Rogers authored
      sysfs__read_bool() used the first byte from a fully read file into a
      string. It then looked at the first byte's value. Avoid doing this and
      just read the first byte.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Guilherme Amadio <amadio@gentoo.org>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Li Dong <lidong@vivo.com>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Ming Wang <wangming01@loongson.cn>
      Cc: Nick Terrell <terrelln@fb.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
      Cc: Wenyu Liu <liuwenyu7@huawei.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20231127220902.1315692-6-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f8846a1a
    • Ian Rogers's avatar
      tools api fs: Switch filename__read_str to use io.h · b6a15269
      Ian Rogers authored
      filename__read_str() has its own string reading code that allocates
      memory before reading into it. The memory allocated is sized at BUFSIZ
      that is 8kb. Most strings are short and so most of this 8kb is wasted.
      
      Refactor io__getline(), as io__getdelim(), so that the newline character
      can be configurable and ignored in the case of filename__read_str().
      
      Code like build_caches_for_cpu() in perf's header.c will read many strings
      and hold them in a data structure, in this case multiple strings per
      cache level per CPU.
      
      Using io.h's io__getline() avoids the wasted memory as strings are
      temporarily read into a buffer on the stack before being copied to a
      buffer that grows 128 bytes at a time and is never sized larger than the
      string.
      
      For a 16 hyperthread system the memory consumption of "perf record
      true" is reduced by 180kb, primarily through saving memory when
      reading the cache information.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Guilherme Amadio <amadio@gentoo.org>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Li Dong <lidong@vivo.com>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Ming Wang <wangming01@loongson.cn>
      Cc: Nick Terrell <terrelln@fb.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
      Cc: Wenyu Liu <liuwenyu7@huawei.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20231127220902.1315692-5-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b6a15269
    • Ian Rogers's avatar
      libperf: Lazily allocate/size mmap event copy · 366efbff
      Ian Rogers authored
      The event copy in the mmap is used to have storage to read an event. Not
      all users of mmaps read the events, such as perf record. The amount of
      buffer was also statically set to PERF_SAMPLE_MAX_SIZE rather than the
      amount necessary from the header's event size.
      
      Switch to a model where the event_copy is reallocated if too small to
      the event's size. This adds the potential for the event to move, so if a
      copy of the event pointer were stored it could be broken. All the
      current users do:
      
        while(event = perf_mmap__read_event()) { ... }
      
      and so they would be broken due to the event being overwritten if they
      had stored the pointer. Manual inspection and address sanitizer testing
      also shows the event pointer not being stored.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Guilherme Amadio <amadio@gentoo.org>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Li Dong <lidong@vivo.com>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Ming Wang <wangming01@loongson.cn>
      Cc: Nick Terrell <terrelln@fb.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
      Cc: Wenyu Liu <liuwenyu7@huawei.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20231127220902.1315692-3-irogers@google.com
      [ Replace two lines with equivalent zfree(&map->event_copy) ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      366efbff
    • Arnaldo Carvalho de Melo's avatar
      libapi: Add missing linux/types.h header to get the __u64 type on io.h · af76b2de
      Arnaldo Carvalho de Melo authored
      There are functions using __u64, so we need to have the linux/types.h
      header otherwise we'll break when its not included before api/io.h.
      
      Fixes: e95770af ("tools api: Add a lightweight buffered reading api")
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/ZWjDPL+IzPPsuC3X@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      af76b2de
  8. 29 Nov, 2023 3 commits
    • Likhitha Korrapati's avatar
      perf test record+probe_libc_inet_pton: Fix call chain match on powerpc · 72a2a0a4
      Likhitha Korrapati authored
      The perf test "probe libc's inet_pton & backtrace it with ping" fails on
      powerpc as below:
      
        # perf test -v "probe libc's inet_pton & backtrace it with
        ping"
         85: probe libc's inet_pton & backtrace it with ping                 :
        --- start ---
        test child forked, pid 96028
        ping 96056 [002] 127271.101961: probe_libc:inet_pton: (7fffa1779a60)
        7fffa1779a60 __GI___inet_pton+0x0 (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
        7fffa172a73c getaddrinfo+0x121c (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
        FAIL: expected backtrace entry
        "gaih_inet.*\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib64/glibc-hwcaps/power10/libc.so.6\)$"
        got "7fffa172a73c getaddrinfo+0x121c (/usr/lib64/glibc-hwcaps/power10/libc.so.6)"
        test child finished with -1
        ---- end ----
        probe libc's inet_pton & backtrace it with ping: FAILED!
      
      This test installs a probe on libc's inet_pton function, which will use
      uprobes and then uses perf trace on a ping to localhost. It gets 3
      levels deep backtrace and checks whether it is what we expected or not.
      
      The test started failing from RHEL 9.4 where as it works in previous
      distro version (RHEL 9.2). Test expects gaih_inet function to be part of
      backtrace. But in the glibc version (2.34-86) which is part of distro
      where it fails, this function is missing and hence the test is failing.
      
      From nm and ping command output we can confirm that gaih_inet function
      is not present in the expected backtrace for glibc version glibc-2.34-86
      
        [root@xxx perf]# nm /usr/lib64/glibc-hwcaps/power10/libc.so.6 | grep gaih_inet
        00000000001273e0 t gaih_inet_serv
        00000000001cd8d8 r gaih_inet_typeproto
      
        [root@xxx perf]# perf script -i /tmp/perf.data.6E8
        ping  104048 [000] 128582.508976: probe_libc:inet_pton: (7fff83779a60)
                    7fff83779a60 __GI___inet_pton+0x0 (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
                    7fff8372a73c getaddrinfo+0x121c (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
                       11dc73534 [unknown] (/usr/bin/ping)
                    7fff8362a8c4 __libc_start_call_main+0x84 (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
      
        FAIL: expected backtrace entry
        "gaih_inet.*\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib64/glibc-hwcaps/power10/libc.so.6\)$"
        got "7fff9d52a73c getaddrinfo+0x121c (/usr/lib64/glibc-hwcaps/power10/libc.so.6)"
      
      With version glibc-2.34-60 gaih_inet function is present as part of the
      expected backtrace. So we cannot just remove the gaih_inet function from
      the backtrace.
      
        [root@xxx perf]# nm /usr/lib64/glibc-hwcaps/power10/libc.so.6 | grep gaih_inet
        0000000000130490 t gaih_inet.constprop.0
        000000000012e830 t gaih_inet_serv
        00000000001d45e4 r gaih_inet_typeproto
      
        [root@xxx perf]# ./perf script -i /tmp/perf.data.b6S
        ping   67906 [000] 22699.591699: probe_libc:inet_pton_3: (7fffbdd80820) 7fffbdd80820 __GI___inet_pton+0x0
        (/usr/lib64/glibc-hwcaps/power10/libc.so.6) 7fffbdd31160 gaih_inet.constprop.0+0xcd0
        (/usr/lib64/glibc-hwcaps/power10/libc.so.6) 7fffbdd31c7c getaddrinfo+0x14c
        (/usr/lib64/glibc-hwcaps/power10/libc.so.6) 1140d3558 [unknown] (/usr/bin/ping)
      
      This patch solves this issue by doing a conditional skip. If there is a
      gaih_inet function present in the libc then it will be added to the
      expected backtrace else the function will be skipped from being added
      to the expected backtrace.
      
      Output with the patch
      
        [root@xxx perf]# ./perf test -v "probe libc's inet_pton & backtrace it
        with ping"
         83: probe libc's inet_pton & backtrace it with ping                 :
        --- start ---
        test child forked, pid 102662
        ping 102692 [000] 127935.549973: probe_libc:inet_pton: (7fff93379a60)
        7fff93379a60 __GI___inet_pton+0x0 (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
        7fff9332a73c getaddrinfo+0x121c (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
        11ef03534 [unknown] (/usr/bin/ping)
        test child finished with 0
        ---- end ----
        probe libc's inet_pton & backtrace it with ping: Ok
      Reported-by: default avatarDisha Goel <disgoel@linux.ibm.com>
      Reviewed-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarLikhitha Korrapati <likhitha@linux.ibm.com>
      Tested-by: default avatarDisha Goel <disgoel@linux.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: https://lore.kernel.org/r/20231126070914.175332-1-likhitha@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      72a2a0a4
    • Arnaldo Carvalho de Melo's avatar
      perf tests sigtrap: Skip if running on a kernel with sleepable spinlocks · 650e0bde
      Arnaldo Carvalho de Melo authored
      There are issues as reported that need some more investigation on the
      RT kernel front, till that is addressed, skip this test.
      
      This test is already skipped for multiple hardware architectures where
      the tested kernel feature is not supported.
      Acked-by: default avatarMarco Elver <elver@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Kate Carcia <kcarcia@redhat.com>
      Cc: Marco Elver <elver@google.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lore.kernel.org/all/e368f2c848d77fbc8d259f44e2055fe469c219cf.camel@gmx.de/
      Link: https://lore.kernel.org/r/20231129154718.326330-3-acme@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      650e0bde
    • Arnaldo Carvalho de Melo's avatar
      perf test sigtrap: Generalize the BTF routine to reuse it in this test · a472ee42
      Arnaldo Carvalho de Melo authored
      Move the part that loads the BTF info to a "btf__available()" that will
      lazy load the BTF info so that if we need it for some other test, which
      we will in the following cset, we can reuse it.
      
      At some point this will move from this specific 'perf test' entry to be
      used in other parts of perf, do it when needed.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kate Carcia <kcarcia@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lore.kernel.org/r/20231129154718.326330-2-acme@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a472ee42