1. 18 Nov, 2021 12 commits
    • Arnaldo Carvalho de Melo's avatar
      tools build: Fix removal of feature-sync-compare-and-swap feature detection · e8c04ea0
      Arnaldo Carvalho de Melo authored
      The patch removing the feature-sync-compare-and-swap feature detection
      didn't remove the call to main_test_sync_compare_and_swap(), making the
      'test-all' case fail an all the feature tests to be performed
      individually:
      
        $ cat /tmp/build/perf/feature/test-all.make.output
        In file included from test-all.c:18:
        test-libpython-version.c:5:10: error: #error
            5 |         #error
              |          ^~~~~
        test-all.c: In function ‘main’:
        test-all.c:203:9: error: implicit declaration of function ‘main_test_sync_compare_and_swap’ [-Werror=implicit-function-declaration]
          203 |         main_test_sync_compare_and_swap(argc, argv);
              |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        cc1: all warnings being treated as errors
        $
      
      Fix it, now to figure out what is that test-libpython-version.c
      problem...
      
      Fixes: 60fa754b ("tools: Remove feature-sync-compare-and-swap feature detection")
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/YZU9Fe0sgkHSXeC2@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e8c04ea0
    • German Gomez's avatar
      perf inject: Fix ARM SPE handling · 9e1a8d9f
      German Gomez authored
      'perf inject' is currently not working for Arm SPE. When you try to run
      'perf inject' and 'perf report' with a perf.data file that contains SPE
      traces, the tool reports a "Bad address" error:
      
        # ./perf record -e arm_spe_0/ts_enable=1,store_filter=1,branch_filter=1,load_filter=1/ -a -- sleep 1
        # ./perf inject -i perf.data -o perf.inject.data --itrace
        # ./perf report -i perf.inject.data --stdio
      
        0x42c00 [0x8]: failed to process type: 9 [Bad address]
        Error:
        failed to process sample
      
      As far as I know, the issue was first spotted in [1], but 'perf inject'
      was not yet injecting the samples. This patch does something similar to
      what cs_etm does for injecting the samples [2], but for SPE.
      
      [1] https://patchwork.kernel.org/project/linux-arm-kernel/cover/20210412091006.468557-1-leo.yan@linaro.org/#24117339
      [2] https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/cs-etm.c?h=perf/core&id=133fe2e617e48ca0948983329f43877064ffda3e#n1196Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarGerman Gomez <german.gomez@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20211105104130.28186-2-german.gomez@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9e1a8d9f
    • Sohaib Mohamed's avatar
      perf bench: Fix two memory leaks detected with ASan · 92723ea0
      Sohaib Mohamed authored
      ASan reports memory leaks while running:
      
        $ perf bench sched all
      
      Fixes: e27454cc ("perf bench: Add sched-messaging.c: Benchmark for scheduler and IPC mechanisms based on hackbench")
      Signed-off-by: default avatarSohaib Mohamed <sohaib.amhmd@gmail.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Hitoshi Mitake <h.mitake@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Russel <rusty@rustcorp.com.au>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Pierre Gondois <pierre.gondois@arm.com>
      Link: http://lore.kernel.org/lkml/20211110022012.16620-1-sohaib.amhmd@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      92723ea0
    • Thomas Richter's avatar
      perf test sample-parsing: Fix branch_stack entry endianness check · cb5a63fe
      Thomas Richter authored
      Commit 10269a2c ("perf test sample-parsing: Add endian test for
      struct branch_flags") broke the test case 27 (Sample parsing) on s390 on
      linux-next tree:
      
        # perf test -Fv 27
        27: Sample parsing
        --- start ---
        parsing failed for sample_type 0x800
        ---- end ----
        Sample parsing: FAILED!
        #
      
      The cause of the failure is a wrong #define BS_EXPECTED_BE statement in
      above commit.  Correct this define and the test case runs fine.
      
      Output After:
      
        # perf test -Fv 27
        27: Sample parsing                                                  :
        --- start ---
        ---- end ----
        Sample parsing: Ok
        #
      
      Fixes: 10269a2c ("perf test sample-parsing: Add endian test for struct branch_flags")
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Tested-by: default avatarMadhavan Srinivasan <maddy@linux.ibm.com>
      Acked-by: default avatarMadhavan Srinivasan <maddy@linux.ibm.com>
      CC: Sven Schnelle <svens@linux.ibm.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Link: https://lore.kernel.org/r/54077e81-503e-3405-6cb0-6541eb5532cc@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cb5a63fe
    • Arnaldo Carvalho de Melo's avatar
      tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources · 162b9445
      Arnaldo Carvalho de Melo authored
      To pick the changes in:
      
        828ca896 ("KVM: x86: Expose TSC offset controls to userspace")
      
      That just rebuilds kvm-stat.c on x86, no change in functionality.
      
      This silences these perf build warning:
      
        Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h'
        diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h
      
      Cc: Oliver Upton <oupton@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      162b9445
    • Namhyung Kim's avatar
      perf sort: Fix the 'p_stage_cyc' sort key behavior · db4b2840
      Namhyung Kim authored
      andle 'p_stage_cyc' (for pipeline stage cycles) sort key with the same
      rationale as for the 'weight' and 'local_weight', see the fix in this
      series for a full explanation.
      
      Not sure it also needs the local and global variants.
      
      But I couldn't test it actually because I don't have the machine.
      Reviewed-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20211105225617.151364-3-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      db4b2840
    • Namhyung Kim's avatar
      perf sort: Fix the 'ins_lat' sort key behavior · 4d03c753
      Namhyung Kim authored
      Handle 'ins_lat' (for instruction latency) and 'local_ins_lat' sort keys
      with the same rationale as for the 'weight' and 'local_weight', see the
      previous fix in this series for a full explanation.
      
      But I couldn't test it actually, so only build tested.
      Reviewed-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20211105225617.151364-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4d03c753
    • Namhyung Kim's avatar
      perf sort: Fix the 'weight' sort key behavior · 784e8add
      Namhyung Kim authored
      Currently, the 'weight' field in the perf sample has latency information
      for some instructions like in memory accesses.  And perf tool has 'weight'
      and 'local_weight' sort keys to display the info.
      
      But it's somewhat confusing what it shows exactly.  In my understanding,
      'local_weight' shows a weight in a single sample, and (global) 'weight'
      shows a sum of the weights in the hist_entry.
      
      For example:
      
        $ perf mem record -t load dd if=/dev/zero of=/dev/null bs=4k count=1M
      
        $ perf report --stdio -n -s +local_weight
        ...
        #
        # Overhead  Samples  Command  Shared Object     Symbol                     Local Weight
        # ........  .......  .......  ................  .........................  ............
        #
            21.23%      313  dd       [kernel.vmlinux]  [k] lockref_get_not_zero   32
            12.43%      183  dd       [kernel.vmlinux]  [k] lockref_get_not_zero   35
            11.97%      159  dd       [kernel.vmlinux]  [k] lockref_get_not_zero   36
            10.40%      141  dd       [kernel.vmlinux]  [k] lockref_put_return     32
             7.63%      113  dd       [kernel.vmlinux]  [k] lockref_get_not_zero   33
             6.37%       92  dd       [kernel.vmlinux]  [k] lockref_get_not_zero   34
             6.15%       90  dd       [kernel.vmlinux]  [k] lockref_put_return     33
        ...
      
      So let's look at the 'lockref_get_not_zero' symbols.  The top entry
      shows that 313 samples were captured with 'local_weight' 32, so the
      total weight should be 313 x 32 = 10016.  But it's not the case:
      
        $ perf report --stdio -n -s +local_weight,weight -S lockref_get_not_zero
        ...
        #
        # Overhead  Samples  Command  Shared Object     Local Weight  Weight
        # ........  .......  .......  ................  ............  ......
        #
             1.36%        4  dd       [kernel.vmlinux]  36            144
             0.47%        4  dd       [kernel.vmlinux]  37            148
             0.42%        4  dd       [kernel.vmlinux]  32            128
             0.40%        4  dd       [kernel.vmlinux]  34            136
             0.35%        4  dd       [kernel.vmlinux]  36            144
             0.34%        4  dd       [kernel.vmlinux]  35            140
             0.30%        4  dd       [kernel.vmlinux]  36            144
             0.30%        4  dd       [kernel.vmlinux]  34            136
             0.30%        4  dd       [kernel.vmlinux]  32            128
             0.30%        4  dd       [kernel.vmlinux]  32            128
        ...
      
      With the 'weight' sort key, it's divided to 4 samples even with the same
      info ('comm', 'dso', 'sym' and 'local_weight').  I don't think this is
      what we want.
      
      I found this because of the way it aggregates the 'weight' value.  Since
      it's not a period, we should not add them in the he->stat.  Otherwise,
      two 32 'weight' entries will create a 64 'weight' entry.
      
      After that, new 32 'weight' samples don't have a matching entry so it'd
      create a new entry and make it a 64 'weight' entry again and again.
      Later, they will be merged into 128 'weight' entries during the
      hists__collapse_resort() with 4 samples, multiple times like above.
      
      Let's keep the weight and display it differently.  For 'local_weight',
      it can show the weight as is, and for (global) 'weight' it can display
      the number multiplied by the number of samples.
      
      With this change, I can see the expected numbers.
      
        $ perf report --stdio -n -s +local_weight,weight -S lockref_get_not_zero
        ...
        #
        # Overhead  Samples  Command  Shared Object     Local Weight  Weight
        # ........  .......  .......  ................  ............  .....
        #
            21.23%      313  dd       [kernel.vmlinux]  32            10016
            12.43%      183  dd       [kernel.vmlinux]  35            6405
            11.97%      159  dd       [kernel.vmlinux]  36            5724
             7.63%      113  dd       [kernel.vmlinux]  33            3729
             6.37%       92  dd       [kernel.vmlinux]  34            3128
             4.17%       59  dd       [kernel.vmlinux]  37            2183
             0.08%        1  dd       [kernel.vmlinux]  269           269
             0.08%        1  dd       [kernel.vmlinux]  38            38
      Reviewed-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20211105225617.151364-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      784e8add
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Set COMPAT_NEED_REALLOCARRAY for CONFIG_AUXTRACE=1 · 70f9c9b2
      Arnaldo Carvalho de Melo authored
      As it is being used in tools/perf/arch/arm64/util/arm-spe.c and the
      COMPAT_NEED_REALLOCARRAY was only being set when CORESIGHT=1 is set.
      
      Fixes: 56c31cdf ("perf arm-spe: Implement find_snapshot callback")
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/all/YZT63mIc7iY01er3@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      70f9c9b2
    • Arnaldo Carvalho de Melo's avatar
      perf tests wp: Remove unused functions on s390 · ccb05590
      Arnaldo Carvalho de Melo authored
      Fixing these build problems:
      
        tests/wp.c:24:12: error: 'wp_read' defined but not used [-Werror=unused-function]
         static int wp_read(int fd, long long *count, int size)
                    ^
        tests/wp.c:35:13: error: 'get__perf_event_attr' defined but not used [-Werror=unused-function]
         static void get__perf_event_attr(struct perf_event_attr *attr, int wp_type,
                     ^
          CC      /tmp/build/perf/util/print_binary.o
      
      Fixes: e47c6eca ("perf test: Convert watch point tests to test cases.")
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Brendan Higgins <brendanhiggins@google.com>
      Cc: Daniel Latypov <dlatypov@google.com>
      Cc: David Gow <davidgow@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sohaib Mohamed <sohaib.amhmd@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ccb05590
    • Arnaldo Carvalho de Melo's avatar
      tools headers UAPI: Sync linux/kvm.h with the kernel sources · 346e9199
      Arnaldo Carvalho de Melo authored
      To pick the changes in:
      
        b5663931 ("KVM: SEV: Add support for SEV intra host migration")
        e615e355 ("KVM: x86: On emulation failure, convey the exit reason, etc. to userspace")
        a9d496d8 ("KVM: x86: Clarify the kvm_run.emulation_failure structure layout")
        c68dc1b5 ("KVM: x86: Report host tsc and realtime values in KVM_GET_CLOCK")
        dea8ee31 ("RISC-V: KVM: Add SBI v0.1 support")
      
      That just rebuilds perf, as these patches don't add any new KVM ioctl to
      be harvested for the the 'perf trace' ioctl syscall argument
      beautifiers.
      
      This is also by now used by tools/testing/selftests/kvm/, a simple test
      build succeeded.
      
      This silences this perf build warning:
      
        Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
        diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
      
      Cc: Anup Patel <anup@brainfault.org>
      Cc: Atish Patra <atish.patra@wdc.com>
      Cc: David Edmondson <david.edmondson@oracle.com>
      Cc: Oliver Upton <oupton@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Gonda <pgonda@google.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      346e9199
    • Arnaldo Carvalho de Melo's avatar
      tools headers cpufeatures: Sync with the kernel sources · b075c1d8
      Arnaldo Carvalho de Melo authored
      To pick the changes from:
      
        eec2113e ("x86/fpu/amx: Define AMX state components and have it used for boot-time checks")
      
      This only causes these perf files to be rebuilt:
      
        CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
        CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o
      
      And addresses this perf build warning:
      
        Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
        diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
      
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Chang S. Bae <chang.seok.bae@intel.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b075c1d8
  2. 17 Nov, 2021 4 commits
  3. 16 Nov, 2021 4 commits
  4. 15 Nov, 2021 7 commits
  5. 14 Nov, 2021 13 commits