1. 18 Feb, 2021 3 commits
  2. 17 Feb, 2021 7 commits
  3. 16 Feb, 2021 5 commits
    • Arnaldo Carvalho de Melo's avatar
      37b3fa0e
    • Arnaldo Carvalho de Melo's avatar
      Merge branch 'perf/urgent' into perf/core · c1bd8a2b
      Arnaldo Carvalho de Melo authored
      To get some fixes that didn't made into 5.11.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c1bd8a2b
    • Leo Yan's avatar
      perf arm-spe: Set sample's data source field · a89dbc9b
      Leo Yan authored
      The sample structure contains the field 'data_src' which is used to
      tell the data operation attributions, e.g. operation type is loading or
      storing, cache level, it's snooping or remote accessing, etc.  At the
      end, the 'data_src' will be parsed by perf mem/c2c tools to display
      human readable strings.
      
      This patch is to fill the 'data_src' field in the synthesized samples
      base on different types.  Currently perf tool can display statistics for
      L1/L2/L3 caches but it doesn't support the 'last level cache'.  To fit
      to current implementation, 'data_src' field uses L3 cache for last level
      cache.
      
      Before this commit, perf mem report looks like this:
        # Samples: 75K of event 'l1d-miss'
        # Total weight : 75951
        # Sort order   : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
        #
        # Overhead  Samples  Local Weight  Memory access  Symbol                  Shared Object  Data Symbol             Data Object  Snoop  TLB access
        # ........  .......  ............  .............  ......................  .............  ......................  ...........  .....  ..........
        #
            81.56%    61945  0             N/A            [.] 0x00000000000009d8  serial_c       [.] 0000000000000000    [unknown]    N/A    N/A
            18.44%    14003  0             N/A            [.] 0x0000000000000828  serial_c       [.] 0000000000000000    [unknown]    N/A    N/A
      
      Now on a system with Arm SPE, addresses and access types are displayed:
      
        # Samples: 75K of event 'l1d-miss'
        # Total weight : 75951
        # Sort order   : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
        #
        # Overhead  Samples  Local Weight  Memory access  Symbol                  Shared Object  Data Symbol             Data Object  Snoop  TLB access
        # ........  .......  ............  .............  ......................  .............  ......................  ...........  .....  ..........
        #
             0.43%      324  0             L1 miss        [.] 0x00000000000009d8  serial_c       [.] 0x0000ffff80794e00  anon         N/A    Walker hit
             0.42%      322  0             L1 miss        [.] 0x00000000000009d8  serial_c       [.] 0x0000ffff80794580  anon         N/A    Walker hit
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Tested-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Al Grant <al.grant@arm.com>
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wei Li <liwei391@huawei.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Link: https://lore.kernel.org/r/20210211133856.2137-6-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a89dbc9b
    • Leo Yan's avatar
      perf arm-spe: Synthesize memory event · e55ed342
      Leo Yan authored
      The memory event can deliver two benefits:
      
      - The first benefit is the memory event can give out global view for
        memory accessing, rather than organizing events with scatter mode
        (e.g. uses separate event for L1 cache, last level cache, etc) which
        which can only display a event for single memory type, memory events
        include all memory accessing so it can display the data accessing
        cross memory levels in the same view;
      
      - The second benefit is the sample generation might introduce a big
        overhead and need to wait for long time for Perf reporting, we can
        specify itrace option '--itrace=M' to filter out other events and only
        output memory events, this can significantly reduce the overhead
        caused by generating samples.
      
      This patch is to enable memory event for Arm SPE.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Tested-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Al Grant <al.grant@arm.com>
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wei Li <liwei391@huawei.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Link: https://lore.kernel.org/r/20210211133856.2137-5-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e55ed342
    • Leo Yan's avatar
      perf arm-spe: Fill address info for samples · 54f7815e
      Leo Yan authored
      To properly handle memory and branch samples, this patch divides into
      two functions for generating samples: arm_spe__synth_mem_sample() is for
      synthesizing memory and TLB samples; arm_spe__synth_branch_sample() is
      to synthesize branch samples.
      
      Arm SPE backend decoder has passed virtual and physical address through
      packets, the address info is stored into the synthesize samples in the
      function arm_spe__synth_mem_sample().
      
      Committer notes:
      
      Fixed this:
      
        36    46.77 fedora:27                     : FAIL clang version 5.0.2 (tags/RELEASE_502/final)
      
          util/arm-spe.c:269:34: error: missing field 'pid' initializer [-Werror,-Wmissing-field-initializers]
                  struct perf_sample sample = { 0 };
                                                  ^
          util/arm-spe.c:288:34: error: missing field 'pid' initializer [-Werror,-Wmissing-field-initializers]
                  struct perf_sample sample = { 0 };
      
      By using = { .ip = 0, };
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Tested-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Al Grant <al.grant@arm.com>
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wei Li <liwei391@huawei.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Link: https://lore.kernel.org/r/20210211133856.2137-4-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      54f7815e
  4. 14 Feb, 2021 7 commits
  5. 13 Feb, 2021 12 commits
  6. 12 Feb, 2021 6 commits
    • Linus Torvalds's avatar
      Merge tag '5.11-rc7-smb3-github' of git://github.com/smfrench/smb3-kernel · 7989807d
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
       "Four small smb3 fixes to the new mount API (including a particularly
        important one for DFS links).
      
        These were found in testing this week of additional DFS scenarios, and
        a user testing of an apache container problem"
      
      * tag '5.11-rc7-smb3-github' of git://github.com/smfrench/smb3-kernel:
        cifs: Set CIFS_MOUNT_USE_PREFIX_PATH flag on setting cifs_sb->prepath.
        cifs: In the new mount api we get the full devname as source=
        cifs: do not disable noperm if multiuser mount option is not provided
        cifs: fix dfs-links
      7989807d
    • Jianlin Lv's avatar
      perf probe: Fix kretprobe issue caused by GCC bug · 105f75eb
      Jianlin Lv authored
      Perf failed to add a kretprobe event with debuginfo of vmlinux which is
      compiled by gcc with -fpatchable-function-entry option enabled.  The
      same issue with kernel module.
      
      Issue:
      
        # perf probe  -v 'kernel_clone%return $retval'
        ......
        Writing event: r:probe/kernel_clone__return _text+599624 $retval
        Failed to write event: Invalid argument
          Error: Failed to add events. Reason: Invalid argument (Code: -22)
      
        # cat /sys/kernel/debug/tracing/error_log
        [156.75] trace_kprobe: error: Retprobe address must be an function entry
        Command: r:probe/kernel_clone__return _text+599624 $retval
                                              ^
      
        # llvm-dwarfdump  vmlinux |grep  -A 10  -w 0x00df2c2b
        0x00df2c2b:   DW_TAG_subprogram
                      DW_AT_external  (true)
                      DW_AT_name      ("kernel_clone")
                      DW_AT_decl_file ("/home/code/linux-next/kernel/fork.c")
                      DW_AT_decl_line (2423)
                      DW_AT_decl_column       (0x07)
                      DW_AT_prototyped        (true)
                      DW_AT_type      (0x00dcd492 "pid_t")
                      DW_AT_low_pc    (0xffff800010092648)
                      DW_AT_high_pc   (0xffff800010092b9c)
                      DW_AT_frame_base        (DW_OP_call_frame_cfa)
      
        # cat /proc/kallsyms |grep kernel_clone
        ffff800010092640 T kernel_clone
        # readelf -s vmlinux |grep -i kernel_clone
        183173: ffff800010092640  1372 FUNC    GLOBAL DEFAULT    2 kernel_clone
      
        # objdump -d vmlinux |grep -A 10  -w \<kernel_clone\>:
        ffff800010092640 <kernel_clone>:
        ffff800010092640:       d503201f        nop
        ffff800010092644:       d503201f        nop
        ffff800010092648:       d503233f        paciasp
        ffff80001009264c:       a9b87bfd        stp     x29, x30, [sp, #-128]!
        ffff800010092650:       910003fd        mov     x29, sp
        ffff800010092654:       a90153f3        stp     x19, x20, [sp, #16]
      
      The entry address of kernel_clone converted by debuginfo is _text+599624
      (0x92648), which is consistent with the value of DW_AT_low_pc attribute.
      But the symbolic address of kernel_clone from /proc/kallsyms is
      ffff800010092640.
      
      This issue is found on arm64, -fpatchable-function-entry=2 is enabled when
      CONFIG_DYNAMIC_FTRACE_WITH_REGS=y;
      Just as objdump displayed the assembler contents of kernel_clone,
      GCC generate 2 NOPs  at the beginning of each function.
      
      kprobe_on_func_entry detects that (_text+599624) is not the entry address
      of the function, which leads to the failure of adding kretprobe event.
      
        kprobe_on_func_entry
        ->_kprobe_addr
        ->kallsyms_lookup_size_offset
        ->arch_kprobe_on_func_entry		// FALSE
      
      The cause of the issue is that the first instruction in the compile unit
      indicated by DW_AT_low_pc does not include NOPs.
      This issue exists in all gcc versions that support
      -fpatchable-function-entry option.
      
      I have reported it to the GCC community:
      https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98776
      
      Currently arm64 and PA-RISC may enable fpatchable-function-entry option.
      The kernel compiled with clang does not have this issue.
      
      FIX:
      
      This GCC issue only cause the registration failure of the kretprobe event
      which doesn't need debuginfo. So, stop using debuginfo for retprobe.
      map will be used to query the probe function address.
      Signed-off-by: default avatarJianlin Lv <Jianlin.Lv@arm.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: clang-built-linux@googlegroups.com
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
      Link: http://lore.kernel.org/lkml/20210210062646.2377995-1-Jianlin.Lv@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      105f75eb
    • Nicholas Fraser's avatar
      perf symbols: Fix return value when loading PE DSO · 77771a97
      Nicholas Fraser authored
      The first time dso__load() was called on a PE file it always returned -1
      error. This caused the first call to map__find_symbol() to always fail
      on a PE file so the first sample from each PE file always had symbol
      <unknown>. Subsequent samples succeed however because the DSO is already
      loaded.
      
      This fixes dso__load() to return 0 when successfully loading a DSO with
      libbfd.
      
      Fixes: eac9a434 ("perf symbols: Try reading the symbol table with libbfd")
      Signed-off-by: default avatarNicholas Fraser <nfraser@codeweavers.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Huw Davies <huw@codeweavers.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Remi Bernon <rbernon@codeweavers.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Tommi Rantala <tommi.t.rantala@nokia.com>
      Cc: Ulrich Czekalla <uczekalla@codeweavers.com>
      Link: http://lore.kernel.org/lkml/1671b43b-09c3-1911-dbf8-7f030242fbf7@codeweavers.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      77771a97
    • Nicholas Fraser's avatar
      perf symbols: Make dso__load_bfd_symbols() load PE files from debug cache only · 00a34234
      Nicholas Fraser authored
      dso__load_bfd_symbols() attempts to load a DSO at its original path,
      then closes it and loads the file in the debug cache. This is incorrect.
      It should ignore the original file and work with only the debug cache.
      
      The original file may have changed or may not even exist, for example if
      the debug cache has been transferred to another machine via "perf
      archive".
      
      This fix makes it only load the file in the debug cache.
      
      Further notes from Nicholas:
      
      dso__load_bfd_symbols() is called in a loop from dso__load() for a variety
      of paths. These are generated by the various DSO_BINARY_TYPEs in the
      binary_type_symtab list at the top of util/symbol.c. In each case the
      debugfile passed to dso__load_bfd_symbols() is the path to try.
      
      One of those iterations (the first one I believe) passes the original path
      as the debugfile. If the file still exists at the original path, this is
      the one that ends up being used in case the debugcache was deleted or the
      PE file doesn't have a build-id.
      
      A later iteration (BUILD_ID_CACHE) passes debugfile as the file in the
      debugcache if it has a build-id. Even if the file was previously loaded at
      its original path, (if I understand correctly) this load will override it
      so the debugcache file ends up being used.
      
      Committer notes:
      
      So if it fails to find in the cache, it will eventually hope for the
      best and look at the path in the local filesystem, which in many cases
      is enough.
      
      At some point we need to switch from this "hope for the best" approach
      to one that warns the user that there is no guarantee, if no buildid is
      present, that just by looking at the pathname the symbolisation will
      work.
      Signed-off-by: default avatarNicholas Fraser <nfraser@codeweavers.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Huw Davies <huw@codeweavers.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Remi Bernon <rbernon@codeweavers.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Tommi Rantala <tommi.t.rantala@nokia.com>
      Cc: Ulrich Czekalla <uczekalla@codeweavers.com>
      Link: http://lore.kernel.org/lkml/e58e1237-94ab-e1c9-a7b9-473531906954@codeweavers.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      00a34234
    • Leo Yan's avatar
      perf arm-spe: Store operation type in packet · 97ae666a
      Leo Yan authored
      This patch is to store operation type in packet structure.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Tested-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Al Grant <al.grant@arm.com>
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wei Li <liwei391@huawei.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Link: https://lore.kernel.org/r/20210211133856.2137-3-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      97ae666a
    • Leo Yan's avatar
      perf arm-spe: Store memory address in packet · 265cfb95
      Leo Yan authored
      This patch is to store virtual and physical memory addresses in packet,
      which will be used for memory samples.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Tested-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Al Grant <al.grant@arm.com>
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wei Li <liwei391@huawei.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20210211133856.2137-2-james.clark@arm.comSigned-off-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      265cfb95