1. 17 Oct, 2023 11 commits
  2. 12 Oct, 2023 23 commits
  3. 10 Oct, 2023 1 commit
  4. 05 Oct, 2023 5 commits
    • Kajol Jain's avatar
      tools/perf: Update call stack check in builtin-lock.c · d7c9ae8d
      Kajol Jain authored
      The perf test named "kernel lock contention analysis test"
      fails in powerpc system with below error:
      
        [command]# ./perf test 81 -vv
         81: kernel lock contention analysis test                            :
         --- start ---
        test child forked, pid 2140
        Testing perf lock record and perf lock contention
        Testing perf lock contention --use-bpf
        [Skip] No BPF support
        Testing perf lock record and perf lock contention at the same time
        Testing perf lock contention --threads
        Testing perf lock contention --lock-addr
        Testing perf lock contention --type-filter (w/ spinlock)
        Testing perf lock contention --lock-filter (w/ tasklist_lock)
        Testing perf lock contention --callstack-filter (w/ unix_stream)
        [Fail] Recorded result should have a lock from unix_stream:
        test child finished with -1
         ---- end ----
        kernel lock contention analysis test: FAILED!
      
      The test is failing because we get an address entry with 0 in
      perf lock samples for powerpc, and code for lock contention
      option "--callstack-filter" will not check further entries after
      address 0.
      
      Below are some of the samples from test generated perf.data file, which
      have 0 address in the 2nd entry of callstack:
       --------
      sched-messaging    3409 [001]  7152.904029: lock:contention_begin: 0xc00000c80904ef00 (flags=SPIN)
              c0000000001e926c __traceiter_contention_begin+0x6c ([kernel.kallsyms])
                             0 [unknown] ([unknown])
              c000000000f8a178 native_queued_spin_lock_slowpath+0x1f8 ([kernel.kallsyms])
              c000000000f89f44 _raw_spin_lock_irqsave+0x84 ([kernel.kallsyms])
              c0000000001d9fd0 prepare_to_wait+0x50 ([kernel.kallsyms])
              c000000000c80f50 sock_alloc_send_pskb+0x1b0 ([kernel.kallsyms])
              c000000000e82298 unix_stream_sendmsg+0x2b8 ([kernel.kallsyms])
              c000000000c78980 sock_sendmsg+0x80 ([kernel.kallsyms])
      
      sched-messaging    3408 [005]  7152.904036: lock:contention_begin: 0xc00000c80904ef00 (flags=SPIN)
              c0000000001e926c __traceiter_contention_begin+0x6c ([kernel.kallsyms])
                             0 [unknown] ([unknown])
              c000000000f8a178 native_queued_spin_lock_slowpath+0x1f8 ([kernel.kallsyms])
              c000000000f89f44 _raw_spin_lock_irqsave+0x84 ([kernel.kallsyms])
              c0000000001d9fd0 prepare_to_wait+0x50 ([kernel.kallsyms])
              c000000000c80f50 sock_alloc_send_pskb+0x1b0 ([kernel.kallsyms])
              c000000000e82298 unix_stream_sendmsg+0x2b8 ([kernel.kallsyms])
              c000000000c78980 sock_sendmsg+0x80 ([kernel.kallsyms])
       --------
      
      Based on commit 20002ded ("perf_counter: powerpc: Add callchain support"),
      incase of powerpc, the callchain saved by kernel always includes first
      three entries as the NIP (next instruction pointer), LR (link register), and
      the contents of LR save area in the second stack frame. In certain scenarios
      its possible to have invalid kernel instruction addresses in either of LR or the
      second stack frame's LR. In that case, kernel will store the address as zer0.
      Hence, its possible to have 2nd or 3rd callstack entry as 0.
      
      As per the current code in match_callstack_filter function, we skip the callstack
      check incase we get 0 address. And hence the test case is failing in powerpc.
      
      Fix this issue by updating the check in match_callstack_filter function,
      to not skip callstack check if the 2nd or 3rd entry have 0 address
      for powerpc.
      
      Result in powerpc after patch changes:
      
        [command]# ./perf test 81 -vv
         81: kernel lock contention analysis test                            :
         --- start ---
        test child forked, pid 4570
        Testing perf lock record and perf lock contention
        Testing perf lock contention --use-bpf
        [Skip] No BPF support
        Testing perf lock record and perf lock contention at the same time
        Testing perf lock contention --threads
        Testing perf lock contention --lock-addr
        Testing perf lock contention --type-filter (w/ spinlock)
        Testing perf lock contention --lock-filter (w/ tasklist_lock)
        [Skip] Could not find 'tasklist_lock'
        Testing perf lock contention --callstack-filter (w/ unix_stream)
        Testing perf lock contention --callstack-filter with task aggregation
        Testing perf lock contention CSV output
        [Skip] No BPF support
        test child finished with 0
         ---- end ----
        kernel lock contention analysis test: Ok
      
      Fixes: ebab2916 ("perf lock contention: Support filters for different aggregation")
      Reported-by: default avatarDisha Goel <disgoel@linux.vnet.ibm.com>
      Tested-by: default avatarDisha Goel <disgoel@linux.ibm.com>
      Signed-off-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Cc: maddy@linux.ibm.com
      Cc: atrajeev@linux.vnet.ibm.com
      Link: https://lore.kernel.org/r/20231003092113.252380-1-kjain@linux.ibm.comSigned-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      d7c9ae8d
    • Athira Rajeev's avatar
      tools/perf/tests: Fix object code reading to skip address that falls out of text section · 8f5b62a1
      Athira Rajeev authored
      The testcase "Object code reading" fails in somecases
      for "fs_something" sub test as below:
      
          Reading object code for memory address: 0xc008000007f0142c
          File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
          On file address is: 0x1114cc
          Objdump command is: objdump -z -d --start-address=0x11142c --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
          objdump read too few bytes: 128
          test child finished with -1
      
      This can alo be reproduced when running perf record with
      workload that exercises fs_something() code. In the test
      setup, this is exercising xfs code since root is xfs.
      
          # perf record ./a.out
          # perf report -v |grep "xfs.ko"
            0.76% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  0xc008000007de5efc B [k] xlog_cil_commit
            0.74% a.out  /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  0xc008000007d5ae18 B [k] xfs_btree_key_offset
            0.74% a.out  /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  0xc008000007e11fd4 B [k] 0x0000000000112074
      
      Here addr "0xc008000007e11fd4" is not resolved. since this is a
      kernel module, its offset is from the DSO. Xfs module is loaded
      at 0xc008000007d00000
      
         # cat /proc/modules | grep xfs
          xfs 2228224 3 - Live 0xc008000007d00000
      
      And size is 0x220000. So its loaded between  0xc008000007d00000
      and 0xc008000007f20000. From objdump, text section is:
          text 0010f7bc  0000000000000000 0000000000000000 000000a0 2**4
      
      Hence perf captured ip maps to 0x112074 which is:
      ( ip - start of module ) + a0
      
      This offset 0x112074 falls out .text section which is up to 0x10f7bc
      In this case for module, the address 0xc008000007e11fd4 is pointing
      to stub instructions. This address range represents the module stubs
      which is allocated on module load and hence is not part of DSO offset.
      
      To address this issue in "object code reading", skip the sample if
      address falls out of text section and is within the module end.
      Use the "text_end" member of "struct dso" to do this check.
      
      To address this issue in "perf report", exploring an option of
      having stubs range as part of the /proc/kallsyms, so that perf
      report can resolve addresses in stubs range
      
      However this patch uses text_end to skip the stub range for
      Object code reading testcase.
      Reported-by: default avatarDisha Goel <disgoel@linux.ibm.com>
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Tested-by: Disha Goel<disgoel@linux.ibm.com>
      Reviewed-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Reviewed-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Cc: maddy@linux.ibm.com
      Cc: disgoel@linux.vnet.ibm.com
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: https://lore.kernel.org/r/20230928075213.84392-3-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      8f5b62a1
    • Athira Rajeev's avatar
      tools/perf: Add "is_kmod" to struct dso to check if it is kernel module · 6be5d828
      Athira Rajeev authored
      Update "struct dso" to include new member "is_kmod".
      This new field will determine if the file is a kernel
      module or not.
      
      To resolve the address from a sample, perf looks at the
      DSO maps. In case of address from a kernel module, there
      were some address found to be not resolved. This was
      observed while running perf test for "Object code reading".
      Though the ip falls beteen the start address of the loaded
      module (perf map->start ) and end address ( perf map->end),
      it was unresolved.
      
      This was happening because in some cases for kernel
      modules, address from sample points to stub instructions.
      To identify if the DSO is a kernel module, the new field
      "is_kmod" is added to "struct dso".
      Reported-by: default avatarDisha Goel <disgoel@linux.ibm.com>
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: kjain@linux.ibm.com
      Cc: maddy@linux.ibm.com
      Cc: disgoel@linux.vnet.ibm.com
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: https://lore.kernel.org/r/20230928075213.84392-2-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      6be5d828
    • Athira Rajeev's avatar
      tools/perf: Add text_end to "struct dso" to save .text section size · 26a5262d
      Athira Rajeev authored
      Update "struct dso" to include new member "text_end".
      This new field will represent the offset for end of text
      section for a dso. For elf, this value is derived as:
      sh_size (Size of section in byes) + sh_offset (Section file
      offst) of the elf header for text.
      
      For bfd, this value is derived as:
      1. For PE file,
      section->size + ( section->vma - dso->text_offset)
      2. Other cases:
      section->filepos (file position) + section->size (size of
      section)
      
      To resolve the address from a sample, perf looks at the
      DSO maps. In case of address from a kernel module, there
      were some address found to be not resolved. This was
      observed while running perf test for "Object code reading".
      Though the ip falls beteen the start address of the loaded
      module (perf map->start ) and end address ( perf map->end),
      it was unresolved.
      
      Example:
      
          Reading object code for memory address: 0xc008000007f0142c
          File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
          On file address is: 0x1114cc
          Objdump command is: objdump -z -d --start-address=0x11142c --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
          objdump read too few bytes: 128
          test child finished with -1
      
      Here, module is loaded at:
          # cat /proc/modules | grep xfs
          xfs 2228224 3 - Live 0xc008000007d00000
      
      From objdump for xfs module, text section is:
          text 0010f7bc  0000000000000000 0000000000000000 000000a0 2**4
      
      Here the offset for 0xc008000007f0142c ie  0x112074 falls out
      .text section which is up to 0x10f7bc.
      
      In this case for module, the address 0xc008000007e11fd4 is pointing
      to stub instructions. This address range represents the module stubs
      which is allocated on module load and hence is not part of DSO offset.
      
      To identify such  address, which falls out of text
      section and within module end, added the new field "text_end" to
      "struct dso".
      Reported-by: default avatarDisha Goel <disgoel@linux.ibm.com>
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Reviewed-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Reviewed-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: maddy@linux.ibm.com
      Cc: disgoel@linux.vnet.ibm.com
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: https://lore.kernel.org/r/20230928075213.84392-1-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      26a5262d
    • Ian Rogers's avatar
      perf test: Avoid system wide when not privileged · 0ddce121
      Ian Rogers authored
      Switch the test program to sleep that makes more sense for system wide
      events. Only enable system wide when root or not paranoid. This avoids
      failures under some testing conditions like ARM cloud.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20230930060206.2353141-1-irogers@google.comSigned-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      0ddce121