1. 28 Aug, 2024 1 commit
  2. 26 Aug, 2024 1 commit
  3. 22 Aug, 2024 6 commits
    • Arnaldo Carvalho de Melo's avatar
      perf python: Disable -Wno-cast-function-type-mismatch if present on clang · 00dc5146
      Arnaldo Carvalho de Melo authored
      The -Wcast-function-type-mismatch option was introduced in clang 19 and
      its enabled by default, since we use -Werror, and python bindings do
      casts that are valid but trips this warning, disable it if present.
      
      Closes: https://lore.kernel.org/all/CA+icZUXoJ6BS3GMhJHV3aZWyb5Cz2haFneX0C5pUMUUhG-UVKQ@mail.gmail.comReported-by: default avatarSedat Dilek <sedat.dilek@gmail.com>
      Tested-by: default avatarSedat Dilek <sedat.dilek@gmail.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org # To allow building with the upcoming clang 19
      Link: https://lore.kernel.org/lkml/CA+icZUVtHn8X1Tb_Y__c-WswsO0K8U9uy3r2MzKXwTA5THtL7w@mail.gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      00dc5146
    • Arnaldo Carvalho de Melo's avatar
      perf python: Allow checking for the existence of warning options in clang · b8116230
      Arnaldo Carvalho de Melo authored
      We'll need to check if an warning option introduced in clang 19 is
      available on the clang version being used, so cover the error message
      emitted when testing for a -W option.
      Tested-by: default avatarSedat Dilek <sedat.dilek@gmail.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/lkml/CA+icZUVtHn8X1Tb_Y__c-WswsO0K8U9uy3r2MzKXwTA5THtL7w@mail.gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b8116230
    • Namhyung Kim's avatar
      perf annotate-data: Copy back variable types after move · 1cfd01eb
      Namhyung Kim authored
      In some cases, compilers don't set the location expression in DWARF
      precisely.  For instance, it may assign a variable to a register after
      copying it from a different register.  Then it should use the register
      for the new type but still uses the old register.  This makes hard to
      track the type information properly.
      
      This is an example I found in __tcp_transmit_skb().  The first argument
      (sk) of this function is a pointer to sock and there's a variable (tp)
      for tcp_sock.
      
        static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,
        				int clone_it, gfp_t gfp_mask, u32 rcv_nxt)
        {
        	...
        	struct tcp_sock *tp;
      
        	BUG_ON(!skb || !tcp_skb_pcount(skb));
        	tp = tcp_sk(sk);
        	prior_wstamp = tp->tcp_wstamp_ns;
        	tp->tcp_wstamp_ns = max(tp->tcp_wstamp_ns, tp->tcp_clock_cache);
        	...
      
      So it basically calls tcp_sk(sk) to get the tcp_sock pointer from sk.
      But it turned out to be the same value because tcp_sock embeds sock as
      the first member.  The sk is located in reg5 (RDI) and tp is in reg3
      (RBX).  The offset of tcp_wstamp_ns is 0x748 and tcp_clock_cache is
      0x750.  So you need to use RBX (reg3) to access the fields in the
      tcp_sock.  But the code used RDI (reg5) as it has the same value.
      
        $ pahole --hex -C tcp_sock vmlinux | grep -e 748 -e 750
      	u64                tcp_wstamp_ns;        /* 0x748   0x8 */
      	u64                tcp_clock_cache;      /* 0x750   0x8 */
      
      And this is the disassembly of the part of the function.
      
        <__tcp_transmit_skb>:
        ...
        44:  mov    %rdi, %rbx
        47:  mov    0x748(%rdi), %rsi
        4e:  mov    0x750(%rdi), %rax
        55:  cmp    %rax, %rsi
      
      Because compiler put the debug info to RBX, it only knows RDI is a
      pointer to sock and accessing those two fields resulted in error
      due to offset being beyond the type size.
      
        -----------------------------------------------------------
        find data type for 0x748(reg5) at __tcp_transmit_skb+0x63
        CU for net/ipv4/tcp_output.c (die:0x817f543)
        frame base: cfa=0 fbreg=6
        scope: [1/1] (die:81aac3e)
        bb: [0 - 30]
        var [0] -0x98(stack) type='struct tcp_out_options' size=0x28 (die:0x81af3df)
        var [5] reg8 type='unsigned int' size=0x4 (die:0x8180ed6)
        var [5] reg2 type='unsigned int' size=0x4 (die:0x8180ed6)
        var [5] reg1 type='int' size=0x4 (die:0x818059e)
        var [5] reg4 type='struct sk_buff*' size=0x8 (die:0x8181360)
        var [5] reg5 type='struct sock*' size=0x8 (die:0x8181a0c)                   <<<--- the first argument ('sk' at %RDI)
        mov [19] reg8 -> -0xa8(stack) type='unsigned int' size=0x4 (die:0x8180ed6)
        mov [20] stack canary -> reg0
        mov [29] reg0 -> -0x30(stack) stack canary
        bb: [36 - 3e]
        mov [36] reg4 -> reg15 type='struct sk_buff*' size=0x8 (die:0x8181360)
        bb: [44 - 63]
        mov [44] reg5 -> reg3 type='struct sock*' size=0x8 (die:0x8181a0c)          <<<--- calling tcp_sk()
        var [47] reg3 type='struct tcp_sock*' size=0x8 (die:0x819eead)              <<<--- new variable ('tp' at %RBX)
        var [4e] reg4 type='unsigned long long' size=0x8 (die:0x8180edd)
        mov [58] reg4 -> -0xc0(stack) type='unsigned long long' size=0x8 (die:0x8180edd)
        chk [63] reg5 offset=0x748 ok=1 kind=1 (struct sock*) : offset bigger than size    <<<--- access with old variable
        final result: offset bigger than size
      
      While it's a fault in the compiler, we could work around this issue by
      using the type of new variable when it's copied directly.  So I've added
      copied_from field in the register state to track those direct register
      to register copies.  After that new register gets a new type and the old
      register still has the same type, it'll update (copy it back) the type
      of the old register.
      
      For example, if we can update type of reg5 at __tcp_transmit_skb+0x47,
      we can find the target type of the instruction at 0x63 like below:
      
        -----------------------------------------------------------
        find data type for 0x748(reg5) at __tcp_transmit_skb+0x63
        ...
        bb: [44 - 63]
        mov [44] reg5 -> reg3 type='struct sock*' size=0x8 (die:0x8181a0c)
        var [47] reg3 type='struct tcp_sock*' size=0x8 (die:0x819eead)
        var [47] copyback reg5 type='struct tcp_sock*' size=0x8 (die:0x819eead)     <<<--- here
        mov [47] 0x748(reg5) -> reg4 type='unsigned long long' size=0x8 (die:0x8180edd)
        mov [4e] 0x750(reg5) -> reg0 type='unsigned long long' size=0x8 (die:0x8180edd)
        mov [58] reg4 -> -0xc0(stack) type='unsigned long long' size=0x8 (die:0x8180edd)
        chk [63] reg5 offset=0x748 ok=1 kind=1 (struct tcp_sock*) : Good!           <<<--- new type
        found by insn track: 0x748(reg5) type-offset=0x748
        final result:  type='struct tcp_sock' size=0xa98 (die:0x819eeb2)
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240821232628.353177-5-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1cfd01eb
    • Namhyung Kim's avatar
      perf annotate-data: Update stack slot for the store · 895891da
      Namhyung Kim authored
      When checking the match variable at the target instruction, it might not
      have any information if it's a first write to a stack slot.  In this
      case it could spill a register value into the stack so the type info is
      in the source operand.
      
      But currently it's hard to get the operand from the checking function.
      Let's process the instruction and retry to get the type info from the
      stack if there's no information already.
      
      This is an example of __tcp_transmit_skb().  The instructions are
      
        <__tcp_transmit_skb>:
         0: nopl   0x0(%rax, %rax, 1)
         5: push   %rbp
         6: mov    %rsp, %rbp
         9: push   %r15
         b: push   %r14
         d: push   %r13
         f: push   %r12
        11: push   %rbx
        12: sub    $0x98, %rsp
        19: mov    %r8d, -0xa8(%rbp)
        ...
      
      It cannot find any variable at -0xa8(%rbp) at this point.
        -----------------------------------------------------------
        find data type for -0xa8(reg6) at __tcp_transmit_skb+0x19
        CU for net/ipv4/tcp_output.c (die:0x817f543)
        frame base: cfa=0 fbreg=6
        scope: [1/1] (die:81aac3e)
        bb: [0 - 19]
        var [0] -0x98(stack) type='struct tcp_out_options' size=0x28 (die:0x81af3df)
        var [5] reg8 type='unsigned int' size=0x4 (die:0x8180ed6)
        var [5] reg2 type='unsigned int' size=0x4 (die:0x8180ed6)
        var [5] reg1 type='int' size=0x4 (die:0x818059e)
        var [5] reg4 type='struct sk_buff*' size=0x8 (die:0x8181360)
        var [5] reg5 type='struct sock*' size=0x8 (die:0x8181a0c)
        chk [19] reg6 offset=-0xa8 ok=0 kind=0 fbreg : no type information
        no type information
      
      And it was able to find the type after processing the 'mov' instruction.
        -----------------------------------------------------------
        find data type for -0xa8(reg6) at __tcp_transmit_skb+0x19
        CU for net/ipv4/tcp_output.c (die:0x817f543)
        frame base: cfa=0 fbreg=6
        scope: [1/1] (die:81aac3e)
        bb: [0 - 19]
        var [0] -0x98(stack) type='struct tcp_out_options' size=0x28 (die:0x81af3df)
        var [5] reg8 type='unsigned int' size=0x4 (die:0x8180ed6)
        var [5] reg2 type='unsigned int' size=0x4 (die:0x8180ed6)
        var [5] reg1 type='int' size=0x4 (die:0x818059e)
        var [5] reg4 type='struct sk_buff*' size=0x8 (die:0x8181360)
        var [5] reg5 type='struct sock*' size=0x8 (die:0x8181a0c)
        chk [19] reg6 offset=-0xa8 ok=0 kind=0 fbreg : retry                    <<<--- here
        mov [19] reg8 -> -0xa8(stack) type='unsigned int' size=0x4 (die:0x8180ed6)
        chk [19] reg6 offset=-0xa8 ok=0 kind=0 fbreg : Good!
        found by insn track: -0xa8(reg6) type-offset=0
        final result:  type='unsigned int' size=0x4 (die:0x8180ed6)
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240821232628.353177-4-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      895891da
    • Namhyung Kim's avatar
      perf annotate-data: Update debug messages · a0d57c60
      Namhyung Kim authored
      In check_matching_type(), it'd be easier to display the typename in
      question if it's available.
      
      For example, check out the line starts with 'chk'.
        -----------------------------------------------------------
        find data type for 0x10(reg0) at cpuacct_charge+0x13
        CU for kernel/sched/build_utility.c (die:0x137ee0b)
        frame base: cfa=1 fbreg=7
        scope: [3/3] (die:13d9632)
        bb: [c - 13]
        var [c] reg5 type='struct task_struct*' size=0x8 (die:0x1381230)
        mov [c] 0xdf8(reg5) -> reg0 type='struct css_set*' size=0x8 (die:0x1385c56)
        chk [13] reg0 offset=0x10 ok=1 kind=1 (struct css_set*) : Good!         <<<--- here
        found by insn track: 0x10(reg0) type-offset=0x10
        final result:  type='struct css_set' size=0x250 (die:0x1385b0e)
      
      Another example:
        -----------------------------------------------------------
        find data type for 0x8(reg0) at menu_select+0x279
        CU for drivers/cpuidle/governors/menu.c (die:0x7b0fe79)
        frame base: cfa=1 fbreg=7
        scope: [2/2] (die:7b11010)
        bb: [273 - 277]
        bb: [279 - 279]
        chk [279] reg0 offset=0x8 ok=0 kind=0 cfa : no type information
        scope: [1/2] (die:7b10cbc)
        bb: [0 - 64]
        ...
        mov [26a] imm=0xffffffff -> reg15
        bb: [273 - 277]
        bb: [279 - 279]
        chk [279] reg0 offset=0x8 ok=1 kind=1 (long long unsigned int) : no/void pointer    <<<--- here
        final result: no/void pointer
      
      Also change some places to print negative offsets properly.
      
      Before:
        -----------------------------------------------------------
        find data type for 0xffffff40(reg6) at __tcp_transmit_skb+0x58
      
      After:
        -----------------------------------------------------------
        find data type for -0xc0(reg6) at __tcp_transmit_skb+0x58
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240821232628.353177-3-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a0d57c60
    • Namhyung Kim's avatar
      perf dwarf-aux: Handle bitfield members from pointer access · a11b4222
      Namhyung Kim authored
      The __die_find_member_offset_cb() missed to handle bitfield members
      which don't have DW_AT_data_member_location.  Like in adding member
      types in __add_member_cb() it should fallback to check the bit offset
      when it resolves the member type for an offset.
      
      Fixes: 437683a9 ("perf dwarf-aux: Handle type transfer for memory access")
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240821232628.353177-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a11b4222
  4. 21 Aug, 2024 6 commits
    • Namhyung Kim's avatar
      perf annotate-data: Add 'typecln' sort key · fd45d52e
      Namhyung Kim authored
      Sometimes it's useful to organize member fields in cache-line boundary.
      
      The 'typecln' sort key is short for type-cacheline and to show samples
      in each cacheline.  The cacheline size is fixed to 64 for now, but it
      can read the actual size once it saves the value from sysfs.
      
      For example, you maybe want to which cacheline in a target is hot or
      cold.  The following shows members in the cfs_rq's first cache line.
      
        $ perf report -s type,typecln,typeoff -H
        ...
        -    2.67%        struct cfs_rq
           +    1.23%        struct cfs_rq: cache-line 2
           +    0.57%        struct cfs_rq: cache-line 4
           +    0.46%        struct cfs_rq: cache-line 6
           -    0.41%        struct cfs_rq: cache-line 0
                   0.39%        struct cfs_rq +0x14 (h_nr_running)
                   0.02%        struct cfs_rq +0x38 (tasks_timeline.rb_leftmost)
        ...
      
      Committer testing:
      
        # root@number:~# perf report -s type,typecln,typeoff -H --stdio
        # Total Lost Samples: 0
        #
        # Samples: 5K of event 'cpu_atom/mem-loads,ldlat=5/P'
        # Event count (approx.): 312251
        #
        #       Overhead  Data Type / Data Type Cacheline / Data Type Offset
        # ..............  ..................................................
        #
        <SNIP>
             0.07%        struct sigaction
                0.05%        struct sigaction: cache-line 1
                   0.02%        struct sigaction +0x58 (sa_mask)
                   0.02%        struct sigaction +0x78 (sa_mask)
                0.03%        struct sigaction: cache-line 0
                   0.02%        struct sigaction +0x38 (sa_mask)
                   0.01%        struct sigaction +0x8 (sa_mask)
        <SNIP>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240819233603.54941-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fd45d52e
    • Namhyung Kim's avatar
      perf annotate-data: Show offset and size in hex · 7a5c2170
      Namhyung Kim authored
      It'd be better to have them in hex to check cacheline alignment.
      
       Percent     offset       size  field
        100.00          0      0x1c0  struct cfs_rq    {
          0.00          0       0x10      struct load_weight  load {
          0.00          0        0x8          long unsigned int       weight;
          0.00        0x8        0x4          u32     inv_weight;
                                          };
          0.00       0x10        0x4      unsigned int        nr_running;
         14.56       0x14        0x4      unsigned int        h_nr_running;
          0.00       0x18        0x4      unsigned int        idle_nr_running;
          0.00       0x1c        0x4      unsigned int        idle_h_nr_running;
        ...
      
      Committer notes:
      
      Justification from Namhyung when asked about why it would be "better":
      
      Cache line sizes are power of 2 so it'd be natural to use hex and
      check whether an offset is in the same boundary.  Also 'perf annotate'
      shows instruction offsets in hex.
      
      >
      > Maybe this should be selectable?
      
      I can add an option and/or a config if you want.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240819233603.54941-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7a5c2170
    • Yang Ruibin's avatar
      perf bpf: Remove redundant check that map is NULL · ce66d7c7
      Yang Ruibin authored
      The check that map is NULL is already done in the bpf_map__fd(map) and
      returns an errno, which does not run further checks.
      
      In addition, even if the check for map is run, the return is a pointer,
      which is not consistent with the err_number returned by bpf_map__fd(map).
      Signed-off-by: default avatarYang Ruibin <11162571@vivo.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: opensource.kernel@vivo.com
      Link: https://lore.kernel.org/r/20240821101500.4568-1-11162571@vivo.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ce66d7c7
    • Namhyung Kim's avatar
      perf annotate-data: Fix percpu pointer check · 4d6d6e0f
      Namhyung Kim authored
      In check_matching_type(), it checks the type state of the register in a
      wrong order.  When it's the percpu pointer, it should check the type for
      the pointer, but it checks the CFA bit first and thought it has no type
      in the stack slot.  This resulted in no type info.
      
        -----------------------------------------------------------
        find data type for 0x28(reg1) at hrtimer_reprogram+0x88
        CU for kernel/time/hrtimer.c (die:0x18f219f)
        frame base: cfa=1 fbreg=7
        ...
        add [72] percpu 0x24500 -> reg1 pointer type='struct hrtimer_cpu_base' size=0x240 (die:0x18f6d46)
        bb: [7a - 7e]
        bb: [80 - 86]                        (here)
        bb: [88 - 88]                         vvv
        chk [88] reg1 offset=0x28 ok=1 kind=4 cfa : no type information
        no type information
      
      Here, instruction at 0x72 found reg1 has a (percpu) pointer and got the
      correct type.  But when it checks the final result, it wrongly thought
      it was stack variable because it checks the cfa bit first.
      
      After changing the order of state check:
        -----------------------------------------------------------
        find data type for 0x28(reg1) at hrtimer_reprogram+0x88
        CU for kernel/time/hrtimer.c (die:0x18f219f)
        frame base: cfa=1 fbreg=7
        ...                                     (here)
                                              vvvvvvvvvv
        chk [88] reg1 offset=0x28 ok=1 kind=4 percpu ptr : Good!
        found by insn track: 0x28(reg1) type-offset=0x28
        final type: type='struct hrtimer_cpu_base' size=0x240 (die:0x18f6d46)
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240821065408.285548-3-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4d6d6e0f
    • Namhyung Kim's avatar
      perf annotate-data: Prefer struct/union over base type · 4a32a972
      Namhyung Kim authored
      Sometimes a compound type can have a single field and the size is the
      same as the base type.  But it's still preferred as struct or union
      could carry more information than the base type.
      
      Also put a slight priority on the typedef for the same reason.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240821065408.285548-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4a32a972
    • Namhyung Kim's avatar
      perf annotate-data: Fix missing constant copy · 922ec313
      Namhyung Kim authored
      I found it missed to copy the immediate constant when it moves the
      register value.  This could result in a wrong type inference since the
      address for the per-cpu variable would be 0 always.
      
      Fixes: eb9190af ("perf annotate-data: Handle ADD instructions")
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240821065408.285548-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      922ec313
  5. 20 Aug, 2024 3 commits
    • Ian Rogers's avatar
      perf cap: Tidy up and improve capability testing · e25ebda7
      Ian Rogers authored
      Remove dependence on libcap. libcap is only used to query whether a
      capability is supported, which is just 1 capget system call.
      
      If the capget system call fails, fall back on root permission
      checking. Previously if libcap fails then the permission is assumed
      not present which may be pessimistic/wrong.
      
      Add a used_root out argument to perf_cap__capable to say whether the
      fall back root check was used. This allows the correct error message,
      "root" vs "users with the CAP_PERFMON or CAP_SYS_ADMIN capability", to
      be selected.
      
      Tidy uses of perf_cap__capable so that tests aren't repeated if capget
      isn't supported.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linux.dev>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Oliver Upton <oliver.upton@linux.dev>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240806220614.831914-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e25ebda7
    • Namhyung Kim's avatar
      perf annotate-data: Set bitfield member offset and size properly · 8b1042c4
      Namhyung Kim authored
      The bitfield members might not have DW_AT_data_member_location.  Let's
      use DW_AT_data_bit_offset to set the member offset correct.  Also use
      DW_AT_bit_size for the name like in a C program.
      
      Before:
        Annotate type: 'struct sk_buff' (1 samples)
              Percent     Offset       Size  Field
        -      100.00          0        232  struct sk_buff {
        +        0.00          0         24      union  ;
        +        0.00         24          8      union  ;
        +        0.00         32          8      union  ;
                 0.00         40         48      char[] cb;
        +        0.00         88         16      union  ;
                 0.00        104          8      long unsigned int      _nfct;
               100.00        112          4      unsigned int   len;
                 0.00        116          4      unsigned int   data_len;
                 0.00        120          2      __u16  mac_len;
                 0.00        122          2      __u16  hdr_len;
                 0.00        124          2      __u16  queue_mapping;
                 0.00        126          0      __u8[] __cloned_offset;
                 0.00          0          1      __u8   cloned;
                 0.00          0          1      __u8   nohdr;
                 0.00          0          1      __u8   fclone;
                 0.00          0          1      __u8   peeked;
                 0.00          0          1      __u8   head_frag;
                 0.00          0          1      __u8   pfmemalloc;
                 0.00          0          1      __u8   pp_recycle;
                 0.00        127          1      __u8   active_extensions;
        +        0.00        128         60      union  ;
                 0.00        188          4      sk_buff_data_t tail;
                 0.00        192          4      sk_buff_data_t end;
                 0.00        200          8      unsigned char* head;
      
      After:
      
        Annotate type: 'struct sk_buff' (1 samples)
              Percent     Offset       Size  Field
        -      100.00          0        232  struct sk_buff {
        +        0.00          0         24      union  ;
        +        0.00         24          8      union  ;
        +        0.00         32          8      union  ;
                 0.00         40         48      char[] cb
        +        0.00         88         16      union  ;
                 0.00        104          8      long unsigned int      _nfct;
               100.00        112          4      unsigned int   len;
                 0.00        116          4      unsigned int   data_len;
                 0.00        120          2      __u16  mac_len;
                 0.00        122          2      __u16  hdr_len;
                 0.00        124          2      __u16  queue_mapping;
                 0.00        126          0      __u8[] __cloned_offset;
                 0.00        126          1      __u8   cloned:1;
                 0.00        126          1      __u8   nohdr:1;
                 0.00        126          1      __u8   fclone:2;
                 0.00        126          1      __u8   peeked:1;
                 0.00        126          1      __u8   head_frag:1;
                 0.00        126          1      __u8   pfmemalloc:1;
                 0.00        126          1      __u8   pp_recycle:1;
                 0.00        127          1      __u8   active_extensions;
        +        0.00        128         60      union  ;
                 0.00        188          4      sk_buff_data_t tail;
                 0.00        192          4      sk_buff_data_t end;
                 0.00        200          8      unsigned char* head;
      
      Commiter notes:
      
      Collect some data:
      
        root@number:~# perf mem record -a --ldlat 5 -- ping -s 8193 -f 192.168.86.1
        Memory events are enabled on a subset of CPUs: 16-27
        PING 192.168.86.1 (192.168.86.1) 8193(8221) bytes of data.
        .^C
        --- 192.168.86.1 ping statistics ---
        13881 packets transmitted, 13880 received, 0.00720409% packet loss, time 8664ms
        rtt min/avg/max/mdev = 0.510/0.599/7.768/0.115 ms, ipg/ewma 0.624/0.593 ms
        [ perf record: Woken up 8 times to write data ]
        [ perf record: Captured and wrote 14.877 MB perf.data (46785 samples) ]
      
        root@number:~#
        root@number:~# perf evlist
        cpu_atom/mem-loads,ldlat=5/P
        cpu_atom/mem-stores/P
        dummy:u
        root@number:~# perf evlist -v
        cpu_atom/mem-loads,ldlat=5/P: type: 10 (cpu_atom), size: 136, config: 0x5d0 (mem-loads), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, disabled: 1, inherit: 1, freq: 1, precise_ip: 3, sample_id_all: 1, { bp_addr, config1 }: 0x7
        cpu_atom/mem-stores/P: type: 10 (cpu_atom), size: 136, config: 0x6d0 (mem-stores), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, disabled: 1, inherit: 1, freq: 1, precise_ip: 3, sample_id_all: 1
        dummy:u: type: 1 (software), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|ADDR|CPU|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
        root@number:~#
      
      Ok, now lets see what changes from before this patch to after it:
      
        root@number:~# perf annotate --data-type > /tmp/before
      
      Apply the patch, build:
      
        root@number:~# perf annotate --data-type > /tmp/after
      
      The first hunk of the diff, for a glib data structure, in userspace,
      look at those bitfields:
      
        root@number:~# diff -u10 /tmp/before /tmp/after | head -20
        --- /tmp/before	2024-08-20 17:29:58.306765780 -0300
        +++ /tmp/after	2024-08-20 17:33:13.210582596 -0300
        @@ -163,22 +163,22 @@
      
         Annotate type: 'GHashTable' in /usr/lib64/libglib-2.0.so.0.8000.3 (1 samples):
         ============================================================================
          Percent     offset       size  field
           100.00          0         96  GHashTable	 {
             0.00          0          8      gsize	size;
             0.00          8          4      gint	mod;
           100.00         12          4      guint	mask;
             0.00         16          4      guint	nnodes;
             0.00         20          4      guint	noccupied;
        -    0.00          0          4      guint	have_big_keys;
        -    0.00          0          4      guint	have_big_values;
        +    0.00         24          1      guint	have_big_keys:1;
        +    0.00         24          1      guint	have_big_values:1;
             0.00         32          8      gpointer	keys;
             0.00         40          8      guint*	hashes;
             0.00         48          8      gpointer	values;
        root@number:~#
      
      As advertised :-)
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240815223823.2402285-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8b1042c4
    • Arnaldo Carvalho de Melo's avatar
      perf daemon: Fix the build on more 32-bit architectures · 6236ebe0
      Arnaldo Carvalho de Melo authored
      The previous attempt fixed the build on debian:experimental-x-mipsel,
      but when building on a larger set of containers I noticed it broke the
      build on some other 32-bit architectures such as:
      
        42     7.87 ubuntu:18.04-x-arm            : FAIL gcc version 7.5.0 (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04)
          builtin-daemon.c: In function 'cmd_session_list':
          builtin-daemon.c:692:16: error: format '%llu' expects argument of type 'long long unsigned int', but argument 4 has type 'long int' [-Werror=format=]
             fprintf(out, "%c%" PRIu64,
                          ^~~~~
          builtin-daemon.c:694:13:
              csv_sep, (curr - daemon->start) / 60);
                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~
          In file included from builtin-daemon.c:3:0:
          /usr/arm-linux-gnueabihf/include/inttypes.h:105:34: note: format string is defined here
           # define PRIu64  __PRI64_PREFIX "u"
      
      So lets cast that time_t (32-bit/64-bit) to uint64_t to make sure it
      builds everywhere.
      
      Fixes: 4bbe6002 ("perf daemon: Fix the build on 32-bit architectures")
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/ZsPmldtJ0D9Cua9_@x1Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6236ebe0
  6. 19 Aug, 2024 21 commits
    • Namhyung Kim's avatar
      perf test: Add cgroup sampling test · 5cc698ba
      Namhyung Kim authored
      Add it to the record.sh shell test to verify if it tracks cgroup
      information correctly.  It records with --all-cgroups option can check
      if it has PERF_RECORD_CGROUP and the names are not "unknown".
      
        $ sudo ./perf test -vv 95
         95: perf record tests:
        --- start ---
        test child forked, pid 2871922
         169c90-169cd0 g test_loop
        perf does have symbol 'test_loop'
        Basic --per-thread mode test
        Basic --per-thread mode test [Success]
        Register capture test
        Register capture test [Success]
        Basic --system-wide mode test
        Basic --system-wide mode test [Success]
        Basic target workload test
        Basic target workload test [Success]
        Branch counter test
        branch counter feature not supported on all core PMUs (/sys/bus/event_source/devices/cpu) [Skipped]
        Cgroup sampling test
        Cgroup sampling test [Success]
        ---- end(0) ----
         95: perf record tests                                               : Ok
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240818212948.2873156-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5cc698ba
    • Namhyung Kim's avatar
      perf record: Fix sample cgroup & namespace tracking · 3432bae8
      Namhyung Kim authored
      The recent change in 'struct perf_tool' constification broke the cgroup
      and/or namespace tracking by resetting tool fields.  It should set the
      values after perf_tool__init().
      
      Fixes: cecb1cf1 ("perf record: Use perf_tool__init()")
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240818212948.2873156-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3432bae8
    • Ian Rogers's avatar
      perf inject: Combine mmap and mmap2 handling · 05c4cfeb
      Ian Rogers authored
      The handling of mmap and mmap2 events is near identical. Add a common
      helper function and call that by the two event handling functions.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Anne Macedo <retpolanne@posteo.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Casey Chen <cachen@purestorage.com>
      Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dominique Martinet <asmadeus@codewreck.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@linaro.org>
      Cc: Jann Horn <jannh@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sun Haiyong <sunhaiyong@loongson.cn>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yunseong Kim <yskelg@gmail.com>
      Cc: Ze Gao <zegao2021@gmail.com>
      Link: https://lore.kernel.org/r/20240817064442.2152089-10-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      05c4cfeb
    • Ian Rogers's avatar
      perf inject: Combine different mmap and mmap2 functions · 048a7a93
      Ian Rogers authored
      There are repipe, build ID and JIT dump variants of the mmap and mmap2
      repipe functions. The organization doesn't allow JIT dump to work with
      build ID injection and the structure is less than clear. Combine the
      function and enable the different behaviors based on ifs.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Anne Macedo <retpolanne@posteo.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Casey Chen <cachen@purestorage.com>
      Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dominique Martinet <asmadeus@codewreck.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@linaro.org>
      Cc: Jann Horn <jannh@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sun Haiyong <sunhaiyong@loongson.cn>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yunseong Kim <yskelg@gmail.com>
      Cc: Ze Gao <zegao2021@gmail.com>
      Link: https://lore.kernel.org/r/20240817064442.2152089-9-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      048a7a93
    • Ian Rogers's avatar
      perf inject: Combine build_ids and build_id_all into enum · 0ed4c8c3
      Ian Rogers authored
      It is clearer to have a single enum that determines how build ids are
      injected, it also allows for future extension.
      
      Set the header build ID feature whether lazy or all are generated,
      previously only the lazy case would set it.
      
      Allow parsing of known build IDs for either the lazy or all cases.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Anne Macedo <retpolanne@posteo.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Casey Chen <cachen@purestorage.com>
      Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dominique Martinet <asmadeus@codewreck.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@linaro.org>
      Cc: Jann Horn <jannh@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sun Haiyong <sunhaiyong@loongson.cn>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yunseong Kim <yskelg@gmail.com>
      Cc: Ze Gao <zegao2021@gmail.com>
      Link: https://lore.kernel.org/r/20240817064442.2152089-8-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0ed4c8c3
    • Ian Rogers's avatar
      perf test: Expand pipe/inject test · a8656614
      Ian Rogers authored
      Test recording of call-graphs and injecting --build-all. Add/expand
      trap handler.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Anne Macedo <retpolanne@posteo.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Casey Chen <cachen@purestorage.com>
      Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dominique Martinet <asmadeus@codewreck.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@linaro.org>
      Cc: Jann Horn <jannh@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sun Haiyong <sunhaiyong@loongson.cn>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yunseong Kim <yskelg@gmail.com>
      Cc: Ze Gao <zegao2021@gmail.com>
      Link: https://lore.kernel.org/r/20240817064442.2152089-7-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a8656614
    • Ian Rogers's avatar
      perf evsel: Constify evsel__id_hdr_size() argument · 63c89dc5
      Ian Rogers authored
      Allows evsel__id_hdr_size() to be used when the evsel is const.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Anne Macedo <retpolanne@posteo.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Casey Chen <cachen@purestorage.com>
      Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dominique Martinet <asmadeus@codewreck.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@linaro.org>
      Cc: Jann Horn <jannh@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sun Haiyong <sunhaiyong@loongson.cn>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yunseong Kim <yskelg@gmail.com>
      Cc: Ze Gao <zegao2021@gmail.com>
      Link: https://lore.kernel.org/r/20240817064442.2152089-6-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      63c89dc5
    • Ian Rogers's avatar
      perf dso: Constify dso_id · e4bb4caa
      Ian Rogers authored
      The passed dso_id is copied and so is never an out argument. Remove
      its mutability.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Anne Macedo <retpolanne@posteo.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Casey Chen <cachen@purestorage.com>
      Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dominique Martinet <asmadeus@codewreck.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@linaro.org>
      Cc: Jann Horn <jannh@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sun Haiyong <sunhaiyong@loongson.cn>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yunseong Kim <yskelg@gmail.com>
      Cc: Ze Gao <zegao2021@gmail.com>
      Link: https://lore.kernel.org/r/20240817064442.2152089-5-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e4bb4caa
    • Ian Rogers's avatar
      perf jit: Constify filename argument · 0847c193
      Ian Rogers authored
      Make it clearer the argument is just being used as a string.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Anne Macedo <retpolanne@posteo.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Casey Chen <cachen@purestorage.com>
      Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dominique Martinet <asmadeus@codewreck.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@linaro.org>
      Cc: Jann Horn <jannh@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sun Haiyong <sunhaiyong@loongson.cn>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yunseong Kim <yskelg@gmail.com>
      Cc: Ze Gao <zegao2021@gmail.com>
      Link: https://lore.kernel.org/r/20240817064442.2152089-4-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0847c193
    • Ian Rogers's avatar
      perf map: API clean up · a0310736
      Ian Rogers authored
      map__init() is only used internally so make it static. Assume memory is
      zero initialized, which will better support adding fields to struct
      map in the future and was already the case for map__new2.
      
      To reduce complexity, change set_priv and set_erange_warned to not take
      a value to assign as they always assign true.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Anne Macedo <retpolanne@posteo.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Casey Chen <cachen@purestorage.com>
      Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dominique Martinet <asmadeus@codewreck.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@linaro.org>
      Cc: Jann Horn <jannh@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sun Haiyong <sunhaiyong@loongson.cn>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yunseong Kim <yskelg@gmail.com>
      Cc: Ze Gao <zegao2021@gmail.com>
      Link: https://lore.kernel.org/r/20240817064442.2152089-3-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a0310736
    • Ian Rogers's avatar
      perf synthetic-events: Avoid unnecessary memset · 2aebebb8
      Ian Rogers authored
      Make sure the memset of a synthesized event only zeros the necessary
      tracing data part of the event, as a full event can be over 4kb in
      size.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Anne Macedo <retpolanne@posteo.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Casey Chen <cachen@purestorage.com>
      Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dominique Martinet <asmadeus@codewreck.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@linaro.org>
      Cc: Jann Horn <jannh@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sun Haiyong <sunhaiyong@loongson.cn>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yunseong Kim <yskelg@gmail.com>
      Cc: Ze Gao <zegao2021@gmail.com>
      Link: https://lore.kernel.org/r/20240817064442.2152089-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2aebebb8
    • Xu Yang's avatar
      perf python: Fix the build on 32-bit arm by including missing "util/sample.h" · 2518e132
      Xu Yang authored
      The 32-bit arm build system will complain:
      
        tools/perf/util/python.c:75:28: error: field ‘sample’ has incomplete type
           75 |         struct perf_sample sample;
      
      However, arm64 build system doesn't complain this.
      
      The root cause is arm64 define "HAVE_KVM_STAT_SUPPORT := 1" in
      tools/perf/arch/arm64/Makefile, but arm arch doesn't define this.  This
      will lead to kvm-stat.h include other header files on arm64 build
      system, especially "util/sample.h" for util/python.c.
      
      This will try to directly include "util/sample.h" for "util/python.c" to
      avoid such build issue on arm platform.
      Signed-off-by: default avatarXu Yang <xu.yang_2@nxp.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: imx@lists.linux.dev
      Link: https://lore.kernel.org/r/20240819023403.201324-1-xu.yang_2@nxp.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2518e132
    • Namhyung Kim's avatar
      perf annotate-data: Update type stat at the end of find_data_type_die() · 023aceec
      Namhyung Kim authored
      After trying all possibilities with DWARF and instruction tracking.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240816235840.2754937-10-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      023aceec
    • Namhyung Kim's avatar
      perf annotate-data: Check variables in every scope · ba883370
      Namhyung Kim authored
      Sometimes it matches a variable in the inner scope but it fails because
      the actual access can be on a different type.  Let's try variables in
      every scope and choose the best one using is_better_type().
      
      I have an example with update_blocked_averages(), at first it found a
      variable (__mptr) but it's a void pointer.  So it moved on to the upper
      scope and found another variable (cfs_rq).
      
        $ perf --debug type-profile annotate --data-type --stdio
        ...
        -----------------------------------------------------------
        find data type for 0x140(reg14) at update_blocked_averages+0x2db
        CU for kernel/sched/fair.c (die:0x12dd892)
        frame base: cfa=1 fbreg=7
        found "__mptr" (die: 0x13022f1) in scope=4/4 (die: 0x13022e8) failed: no/void pointer
         variable location: base=reg14, offset=0x140
         type='void*' size=0x8 (die:0x12dd8f9)
        found "cfs_rq" (die: 0x1301721) in scope=3/4 (die: 0x130171c) type_offset=0x140
         variable location: reg14
         type='struct cfs_rq' size=0x1c0 (die:0x12e37e5)
        final type: type='struct cfs_rq' size=0x1c0 (die:0x12e37e5)
      
      IIUC the scope is like below:
        1: update_blocked_averages
        2:   __update_blocked_fair
        3:     for_each_leaf_cfs_rq_safe
        4:       list_entry -> (container_of)
      
      The container_of is implemented like:
      
        #define container_of(ptr, type, member) ({				\
        	void *__mptr = (void *)(ptr);					\
        	static_assert(__same_type(*(ptr), ((type *)0)->member) ||	\
        		      __same_type(*(ptr), void),			\
        		      "pointer type mismatch in container_of()");	\
        	((type *)(__mptr - offsetof(type, member))); })
      
      That's why we see the __mptr variable first but it failed since it has
      no type information.
      
      Then for_each_leaf_cfs_rq_safe() is defined as
      
        #define for_each_leaf_cfs_rq_safe(rq, cfs_rq, pos)			\
        	list_for_each_entry_safe(cfs_rq, pos, &rq->leaf_cfs_rq_list,	\
        				 leaf_cfs_rq_list)
      
      Note that the access was 0x140(r14).  And the cfs_rq has
      leaf_cfs_rq_list at the 0x140.  So it converts the list_head pointer to
      a pointer to struct cfs_rq here.
      
        $ pahole --hex -C cfs_rq vmlinux | grep 140
        struct cfs_rq 	struct list_head           leaf_cfs_rq_list;     /* 0x140  0x10 */
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240816235840.2754937-9-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ba883370
    • Namhyung Kim's avatar
      perf annotate-data: Add is_better_type() helper · c663451f
      Namhyung Kim authored
      Sometimes more than one variables are located in the same register or a
      stack slot.  Or it can overwrite existing information with others.  I
      found this is not helpful in some cases so it needs to update the type
      information from the variable only if it's better.
      
      But it's hard to know which one is better, so we needs heuristics. :)
      
      As it deals with memory accesses, the location should have a pointer or
      something similar (like array or reference).  So if it had an integer
      type and a variable is a pointer, we can take the variable's type to
      resolve the target of the access.
      
      If it has a pointer type and a variable with the same location has a
      different pointer type, it'll take one with bigger target type.  This
      can be useful when the target type embeds a smaller type (like list
      header or RB-tree node) at the beginning so their location is same.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240816235840.2754937-8-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c663451f
    • Namhyung Kim's avatar
      perf annotate-data: Add is_pointer_type() helper · 98d1f1dc
      Namhyung Kim authored
      It treats pointers and arrays in the same way.  Let's add the helper and
      use it when it checks if it needs a pointer.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240816235840.2754937-7-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      98d1f1dc
    • Namhyung Kim's avatar
      perf annotate-data: Change return type of find_data_type_block() · 69e2c784
      Namhyung Kim authored
      So that it can return enum variable_match_type to be propagated to the
      find_data_type_die().  Also update the debug message to show the result
      of the check_matching_type().
      
        chk [dd] reg0 offset=0 ok=1 kind=1  : Good!
      or
        chk [177] reg4 offset=0x138 ok=0 kind=0 cfa : no type information
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240816235840.2754937-6-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      69e2c784
    • Namhyung Kim's avatar
      perf annotate-data: Add variable_state_str() · 653185d8
      Namhyung Kim authored
      So that it can show a proper debug message in the right place.  The
      check_variable() is used in other places which don't want to print the
      message.
      
        $ perf --debug type-profile annotate --data-type
      
      Before:
        -----------------------------------------------------------
        find data type for 0x140(reg14) at update_blocked_averages+0x2db
        CU for kernel/sched/fair.c (die:0x12dd892)
        frame base: cfa=1 fbreg=7
        no pointer or no type                                         <<<--- removed
        check variable "__mptr" failed (die: 0x13022f1)
         variable location: base=reg14, offset=0x140
         type='void*' size=0x8 (die:0x12dd8f9)
      
      After:
        -----------------------------------------------------------
        find data type for 0x140(reg14) at update_blocked_averages+0x2db
        CU for kernel/sched/fair.c (die:0x12dd892)
        frame base: cfa=1 fbreg=7
        found "__mptr" (die: 0x13022f1) in scope=4/4 (die: 0x13022e8) failed: no/void pointer  <<<--- here
         variable location: base=reg14, offset=0x140
         type='void*' size=0x8 (die:0x12dd8f9)
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240816235840.2754937-5-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      653185d8
    • Namhyung Kim's avatar
      perf annotate-data: Add 'enum type_match_result' · 976862f8
      Namhyung Kim authored
      And let check_variable() return the enum value so that callers can know
      what was the problem.  This will be used by the later patch to update
      the statistics correctly and print the error message in a right place.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240816235840.2754937-4-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      976862f8
    • Namhyung Kim's avatar
      perf annotate-data: Fix off-by-one in location range check · 3ab0b8b2
      Namhyung Kim authored
      The location list will have entries with half-open addressing like
      [start, end) which means it doesn't include the end address.  So it
      should skip entries at the end address and match to the next entry.
      
      An example location list looks like this (from readelf -wo):
      
          00237876 ffffffff8110d32b (base address)
          0023787f v000000000000000 v000000000000002 views at 00237868 for:
                   ffffffff8110d32b ffffffff8110d4eb (DW_OP_reg3 (rbx))     <<<--- 1
          00237885 v000000000000002 v000000000000000 views at 0023786a for:
                   ffffffff8110d4eb ffffffff8110d50b (DW_OP_reg14 (r14))    <<<--- 2
          0023788c v000000000000000 v000000000000001 views at 0023786c for:
                   ffffffff8110d50b ffffffff8110d7c4 (DW_OP_reg3 (rbx))
          00237893 v000000000000000 v000000000000000 views at 0023786e for:
                   ffffffff8110d806 ffffffff8110d854 (DW_OP_reg3 (rbx))
          0023789a v000000000000000 v000000000000000 views at 00237870 for:
                   ffffffff8110d876 ffffffff8110d88e (DW_OP_reg3 (rbx))
      
      The first entry at 0023787f has [8110d32b, 8110d4eb) (omitting the
      ffffffff at the beginning), and the second one has [8110d4eb, 8110d50b).
      
      Fixes: 2bc3cf57 ("perf annotate-data: Improve debug message with location info")
      Reviewed-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240816235840.2754937-3-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3ab0b8b2
    • Namhyung Kim's avatar
      perf dwarf-aux: Check allowed location expressions when collecting variables · e8bb03ed
      Namhyung Kim authored
      It missed to call check_allowed_ops() in __die_collect_vars_cb() so it
      can take variables with complex location expression incorrectly.
      
      For example, I found some variable has this expression.
      
          015d8df8 ffffffff81aacfb3 (base address)
          015d8e01 v000000000000004 v000000000000000 views at 015d8df2 for:
                   ffffffff81aacfb3 ffffffff81aacfd2 (DW_OP_fbreg: -176; DW_OP_deref;
      						DW_OP_plus_uconst: 332; DW_OP_deref_size: 4;
      						DW_OP_lit1; DW_OP_shra; DW_OP_const1u: 64;
      						DW_OP_minus; DW_OP_stack_value)
          015d8e14 v000000000000000 v000000000000000 views at 015d8df4 for:
                   ffffffff81aacfd2 ffffffff81aacfd7 (DW_OP_reg3 (rbx))
          015d8e19 v000000000000000 v000000000000000 views at 015d8df6 for:
                   ffffffff81aacfd7 ffffffff81aad020 (DW_OP_fbreg: -176; DW_OP_deref;
      						DW_OP_plus_uconst: 332; DW_OP_deref_size: 4;
      						DW_OP_lit1; DW_OP_shra; DW_OP_const1u: 64;
      						DW_OP_minus; DW_OP_stack_value)
          015d8e2c <End of list>
      
      It looks like '((int *)(-176(%rbp) + 332) >> 1) - 64' but the current
      code thought it's just -176(%rbp) and processed the variable incorrectly.
      It should reject such a complex expression if check_allowed_ops()
      doesn't like it. :)
      
      Fixes: 932dcc2c ("perf dwarf-aux: Add die_collect_vars()")
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240816235840.2754937-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e8bb03ed
  7. 16 Aug, 2024 2 commits
    • Arnaldo Carvalho de Melo's avatar
      Merge remote-tracking branch 'torvalds/master' into perf-tools-next · 3bce87eb
      Arnaldo Carvalho de Melo authored
      To pick up the latest perf-tools merge for 6.11, i.e. to have the
      current perf tools branch that is getting into 6.11 with the
      perf-tools-next that is geared towards 6.12.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3bce87eb
    • Yicong Yang's avatar
      perf stat: Display iostat headers correctly · 26156393
      Yicong Yang authored
      Currently we'll only print metric headers for metric leader in
      aggregration mode. This will make `perf iostat` header not shown
      since it'll aggregrated globally but don't have metric events:
      
        root@ubuntu204:/home/yang/linux/tools/perf# ./perf stat --iostat --timeout 1000
         Performance counter stats for 'system wide':
            port
        0000:00                    0                    0                    0                    0
        0000:80                    0                    0                    0                    0
        [...]
      
      Fix this by excluding the iostat in the check of printing metric
      headers. Then we can see the headers:
      
        root@ubuntu204:/home/yang/linux/tools/perf# ./perf stat --iostat --timeout 1000
         Performance counter stats for 'system wide':
            port             Inbound Read(MB)    Inbound Write(MB)    Outbound Read(MB)   Outbound Write(MB)
        0000:00                    0                    0                    0                    0
        0000:80                    0                    0                    0                    0
        [...]
      
      Fixes: 193a9e30 ("perf stat: Don't display metric header for non-leader uncore events")
      Signed-off-by: default avatarYicong Yang <yangyicong@hisilicon.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
      Cc: Junhao He <hejunhao3@huawei.com>
      Cc: linuxarm@huawei.com
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
      Cc: Zeng Tao <prime.zeng@hisilicon.com>
      Link: https://lore.kernel.org/r/20240802065800.48774-1-yangyicong@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      26156393