1. 16 Feb, 2018 38 commits
    • Ravi Bangoria's avatar
      perf trace powerpc: Use generated syscall table · 4281da23
      Ravi Bangoria authored
      This should speed up accessing new system calls introduced with the
      kernel rather than waiting for libaudit updates to include them.
      
      It also enables users to specify wildcards, for example, perf trace -e
      'open*', just like was already possible on x86 and s390.
      Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/r/20180129083417.31240-4-ravi.bangoria@linux.vnet.ibm.com
      [ Do it for ppc32 as well ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4281da23
    • Ravi Bangoria's avatar
      perf powerpc: Generate system call table from asm/unistd.h · 8e2ff72a
      Ravi Bangoria authored
      This should speed up accessing new system calls introduced with the
      kernel rather than waiting for libaudit updates to include them.
      Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/r/20180129083417.31240-3-ravi.bangoria@linux.vnet.ibm.com
      [ Made it generate syscall_32.c as well to fix the build on 32-bit ppc ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8e2ff72a
    • Ravi Bangoria's avatar
      tools include powerpc: Grab a copy of arch/powerpc/include/uapi/asm/unistd.h · 1350fb7d
      Ravi Bangoria authored
      Will be used for generating the syscall id/string translation table.
      
      Committer notes:
      
      Update it already to catch with these csets applied since Ravi first
      submitted this patch:
      
        3350eb2e powerpc: sys_pkey_mprotect() system call
        9499ec1b powerpc: sys_pkey_alloc() and sys_pkey_free() system calls
      
      So now 'perf trace' on ppc now knows about the pkey_ syscals.
      Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/r/20180129083417.31240-2-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1350fb7d
    • Jiri Olsa's avatar
      perf report: Fix memory corruption in --branch-history mode --branch-history · e3ebaa46
      Jiri Olsa authored
      Jin Yao reported memory corrupton in perf report with
      branch info used for stack trace:
      
        > Following command lines will cause perf crash.
      
        > perf record -j call -g -a <application>
        > perf report --branch-history
        >
        > *** Error in `perf': double free or corruption (!prev): 0x00000000104aa040 ***
        > ======= Backtrace: =========
        > /lib/x86_64-linux-gnu/libc.so.6(+0x77725)[0x7f6b37254725]
        > /lib/x86_64-linux-gnu/libc.so.6(+0x7ff4a)[0x7f6b3725cf4a]
        > /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f6b37260abc]
        > perf[0x51b914]
        > perf(hist_entry_iter__add+0x1e5)[0x51f305]
        > perf[0x43cf01]
        > perf[0x4fa3bf]
        > perf[0x4fa923]
        > perf[0x4fd396]
        > perf[0x4f9614]
        > perf(perf_session__process_events+0x89e)[0x4fc38e]
        > perf(cmd_report+0x15d2)[0x43f202]
        > perf[0x4a059f]
        > perf(main+0x631)[0x427b71]
        > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f6b371fd830]
        > perf(_start+0x29)[0x427d89]
      
      For the cumulative output, we allocate the he_cache array based on the
      --max-stack option value and populate it with data from 'callchain_cursor'.
      
      The --max-stack option value does not ensure now the limit for number of
      callchain_cursor nodes, so the cumulative iter code will allocate smaller array
      than it's actually needed and cause above corruption.
      
      I think the --max-stack limit does not apply here anyway, because we add
      callchain data as normal hist entries, while the --max-stack control the limit
      of single entry callchain depth.
      
      Using the callchain_cursor.nr as he_cache array count to fix this. Also
      removing struct hist_entry_iter::max_stack, because there's no longer any use
      for it.
      
      We need more fixes to ensure that the branch stack code follows properly the
      logic of --max-stack, which is not the case at the moment.
      Original-patch-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Reported-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180216123619.GA9945@kravaSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e3ebaa46
    • Jin Yao's avatar
      perf report: Fix wrong jump arrow · b40982e8
      Jin Yao authored
      When we use perf report interactive annotate view, we can see
      the position of jump arrow is not correct. For example,
      
      1. perf record -b ...
      2. perf report
      3. In interactive mode, select Annotate 'function'
      
      Percent│ IPC Cycle
             │                                if (flag)
        1.37 │0.4┌──   1      ↓ je     82
             │   │                                    x += x / y + y / x;
        0.00 │0.4│  1310        movsd  (%rsp),%xmm0
        0.00 │0.4│   565        movsd  0x8(%rsp),%xmm4
             │0.4│              movsd  0x8(%rsp),%xmm1
             │0.4│              movsd  (%rsp),%xmm3
             │0.4│              divsd  %xmm4,%xmm0
        0.00 │0.4│   579        divsd  %xmm3,%xmm1
             │0.4│              movsd  (%rsp),%xmm2
             │0.4│              addsd  %xmm1,%xmm0
             │0.4│              addsd  %xmm2,%xmm0
        0.00 │0.4│              movsd  %xmm0,(%rsp)
             │   │                    volatile double x = 1212121212, y = 121212;
             │   │
             │   │                    s_randseed = time(0);
             │   │                    srand(s_randseed);
             │   │
             │   │                    for (i = 0; i < 2000000000; i++) {
        1.37 │0.4└─→      82:   sub    $0x1,%ebx
       28.21 │0.48    17      ↑ jne    38
      
      The jump arrow in above example is not correct. It should add the
      width of IPC and Cycle.
      
      With this patch, the result is:
      
      Percent│ IPC Cycle
             │                                if (flag)
        1.37 │0.48     1     ┌──je     82
             │               │                        x += x / y + y / x;
        0.00 │0.48  1310     │  movsd  (%rsp),%xmm0
        0.00 │0.48   565     │  movsd  0x8(%rsp),%xmm4
             │0.48           │  movsd  0x8(%rsp),%xmm1
             │0.48           │  movsd  (%rsp),%xmm3
             │0.48           │  divsd  %xmm4,%xmm0
        0.00 │0.48   579     │  divsd  %xmm3,%xmm1
             │0.48           │  movsd  (%rsp),%xmm2
             │0.48           │  addsd  %xmm1,%xmm0
             │0.48           │  addsd  %xmm2,%xmm0
        0.00 │0.48           │  movsd  %xmm0,(%rsp)
             │               │        volatile double x = 1212121212, y = 121212;
             │               │
             │               │        s_randseed = time(0);
             │               │        srand(s_randseed);
             │               │
             │               │        for (i = 0; i < 2000000000; i++) {
        1.37 │0.48        82:└─→sub    $0x1,%ebx
       28.21 │0.48    17      ↑ jne    38
      
      Committer notes:
      
      Please note that only from LBRv5 (according to Jiri) onwards, i.e. >=
      Skylake is that we'll have the cycles counts in each branch record
      entry, so to see the Cycles and IPC columns, and be able to test this
      patch, one need a capable hardware.
      
      While applying this I first tested it on a Broadwell class machine and
      couldn't get those columns, will add code to the annotate browser to
      warn the user about that, i.e. you have branch records, but no cycles,
      use a more recent hardware to get the cycles and IPC columns.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1517223473-14750-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b40982e8
    • Andi Kleen's avatar
      perf report: Fix description for --mem-mode · fc2f5237
      Andi Kleen authored
      The "mem-loads" event only works when PEBS is enabled, so add the "/p"
      ("precise") suffix to the examples.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      LPU-Reference: 20180209163909.9240-1-andi@firstfloor.org
      Link: https://lkml.kernel.org/n/tip-v0gcd4u9tktrvjjsp6y7ouv4@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fc2f5237
    • Robert Walker's avatar
      coresight: Update documentation for perf usage · 6673016f
      Robert Walker authored
      Add notes on using perf to collect and analyze CoreSight trace
      Signed-off-by: default avatarRobert Walker <robert.walker@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/1518607481-4059-4-git-send-email-robert.walker@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6673016f
    • Robert Walker's avatar
      perf inject: Emit instruction records on ETM trace discontinuity · 256e751c
      Robert Walker authored
      There may be discontinuities in the ETM trace stream due to overflows or
      ETM configuration for selective trace.  This patch emits an instruction
      sample with the pending branch stack when a TRACE ON packet occurs
      indicating a discontinuity in the trace data.
      
      A new packet type CS_ETM_TRACE_ON is added, which is emitted by the low
      level decoder when a TRACE ON occurs.  The higher level decoder flushes
      the branch stack when this packet is emitted.
      Signed-off-by: default avatarRobert Walker <robert.walker@arm.com>
      Acked-by: default avatarMathieu Poirier <mathieu.poirier@linaro.org>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/1518607481-4059-3-git-send-email-robert.walker@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      256e751c
    • Robert Walker's avatar
      perf cs-etm: Inject capabilitity for CoreSight traces · e573e978
      Robert Walker authored
      Added user space perf functionality to translate CoreSight traces into
      instruction events with branch stack.
      
      To invoke the new functionality, use the perf inject tool with
      --itrace=il. For example, to translate the ETM trace from perf.data into
      last branch records in a new inj.data file:
      
          $ perf inject --itrace=i100000il128 -i perf.data -o perf.data.new
      
      The 'i' parameter to itrace generates periodic instruction events.  The
      period between instruction events can be specified as a number of
      instructions suffixed by i (default 100000).
      
      The parameter to 'l' specifies the number of entries in the branch stack
      attached to instruction events.
      
      The 'b' parameter to itrace generates events on taken branches.
      
      This patch also fixes the contents of the branch events used in perf
      report - previously branch events were generated for each contiguous
      range of instructions executed.  These are fixed to generate branch
      events between the last address of a range ending in an executed branch
      instruction and the start address of the next range.
      
      Based on patches by Sebastian Pop <s.pop@samsung.com> with additional fixes
      and support for specifying the instruction period.
      Originally-by: default avatarSebastian Pop <s.pop@samsung.com>
      Signed-off-by: default avatarRobert Walker <robert.walker@arm.com>
      Acked-by: default avatarMathieu Poirier <mathieu.poirier@linaro.org>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/1518607481-4059-2-git-send-email-robert.walker@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e573e978
    • Sangwon Hong's avatar
      perf mem: Document a missing option · 7e99b197
      Sangwon Hong authored
      Add the missing --force option on the man page.
      Signed-off-by: default avatarSangwon Hong <qpakzk@gmail.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Link: http://lkml.kernel.org/r/1518381517-30766-2-git-send-email-qpakzk@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7e99b197
    • Sangwon Hong's avatar
      perf kmem: Document a missing option & an argument · 577980a0
      Sangwon Hong authored
      First, 'perf kmem' has a '--force' option, but didn't document it on the
      man page. So add it.
      
      Second, the '--time' option has to get a value, but isn't documented on
      the man page. Describe it.
      Signed-off-by: default avatarSangwon Hong <qpakzk@gmail.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Link: http://lkml.kernel.org/r/1518381517-30766-1-git-send-email-qpakzk@gmail.com
      [ Add blank like after --force block, as requested by Namhyung ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      577980a0
    • Jaecheol Shin's avatar
      perf annotate: Add missing arguments in Man page · ac2c3068
      Jaecheol Shin authored
      Some options must require an argument. But input, stdio-color, cpu have
      no them.  So I added it.
      Signed-off-by: default avatarJaecheol Shin <jcgod413@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Link: http://lkml.kernel.org/r/20180207095205.62715-1-jcgod413@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ac2c3068
    • Mathieu Poirier's avatar
      perf cs-etm: Properly deal with cpu maps · 796bfadd
      Mathieu Poirier authored
      This patch allows the CoreSight AUX info section to fit topologies where
      only a subset of all available CPUs are present, avoiding at the same
      time accessing the ETM configuration areas of CPUs that have been
      offlined.
      Signed-off-by: default avatarMathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/1518478737-24649-1-git-send-email-mathieu.poirier@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      796bfadd
    • Mathieu Poirier's avatar
      perf auxtrace arm: Fixing uninitialised variable · d2785de1
      Mathieu Poirier authored
      When working natively on arm64 the compiler gets pesky and complains
      that variable 'i' is uninitialised, something that breaks the
      compilation.  Here no further checks are needed since variable
      'found_spe' can only be true if variable 'i' has been initialised as
      part of the for loop.
      Signed-off-by: default avatarMathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/1518467557-18505-4-git-send-email-mathieu.poirier@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d2785de1
    • Jin Yao's avatar
      perf tools: Use target->per_thread and target->system_wide flags · 147c508f
      Jin Yao authored
      Mathieu Poirier reports issue in commit ("73c0ca1e perf thread_map:
      Enumerate all threads from /proc") that it has negative impact on 'perf
      record --per-thread'. It has the effect of creating a kernel event for
      each thread in the system for 'perf record --per-thread'.
      
      Mathieu Poirier's patch ("perf util: Do not reuse target->per_thread flag")
      can fix this issue by creating a new target->all_threads flag.
      
      This patch is based on Mathieu Poirier's patch but it doesn't use a new
      target->all_threads flag. This patch just uses 'target->per_thread &&
      target->system_wide' as a condition to check for all threads case.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Fixes: 73c0ca1e ("perf thread_map: Enumerate all threads from /proc")
      Link: http://lkml.kernel.org/r/1518467557-18505-3-git-send-email-mathieu.poirier@linaro.orgSigned-off-by: default avatarMathieu Poirier <mathieu.poirier@linaro.org>
      [Fixed checkpatch warning about line over 80 characters]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      147c508f
    • Mathieu Poirier's avatar
      perf cs-etm: Freeing allocated memory · 099c1130
      Mathieu Poirier authored
      This patch frees all the memory allocated in function
      cs_etm__alloc_queue().
      Signed-off-by: default avatarMathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/1518467557-18505-2-git-send-email-mathieu.poirier@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      099c1130
    • Jiri Olsa's avatar
      perf tests: Use arch__compare_symbol_names to compare symbols · ab6e9a99
      Jiri Olsa authored
      The symbol search called by machine__find_kernel_symbol_by_name is using
      internally arch__compare_symbol_names function to compare 2 symbol
      names, because different archs have different ways of comparing symbols.
      Mostly for skipping '.' prefixes and similar.
      
      In test 1 when we try to find matching symbols in kallsyms and vmlinux,
      by address and by symbol name. When either is found we compare the pair
      symbol names  by simple strcmp, which is not good enough for reasons
      explained in previous paragraph.
      
      On powerpc this can cause lockup, because even thought we found the
      pair, the compared names are different and don't match simple strcmp.
      Following code path is executed, that leads to lockup:
      
         - we find the pair in kallsyms by sym->start
      next_pair:
         - we compare the names and it fails
         - we find the pair by sym->name
         - the pair addresses match so we call goto next_pair
           because we assume the names match in this case
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Acked-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: 031b84c4 ("perf probe ppc: Enable matching against dot symbols automatically")
      Link: http://lkml.kernel.org/r/20180215122635.24029-10-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ab6e9a99
    • Jiri Olsa's avatar
      perf tools: Do not create kernel maps in sample__resolve() · a73e24d2
      Jiri Olsa authored
      There's no need for kernel maps to be allocated at this point - sample
      processing.
      
      We search for kernel maps using the kernel map_groups in machine::kmaps
      which is static. If vmlinux maps for any reason still don't exist, the
      search correctly fails because they are not in the map group.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180215122635.24029-9-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a73e24d2
    • Jiri Olsa's avatar
      perf machine: Remove machine__load_kallsyms() · e8f3879f
      Jiri Olsa authored
      The current machine__load_kallsyms() function has no caller, so replace
      it directly with __machine__load_kallsyms().  Also remove the no_kcore
      argument as it was always called with a 'true' value.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180215122635.24029-8-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e8f3879f
    • Jiri Olsa's avatar
      perf machine: Don't search for active kernel start in __machine__create_kernel_maps · 1fb87b8e
      Jiri Olsa authored
      We should not search for the kernel start address in
      __machine__create_kernel_maps(), because it's being used in the 'report'
      code path, where we are interested in kernel MMAP data address (the one
      recorded via 'perf record', possibly on another machine, or an older or
      newer kernel on the same machine where analysis is being performed)
      instead of in current kernel address.
      
      The __machine__create_kernel_maps() function serves purely for creating
      the machines kernel maps and setting up the kmap group. The report code
      path then sets the address based on the data from kernel MMAP event in
      the machine__set_kernel_mmap() function.
      
      The kallsyms search address logic is used for test code, that calls
      machine__create_kernel_maps() to get current maps and calls
      machine__get_running_kernel_start() to get kernel starting address.
      
      Use machine__set_kernel_mmap() to set the kernel maps start address and
      moving map_groups__fixup_end to be call when all maps are in place.
      
      Also make __machine__create_kernel_maps static, because there's no
      external user.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180215122635.24029-7-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1fb87b8e
    • Jiri Olsa's avatar
      perf machine: Generalize machine__set_kernel_mmap() · 05db6ff7
      Jiri Olsa authored
      So it could be called without event object, just with start and end
      values. It will be used in following patch.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180215122635.24029-6-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      05db6ff7
    • Jiri Olsa's avatar
      perf machine: Move kernel mmap name into struct machine · 8c7f1bb3
      Jiri Olsa authored
      It simplifies and centralizes the code. The kernel mmap name is set for
      machine type, which we know from the beginning, so there's no reason to
      generate it every time we need it.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180215122635.24029-5-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8c7f1bb3
    • Jiri Olsa's avatar
      perf machine: Free root_dir in machine__init() error path · 81f981d7
      Jiri Olsa authored
      Free root_dir in machine__init() error path.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180215122635.24029-4-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      81f981d7
    • Jiri Olsa's avatar
      perf symbols: Check if we read regular file in dso__load() · c3962961
      Jiri Olsa authored
      The current code in dso__load() calls is_regular_file(), but it checks
      its return value only after calling symsrc__init().
      
      That can make symsrc__init() block in elf_* functions on reading
      the file if the file happens to be device and not regular one.
      
      Call symsrc__init() only for regular files. Also remove the
      symsrc__destroy() cleanup, which is not needed now, because we call
      symsrc__init() only for regular files.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180215122635.24029-3-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c3962961
    • Jiri Olsa's avatar
      tools lib symbol: Skip non-address kallsyms line · c53b4bb0
      Jiri Olsa authored
      Adding check on failed attempt to parse the address and skip the line
      parsing early in that case.
      
      The address can be replaced with '(null)' string in case user don't have
      enough permissions, like:
      
        $ cat /proc/kallsyms
            (null) A irq_stack_union
            (null) A __per_cpu_start
            ...
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180215122635.24029-2-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c53b4bb0
    • yuzhoujian's avatar
      perf stat: Add support to print counts after a period of time · f1f8ad52
      yuzhoujian authored
      Introduce a new option to print counts after N milliseconds and update
      'perf stat' documentation accordingly.
      
      Show below is the output of the new option for perf stat.
      
        $ perf stat --time 2000 -e cycles -a
        Performance counter stats for 'system wide':
      
              157,260,423      cycles
      
              2.003060766 seconds time elapsed
      
      We can print the count deltas after N milliseconds with this new
      introduced option. This option is not supported with "-I" option.
      
      In addition, according to Kangliang's patch(19afd104), the
      monitoring overhead for system-wide core event could be very high if the
      interval-print parameter was below 100ms, and the limitation value is
      10ms.
      
      So the same warning will be displayed when the time is set between 10ms
      to 100ms, and the minimal time is limited to 10ms. Users can make a
      decision according to their spcific cases.
      
      Committer notes:
      
      This actually stops the workload after the specified time, then prints
      the counts.
      
      So I renamed the option to --timeout and updated the documentation to
      state that it will not just print the counts after the specified time,
      but will really stop the 'perf stat' session and print the counts.
      
      The rename from 'time' to 'timeout' also fixes the build in systems
      where 'time' is used by glibc and can't be used as a name of a variable,
      such as centos:5 and centos:6.
      
      Changes since v3:
      - none.
      
      Changes since v2:
      - modify the time check in __run_perf_stat func to keep some consistency
        with the workload case.
      - add the warning when the time is set between 10ms to 100ms.
      - add the pr_err when the time is set below 10ms.
      
      Changes since v1:
      - none.
      Signed-off-by: default avataryuzhoujian <yuzhoujian@didichuxing.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1517217923-8302-3-git-send-email-ufo19890607@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f1f8ad52
    • yuzhoujian's avatar
      perf stat: Add support to print counts for fixed times · db06a269
      yuzhoujian authored
      Introduce a new option to print counts for fixed number of times and
      update 'perf stat' documentation accordingly.
      
      Show below is the output of the new option for perf stat.
      
        $ perf stat -I 1000 --interval-count 2 -e cycles -a
        #           time             counts unit events
                 1.002827089         93,884,870      cycles
                 2.004231506         56,573,446      cycles
      
      We can just print the counts for several times with this newly
      introduced option. The usage of it is a little like 'vmstat', and it
      should be used together with "-I" option.
      
        $ vmstat -n 1 2
        procs ---------memory-------------- --swap- ----io-- -system-- ------cpu---
         r  b swpd   free   buff   cache    si   so  bi   bo  in   cs us sy id wa st
         0  0    0 78270544 547484 51732076  0   0   0   20    1    1  1  0 99  0 0
         0  0    0 78270512 547484 51732080  0   0   0   16  477 1555  0  0 100 0 0
      
      Changes since v3:
      - merge interval_count check and times check to one line.
      - fix the wrong indent in stat.h
      - use stat_config.times instead of 'times' in cmd_stat function.
      
      Changes since v2:
      - none.
      
      Changes since v1:
      - change the name of the new option "times-print" to "interval-count".
      - keep the new option interval specifically.
      Signed-off-by: default avataryuzhoujian <yuzhoujian@didichuxing.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1517217923-8302-2-git-send-email-ufo19890607@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      db06a269
    • Jiri Olsa's avatar
      perf report: Add support to display group output for non group events · ad52b8cb
      Jiri Olsa authored
      Add support to display group output for if non grouped events are
      detected and user forces --group option. Now for non-group events
      recorded like:
      
        $ perf record -e 'cycles,instructions' ls
      
      you can still get group output by using --group option
      in report:
      
        $ perf report --group --stdio
        ...
        #         Overhead  Command  Shared Object     Symbol
        # ................  .......  ................  ......................
        #
            17.67%   0.00%  ls       libc-2.25.so      [.] _IO_do_write@@GLIB
            15.59%  25.94%  ls       ls                [.] calculate_columns
            15.41%  31.35%  ls       libc-2.25.so      [.] __strcoll_l
        ...
      
      Committer note:
      
      We should improve on this by making sure that the first line states that
      this is not a group, but since the user doesn't have to force group view
      when really using grouped events (e.g. '{cycles,instructions}'), the
      user better know what is being done...
      Requested-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarStephane Eranian <eranian@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180209092734.GB20449@kravaSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ad52b8cb
    • Jiri Olsa's avatar
      perf report: Ask for ordered events for --tasks option · 8614ada0
      Jiri Olsa authored
      If we have the time in, keep the events in time order.
      
      Committer notes:
      
      Trying to be more verbose, what actual effect this will have in this particular
      case?
      
      Before and after this patch shows the artifacts:
      
        --- /tmp/before 2018-02-06 15:40:29.536411625 -0300
        +++ /tmp/after  2018-02-06 15:40:51.963403599 -0300
        @@ -5,34 +5,34 @@
               2540     2540     1818 |   gnome-terminal-
               3489     3489     2540 |    bash
              32433    32433     3489 |     perf
        -     32434    32434    32433 |      perf
        +     32434    32434    32433 |      make
              32441    32441    32434 |       make
              32514    32514    32441 |        make
                511      511    32514 |         sh
        -       512      512      511 |          sh
        +       512      512      511 |          install
      <SNIP>
      
      We don't have 'perf' calling 'perf' calling 'make', etc, the second
      'perf' actually is 'make', i.e.  there was reordering of the relevant
      PERF_RECORD_COMM and PERF_RECORD_FORK records.
      
      Ditto for sh/install later on.
      
      Look for FORK and COMM meta events, for those tids:
      
        # perf report -D | egrep 'PERF_RECORD_(FORK|COMM)' | egrep '3243[34]'
        0 14774650990679 0x1a3cd8 [0x38]: PERF_RECORD_FORK(32433:32433):(3489:3489)
        1 14774652080381 0x1d6568 [0x30]: PERF_RECORD_COMM exec: perf:32433/32433
        1 14774742473340 0x1dbb48 [0x38]: PERF_RECORD_FORK(32434:32434):(32433:32433)
        0 14774752005779 0x1a4af8 [0x30]: PERF_RECORD_COMM exec: make:32434/32434
        0 14774753997960 0x1a5578 [0x38]: PERF_RECORD_FORK(32435:32435):(32434:32434)
        0 14774756070782 0x1a5618 [0x38]: PERF_RECORD_FORK(32438:32438):(32434:32434)
        0 14774757772939 0x1a5680 [0x38]: PERF_RECORD_FORK(32440:32440):(32434:32434)
        0 14774758230600 0x1a56e8 [0x38]: PERF_RECORD_FORK(32441:32441):(32434:32434)
        #
      
      First column is the cpu, second is the timestamp.
      
      So they are on different CPUs, thus ring buffers, and when we don't use
      the ordered_events class, we end up mixing that up, use it to take
      advantage of the PERF_RECORD_FINISHED_ROUND meta events to go on
      ordering the events using the PERF_SAMPLE_TIME present in the
      PERF_RECORD_{FORK,COMM,EXIT,SAMPLE,etc} records in the ring buffer.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180206181813.10943-2-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8614ada0
    • Jiri Olsa's avatar
      perf tools: Fix comment for sort__* compare functions · a7402c94
      Jiri Olsa authored
      In commit 2f15bd8c ("perf tools: Fix "Command" sort_entry's cmp and
      collapse function") we switched from pointer to string comparison.
      
      But failed to remove related comments. Removing them and adding another
      one to warn before pointer comparison in here.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180206181813.10943-18-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a7402c94
    • Jiri Olsa's avatar
      perf tests: Fix dwarf unwind for stripped binaries · fdf7c49c
      Jiri Olsa authored
      When we strip the perf binary, dwarf unwind test stop
      to work. The reason is that strip will remove static
      function symbols, which we need to check for unwind.
      
      This change will keep this test working in cases where
      the global symbols are put into dynamic symbol table,
      which is the case on x86. It still won't work on powerpc.
      
      Making those 5 local functions global, and adding
      'test_dwarf_unwind__' to their names.
      
      Committer testing:
      
      Before:
      
        # perf test dwarf
        58: DWARF unwind                               : Ok
        # strip ~/bin/perf
        # perf test dwarf
        58: DWARF unwind                               : FAILED!
        # perf test -v dwarf
        58: DWARF unwind                               :
        --- start ---
        test child forked, pid 6590
        unwind: thread map already set, dso=/home/acme/bin/perf
        <SNIP>
        unwind: access_mem addr 0x7ffce6c48098 val 48563f, offset 1144
        unwind: test__dwarf_unwind:ip = 0x4a54e5 (0xa54e5)
        got: test__dwarf_unwind 0xa54e5, expecting test__dwarf_unwind
        unwind: '':ip = 0x4a50bb (0xa50bb)
        failed: got unresolved address 0xa50bb
        unwind failed
        test child finished with -1
        ---- end ----
        DWARF unwind: FAILED!
        #
      
      After:
      
        # perf test dwarf
        58: DWARF unwind                               : Ok
        # strip ~/bin/perf
        # perf test dwarf
        58: DWARF unwind                               : Ok
        #
        # perf test -v dwarf
        58: DWARF unwind                               :
        --- start ---
        test child forked, pid 7219
        unwind: thread map already set, dso=/home/acme/bin/perf
        <SNIP>
        unwind: access_mem addr 0x7fff007da2c8 val 48575f, offset 1144
        unwind: test__arch_unwind_sample:ip = 0x589044 (0x189044)
        got: test__arch_unwind_sample 0x189044, expecting test__arch_unwind_sample
        unwind: test_dwarf_unwind__thread:ip = 0x4a52f7 (0xa52f7)
        got: test_dwarf_unwind__thread 0xa52f7, expecting test_dwarf_unwind__thread
        unwind: test_dwarf_unwind__compare:ip = 0x4a5468 (0xa5468)
        got: test_dwarf_unwind__compare 0xa5468, expecting test_dwarf_unwind__compare
        unwind: bsearch:ip = 0x7f6608ae94d8 (0x394d8)
        got: bsearch 0x394d8, expecting bsearch
        unwind: test_dwarf_unwind__krava_3:ip = 0x4a54d1 (0xa54d1)
        got: test_dwarf_unwind__krava_3 0xa54d1, expecting test_dwarf_unwind__krava_3
        unwind: test_dwarf_unwind__krava_2:ip = 0x4a550b (0xa550b)
        got: test_dwarf_unwind__krava_2 0xa550b, expecting test_dwarf_unwind__krava_2
        unwind: test_dwarf_unwind__krava_1:ip = 0x4a554b (0xa554b)
        got: test_dwarf_unwind__krava_1 0xa554b, expecting test_dwarf_unwind__krava_1
        unwind: test__dwarf_unwind:ip = 0x4a5605 (0xa5605)
        got: test__dwarf_unwind 0xa5605, expecting test__dwarf_unwind
        test child finished with 0
        ---- end ----
        DWARF unwind: Ok
        #
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180206181813.10943-17-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fdf7c49c
    • Jiri Olsa's avatar
      tools lib api fs: Add sysfs__read_xll function · d9c5f322
      Jiri Olsa authored
      Adding sysfs__read_xll function to be able to read sysfs files with hex
      numbers in, which do not have 0x prefix.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180206181813.10943-6-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d9c5f322
    • Jiri Olsa's avatar
      tools lib api fs: Add filename__read_xll function · 6baddfc6
      Jiri Olsa authored
      Adding filename__read_xll function to be able to read files with hex
      numbers in, which do not have 0x prefix.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180206181813.10943-5-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6baddfc6
    • Jiri Olsa's avatar
      perf script: Add --show-round-event to display PERF_RECORD_FINISHED_ROUND · 3233b37a
      Jiri Olsa authored
      Adding --show-round-event to display PERF_RECORD_FINISHED_ROUND events
      like:
      
        # perf script --show-round-events 2>/dev/null
                     yes  8591 [002] 124177.397597:         18         cpu/mem-stores/P: ff...
                     yes  8591 [002] 124177.397615:          1 cpu/mem-loads,ldlat=30/P: ff...
        PERF_RECORD_FINISHED_ROUND
                    perf 10380 [001] 124177.397622:          6 cpu/mem-loads,ldlat=30/P: ff...
        PERF_RECORD_FINISHED_ROUND
                 swapper     0 [000] 124177.400518:         88         cpu/mem-stores/P: ff...
                 swapper     0 [000] 124177.400521:         88         cpu/mem-stores/P: ff...
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180206181813.10943-4-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3233b37a
    • Jiri Olsa's avatar
      perf record: Put new line after target override warning · c3dec27b
      Jiri Olsa authored
      There's no new-line after target-override warning, now:
      
        $ perf record -a --per-thread
        Warning:
        SYSTEM/CPU switch overriding PER-THREAD^C[ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.705 MB perf.data (2939 samples) ]
      
      with patch:
      
        $ perf record -a --per-thread
        Warning:
        SYSTEM/CPU switch overriding PER-THREAD
        ^C[ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.705 MB perf.data (2939 samples) ]
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: 16ad2ffb ("perf tools: Introduce perf_target__strerror()")
      Link: http://lkml.kernel.org/r/20180206181813.10943-3-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c3dec27b
    • Jessica Yu's avatar
      kprobes: Propagate error from disarm_kprobe_ftrace() · 297f9233
      Jessica Yu authored
      Improve error handling when disarming ftrace-based kprobes. Like with
      arm_kprobe_ftrace(), propagate any errors from disarm_kprobe_ftrace() so
      that we do not disable/unregister kprobes that are still armed. In other
      words, unregister_kprobe() and disable_kprobe() should not report success
      if the kprobe could not be disarmed.
      
      disarm_all_kprobes() keeps its current behavior and attempts to
      disarm all kprobes. It returns the last encountered error and gives a
      warning if not all probes could be disarmed.
      
      This patch is based on Petr Mladek's original patchset (patches 2 and 3)
      back in 2015, which improved kprobes error handling, found here:
      
         https://lkml.org/lkml/2015/2/26/452
      
      However, further work on this had been paused since then and the patches
      were not upstreamed.
      Based-on-patches-by: default avatarPetr Mladek <pmladek@suse.com>
      Signed-off-by: default avatarJessica Yu <jeyu@kernel.org>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: David S . Miller <davem@davemloft.net>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Joe Lawrence <joe.lawrence@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Miroslav Benes <mbenes@suse.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: live-patching@vger.kernel.org
      Link: http://lkml.kernel.org/r/20180109235124.30886-3-jeyu@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      297f9233
    • Jessica Yu's avatar
      kprobes: Propagate error from arm_kprobe_ftrace() · 12310e34
      Jessica Yu authored
      Improve error handling when arming ftrace-based kprobes. Specifically, if
      we fail to arm a ftrace-based kprobe, register_kprobe()/enable_kprobe()
      should report an error instead of success. Previously, this has lead to
      confusing situations where register_kprobe() would return 0 indicating
      success, but the kprobe would not be functional if ftrace registration
      during the kprobe arming process had failed. We should therefore take any
      errors returned by ftrace into account and propagate this error so that we
      do not register/enable kprobes that cannot be armed. This can happen if,
      for example, register_ftrace_function() finds an IPMODIFY conflict (since
      kprobe_ftrace_ops has this flag set) and returns an error. Such a conflict
      is possible since livepatches also set the IPMODIFY flag for their ftrace_ops.
      
      arm_all_kprobes() keeps its current behavior and attempts to arm all
      kprobes. It returns the last encountered error and gives a warning if
      not all probes could be armed.
      
      This patch is based on Petr Mladek's original patchset (patches 2 and 3)
      back in 2015, which improved kprobes error handling, found here:
      
         https://lkml.org/lkml/2015/2/26/452
      
      However, further work on this had been paused since then and the patches
      were not upstreamed.
      Based-on-patches-by: default avatarPetr Mladek <pmladek@suse.com>
      Signed-off-by: default avatarJessica Yu <jeyu@kernel.org>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: David S . Miller <davem@davemloft.net>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Joe Lawrence <joe.lawrence@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Miroslav Benes <mbenes@suse.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: live-patching@vger.kernel.org
      Link: http://lkml.kernel.org/r/20180109235124.30886-2-jeyu@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      12310e34
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo-4.17-20180215' of... · 3f9e6463
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo-4.17-20180215' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/core fixes from Arnaldo Carvalho de Melo:
      
      - perf_mmap overwrite mode fixes/overhaul, prep work to get 'perf top'
        using it, making it bearable to use it in large core count systems
        such as Knights Landing/Mill Intel systems (Kan Liang)
      
      - s/390 now uses syscall.tbl, just like x86-64 to generate the syscall
        table id -> string tables used by 'perf trace' (Hendrik Brueckner)
      
      - Use strtoull() instead of home grown function (Andy Shevchenko)
      
      - Synchronize kernel ABI headers, v4.16-rc1 (Ingo Molnar)
      
      - Document missing 'perf data --force' option (Sangwon Hong)
      
      - Add perf vendor JSON metrics for ARM Cortex-A53 Processor (William Cohen)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      3f9e6463
  2. 15 Feb, 2018 2 commits