1. 24 Apr, 2015 2 commits
  2. 23 Apr, 2015 4 commits
    • Bobby Powers's avatar
      tools lib api: Undefine _FORTIFY_SOURCE before setting it · de28c15d
      Bobby Powers authored
      Some toolchains (like Hardened Gentoo) define _FORTIFY_SOURCE in the
      built-in, default args.  This causes perf builds to fail with:
      
      <command-line>:0:0: error: "_FORTIFY_SOURCE" redefined [-Werror]
      <built-in>: note: this is the location of the previous definition cc1:
      all warnings being treated as errors
      
      To avoid this, undefine _FORTIFY_SOURCE before (possibly re-)defining it
      in tools/lib/api.
      
      v2 applies cleanly on top of already pulled kbuild changes for 4.1-rc1.
      Signed-off-by: default avatarBobby Powers <bobbypowers@gmail.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Dirk Gouders <dirk@gouders.net>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: linux-kbuild@vger.kernel.org
      Link: http://lkml.kernel.org/r/1429658381-3039-1-git-send-email-bobbypowers@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      de28c15d
    • Will Deacon's avatar
      perf kmem: Consistently use PRIu64 for printing u64 values · 6145c259
      Will Deacon authored
      Building the perf tool for 32-bit ARM results in the following build
      error due to a combination of an incorrect conversion specifier and
      compiling with -Werror:
      
        builtin-kmem.c: In function ‘print_page_summary’:
        builtin-kmem.c:644:9: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘u64’ [-Werror=format=]
                 nr_alloc_freed, (total_alloc_freed_bytes) / 1024);
                 ^
        builtin-kmem.c:647:9: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘u64’ [-Werror=format=]
                 (total_page_alloc_bytes - total_alloc_freed_bytes) / 1024);
                 ^
        cc1: all warnings being treated as errors
      
      This patch fixes the problem by consistently using PRIu64 for printing
      out u64 values.
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1429796437-1790-1-git-send-email-will.deacon@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6145c259
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Disable events and drain events when forked workload ends · 02ac5421
      Arnaldo Carvalho de Melo authored
      We were not checking in the inner event processing loop if the forked workload
      had finished, which, on a busy system, may make it take a long time trying to
      drain events, entering a seemingly neverending loop, waiting for the system to
      get idle enough to make it drain the buffers.
      
      Fix it by disabling the events when 'done' is true, in the inner loop, to start
      draining what is in the buffers.
      
      Now:
      
      [root@ssdandy ~]# time trace --filter-pids 14003 -a sleep 1 | tail
        996.748 ( 0.002 ms): sh/30296 rt_sigprocmask(how: SETMASK, nset: 0x7ffc83418160, sigsetsize: 8) = 0
        996.751 ( 0.002 ms): sh/30296 rt_sigprocmask(how: BLOCK, nset: 0x7ffc834181f0, oset: 0x7ffc83418270, sigsetsize: 8) = 0
        996.755 ( 0.002 ms): sh/30296 rt_sigaction(sig: INT, act: 0x7ffc83417f50, oact: 0x7ffc83417ff0, sigsetsize: 8) = 0
       1004.543 ( 0.362 ms): tail/30198  ... [continued]: read()) = 4096
       1004.548 ( 7.791 ms): sh/30296 wait4(upid: -1, stat_addr: 0x7ffc834181a0) ...
       1004.975 ( 0.427 ms): tail/30198 read(buf: 0x7633f0, count: 8192) = 4096
       1005.390 ( 0.410 ms): tail/30198 read(buf: 0x765410, count: 8192) = 4096
       1005.743 ( 0.348 ms): tail/30198 read(buf: 0x7633f0, count: 8192) = 4096
       1006.197 ( 0.449 ms): tail/30198 read(buf: 0x765410, count: 8192) = 4096
       1006.492 ( 0.290 ms): tail/30198 read(buf: 0x7633f0, count: 8192) = 4096
      
      real	0m1.219s
      user	0m0.704s
      sys	0m0.331s
      [root@ssdandy ~]#
      Reported-by: default avatarMichael Petlan <mpetlan@redhat.com>
      Suggested-by: default avatarJiri Olsa <jolsa@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-p6kpn1b26qcbe47pufpw0tex@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      02ac5421
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Enable events when doing system wide tracing and starting a workload · cb24d01d
      Arnaldo Carvalho de Melo authored
       commit f7aa222f
       Author: Arnaldo Carvalho de Melo <acme@redhat.com>
       Date:   Tue Feb 3 13:25:39 2015 -0300
      
          perf trace: No need to enable evsels for workload started from perf
      
      The assumption was that whenever a workload is specified, the
      attr.enable_on_exec evsel flag would be set, but that is not happening
      when perf_record_opts.system_wide is set, for instance
      
      That resulted in both perf_evlist__enable() and attr.enable_on_exec
      being not called/set, which made the events to remain disabled while the
      workload runs, producing no output.
      
      Fix it,  by calling perf_evlist__enable() in the 'trace' tool
      when forking and not targetting a workload started from trace
      
      v2: Test against !target__none(), as suggested by Namhyung Kim, that is
      what is used in perf_evsel__config() when deciding if the
      attr.enable_on_exec flag to be set. More work is needed to cover other
      cases such as opts->initial_delay.
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-27z7169pvfxgj8upic636syv@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cb24d01d
  3. 22 Apr, 2015 3 commits
    • Sonny Rao's avatar
      perf/x86/intel/uncore: Move PCI IDs for IMC to uncore driver · 0140e614
      Sonny Rao authored
      This keeps all the related PCI IDs together in the driver where
      they are used.
      Signed-off-by: default avatarSonny Rao <sonnyrao@chromium.org>
      Acked-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1429644791-25724-1-git-send-email-sonnyrao@chromium.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0140e614
    • Sonny Rao's avatar
      perf/x86/intel/uncore: Add support for Intel Haswell ULT (lower power Mobile... · 80bcffb3
      Sonny Rao authored
      perf/x86/intel/uncore: Add support for Intel Haswell ULT (lower power Mobile Processor) IMC uncore PMUs
      
      This uncore is the same as the Haswell desktop part but uses a
      different PCI ID.
      Signed-off-by: default avatarSonny Rao <sonnyrao@chromium.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1429569247-16697-1-git-send-email-sonnyrao@chromium.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      80bcffb3
    • Jiri Olsa's avatar
      perf/x86/intel: Add cpu_(prepare|starting|dying) for core_pmu · 3b6e0421
      Jiri Olsa authored
      The core_pmu does not define cpu_* callbacks, which handles
      allocation of 'struct cpu_hw_events::shared_regs' data,
      initialization of debug store and PMU_FL_EXCL_CNTRS counters.
      
      While this probably won't happen on bare metal, virtual CPU can
      define x86_pmu.extra_regs together with PMU version 1 and thus
      be using core_pmu -> using shared_regs data without it being
      allocated. That could could leave to following panic:
      
      	BUG: unable to handle kernel NULL pointer dereference at (null)
      	IP: [<ffffffff8152cd4f>] _spin_lock_irqsave+0x1f/0x40
      
      	SNIP
      
      	 [<ffffffff81024bd9>] __intel_shared_reg_get_constraints+0x69/0x1e0
      	 [<ffffffff81024deb>] intel_get_event_constraints+0x9b/0x180
      	 [<ffffffff8101e815>] x86_schedule_events+0x75/0x1d0
      	 [<ffffffff810586dc>] ? check_preempt_curr+0x7c/0x90
      	 [<ffffffff810649fe>] ? try_to_wake_up+0x24e/0x3e0
      	 [<ffffffff81064ba2>] ? default_wake_function+0x12/0x20
      	 [<ffffffff8109eb16>] ? autoremove_wake_function+0x16/0x40
      	 [<ffffffff810577e9>] ? __wake_up_common+0x59/0x90
      	 [<ffffffff811a9517>] ? __d_lookup+0xa7/0x150
      	 [<ffffffff8119db5f>] ? do_lookup+0x9f/0x230
      	 [<ffffffff811a993a>] ? dput+0x9a/0x150
      	 [<ffffffff8119c8f5>] ? path_to_nameidata+0x25/0x60
      	 [<ffffffff8119e90a>] ? __link_path_walk+0x7da/0x1000
      	 [<ffffffff8101d8f9>] ? x86_pmu_add+0xb9/0x170
      	 [<ffffffff8101d7a7>] x86_pmu_commit_txn+0x67/0xc0
      	 [<ffffffff811b07b0>] ? mntput_no_expire+0x30/0x110
      	 [<ffffffff8119c731>] ? path_put+0x31/0x40
      	 [<ffffffff8107c297>] ? current_fs_time+0x27/0x30
      	 [<ffffffff8117d170>] ? mem_cgroup_get_reclaim_stat_from_page+0x20/0x70
      	 [<ffffffff8111b7aa>] group_sched_in+0x13a/0x170
      	 [<ffffffff81014a29>] ? sched_clock+0x9/0x10
      	 [<ffffffff8111bac8>] ctx_sched_in+0x2e8/0x330
      	 [<ffffffff8111bb7b>] perf_event_sched_in+0x6b/0xb0
      	 [<ffffffff8111bc36>] perf_event_context_sched_in+0x76/0xc0
      	 [<ffffffff8111eb3b>] perf_event_comm+0x1bb/0x2e0
      	 [<ffffffff81195ee9>] set_task_comm+0x69/0x80
      	 [<ffffffff81195fe1>] setup_new_exec+0xe1/0x2e0
      	 [<ffffffff811ea68e>] load_elf_binary+0x3ce/0x1ab0
      
      Adding cpu_(prepare|starting|dying) for core_pmu to have
      shared_regs data allocated for core_pmu. AFAICS there's no harm
      to initialize debug store and PMU_FL_EXCL_CNTRS either for
      core_pmu.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20150421152623.GC13169@krava.redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      3b6e0421
  4. 18 Apr, 2015 1 commit
  5. 17 Apr, 2015 4 commits
  6. 14 Apr, 2015 1 commit
  7. 13 Apr, 2015 5 commits
    • He Kuang's avatar
      perf probe: Fix segfault when probe with lazy_line to file · f19e80c6
      He Kuang authored
      The first argument passed to find_probe_point_lazy() should be CU die,
      which will be passed to die_walk_lines() when lazy_line matches.
      Currently, when we probe with lazy_line pattern to file without function
      name, NULL pointer is passed and causes a segment fault.
      
      Can be reproduced as following:
      
        $ perf probe -k vmlinux --add='fs/super.c;s->s_count=1;'
        [ 1958.984658] perf[1020]: segfault at 10 ip 00007fc6e10d8c71 sp
        00007ffcbfaaf900 error 4 in libdw-0.161.so[7fc6e10ce000+34000]
        Segmentation fault
      
      After this patch:
      
        $ perf probe -k vmlinux --add='fs/super.c;s->s_count=1;'
        Added new event:
        probe:_stext         (on @fs/super.c)
      
        You can now use it in all perf tools, such as:
          perf record -e probe:_stext -aR sleep 1
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1428925290-5623-3-git-send-email-hekuang@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f19e80c6
    • Naohiro Aota's avatar
      perf probe: Find compilation directory path for lazy matching · 09ed8975
      Naohiro Aota authored
      If we use lazy matching, it failed to open a souce file if perf command
      is invoked outside of compilation directory:
      
      $ perf probe -a '__schedule;clear_*'
      Failed to open kernel/sched/core.c: No such file or directory
        Error: Failed to add events. (-2)
      
      OTOH, other commands like "probe -L" can solve the souce directory by
      themselves. Let's make it possible for lazy matching too!
      Signed-off-by: default avatarNaohiro Aota <naota@elisp.net>
      Acked-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1426223923-1493-1-git-send-email-naota@elisp.netSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      09ed8975
    • He Kuang's avatar
      perf probe: Set retprobe flag when probe in address-based alternative mode · 9d7b45c5
      He Kuang authored
      When perf probe searched in a debuginfo file and failed, it tried with
      an alternative, in function get_alternative_probe_event():
      
              memcpy(tmp, &pev->point, sizeof(*tmp));
              memset(&pev->point, 0, sizeof(pev->point));
      
      In this case, it drops the retprobe flag and forgets to set it back in
      find_alternative_probe_point(), so the problem occurs.
      
      Can be reproduced as following:
      
        $ perf probe -v -k vmlinux --add='sys_write%return'
        ...
        Added new event:
        Writing event: p:probe/sys_write _stext+1584952
          probe:sys_write      (on sys_write%return)
      
        $ cat /sys/kernel/debug/tracing/kprobe_events
        p:probe/sys_write _stext+1584952
      
      After this patch:
      
        $ perf probe -v -k vmlinux --add='sys_write%return'
        Added new event:
        Writing event: r:probe/sys_write SyS_write+0
          probe:sys_write      (on sys_write%return)
      
        $ cat /sys/kernel/debug/tracing/kprobe_events
        r:probe/sys_write SyS_write
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1428925290-5623-1-git-send-email-hekuang@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9d7b45c5
    • Namhyung Kim's avatar
      perf kmem: Analyze page allocator events also · 0d68bc92
      Namhyung Kim authored
      The perf kmem command records and analyze kernel memory allocation only
      for SLAB objects.  This patch implement a simple page allocator analyzer
      using kmem:mm_page_alloc and kmem:mm_page_free events.
      
      It adds two new options of --slab and --page.  The --slab option is for
      analyzing SLAB allocator and that's what perf kmem currently does.
      
      The new --page option enables page allocator events and analyze kernel
      memory usage in page unit.  Currently, 'stat --alloc' subcommand is
      implemented only.
      
      If none of these --slab nor --page is specified, --slab is implied.
      
      First run 'perf kmem record' to generate a suitable perf.data file:
      
        # perf kmem record --page sleep 5
      
      Then run 'perf kmem stat' to postprocess the perf.data file:
      
        # perf kmem stat --page --alloc --line 10
      
        -------------------------------------------------------------------------------
         PFN              | Total alloc (KB) | Hits     | Order | Mig.type | GFP flags
        -------------------------------------------------------------------------------
                  4045014 |               16 |        1 |     2 |  RECLAIM |  00285250
                  4143980 |               16 |        1 |     2 |  RECLAIM |  00285250
                  3938658 |               16 |        1 |     2 |  RECLAIM |  00285250
                  4045400 |               16 |        1 |     2 |  RECLAIM |  00285250
                  3568708 |               16 |        1 |     2 |  RECLAIM |  00285250
                  3729824 |               16 |        1 |     2 |  RECLAIM |  00285250
                  3657210 |               16 |        1 |     2 |  RECLAIM |  00285250
                  4120750 |               16 |        1 |     2 |  RECLAIM |  00285250
                  3678850 |               16 |        1 |     2 |  RECLAIM |  00285250
                  3693874 |               16 |        1 |     2 |  RECLAIM |  00285250
         ...              | ...              | ...      | ...   | ...      | ...
        -------------------------------------------------------------------------------
      
        SUMMARY (page allocator)
        ========================
        Total allocation requests     :           44,260   [          177,256 KB ]
        Total free requests           :              117   [              468 KB ]
      
        Total alloc+freed requests    :               49   [              196 KB ]
        Total alloc-only requests     :           44,211   [          177,060 KB ]
        Total free-only requests      :               68   [              272 KB ]
      
        Total allocation failures     :                0   [                0 KB ]
      
        Order     Unmovable   Reclaimable       Movable      Reserved  CMA/Isolated
        -----  ------------  ------------  ------------  ------------  ------------
            0            32             .        44,210             .             .
            1             .             .             .             .             .
            2             .            18             .             .             .
            3             .             .             .             .             .
            4             .             .             .             .             .
            5             .             .             .             .             .
            6             .             .             .             .             .
            7             .             .             .             .             .
            8             .             .             .             .             .
            9             .             .             .             .             .
           10             .             .             .             .             .
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/1428298576-9785-4-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0d68bc92
    • Namhyung Kim's avatar
      tracing, mm: Record pfn instead of pointer to struct page · 9fdd8a87
      Namhyung Kim authored
      The struct page is opaque for userspace tools, so it'd be better to save
      pfn in order to identify page frames.
      
      The textual output of $debugfs/tracing/trace file remains unchanged and
      only raw (binary) data format is changed - but thanks to libtraceevent,
      userspace tools which deal with the raw data (like perf and trace-cmd)
      can parse the format easily.  So impact on the userspace will also be
      minimal.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Based-on-patch-by: default avatarJoonsoo Kim <js1304@gmail.com>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/1428298576-9785-3-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9fdd8a87
  8. 12 Apr, 2015 1 commit
  9. 11 Apr, 2015 1 commit
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · 5dafd7cb
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      New user visible features:
      
        - Support multiple probes on different binaries on the same command line (Masami Hiramatsu)
      
      User visible changes:
      
        - Fix synthesizing fork_event.ppid for non-main thread (David Ahern)
      
        - Fix cross-endian analysis (David Ahern)
      
        - Fix segfault in 'perf buildid-list' when show DSOs with hits (He Kuang)
      
      Infrastructure changes:
      
        - Fix type for references to data_head/tail (David Ahern)
      
        - Fix error path to do closedir() when synthesizing threads (Arnaldo Carvalho de Melo)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      5dafd7cb
  10. 10 Apr, 2015 7 commits
  11. 08 Apr, 2015 11 commits
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · 51ab7155
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
        - Teach 'perf record' about perf_event_attr.clockid (Peter Zijlstra)
      
        - Improve 'perf sched replay' on high CPU core count machines (Yunlong Song)
      
        - Consider PERF_RECORD_ events with cpumode == 0 in 'perf top', removing one
          cause of long term memory usage buildup, i.e. not processing PERF_RECORD_EXIT
          events (Arnaldo Carvalho de Melo)
      
        - Add 'I' event modifier for perf_event_attr.exclude_idle bit (Jiri Olsa)
      
        - Respect -i option 'in perf kmem' (Jiri Olsa)
      
      Infrastructure changes:
      
        - Honor operator priority in libtraceevent (Namhyung Kim)
      
        - Merge all perf_event_attr print functions (Peter Zijlstra)
      
        - Check kmaps access to make code more robust (Wang Nan)
      
        - Fix inverted logic in perf_mmap__empty() (He Kuang)
      
        - Fix ARM 32 'perf probe' building error (Wang Nan)
      
        - Fix perf_event_attr tests (Jiri Olsa)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      51ab7155
    • Jiri Olsa's avatar
      perf tools: Add 'I' event modifier for exclude_idle bit · a1e12da4
      Jiri Olsa authored
      Adding 'I' event modifier to have complete set of modifiers for
      perf_event_attr:exclude_* bits.
      
      Any event specified with 'I' modifier will have the
      perf_event_attr:exclude_idle bit set.
      
        $ perf record -e cycles:I -vv ls 2>&1 | grep exclude_idle
        exclude_hv          0    exclude_idle        1
      
      Adding automated tests.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: William Cohen <wcohen@redhat.com>
      Link: http://lkml.kernel.org/r/1428441919-23099-2-git-send-email-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a1e12da4
    • Wang Nan's avatar
      perf report: Don't call map__kmap if map is NULL. · f6fcc143
      Wang Nan authored
      report__warn_kptr_restrict() calls map__kmap(kernel_map) before checking
      kernel_map againest NULL.
      
      Which is dangerous, since map__kmap() will return a invalid and not NULL
      address.
      
      It will trigger a warning message in map__kmap() after the patch "perf:
      kmaps: enforce usage of kmaps to protect futher bugs." was applied.
      
      This patch fixes it by adding the missing checking.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1428490772-135393-1-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f6fcc143
    • Jiri Olsa's avatar
      perf tests: Fix attr tests · 54a50f93
      Jiri Olsa authored
      Following commit:
        1a594131 perf: Add wakeup watermark control to the AUX area
      
      enlarged perf_event_attr, but did not updated attr tests.
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Kaixu Xia <kaixu.xia@linaro.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Markus T Metzger <markus.t.metzger@intel.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Link: http://lkml.kernel.org/n/20150407171715.GA22603@krava.redhat.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      54a50f93
    • Wang Nan's avatar
      perf probe: Fix ARM 32 building error · f6c15621
      Wang Nan authored
      Commit 9b118aca ("perf probe: Fix to
      handle aliased symbols in glibc") uses an absolute format '%lx' to
      print u64 argument, which causes compiling error on ARM 32.
      
      This patch replaces it with PRIx64.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Acked-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1428459274-138470-1-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f6c15621
    • Peter Zijlstra's avatar
      perf tools: Merge all perf_event_attr print functions · 2c5e8c52
      Peter Zijlstra authored
      Currently there's 3 (that I found) different and incomplete
      implementations of printing perf_event_attr.
      
      This is quite silly. Merge the lot.
      
      While this patch does not retain the exact form all printing that I
      found is debug output and thus it should not be critical.
      
      Also, I cannot find a single print_event_desc() caller.
      
      Pre:
      
       $ perf record -vv -e cycles -- sleep 1
       ------------------------------------------------------------
       perf_event_attr:
        type                0
        size                104
        config              0
        sample_period       4000
        sample_freq         4000
        sample_type         0x107
        read_format         0
        disabled            1    inherit             1
        pinned              0    exclusive           0
        exclude_user        0    exclude_kernel      0
        exclude_hv          0    exclude_idle        0
        mmap                1    comm                1
        mmap2               1    comm_exec           1
        freq                1    inherit_stat        0
        enable_on_exec      1    task                1
        watermark           0    precise_ip          0
        mmap_data           0    sample_id_all       1
        exclude_host        0    exclude_guest       1
        excl.callchain_kern 0    excl.callchain_user 0
        wakeup_events       0
        wakeup_watermark    0
        bp_type             0
        bp_addr             0
        config1             0
        bp_len              0
        config2             0
        branch_sample_type  0
        sample_regs_user    0
        sample_stack_user   0
        sample_regs_intr    0
       ------------------------------------------------------------
      
       $ perf evlist  -vv
       cycles: sample_freq=4000, size: 104, sample_type: IP|TID|TIME|PERIOD,
       disabled: 1, inherit: 1, mmap: 1, mmap2: 1, comm: 1, comm_exec: 1,
       freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1
      
       Post:
      
       $ ./perf record -vv -e cycles -- sleep 1
       ------------------------------------------------------------
       perf_event_attr:
        size                             112
        { sample_period, sample_freq }   4000
        sample_type                      IP|TID|TIME|PERIOD
        disabled                         1
        inherit                          1
        mmap                             1
        comm                             1
        freq                             1
        enable_on_exec                   1
        task                             1
        sample_id_all                    1
        exclude_guest                    1
        mmap2                            1
        comm_exec                        1
      ------------------------------------------------------------
      
       $ ./perf evlist  -vv
       cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
       IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq:
       1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1,
       mmap2: 1, comm_exec: 1
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20150407091150.644238729@infradead.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2c5e8c52
    • Peter Zijlstra's avatar
      perf record: Add clockid parameter · 814c8c38
      Peter Zijlstra authored
      Teach perf-record about the new perf_event_attr::{use_clockid, clockid}
      fields. Add a simple parameter to set the clock (if any) to be used for
      the events to be recorded into the data file.
      
      Since we store the entire perf_event_attr in the EVENT_DESC section we
      also already store the used clockid in the data file.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarDavid Ahern <dsahern@gmail.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yunlong Song <yunlong.song@huawei.com>
      Link: http://lkml.kernel.org/r/20150407154851.GR23123@twins.programming.kicks-ass.net
      [ Conditionally define CLOCK_BOOTTIME, at least rhel6 doesn't have it - dsahern
        Ditto for CLOCK_MONOTONIC_RAW, sles11sp2 doesn't have it - yunlong.song ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      814c8c38
    • Yunlong Song's avatar
      perf sched replay: Use replay_repeat to calculate the runavg of cpu usage... · ff5f3bbd
      Yunlong Song authored
      perf sched replay: Use replay_repeat to calculate the runavg of cpu usage instead of the default value 10
      
      Since sched->replay_repeat is set to 10 as default, the sched->run_avg,
      sched->runavg_cpu_usage, and sched->runavg_parent_cpu_usage all use
      10 to calculate their value.
      
      However, the replay_repeat can be changed to other value by using -r
      option, so the calculation above should use replay_repeat to achieve
      more accurate results instead of the default value 10.
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427809596-29559-10-git-send-email-yunlong.song@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ff5f3bbd
    • Yunlong Song's avatar
      perf sched replay: Support using -f to override perf.data file ownership · f0dd330f
      Yunlong Song authored
      Enable to use perf.data when it is not owned by current user or root.
      
      Example:
      
       $ ls -al perf.data
       -rw------- 1 Yunlong.Song Yunlong.Song 5321918 Mar 25 15:14 perf.data
       $ sudo id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       $ sudo perf sched replay -f
       run measurement overhead: 98 nsecs
       sleep measurement overhead: 52909 nsecs
       the run test took 1000015 nsecs
       the sleep test took 1054253 nsecs
       File perf.data not owned by current user or root (use -f to override)
      
      As shown above, the -f option does not work at all.
      
      After this patch:
      
       $ sudo perf sched replay -f
       run measurement overhead: 221 nsecs
       sleep measurement overhead: 40514 nsecs
       the run test took 1000003 nsecs
       the sleep test took 1056098 nsecs
       nr_run_events:        10
       nr_sleep_events:      1562
       nr_wakeup_events:     5
       task      0 (                  :1:         1), nr_events: 1
       task      1 (                  :2:         2), nr_events: 1
       task      2 (                  :3:         3), nr_events: 1
       ...
       ...
       task   1549 (             :163132:    163132), nr_events: 1
       task   1550 (             :163540:    163540), nr_events: 1
       task   1551 (           <unknown>:         0), nr_events: 10
       ------------------------------------------------------------
       #1  : 50.198, ravg: 50.20, cpu: 2335.18 / 2335.18
       #2  : 219.099, ravg: 67.09, cpu: 2835.11 / 2385.17
       #3  : 238.626, ravg: 84.24, cpu: 3278.26 / 2474.48
       #4  : 200.364, ravg: 95.85, cpu: 2977.41 / 2524.77
       #5  : 176.882, ravg: 103.96, cpu: 2801.35 / 2552.43
       #6  : 191.093, ravg: 112.67, cpu: 2813.70 / 2578.56
       #7  : 189.448, ravg: 120.35, cpu: 2809.21 / 2601.62
       #8  : 200.637, ravg: 128.38, cpu: 2849.91 / 2626.45
       #9  : 248.338, ravg: 140.37, cpu: 4380.61 / 2801.87
       #10 : 511.139, ravg: 177.45, cpu: 3077.73 / 2829.45
      
      As shown above, the -f option really works now.
      
      Besides for replay, -f option can also work for latency and map.
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427809596-29559-9-git-send-email-yunlong.song@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f0dd330f
    • Yunlong Song's avatar
      perf sched replay: Fix the EMFILE error caused by the limitation of the maximum open files · 939cda52
      Yunlong Song authored
      The soft maximum number of open files for a calling process is 1024,
      which is defined as INR_OPEN_CUR in include/uapi/linux/fs.h, and the
      hard maximum number of open files for a calling process is 4096, which
      is defined as INR_OPEN_MAX in include/uapi/linux/fs.h.
      
      Both INR_OPEN_CUR and INR_OPEN_MAX are used to limit the value of
      RLIMIT_NOFILE in include/asm-generic/resource.h.
      
      And the soft maximum number finally decides the limitation of the
      maximum files which are allowed to be opened.
      
      That is to say a process can use at most 1024 file descriptors for its
      o pened files, or an EMFILE error will happen.
      
      This error can be fixed by increasing the soft maximum number, under the
      constraint that the soft maximum number can not exceed the hard maximum
      number, or both soft and hard maximum number should be increased
      simultaneously with privilege.
      
      For perf sched replay, it uses sys_perf_event_open to create the file
      descriptor for each of the tasks in order to handle information of perf
      events.
      
      That is to say each task needs a unique file descriptor. In x86_64,
      there may be over 1024 or 4096 tasks correspoinding to the record in
      perf.data, which causes that no enough file descriptors can be used.
      
      As a result, EMFILE error happens and stops the replay process. To solve
      this problem, we adaptively increase the soft and hard maximum number of
      open files with a '-f' option.
      
      Example:
      
      Test environment: x86_64 with 160 cores
      
       $ cat /proc/sys/kernel/pid_max
       163840
       $ cat /proc/sys/fs/file-max
       6815744
       $ ulimit -Sn
       1024
       $ ulimit -Hn
       4096
      
      Before this patch:
      
       $ perf sched replay
       ...
       task   1549 (             :163132:    163132), nr_events: 1
       task   1550 (             :163540:    163540), nr_events: 1
       task   1551 (           <unknown>:         0), nr_events: 10
       Error: sys_perf_event_open() syscall returned with -1 (Too many open
       files)
      
      After this patch:
      
       $ perf sched replay
       ...
       task   1549 (             :163132:    163132), nr_events: 1
       task   1550 (             :163540:    163540), nr_events: 1
       task   1551 (           <unknown>:         0), nr_events: 10
       Error: sys_perf_event_open() syscall returned with -1 (Too many open
       files)
       Have a try with -f option
      
       $ perf sched replay -f
       ...
       task   1549 (             :163132:    163132), nr_events: 1
       task   1550 (             :163540:    163540), nr_events: 1
       task   1551 (           <unknown>:         0), nr_events: 10
       ------------------------------------------------------------
       #1  : 54.401, ravg: 54.40, cpu: 3285.21 / 3285.21
       #2  : 199.548, ravg: 68.92, cpu: 4999.65 / 3456.66
       #3  : 170.483, ravg: 79.07, cpu: 1349.94 / 3245.99
       #4  : 192.034, ravg: 90.37, cpu: 1322.88 / 3053.67
       #5  : 182.929, ravg: 99.62, cpu: 1406.51 / 2888.96
       #6  : 152.974, ravg: 104.96, cpu: 1167.54 / 2716.82
       #7  : 155.579, ravg: 110.02, cpu: 2992.53 / 2744.39
       #8  : 130.557, ravg: 112.08, cpu: 1126.43 / 2582.59
       #9  : 138.520, ravg: 114.72, cpu: 1253.22 / 2449.65
       #10 : 134.328, ravg: 116.68, cpu: 1587.95 / 2363.48
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427809596-29559-8-git-send-email-yunlong.song@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      939cda52
    • Yunlong Song's avatar
      perf sched replay: Handle the dead halt of sem_wait when create_tasks() fails for any task · 1aff59be
      Yunlong Song authored
      Since there is sem_wait for each task in the wait_for_tasks(), e.g.
      sem_wait(&task->work_done_sem).
      
      The sem_wait can continue only when work_done_sem is greater than 0, or
      it will be blocked.
      
      For perf sched replay, one task may sem_post the work_done_sem of
      another task, which causes the work_done_sem of that task processed in a
      reasonable sequence, e.g. sem_post, sem_wait, sem_wait, sem_post...
      
      This sequence simulates the sched process of the running tasks at the
      time when perf sched record runs.
      
      As a result, all the tasks are required and their threads must be
      successfully created.
      
      If any one (task A) of the tasks fails to create its thread, then
      another task (task B), whose work_done_sem needs sem_post from that
      failed task A, may likely block itself due to seg_wait.
      
      And this is a dead halt, since task B's thread_func cannot continue at
      all.
      
      To solve this problem, perf sched replay should exit once any task fails
      to create its thread.
      
      Example:
      
      Test environment: x86_64 with 160 cores
      
      Before this patch:
      
       $ perf sched replay
       ...
       Error: sys_perf_event_open() syscall returned with -1 (Too many open
       files)
       ------------------------------------------------------------    <- dead halt
      
      After this patch:
      
       $ perf sched replay
       ...
       task   1551 (           <unknown>:         0), nr_events: 10
       Error: sys_perf_event_open() syscall returned with -1 (Too many open
       files)
       $
      
      As shown above, perf sched replay finishes the process after printing an
      error message and does not block itself.
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427809596-29559-7-git-send-email-yunlong.song@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1aff59be