Commit ef35ad26 authored by Linus Torvalds's avatar Linus Torvalds

Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf changes from Ingo Molnar:
 "Kernel side changes:

   - Consolidate the PMU interrupt-disabled code amongst architectures
     (Vince Weaver)

   - misc fixes

  Tooling changes (new features, user visible changes):

   - Add support for pagefault tracing in 'trace', please see multiple
     examples in the changeset messages (Stanislav Fomichev).

   - Add pagefault statistics in 'trace' (Stanislav Fomichev)

   - Add header for columns in 'top' and 'report' TUI browsers (Jiri
     Olsa)

   - Add pagefault statistics in 'trace' (Stanislav Fomichev)

   - Add IO mode into timechart command (Stanislav Fomichev)

   - Fallback to syscalls:* when raw_syscalls:* is not available in the
     perl and python perf scripts.  (Daniel Bristot de Oliveira)

   - Add --repeat global option to 'perf bench' to be used in benchmarks
     such as the existing 'futex' one, that was modified to use it
     instead of a local option.  (Davidlohr Bueso)

   - Fix fd -> pathname resolution in 'trace', be it using /proc or a
     vfs_getname probe point.  (Arnaldo Carvalho de Melo)

   - Add suggestion of how to set perf_event_paranoid sysctl, to help
     non-root users trying tools like 'trace' to get a working
     environment.  (Arnaldo Carvalho de Melo)

   - Updates from trace-cmd for traceevent plugin_kvm plus args cleanup
     (Steven Rostedt, Jan Kiszka)

   - Support S/390 in 'perf kvm stat' (Alexander Yarygin)

  Tooling infrastructure changes:

   - Allow reserving a row for header purposes in the hists browser
     (Arnaldo Carvalho de Melo)

   - Various fixes and prep work related to supporting Intel PT (Adrian
     Hunter)

   - Introduce multiple debug variables control (Jiri Olsa)

   - Add callchain and additional sample information for python scripts
     (Joseph Schuchart)

   - More prep work to support Intel PT: (Adrian Hunter)
     - Polishing 'script' BTS output
     - 'inject' can specify --kallsym
     - VDSO is per machine, not a global var
     - Expose data addr lookup functions previously private to 'script'
     - Large mmap fixes in events processing

   - Include standard stringify macros in power pc code (Sukadev
     Bhattiprolu)

  Tooling cleanups:

   - Convert open coded equivalents to asprintf() (Andy Shevchenko)

   - Remove needless reassignments in 'trace' (Arnaldo Carvalho de Melo)

   - Cache the is_exit syscall test in 'trace) (Arnaldo Carvalho de
     Melo)

   - No need to reimplement err() in 'perf bench sched-messaging', drop
     barf().  (Davidlohr Bueso).

   - Remove ev_name argument from perf_evsel__hists_browse, can be
     obtained from the other parameters.  (Jiri Olsa)

  Tooling fixes:

   - Fix memory leak in the 'sched-messaging' perf bench test.
     (Davidlohr Bueso)

   - The -o and -n 'perf bench mem' options are mutually exclusive, emit
     error when both are specified.  (Davidlohr Bueso)

   - Fix scrollbar refresh row index in the ui browser, problem exposed
     now that headers will be added and will be allowed to be switched
     on/off.  (Jiri Olsa)

   - Handle the num array type in python properly (Sebastian Andrzej
     Siewior)

   - Fix wrong condition for allocation failure (Jiri Olsa)

   - Adjust callchain based on DWARF debug info on powerpc (Sukadev
     Bhattiprolu)

   - Fix a risk for doing free on uninitialized pointer in traceevent
     lib (Rickard Strandqvist)

   - Update attr test with PERF_FLAG_FD_CLOEXEC flag (Jiri Olsa)

   - Enable close-on-exec flag on perf file descriptor (Yann Droneaud)

   - Fix build on gcc 4.4.7 (Arnaldo Carvalho de Melo)

   - Event ordering fixes (Jiri Olsa)"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (123 commits)
  Revert "perf tools: Fix jump label always changing during tracing"
  perf tools: Fix perf usage string leftover
  perf: Check permission only for parent tracepoint event
  perf record: Store PERF_RECORD_FINISHED_ROUND only for nonempty rounds
  perf record: Always force PERF_RECORD_FINISHED_ROUND event
  perf inject: Add --kallsyms parameter
  perf tools: Expose 'addr' functions so they can be reused
  perf session: Fix accounting of ordered samples queue
  perf powerpc: Include util/util.h and remove stringify macros
  perf tools: Fix build on gcc 4.4.7
  perf tools: Add thread parameter to vdso__dso_findnew()
  perf tools: Add dso__type()
  perf tools: Separate the VDSO map name from the VDSO dso name
  perf tools: Add vdso__new()
  perf machine: Fix the lifetime of the VDSO temporary file
  perf tools: Group VDSO global variables into a structure
  perf session: Add ability to skip 4GiB or more
  perf session: Add ability to 'skip' a non-piped event stream
  perf tools: Pass machine to vdso__dso_findnew()
  perf tools: Add dso__data_size()
  ...
parents 8efb90cf f9b9f812
......@@ -99,10 +99,6 @@ static int arc_pmu_event_init(struct perf_event *event)
struct hw_perf_event *hwc = &event->hw;
int ret;
/* ARC 700 PMU does not support sampling events */
if (is_sampling_event(event))
return -ENOENT;
switch (event->attr.type) {
case PERF_TYPE_HARDWARE:
if (event->attr.config >= PERF_COUNT_HW_MAX)
......@@ -298,6 +294,9 @@ static int arc_pmu_device_probe(struct platform_device *pdev)
.read = arc_pmu_read,
};
/* ARC 700 PMU does not support sampling events */
arc_pmu->pmu.capabilities |= PERF_PMU_CAP_NO_INTERRUPT;
ret = perf_pmu_register(&arc_pmu->pmu, pdev->name, PERF_TYPE_RAW);
return ret;
......
......@@ -389,14 +389,6 @@ static int bfin_pmu_event_init(struct perf_event *event)
if (attr->exclude_hv || attr->exclude_idle)
return -EPERM;
/*
* All of the on-chip counters are "limited", in that they have
* no interrupts, and are therefore unable to do sampling without
* further work and timer assistance.
*/
if (hwc->sample_period)
return -EINVAL;
ret = 0;
switch (attr->type) {
case PERF_TYPE_RAW:
......@@ -490,6 +482,13 @@ static int __init bfin_pmu_init(void)
{
int ret;
/*
* All of the on-chip counters are "limited", in that they have
* no interrupts, and are therefore unable to do sampling without
* further work and timer assistance.
*/
pmu.capabilities |= PERF_PMU_CAP_NO_INTERRUPT;
ret = perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
if (!ret)
perf_cpu_notifier(bfin_pmu_notifier);
......
......@@ -567,16 +567,6 @@ static int _hw_perf_event_init(struct perf_event *event)
if (mapping == -1)
return -EINVAL;
/*
* Early cores have "limited" counters - they have no overflow
* interrupts - and so are unable to do sampling without extra work
* and timer assistance.
*/
if (metag_pmu->max_period == 0) {
if (hwc->sample_period)
return -EINVAL;
}
/*
* Don't assign an index until the event is placed into the hardware.
* -1 signifies that we're still deciding where to put it. On SMP
......@@ -866,6 +856,15 @@ static int __init init_hw_perf_events(void)
pr_info("enabled with %s PMU driver, %d counters available\n",
metag_pmu->name, metag_pmu->max_events);
/*
* Early cores have "limited" counters - they have no overflow
* interrupts - and so are unable to do sampling without extra work
* and timer assistance.
*/
if (metag_pmu->max_period == 0) {
metag_pmu->pmu.capabilities |= PERF_PMU_CAP_NO_INTERRUPT;
}
/* Initialise the active events and reservation mutex */
atomic_set(&metag_pmu->active_events, 0);
mutex_init(&metag_pmu->reserve_mutex);
......
......@@ -387,8 +387,7 @@ static int h_24x7_event_init(struct perf_event *event)
event->attr.exclude_hv ||
event->attr.exclude_idle ||
event->attr.exclude_host ||
event->attr.exclude_guest ||
is_sampling_event(event)) /* no sampling */
event->attr.exclude_guest)
return -EINVAL;
/* no branch sampling */
......@@ -513,6 +512,9 @@ static int hv_24x7_init(void)
if (!hv_page_cache)
return -ENOMEM;
/* sampling not supported */
h_24x7_pmu.capabilities |= PERF_PMU_CAP_NO_INTERRUPT;
r = perf_pmu_register(&h_24x7_pmu, h_24x7_pmu.name, -1);
if (r)
return r;
......
......@@ -210,8 +210,7 @@ static int h_gpci_event_init(struct perf_event *event)
event->attr.exclude_hv ||
event->attr.exclude_idle ||
event->attr.exclude_host ||
event->attr.exclude_guest ||
is_sampling_event(event)) /* no sampling */
event->attr.exclude_guest)
return -EINVAL;
/* no branch sampling */
......@@ -284,6 +283,9 @@ static int hv_gpci_init(void)
return -ENODEV;
}
/* sampling not supported */
h_gpci_pmu.capabilities |= PERF_PMU_CAP_NO_INTERRUPT;
r = perf_pmu_register(&h_gpci_pmu, h_gpci_pmu.name, -1);
if (r)
return r;
......
......@@ -16,6 +16,7 @@ header-y += ioctls.h
header-y += ipcbuf.h
header-y += kvm.h
header-y += kvm_para.h
header-y += kvm_perf.h
header-y += kvm_virtio.h
header-y += mman.h
header-y += monwriter.h
......
/*
* Definitions for perf-kvm on s390
*
* Copyright 2014 IBM Corp.
* Author(s): Alexander Yarygin <yarygin@linux.vnet.ibm.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License (version 2 only)
* as published by the Free Software Foundation.
*/
#ifndef __LINUX_KVM_PERF_S390_H
#define __LINUX_KVM_PERF_S390_H
#include <asm/sie.h>
#define DECODE_STR_LEN 40
#define VCPU_ID "id"
#define KVM_ENTRY_TRACE "kvm:kvm_s390_sie_enter"
#define KVM_EXIT_TRACE "kvm:kvm_s390_sie_exit"
#define KVM_EXIT_REASON "icptcode"
#endif
......@@ -411,12 +411,6 @@ static int cpumf_pmu_event_init(struct perf_event *event)
case PERF_TYPE_HARDWARE:
case PERF_TYPE_HW_CACHE:
case PERF_TYPE_RAW:
/* The CPU measurement counter facility does not have overflow
* interrupts to do sampling. Sampling must be provided by
* external means, for example, by timers.
*/
if (is_sampling_event(event))
return -ENOENT;
err = __hw_perf_event_init(event);
break;
default:
......@@ -681,6 +675,12 @@ static int __init cpumf_pmu_init(void)
goto out;
}
/* The CPU measurement counter facility does not have overflow
* interrupts to do sampling. Sampling must be provided by
* external means, for example, by timers.
*/
cpumf_pmu.capabilities |= PERF_PMU_CAP_NO_INTERRUPT;
cpumf_pmu.attr_groups = cpumf_cf_event_group();
rc = perf_pmu_register(&cpumf_pmu, "cpum_cf", PERF_TYPE_RAW);
if (rc) {
......
......@@ -128,14 +128,6 @@ static int __hw_perf_event_init(struct perf_event *event)
if (!sh_pmu_initialized())
return -ENODEV;
/*
* All of the on-chip counters are "limited", in that they have
* no interrupts, and are therefore unable to do sampling without
* further work and timer assistance.
*/
if (hwc->sample_period)
return -EINVAL;
/*
* See if we need to reserve the counter.
*
......@@ -392,6 +384,13 @@ int register_sh_pmu(struct sh_pmu *_pmu)
pr_info("Performance Events: %s support registered\n", _pmu->name);
/*
* All of the on-chip counters are "limited", in that they have
* no interrupts, and are therefore unable to do sampling without
* further work and timer assistance.
*/
pmu.capabilities |= PERF_PMU_CAP_NO_INTERRUPT;
WARN_ON(_pmu->num_events > MAX_HWEVENTS);
perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
......
......@@ -22,6 +22,7 @@ header-y += ipcbuf.h
header-y += ist.h
header-y += kvm.h
header-y += kvm_para.h
header-y += kvm_perf.h
header-y += ldt.h
header-y += mce.h
header-y += mman.h
......
#ifndef _ASM_X86_KVM_PERF_H
#define _ASM_X86_KVM_PERF_H
#include <asm/svm.h>
#include <asm/vmx.h>
#include <asm/kvm.h>
#define DECODE_STR_LEN 20
#define VCPU_ID "vcpu_id"
#define KVM_ENTRY_TRACE "kvm:kvm_entry"
#define KVM_EXIT_TRACE "kvm:kvm_exit"
#define KVM_EXIT_REASON "exit_reason"
#endif /* _ASM_X86_KVM_PERF_H */
......@@ -294,31 +294,41 @@ static struct amd_uncore *amd_uncore_alloc(unsigned int cpu)
cpu_to_node(cpu));
}
static void amd_uncore_cpu_up_prepare(unsigned int cpu)
static int amd_uncore_cpu_up_prepare(unsigned int cpu)
{
struct amd_uncore *uncore;
struct amd_uncore *uncore_nb = NULL, *uncore_l2;
if (amd_uncore_nb) {
uncore = amd_uncore_alloc(cpu);
uncore->cpu = cpu;
uncore->num_counters = NUM_COUNTERS_NB;
uncore->rdpmc_base = RDPMC_BASE_NB;
uncore->msr_base = MSR_F15H_NB_PERF_CTL;
uncore->active_mask = &amd_nb_active_mask;
uncore->pmu = &amd_nb_pmu;
*per_cpu_ptr(amd_uncore_nb, cpu) = uncore;
uncore_nb = amd_uncore_alloc(cpu);
if (!uncore_nb)
goto fail;
uncore_nb->cpu = cpu;
uncore_nb->num_counters = NUM_COUNTERS_NB;
uncore_nb->rdpmc_base = RDPMC_BASE_NB;
uncore_nb->msr_base = MSR_F15H_NB_PERF_CTL;
uncore_nb->active_mask = &amd_nb_active_mask;
uncore_nb->pmu = &amd_nb_pmu;
*per_cpu_ptr(amd_uncore_nb, cpu) = uncore_nb;
}
if (amd_uncore_l2) {
uncore = amd_uncore_alloc(cpu);
uncore->cpu = cpu;
uncore->num_counters = NUM_COUNTERS_L2;
uncore->rdpmc_base = RDPMC_BASE_L2;
uncore->msr_base = MSR_F16H_L2I_PERF_CTL;
uncore->active_mask = &amd_l2_active_mask;
uncore->pmu = &amd_l2_pmu;
*per_cpu_ptr(amd_uncore_l2, cpu) = uncore;
uncore_l2 = amd_uncore_alloc(cpu);
if (!uncore_l2)
goto fail;
uncore_l2->cpu = cpu;
uncore_l2->num_counters = NUM_COUNTERS_L2;
uncore_l2->rdpmc_base = RDPMC_BASE_L2;
uncore_l2->msr_base = MSR_F16H_L2I_PERF_CTL;
uncore_l2->active_mask = &amd_l2_active_mask;
uncore_l2->pmu = &amd_l2_pmu;
*per_cpu_ptr(amd_uncore_l2, cpu) = uncore_l2;
}
return 0;
fail:
kfree(uncore_nb);
return -ENOMEM;
}
static struct amd_uncore *
......@@ -441,7 +451,7 @@ static void uncore_dead(unsigned int cpu, struct amd_uncore * __percpu *uncores)
if (!--uncore->refcnt)
kfree(uncore);
*per_cpu_ptr(amd_uncore_nb, cpu) = NULL;
*per_cpu_ptr(uncores, cpu) = NULL;
}
static void amd_uncore_cpu_dead(unsigned int cpu)
......@@ -461,7 +471,8 @@ amd_uncore_cpu_notifier(struct notifier_block *self, unsigned long action,
switch (action & ~CPU_TASKS_FROZEN) {
case CPU_UP_PREPARE:
amd_uncore_cpu_up_prepare(cpu);
if (amd_uncore_cpu_up_prepare(cpu))
return notifier_from_errno(-ENOMEM);
break;
case CPU_STARTING:
......@@ -501,20 +512,33 @@ static void __init init_cpu_already_online(void *dummy)
amd_uncore_cpu_online(cpu);
}
static void cleanup_cpu_online(void *dummy)
{
unsigned int cpu = smp_processor_id();
amd_uncore_cpu_dead(cpu);
}
static int __init amd_uncore_init(void)
{
unsigned int cpu;
unsigned int cpu, cpu2;
int ret = -ENODEV;
if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
return -ENODEV;
goto fail_nodev;
if (!cpu_has_topoext)
return -ENODEV;
goto fail_nodev;
if (cpu_has_perfctr_nb) {
amd_uncore_nb = alloc_percpu(struct amd_uncore *);
perf_pmu_register(&amd_nb_pmu, amd_nb_pmu.name, -1);
if (!amd_uncore_nb) {
ret = -ENOMEM;
goto fail_nb;
}
ret = perf_pmu_register(&amd_nb_pmu, amd_nb_pmu.name, -1);
if (ret)
goto fail_nb;
printk(KERN_INFO "perf: AMD NB counters detected\n");
ret = 0;
......@@ -522,20 +546,28 @@ static int __init amd_uncore_init(void)
if (cpu_has_perfctr_l2) {
amd_uncore_l2 = alloc_percpu(struct amd_uncore *);
perf_pmu_register(&amd_l2_pmu, amd_l2_pmu.name, -1);
if (!amd_uncore_l2) {
ret = -ENOMEM;
goto fail_l2;
}
ret = perf_pmu_register(&amd_l2_pmu, amd_l2_pmu.name, -1);
if (ret)
goto fail_l2;
printk(KERN_INFO "perf: AMD L2I counters detected\n");
ret = 0;
}
if (ret)
return -ENODEV;
goto fail_nodev;
cpu_notifier_register_begin();
/* init cpus already online before registering for hotplug notifier */
for_each_online_cpu(cpu) {
amd_uncore_cpu_up_prepare(cpu);
ret = amd_uncore_cpu_up_prepare(cpu);
if (ret)
goto fail_online;
smp_call_function_single(cpu, init_cpu_already_online, NULL, 1);
}
......@@ -543,5 +575,30 @@ static int __init amd_uncore_init(void)
cpu_notifier_register_done();
return 0;
fail_online:
for_each_online_cpu(cpu2) {
if (cpu2 == cpu)
break;
smp_call_function_single(cpu, cleanup_cpu_online, NULL, 1);
}
cpu_notifier_register_done();
/* amd_uncore_nb/l2 should have been freed by cleanup_cpu_online */
amd_uncore_nb = amd_uncore_l2 = NULL;
if (cpu_has_perfctr_l2)
perf_pmu_unregister(&amd_l2_pmu);
fail_l2:
if (cpu_has_perfctr_nb)
perf_pmu_unregister(&amd_nb_pmu);
if (amd_uncore_l2)
free_percpu(amd_uncore_l2);
fail_nb:
if (amd_uncore_nb)
free_percpu(amd_uncore_nb);
fail_nodev:
return ret;
}
device_initcall(amd_uncore_init);
......@@ -2947,10 +2947,7 @@ nhmex_rbox_get_constraint(struct intel_uncore_box *box, struct perf_event *event
* extra registers. If we failed to take an extra
* register, try the alternative.
*/
if (idx % 2)
idx--;
else
idx++;
idx ^= 1;
if (idx != reg1->idx % 6) {
if (idx == 2)
config1 >>= 8;
......
......@@ -5266,6 +5266,12 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event)
goto got_name;
} else {
if (vma->vm_ops && vma->vm_ops->name) {
name = (char *) vma->vm_ops->name(vma);
if (name)
goto cpy_name;
}
name = (char *)arch_vma_name(vma);
if (name)
goto cpy_name;
......@@ -7804,7 +7810,7 @@ inherit_task_group(struct perf_event *event, struct task_struct *parent,
/*
* Initialize the perf_event context in task_struct
*/
int perf_event_init_context(struct task_struct *child, int ctxn)
static int perf_event_init_context(struct task_struct *child, int ctxn)
{
struct perf_event_context *child_ctx, *parent_ctx;
struct perf_event_context *cloned_ctx;
......
......@@ -30,6 +30,18 @@ static int perf_trace_event_perm(struct ftrace_event_call *tp_event,
return ret;
}
/*
* We checked and allowed to create parent,
* allow children without checking.
*/
if (p_event->parent)
return 0;
/*
* It's ok to check current process (owner) permissions in here,
* because code below is called only via perf_event_open syscall.
*/
/* The ftrace function trace is allowed only for root. */
if (ftrace_event_is_function(tp_event)) {
if (perf_paranoid_tracepoint_raw() && !capable(CAP_SYS_ADMIN))
......
......@@ -2395,7 +2395,7 @@ process_flags(struct event_format *event, struct print_arg *arg, char **tok)
{
struct print_arg *field;
enum event_type type;
char *token;
char *token = NULL;
memset(arg, 0, sizeof(*arg));
arg->type = PRINT_FLAGS;
......@@ -2448,7 +2448,7 @@ process_symbols(struct event_format *event, struct print_arg *arg, char **tok)
{
struct print_arg *field;
enum event_type type;
char *token;
char *token = NULL;
memset(arg, 0, sizeof(*arg));
arg->type = PRINT_SYMBOL;
......@@ -2487,7 +2487,7 @@ process_hex(struct event_format *event, struct print_arg *arg, char **tok)
{
struct print_arg *field;
enum event_type type;
char *token;
char *token = NULL;
memset(arg, 0, sizeof(*arg));
arg->type = PRINT_HEX;
......
......@@ -5,8 +5,7 @@
#include "event-parse.h"
static unsigned long long
process___le16_to_cpup(struct trace_seq *s,
unsigned long long *args)
process___le16_to_cpup(struct trace_seq *s, unsigned long long *args)
{
uint16_t *val = (uint16_t *) (unsigned long) args[0];
return val ? (long long) le16toh(*val) : 0;
......
......@@ -30,8 +30,7 @@
#define MINOR(dev) ((unsigned int) ((dev) & MINORMASK))
static unsigned long long
process_jbd2_dev_to_name(struct trace_seq *s,
unsigned long long *args)
process_jbd2_dev_to_name(struct trace_seq *s, unsigned long long *args)
{
unsigned int dev = args[0];
......@@ -40,8 +39,7 @@ process_jbd2_dev_to_name(struct trace_seq *s,
}
static unsigned long long
process_jiffies_to_msecs(struct trace_seq *s,
unsigned long long *args)
process_jiffies_to_msecs(struct trace_seq *s, unsigned long long *args)
{
unsigned long long jiffies = args[0];
......
......@@ -240,25 +240,38 @@ static const char *find_exit_reason(unsigned isa, int val)
for (i = 0; strings[i].val >= 0; i++)
if (strings[i].val == val)
break;
if (strings[i].str)
return strings[i].str;
return "UNKNOWN";
return strings[i].str;
}
static int kvm_exit_handler(struct trace_seq *s, struct pevent_record *record,
struct event_format *event, void *context)
static int print_exit_reason(struct trace_seq *s, struct pevent_record *record,
struct event_format *event, const char *field)
{
unsigned long long isa;
unsigned long long val;
unsigned long long info1 = 0, info2 = 0;
const char *reason;
if (pevent_get_field_val(s, event, "exit_reason", record, &val, 1) < 0)
if (pevent_get_field_val(s, event, field, record, &val, 1) < 0)
return -1;
if (pevent_get_field_val(s, event, "isa", record, &isa, 0) < 0)
isa = 1;
trace_seq_printf(s, "reason %s", find_exit_reason(isa, val));
reason = find_exit_reason(isa, val);
if (reason)
trace_seq_printf(s, "reason %s", reason);
else
trace_seq_printf(s, "reason UNKNOWN (%llu)", val);
return 0;
}
static int kvm_exit_handler(struct trace_seq *s, struct pevent_record *record,
struct event_format *event, void *context)
{
unsigned long long info1 = 0, info2 = 0;
if (print_exit_reason(s, record, event, "exit_reason") < 0)
return -1;
pevent_print_num_field(s, " rip 0x%lx", event, "guest_rip", record, 1);
......@@ -313,6 +326,29 @@ static int kvm_emulate_insn_handler(struct trace_seq *s,
return 0;
}
static int kvm_nested_vmexit_inject_handler(struct trace_seq *s, struct pevent_record *record,
struct event_format *event, void *context)
{
if (print_exit_reason(s, record, event, "exit_code") < 0)
return -1;
pevent_print_num_field(s, " info1 %llx", event, "exit_info1", record, 1);
pevent_print_num_field(s, " info2 %llx", event, "exit_info2", record, 1);
pevent_print_num_field(s, " int_info %llx", event, "exit_int_info", record, 1);
pevent_print_num_field(s, " int_info_err %llx", event, "exit_int_info_err", record, 1);
return 0;
}
static int kvm_nested_vmexit_handler(struct trace_seq *s, struct pevent_record *record,
struct event_format *event, void *context)
{
pevent_print_num_field(s, "rip %llx ", event, "rip", record, 1);
return kvm_nested_vmexit_inject_handler(s, record, event, context);
}
union kvm_mmu_page_role {
unsigned word;
struct {
......@@ -409,6 +445,12 @@ int PEVENT_PLUGIN_LOADER(struct pevent *pevent)
pevent_register_event_handler(pevent, -1, "kvm", "kvm_emulate_insn",
kvm_emulate_insn_handler, NULL);
pevent_register_event_handler(pevent, -1, "kvm", "kvm_nested_vmexit",
kvm_nested_vmexit_handler, NULL);
pevent_register_event_handler(pevent, -1, "kvm", "kvm_nested_vmexit_inject",
kvm_nested_vmexit_inject_handler, NULL);
pevent_register_event_handler(pevent, -1, "kvmmmu", "kvm_mmu_get_page",
kvm_mmu_get_page_handler, NULL);
......@@ -443,6 +485,12 @@ void PEVENT_PLUGIN_UNLOADER(struct pevent *pevent)
pevent_unregister_event_handler(pevent, -1, "kvm", "kvm_emulate_insn",
kvm_emulate_insn_handler, NULL);
pevent_unregister_event_handler(pevent, -1, "kvm", "kvm_nested_vmexit",
kvm_nested_vmexit_handler, NULL);
pevent_unregister_event_handler(pevent, -1, "kvm", "kvm_nested_vmexit_inject",
kvm_nested_vmexit_inject_handler, NULL);
pevent_unregister_event_handler(pevent, -1, "kvmmmu", "kvm_mmu_get_page",
kvm_mmu_get_page_handler, NULL);
......
......@@ -16,6 +16,10 @@ This 'perf bench' command is a general framework for benchmark suites.
COMMON OPTIONS
--------------
-r::
--repeat=::
Specify amount of times to repeat the run (default 10).
-f::
--format=::
Specify format style.
......
......@@ -41,6 +41,9 @@ OPTIONS
tasks slept. sched_switch contains a callchain where a task slept and
sched_stat contains a timeslice how long a task slept.
--kallsyms=<file>::
kallsyms pathname
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-report[1], linkperf:perf-archive[1]
......@@ -51,9 +51,9 @@ There are a couple of variants of perf kvm:
'perf kvm stat <command>' to run a command and gather performance counter
statistics.
Especially, perf 'kvm stat record/report' generates a statistical analysis
of KVM events. Currently, vmexit, mmio and ioport events are supported.
'perf kvm stat record <command>' records kvm events and the events between
start and end <command>.
of KVM events. Currently, vmexit, mmio (x86 only) and ioport (x86 only)
events are supported. 'perf kvm stat record <command>' records kvm events
and the events between start and end <command>.
And this command produces a file which contains tracing results of kvm
events.
......@@ -103,8 +103,8 @@ STAT REPORT OPTIONS
analyze events which occures on this vcpu. (default: all vcpus)
--event=<value>::
event to be analyzed. Possible values: vmexit, mmio, ioport.
(default: vmexit)
event to be analyzed. Possible values: vmexit, mmio (x86 only),
ioport (x86 only). (default: vmexit)
-k::
--key=<value>::
Sorting key. Possible values: sample (default, sort by samples
......@@ -138,7 +138,8 @@ STAT LIVE OPTIONS
--event=<value>::
event to be analyzed. Possible values: vmexit, mmio, ioport.
event to be analyzed. Possible values: vmexit,
mmio (x86 only), ioport (x86 only).
(default: vmexit)
-k::
......@@ -147,7 +148,8 @@ STAT LIVE OPTIONS
number), time (sort by average time).
--duration=<value>::
Show events other than HLT that take longer than duration usecs.
Show events other than HLT (x86 only) or Wait state (s390 only)
that take longer than duration usecs.
SEE ALSO
--------
......
......@@ -15,10 +15,20 @@ DESCRIPTION
There are two variants of perf timechart:
'perf timechart record <command>' to record the system level events
of an arbitrary workload.
of an arbitrary workload. By default timechart records only scheduler
and CPU events (task switches, running times, CPU power states, etc),
but it's possible to record IO (disk, network) activity using -I argument.
'perf timechart' to turn a trace into a Scalable Vector Graphics file,
that can be viewed with popular SVG viewers such as 'Inkscape'.
that can be viewed with popular SVG viewers such as 'Inkscape'. Depending
on the events in the perf.data file, timechart will contain scheduler/cpu
events or IO events.
In IO mode, every bar has two charts: upper and lower.
Upper bar shows incoming events (disk reads, ingress network packets).
Lower bar shows outgoing events (disk writes, egress network packets).
There are also poll bars which show how much time application spent
in poll/epoll/select syscalls.
TIMECHART OPTIONS
-----------------
......@@ -54,6 +64,19 @@ TIMECHART OPTIONS
duration or tasks with given name. If number is given it's interpreted
as number of nanoseconds. If non-numeric string is given it's
interpreted as task name.
--io-skip-eagain::
Don't draw EAGAIN IO events.
--io-min-time=<nsecs>::
Draw small events as if they lasted min-time. Useful when you need
to see very small and fast IO. It's possible to specify ms or us
suffix to specify time in milliseconds or microseconds.
Default value is 1ms.
--io-merge-dist=<nsecs>::
Merge events that are merge-dist nanoseconds apart.
Reduces number of figures on the SVG and makes it more render-friendly.
It's possible to specify ms or us suffix to specify time in
milliseconds or microseconds.
Default value is 1us.
RECORD OPTIONS
--------------
......@@ -63,6 +86,9 @@ RECORD OPTIONS
-T::
--tasks-only::
Record only tasks-related events
-I::
--io-only::
Record only io-related events
-g::
--callchain::
Do call-graph (stack chain/backtrace) recording
......@@ -87,6 +113,14 @@ Record system-wide timechart:
$ perf timechart --highlight gcc
Record system-wide IO events:
$ perf timechart record -I
then generate timechart:
$ perf timechart
SEE ALSO
--------
linkperf:perf-record[1]
......@@ -107,6 +107,52 @@ the thread executes on the designated CPUs. Default is to monitor all CPUs.
Show tool stats such as number of times fd->pathname was discovered thru
hooking the open syscall return + vfs_getname or via reading /proc/pid/fd, etc.
-F=[all|min|maj]::
--pf=[all|min|maj]::
Trace pagefaults. Optionally, you can specify whether you want minor,
major or all pagefaults. Default value is maj.
--syscalls::
Trace system calls. This options is enabled by default.
PAGEFAULTS
----------
When tracing pagefaults, the format of the trace is as follows:
<min|maj>fault [<ip.symbol>+<ip.offset>] => <addr.dso@addr.offset> (<map type><addr level>).
- min/maj indicates whether fault event is minor or major;
- ip.symbol shows symbol for instruction pointer (the code that generated the
fault); if no debug symbols available, perf trace will print raw IP;
- addr.dso shows DSO for the faulted address;
- map type is either 'd' for non-executable maps or 'x' for executable maps;
- addr level is either 'k' for kernel dso or '.' for user dso.
For symbols resolution you may need to install debugging symbols.
Please be aware that duration is currently always 0 and doesn't reflect actual
time it took for fault to be handled!
When --verbose specified, perf trace tries to print all available information
for both IP and fault address in the form of dso@symbol+offset.
EXAMPLES
--------
Trace only major pagefaults:
$ perf trace --no-syscalls -F
Trace syscalls, major and minor pagefaults:
$ perf trace -F all
1416.547 ( 0.000 ms): python/20235 majfault [CRYPTO_push_info_+0x0] => /lib/x86_64-linux-gnu/libcrypto.so.1.0.0@0x61be0 (x.)
As you can see, there was major pagefault in python process, from
CRYPTO_push_info_ routine which faulted somewhere in libcrypto.so.
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-script[1]
......@@ -8,7 +8,15 @@ perf - Performance analysis tools for Linux
SYNOPSIS
--------
[verse]
'perf' [--version] [--help] COMMAND [ARGS]
'perf' [--version] [--help] [OPTIONS] COMMAND [ARGS]
OPTIONS
-------
--debug::
Setup debug variable (just verbose for now) in value
range (0, 10). Use like:
--debug verbose # sets verbose = 1
--debug verbose=2 # sets verbose = 2
DESCRIPTION
-----------
......
......@@ -37,3 +37,6 @@ arch/x86/include/asm/kvm_host.h
arch/x86/include/uapi/asm/svm.h
arch/x86/include/uapi/asm/vmx.h
arch/x86/include/uapi/asm/kvm.h
arch/x86/include/uapi/asm/kvm_perf.h
arch/s390/include/uapi/asm/sie.h
arch/s390/include/uapi/asm/kvm_perf.h
......@@ -295,11 +295,13 @@ LIB_H += util/intlist.h
LIB_H += util/perf_regs.h
LIB_H += util/unwind.h
LIB_H += util/vdso.h
LIB_H += util/tsc.h
LIB_H += ui/helpline.h
LIB_H += ui/progress.h
LIB_H += ui/util.h
LIB_H += ui/ui.h
LIB_H += util/data.h
LIB_H += util/kvm-stat.h
LIB_OBJS += $(OUTPUT)util/abspath.o
LIB_OBJS += $(OUTPUT)util/alias.o
......@@ -373,6 +375,8 @@ LIB_OBJS += $(OUTPUT)util/stat.o
LIB_OBJS += $(OUTPUT)util/record.o
LIB_OBJS += $(OUTPUT)util/srcline.o
LIB_OBJS += $(OUTPUT)util/data.o
LIB_OBJS += $(OUTPUT)util/tsc.o
LIB_OBJS += $(OUTPUT)util/cloexec.o
LIB_OBJS += $(OUTPUT)ui/setup.o
LIB_OBJS += $(OUTPUT)ui/helpline.o
......
......@@ -3,3 +3,4 @@ PERF_HAVE_DWARF_REGS := 1
LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
endif
LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/header.o
LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/skip-callchain-idx.o
......@@ -5,9 +5,7 @@
#include <string.h>
#include "../../util/header.h"
#define __stringify_1(x) #x
#define __stringify(x) __stringify_1(x)
#include "../../util/util.h"
#define mfspr(rn) ({unsigned long rval; \
asm volatile("mfspr %0," __stringify(rn) \
......
/*
* Use DWARF Debug information to skip unnecessary callchain entries.
*
* Copyright (C) 2014 Sukadev Bhattiprolu, IBM Corporation.
* Copyright (C) 2014 Ulrich Weigand, IBM Corporation.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version
* 2 of the License, or (at your option) any later version.
*/
#include <inttypes.h>
#include <dwarf.h>
#include <elfutils/libdwfl.h>
#include "util/thread.h"
#include "util/callchain.h"
/*
* When saving the callchain on Power, the kernel conservatively saves
* excess entries in the callchain. A few of these entries are needed
* in some cases but not others. If the unnecessary entries are not
* ignored, we end up with duplicate arcs in the call-graphs. Use
* DWARF debug information to skip over any unnecessary callchain
* entries.
*
* See function header for arch_adjust_callchain() below for more details.
*
* The libdwfl code in this file is based on code from elfutils
* (libdwfl/argp-std.c, libdwfl/tests/addrcfi.c, etc).
*/
static char *debuginfo_path;
static const Dwfl_Callbacks offline_callbacks = {
.debuginfo_path = &debuginfo_path,
.find_debuginfo = dwfl_standard_find_debuginfo,
.section_address = dwfl_offline_section_address,
};
/*
* Use the DWARF expression for the Call-frame-address and determine
* if return address is in LR and if a new frame was allocated.
*/
static int check_return_reg(int ra_regno, Dwarf_Frame *frame)
{
Dwarf_Op ops_mem[2];
Dwarf_Op dummy;
Dwarf_Op *ops = &dummy;
size_t nops;
int result;
result = dwarf_frame_register(frame, ra_regno, ops_mem, &ops, &nops);
if (result < 0) {
pr_debug("dwarf_frame_register() %s\n", dwarf_errmsg(-1));
return -1;
}
/*
* Check if return address is on the stack.
*/
if (nops != 0 || ops != NULL)
return 0;
/*
* Return address is in LR. Check if a frame was allocated
* but not-yet used.
*/
result = dwarf_frame_cfa(frame, &ops, &nops);
if (result < 0) {
pr_debug("dwarf_frame_cfa() returns %d, %s\n", result,
dwarf_errmsg(-1));
return -1;
}
/*
* If call frame address is in r1, no new frame was allocated.
*/
if (nops == 1 && ops[0].atom == DW_OP_bregx && ops[0].number == 1 &&
ops[0].number2 == 0)
return 1;
/*
* A new frame was allocated but has not yet been used.
*/
return 2;
}
/*
* Get the DWARF frame from the .eh_frame section.
*/
static Dwarf_Frame *get_eh_frame(Dwfl_Module *mod, Dwarf_Addr pc)
{
int result;
Dwarf_Addr bias;
Dwarf_CFI *cfi;
Dwarf_Frame *frame;
cfi = dwfl_module_eh_cfi(mod, &bias);
if (!cfi) {
pr_debug("%s(): no CFI - %s\n", __func__, dwfl_errmsg(-1));
return NULL;
}
result = dwarf_cfi_addrframe(cfi, pc, &frame);
if (result) {
pr_debug("%s(): %s\n", __func__, dwfl_errmsg(-1));
return NULL;
}
return frame;
}
/*
* Get the DWARF frame from the .debug_frame section.
*/
static Dwarf_Frame *get_dwarf_frame(Dwfl_Module *mod, Dwarf_Addr pc)
{
Dwarf_CFI *cfi;
Dwarf_Addr bias;
Dwarf_Frame *frame;
int result;
cfi = dwfl_module_dwarf_cfi(mod, &bias);
if (!cfi) {
pr_debug("%s(): no CFI - %s\n", __func__, dwfl_errmsg(-1));
return NULL;
}
result = dwarf_cfi_addrframe(cfi, pc, &frame);
if (result) {
pr_debug("%s(): %s\n", __func__, dwfl_errmsg(-1));
return NULL;
}
return frame;
}
/*
* Return:
* 0 if return address for the program counter @pc is on stack
* 1 if return address is in LR and no new stack frame was allocated
* 2 if return address is in LR and a new frame was allocated (but not
* yet used)
* -1 in case of errors
*/
static int check_return_addr(const char *exec_file, Dwarf_Addr pc)
{
int rc = -1;
Dwfl *dwfl;
Dwfl_Module *mod;
Dwarf_Frame *frame;
int ra_regno;
Dwarf_Addr start = pc;
Dwarf_Addr end = pc;
bool signalp;
dwfl = dwfl_begin(&offline_callbacks);
if (!dwfl) {
pr_debug("dwfl_begin() failed: %s\n", dwarf_errmsg(-1));
return -1;
}
if (dwfl_report_offline(dwfl, "", exec_file, -1) == NULL) {
pr_debug("dwfl_report_offline() failed %s\n", dwarf_errmsg(-1));
goto out;
}
mod = dwfl_addrmodule(dwfl, pc);
if (!mod) {
pr_debug("dwfl_addrmodule() failed, %s\n", dwarf_errmsg(-1));
goto out;
}
/*
* To work with split debug info files (eg: glibc), check both
* .eh_frame and .debug_frame sections of the ELF header.
*/
frame = get_eh_frame(mod, pc);
if (!frame) {
frame = get_dwarf_frame(mod, pc);
if (!frame)
goto out;
}
ra_regno = dwarf_frame_info(frame, &start, &end, &signalp);
if (ra_regno < 0) {
pr_debug("Return address register unavailable: %s\n",
dwarf_errmsg(-1));
goto out;
}
rc = check_return_reg(ra_regno, frame);
out:
dwfl_end(dwfl);
return rc;
}
/*
* The callchain saved by the kernel always includes the link register (LR).
*
* 0: PERF_CONTEXT_USER
* 1: Program counter (Next instruction pointer)
* 2: LR value
* 3: Caller's caller
* 4: ...
*
* The value in LR is only needed when it holds a return address. If the
* return address is on the stack, we should ignore the LR value.
*
* Further, when the return address is in the LR, if a new frame was just
* allocated but the LR was not saved into it, then the LR contains the
* caller, slot 4: contains the caller's caller and the contents of slot 3:
* (chain->ips[3]) is undefined and must be ignored.
*
* Use DWARF debug information to determine if any entries need to be skipped.
*
* Return:
* index: of callchain entry that needs to be ignored (if any)
* -1 if no entry needs to be ignored or in case of errors
*/
int arch_skip_callchain_idx(struct machine *machine, struct thread *thread,
struct ip_callchain *chain)
{
struct addr_location al;
struct dso *dso = NULL;
int rc;
u64 ip;
u64 skip_slot = -1;
if (chain->nr < 3)
return skip_slot;
ip = chain->ips[2];
thread__find_addr_location(thread, machine, PERF_RECORD_MISC_USER,
MAP__FUNCTION, ip, &al);
if (al.map)
dso = al.map->dso;
if (!dso) {
pr_debug("%" PRIx64 " dso is NULL\n", ip);
return skip_slot;
}
rc = check_return_addr(dso->long_name, ip);
pr_debug("DSO %s, nr %" PRIx64 ", ip 0x%" PRIx64 "rc %d\n",
dso->long_name, chain->nr, ip, rc);
if (rc == 0) {
/*
* Return address on stack. Ignore LR value in callchain
*/
skip_slot = 2;
} else if (rc == 2) {
/*
* New frame allocated but return address still in LR.
* Ignore the caller's caller entry in callchain.
*/
skip_slot = 3;
}
return skip_slot;
}
......@@ -2,3 +2,6 @@ ifndef NO_DWARF
PERF_HAVE_DWARF_REGS := 1
LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
endif
LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/header.o
HAVE_KVM_STAT_SUPPORT := 1
LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/kvm-stat.o
/*
* Implementation of get_cpuid().
*
* Copyright 2014 IBM Corp.
* Author(s): Alexander Yarygin <yarygin@linux.vnet.ibm.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License (version 2 only)
* as published by the Free Software Foundation.
*/
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include "../../util/header.h"
int get_cpuid(char *buffer, size_t sz)
{
const char *cpuid = "IBM/S390";
if (strlen(cpuid) + 1 > sz)
return -1;
strcpy(buffer, cpuid);
return 0;
}
/*
* Arch specific functions for perf kvm stat.
*
* Copyright 2014 IBM Corp.
* Author(s): Alexander Yarygin <yarygin@linux.vnet.ibm.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License (version 2 only)
* as published by the Free Software Foundation.
*/
#include "../../util/kvm-stat.h"
#include <asm/kvm_perf.h>
define_exit_reasons_table(sie_exit_reasons, sie_intercept_code);
define_exit_reasons_table(sie_icpt_insn_codes, icpt_insn_codes);
define_exit_reasons_table(sie_sigp_order_codes, sigp_order_codes);
define_exit_reasons_table(sie_diagnose_codes, diagnose_codes);
define_exit_reasons_table(sie_icpt_prog_codes, icpt_prog_codes);
static void event_icpt_insn_get_key(struct perf_evsel *evsel,
struct perf_sample *sample,
struct event_key *key)
{
unsigned long insn;
insn = perf_evsel__intval(evsel, sample, "instruction");
key->key = icpt_insn_decoder(insn);
key->exit_reasons = sie_icpt_insn_codes;
}
static void event_sigp_get_key(struct perf_evsel *evsel,
struct perf_sample *sample,
struct event_key *key)
{
key->key = perf_evsel__intval(evsel, sample, "order_code");
key->exit_reasons = sie_sigp_order_codes;
}
static void event_diag_get_key(struct perf_evsel *evsel,
struct perf_sample *sample,
struct event_key *key)
{
key->key = perf_evsel__intval(evsel, sample, "code");
key->exit_reasons = sie_diagnose_codes;
}
static void event_icpt_prog_get_key(struct perf_evsel *evsel,
struct perf_sample *sample,
struct event_key *key)
{
key->key = perf_evsel__intval(evsel, sample, "code");
key->exit_reasons = sie_icpt_prog_codes;
}
static struct child_event_ops child_events[] = {
{ .name = "kvm:kvm_s390_intercept_instruction",
.get_key = event_icpt_insn_get_key },
{ .name = "kvm:kvm_s390_handle_sigp",
.get_key = event_sigp_get_key },
{ .name = "kvm:kvm_s390_handle_diag",
.get_key = event_diag_get_key },
{ .name = "kvm:kvm_s390_intercept_prog",
.get_key = event_icpt_prog_get_key },
{ NULL, NULL },
};
static struct kvm_events_ops exit_events = {
.is_begin_event = exit_event_begin,
.is_end_event = exit_event_end,
.child_ops = child_events,
.decode_key = exit_event_decode_key,
.name = "VM-EXIT"
};
const char * const kvm_events_tp[] = {
"kvm:kvm_s390_sie_enter",
"kvm:kvm_s390_sie_exit",
"kvm:kvm_s390_intercept_instruction",
"kvm:kvm_s390_handle_sigp",
"kvm:kvm_s390_handle_diag",
"kvm:kvm_s390_intercept_prog",
NULL,
};
struct kvm_reg_events_ops kvm_reg_events_ops[] = {
{ .name = "vmexit", .ops = &exit_events },
{ NULL, NULL },
};
const char * const kvm_skip_events[] = {
"Wait state",
NULL,
};
int cpu_isa_init(struct perf_kvm_stat *kvm, const char *cpuid)
{
if (strstr(cpuid, "IBM/S390")) {
kvm->exit_reasons = sie_exit_reasons;
kvm->exit_reasons_isa = "SIE";
} else
return -ENOTSUP;
return 0;
}
......@@ -15,3 +15,5 @@ endif
LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/header.o
LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/tsc.o
LIB_H += arch/$(ARCH)/util/tsc.h
HAVE_KVM_STAT_SUPPORT := 1
LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/kvm-stat.o
......@@ -3,6 +3,7 @@
#include "thread.h"
#include "map.h"
#include "event.h"
#include "debug.h"
#include "tests/tests.h"
#define STACK_SIZE 8192
......
#include "../../util/kvm-stat.h"
#include <asm/kvm_perf.h>
define_exit_reasons_table(vmx_exit_reasons, VMX_EXIT_REASONS);
define_exit_reasons_table(svm_exit_reasons, SVM_EXIT_REASONS);
static struct kvm_events_ops exit_events = {
.is_begin_event = exit_event_begin,
.is_end_event = exit_event_end,
.decode_key = exit_event_decode_key,
.name = "VM-EXIT"
};
/*
* For the mmio events, we treat:
* the time of MMIO write: kvm_mmio(KVM_TRACE_MMIO_WRITE...) -> kvm_entry
* the time of MMIO read: kvm_exit -> kvm_mmio(KVM_TRACE_MMIO_READ...).
*/
static void mmio_event_get_key(struct perf_evsel *evsel, struct perf_sample *sample,
struct event_key *key)
{
key->key = perf_evsel__intval(evsel, sample, "gpa");
key->info = perf_evsel__intval(evsel, sample, "type");
}
#define KVM_TRACE_MMIO_READ_UNSATISFIED 0
#define KVM_TRACE_MMIO_READ 1
#define KVM_TRACE_MMIO_WRITE 2
static bool mmio_event_begin(struct perf_evsel *evsel,
struct perf_sample *sample, struct event_key *key)
{
/* MMIO read begin event in kernel. */
if (kvm_exit_event(evsel))
return true;
/* MMIO write begin event in kernel. */
if (!strcmp(evsel->name, "kvm:kvm_mmio") &&
perf_evsel__intval(evsel, sample, "type") == KVM_TRACE_MMIO_WRITE) {
mmio_event_get_key(evsel, sample, key);
return true;
}
return false;
}
static bool mmio_event_end(struct perf_evsel *evsel, struct perf_sample *sample,
struct event_key *key)
{
/* MMIO write end event in kernel. */
if (kvm_entry_event(evsel))
return true;
/* MMIO read end event in kernel.*/
if (!strcmp(evsel->name, "kvm:kvm_mmio") &&
perf_evsel__intval(evsel, sample, "type") == KVM_TRACE_MMIO_READ) {
mmio_event_get_key(evsel, sample, key);
return true;
}
return false;
}
static void mmio_event_decode_key(struct perf_kvm_stat *kvm __maybe_unused,
struct event_key *key,
char *decode)
{
scnprintf(decode, DECODE_STR_LEN, "%#lx:%s",
(unsigned long)key->key,
key->info == KVM_TRACE_MMIO_WRITE ? "W" : "R");
}
static struct kvm_events_ops mmio_events = {
.is_begin_event = mmio_event_begin,
.is_end_event = mmio_event_end,
.decode_key = mmio_event_decode_key,
.name = "MMIO Access"
};
/* The time of emulation pio access is from kvm_pio to kvm_entry. */
static void ioport_event_get_key(struct perf_evsel *evsel,
struct perf_sample *sample,
struct event_key *key)
{
key->key = perf_evsel__intval(evsel, sample, "port");
key->info = perf_evsel__intval(evsel, sample, "rw");
}
static bool ioport_event_begin(struct perf_evsel *evsel,
struct perf_sample *sample,
struct event_key *key)
{
if (!strcmp(evsel->name, "kvm:kvm_pio")) {
ioport_event_get_key(evsel, sample, key);
return true;
}
return false;
}
static bool ioport_event_end(struct perf_evsel *evsel,
struct perf_sample *sample __maybe_unused,
struct event_key *key __maybe_unused)
{
return kvm_entry_event(evsel);
}
static void ioport_event_decode_key(struct perf_kvm_stat *kvm __maybe_unused,
struct event_key *key,
char *decode)
{
scnprintf(decode, DECODE_STR_LEN, "%#llx:%s",
(unsigned long long)key->key,
key->info ? "POUT" : "PIN");
}
static struct kvm_events_ops ioport_events = {
.is_begin_event = ioport_event_begin,
.is_end_event = ioport_event_end,
.decode_key = ioport_event_decode_key,
.name = "IO Port Access"
};
const char * const kvm_events_tp[] = {
"kvm:kvm_entry",
"kvm:kvm_exit",
"kvm:kvm_mmio",
"kvm:kvm_pio",
NULL,
};
struct kvm_reg_events_ops kvm_reg_events_ops[] = {
{ .name = "vmexit", .ops = &exit_events },
{ .name = "mmio", .ops = &mmio_events },
{ .name = "ioport", .ops = &ioport_events },
{ NULL, NULL },
};
const char * const kvm_skip_events[] = {
"HLT",
NULL,
};
int cpu_isa_init(struct perf_kvm_stat *kvm, const char *cpuid)
{
if (strstr(cpuid, "Intel")) {
kvm->exit_reasons = vmx_exit_reasons;
kvm->exit_reasons_isa = "VMX";
} else if (strstr(cpuid, "AMD")) {
kvm->exit_reasons = svm_exit_reasons;
kvm->exit_reasons_isa = "SVM";
} else
return -ENOTSUP;
return 0;
}
......@@ -6,29 +6,9 @@
#include "../../perf.h"
#include <linux/types.h>
#include "../../util/debug.h"
#include "../../util/tsc.h"
#include "tsc.h"
u64 perf_time_to_tsc(u64 ns, struct perf_tsc_conversion *tc)
{
u64 t, quot, rem;
t = ns - tc->time_zero;
quot = t / tc->time_mult;
rem = t % tc->time_mult;
return (quot << tc->time_shift) +
(rem << tc->time_shift) / tc->time_mult;
}
u64 tsc_to_perf_time(u64 cyc, struct perf_tsc_conversion *tc)
{
u64 quot, rem;
quot = cyc >> tc->time_shift;
rem = cyc & ((1 << tc->time_shift) - 1);
return tc->time_zero + quot * tc->time_mult +
((rem * tc->time_mult) >> tc->time_shift);
}
int perf_read_tsc_conversion(const struct perf_event_mmap_page *pc,
struct perf_tsc_conversion *tc)
{
......@@ -57,3 +37,12 @@ int perf_read_tsc_conversion(const struct perf_event_mmap_page *pc,
return 0;
}
u64 rdtsc(void)
{
unsigned int low, high;
asm volatile("rdtsc" : "=a" (low), "=d" (high));
return low | ((u64)high) << 32;
}
......@@ -14,7 +14,4 @@ struct perf_event_mmap_page;
int perf_read_tsc_conversion(const struct perf_event_mmap_page *pc,
struct perf_tsc_conversion *tc);
u64 perf_time_to_tsc(u64 ns, struct perf_tsc_conversion *tc);
u64 tsc_to_perf_time(u64 cyc, struct perf_tsc_conversion *tc);
#endif /* TOOLS_PERF_ARCH_X86_UTIL_TSC_H__ */
......@@ -3,6 +3,7 @@
#include <libunwind.h>
#include "perf_regs.h"
#include "../../util/unwind.h"
#include "../../util/debug.h"
#ifdef HAVE_ARCH_X86_64_SUPPORT
int libunwind__arch_reg_id(int regnum)
......
......@@ -43,5 +43,6 @@ extern int bench_futex_requeue(int argc, const char **argv, const char *prefix);
#define BENCH_FORMAT_UNKNOWN -1
extern int bench_format;
extern unsigned int bench_repeat;
#endif
......@@ -29,13 +29,6 @@ static u_int32_t futex1 = 0, futex2 = 0;
*/
static unsigned int nrequeue = 1;
/*
* There can be significant variance from run to run,
* the more repeats, the more exact the overall avg and
* the better idea of the futex latency.
*/
static unsigned int repeat = 10;
static pthread_t *worker;
static bool done = 0, silent = 0;
static pthread_mutex_t thread_lock;
......@@ -46,7 +39,6 @@ static unsigned int ncpus, threads_starting, nthreads = 0;
static const struct option options[] = {
OPT_UINTEGER('t', "threads", &nthreads, "Specify amount of threads"),
OPT_UINTEGER('q', "nrequeue", &nrequeue, "Specify amount of threads to requeue at once"),
OPT_UINTEGER('r', "repeat", &repeat, "Specify amount of times to repeat the run"),
OPT_BOOLEAN( 's', "silent", &silent, "Silent mode: do not display data/details"),
OPT_END()
};
......@@ -146,7 +138,7 @@ int bench_futex_requeue(int argc, const char **argv,
pthread_cond_init(&thread_parent, NULL);
pthread_cond_init(&thread_worker, NULL);
for (j = 0; j < repeat && !done; j++) {
for (j = 0; j < bench_repeat && !done; j++) {
unsigned int nrequeued = 0;
struct timeval start, end, runtime;
......
......@@ -30,15 +30,8 @@ static u_int32_t futex1 = 0;
*/
static unsigned int nwakes = 1;
/*
* There can be significant variance from run to run,
* the more repeats, the more exact the overall avg and
* the better idea of the futex latency.
*/
static unsigned int repeat = 10;
pthread_t *worker;
static bool done = 0, silent = 0;
static bool done = false, silent = false;
static pthread_mutex_t thread_lock;
static pthread_cond_t thread_parent, thread_worker;
static struct stats waketime_stats, wakeup_stats;
......@@ -47,7 +40,6 @@ static unsigned int ncpus, threads_starting, nthreads = 0;
static const struct option options[] = {
OPT_UINTEGER('t', "threads", &nthreads, "Specify amount of threads"),
OPT_UINTEGER('w', "nwakes", &nwakes, "Specify amount of threads to wake at once"),
OPT_UINTEGER('r', "repeat", &repeat, "Specify amount of times to repeat the run"),
OPT_BOOLEAN( 's', "silent", &silent, "Silent mode: do not display data/details"),
OPT_END()
};
......@@ -149,7 +141,7 @@ int bench_futex_wake(int argc, const char **argv,
pthread_cond_init(&thread_parent, NULL);
pthread_cond_init(&thread_worker, NULL);
for (j = 0; j < repeat && !done; j++) {
for (j = 0; j < bench_repeat && !done; j++) {
unsigned int nwoken = 0;
struct timeval start, end, runtime;
......
......@@ -10,6 +10,7 @@
#include "../util/util.h"
#include "../util/parse-options.h"
#include "../util/header.h"
#include "../util/cloexec.h"
#include "bench.h"
#include "mem-memcpy-arch.h"
......@@ -83,7 +84,8 @@ static struct perf_event_attr cycle_attr = {
static void init_cycle(void)
{
cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1, 0);
cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1,
perf_event_open_cloexec_flag());
if (cycle_fd < 0 && errno == ENOSYS)
die("No CONFIG_PERF_EVENTS=y kernel support configured?\n");
......@@ -189,6 +191,11 @@ int bench_mem_memcpy(int argc, const char **argv,
argc = parse_options(argc, argv, options,
bench_mem_memcpy_usage, 0);
if (no_prefault && only_prefault) {
fprintf(stderr, "Invalid options: -o and -n are mutually exclusive\n");
return 1;
}
if (use_cycle)
init_cycle();
......
......@@ -10,6 +10,7 @@
#include "../util/util.h"
#include "../util/parse-options.h"
#include "../util/header.h"
#include "../util/cloexec.h"
#include "bench.h"
#include "mem-memset-arch.h"
......@@ -83,7 +84,8 @@ static struct perf_event_attr cycle_attr = {
static void init_cycle(void)
{
cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1, 0);
cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1,
perf_event_open_cloexec_flag());
if (cycle_fd < 0 && errno == ENOSYS)
die("No CONFIG_PERF_EVENTS=y kernel support configured?\n");
......@@ -181,6 +183,11 @@ int bench_mem_memset(int argc, const char **argv,
argc = parse_options(argc, argv, options,
bench_mem_memset_usage, 0);
if (no_prefault && only_prefault) {
fprintf(stderr, "Invalid options: -o and -n are mutually exclusive\n");
return 1;
}
if (use_cycle)
init_cycle();
......
......@@ -28,6 +28,7 @@
#include <sys/time.h>
#include <sys/poll.h>
#include <limits.h>
#include <err.h>
#define DATASIZE 100
......@@ -50,12 +51,6 @@ struct receiver_context {
int wakefd;
};
static void barf(const char *msg)
{
fprintf(stderr, "%s (error: %s)\n", msg, strerror(errno));
exit(1);
}
static void fdpair(int fds[2])
{
if (use_pipes) {
......@@ -66,7 +61,7 @@ static void fdpair(int fds[2])
return;
}
barf(use_pipes ? "pipe()" : "socketpair()");
err(EXIT_FAILURE, use_pipes ? "pipe()" : "socketpair()");
}
/* Block until we're ready to go */
......@@ -77,11 +72,11 @@ static void ready(int ready_out, int wakefd)
/* Tell them we're ready. */
if (write(ready_out, &dummy, 1) != 1)
barf("CLIENT: ready write");
err(EXIT_FAILURE, "CLIENT: ready write");
/* Wait for "GO" signal */
if (poll(&pollfd, 1, -1) != 1)
barf("poll");
err(EXIT_FAILURE, "poll");
}
/* Sender sprays loops messages down each file descriptor */
......@@ -101,7 +96,7 @@ static void *sender(struct sender_context *ctx)
ret = write(ctx->out_fds[j], data + done,
sizeof(data)-done);
if (ret < 0)
barf("SENDER: write");
err(EXIT_FAILURE, "SENDER: write");
done += ret;
if (done < DATASIZE)
goto again;
......@@ -131,7 +126,7 @@ static void *receiver(struct receiver_context* ctx)
again:
ret = read(ctx->in_fds[0], data + done, DATASIZE - done);
if (ret < 0)
barf("SERVER: read");
err(EXIT_FAILURE, "SERVER: read");
done += ret;
if (done < DATASIZE)
goto again;
......@@ -144,14 +139,14 @@ static pthread_t create_worker(void *ctx, void *(*func)(void *))
{
pthread_attr_t attr;
pthread_t childid;
int err;
int ret;
if (!thread_mode) {
/* process mode */
/* Fork the receiver. */
switch (fork()) {
case -1:
barf("fork()");
err(EXIT_FAILURE, "fork()");
break;
case 0:
(*func) (ctx);
......@@ -165,19 +160,17 @@ static pthread_t create_worker(void *ctx, void *(*func)(void *))
}
if (pthread_attr_init(&attr) != 0)
barf("pthread_attr_init:");
err(EXIT_FAILURE, "pthread_attr_init:");
#ifndef __ia64__
if (pthread_attr_setstacksize(&attr, PTHREAD_STACK_MIN) != 0)
barf("pthread_attr_setstacksize");
err(EXIT_FAILURE, "pthread_attr_setstacksize");
#endif
err = pthread_create(&childid, &attr, func, ctx);
if (err != 0) {
fprintf(stderr, "pthread_create failed: %s (%d)\n",
strerror(err), err);
exit(-1);
}
ret = pthread_create(&childid, &attr, func, ctx);
if (ret != 0)
err(EXIT_FAILURE, "pthread_create failed");
return childid;
}
......@@ -207,14 +200,14 @@ static unsigned int group(pthread_t *pth,
+ num_fds * sizeof(int));
if (!snd_ctx)
barf("malloc()");
err(EXIT_FAILURE, "malloc()");
for (i = 0; i < num_fds; i++) {
int fds[2];
struct receiver_context *ctx = malloc(sizeof(*ctx));
if (!ctx)
barf("malloc()");
err(EXIT_FAILURE, "malloc()");
/* Create the pipe between client and server */
......@@ -281,7 +274,7 @@ int bench_sched_messaging(int argc, const char **argv,
pth_tab = malloc(num_fds * 2 * num_groups * sizeof(pthread_t));
if (!pth_tab)
barf("main:malloc()");
err(EXIT_FAILURE, "main:malloc()");
fdpair(readyfds);
fdpair(wakefds);
......@@ -294,13 +287,13 @@ int bench_sched_messaging(int argc, const char **argv,
/* Wait for everyone to be ready */
for (i = 0; i < total_children; i++)
if (read(readyfds[0], &dummy, 1) != 1)
barf("Reading for readyfds");
err(EXIT_FAILURE, "Reading for readyfds");
gettimeofday(&start, NULL);
/* Kick them off */
if (write(wakefds[1], &dummy, 1) != 1)
barf("Writing to start them");
err(EXIT_FAILURE, "Writing to start them");
/* Reap them all */
for (i = 0; i < total_children; i++)
......@@ -332,5 +325,7 @@ int bench_sched_messaging(int argc, const char **argv,
break;
}
free(pth_tab);
return 0;
}
......@@ -104,9 +104,11 @@ static const char *bench_format_str;
/* Output/formatting style, exported to benchmark modules: */
int bench_format = BENCH_FORMAT_DEFAULT;
unsigned int bench_repeat = 10; /* default number of times to repeat the run */
static const struct option bench_options[] = {
OPT_STRING('f', "format", &bench_format_str, "default", "Specify format style"),
OPT_UINTEGER('r', "repeat", &bench_repeat, "Specify amount of times to repeat the run"),
OPT_END()
};
......@@ -226,6 +228,11 @@ int cmd_bench(int argc, const char **argv, const char *prefix __maybe_unused)
goto end;
}
if (bench_repeat == 0) {
printf("Invalid repeat option: Must specify a positive value\n");
goto end;
}
if (argc < 1) {
print_usage();
goto end;
......
......@@ -125,7 +125,8 @@ static int build_id_cache__kcore_existing(const char *from_dir, char *to_dir,
return ret;
}
static int build_id_cache__add_kcore(const char *filename, const char *debugdir)
static int build_id_cache__add_kcore(const char *filename, const char *debugdir,
bool force)
{
char dir[32], sbuildid[BUILD_ID_SIZE * 2 + 1];
char from_dir[PATH_MAX], to_dir[PATH_MAX];
......@@ -144,7 +145,8 @@ static int build_id_cache__add_kcore(const char *filename, const char *debugdir)
scnprintf(to_dir, sizeof(to_dir), "%s/[kernel.kcore]/%s",
debugdir, sbuildid);
if (!build_id_cache__kcore_existing(from_dir, to_dir, sizeof(to_dir))) {
if (!force &&
!build_id_cache__kcore_existing(from_dir, to_dir, sizeof(to_dir))) {
pr_debug("same kcore found in %s\n", to_dir);
return 0;
}
......@@ -389,7 +391,7 @@ int cmd_buildid_cache(int argc, const char **argv,
}
if (kcore_filename &&
build_id_cache__add_kcore(kcore_filename, debugdir))
build_id_cache__add_kcore(kcore_filename, debugdir, force))
pr_warning("Couldn't add %s\n", kcore_filename);
return ret;
......
......@@ -15,6 +15,7 @@
#include "util/parse-options.h"
#include "util/session.h"
#include "util/data.h"
#include "util/debug.h"
static int __cmd_evlist(const char *file_name, struct perf_attr_details *details)
{
......
......@@ -11,6 +11,7 @@
#include "util/parse-options.h"
#include "util/run-command.h"
#include "util/help.h"
#include "util/debug.h"
static struct man_viewer_list {
struct man_viewer_list *next;
......
......@@ -389,6 +389,9 @@ static int __cmd_inject(struct perf_inject *inject)
ret = perf_session__process_events(session, &inject->tool);
if (!file_out->is_pipe) {
if (inject->build_ids)
perf_header__set_feat(&session->header,
HEADER_BUILD_ID);
session->header.data_size = inject->bytes_written;
perf_session__write_header(session, session->evlist, file_out->fd, true);
}
......@@ -436,6 +439,8 @@ int cmd_inject(int argc, const char **argv, const char *prefix __maybe_unused)
"where and how long tasks slept"),
OPT_INCR('v', "verbose", &verbose,
"be more verbose (show build ids, etc)"),
OPT_STRING(0, "kallsyms", &symbol_conf.kallsyms_name, "file",
"kallsyms pathname"),
OPT_END()
};
const char * const inject_usage[] = {
......
This diff is collapsed.
......@@ -238,6 +238,7 @@ static struct perf_event_header finished_round_event = {
static int record__mmap_read_all(struct record *rec)
{
u64 bytes_written = rec->bytes_written;
int i;
int rc = 0;
......@@ -250,7 +251,11 @@ static int record__mmap_read_all(struct record *rec)
}
}
if (perf_header__has_feat(&rec->session->header, HEADER_TRACING_DATA))
/*
* Mark the round finished in case we wrote
* at least one event.
*/
if (bytes_written != rec->bytes_written)
rc = record__write(rec, &finished_round_event, sizeof(finished_round_event));
out:
......
......@@ -10,6 +10,7 @@
#include "util/header.h"
#include "util/session.h"
#include "util/tool.h"
#include "util/cloexec.h"
#include "util/parse-options.h"
#include "util/trace-event.h"
......@@ -434,7 +435,8 @@ static int self_open_counters(void)
attr.type = PERF_TYPE_SOFTWARE;
attr.config = PERF_COUNT_SW_TASK_CLOCK;
fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
fd = sys_perf_event_open(&attr, 0, -1, -1,
perf_event_open_cloexec_flag());
if (fd < 0)
pr_err("Error: sys_perf_event_open() syscall returned "
......@@ -935,8 +937,8 @@ static int latency_switch_event(struct perf_sched *sched,
return -1;
}
sched_out = machine__findnew_thread(machine, 0, prev_pid);
sched_in = machine__findnew_thread(machine, 0, next_pid);
sched_out = machine__findnew_thread(machine, -1, prev_pid);
sched_in = machine__findnew_thread(machine, -1, next_pid);
out_events = thread_atoms_search(&sched->atom_root, sched_out, &sched->cmp_pid);
if (!out_events) {
......@@ -979,7 +981,7 @@ static int latency_runtime_event(struct perf_sched *sched,
{
const u32 pid = perf_evsel__intval(evsel, sample, "pid");
const u64 runtime = perf_evsel__intval(evsel, sample, "runtime");
struct thread *thread = machine__findnew_thread(machine, 0, pid);
struct thread *thread = machine__findnew_thread(machine, -1, pid);
struct work_atoms *atoms = thread_atoms_search(&sched->atom_root, thread, &sched->cmp_pid);
u64 timestamp = sample->time;
int cpu = sample->cpu;
......@@ -1012,7 +1014,7 @@ static int latency_wakeup_event(struct perf_sched *sched,
struct thread *wakee;
u64 timestamp = sample->time;
wakee = machine__findnew_thread(machine, 0, pid);
wakee = machine__findnew_thread(machine, -1, pid);
atoms = thread_atoms_search(&sched->atom_root, wakee, &sched->cmp_pid);
if (!atoms) {
if (thread_atoms_insert(sched, wakee))
......@@ -1072,7 +1074,7 @@ static int latency_migrate_task_event(struct perf_sched *sched,
if (sched->profile_cpu == -1)
return 0;
migrant = machine__findnew_thread(machine, 0, pid);
migrant = machine__findnew_thread(machine, -1, pid);
atoms = thread_atoms_search(&sched->atom_root, migrant, &sched->cmp_pid);
if (!atoms) {
if (thread_atoms_insert(sched, migrant))
......@@ -1290,7 +1292,7 @@ static int map_switch_event(struct perf_sched *sched, struct perf_evsel *evsel,
return -1;
}
sched_in = machine__findnew_thread(machine, 0, next_pid);
sched_in = machine__findnew_thread(machine, -1, next_pid);
sched->curr_thread[this_cpu] = sched_in;
......
......@@ -358,27 +358,6 @@ static void print_sample_start(struct perf_sample *sample,
}
}
static bool is_bts_event(struct perf_event_attr *attr)
{
return ((attr->type == PERF_TYPE_HARDWARE) &&
(attr->config & PERF_COUNT_HW_BRANCH_INSTRUCTIONS) &&
(attr->sample_period == 1));
}
static bool sample_addr_correlates_sym(struct perf_event_attr *attr)
{
if ((attr->type == PERF_TYPE_SOFTWARE) &&
((attr->config == PERF_COUNT_SW_PAGE_FAULTS) ||
(attr->config == PERF_COUNT_SW_PAGE_FAULTS_MIN) ||
(attr->config == PERF_COUNT_SW_PAGE_FAULTS_MAJ)))
return true;
if (is_bts_event(attr))
return true;
return false;
}
static void print_sample_addr(union perf_event *event,
struct perf_sample *sample,
struct machine *machine,
......@@ -386,24 +365,13 @@ static void print_sample_addr(union perf_event *event,
struct perf_event_attr *attr)
{
struct addr_location al;
u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
printf("%16" PRIx64, sample->addr);
if (!sample_addr_correlates_sym(attr))
return;
thread__find_addr_map(thread, machine, cpumode, MAP__FUNCTION,
sample->addr, &al);
if (!al.map)
thread__find_addr_map(thread, machine, cpumode, MAP__VARIABLE,
sample->addr, &al);
al.cpu = sample->cpu;
al.sym = NULL;
if (al.map)
al.sym = map__find_symbol(al.map, al.addr, NULL);
perf_event__preprocess_sample_addr(event, sample, machine, thread, &al);
if (PRINT_FIELD(SYM)) {
printf(" ");
......@@ -427,25 +395,35 @@ static void print_sample_bts(union perf_event *event,
struct addr_location *al)
{
struct perf_event_attr *attr = &evsel->attr;
bool print_srcline_last = false;
/* print branch_from information */
if (PRINT_FIELD(IP)) {
if (!symbol_conf.use_callchain)
printf(" ");
else
unsigned int print_opts = output[attr->type].print_ip_opts;
if (symbol_conf.use_callchain && sample->callchain) {
printf("\n");
perf_evsel__print_ip(evsel, sample, al,
output[attr->type].print_ip_opts,
} else {
printf(" ");
if (print_opts & PRINT_IP_OPT_SRCLINE) {
print_srcline_last = true;
print_opts &= ~PRINT_IP_OPT_SRCLINE;
}
}
perf_evsel__print_ip(evsel, sample, al, print_opts,
PERF_MAX_STACK_DEPTH);
}
printf(" => ");
/* print branch_to information */
if (PRINT_FIELD(ADDR) ||
((evsel->attr.sample_type & PERF_SAMPLE_ADDR) &&
!output[attr->type].user_set))
!output[attr->type].user_set)) {
printf(" => ");
print_sample_addr(event, sample, al->machine, thread, attr);
}
if (print_srcline_last)
map__fprintf_srcline(al->map, al->addr, "\n ", stdout);
printf("\n");
}
......
......@@ -184,7 +184,7 @@ static void perf_evsel__reset_stat_priv(struct perf_evsel *evsel)
static int perf_evsel__alloc_stat_priv(struct perf_evsel *evsel)
{
evsel->priv = zalloc(sizeof(struct perf_stat));
if (evsel == NULL)
if (evsel->priv == NULL)
return -ENOMEM;
perf_evsel__reset_stat_priv(evsel);
return 0;
......
This diff is collapsed.
This diff is collapsed.
......@@ -48,6 +48,10 @@ ifneq ($(ARCH),$(filter $(ARCH),x86 arm))
NO_LIBDW_DWARF_UNWIND := 1
endif
ifeq ($(ARCH),powerpc)
CFLAGS += -DHAVE_SKIP_CALLCHAIN_IDX
endif
ifeq ($(LIBUNWIND_LIBS),)
NO_LIBUNWIND := 1
else
......@@ -160,6 +164,7 @@ CORE_FEATURE_TESTS = \
backtrace \
dwarf \
fortify-source \
sync-compare-and-swap \
glibc \
gtk2 \
gtk2-infobar \
......@@ -195,6 +200,7 @@ LIB_FEATURE_TESTS = \
VF_FEATURE_TESTS = \
backtrace \
fortify-source \
sync-compare-and-swap \
gtk2-infobar \
libelf-getphdrnum \
libelf-mmap \
......@@ -268,6 +274,10 @@ CFLAGS += -I$(LIB_INCLUDE)
CFLAGS += -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE
ifeq ($(feature-sync-compare-and-swap), 1)
CFLAGS += -DHAVE_SYNC_COMPARE_AND_SWAP_SUPPORT
endif
ifndef NO_BIONIC
$(call feature_check,bionic)
ifeq ($(feature-bionic), 1)
......@@ -590,6 +600,10 @@ ifndef NO_LIBNUMA
endif
endif
ifdef HAVE_KVM_STAT_SUPPORT
CFLAGS += -DHAVE_KVM_STAT_SUPPORT
endif
# Among the variables below, these:
# perfexecdir
# template_dir
......
......@@ -5,6 +5,7 @@ FILES= \
test-bionic.bin \
test-dwarf.bin \
test-fortify-source.bin \
test-sync-compare-and-swap.bin \
test-glibc.bin \
test-gtk2.bin \
test-gtk2-infobar.bin \
......@@ -141,6 +142,9 @@ test-timerfd.bin:
test-libdw-dwarf-unwind.bin:
$(BUILD)
test-sync-compare-and-swap.bin:
$(BUILD) -Werror
-include *.d
###############################
......
......@@ -89,6 +89,10 @@
# include "test-libdw-dwarf-unwind.c"
#undef main
#define main main_test_sync_compare_and_swap
# include "test-sync-compare-and-swap.c"
#undef main
int main(int argc, char *argv[])
{
main_test_libpython();
......@@ -111,6 +115,7 @@ int main(int argc, char *argv[])
main_test_timerfd();
main_test_stackprotector_all();
main_test_libdw_dwarf_unwind();
main_test_sync_compare_and_swap(argc, argv);
return 0;
}
#include <stdint.h>
volatile uint64_t x;
int main(int argc, char *argv[])
{
uint64_t old, new = argc;
argv = argv;
do {
old = __sync_val_compare_and_swap(&x, 0, 0);
} while (!__sync_bool_compare_and_swap(&x, old, new));
return old == new;
}
......@@ -54,6 +54,7 @@
#define mb() asm volatile("bcr 15,0" ::: "memory")
#define wmb() asm volatile("bcr 15,0" ::: "memory")
#define rmb() asm volatile("bcr 15,0" ::: "memory")
#define CPUINFO_PROC "vendor_id"
#endif
#ifdef __sh__
......
......@@ -13,11 +13,12 @@
#include "util/quote.h"
#include "util/run-command.h"
#include "util/parse-events.h"
#include "util/debug.h"
#include <api/fs/debugfs.h>
#include <pthread.h>
const char perf_usage_string[] =
"perf [--version] [--help] COMMAND [ARGS]";
"perf [--version] [--help] [OPTIONS] COMMAND [ARGS]";
const char perf_more_info_string[] =
"See 'perf help COMMAND' for more information on a specific command.";
......@@ -212,6 +213,16 @@ static int handle_options(const char ***argv, int *argc, int *envchanged)
printf("%s ", p->cmd);
}
exit(0);
} else if (!strcmp(cmd, "--debug")) {
if (*argc < 2) {
fprintf(stderr, "No variable specified for --debug.\n");
usage(perf_usage_string);
}
if (perf_debug_option((*argv)[1]))
usage(perf_usage_string);
(*argv)++;
(*argc)--;
} else {
fprintf(stderr, "Unknown option: %s\n", cmd);
usage(perf_usage_string);
......
#!/bin/bash
perf record -e raw_syscalls:sys_exit $@
(perf record -e raw_syscalls:sys_exit $@ || \
perf record -e syscalls:sys_exit $@) 2> /dev/null
......@@ -26,6 +26,11 @@ sub raw_syscalls::sys_exit
}
}
sub syscalls::sys_exit
{
raw_syscalls::sys_exit(@_)
}
sub trace_end
{
printf("\nfailed syscalls by comm:\n\n");
......
......@@ -107,12 +107,13 @@ def taskState(state):
class EventHeaders:
def __init__(self, common_cpu, common_secs, common_nsecs,
common_pid, common_comm):
common_pid, common_comm, common_callchain):
self.cpu = common_cpu
self.secs = common_secs
self.nsecs = common_nsecs
self.pid = common_pid
self.comm = common_comm
self.callchain = common_callchain
def ts(self):
return (self.secs * (10 ** 9)) + self.nsecs
......
#!/bin/bash
perf record -e raw_syscalls:sys_exit $@
(perf record -e raw_syscalls:sys_exit $@ || \
perf record -e syscalls:sys_exit $@) 2> /dev/null
#!/bin/bash
perf record -e raw_syscalls:sys_enter $@
(perf record -e raw_syscalls:sys_enter $@ || \
perf record -e syscalls:sys_enter $@) 2> /dev/null
#!/bin/bash
perf record -e raw_syscalls:sys_enter $@
(perf record -e raw_syscalls:sys_enter $@ || \
perf record -e syscalls:sys_enter $@) 2> /dev/null
#!/bin/bash
perf record -e raw_syscalls:sys_enter $@
(perf record -e raw_syscalls:sys_enter $@ || \
perf record -e syscalls:sys_enter $@) 2> /dev/null
......@@ -27,7 +27,7 @@ def trace_end():
def irq__softirq_entry(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
vec):
common_callchain, vec):
print_header(event_name, common_cpu, common_secs, common_nsecs,
common_pid, common_comm)
......@@ -38,7 +38,7 @@ def irq__softirq_entry(event_name, context, common_cpu,
def kmem__kmalloc(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
call_site, ptr, bytes_req, bytes_alloc,
common_callchain, call_site, ptr, bytes_req, bytes_alloc,
gfp_flags):
print_header(event_name, common_cpu, common_secs, common_nsecs,
common_pid, common_comm)
......
......@@ -39,7 +39,7 @@ def trace_end():
def raw_syscalls__sys_exit(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
id, ret):
common_callchain, id, ret):
if (for_comm and common_comm != for_comm) or \
(for_pid and common_pid != for_pid ):
return
......@@ -50,6 +50,11 @@ def raw_syscalls__sys_exit(event_name, context, common_cpu,
except TypeError:
syscalls[common_comm][common_pid][id][ret] = 1
def syscalls__sys_exit(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
id, ret):
raw_syscalls__sys_exit(**locals())
def print_error_totals():
if for_comm is not None:
print "\nsyscall errors for %s:\n\n" % (for_comm),
......
......@@ -21,7 +21,7 @@ thread_blocktime = {}
lock_waits = {} # long-lived stats on (tid,lock) blockage elapsed time
process_names = {} # long-lived pid-to-execname mapping
def syscalls__sys_enter_futex(event, ctxt, cpu, s, ns, tid, comm,
def syscalls__sys_enter_futex(event, ctxt, cpu, s, ns, tid, comm, callchain,
nr, uaddr, op, val, utime, uaddr2, val3):
cmd = op & FUTEX_CMD_MASK
if cmd != FUTEX_WAIT:
......@@ -31,7 +31,7 @@ def syscalls__sys_enter_futex(event, ctxt, cpu, s, ns, tid, comm,
thread_thislock[tid] = uaddr
thread_blocktime[tid] = nsecs(s, ns)
def syscalls__sys_exit_futex(event, ctxt, cpu, s, ns, tid, comm,
def syscalls__sys_exit_futex(event, ctxt, cpu, s, ns, tid, comm, callchain,
nr, ret):
if thread_blocktime.has_key(tid):
elapsed = nsecs(s, ns) - thread_blocktime[tid]
......
......@@ -66,7 +66,7 @@ def trace_end():
print_drop_table()
# called from perf, when it finds a correspoinding event
def skb__kfree_skb(name, context, cpu, sec, nsec, pid, comm,
def skb__kfree_skb(name, context, cpu, sec, nsec, pid, comm, callchain,
skbaddr, location, protocol):
slocation = str(location)
try:
......
......@@ -224,75 +224,75 @@ def trace_end():
(len(rx_skb_list), of_count_rx_skb_list)
# called from perf, when it finds a correspoinding event
def irq__softirq_entry(name, context, cpu, sec, nsec, pid, comm, vec):
def irq__softirq_entry(name, context, cpu, sec, nsec, pid, comm, callchain, vec):
if symbol_str("irq__softirq_entry", "vec", vec) != "NET_RX":
return
event_info = (name, context, cpu, nsecs(sec, nsec), pid, comm, vec)
all_event_list.append(event_info)
def irq__softirq_exit(name, context, cpu, sec, nsec, pid, comm, vec):
def irq__softirq_exit(name, context, cpu, sec, nsec, pid, comm, callchain, vec):
if symbol_str("irq__softirq_entry", "vec", vec) != "NET_RX":
return
event_info = (name, context, cpu, nsecs(sec, nsec), pid, comm, vec)
all_event_list.append(event_info)
def irq__softirq_raise(name, context, cpu, sec, nsec, pid, comm, vec):
def irq__softirq_raise(name, context, cpu, sec, nsec, pid, comm, callchain, vec):
if symbol_str("irq__softirq_entry", "vec", vec) != "NET_RX":
return
event_info = (name, context, cpu, nsecs(sec, nsec), pid, comm, vec)
all_event_list.append(event_info)
def irq__irq_handler_entry(name, context, cpu, sec, nsec, pid, comm,
irq, irq_name):
callchain, irq, irq_name):
event_info = (name, context, cpu, nsecs(sec, nsec), pid, comm,
irq, irq_name)
all_event_list.append(event_info)
def irq__irq_handler_exit(name, context, cpu, sec, nsec, pid, comm, irq, ret):
def irq__irq_handler_exit(name, context, cpu, sec, nsec, pid, comm, callchain, irq, ret):
event_info = (name, context, cpu, nsecs(sec, nsec), pid, comm, irq, ret)
all_event_list.append(event_info)
def napi__napi_poll(name, context, cpu, sec, nsec, pid, comm, napi, dev_name):
def napi__napi_poll(name, context, cpu, sec, nsec, pid, comm, callchain, napi, dev_name):
event_info = (name, context, cpu, nsecs(sec, nsec), pid, comm,
napi, dev_name)
all_event_list.append(event_info)
def net__netif_receive_skb(name, context, cpu, sec, nsec, pid, comm, skbaddr,
def net__netif_receive_skb(name, context, cpu, sec, nsec, pid, comm, callchain, skbaddr,
skblen, dev_name):
event_info = (name, context, cpu, nsecs(sec, nsec), pid, comm,
skbaddr, skblen, dev_name)
all_event_list.append(event_info)
def net__netif_rx(name, context, cpu, sec, nsec, pid, comm, skbaddr,
def net__netif_rx(name, context, cpu, sec, nsec, pid, comm, callchain, skbaddr,
skblen, dev_name):
event_info = (name, context, cpu, nsecs(sec, nsec), pid, comm,
skbaddr, skblen, dev_name)
all_event_list.append(event_info)
def net__net_dev_queue(name, context, cpu, sec, nsec, pid, comm,
def net__net_dev_queue(name, context, cpu, sec, nsec, pid, comm, callchain,
skbaddr, skblen, dev_name):
event_info = (name, context, cpu, nsecs(sec, nsec), pid, comm,
skbaddr, skblen, dev_name)
all_event_list.append(event_info)
def net__net_dev_xmit(name, context, cpu, sec, nsec, pid, comm,
def net__net_dev_xmit(name, context, cpu, sec, nsec, pid, comm, callchain,
skbaddr, skblen, rc, dev_name):
event_info = (name, context, cpu, nsecs(sec, nsec), pid, comm,
skbaddr, skblen, rc ,dev_name)
all_event_list.append(event_info)
def skb__kfree_skb(name, context, cpu, sec, nsec, pid, comm,
def skb__kfree_skb(name, context, cpu, sec, nsec, pid, comm, callchain,
skbaddr, protocol, location):
event_info = (name, context, cpu, nsecs(sec, nsec), pid, comm,
skbaddr, protocol, location)
all_event_list.append(event_info)
def skb__consume_skb(name, context, cpu, sec, nsec, pid, comm, skbaddr):
def skb__consume_skb(name, context, cpu, sec, nsec, pid, comm, callchain, skbaddr):
event_info = (name, context, cpu, nsecs(sec, nsec), pid, comm,
skbaddr)
all_event_list.append(event_info)
def skb__skb_copy_datagram_iovec(name, context, cpu, sec, nsec, pid, comm,
def skb__skb_copy_datagram_iovec(name, context, cpu, sec, nsec, pid, comm, callchain,
skbaddr, skblen):
event_info = (name, context, cpu, nsecs(sec, nsec), pid, comm,
skbaddr, skblen)
......
......@@ -369,93 +369,92 @@ def trace_end():
def sched__sched_stat_runtime(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
comm, pid, runtime, vruntime):
common_callchain, comm, pid, runtime, vruntime):
pass
def sched__sched_stat_iowait(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
comm, pid, delay):
common_callchain, comm, pid, delay):
pass
def sched__sched_stat_sleep(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
comm, pid, delay):
common_callchain, comm, pid, delay):
pass
def sched__sched_stat_wait(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
comm, pid, delay):
common_callchain, comm, pid, delay):
pass
def sched__sched_process_fork(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
parent_comm, parent_pid, child_comm, child_pid):
common_callchain, parent_comm, parent_pid, child_comm, child_pid):
pass
def sched__sched_process_wait(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
comm, pid, prio):
common_callchain, comm, pid, prio):
pass
def sched__sched_process_exit(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
comm, pid, prio):
common_callchain, comm, pid, prio):
pass
def sched__sched_process_free(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
comm, pid, prio):
common_callchain, comm, pid, prio):
pass
def sched__sched_migrate_task(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
comm, pid, prio, orig_cpu,
common_callchain, comm, pid, prio, orig_cpu,
dest_cpu):
headers = EventHeaders(common_cpu, common_secs, common_nsecs,
common_pid, common_comm)
common_pid, common_comm, common_callchain)
parser.migrate(headers, pid, prio, orig_cpu, dest_cpu)
def sched__sched_switch(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
common_secs, common_nsecs, common_pid, common_comm, common_callchain,
prev_comm, prev_pid, prev_prio, prev_state,
next_comm, next_pid, next_prio):
headers = EventHeaders(common_cpu, common_secs, common_nsecs,
common_pid, common_comm)
common_pid, common_comm, common_callchain)
parser.sched_switch(headers, prev_comm, prev_pid, prev_prio, prev_state,
next_comm, next_pid, next_prio)
def sched__sched_wakeup_new(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
comm, pid, prio, success,
common_callchain, comm, pid, prio, success,
target_cpu):
headers = EventHeaders(common_cpu, common_secs, common_nsecs,
common_pid, common_comm)
common_pid, common_comm, common_callchain)
parser.wake_up(headers, comm, pid, success, target_cpu, 1)
def sched__sched_wakeup(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
comm, pid, prio, success,
common_callchain, comm, pid, prio, success,
target_cpu):
headers = EventHeaders(common_cpu, common_secs, common_nsecs,
common_pid, common_comm)
common_pid, common_comm, common_callchain)
parser.wake_up(headers, comm, pid, success, target_cpu, 0)
def sched__sched_wait_task(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
comm, pid, prio):
common_callchain, comm, pid, prio):
pass
def sched__sched_kthread_stop_ret(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
ret):
common_callchain, ret):
pass
def sched__sched_kthread_stop(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
comm, pid):
common_callchain, comm, pid):
pass
def trace_unhandled(event_name, context, common_cpu, common_secs, common_nsecs,
common_pid, common_comm):
def trace_unhandled(event_name, context, event_fields_dict):
pass
......@@ -44,7 +44,7 @@ def trace_begin():
def raw_syscalls__sys_enter(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
id, args):
common_callchain, id, args):
if for_comm is not None:
if common_comm != for_comm:
return
......@@ -53,6 +53,11 @@ def raw_syscalls__sys_enter(event_name, context, common_cpu,
except TypeError:
syscalls[id] = 1
def syscalls__sys_enter(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
id, args):
raw_syscalls__sys_enter(**locals())
def print_syscall_totals(interval):
while 1:
clear_term()
......
......@@ -38,7 +38,7 @@ def trace_end():
def raw_syscalls__sys_enter(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
id, args):
common_callchain, id, args):
if (for_comm and common_comm != for_comm) or \
(for_pid and common_pid != for_pid ):
......@@ -48,6 +48,11 @@ def raw_syscalls__sys_enter(event_name, context, common_cpu,
except TypeError:
syscalls[common_comm][common_pid][id] = 1
def syscalls__sys_enter(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
id, args):
raw_syscalls__sys_enter(**locals())
def print_syscall_totals():
if for_comm is not None:
print "\nsyscall events for %s:\n\n" % (for_comm),
......
......@@ -35,7 +35,7 @@ def trace_end():
def raw_syscalls__sys_enter(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
id, args):
common_callchain, id, args):
if for_comm is not None:
if common_comm != for_comm:
return
......@@ -44,6 +44,11 @@ def raw_syscalls__sys_enter(event_name, context, common_cpu,
except TypeError:
syscalls[id] = 1
def syscalls__sys_enter(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
id, args):
raw_syscalls__sys_enter(**locals())
def print_syscall_totals():
if for_comm is not None:
print "\nsyscall events for %s:\n\n" % (for_comm),
......
[event]
fd=1
group_fd=-1
flags=0
# 0 or PERF_FLAG_FD_CLOEXEC flag
flags=0|8
cpu=*
type=0|1
size=96
......
[event]
fd=1
group_fd=-1
flags=0
# 0 or PERF_FLAG_FD_CLOEXEC flag
flags=0|8
cpu=*
type=0
size=96
......
......@@ -25,6 +25,7 @@
#include "tests.h"
#include "debug.h"
#include "perf.h"
#include "cloexec.h"
static int fd1;
static int fd2;
......@@ -78,7 +79,8 @@ static int bp_event(void *fn, int setup_signal)
pe.exclude_kernel = 1;
pe.exclude_hv = 1;
fd = sys_perf_event_open(&pe, 0, -1, -1, 0);
fd = sys_perf_event_open(&pe, 0, -1, -1,
perf_event_open_cloexec_flag());
if (fd < 0) {
pr_debug("failed opening event %llx\n", pe.config);
return TEST_FAIL;
......
......@@ -24,6 +24,7 @@
#include "tests.h"
#include "debug.h"
#include "perf.h"
#include "cloexec.h"
static int overflows;
......@@ -91,7 +92,8 @@ int test__bp_signal_overflow(void)
pe.exclude_kernel = 1;
pe.exclude_hv = 1;
fd = sys_perf_event_open(&pe, 0, -1, -1, 0);
fd = sys_perf_event_open(&pe, 0, -1, -1,
perf_event_open_cloexec_flag());
if (fd < 0) {
pr_debug("failed opening event %llx\n", pe.config);
return TEST_FAIL;
......
......@@ -10,6 +10,7 @@
#include "machine.h"
#include "symbol.h"
#include "tests.h"
#include "debug.h"
static char *test_file(int size)
{
......
......@@ -2,6 +2,7 @@
#include "evsel.h"
#include "parse-events.h"
#include "tests.h"
#include "debug.h"
static int perf_evsel__roundtrip_cache_name_test(void)
{
......
#include <traceevent/event-parse.h>
#include "evsel.h"
#include "tests.h"
#include "debug.h"
static int perf_evsel__test_field(struct perf_evsel *evsel, const char *name,
int size, bool should_be_signed)
......
......@@ -3,6 +3,7 @@
#include "evsel.h"
#include "thread_map.h"
#include "tests.h"
#include "debug.h"
int test__syscall_open_tp_fields(void)
{
......
......@@ -5,6 +5,7 @@
#include <api/fs/fs.h>
#include <api/fs/debugfs.h>
#include "tests.h"
#include "debug.h"
#include <linux/hw_breakpoint.h>
#define PERF_TP_SAMPLE_TYPE (PERF_SAMPLE_RAW | PERF_SAMPLE_TIME | \
......
......@@ -7,6 +7,7 @@
#include "evlist.h"
#include "header.h"
#include "util.h"
#include "debug.h"
static int process_event(struct perf_evlist **pevlist, union perf_event *event)
{
......
......@@ -8,10 +8,9 @@
#include "evsel.h"
#include "thread_map.h"
#include "cpumap.h"
#include "tsc.h"
#include "tests.h"
#include "../arch/x86/util/tsc.h"
#define CHECK__(x) { \
while ((x) < 0) { \
pr_debug(#x " failed!\n"); \
......@@ -26,15 +25,6 @@
} \
}
static u64 rdtsc(void)
{
unsigned int low, high;
asm volatile("rdtsc" : "=a" (low), "=d" (high));
return low | ((u64)high) << 32;
}
/**
* test__perf_time_to_tsc - test converting perf time to TSC.
*
......
......@@ -6,6 +6,7 @@
#include "perf.h"
#include "debug.h"
#include "tests.h"
#include "cloexec.h"
#if defined(__x86_64__) || defined(__i386__)
......@@ -104,7 +105,8 @@ static int __test__rdpmc(void)
sa.sa_sigaction = segfault_handler;
sigaction(SIGSEGV, &sa, NULL);
fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
fd = sys_perf_event_open(&attr, 0, -1, -1,
perf_event_open_cloexec_flag());
if (fd < 0) {
pr_err("Error: sys_perf_event_open() syscall returned "
"with %d (%s)\n", fd, strerror(errno));
......
......@@ -4,6 +4,7 @@
#include "util.h"
#include "event.h"
#include "evsel.h"
#include "debug.h"
#include "tests.h"
......
......@@ -2,6 +2,7 @@
#include "machine.h"
#include "thread.h"
#include "map.h"
#include "debug.h"
int test__thread_mg_share(void)
{
......
......@@ -150,7 +150,7 @@ unsigned int ui_browser__rb_tree_refresh(struct ui_browser *browser)
while (nd != NULL) {
ui_browser__gotorc(browser, row, 0);
browser->write(browser, nd, row);
if (++row == browser->height)
if (++row == browser->rows)
break;
nd = rb_next(nd);
}
......@@ -166,7 +166,7 @@ bool ui_browser__is_current_entry(struct ui_browser *browser, unsigned row)
void ui_browser__refresh_dimensions(struct ui_browser *browser)
{
browser->width = SLtt_Screen_Cols - 1;
browser->height = SLtt_Screen_Rows - 2;
browser->height = browser->rows = SLtt_Screen_Rows - 2;
browser->y = 1;
browser->x = 0;
}
......@@ -250,7 +250,10 @@ int ui_browser__show(struct ui_browser *browser, const char *title,
int err;
va_list ap;
ui_browser__refresh_dimensions(browser);
if (browser->refresh_dimensions == NULL)
browser->refresh_dimensions = ui_browser__refresh_dimensions;
browser->refresh_dimensions(browser);
pthread_mutex_lock(&ui__lock);
__ui_browser__show_title(browser, title);
......@@ -279,7 +282,7 @@ static void ui_browser__scrollbar_set(struct ui_browser *browser)
{
int height = browser->height, h = 0, pct = 0,
col = browser->width,
row = browser->y - 1;
row = 0;
if (browser->nr_entries > 1) {
pct = ((browser->index * (browser->height - 1)) /
......@@ -367,7 +370,7 @@ int ui_browser__run(struct ui_browser *browser, int delay_secs)
if (key == K_RESIZE) {
ui__refresh_dimensions(false);
ui_browser__refresh_dimensions(browser);
browser->refresh_dimensions(browser);
__ui_browser__show_title(browser, browser->title);
ui_helpline__puts(browser->helpline);
continue;
......@@ -389,7 +392,7 @@ int ui_browser__run(struct ui_browser *browser, int delay_secs)
if (browser->index == browser->nr_entries - 1)
break;
++browser->index;
if (browser->index == browser->top_idx + browser->height) {
if (browser->index == browser->top_idx + browser->rows) {
++browser->top_idx;
browser->seek(browser, +1, SEEK_CUR);
}
......@@ -405,10 +408,10 @@ int ui_browser__run(struct ui_browser *browser, int delay_secs)
break;
case K_PGDN:
case ' ':
if (browser->top_idx + browser->height > browser->nr_entries - 1)
if (browser->top_idx + browser->rows > browser->nr_entries - 1)
break;
offset = browser->height;
offset = browser->rows;
if (browser->index + offset > browser->nr_entries - 1)
offset = browser->nr_entries - 1 - browser->index;
browser->index += offset;
......@@ -419,10 +422,10 @@ int ui_browser__run(struct ui_browser *browser, int delay_secs)
if (browser->top_idx == 0)
break;
if (browser->top_idx < browser->height)
if (browser->top_idx < browser->rows)
offset = browser->top_idx;
else
offset = browser->height;
offset = browser->rows;
browser->index -= offset;
browser->top_idx -= offset;
......@@ -432,7 +435,7 @@ int ui_browser__run(struct ui_browser *browser, int delay_secs)
ui_browser__reset_index(browser);
break;
case K_END:
offset = browser->height - 1;
offset = browser->rows - 1;
if (offset >= browser->nr_entries)
offset = browser->nr_entries - 1;
......@@ -462,7 +465,7 @@ unsigned int ui_browser__list_head_refresh(struct ui_browser *browser)
if (!browser->filter || !browser->filter(browser, pos)) {
ui_browser__gotorc(browser, row, 0);
browser->write(browser, pos, row);
if (++row == browser->height)
if (++row == browser->rows)
break;
}
}
......@@ -587,7 +590,7 @@ unsigned int ui_browser__argv_refresh(struct ui_browser *browser)
if (!browser->filter || !browser->filter(browser, *pos)) {
ui_browser__gotorc(browser, row, 0);
browser->write(browser, pos, row);
if (++row == browser->height)
if (++row == browser->rows)
break;
}
......@@ -623,7 +626,7 @@ static void __ui_browser__line_arrow_up(struct ui_browser *browser,
SLsmg_set_char_set(1);
if (start < browser->top_idx + browser->height) {
if (start < browser->top_idx + browser->rows) {
row = start - browser->top_idx;
ui_browser__gotorc(browser, row, column);
SLsmg_write_char(SLSMG_LLCORN_CHAR);
......@@ -633,7 +636,7 @@ static void __ui_browser__line_arrow_up(struct ui_browser *browser,
if (row-- == 0)
goto out;
} else
row = browser->height - 1;
row = browser->rows - 1;
if (end > browser->top_idx)
end_row = end - browser->top_idx;
......@@ -675,8 +678,8 @@ static void __ui_browser__line_arrow_down(struct ui_browser *browser,
} else
row = 0;
if (end >= browser->top_idx + browser->height)
end_row = browser->height - 1;
if (end >= browser->top_idx + browser->rows)
end_row = browser->rows - 1;
else
end_row = end - browser->top_idx;
......@@ -684,7 +687,7 @@ static void __ui_browser__line_arrow_down(struct ui_browser *browser,
SLsmg_draw_vline(end_row - row + 1);
ui_browser__gotorc(browser, end_row, column);
if (end < browser->top_idx + browser->height) {
if (end < browser->top_idx + browser->rows) {
SLsmg_write_char(SLSMG_LLCORN_CHAR);
ui_browser__gotorc(browser, end_row, column + 1);
SLsmg_write_char(SLSMG_HLINE_CHAR);
......
......@@ -14,11 +14,12 @@
struct ui_browser {
u64 index, top_idx;
void *top, *entries;
u16 y, x, width, height;
u16 y, x, width, height, rows;
int current_color;
void *priv;
const char *title;
char *helpline;
void (*refresh_dimensions)(struct ui_browser *browser);
unsigned int (*refresh)(struct ui_browser *browser);
void (*write)(struct ui_browser *browser, void *entry, int row);
void (*seek)(struct ui_browser *browser, off_t offset, int whence);
......
This diff is collapsed.
......@@ -479,7 +479,7 @@ size_t hists__fprintf(struct hists *hists, bool show_header, int max_rows,
if (h->ms.map == NULL && verbose > 1) {
__map_groups__fprintf_maps(h->thread->mg,
MAP__FUNCTION, verbose, fp);
MAP__FUNCTION, fp);
fprintf(fp, "%.10s end\n", graph_dotted_line);
}
}
......
......@@ -626,7 +626,7 @@ int sample__resolve_callchain(struct perf_sample *sample, struct symbol **parent
int hist_entry__append_callchain(struct hist_entry *he, struct perf_sample *sample)
{
if (!symbol_conf.use_callchain)
if (!symbol_conf.use_callchain || sample->callchain == NULL)
return 0;
return callchain_append(he->callchain, &callchain_cursor, sample->period);
}
......
......@@ -176,4 +176,17 @@ static inline void callchain_cursor_snapshot(struct callchain_cursor *dest,
dest->first = src->curr;
dest->nr -= src->pos;
}
#ifdef HAVE_SKIP_CALLCHAIN_IDX
extern int arch_skip_callchain_idx(struct machine *machine,
struct thread *thread, struct ip_callchain *chain);
#else
static inline int arch_skip_callchain_idx(struct machine *machine __maybe_unused,
struct thread *thread __maybe_unused,
struct ip_callchain *chain __maybe_unused)
{
return -1;
}
#endif
#endif /* __PERF_CALLCHAIN_H */
#include "util.h"
#include "../perf.h"
#include "cloexec.h"
#include "asm/bug.h"
static unsigned long flag = PERF_FLAG_FD_CLOEXEC;
static int perf_flag_probe(void)
{
/* use 'safest' configuration as used in perf_evsel__fallback() */
struct perf_event_attr attr = {
.type = PERF_COUNT_SW_CPU_CLOCK,
.config = PERF_COUNT_SW_CPU_CLOCK,
};
int fd;
int err;
/* check cloexec flag */
fd = sys_perf_event_open(&attr, 0, -1, -1,
PERF_FLAG_FD_CLOEXEC);
err = errno;
if (fd >= 0) {
close(fd);
return 1;
}
WARN_ONCE(err != EINVAL,
"perf_event_open(..., PERF_FLAG_FD_CLOEXEC) failed with unexpected error %d (%s)\n",
err, strerror(err));
/* not supported, confirm error related to PERF_FLAG_FD_CLOEXEC */
fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
err = errno;
if (WARN_ONCE(fd < 0,
"perf_event_open(..., 0) failed unexpectedly with error %d (%s)\n",
err, strerror(err)))
return -1;
close(fd);
return 0;
}
unsigned long perf_event_open_cloexec_flag(void)
{
static bool probed;
if (!probed) {
if (perf_flag_probe() <= 0)
flag = 0;
probed = true;
}
return flag;
}
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment