• Matt Fleming's avatar
    perf/x86/intel/cqm: Return cached counter value from IRQ context · 2c534c0d
    Matt Fleming authored
    Peter reported the following potential crash which I was able to
    reproduce with his test program,
    
    [  148.765788] ------------[ cut here ]------------
    [  148.765796] WARNING: CPU: 34 PID: 2840 at kernel/smp.c:417 smp_call_function_many+0xb6/0x260()
    [  148.765797] Modules linked in:
    [  148.765800] CPU: 34 PID: 2840 Comm: perf Not tainted 4.2.0-rc1+ #4
    [  148.765803]  ffffffff81cdc398 ffff88085f105950 ffffffff818bdfd5 0000000000000007
    [  148.765805]  0000000000000000 ffff88085f105990 ffffffff810e413a 0000000000000000
    [  148.765807]  ffffffff82301080 0000000000000022 ffffffff8107f640 ffffffff8107f640
    [  148.765809] Call Trace:
    [  148.765810]  <NMI>  [<ffffffff818bdfd5>] dump_stack+0x45/0x57
    [  148.765818]  [<ffffffff810e413a>] warn_slowpath_common+0x8a/0xc0
    [  148.765822]  [<ffffffff8107f640>] ? intel_cqm_stable+0x60/0x60
    [  148.765824]  [<ffffffff8107f640>] ? intel_cqm_stable+0x60/0x60
    [  148.765825]  [<ffffffff810e422a>] warn_slowpath_null+0x1a/0x20
    [  148.765827]  [<ffffffff811613f6>] smp_call_function_many+0xb6/0x260
    [  148.765829]  [<ffffffff8107f640>] ? intel_cqm_stable+0x60/0x60
    [  148.765831]  [<ffffffff81161748>] on_each_cpu_mask+0x28/0x60
    [  148.765832]  [<ffffffff8107f6ef>] intel_cqm_event_count+0x7f/0xe0
    [  148.765836]  [<ffffffff811cdd35>] perf_output_read+0x2a5/0x400
    [  148.765839]  [<ffffffff811d2e5a>] perf_output_sample+0x31a/0x590
    [  148.765840]  [<ffffffff811d333d>] ? perf_prepare_sample+0x26d/0x380
    [  148.765841]  [<ffffffff811d3497>] perf_event_output+0x47/0x60
    [  148.765843]  [<ffffffff811d36c5>] __perf_event_overflow+0x215/0x240
    [  148.765844]  [<ffffffff811d4124>] perf_event_overflow+0x14/0x20
    [  148.765847]  [<ffffffff8107e7f4>] intel_pmu_handle_irq+0x1d4/0x440
    [  148.765849]  [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0
    [  148.765853]  [<ffffffff81219bad>] ? vunmap_page_range+0x19d/0x2f0
    [  148.765854]  [<ffffffff81219d11>] ? unmap_kernel_range_noflush+0x11/0x20
    [  148.765859]  [<ffffffff814ce6fe>] ? ghes_copy_tofrom_phys+0x11e/0x2a0
    [  148.765863]  [<ffffffff8109e5db>] ? native_apic_msr_write+0x2b/0x30
    [  148.765865]  [<ffffffff8109e44d>] ? x2apic_send_IPI_self+0x1d/0x20
    [  148.765869]  [<ffffffff81065135>] ? arch_irq_work_raise+0x35/0x40
    [  148.765872]  [<ffffffff811c8d86>] ? irq_work_queue+0x66/0x80
    [  148.765875]  [<ffffffff81075306>] perf_event_nmi_handler+0x26/0x40
    [  148.765877]  [<ffffffff81063ed9>] nmi_handle+0x79/0x100
    [  148.765879]  [<ffffffff81064422>] default_do_nmi+0x42/0x100
    [  148.765880]  [<ffffffff81064563>] do_nmi+0x83/0xb0
    [  148.765884]  [<ffffffff818c7c0f>] end_repeat_nmi+0x1e/0x2e
    [  148.765886]  [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0
    [  148.765888]  [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0
    [  148.765890]  [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0
    [  148.765891]  <<EOE>>  [<ffffffff8110ab66>] finish_task_switch+0x156/0x210
    [  148.765898]  [<ffffffff818c1671>] __schedule+0x341/0x920
    [  148.765899]  [<ffffffff818c1c87>] schedule+0x37/0x80
    [  148.765903]  [<ffffffff810ae1af>] ? do_page_fault+0x2f/0x80
    [  148.765905]  [<ffffffff818c1f4a>] schedule_user+0x1a/0x50
    [  148.765907]  [<ffffffff818c666c>] retint_careful+0x14/0x32
    [  148.765908] ---[ end trace e33ff2be78e14901 ]---
    
    The CQM task events are not safe to be called from within interrupt
    context because they require performing an IPI to read the counter value
    on all sockets. And performing IPIs from within IRQ context is a
    "no-no".
    
    Make do with the last read counter value currently event in
    event->count when we're invoked in this context.
    Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
    Signed-off-by: default avatarMatt Fleming <matt.fleming@intel.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Vikas Shivappa <vikas.shivappa@intel.com>
    Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
    Cc: Will Auld <will.auld@intel.com>
    Cc: <stable@vger.kernel.org>
    Link: http://lkml.kernel.org/r/1437490509-15373-1-git-send-email-matt@codeblueprint.co.ukSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    2c534c0d
perf_event_intel_cqm.c 33.4 KB