• Kan Liang's avatar
    perf/x86/intel/lbr: Support XSAVES for arch LBR read · c085fb87
    Kan Liang authored
    Reading LBR registers in a perf NMI handler for a non-PEBS event
    causes a high overhead because the number of LBR registers is huge.
    To reduce the overhead, the XSAVES instruction should be used to replace
    the LBR registers' reading method.
    
    The XSAVES buffer used for LBR read has to be per-CPU because the NMI
    handler invoked the lbr_read(). The existing task_ctx_data buffer
    cannot be used which is per-task and only be allocated for the LBR call
    stack mode. A new lbr_xsave pointer is introduced in the cpu_hw_events
    as an XSAVES buffer for LBR read.
    
    The XSAVES buffer should be allocated only when LBR is used by a
    non-PEBS event on the CPU because the total size of the lbr_xsave is
    not small (~1.4KB).
    
    The XSAVES buffer is allocated when a non-PEBS event is added, but it
    is lazily released in x86_release_hardware() when perf releases the
    entire PMU hardware resource, because perf may frequently schedule the
    event, e.g. high context switch. The lazy release method reduces the
    overhead of frequently allocate/free the buffer.
    
    If the lbr_xsave fails to be allocated, roll back to normal Arch LBR
    lbr_read().
    Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Reviewed-by: default avatarDave Hansen <dave.hansen@intel.com>
    Link: https://lkml.kernel.org/r/1593780569-62993-24-git-send-email-kan.liang@linux.intel.com
    c085fb87
lbr.c 47.8 KB