• Kan Liang's avatar
    perf/x86/intel/lbr: Support XSAVES/XRSTORS for LBR context switch · ce711ea3
    Kan Liang authored
    In the LBR call stack mode, LBR information is used to reconstruct a
    call stack. To get the complete call stack, perf has to save/restore
    all LBR registers during a context switch. Due to a large number of the
    LBR registers, this process causes a high CPU overhead. To reduce the
    CPU overhead during a context switch, use the XSAVES/XRSTORS
    instructions.
    
    Every XSAVE area must follow a canonical format: the legacy region, an
    XSAVE header and the extended region. Although the LBR information is
    only kept in the extended region, a space for the legacy region and
    XSAVE header is still required. Add a new dedicated structure for LBR
    XSAVES support.
    
    Before enabling XSAVES support, the size of the LBR state has to be
    sanity checked, because:
    - the size of the software structure is calculated from the max number
    of the LBR depth, which is enumerated by the CPUID leaf for Arch LBR.
    The size of the LBR state is enumerated by the CPUID leaf for XSAVE
    support of Arch LBR. If the values from the two CPUID leaves are not
    consistent, it may trigger a buffer overflow. For example, a hypervisor
    may unconsciously set inconsistent values for the two emulated CPUID.
    - unlike other state components, the size of an LBR state depends on the
    max number of LBRs, which may vary from generation to generation.
    
    Expose the function xfeature_size() for the sanity check.
    The LBR XSAVES support will be disabled if the size of the LBR state
    enumerated by CPUID doesn't match with the size of the software
    structure.
    
    The XSAVE instruction requires 64-byte alignment for state buffers. A
    new macro is added to reflect the alignment requirement. A 64-byte
    aligned kmem_cache is created for architecture LBR.
    
    Currently, the structure for each state component is maintained in
    fpu/types.h. The structure for the new LBR state component should be
    maintained in the same place. Move structure lbr_entry to fpu/types.h as
    well for broader sharing.
    
    Add dedicated lbr_save/lbr_restore functions for LBR XSAVES support,
    which invokes the corresponding xstate helpers to XSAVES/XRSTORS LBR
    information at the context switch when the call stack mode is enabled.
    Since the XSAVES/XRSTORS instructions will be eventually invoked, the
    dedicated functions is named with '_xsaves'/'_xrstors' postfix.
    Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Reviewed-by: default avatarDave Hansen <dave.hansen@intel.com>
    Link: https://lkml.kernel.org/r/1593780569-62993-23-git-send-email-kan.liang@linux.intel.com
    ce711ea3
lbr.c 46.8 KB