• Kai Huang's avatar
    x86/virt/tdx: Allocate and set up PAMTs for TDMRs · ac3a2208
    Kai Huang authored
    The TDX module uses additional metadata to record things like which
    guest "owns" a given page of memory.  This metadata, referred as
    Physical Address Metadata Table (PAMT), essentially serves as the
    'struct page' for the TDX module.  PAMTs are not reserved by hardware
    up front.  They must be allocated by the kernel and then given to the
    TDX module during module initialization.
    
    TDX supports 3 page sizes: 4K, 2M, and 1G.  Each "TD Memory Region"
    (TDMR) has 3 PAMTs to track the 3 supported page sizes.  Each PAMT must
    be a physically contiguous area from a Convertible Memory Region (CMR).
    However, the PAMTs which track pages in one TDMR do not need to reside
    within that TDMR but can be anywhere in CMRs.  If one PAMT overlaps with
    any TDMR, the overlapping part must be reported as a reserved area in
    that particular TDMR.
    
    Use alloc_contig_pages() since PAMT must be a physically contiguous area
    and it may be potentially large (~1/256th of the size of the given TDMR).
    The downside is alloc_contig_pages() may fail at runtime.  One (bad)
    mitigation is to launch a TDX guest early during system boot to get
    those PAMTs allocated at early time, but the only way to fix is to add a
    boot option to allocate or reserve PAMTs during kernel boot.
    
    It is imperfect but will be improved on later.
    
    TDX only supports a limited number of reserved areas per TDMR to cover
    both PAMTs and memory holes within the given TDMR.  If many PAMTs are
    allocated within a single TDMR, the reserved areas may not be sufficient
    to cover all of them.
    
    Adopt the following policies when allocating PAMTs for a given TDMR:
    
      - Allocate three PAMTs of the TDMR in one contiguous chunk to minimize
        the total number of reserved areas consumed for PAMTs.
      - Try to first allocate PAMT from the local node of the TDMR for better
        NUMA locality.
    
    Also dump out how many pages are allocated for PAMTs when the TDX module
    is initialized successfully.  This helps answer the eternal "where did
    all my memory go?" questions.
    
    [ dhansen: merge in error handling cleanup ]
    Signed-off-by: default avatarKai Huang <kai.huang@intel.com>
    Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
    Reviewed-by: default avatarIsaku Yamahata <isaku.yamahata@intel.com>
    Reviewed-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
    Reviewed-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Reviewed-by: default avatarYuan Yao <yuan.yao@intel.com>
    Link: https://lore.kernel.org/all/20231208170740.53979-11-dave.hansen%40intel.com
    ac3a2208
Kconfig 99.1 KB