• Alexey Kardashevskiy's avatar
    KVM: PPC: Optimize clearing TCEs for sparse tables · 6e301a8e
    Alexey Kardashevskiy authored
    The powernv platform maintains 2 TCE tables for VFIO - a hardware TCE
    table and a table with userspace addresses. These tables are radix trees,
    we allocate indirect levels when they are written to. Since
    the memory allocation is problematic in real mode, we have 2 accessors
    to the entries:
    - for virtual mode: it allocates the memory and it is always expected
    to return non-NULL;
    - fr real mode: it does not allocate and can return NULL.
    
    Also, DMA windows can span to up to 55 bits of the address space and since
    we never have this much RAM, such windows are sparse. However currently
    the SPAPR TCE IOMMU driver walks through all TCEs to unpin DMA memory.
    
    Since we maintain a userspace addresses table for VFIO which is a mirror
    of the hardware table, we can use it to know which parts of the DMA
    window have not been mapped and skip these so does this patch.
    
    The bare metal systems do not have this problem as they use a bypass mode
    of a PHB which maps RAM directly.
    
    This helps a lot with sparse DMA windows, reducing the shutdown time from
    about 3 minutes per 1 billion TCEs to a few seconds for 32GB sparse guest.
    Just skipping the last level seems to be good enough.
    
    As non-allocating accessor is used now in virtual mode as well, rename it
    from IOMMU_TABLE_USERSPACE_ENTRY_RM (real mode) to _RO (read only).
    Signed-off-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
    Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
    6e301a8e
vfio_iommu_spapr_tce.c 32.7 KB