• Suraj Jitindar Singh's avatar
    KVM: PPC: Book3S HV: Handle page fault for a nested guest · fd10be25
    Suraj Jitindar Singh authored
    Consider a normal (L1) guest running under the main hypervisor (L0),
    and then a nested guest (L2) running under the L1 guest which is acting
    as a nested hypervisor. L0 has page tables to map the address space for
    L1 providing the translation from L1 real address -> L0 real address;
    
    	L1
    	|
    	| (L1 -> L0)
    	|
    	----> L0
    
    There are also page tables in L1 used to map the address space for L2
    providing the translation from L2 real address -> L1 read address. Since
    the hardware can only walk a single level of page table, we need to
    maintain in L0 a "shadow_pgtable" for L2 which provides the translation
    from L2 real address -> L0 real address. Which looks like;
    
    	L2				L2
    	|				|
    	| (L2 -> L1)			|
    	|				|
    	----> L1			| (L2 -> L0)
    	      |				|
    	      | (L1 -> L0)		|
    	      |				|
    	      ----> L0			--------> L0
    
    When a page fault occurs while running a nested (L2) guest we need to
    insert a pte into this "shadow_pgtable" for the L2 -> L0 mapping. To
    do this we need to:
    
    1. Walk the pgtable in L1 memory to find the L2 -> L1 mapping, and
       provide a page fault to L1 if this mapping doesn't exist.
    2. Use our L1 -> L0 pgtable to convert this L1 address to an L0 address,
       or try to insert a pte for that mapping if it doesn't exist.
    3. Now we have a L2 -> L0 mapping, insert this into our shadow_pgtable
    
    Once this mapping exists we can take rc faults when hardware is unable
    to automatically set the reference and change bits in the pte. On these
    we need to:
    
    1. Check the rc bits on the L2 -> L1 pte match, and otherwise reflect
       the fault down to L1.
    2. Set the rc bits in the L1 -> L0 pte which corresponds to the same
       host page.
    3. Set the rc bits in the L2 -> L0 pte.
    
    As we reuse a large number of functions in book3s_64_mmu_radix.c for
    this we also needed to refactor a number of these functions to take
    an lpid parameter so that the correct lpid is used for tlb invalidations.
    The functionality however has remained the same.
    Reviewed-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
    Signed-off-by: default avatarSuraj Jitindar Singh <sjitindarsingh@gmail.com>
    Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    fd10be25
tlbflush-radix.h 2.48 KB