• Oliver Upton's avatar
    KVM: arm64: Infer the PA offset from IPA in stage-2 map walker · 1f0f4a2e
    Oliver Upton authored
    Until now, the page table walker counted increments to the PA and IPA
    of a walk in two separate places. While the PA is incremented as soon as
    a leaf PTE is installed in stage2_map_walker_try_leaf(), the IPA is
    actually bumped in the generic table walker context. Critically,
    __kvm_pgtable_visit() rereads the PTE after the LEAF callback returns
    to work out if a table or leaf was installed, and only bumps the IPA for
    a leaf PTE.
    
    This arrangement worked fine when we handled faults behind the write lock,
    as the walker had exclusive access to the stage-2 page tables. However,
    commit 1577cb58 ("KVM: arm64: Handle stage-2 faults in parallel")
    started handling all stage-2 faults behind the read lock, opening up a
    race where a walker could increment the PA but not the IPA of a walk.
    Nothing good ensues, as the walker starts mapping with the incorrect
    IPA -> PA relationship.
    
    For example, assume that two vCPUs took a data abort on the same IPA.
    One observes that dirty logging is disabled, and the other observed that
    it is enabled:
    
      vCPU attempting PMD mapping		  vCPU attempting PTE mapping
      ======================================  =====================================
      /* install PMD */
      stage2_make_pte(ctx, leaf);
      data->phys += granule;
      					  /* replace PMD with a table */
      					  stage2_try_break_pte(ctx, data->mmu);
    					  stage2_make_pte(ctx, table);
      /* table is observed */
      ctx.old = READ_ONCE(*ptep);
      table = kvm_pte_table(ctx.old, level);
    
      /*
       * map walk continues w/o incrementing
       * IPA.
       */
       __kvm_pgtable_walk(..., level + 1);
    
    Bring an end to the whole mess by using the IPA as the single source of
    truth for how far along a walk has gotten. Work out the correct PA to
    map by calculating the IPA offset from the beginning of the walk and add
    that to the starting physical address.
    
    Cc: stable@vger.kernel.org
    Fixes: 1577cb58 ("KVM: arm64: Handle stage-2 faults in parallel")
    Signed-off-by: default avatarOliver Upton <oliver.upton@linux.dev>
    Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
    Link: https://lore.kernel.org/r/20230421071606.1603916-2-oliver.upton@linux.dev
    1f0f4a2e
kvm_pgtable.h 23.5 KB