• Ram Tummala's avatar
    mm: fix old/young bit handling in the faulting path · 4cd7ba16
    Ram Tummala authored
    Commit 3bd786f7 ("mm: convert do_set_pte() to set_pte_range()")
    replaced do_set_pte() with set_pte_range() and that introduced a
    regression in the following faulting path of non-anonymous vmas which
    caused the PTE for the faulting address to be marked as old instead of
    young.
    
    handle_pte_fault()
      do_pte_missing()
        do_fault()
          do_read_fault() || do_cow_fault() || do_shared_fault()
            finish_fault()
              set_pte_range()
    
    The polarity of prefault calculation is incorrect.  This leads to prefault
    being incorrectly set for the faulting address.  The following check will
    incorrectly mark the PTE old rather than young.  On some architectures
    this will cause a double fault to mark it young when the access is
    retried.
    
        if (prefault && arch_wants_old_prefaulted_pte())
            entry = pte_mkold(entry);
    
    On a subsequent fault on the same address, the faulting path will see a
    non NULL vmf->pte and instead of reaching the do_pte_missing() path, PTE
    will then be correctly marked young in handle_pte_fault() itself.
    
    Due to this bug, performance degradation in the fault handling path will
    be observed due to unnecessary double faulting.
    
    Link: https://lkml.kernel.org/r/20240710014539.746200-1-rtummala@nvidia.com
    Fixes: 3bd786f7 ("mm: convert do_set_pte() to set_pte_range()")
    Signed-off-by: default avatarRam Tummala <rtummala@nvidia.com>
    Reviewed-by: default avatarYin Fengwei <fengwei.yin@intel.com>
    Cc: Alistair Popple <apopple@nvidia.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Yin Fengwei <fengwei.yin@intel.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    4cd7ba16
memory.c 182 KB