• Andrew Morton's avatar
    [PATCH] Convert i_shared_sem back to a spinlock · c0868962
    Andrew Morton authored
    Having a semaphore in there causes modest performance regressions on heavily
    mmap-intensive workloads on some hardware.  Specifically, up to 30% in SDET on
    NUMAQ and big PPC64.
    
    So switch it back to being a spinlock.  This does mean that unmap_vmas() needs
    to be told whether or not it is allowed to schedule away; that's simple to do
    via the zap_details structure.
    
    This change means that there will be high scheuling latencies when someone
    truncates a large file which is currently mmapped, but nobody does that
    anyway.  The scheduling points in unmap_vmas() are mainly for munmap() and
    exit(), and they still will work OK for that.
    
    From: Hugh Dickins <hugh@veritas.com>
    
      Sorry, my premature optimizations (trying to pass down NULL zap_details
      except when needed) have caught you out doubly: unmap_mapping_range_list was
      NULLing the details even though atomic was set; and if it hadn't, then
      zap_pte_range would have missed free_swap_and_cache and pte_clear when pte
      not present.  Moved the optimization into zap_pte_range itself.  Plus
      massive documentation update.
    
    From: Hugh Dickins <hugh@veritas.com>
    
      Here's a second patch to add to the first: mremap's cows can't come home
      without releasing the i_mmap_lock, better move the whole "Subtle point"
      locking from move_vma into move_page_tables.  And it's possible for the file
      that was behind an anonymous page to be truncated while we drop that lock,
      don't want to abort mremap because of VM_FAULT_SIGBUS.
    
      (Eek, should we be checking do_swap_page of a vm_file area against the
      truncate_count sequence?  Technically yes, but I doubt we need bother.)
    
    
    - We cannot hold i_mmap_lock across move_one_page() because
      move_one_page() needs to perform __GFP_WAIT allocations of pagetable pages.
    
    - Move the cond_resched() out so we test it once per page rather than only
      when move_one_page() returns -EAGAIN.
    c0868962
inode.c 36.2 KB