• Vlastimil Babka's avatar
    mm, thp: tweak reclaim/compaction effort of local-only and all-node allocations · cc638f32
    Vlastimil Babka authored
    THP page faults now attempt a __GFP_THISNODE allocation first, which
    should only compact existing free memory, followed by another attempt
    that can allocate from any node using reclaim/compaction effort
    specified by global defrag setting and madvise.
    
    This patch makes the following changes to the scheme:
    
     - Before the patch, the first allocation relies on a check for
       pageblock order and __GFP_IO to prevent excessive reclaim. This
       however affects also the second attempt, which is not limited to
       single node.
    
       Instead of that, reuse the existing check for costly order
       __GFP_NORETRY allocations, and make sure the first THP attempt uses
       __GFP_NORETRY. As a side-effect, all costly order __GFP_NORETRY
       allocations will bail out if compaction needs reclaim, while
       previously they only bailed out when compaction was deferred due to
       previous failures.
    
       This should be still acceptable within the __GFP_NORETRY semantics.
    
     - Before the patch, the second allocation attempt (on all nodes) was
       passing __GFP_NORETRY. This is redundant as the check for pageblock
       order (discussed above) was stronger. It's also contrary to
       madvise(MADV_HUGEPAGE) which means some effort to allocate THP is
       requested.
    
       After this patch, the second attempt doesn't pass __GFP_THISNODE nor
       __GFP_NORETRY.
    
    To sum up, THP page faults now try the following attempts:
    
    1. local node only THP allocation with no reclaim, just compaction.
    2. for madvised VMA's or when synchronous compaction is enabled always - THP
       allocation from any node with effort determined by global defrag setting
       and VMA madvise
    3. fallback to base pages on any node
    
    Link: http://lkml.kernel.org/r/08a3f4dd-c3ce-0009-86c5-9ee51aba8557@suse.cz
    Fixes: b39d0ee2 ("mm, page_alloc: avoid expensive reclaim when compaction may not succeed")
    Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
    Acked-by: default avatarMichal Hocko <mhocko@suse.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Mel Gorman <mgorman@suse.de>
    Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
    Cc: David Rientjes <rientjes@google.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    cc638f32
mempolicy.c 74.8 KB