• Charan Teja Kalla's avatar
    mm: page_alloc: unreserve highatomic page blocks before oom · ac3f3b0a
    Charan Teja Kalla authored
    __alloc_pages_direct_reclaim() is called from slowpath allocation where
    high atomic reserves can be unreserved after there is a progress in
    reclaim and yet no suitable page is found.  Later should_reclaim_retry()
    gets called from slow path allocation to decide if the reclaim needs to be
    retried before OOM kill path is taken.
    
    should_reclaim_retry() checks the available(reclaimable + free pages)
    memory against the min wmark levels of a zone and returns:
    
    a) true, if it is above the min wmark so that slow path allocation will
       do the reclaim retries.
    
    b) false, thus slowpath allocation takes oom kill path.
    
    should_reclaim_retry() can also unreserves the high atomic reserves **but
    only after all the reclaim retries are exhausted.**
    
    In a case where there are almost none reclaimable memory and free pages
    contains mostly the high atomic reserves but allocation context can't use
    these high atomic reserves, makes the available memory below min wmark
    levels hence false is returned from should_reclaim_retry() leading the
    allocation request to take OOM kill path.  This can turn into a early oom
    kill if high atomic reserves are holding lot of free memory and
    unreserving of them is not attempted.
    
    (early)OOM is encountered on a VM with the below state:
    [  295.998653] Normal free:7728kB boost:0kB min:804kB low:1004kB
    high:1204kB reserved_highatomic:8192KB active_anon:4kB inactive_anon:0kB
    active_file:24kB inactive_file:24kB unevictable:1220kB writepending:0kB
    present:70732kB managed:49224kB mlocked:0kB bounce:0kB free_pcp:688kB
    local_pcp:492kB free_cma:0kB
    [  295.998656] lowmem_reserve[]: 0 32
    [  295.998659] Normal: 508*4kB (UMEH) 241*8kB (UMEH) 143*16kB (UMEH)
    33*32kB (UH) 7*64kB (UH) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB
    0*4096kB = 7752kB
    
    Per above log, the free memory of ~7MB exist in the high atomic reserves
    is not freed up before falling back to oom kill path.
    
    Fix it by trying to unreserve the high atomic reserves in
    should_reclaim_retry() before __alloc_pages_direct_reclaim() can fallback
    to oom kill path.
    
    Link: https://lkml.kernel.org/r/1700823445-27531-1-git-send-email-quic_charante@quicinc.com
    Fixes: 0aaa29a5 ("mm, page_alloc: reserve pageblocks for high-order atomic allocations on demand")
    Signed-off-by: default avatarCharan Teja Kalla <quic_charante@quicinc.com>
    Reported-by: default avatarChris Goldsworthy <quic_cgoldswo@quicinc.com>
    Suggested-by: default avatarMichal Hocko <mhocko@suse.com>
    Acked-by: default avatarMichal Hocko <mhocko@suse.com>
    Acked-by: default avatarDavid Rientjes <rientjes@google.com>
    Cc: Chris Goldsworthy <quic_cgoldswo@quicinc.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Pavankumar Kondeti <quic_pkondeti@quicinc.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    ac3f3b0a
page_alloc.c 191 KB