Commit 50694c28 authored by Mel Gorman's avatar Mel Gorman Committed by Linus Torvalds

mm: vmscan: check for fatal signals iff the process was throttled

Commit 5515061d ("mm: throttle direct reclaimers if PF_MEMALLOC
reserves are low and swap is backed by network storage") introduced a
check for fatal signals after a process gets throttled for network
storage.  The intention was that if a process was throttled and got
killed that it should not trigger the OOM killer.  As pointed out by
Minchan Kim and David Rientjes, this check is in the wrong place and too
broad.  If a system is in am OOM situation and a process is exiting, it
can loop in __alloc_pages_slowpath() and calling direct reclaim in a
loop.  As the fatal signal is pending it returns 1 as if it is making
forward progress and can effectively deadlock.

This patch moves the fatal_signal_pending() check after throttling to
throttle_direct_reclaim() where it belongs.  If the process is killed
while throttled, it will return immediately without direct reclaim
except now it will have TIF_MEMDIE set and will use the PFMEMALLOC
reserves.

Minchan pointed out that it may be better to direct reclaim before
returning to avoid using the reserves because there may be pages that
can easily reclaim that would avoid using the reserves.  However, we do
no such targetted reclaim and there is no guarantee that suitable pages
are available.  As it is expected that this throttling happens when
swap-over-NFS is used there is a possibility that the process will
instead swap which may allocate network buffers from the PFMEMALLOC
reserves.  Hence, in the swap-over-nfs case where a process can be
throtted and be killed it can use the reserves to exit or it can
potentially use reserves to swap a few pages and then exit.  This patch
takes the option of using the reserves if necessary to allow the process
exit quickly.

If this patch passes review it should be considered a -stable candidate
for 3.6.
Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Sonny Rao <sonnyrao@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent 82b212f4
...@@ -2207,9 +2207,12 @@ static bool pfmemalloc_watermark_ok(pg_data_t *pgdat) ...@@ -2207,9 +2207,12 @@ static bool pfmemalloc_watermark_ok(pg_data_t *pgdat)
* Throttle direct reclaimers if backing storage is backed by the network * Throttle direct reclaimers if backing storage is backed by the network
* and the PFMEMALLOC reserve for the preferred node is getting dangerously * and the PFMEMALLOC reserve for the preferred node is getting dangerously
* depleted. kswapd will continue to make progress and wake the processes * depleted. kswapd will continue to make progress and wake the processes
* when the low watermark is reached * when the low watermark is reached.
*
* Returns true if a fatal signal was delivered during throttling. If this
* happens, the page allocator should not consider triggering the OOM killer.
*/ */
static void throttle_direct_reclaim(gfp_t gfp_mask, struct zonelist *zonelist, static bool throttle_direct_reclaim(gfp_t gfp_mask, struct zonelist *zonelist,
nodemask_t *nodemask) nodemask_t *nodemask)
{ {
struct zone *zone; struct zone *zone;
...@@ -2224,13 +2227,20 @@ static void throttle_direct_reclaim(gfp_t gfp_mask, struct zonelist *zonelist, ...@@ -2224,13 +2227,20 @@ static void throttle_direct_reclaim(gfp_t gfp_mask, struct zonelist *zonelist,
* processes to block on log_wait_commit(). * processes to block on log_wait_commit().
*/ */
if (current->flags & PF_KTHREAD) if (current->flags & PF_KTHREAD)
return; goto out;
/*
* If a fatal signal is pending, this process should not throttle.
* It should return quickly so it can exit and free its memory
*/
if (fatal_signal_pending(current))
goto out;
/* Check if the pfmemalloc reserves are ok */ /* Check if the pfmemalloc reserves are ok */
first_zones_zonelist(zonelist, high_zoneidx, NULL, &zone); first_zones_zonelist(zonelist, high_zoneidx, NULL, &zone);
pgdat = zone->zone_pgdat; pgdat = zone->zone_pgdat;
if (pfmemalloc_watermark_ok(pgdat)) if (pfmemalloc_watermark_ok(pgdat))
return; goto out;
/* Account for the throttling */ /* Account for the throttling */
count_vm_event(PGSCAN_DIRECT_THROTTLE); count_vm_event(PGSCAN_DIRECT_THROTTLE);
...@@ -2246,12 +2256,20 @@ static void throttle_direct_reclaim(gfp_t gfp_mask, struct zonelist *zonelist, ...@@ -2246,12 +2256,20 @@ static void throttle_direct_reclaim(gfp_t gfp_mask, struct zonelist *zonelist,
if (!(gfp_mask & __GFP_FS)) { if (!(gfp_mask & __GFP_FS)) {
wait_event_interruptible_timeout(pgdat->pfmemalloc_wait, wait_event_interruptible_timeout(pgdat->pfmemalloc_wait,
pfmemalloc_watermark_ok(pgdat), HZ); pfmemalloc_watermark_ok(pgdat), HZ);
return;
goto check_pending;
} }
/* Throttle until kswapd wakes the process */ /* Throttle until kswapd wakes the process */
wait_event_killable(zone->zone_pgdat->pfmemalloc_wait, wait_event_killable(zone->zone_pgdat->pfmemalloc_wait,
pfmemalloc_watermark_ok(pgdat)); pfmemalloc_watermark_ok(pgdat));
check_pending:
if (fatal_signal_pending(current))
return true;
out:
return false;
} }
unsigned long try_to_free_pages(struct zonelist *zonelist, int order, unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
...@@ -2273,13 +2291,12 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order, ...@@ -2273,13 +2291,12 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
.gfp_mask = sc.gfp_mask, .gfp_mask = sc.gfp_mask,
}; };
throttle_direct_reclaim(gfp_mask, zonelist, nodemask);
/* /*
* Do not enter reclaim if fatal signal is pending. 1 is returned so * Do not enter reclaim if fatal signal was delivered while throttled.
* that the page allocator does not consider triggering OOM * 1 is returned so that the page allocator does not OOM kill at this
* point.
*/ */
if (fatal_signal_pending(current)) if (throttle_direct_reclaim(gfp_mask, zonelist, nodemask))
return 1; return 1;
trace_mm_vmscan_direct_reclaim_begin(order, trace_mm_vmscan_direct_reclaim_begin(order,
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment