Commit 2a11ff06 authored by Christoph Lameter's avatar Christoph Lameter Committed by Linus Torvalds

[PATCH] zone_reclaim: configurable off node allocation period.

Currently the zone_reclaim code has a fixed window of 30 seconds of off node
allocations should a local zone have no unused pagecache pages left.  Reclaim
will be attempted again after this timeout period to avoid repeated useless
scans for memory.  This is also useful to established sufficiently large off
node allocation chunks to relieve the local node.

It may be beneficial to adjust that time period for some special situations.
For example if memory use was exceeding node capacity one may want to give up
for longer periods of time.  If memory spikes intermittendly then one may want
to shorten the time period to reduce the number of off node allocations.

This patch allows just that....
Signed-off-by: default avatarChristoph Lameter <clameter@sgi.com>
Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
parent a92f7126
...@@ -28,6 +28,7 @@ Currently, these files are in /proc/sys/vm: ...@@ -28,6 +28,7 @@ Currently, these files are in /proc/sys/vm:
- block_dump - block_dump
- drop-caches - drop-caches
- zone_reclaim_mode - zone_reclaim_mode
- zone_reclaim_interval
============================================================== ==============================================================
...@@ -137,4 +138,15 @@ of memory should be used for caching files from disk. ...@@ -137,4 +138,15 @@ of memory should be used for caching files from disk.
It may be beneficial to switch this on if one wants to do zone It may be beneficial to switch this on if one wants to do zone
reclaim regardless of the numa distances in the system. reclaim regardless of the numa distances in the system.
================================================================
zone_reclaim_interval:
The time allowed for off node allocations after zone reclaim
has failed to reclaim enough pages to allow a local allocation.
Time is set in seconds and set by default to 30 seconds.
Reduce the interval if undesired off node allocations occur. However, too
frequent scans will have a negative impact onoff node allocation performance.
...@@ -178,6 +178,7 @@ extern int vm_swappiness; ...@@ -178,6 +178,7 @@ extern int vm_swappiness;
#ifdef CONFIG_NUMA #ifdef CONFIG_NUMA
extern int zone_reclaim_mode; extern int zone_reclaim_mode;
extern int zone_reclaim_interval;
extern int zone_reclaim(struct zone *, gfp_t, unsigned int); extern int zone_reclaim(struct zone *, gfp_t, unsigned int);
#else #else
#define zone_reclaim_mode 0 #define zone_reclaim_mode 0
......
...@@ -182,7 +182,8 @@ enum ...@@ -182,7 +182,8 @@ enum
VM_SWAP_TOKEN_TIMEOUT=28, /* default time for token time out */ VM_SWAP_TOKEN_TIMEOUT=28, /* default time for token time out */
VM_DROP_PAGECACHE=29, /* int: nuke lots of pagecache */ VM_DROP_PAGECACHE=29, /* int: nuke lots of pagecache */
VM_PERCPU_PAGELIST_FRACTION=30,/* int: fraction of pages in each percpu_pagelist */ VM_PERCPU_PAGELIST_FRACTION=30,/* int: fraction of pages in each percpu_pagelist */
VM_ZONE_RECLAIM_MODE=31,/* reclaim local zone memory before going off node */ VM_ZONE_RECLAIM_MODE=31, /* reclaim local zone memory before going off node */
VM_ZONE_RECLAIM_INTERVAL=32, /* time period to wait after reclaim failure */
}; };
......
...@@ -881,6 +881,15 @@ static ctl_table vm_table[] = { ...@@ -881,6 +881,15 @@ static ctl_table vm_table[] = {
.strategy = &sysctl_intvec, .strategy = &sysctl_intvec,
.extra1 = &zero, .extra1 = &zero,
}, },
{
.ctl_name = VM_ZONE_RECLAIM_INTERVAL,
.procname = "zone_reclaim_interval",
.data = &zone_reclaim_interval,
.maxlen = sizeof(zone_reclaim_interval),
.mode = 0644,
.proc_handler = &proc_dointvec_jiffies,
.strategy = &sysctl_jiffies,
},
#endif #endif
{ .ctl_name = 0 } { .ctl_name = 0 }
}; };
......
...@@ -1595,7 +1595,7 @@ int zone_reclaim_mode __read_mostly; ...@@ -1595,7 +1595,7 @@ int zone_reclaim_mode __read_mostly;
/* /*
* Mininum time between zone reclaim scans * Mininum time between zone reclaim scans
*/ */
#define ZONE_RECLAIM_INTERVAL 30*HZ int zone_reclaim_interval __read_mostly = 30*HZ;
/* /*
* Priority for ZONE_RECLAIM. This determines the fraction of pages * Priority for ZONE_RECLAIM. This determines the fraction of pages
...@@ -1617,7 +1617,7 @@ int zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order) ...@@ -1617,7 +1617,7 @@ int zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
int node_id; int node_id;
if (time_before(jiffies, if (time_before(jiffies,
zone->last_unsuccessful_zone_reclaim + ZONE_RECLAIM_INTERVAL)) zone->last_unsuccessful_zone_reclaim + zone_reclaim_interval))
return 0; return 0;
if (!(gfp_mask & __GFP_WAIT) || if (!(gfp_mask & __GFP_WAIT) ||
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment