-
Andrew Morton authored
This function is called a lot. Every brk(). The atomic_add() against a global counter hurts on large SMP machines. The patch simply reduces the rate at which that atomic operation is performed, by accumulating a per-cpu count which is spilled into the global counter when the local counter overflows. It trades off efficiency for a little inaccuracy. I tried various implementations involving kmalloc_percpu() and open-coded per-cpu arrays in a generic "per-cpu counter" thing. They all were surprisingly sucky - the additional cache misses involved in walking the more complex data structures really showed up.
29580832