• Nishanth Aravamudan's avatar
    powerpc/numa: Reset node_possible_map to only node_online_map · 3af229f2
    Nishanth Aravamudan authored
    Raghu noticed an issue with excessive memory allocation on power with a
    simple cgroup test, specifically, in mem_cgroup_css_alloc ->
    for_each_node -> alloc_mem_cgroup_per_zone_info(), which ends up blowing
    up the kmalloc-2048 slab (to the order of 200MB for 400 cgroup
    directories).
    
    The underlying issue is that NODES_SHIFT on power is 8 (256 NUMA nodes
    possible), which defines node_possible_map, which in turn defines the
    value of nr_node_ids in setup_nr_node_ids and the iteration of
    for_each_node.
    
    In practice, we never see a system with 256 NUMA nodes, and in fact, we
    do not support node hotplug on power in the first place, so the nodes
    that are online when we come up are the nodes that will be present for
    the lifetime of this kernel. So let's, at least, drop the NUMA possible
    map down to the online map at runtime. This is similar to what x86 does
    in its initialization routines.
    
    mem_cgroup_css_alloc should also be fixed to only iterate over
    memory-populated nodes and handle hotplug, but that is a separate
    change.
    Signed-off-by: default avatarNishanth Aravamudan <nacc@linux.vnet.ibm.com>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Anton Blanchard <anton@samba.org>
    Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    3af229f2
numa.c 39 KB