• Johannes Weiner's avatar
    mm: memcontrol: account "kmem" consumers in cgroup2 memory controller · 52c29b04
    Johannes Weiner authored
    The original cgroup memory controller has an extension to account slab
    memory (and other "kernel memory" consumers) in a separate "kmem"
    counter, once the user set an explicit limit on that "kmem" pool.
    
    However, this includes various consumers whose sizes are directly linked
    to userspace activity.  Accounting them as an optional "kmem" extension
    is problematic for several reasons:
    
    1. It leaves the main memory interface with incomplete semantics. A
       user who puts their workload into a cgroup and configures a memory
       limit does not expect us to leave holes in the containment as big
       as the dentry and inode cache, or the kernel stack pages.
    
    2. If the limit set on this random historical subgroup of consumers is
       reached, subsequent allocations will fail even when the main memory
       pool available to the cgroup is not yet exhausted and/or has
       reclaimable memory in it.
    
    3. Calling it 'kernel memory' is misleading. The dentry and inode
       caches are no more 'kernel' (or no less 'user') memory than the
       page cache itself. Treating these consumers as different classes is
       a historical implementation detail that should not leak to users.
    
    So, in addition to page cache, anonymous memory, and network socket
    memory, account the following memory consumers per default in the
    cgroup2 memory controller:
    
         - threadinfo
         - task_struct
         - task_delay_info
         - pid
         - cred
         - mm_struct
         - vm_area_struct and vm_region (nommu)
         - anon_vma and anon_vma_chain
         - signal_struct
         - sighand_struct
         - fs_struct
         - files_struct
         - fdtable and fdtable->full_fds_bits
         - dentry and external_name
         - inode for all filesystems.
    
    This should give us reasonable memory isolation for most common
    workloads out of the box.
    Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
    Acked-by: default avatarMichal Hocko <mhocko@suse.com>
    Cc: Tejun Heo <tj@kernel.org>
    Acked-by: default avatarVladimir Davydov <vdavydov@virtuozzo.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    52c29b04
memcontrol.c 147 KB