• Zefan Li's avatar
    netprio_cgroup: Fix unlimited memory leak of v2 cgroups · 090e28b2
    Zefan Li authored
    If systemd is configured to use hybrid mode which enables the use of
    both cgroup v1 and v2, systemd will create new cgroup on both the default
    root (v2) and netprio_cgroup hierarchy (v1) for a new session and attach
    task to the two cgroups. If the task does some network thing then the v2
    cgroup can never be freed after the session exited.
    
    One of our machines ran into OOM due to this memory leak.
    
    In the scenario described above when sk_alloc() is called
    cgroup_sk_alloc() thought it's in v2 mode, so it stores
    the cgroup pointer in sk->sk_cgrp_data and increments
    the cgroup refcnt, but then sock_update_netprioidx()
    thought it's in v1 mode, so it stores netprioidx value
    in sk->sk_cgrp_data, so the cgroup refcnt will never be freed.
    
    Currently we do the mode switch when someone writes to the ifpriomap
    cgroup control file. The easiest fix is to also do the switch when
    a task is attached to a new cgroup.
    
    Fixes: bd1060a1 ("sock, cgroup: add sock->sk_cgroup")
    Reported-by: default avatarYang Yingliang <yangyingliang@huawei.com>
    Tested-by: default avatarYang Yingliang <yangyingliang@huawei.com>
    Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
    Acked-by: default avatarTejun Heo <tj@kernel.org>
    Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
    090e28b2
netprio_cgroup.c 6.59 KB