• Hugh Dickins's avatar
    memcg: free mem_cgroup by RCU to fix oops · 59927fb9
    Hugh Dickins authored
    After fixing the GPF in mem_cgroup_lru_del_list(), three times one
    machine running a similar load (moving and removing memcgs while
    swapping) has oopsed in mem_cgroup_zone_nr_lru_pages(), when retrieving
    memcg zone numbers for get_scan_count() for shrink_mem_cgroup_zone():
    this is where a struct mem_cgroup is first accessed after being chosen
    by mem_cgroup_iter().
    
    Just what protects a struct mem_cgroup from being freed, in between
    mem_cgroup_iter()'s css_get_next() and its css_tryget()? css_tryget()
    fails once css->refcnt is zero with CSS_REMOVED set in flags, yes: but
    what if that memory is freed and reused for something else, which sets
    "refcnt" non-zero? Hmm, and scope for an indefinite freeze if refcnt is
    left at zero but flags are cleared.
    
    It's tempting to move the css_tryget() into css_get_next(), to make it
    really "get" the css, but I don't think that actually solves anything:
    the same difficulty in moving from css_id found to stable css remains.
    
    But we already have rcu_read_lock() around the two, so it's easily fixed
    if __mem_cgroup_free() just uses kfree_rcu() to free mem_cgroup.
    
    However, a big struct mem_cgroup is allocated with vzalloc() instead of
    kzalloc(), and we're not allowed to vfree() at interrupt time: there
    doesn't appear to be a general vfree_rcu() to help with this, so roll
    our own using schedule_work().  The compiler decently removes
    vfree_work() and vfree_rcu() when the config doesn't need them.
    Signed-off-by: default avatarHugh Dickins <hughd@google.com>
    Acked-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
    Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
    Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: Ying Han <yinghan@google.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    59927fb9
memcontrol.c 144 KB