• Konstantin Khlebnikov's avatar
    mm/swap.c: piggyback lru_add_drain_all() calls · eef1a429
    Konstantin Khlebnikov authored
    This is a very slow operation.  Right now POSIX_FADV_DONTNEED is the top
    user because it has to freeze page references when removing it from the
    cache.  invalidate_bdev() calls it for the same reason.  Both are
    triggered from userspace, so it's easy to generate a storm.
    
    mlock/mlockall no longer calls lru_add_drain_all - I've seen here
    serious slowdown on older kernels.
    
    There are some less obvious paths in memory migration/CMA/offlining
    which shouldn't call frequently.
    
    The worst case requires a non-trivial workload because
    lru_add_drain_all() skips cpus where vectors are empty.  Something must
    constantly generate a flow of pages for each cpu.  Also cpus must be
    busy to make scheduling per-cpu works slower.  And the machine must be
    big enough (64+ cpus in our case).
    
    In our case that was a massive series of mlock calls in map-reduce while
    other tasks write logs (and generates flows of new pages in per-cpu
    vectors).  Mlock calls were serialized by mutex and accumulated latency
    up to 10 seconds or more.
    
    The kernel does not call lru_add_drain_all on mlock paths since 4.15,
    but the same scenario could be triggered by fadvise(POSIX_FADV_DONTNEED)
    or any other remaining user.
    
    There is no reason to do the drain again if somebody else already
    drained all the per-cpu vectors while we waited for the lock.
    
    Piggyback on a drain starting and finishing while we wait for the lock:
    all pages pending at the time of our entry were drained from the
    vectors.
    
    Callers like POSIX_FADV_DONTNEED retry their operations once after
    draining per-cpu vectors when pages have unexpected references.
    
    Link: http://lkml.kernel.org/r/157019456205.3142.3369423180908482020.stgit@buzzSigned-off-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
    Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Matthew Wilcox <willy@infradead.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    eef1a429
swap.c 30.4 KB