• Andrew Morton's avatar
    [PATCH] always update page->flags atomically · a2b41d23
    Andrew Morton authored
    move_from_swap_cache() and move_to_swap_cache() are playing with
    page->flags nonatomically.  The page is on the LRU at the time and
    another CPU could be altering page->flags concurrently.
    
    The patch converts those functions to use atomic operations.
    
    It also rationalises the number of bits which are cleared.  It's not
    really clear to me what page flags we really want to set to a known
    state in there.
    
    It had no right to go clearing PG_arch_1.  I'm now clearing PG_arch_1
    inside rmqueue() which is still a bit presumptious.
    
    btw: shmem uses PAGE_CACHE_SIZE and swapper_space uses PAGE_SIZE.  I've
    been carefully maintaining the distinction, but it looks like shmem
    will break if we ever do make these values different.
    
    
    Also, __add_to_page_cache() was performing a non-atomic RMW against
    page->flags, under the assumption that it was a newly allocated page
    which no other CPU would look at.  Not true - this function is used for
    moving anon pages into swapcache.  Those anon pages are on the LRU -
    other CPUs can be performing operations against page->flags while
    __add_to_swap_cache is stomping on them.  This had me running around in
    circles for two days.
    
    So let's move the initialisation of the page state into rmqueue(),
    where the page really is new (could do it in page_cache_alloc,
    perhaps).
    
    The SetPageLocked() in __add_to_page_cache() is also rather curious.
    Seems OK for both pagecache and swapcache so I covered that with a
    comment.
    
    
    2.4 has the same problem.  Basically, add_to_swap_cache() can stomp on
    another CPU's manipulation of page->flags.  After a quick review of the
    code there, it is barely conceivable that a concurrent refill_inactve()
    could get its PG_referenced and PG_active bits scribbled on.  Rather
    unlikely because swap_out() will probably see PageActive() and bale
    out.
    
    Also, mark_dirty_kiobuf() could have its PG_dirty bit accidentally
    cleared (but try_to_swap_out() sets it again later).
    
    But there may be other code paths.  Really, I think this needs fixing
    in 2.4 - it's horrid.
    a2b41d23
swap_state.c 8.24 KB