• Mel Gorman's avatar
    mm: compaction: check for overlapping nodes during isolation for migration · dc908600
    Mel Gorman authored
    When isolating pages for migration, migration starts at the start of a
    zone while the free scanner starts at the end of the zone.  Migration
    avoids entering a new zone by never going beyond the free scanned.
    
    Unfortunately, in very rare cases nodes can overlap.  When this happens,
    migration isolates pages without the LRU lock held, corrupting lists
    which will trigger errors in reclaim or during page free such as in the
    following oops
    
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      IP: [<ffffffff810f795c>] free_pcppages_bulk+0xcc/0x450
      PGD 1dda554067 PUD 1e1cb58067 PMD 0
      Oops: 0000 [#1] SMP
      CPU 37
      Pid: 17088, comm: memcg_process_s Tainted: G            X
      RIP: free_pcppages_bulk+0xcc/0x450
      Process memcg_process_s (pid: 17088, threadinfo ffff881c2926e000, task ffff881c2926c0c0)
      Call Trace:
        free_hot_cold_page+0x17e/0x1f0
        __pagevec_free+0x90/0xb0
        release_pages+0x22a/0x260
        pagevec_lru_move_fn+0xf3/0x110
        putback_lru_page+0x66/0xe0
        unmap_and_move+0x156/0x180
        migrate_pages+0x9e/0x1b0
        compact_zone+0x1f3/0x2f0
        compact_zone_order+0xa2/0xe0
        try_to_compact_pages+0xdf/0x110
        __alloc_pages_direct_compact+0xee/0x1c0
        __alloc_pages_slowpath+0x370/0x830
        __alloc_pages_nodemask+0x1b1/0x1c0
        alloc_pages_vma+0x9b/0x160
        do_huge_pmd_anonymous_page+0x160/0x270
        do_page_fault+0x207/0x4c0
        page_fault+0x25/0x30
    
    The "X" in the taint flag means that external modules were loaded but but
    is unrelated to the bug triggering.  The real problem was because the PFN
    layout looks like this
    
      Zone PFN ranges:
        DMA      0x00000010 -> 0x00001000
        DMA32    0x00001000 -> 0x00100000
        Normal   0x00100000 -> 0x01e80000
      Movable zone start PFN for each node
      early_node_map[14] active PFN ranges
          0: 0x00000010 -> 0x0000009b
          0: 0x00000100 -> 0x0007a1ec
          0: 0x0007a354 -> 0x0007a379
          0: 0x0007f7ff -> 0x0007f800
          0: 0x00100000 -> 0x00680000
          1: 0x00680000 -> 0x00e80000
          0: 0x00e80000 -> 0x01080000
          1: 0x01080000 -> 0x01280000
          0: 0x01280000 -> 0x01480000
          1: 0x01480000 -> 0x01680000
          0: 0x01680000 -> 0x01880000
          1: 0x01880000 -> 0x01a80000
          0: 0x01a80000 -> 0x01c80000
          1: 0x01c80000 -> 0x01e80000
    
    The fix is straight-forward.  isolate_migratepages() has to make a
    similar check to isolate_freepage to ensure that it never isolates pages
    from a zone it does not hold the LRU lock for.
    
    This was discovered in a 3.0-based kernel but it affects 3.1.x, 3.2.x
    and current mainline.
    Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
    Acked-by: default avatarMichal Nazarewicz <mina86@mina86.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    dc908600
compaction.c 20.5 KB