• Andrea Arcangeli's avatar
    mm: hugetlbfs: fix hugetlbfs optimization · 50d8f1b5
    Andrea Arcangeli authored
    commit 27c73ae7 upstream.
    
    Commit 7cb2ef56 ("mm: fix aio performance regression for database
    caused by THP") can cause dereference of a dangling pointer if
    split_huge_page runs during PageHuge() if there are updates to the
    tail_page->private field.
    
    Also it is repeating compound_head twice for hugetlbfs and it is running
    compound_head+compound_trans_head for THP when a single one is needed in
    both cases.
    
    The new code within the PageSlab() check doesn't need to verify that the
    THP page size is never bigger than the smallest hugetlbfs page size, to
    avoid memory corruption.
    
    A longstanding theoretical race condition was found while fixing the
    above (see the change right after the skip_unlock label, that is
    relevant for the compound_lock path too).
    
    By re-establishing the _mapcount tail refcounting for all compound
    pages, this also fixes the below problem:
    
      echo 0 >/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
    
      BUG: Bad page state in process bash  pfn:59a01
      page:ffffea000139b038 count:0 mapcount:10 mapping:          (null) index:0x0
      page flags: 0x1c00000000008000(tail)
      Modules linked in:
      CPU: 6 PID: 2018 Comm: bash Not tainted 3.12.0+ #25
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      Call Trace:
        dump_stack+0x55/0x76
        bad_page+0xd5/0x130
        free_pages_prepare+0x213/0x280
        __free_pages+0x36/0x80
        update_and_free_page+0xc1/0xd0
        free_pool_huge_page+0xc2/0xe0
        set_max_huge_pages.part.58+0x14c/0x220
        nr_hugepages_store_common.isra.60+0xd0/0xf0
        nr_hugepages_store+0x13/0x20
        kobj_attr_store+0xf/0x20
        sysfs_write_file+0x189/0x1e0
        vfs_write+0xc5/0x1f0
        SyS_write+0x55/0xb0
        system_call_fastpath+0x16/0x1b
    Signed-off-by: default avatarKhalid Aziz <khalid.aziz@oracle.com>
    Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
    Tested-by: default avatarKhalid Aziz <khalid.aziz@oracle.com>
    Cc: Pravin Shelar <pshelar@nicira.com>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: Ben Hutchings <bhutchings@solarflare.com>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: Johannes Weiner <jweiner@redhat.com>
    Cc: Mel Gorman <mgorman@suse.de>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Andi Kleen <andi@firstfloor.org>
    Cc: Minchan Kim <minchan@kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    Cc: Guillaume Morin <guillaume@morinfr.org>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    
    50d8f1b5
swap.c 22.4 KB