• Joonsoo Kim's avatar
    mm, hugetlb: add VM_NORESERVE check in vma_has_reserves() · 72231b03
    Joonsoo Kim authored
    If we map the region with MAP_NORESERVE and MAP_SHARED, we can skip to
    check reserve counting and eventually we cannot be ensured to allocate a
    huge page in fault time.  With following example code, you can easily find
    this situation.
    
    Assume 2MB, nr_hugepages = 100
    
            fd = hugetlbfs_unlinked_fd();
            if (fd < 0)
                    return 1;
    
            size = 200 * MB;
            flag = MAP_SHARED;
            p = mmap(NULL, size, PROT_READ|PROT_WRITE, flag, fd, 0);
            if (p == MAP_FAILED) {
                    fprintf(stderr, "mmap() failed: %s\n", strerror(errno));
                    return -1;
            }
    
            size = 2 * MB;
            flag = MAP_ANONYMOUS | MAP_SHARED | MAP_HUGETLB | MAP_NORESERVE;
            p = mmap(NULL, size, PROT_READ|PROT_WRITE, flag, -1, 0);
            if (p == MAP_FAILED) {
                    fprintf(stderr, "mmap() failed: %s\n", strerror(errno));
            }
            p[0] = '0';
            sleep(10);
    
    During executing sleep(10), run 'cat /proc/meminfo' on another process.
    
    HugePages_Free:       99
    HugePages_Rsvd:      100
    
    Number of free should be higher or equal than number of reserve, but this
    aren't.  This represent that non reserved shared mapping steal a reserved
    page.  Non reserved shared mapping should not eat into reserve space.
    
    If we consider VM_NORESERVE in vma_has_reserve() and return 0 which mean
    that we don't have reserved pages, then we check that we have enough free
    pages in dequeue_huge_page_vma().  This prevent to steal a reserved page.
    
    With this change, above test generate a SIGBUG which is correct, because
    all free pages are reserved and non reserved shared mapping can't get a
    free page.
    Signed-off-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
    Reviewed-by: default avatarWanpeng Li <liwanp@linux.vnet.ibm.com>
    Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
    Acked-by: default avatarHillf Danton <dhillf@gmail.com>
    Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
    Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Mel Gorman <mgorman@suse.de>
    Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
    Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
    Cc: David Gibson <david@gibson.dropbear.id.au>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    72231b03
hugetlb.c 88.4 KB