• Tarun Sahu's avatar
    mm/folio: avoid special handling for order value 0 in folio_set_order · e3b7bf97
    Tarun Sahu authored
    folio_set_order(folio, 0) is used in kernel at two places
    __destroy_compound_gigantic_folio and __prep_compound_gigantic_folio.
    Currently, It is called to clear out the folio->_folio_nr_pages and
    folio->_folio_order.
    
    For __destroy_compound_gigantic_folio:
    In past, folio_set_order(folio, 0) was needed because page->mapping used
    to overlap with _folio_nr_pages and _folio_order. So if these fields were
    left uncleared during freeing gigantic hugepages, they were causing
    "BUG: bad page state" due to non-zero page->mapping. Now, After
    Commit a01f4390 ("hugetlb: be sure to free demoted CMA pages to
    CMA") page->mapping has explicitly been cleared out for tail pages. Also,
    _folio_order and _folio_nr_pages no longer overlaps with page->mapping.
    
    So, folio_set_order(folio, 0) can be removed from freeing gigantic
    folio path (__destroy_compound_gigantic_folio).
    
    Another place, folio_set_order(folio, 0) is called inside
    __prep_compound_gigantic_folio during error path. Here,
    folio_set_order(folio, 0) can also be removed if we move
    folio_set_order(folio, order) after for loop.
    
    The patch also moves _folio_set_head call in __prep_compound_gigantic_folio()
    such that we avoid clearing them in the error path.
    
    Also, as Mike pointed out:
    "It would actually be better to move the calls _folio_set_head and
    folio_set_order in __prep_compound_gigantic_folio() as suggested here. Why?
    In the current code, the ref count on the 'head page' is still 1 (or more)
    while those calls are made. So, someone could take a speculative ref on the
    page BEFORE the tail pages are set up."
    
    This way, folio_set_order(folio, 0) is no more needed. And it will also
    helps removing the confusion of folio order being set to 0 (as _folio_order
    field is part of first tail page).
    
    Testing: I have run LTP tests, which all passes. and also I have written
    the test in LTP which tests the bug caused by compound_nr and page->mapping
    overlapping.
    
    https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/hugetlb/hugemmap/hugemmap32.c
    
    Running on older kernel ( < 5.10-rc7) with the above bug this fails while
    on newer kernel and, also with this patch it passes.
    
    Link: https://lkml.kernel.org/r/20230609162907.111756-1-tsahu@linux.ibm.comSigned-off-by: default avatarTarun Sahu <tsahu@linux.ibm.com>
    Reviewed-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
    Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
    Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Muchun Song <muchun.song@linux.dev>
    Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    e3b7bf97
internal.h 34.1 KB