• Gang Li's avatar
    hugetlb: code clean for hugetlb_hstate_alloc_pages · fc37bbb3
    Gang Li authored
    Patch series "hugetlb: parallelize hugetlb page init on boot", v6.
    
    Introduction
    ------------
    Hugetlb initialization during boot takes up a considerable amount of time.
    For instance, on a 2TB system, initializing 1,800 1GB huge pages takes
    1-2 seconds out of 10 seconds.  Initializing 11,776 1GB pages on a 12TB
    Intel host takes more than 1 minute[1].  This is a noteworthy figure.
    
    Inspired by [2] and [3], hugetlb initialization can also be accelerated
    through parallelization.  Kernel already has infrastructure like
    padata_do_multithreaded, this patch uses it to achieve effective results
    by minimal modifications.
    
    [1] https://lore.kernel.org/all/783f8bac-55b8-5b95-eb6a-11a583675000@google.com/
    [2] https://lore.kernel.org/all/20200527173608.2885243-1-daniel.m.jordan@oracle.com/
    [3] https://lore.kernel.org/all/20230906112605.2286994-1-usama.arif@bytedance.com/
    [4] https://lore.kernel.org/all/76becfc1-e609-e3e8-2966-4053143170b6@google.com/
    
    max_threads
    -----------
    This patch use `padata_do_multithreaded` like this:
    
    ```
    job.max_threads	= num_node_state(N_MEMORY) * multiplier;
    padata_do_multithreaded(&job);
    ```
    
    To fully utilize the CPU, the number of parallel threads needs to be
    carefully considered.  `max_threads = num_node_state(N_MEMORY)` does not
    fully utilize the CPU, so we need to multiply it by a multiplier.
    
    Tests below indicate that a multiplier of 2 significantly improves
    performance, and although larger values also provide improvements, the
    gains are marginal.
    
      multiplier     1       2       3       4       5
     ------------ ------- ------- ------- ------- -------
      256G 2node   358ms   215ms   157ms   134ms   126ms
      2T   4node   979ms   679ms   543ms   489ms   481ms
      50G  2node   71ms    44ms    37ms    30ms    31ms
    
    Therefore, choosing 2 as the multiplier strikes a good balance between
    enhancing parallel processing capabilities and maintaining efficient
    resource management.
    
    Test result
    -----------
          test case       no patch(ms)   patched(ms)   saved
     ------------------- -------------- ------------- --------
      256c2T(4 node) 1G           4745          2024   57.34%
      128c1T(2 node) 1G           3358          1712   49.02%
         12T         1G          77000         18300   76.23%
    
      256c2T(4 node) 2M           3336          1051   68.52%
      128c1T(2 node) 2M           1943           716   63.15%
    
    
    This patch (of 8):
    
    The readability of `hugetlb_hstate_alloc_pages` is poor.  By cleaning the
    code, its readability can be improved, facilitating future modifications.
    
    This patch extracts two functions to reduce the complexity of
    `hugetlb_hstate_alloc_pages` and has no functional changes.
    
    - hugetlb_hstate_alloc_pages_node_specific() to handle iterates through
      each online node and performs allocation if necessary.
    - hugetlb_hstate_alloc_pages_report() report error during allocation.
      And the value of h->max_huge_pages is updated accordingly.
    
    Link: https://lkml.kernel.org/r/20240222140422.393911-1-gang.li@linux.dev
    Link: https://lkml.kernel.org/r/20240222140422.393911-2-gang.li@linux.devSigned-off-by: default avatarGang Li <ligang.bdlg@bytedance.com>
    Tested-by: default avatarDavid Rientjes <rientjes@google.com>
    Reviewed-by: default avatarMuchun Song <muchun.song@linux.dev>
    Reviewed-by: default avatarTim Chen <tim.c.chen@linux.intel.com>
    Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Jane Chu <jane.chu@oracle.com>
    Cc: Paul E. McKenney <paulmck@kernel.org>
    Cc: Randy Dunlap <rdunlap@infradead.org>
    Cc: Steffen Klassert <steffen.klassert@secunet.com>
    Cc: Alexey Dobriyan <adobriyan@gmail.com>
    Cc: Mike Kravetz <mike.kravetz@oracle.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    fc37bbb3
hugetlb.c 216 KB