Commit 88ececc2 authored by Mike Rapoport's avatar Mike Rapoport Committed by Jonathan Corbet

docs/vm: hugetlbfs_reserv.txt: convert to ReST format

Signed-off-by: default avatarMike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: default avatarJonathan Corbet <corbet@lwn.net>
parent 148723f7
Hugetlbfs Reservation Overview .. _hugetlbfs_reserve:
------------------------------
Huge pages as described at 'Documentation/vm/hugetlbpage.txt' are typically =====================
Hugetlbfs Reservation
=====================
Overview
========
Huge pages as described at :ref:`hugetlbpage` are typically
preallocated for application use. These huge pages are instantiated in a preallocated for application use. These huge pages are instantiated in a
task's address space at page fault time if the VMA indicates huge pages are task's address space at page fault time if the VMA indicates huge pages are
to be used. If no huge page exists at page fault time, the task is sent to be used. If no huge page exists at page fault time, the task is sent
...@@ -17,47 +24,55 @@ describe how huge page reserve processing is done in the v4.10 kernel. ...@@ -17,47 +24,55 @@ describe how huge page reserve processing is done in the v4.10 kernel.
Audience Audience
-------- ========
This description is primarily targeted at kernel developers who are modifying This description is primarily targeted at kernel developers who are modifying
hugetlbfs code. hugetlbfs code.
The Data Structures The Data Structures
------------------- ===================
resv_huge_pages resv_huge_pages
This is a global (per-hstate) count of reserved huge pages. Reserved This is a global (per-hstate) count of reserved huge pages. Reserved
huge pages are only available to the task which reserved them. huge pages are only available to the task which reserved them.
Therefore, the number of huge pages generally available is computed Therefore, the number of huge pages generally available is computed
as (free_huge_pages - resv_huge_pages). as (``free_huge_pages - resv_huge_pages``).
Reserve Map Reserve Map
A reserve map is described by the structure: A reserve map is described by the structure::
struct resv_map {
struct kref refs; struct resv_map {
spinlock_t lock; struct kref refs;
struct list_head regions; spinlock_t lock;
long adds_in_progress; struct list_head regions;
struct list_head region_cache; long adds_in_progress;
long region_cache_count; struct list_head region_cache;
}; long region_cache_count;
};
There is one reserve map for each huge page mapping in the system. There is one reserve map for each huge page mapping in the system.
The regions list within the resv_map describes the regions within The regions list within the resv_map describes the regions within
the mapping. A region is described as: the mapping. A region is described as::
struct file_region {
struct list_head link; struct file_region {
long from; struct list_head link;
long to; long from;
}; long to;
};
The 'from' and 'to' fields of the file region structure are huge page The 'from' and 'to' fields of the file region structure are huge page
indices into the mapping. Depending on the type of mapping, a indices into the mapping. Depending on the type of mapping, a
region in the reserv_map may indicate reservations exist for the region in the reserv_map may indicate reservations exist for the
range, or reservations do not exist. range, or reservations do not exist.
Flags for MAP_PRIVATE Reservations Flags for MAP_PRIVATE Reservations
These are stored in the bottom bits of the reservation map pointer. These are stored in the bottom bits of the reservation map pointer.
#define HPAGE_RESV_OWNER (1UL << 0) Indicates this task is the
owner of the reservations associated with the mapping. ``#define HPAGE_RESV_OWNER (1UL << 0)``
#define HPAGE_RESV_UNMAPPED (1UL << 1) Indicates task originally Indicates this task is the owner of the reservations
mapping this range (and creating reserves) has unmapped a associated with the mapping.
page from this task (the child) due to a failed COW. ``#define HPAGE_RESV_UNMAPPED (1UL << 1)``
Indicates task originally mapping this range (and creating
reserves) has unmapped a page from this task (the child)
due to a failed COW.
Page Flags Page Flags
The PagePrivate page flag is used to indicate that a huge page The PagePrivate page flag is used to indicate that a huge page
reservation must be restored when the huge page is freed. More reservation must be restored when the huge page is freed. More
...@@ -65,12 +80,14 @@ Page Flags ...@@ -65,12 +80,14 @@ Page Flags
Reservation Map Location (Private or Shared) Reservation Map Location (Private or Shared)
-------------------------------------------- ============================================
A huge page mapping or segment is either private or shared. If private, A huge page mapping or segment is either private or shared. If private,
it is typically only available to a single address space (task). If shared, it is typically only available to a single address space (task). If shared,
it can be mapped into multiple address spaces (tasks). The location and it can be mapped into multiple address spaces (tasks). The location and
semantics of the reservation map is significantly different for two types semantics of the reservation map is significantly different for two types
of mappings. Location differences are: of mappings. Location differences are:
- For private mappings, the reservation map hangs off the the VMA structure. - For private mappings, the reservation map hangs off the the VMA structure.
Specifically, vma->vm_private_data. This reserve map is created at the Specifically, vma->vm_private_data. This reserve map is created at the
time the mapping (mmap(MAP_PRIVATE)) is created. time the mapping (mmap(MAP_PRIVATE)) is created.
...@@ -82,15 +99,15 @@ of mappings. Location differences are: ...@@ -82,15 +99,15 @@ of mappings. Location differences are:
Creating Reservations Creating Reservations
--------------------- =====================
Reservations are created when a huge page backed shared memory segment is Reservations are created when a huge page backed shared memory segment is
created (shmget(SHM_HUGETLB)) or a mapping is created via mmap(MAP_HUGETLB). created (shmget(SHM_HUGETLB)) or a mapping is created via mmap(MAP_HUGETLB).
These operations result in a call to the routine hugetlb_reserve_pages() These operations result in a call to the routine hugetlb_reserve_pages()::
int hugetlb_reserve_pages(struct inode *inode, int hugetlb_reserve_pages(struct inode *inode,
long from, long to, long from, long to,
struct vm_area_struct *vma, struct vm_area_struct *vma,
vm_flags_t vm_flags) vm_flags_t vm_flags)
The first thing hugetlb_reserve_pages() does is check for the NORESERVE The first thing hugetlb_reserve_pages() does is check for the NORESERVE
flag was specified in either the shmget() or mmap() call. If NORESERVE flag was specified in either the shmget() or mmap() call. If NORESERVE
...@@ -105,6 +122,7 @@ the 'from' and 'to' arguments have been adjusted by this offset. ...@@ -105,6 +122,7 @@ the 'from' and 'to' arguments have been adjusted by this offset.
One of the big differences between PRIVATE and SHARED mappings is the way One of the big differences between PRIVATE and SHARED mappings is the way
in which reservations are represented in the reservation map. in which reservations are represented in the reservation map.
- For shared mappings, an entry in the reservation map indicates a reservation - For shared mappings, an entry in the reservation map indicates a reservation
exists or did exist for the corresponding page. As reservations are exists or did exist for the corresponding page. As reservations are
consumed, the reservation map is not modified. consumed, the reservation map is not modified.
...@@ -121,12 +139,13 @@ to indicate this VMA owns the reservations. ...@@ -121,12 +139,13 @@ to indicate this VMA owns the reservations.
The reservation map is consulted to determine how many huge page reservations The reservation map is consulted to determine how many huge page reservations
are needed for the current mapping/segment. For private mappings, this is are needed for the current mapping/segment. For private mappings, this is
always the value (to - from). However, for shared mappings it is possible that some reservations may already exist within the range (to - from). See the always the value (to - from). However, for shared mappings it is possible that some reservations may already exist within the range (to - from). See the
section "Reservation Map Modifications" for details on how this is accomplished. section :ref:`Reservation Map Modifications <resv_map_modifications>`
for details on how this is accomplished.
The mapping may be associated with a subpool. If so, the subpool is consulted The mapping may be associated with a subpool. If so, the subpool is consulted
to ensure there is sufficient space for the mapping. It is possible that the to ensure there is sufficient space for the mapping. It is possible that the
subpool has set aside reservations that can be used for the mapping. See the subpool has set aside reservations that can be used for the mapping. See the
section "Subpool Reservations" for more details. section :ref:`Subpool Reservations <sub_pool_resv>` for more details.
After consulting the reservation map and subpool, the number of needed new After consulting the reservation map and subpool, the number of needed new
reservations is known. The routine hugetlb_acct_memory() is called to check reservations is known. The routine hugetlb_acct_memory() is called to check
...@@ -135,9 +154,11 @@ calls into routines that potentially allocate and adjust surplus page counts. ...@@ -135,9 +154,11 @@ calls into routines that potentially allocate and adjust surplus page counts.
However, within those routines the code is simply checking to ensure there However, within those routines the code is simply checking to ensure there
are enough free huge pages to accommodate the reservation. If there are, are enough free huge pages to accommodate the reservation. If there are,
the global reservation count resv_huge_pages is adjusted something like the the global reservation count resv_huge_pages is adjusted something like the
following. following::
if (resv_needed <= (resv_huge_pages - free_huge_pages)) if (resv_needed <= (resv_huge_pages - free_huge_pages))
resv_huge_pages += resv_needed; resv_huge_pages += resv_needed;
Note that the global lock hugetlb_lock is held when checking and adjusting Note that the global lock hugetlb_lock is held when checking and adjusting
these counters. these counters.
...@@ -152,14 +173,18 @@ If hugetlb_reserve_pages() was successful, the global reservation count and ...@@ -152,14 +173,18 @@ If hugetlb_reserve_pages() was successful, the global reservation count and
reservation map associated with the mapping will be modified as required to reservation map associated with the mapping will be modified as required to
ensure reservations exist for the range 'from' - 'to'. ensure reservations exist for the range 'from' - 'to'.
.. _consume_resv:
Consuming Reservations/Allocating a Huge Page Consuming Reservations/Allocating a Huge Page
--------------------------------------------- =============================================
Reservations are consumed when huge pages associated with the reservations Reservations are consumed when huge pages associated with the reservations
are allocated and instantiated in the corresponding mapping. The allocation are allocated and instantiated in the corresponding mapping. The allocation
is performed within the routine alloc_huge_page(). is performed within the routine alloc_huge_page()::
struct page *alloc_huge_page(struct vm_area_struct *vma,
unsigned long addr, int avoid_reserve) struct page *alloc_huge_page(struct vm_area_struct *vma,
unsigned long addr, int avoid_reserve)
alloc_huge_page is passed a VMA pointer and a virtual address, so it can alloc_huge_page is passed a VMA pointer and a virtual address, so it can
consult the reservation map to determine if a reservation exists. In addition, consult the reservation map to determine if a reservation exists. In addition,
alloc_huge_page takes the argument avoid_reserve which indicates reserves alloc_huge_page takes the argument avoid_reserve which indicates reserves
...@@ -170,8 +195,9 @@ page are being allocated. ...@@ -170,8 +195,9 @@ page are being allocated.
The helper routine vma_needs_reservation() is called to determine if a The helper routine vma_needs_reservation() is called to determine if a
reservation exists for the address within the mapping(vma). See the section reservation exists for the address within the mapping(vma). See the section
"Reservation Map Helper Routines" for detailed information on what this :ref:`Reservation Map Helper Routines <resv_map_helpers>` for detailed
routine does. The value returned from vma_needs_reservation() is generally information on what this routine does.
The value returned from vma_needs_reservation() is generally
0 or 1. 0 if a reservation exists for the address, 1 if no reservation exists. 0 or 1. 0 if a reservation exists for the address, 1 if no reservation exists.
If a reservation does not exist, and there is a subpool associated with the If a reservation does not exist, and there is a subpool associated with the
mapping the subpool is consulted to determine if it contains reservations. mapping the subpool is consulted to determine if it contains reservations.
...@@ -180,21 +206,25 @@ However, in every case the avoid_reserve argument overrides the use of ...@@ -180,21 +206,25 @@ However, in every case the avoid_reserve argument overrides the use of
a reservation for the allocation. After determining whether a reservation a reservation for the allocation. After determining whether a reservation
exists and can be used for the allocation, the routine dequeue_huge_page_vma() exists and can be used for the allocation, the routine dequeue_huge_page_vma()
is called. This routine takes two arguments related to reservations: is called. This routine takes two arguments related to reservations:
- avoid_reserve, this is the same value/argument passed to alloc_huge_page() - avoid_reserve, this is the same value/argument passed to alloc_huge_page()
- chg, even though this argument is of type long only the values 0 or 1 are - chg, even though this argument is of type long only the values 0 or 1 are
passed to dequeue_huge_page_vma. If the value is 0, it indicates a passed to dequeue_huge_page_vma. If the value is 0, it indicates a
reservation exists (see the section "Memory Policy and Reservations" for reservation exists (see the section "Memory Policy and Reservations" for
possible issues). If the value is 1, it indicates a reservation does not possible issues). If the value is 1, it indicates a reservation does not
exist and the page must be taken from the global free pool if possible. exist and the page must be taken from the global free pool if possible.
The free lists associated with the memory policy of the VMA are searched for The free lists associated with the memory policy of the VMA are searched for
a free page. If a page is found, the value free_huge_pages is decremented a free page. If a page is found, the value free_huge_pages is decremented
when the page is removed from the free list. If there was a reservation when the page is removed from the free list. If there was a reservation
associated with the page, the following adjustments are made: associated with the page, the following adjustments are made::
SetPagePrivate(page); /* Indicates allocating this page consumed SetPagePrivate(page); /* Indicates allocating this page consumed
* a reservation, and if an error is * a reservation, and if an error is
* encountered such that the page must be * encountered such that the page must be
* freed, the reservation will be restored. */ * freed, the reservation will be restored. */
resv_huge_pages--; /* Decrement the global reservation count */ resv_huge_pages--; /* Decrement the global reservation count */
Note, if no huge page can be found that satisfies the VMA's memory policy Note, if no huge page can be found that satisfies the VMA's memory policy
an attempt will be made to allocate one using the buddy allocator. This an attempt will be made to allocate one using the buddy allocator. This
brings up the issue of surplus huge pages and overcommit which is beyond brings up the issue of surplus huge pages and overcommit which is beyond
...@@ -222,12 +252,14 @@ mapping. In such cases, the reservation count and subpool free page count ...@@ -222,12 +252,14 @@ mapping. In such cases, the reservation count and subpool free page count
will be off by one. This rare condition can be identified by comparing the will be off by one. This rare condition can be identified by comparing the
return value from vma_needs_reservation and vma_commit_reservation. If such return value from vma_needs_reservation and vma_commit_reservation. If such
a race is detected, the subpool and global reserve counts are adjusted to a race is detected, the subpool and global reserve counts are adjusted to
compensate. See the section "Reservation Map Helper Routines" for more compensate. See the section
:ref:`Reservation Map Helper Routines <resv_map_helpers>` for more
information on these routines. information on these routines.
Instantiate Huge Pages Instantiate Huge Pages
---------------------- ======================
After huge page allocation, the page is typically added to the page tables After huge page allocation, the page is typically added to the page tables
of the allocating task. Before this, pages in a shared mapping are added of the allocating task. Before this, pages in a shared mapping are added
to the page cache and pages in private mappings are added to an anonymous to the page cache and pages in private mappings are added to an anonymous
...@@ -237,7 +269,8 @@ to the global reservation count (resv_huge_pages). ...@@ -237,7 +269,8 @@ to the global reservation count (resv_huge_pages).
Freeing Huge Pages Freeing Huge Pages
------------------ ==================
Huge page freeing is performed by the routine free_huge_page(). This routine Huge page freeing is performed by the routine free_huge_page(). This routine
is the destructor for hugetlbfs compound pages. As a result, it is only is the destructor for hugetlbfs compound pages. As a result, it is only
passed a pointer to the page struct. When a huge page is freed, reservation passed a pointer to the page struct. When a huge page is freed, reservation
...@@ -247,7 +280,8 @@ on an error path where a global reserve count must be restored. ...@@ -247,7 +280,8 @@ on an error path where a global reserve count must be restored.
The page->private field points to any subpool associated with the page. The page->private field points to any subpool associated with the page.
If the PagePrivate flag is set, it indicates the global reserve count should If the PagePrivate flag is set, it indicates the global reserve count should
be adjusted (see the section "Consuming Reservations/Allocating a Huge Page" be adjusted (see the section
:ref:`Consuming Reservations/Allocating a Huge Page <consume_resv>`
for information on how these are set). for information on how these are set).
The routine first calls hugepage_subpool_put_pages() for the page. If this The routine first calls hugepage_subpool_put_pages() for the page. If this
...@@ -259,9 +293,11 @@ Therefore, the global resv_huge_pages counter is incremented in this case. ...@@ -259,9 +293,11 @@ Therefore, the global resv_huge_pages counter is incremented in this case.
If the PagePrivate flag was set in the page, the global resv_huge_pages counter If the PagePrivate flag was set in the page, the global resv_huge_pages counter
will always be incremented. will always be incremented.
.. _sub_pool_resv:
Subpool Reservations Subpool Reservations
-------------------- ====================
There is a struct hstate associated with each huge page size. The hstate There is a struct hstate associated with each huge page size. The hstate
tracks all huge pages of the specified size. A subpool represents a subset tracks all huge pages of the specified size. A subpool represents a subset
of pages within a hstate that is associated with a mounted hugetlbfs of pages within a hstate that is associated with a mounted hugetlbfs
...@@ -295,7 +331,8 @@ the global pools. ...@@ -295,7 +331,8 @@ the global pools.
COW and Reservations COW and Reservations
-------------------- ====================
Since shared mappings all point to and use the same underlying pages, the Since shared mappings all point to and use the same underlying pages, the
biggest reservation concern for COW is private mappings. In this case, biggest reservation concern for COW is private mappings. In this case,
two tasks can be pointing at the same previously allocated page. One task two tasks can be pointing at the same previously allocated page. One task
...@@ -326,30 +363,36 @@ faults on a non-present page. But, the original owner of the ...@@ -326,30 +363,36 @@ faults on a non-present page. But, the original owner of the
mapping/reservation will behave as expected. mapping/reservation will behave as expected.
.. _resv_map_modifications:
Reservation Map Modifications Reservation Map Modifications
----------------------------- =============================
The following low level routines are used to make modifications to a The following low level routines are used to make modifications to a
reservation map. Typically, these routines are not called directly. Rather, reservation map. Typically, these routines are not called directly. Rather,
a reservation map helper routine is called which calls one of these low level a reservation map helper routine is called which calls one of these low level
routines. These low level routines are fairly well documented in the source routines. These low level routines are fairly well documented in the source
code (mm/hugetlb.c). These routines are: code (mm/hugetlb.c). These routines are::
long region_chg(struct resv_map *resv, long f, long t);
long region_add(struct resv_map *resv, long f, long t); long region_chg(struct resv_map *resv, long f, long t);
void region_abort(struct resv_map *resv, long f, long t); long region_add(struct resv_map *resv, long f, long t);
long region_count(struct resv_map *resv, long f, long t); void region_abort(struct resv_map *resv, long f, long t);
long region_count(struct resv_map *resv, long f, long t);
Operations on the reservation map typically involve two operations: Operations on the reservation map typically involve two operations:
1) region_chg() is called to examine the reserve map and determine how 1) region_chg() is called to examine the reserve map and determine how
many pages in the specified range [f, t) are NOT currently represented. many pages in the specified range [f, t) are NOT currently represented.
The calling code performs global checks and allocations to determine if The calling code performs global checks and allocations to determine if
there are enough huge pages for the operation to succeed. there are enough huge pages for the operation to succeed.
2a) If the operation can succeed, region_add() is called to actually modify 2)
the reservation map for the same range [f, t) previously passed to a) If the operation can succeed, region_add() is called to actually modify
region_chg(). the reservation map for the same range [f, t) previously passed to
2b) If the operation can not succeed, region_abort is called for the same range region_chg().
[f, t) to abort the operation. b) If the operation can not succeed, region_abort is called for the same
range [f, t) to abort the operation.
Note that this is a two step process where region_add() and region_abort() Note that this is a two step process where region_add() and region_abort()
are guaranteed to succeed after a prior call to region_chg() for the same are guaranteed to succeed after a prior call to region_chg() for the same
...@@ -371,6 +414,7 @@ and make the appropriate adjustments. ...@@ -371,6 +414,7 @@ and make the appropriate adjustments.
The routine region_del() is called to remove regions from a reservation map. The routine region_del() is called to remove regions from a reservation map.
It is typically called in the following situations: It is typically called in the following situations:
- When a file in the hugetlbfs filesystem is being removed, the inode will - When a file in the hugetlbfs filesystem is being removed, the inode will
be released and the reservation map freed. Before freeing the reservation be released and the reservation map freed. Before freeing the reservation
map, all the individual file_region structures must be freed. In this case map, all the individual file_region structures must be freed. In this case
...@@ -384,6 +428,7 @@ It is typically called in the following situations: ...@@ -384,6 +428,7 @@ It is typically called in the following situations:
removed, region_del() is called to remove the corresponding entry from the removed, region_del() is called to remove the corresponding entry from the
reservation map. In this case, region_del is passed the range reservation map. In this case, region_del is passed the range
[page_idx, page_idx + 1). [page_idx, page_idx + 1).
In every case, region_del() will return the number of pages removed from the In every case, region_del() will return the number of pages removed from the
reservation map. In VERY rare cases, region_del() can fail. This can only reservation map. In VERY rare cases, region_del() can fail. This can only
happen in the hole punch case where it has to split an existing file_region happen in the hole punch case where it has to split an existing file_region
...@@ -403,9 +448,11 @@ outstanding (outstanding = (end - start) - region_count(resv, start, end)). ...@@ -403,9 +448,11 @@ outstanding (outstanding = (end - start) - region_count(resv, start, end)).
Since the mapping is going away, the subpool and global reservation counts Since the mapping is going away, the subpool and global reservation counts
are decremented by the number of outstanding reservations. are decremented by the number of outstanding reservations.
.. _resv_map_helpers:
Reservation Map Helper Routines Reservation Map Helper Routines
------------------------------- ===============================
Several helper routines exist to query and modify the reservation maps. Several helper routines exist to query and modify the reservation maps.
These routines are only interested with reservations for a specific huge These routines are only interested with reservations for a specific huge
page, so they just pass in an address instead of a range. In addition, page, so they just pass in an address instead of a range. In addition,
...@@ -414,32 +461,40 @@ or shared) and the location of the reservation map (inode or VMA) can be ...@@ -414,32 +461,40 @@ or shared) and the location of the reservation map (inode or VMA) can be
determined. These routines simply call the underlying routines described determined. These routines simply call the underlying routines described
in the section "Reservation Map Modifications". However, they do take into in the section "Reservation Map Modifications". However, they do take into
account the 'opposite' meaning of reservation map entries for private and account the 'opposite' meaning of reservation map entries for private and
shared mappings and hide this detail from the caller. shared mappings and hide this detail from the caller::
long vma_needs_reservation(struct hstate *h,
struct vm_area_struct *vma,
unsigned long addr)
long vma_needs_reservation(struct hstate *h,
struct vm_area_struct *vma, unsigned long addr)
This routine calls region_chg() for the specified page. If no reservation This routine calls region_chg() for the specified page. If no reservation
exists, 1 is returned. If a reservation exists, 0 is returned. exists, 1 is returned. If a reservation exists, 0 is returned::
long vma_commit_reservation(struct hstate *h,
struct vm_area_struct *vma,
unsigned long addr)
long vma_commit_reservation(struct hstate *h,
struct vm_area_struct *vma, unsigned long addr)
This calls region_add() for the specified page. As in the case of region_chg This calls region_add() for the specified page. As in the case of region_chg
and region_add, this routine is to be called after a previous call to and region_add, this routine is to be called after a previous call to
vma_needs_reservation. It will add a reservation entry for the page. It vma_needs_reservation. It will add a reservation entry for the page. It
returns 1 if the reservation was added and 0 if not. The return value should returns 1 if the reservation was added and 0 if not. The return value should
be compared with the return value of the previous call to be compared with the return value of the previous call to
vma_needs_reservation. An unexpected difference indicates the reservation vma_needs_reservation. An unexpected difference indicates the reservation
map was modified between calls. map was modified between calls::
void vma_end_reservation(struct hstate *h,
struct vm_area_struct *vma,
unsigned long addr)
void vma_end_reservation(struct hstate *h,
struct vm_area_struct *vma, unsigned long addr)
This calls region_abort() for the specified page. As in the case of region_chg This calls region_abort() for the specified page. As in the case of region_chg
and region_abort, this routine is to be called after a previous call to and region_abort, this routine is to be called after a previous call to
vma_needs_reservation. It will abort/end the in progress reservation add vma_needs_reservation. It will abort/end the in progress reservation add
operation. operation::
long vma_add_reservation(struct hstate *h,
struct vm_area_struct *vma,
unsigned long addr)
long vma_add_reservation(struct hstate *h,
struct vm_area_struct *vma, unsigned long addr)
This is a special wrapper routine to help facilitate reservation cleanup This is a special wrapper routine to help facilitate reservation cleanup
on error paths. It is only called from the routine restore_reserve_on_error(). on error paths. It is only called from the routine restore_reserve_on_error().
This routine is used in conjunction with vma_needs_reservation in an attempt This routine is used in conjunction with vma_needs_reservation in an attempt
...@@ -453,8 +508,10 @@ be done on error paths. ...@@ -453,8 +508,10 @@ be done on error paths.
Reservation Cleanup in Error Paths Reservation Cleanup in Error Paths
---------------------------------- ==================================
As mentioned in the section "Reservation Map Helper Routines", reservation
As mentioned in the section
:ref:`Reservation Map Helper Routines <resv_map_helpers>`, reservation
map modifications are performed in two steps. First vma_needs_reservation map modifications are performed in two steps. First vma_needs_reservation
is called before a page is allocated. If the allocation is successful, is called before a page is allocated. If the allocation is successful,
then vma_commit_reservation is called. If not, vma_end_reservation is called. then vma_commit_reservation is called. If not, vma_end_reservation is called.
...@@ -494,13 +551,14 @@ so that a reservation will not be leaked when the huge page is freed. ...@@ -494,13 +551,14 @@ so that a reservation will not be leaked when the huge page is freed.
Reservations and Memory Policy Reservations and Memory Policy
------------------------------ ==============================
Per-node huge page lists existed in struct hstate when git was first used Per-node huge page lists existed in struct hstate when git was first used
to manage Linux code. The concept of reservations was added some time later. to manage Linux code. The concept of reservations was added some time later.
When reservations were added, no attempt was made to take memory policy When reservations were added, no attempt was made to take memory policy
into account. While cpusets are not exactly the same as memory policy, this into account. While cpusets are not exactly the same as memory policy, this
comment in hugetlb_acct_memory sums up the interaction between reservations comment in hugetlb_acct_memory sums up the interaction between reservations
and cpusets/memory policy. and cpusets/memory policy::
/* /*
* When cpuset is configured, it breaks the strict hugetlb page * When cpuset is configured, it breaks the strict hugetlb page
* reservation as the accounting is done on a global variable. Such * reservation as the accounting is done on a global variable. Such
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment