Commit 023a019a authored by Jérôme Glisse's avatar Jérôme Glisse Committed by Linus Torvalds

mm/hmm: add default fault flags to avoid the need to pre-fill pfns arrays

The HMM mirror API can be use in two fashions.  The first one where the
HMM user coalesce multiple page faults into one request and set flags per
pfns for of those faults.  The second one where the HMM user want to
pre-fault a range with specific flags.  For the latter one it is a waste
to have the user pre-fill the pfn arrays with a default flags value.

This patch adds a default flags value allowing user to set them for a
range without having to pre-fill the pfn array.

Link: http://lkml.kernel.org/r/20190403193318.16478-8-jglisse@redhat.comSigned-off-by: default avatarJérôme Glisse <jglisse@redhat.com>
Reviewed-by: default avatarRalph Campbell <rcampbell@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent a3e0d41c
...@@ -276,6 +276,41 @@ report commands as executed is serialized (there is no point in doing this ...@@ -276,6 +276,41 @@ report commands as executed is serialized (there is no point in doing this
concurrently). concurrently).
Leverage default_flags and pfn_flags_mask
=========================================
The hmm_range struct has 2 fields default_flags and pfn_flags_mask that allows
to set fault or snapshot policy for a whole range instead of having to set them
for each entries in the range.
For instance if the device flags for device entries are:
VALID (1 << 63)
WRITE (1 << 62)
Now let say that device driver wants to fault with at least read a range then
it does set:
range->default_flags = (1 << 63)
range->pfn_flags_mask = 0;
and calls hmm_range_fault() as described above. This will fill fault all page
in the range with at least read permission.
Now let say driver wants to do the same except for one page in the range for
which its want to have write. Now driver set:
range->default_flags = (1 << 63);
range->pfn_flags_mask = (1 << 62);
range->pfns[index_of_write] = (1 << 62);
With this HMM will fault in all page with at least read (ie valid) and for the
address == range->start + (index_of_write << PAGE_SHIFT) it will fault with
write permission ie if the CPU pte does not have write permission set then HMM
will call handle_mm_fault().
Note that HMM will populate the pfns array with write permission for any entry
that have write permission within the CPU pte no matter what are the values set
in default_flags or pfn_flags_mask.
Represent and manage device memory from core kernel point of view Represent and manage device memory from core kernel point of view
================================================================= =================================================================
......
...@@ -165,6 +165,8 @@ enum hmm_pfn_value_e { ...@@ -165,6 +165,8 @@ enum hmm_pfn_value_e {
* @pfns: array of pfns (big enough for the range) * @pfns: array of pfns (big enough for the range)
* @flags: pfn flags to match device driver page table * @flags: pfn flags to match device driver page table
* @values: pfn value for some special case (none, special, error, ...) * @values: pfn value for some special case (none, special, error, ...)
* @default_flags: default flags for the range (write, read, ... see hmm doc)
* @pfn_flags_mask: allows to mask pfn flags so that only default_flags matter
* @pfn_shifts: pfn shift value (should be <= PAGE_SHIFT) * @pfn_shifts: pfn shift value (should be <= PAGE_SHIFT)
* @valid: pfns array did not change since it has been fill by an HMM function * @valid: pfns array did not change since it has been fill by an HMM function
*/ */
...@@ -177,6 +179,8 @@ struct hmm_range { ...@@ -177,6 +179,8 @@ struct hmm_range {
uint64_t *pfns; uint64_t *pfns;
const uint64_t *flags; const uint64_t *flags;
const uint64_t *values; const uint64_t *values;
uint64_t default_flags;
uint64_t pfn_flags_mask;
uint8_t pfn_shift; uint8_t pfn_shift;
bool valid; bool valid;
}; };
...@@ -448,6 +452,15 @@ static inline int hmm_vma_fault(struct hmm_range *range, bool block) ...@@ -448,6 +452,15 @@ static inline int hmm_vma_fault(struct hmm_range *range, bool block)
{ {
long ret; long ret;
/*
* With the old API the driver must set each individual entries with
* the requested flags (valid, write, ...). So here we set the mask to
* keep intact the entries provided by the driver and zero out the
* default_flags.
*/
range->default_flags = 0;
range->pfn_flags_mask = -1UL;
ret = hmm_range_register(range, range->vma->vm_mm, ret = hmm_range_register(range, range->vma->vm_mm,
range->start, range->end); range->start, range->end);
if (ret) if (ret)
......
...@@ -419,6 +419,18 @@ static inline void hmm_pte_need_fault(const struct hmm_vma_walk *hmm_vma_walk, ...@@ -419,6 +419,18 @@ static inline void hmm_pte_need_fault(const struct hmm_vma_walk *hmm_vma_walk,
if (!hmm_vma_walk->fault) if (!hmm_vma_walk->fault)
return; return;
/*
* So we not only consider the individual per page request we also
* consider the default flags requested for the range. The API can
* be use in 2 fashions. The first one where the HMM user coalesce
* multiple page fault into one request and set flags per pfns for
* of those faults. The second one where the HMM user want to pre-
* fault a range with specific flags. For the latter one it is a
* waste to have the user pre-fill the pfn arrays with a default
* flags value.
*/
pfns = (pfns & range->pfn_flags_mask) | range->default_flags;
/* We aren't ask to do anything ... */ /* We aren't ask to do anything ... */
if (!(pfns & range->flags[HMM_PFN_VALID])) if (!(pfns & range->flags[HMM_PFN_VALID]))
return; return;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment