Commits · dea027f28209aa7a23a4e56b2ae270f16d48055d · Kirill Smelkov / linux

20 Aug, 2019 7 commits

nouveau: reset dma_nr in nouveau_dmem_migrate_alloc_and_copy · dea027f2

Christoph Hellwig authored Aug 14, 2019

When we start a new batch of dma_map operations we need to reset dma_nr,
as we start filling a newly allocated array.

Link: https://lore.kernel.org/r/20190814075928.23766-3-hch@lst.deSigned-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

dea027f2

mm: turn migrate_vma upside down · a7d1f22b

Christoph Hellwig authored Aug 14, 2019

There isn't any good reason to pass callbacks to migrate_vma. Instead
we can just export the three steps done by this function to drivers and
let them sequence the operation without callbacks. This removes a lot
of boilerplate code as-is, and will allow the drivers to drastically
improve code flow and error handling further on.

Link: https://lore.kernel.org/r/20190814075928.23766-2-hch@lst.deSigned-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

a7d1f22b

Merge 'notifier_get_put' into hmm.git · f4fb3b9c

Jason Gunthorpe authored Aug 16, 2019

Jason Gunthorpe says:

====================
Add mmu_notifier_get/put for managing mmu notifier registrations

This series introduces a new registration flow for mmu_notifiers based on
the idea that the user would like to get a single refcounted piece of
memory for a mm, keyed to its use.

For instance many users of mmu_notifiers use an interval tree or similar
to dispatch notifications to some object. There are many objects but only
one notifier subscription per mm holding the tree.

Of the 12 places that call mmu_notifier_register:
 - 7 are maintaining some kind of obvious mapping of mm_struct to
   mmu_notifier registration, ie in some linked list or hash table. Of
   the 7 this series converts 4 (gru, hmm, RDMA, radeon)

 - 3 (hfi1, gntdev, vhost) are registering multiple notifiers, but each
   one immediately does some VA range filtering, ie with an interval tree.
   These would be better with a global subsystem-wide range filter and
   could convert to this API.

 - 2 (kvm, amd_iommu) are deliberately using a single mm at a time, and
   really can't use this API. One of the intel-svm's modes is also in this
   list

The 3/7 unconverted drivers are:
 - intel-svm
   This driver tracks mm's in a global linked list 'global_svm_list'
   and would benefit from this API.

   Its flow is a bit complex, since it also wants a set of non-shared
   notifiers.

 - i915_gem_usrptr
   This driver tracks mm's in a per-device hash
   table (dev_priv->mm_structs), but only has an optional use of
   mmu_notifiers.  Since it still seems to need the hash table it is
   difficult to convert.

 - amdkfd/kfd_process
   This driver is using a global SRCU hash table to track mm's

   The control flow here is very complicated and the driver is relying on
   this hash table to be fast on the ioctl syscall path.

   It would definitely benefit, but only if the ioctl path didn't need to
   do the search so often.
====================
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

f4fb3b9c

drm/amdkfd: use mmu_notifier_put · 471f3902

Jason Gunthorpe authored Aug 06, 2019

The sequence of mmu_notifier_unregister_no_release(),
mmu_notifier_call_srcu() is identical to mmu_notifier_put() with the
free_notifier callback.

As this is the last user of those APIs, converting it means we can drop
them.

Link: https://lore.kernel.org/r/20190806231548.25242-11-jgg@ziepe.caReviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

471f3902

drm/amdkfd: fix a use after free race with mmu_notifer unregister · 0029cab3

Jason Gunthorpe authored Aug 06, 2019

When using mmu_notifer_unregister_no_release() the caller must ensure
there is a SRCU synchronize before the mn memory is freed, otherwise use
after free races are possible, for instance:

CPU0 CPU1
invalidate_range_start
hlist_for_each_entry_rcu(..)
mmu_notifier_unregister_no_release(&p->mn)
kfree(mn)
if (mn->ops->invalidate_range_end)

The error unwind in amdkfd misses the SRCU synchronization.

amdkfd keeps the kfd_process around until the mm is released, so split the
flow to fully initialize the kfd_process and register it for find_process,
and with the notifier. Past this point the kfd_process does not need to be
cleaned up as it is fully ready.

The final failable step does a vm_mmap() and does not seem to impact the
kfd_process global state. Since it also cannot be undone (and already has
problems with undo if it internally fails), it has to be last.

This way we don't have to try to unwind the mmu_notifier_register() and
avoid the problem with the SRCU.

Along the way this also fixes various other error unwind bugs in the flow.

Fixes: 45102048 ("amdkfd: Add process queue manager module")
Link: https://lore.kernel.org/r/20190806231548.25242-10-jgg@ziepe.caReviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

0029cab3

drm/radeon: use mmu_notifier_get/put for struct radeon_mn · 534e5f84

Jason Gunthorpe authored Aug 06, 2019

radeon is using a device global hash table to track what mmu_notifiers
have been registered on struct mm. This is better served with the new
get/put scheme instead.

radeon has a bug where it was not blocking notifier release() until all
the BO's had been invalidated. This could result in a use after free of
pages the BOs. This is tied into a second bug where radeon left the
notifiers running endlessly even once the interval tree became
empty. This could result in a use after free with module unload.

Both are fixed by changing the lifetime model, the BOs exist in the
interval tree with their natural lifetimes independent of the mm_struct
lifetime using the get/put scheme. The release runs synchronously and just
does invalidate_start across the entire interval tree to create the
required DMA fence.

Additions to the interval tree after release are already impossible as
only current->mm is used during the add.

Link: https://lore.kernel.org/r/20190806231548.25242-9-jgg@ziepe.caAcked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

534e5f84

hmm: use mmu_notifier_get/put for 'struct hmm' · c7d8b782

Jason Gunthorpe authored Aug 06, 2019

This is a significant simplification, it eliminates all the remaining
'hmm' stuff in mm_struct, eliminates krefing along the critical notifier
paths, and takes away all the ugly locking and abuse of page_table_lock.

mmu_notifier_get() provides the single struct hmm per struct mm which
eliminates mm->hmm.

It also directly guarantees that no mmu_notifier op callback is callable
while concurrent free is possible, this eliminates all the krefs inside
the mmu_notifier callbacks.

The remaining krefs in the range code were overly cautious, drivers are
already not permitted to free the mirror while a range exists.

Link: https://lore.kernel.org/r/20190806231548.25242-6-jgg@ziepe.caReviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

c7d8b782

16 Aug, 2019 4 commits

misc/sgi-gru: use mmu_notifier_get/put for struct gru_mm_struct · e4c057d0

Jason Gunthorpe authored Aug 06, 2019

GRU is already using almost the same algorithm as get/put, it even
helpfully has a 10 year old comment to make this algorithm common, which
is finally happening.

There are a few differences and fixes from this conversion:
- GRU used rcu not srcu to read the hlist
- Unclear how the locking worked to prevent gru_register_mmu_notifier()
  from running concurrently with gru_drop_mmu_notifier() - this version is
  safe
- GRU had a release function which only set a variable without any locking
  that skiped the synchronize_srcu during unregister which looks racey,
  but this makes it reliable via the integrated call_srcu().
- It is unclear if the mmap_sem is actually held when
  __mmu_notifier_register() was called, lockdep will now warn if this is
  wrong

Link: https://lore.kernel.org/r/20190806231548.25242-5-jgg@ziepe.caReviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dimitri Sivanich <sivanich@hpe.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

e4c057d0

mm/mmu_notifiers: add a get/put scheme for the registration · 2c7933f5

Jason Gunthorpe authored Aug 06, 2019

Many places in the kernel have a flow where userspace will create some
object and that object will need to connect to the subsystem's
mmu_notifier subscription for the duration of its lifetime.

In this case the subsystem is usually tracking multiple mm_structs and it
is difficult to keep track of what struct mmu_notifier's have been
allocated for what mm's.

Since this has been open coded in a variety of exciting ways, provide core
functionality to do this safely.

This approach uses the struct mmu_notifier_ops * as a key to determine if
the subsystem has a notifier registered on the mm or not. If there is a
registration then the existing notifier struct is returned, otherwise the
ops->alloc_notifiers() is used to create a new per-subsystem notifier for
the mm.

The destroy side incorporates an async call_srcu based destruction which
will avoid bugs in the callers such as commit 6d7c3cde ("mm/hmm: fix
use after free with struct hmm in the mmu notifiers").

Since we are inside the mmu notifier core locking is fairly simple, the
allocation uses the same approach as for mmu_notifier_mm, the write side
of the mmap_sem makes everything deterministic and we only need to do
hlist_add_head_rcu() under the mm_take_all_locks(). The new users count
and the discoverability in the hlist is fully serialized by the
mmu_notifier_mm->lock.

Link: https://lore.kernel.org/r/20190806231548.25242-4-jgg@ziepe.caCo-developed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

2c7933f5

mm/mmu_notifiers: do not speculatively allocate a mmu_notifier_mm · 70df291b

Jason Gunthorpe authored Aug 06, 2019

A prior commit e0f3c3f7 ("mm/mmu_notifier: init notifier if necessary")
made an attempt at doing this, but had to be reverted as calling
the GFP_KERNEL allocator under the i_mmap_mutex causes deadlock, see
commit 35cfa2b0 ("mm/mmu_notifier: allocate mmu_notifier in advance").

However, we can avoid that problem by doing the allocation only under
the mmap_sem, which is already happening.

Since all writers to mm->mmu_notifier_mm hold the write side of the
mmap_sem reading it under that sem is deterministic and we can use that to
decide if the allocation path is required, without speculation.

The actual update to mmu_notifier_mm must still be done under the
mm_take_all_locks() to ensure read-side coherency.

Link: https://lore.kernel.org/r/20190806231548.25242-3-jgg@ziepe.caReviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

70df291b

mm/mmu_notifiers: hoist do_mmu_notifier_register down_write to the caller · 56c57103

Jason Gunthorpe authored Aug 06, 2019

This simplifies the code to not have so many one line functions and extra
logic. __mmu_notifier_register() simply becomes the entry point to
register the notifier, and the other one calls it under lock.

Also add a lockdep_assert to check that the callers are holding the lock
as expected.

Link: https://lore.kernel.org/r/20190806231548.25242-2-jgg@ziepe.caSuggested-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

56c57103

07 Aug, 2019 13 commits

mm/hmm: make HMM_MIRROR an implicit option · 9c240a7b

Christoph Hellwig authored Aug 06, 2019

Make HMM_MIRROR an option that is selected by drivers wanting to use it
instead of a user visible option as it is just a low-level implementation
detail.

Link: https://lore.kernel.org/r/20190806160554.14046-15-hch@lst.deSigned-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

9c240a7b

mm/hmm: allow HMM_MIRROR on all architectures with MMU · f442c283

Christoph Hellwig authored Aug 06, 2019

There isn't really any architecture specific code in this page table walk
implementation, so drop the dependencies.

Link: https://lore.kernel.org/r/20190806160554.14046-14-hch@lst.deSigned-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

f442c283

mm/hmm: cleanup the hmm_vma_walk_hugetlb_entry stub · 251bbe59

Christoph Hellwig authored Aug 06, 2019

Stub out the whole function and assign NULL to the .hugetlb_entry method
if CONFIG_HUGETLB_PAGE is not set, as the method won't ever be called in
that case.

Link: https://lore.kernel.org/r/20190806160554.14046-13-hch@lst.deSigned-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

251bbe59

mm/hmm: cleanup the hmm_vma_handle_pmd stub · 9d3973d6

Christoph Hellwig authored Aug 06, 2019

Stub out the whole function when CONFIG_TRANSPARENT_HUGEPAGE is not set to
make the function easier to read.

Link: https://lore.kernel.org/r/20190806160554.14046-12-hch@lst.deSigned-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

9d3973d6

mm/hmm: only define hmm_vma_walk_pud if needed · f0b3c45c

Christoph Hellwig authored Aug 06, 2019

We only need the special pud_entry walker if PUD-sized hugepages and pte
mappings are supported, else the common pagewalk code will take care of
the iteration. Not implementing this callback reduced the amount of code
compiled for non-x86 platforms, and also fixes compile failures with other
architectures when helpers like pud_pfn are not implemented.

Link: https://lore.kernel.org/r/20190806160554.14046-11-hch@lst.deSigned-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

f0b3c45c

mm/hmm: don't abuse pte_index() in hmm_vma_handle_pmd · 309f9a4f

Christoph Hellwig authored Aug 06, 2019

pte_index is an internal arch helper in various architectures, without
consistent semantics. Open code that calculation of a PMD index based on
the virtual address instead.

Link: https://lore.kernel.org/r/20190806160554.14046-10-hch@lst.deSigned-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

309f9a4f

mm/hmm: remove the mask variable in hmm_vma_walk_hugetlb_entry · 05c23af4

Christoph Hellwig authored Aug 06, 2019

The pagewalk code already passes the value as the hmask parameter.

Link: https://lore.kernel.org/r/20190806160554.14046-9-hch@lst.deSigned-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

05c23af4

mm/hmm: remove the page_shift member from struct hmm_range · 7f08263d

Christoph Hellwig authored Aug 06, 2019

All users pass PAGE_SIZE here, and if we wanted to support single entries
for huge pages we should really just add a HMM_FAULT_HUGEPAGE flag instead
that uses the huge page size instead of having the caller calculate that
size once, just for the hmm code to verify it.

Link: https://lore.kernel.org/r/20190806160554.14046-8-hch@lst.deSigned-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

7f08263d

mm/hmm: remove superfluous arguments from hmm_range_register · fac555ac

Christoph Hellwig authored Aug 06, 2019

The start, end and page_shift values are all saved in the range structure,
so we might as well use that for argument passing.

Link: https://lore.kernel.org/r/20190806160554.14046-7-hch@lst.deSigned-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

fac555ac

mm/hmm: remove the unused vma argument to hmm_range_dma_unmap · 2cbeb419

Christoph Hellwig authored Aug 06, 2019

Link: https://lore.kernel.org/r/20190806160554.14046-6-hch@lst.deSigned-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

2cbeb419

nouveau: pass struct nouveau_svmm to nouveau_range_fault · 5aa0acb3

Christoph Hellwig authored Aug 06, 2019

We'll need the nouveau_svmm structure to improve the function soon.  For
now this allows using the svmm->mm reference to unlock the mmap_sem, and
thus the same dereference chain that the caller uses to lock and unlock
it.

Link: https://lore.kernel.org/r/20190806160554.14046-4-hch@lst.deSigned-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

5aa0acb3

amdgpu: don't initialize range->list in amdgpu_hmm_init_range · 07d82211

Christoph Hellwig authored Aug 06, 2019

The list is used to add the range to another list as an entry in the core
hmm code, and intended as a private member not exposed to drivers. There
is no need to initialize it in a driver.

Link: https://lore.kernel.org/r/20190806160554.14046-3-hch@lst.deSigned-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

07d82211

amdgpu: remove -EAGAIN handling for hmm_range_fault · 9d0a1665

Christoph Hellwig authored Aug 06, 2019

hmm_range_fault can only return -EAGAIN if called with the
HMM_FAULT_ALLOW_RETRY flag, which amdgpu never does. Remove the handling
for the -EAGAIN case with its non-standard locking scheme.

Link: https://lore.kernel.org/r/20190806160554.14046-2-hch@lst.deSigned-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

9d0a1665

26 Jul, 2019 6 commits

mm/hmm: remove hmm_range vma · cc374377

Ralph Campbell authored Jul 25, 2019

Since hmm_range_fault() doesn't use the struct hmm_range vma field, remove
it.

Link: https://lore.kernel.org/r/20190726005650.2566-8-rcampbell@nvidia.comSuggested-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

cc374377

mm/hmm: remove hugetlbfs check in hmm_vma_walk_pmd · f527688d

Ralph Campbell authored Jul 25, 2019

walk_page_range() will only call hmm_vma_walk_hugetlb_entry() for
hugetlbfs pages and doesn't call hmm_vma_walk_pmd() in this case.

Therefore, it is safe to remove the check for vma->vm_flags & VM_HUGETLB
in hmm_vma_walk_pmd().

Link: https://lore.kernel.org/r/20190726005650.2566-7-rcampbell@nvidia.comSigned-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

f527688d

mm/hmm: merge hmm_range_snapshot into hmm_range_fault · d45d464b

Christoph Hellwig authored Jul 25, 2019

Add a HMM_FAULT_SNAPSHOT flag so that hmm_range_snapshot can be merged
into the almost identical hmm_range_fault function.

Link: https://lore.kernel.org/r/20190726005650.2566-5-rcampbell@nvidia.comSigned-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

d45d464b

mm/hmm: replace the block argument to hmm_range_fault with a flags value · 9a4903e4

Christoph Hellwig authored Jul 25, 2019

This allows easier expansion to other flags, and also makes the callers a
little easier to read.

Link: https://lore.kernel.org/r/20190726005650.2566-4-rcampbell@nvidia.comSigned-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

9a4903e4

mm/hmm: a few more C style and comment clean ups · d2e8d551

Ralph Campbell authored Jul 25, 2019

A few more comments and minor programming style clean ups.  There should
be no functional changes.

Link: https://lore.kernel.org/r/20190726005650.2566-3-rcampbell@nvidia.comSigned-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

d2e8d551

mm/hmm: replace hmm_update with mmu_notifier_range · 1f961807

Ralph Campbell authored Jul 25, 2019

The hmm_mirror_ops callback function sync_cpu_device_pagetables() passes a
struct hmm_update which is a simplified version of struct
mmu_notifier_range. This is unnecessary so replace hmm_update with
mmu_notifier_range directly.

Link: https://lore.kernel.org/r/20190726005650.2566-2-rcampbell@nvidia.comSigned-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
[jgg: white space tuning]
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

1f961807

25 Jul, 2019 7 commits

mm/hmm: comment on VM_FAULT_RETRY semantics in handle_mm_fault · e709accc

Jason Gunthorpe authored Jul 24, 2019

The magic dropping of mmap_sem when handle_mm_fault returns VM_FAULT_RETRY
is rather subtile.  Add a comment explaining it.

Link: https://lore.kernel.org/r/20190724065258.16603-8-hch@lst.deTested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
[hch: wrote a changelog]
Signed-off-by: Christoph Hellwig <hch@lst.de>

e709accc

mm/hmm: remove the legacy hmm_pfn_* APIs · f32471e2

Christoph Hellwig authored Jul 24, 2019

Switch the one remaining user in nouveau over to its replacement, and
remove all the wrappers.

Link: https://lore.kernel.org/r/20190724065258.16603-7-hch@lst.deTested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

f32471e2

nouveau: return -EBUSY when hmm_range_wait_until_valid fails · 1b88b99b

Christoph Hellwig authored Jul 24, 2019

-EAGAIN has a magic meaning for non-blocking faults, so don't overload it.
Given that the caller doesn't check for specific error codes this change
is purely cosmetic.

Link: https://lore.kernel.org/r/20190724065258.16603-6-hch@lst.deTested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

1b88b99b

nouveau: unlock mmap_sem on all errors from nouveau_range_fault · de4ee728

Christoph Hellwig authored Jul 24, 2019

Currently nouveau_svm_fault expects nouveau_range_fault to never unlock
mmap_sem, but the latter unlocks it for a random selection of error
codes. Fix this up by always unlocking mmap_sem for non-zero return values
in nouveau_range_fault, and only unlocking it in the caller for successful
returns.

Link: https://lore.kernel.org/r/20190724065258.16603-5-hch@lst.deTested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

de4ee728

nouveau: remove the block parameter to nouveau_range_fault · 5fbcf501

Christoph Hellwig authored Jul 24, 2019

The parameter is always false, so remove it as well as the -EAGAIN
handling that can only happen for the non-blocking case.

Link: https://lore.kernel.org/r/20190724065258.16603-4-hch@lst.deTested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

5fbcf501

mm/hmm: move hmm_vma_range_done and hmm_vma_fault to nouveau · 02712bc3

Christoph Hellwig authored Jul 24, 2019

These two functions are marked as a legacy APIs to get rid of, but seem to
suit the current nouveau flow.  Move it to the only user in preparation
for fixing a locking bug involving caller and callee.  All comments
referring to the old API have been removed as this now is a driver private
helper.

Link: https://lore.kernel.org/r/20190724065258.16603-3-hch@lst.deTested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

02712bc3

mm/hmm: always return EBUSY for invalid ranges in hmm_range_{fault,snapshot} · 2bcbeaef

Christoph Hellwig authored Jul 24, 2019

We should not have two different error codes for the same
condition. EAGAIN must be reserved for the FAULT_FLAG_ALLOW_RETRY retry
case and signals to the caller that the mmap_sem has been unlocked.

Use EBUSY for the !valid case so that callers can get the locking right.

Link: https://lore.kernel.org/r/20190724065258.16603-2-hch@lst.deTested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
[jgg: elaborated commit message]
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

2bcbeaef

21 Jul, 2019 3 commits

Linus 5.3-rc1 · 5f9e832c
Linus Torvalds authored Jul 21, 2019

5f9e832c

Merge tag 'devicetree-fixes-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · c7bf0a0f

Linus Torvalds authored Jul 21, 2019

Pull Devicetree fixes from Rob Herring:
 "Fix several warnings/errors in validation of binding schemas"

* tag 'devicetree-fixes-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  dt-bindings: pinctrl: stm32: Fix missing 'clocks' property in examples
  dt-bindings: iio: ad7124: Fix dtc warnings in example
  dt-bindings: iio: avia-hx711: Fix avdd-supply typo in example
  dt-bindings: pinctrl: aspeed: Fix AST2500 example errors
  dt-bindings: pinctrl: aspeed: Fix 'compatible' schema errors
  dt-bindings: riscv: Limit cpus schema to only check RiscV 'cpu' nodes
  dt-bindings: Ensure child nodes are of type 'object'

c7bf0a0f

Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · d6788eb7

Linus Torvalds authored Jul 21, 2019

Pull vfs documentation typo fix from Al Viro.

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  typo fix: it's d_make_root, not d_make_inode...

d6788eb7