Commit 2780025e authored by Joao Martins's avatar Joao Martins Committed by Jason Gunthorpe

iommufd/iova_bitmap: Handle recording beyond the mapped pages

IOVA bitmap is a zero-copy scheme of recording dirty bits that iterate the
different bitmap user pages at chunks of a maximum of
PAGE_SIZE/sizeof(struct page*) pages.

When the iterations are split up into 64G, the end of the range may be
broken up in a way that's aligned with a non base page PTE size. This
leads to only part of the huge page being recorded in the bitmap. Note
that in pratice this is only a problem for IOMMU dirty tracking i.e. when
the backing PTEs are in IOMMU hugepages and the bitmap is in base page
granularity. So far this not something that affects VF dirty trackers
(which reports and records at the same granularity).

To fix that, if there is a remainder of bits left to set in which the
current IOVA bitmap doesn't cover, make a copy of the bitmap structure and
iterate-and-set the rest of the bits remaining. Finally, when advancing
the iterator, skip all the bits that were set ahead.

Link: https://lore.kernel.org/r/20240202133415.23819-5-joao.m.martins@oracle.comReported-by: default avatarAvihai Horon <avihaih@nvidia.com>
Fixes: f35f22cc ("iommu/vt-d: Access/Dirty bit support for SS domains")
Fixes: 421a511a ("iommu/amd: Access/Dirty bit support in IOPTEs")
Signed-off-by: default avatarJoao Martins <joao.m.martins@oracle.com>
Tested-by: default avatarAvihai Horon <avihaih@nvidia.com>
Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
parent 42af9511
......@@ -113,6 +113,9 @@ struct iova_bitmap {
/* length of the IOVA range for the whole bitmap */
size_t length;
/* length of the IOVA range set ahead the pinned pages */
unsigned long set_ahead_length;
};
/*
......@@ -341,6 +344,32 @@ static bool iova_bitmap_done(struct iova_bitmap *bitmap)
return bitmap->mapped_base_index >= bitmap->mapped_total_index;
}
static int iova_bitmap_set_ahead(struct iova_bitmap *bitmap,
size_t set_ahead_length)
{
int ret = 0;
while (set_ahead_length > 0 && !iova_bitmap_done(bitmap)) {
unsigned long length = iova_bitmap_mapped_length(bitmap);
unsigned long iova = iova_bitmap_mapped_iova(bitmap);
ret = iova_bitmap_get(bitmap);
if (ret)
break;
length = min(length, set_ahead_length);
iova_bitmap_set(bitmap, iova, length);
set_ahead_length -= length;
bitmap->mapped_base_index +=
iova_bitmap_offset_to_index(bitmap, length - 1) + 1;
iova_bitmap_put(bitmap);
}
bitmap->set_ahead_length = 0;
return ret;
}
/*
* Advances to the next range, releases the current pinned
* pages and pins the next set of bitmap pages.
......@@ -357,6 +386,15 @@ static int iova_bitmap_advance(struct iova_bitmap *bitmap)
if (iova_bitmap_done(bitmap))
return 0;
/* Iterate, set and skip any bits requested for next iteration */
if (bitmap->set_ahead_length) {
int ret;
ret = iova_bitmap_set_ahead(bitmap, bitmap->set_ahead_length);
if (ret)
return ret;
}
/* When advancing the index we pin the next set of bitmap pages */
return iova_bitmap_get(bitmap);
}
......@@ -426,5 +464,10 @@ void iova_bitmap_set(struct iova_bitmap *bitmap,
kunmap_local(kaddr);
cur_bit += nbits;
} while (cur_bit <= last_bit);
if (unlikely(cur_bit <= last_bit)) {
bitmap->set_ahead_length =
((last_bit - cur_bit + 1) << bitmap->mapped.pgshift);
}
}
EXPORT_SYMBOL_NS_GPL(iova_bitmap_set, IOMMUFD);
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment