iommu/vt-d: per-cpu deferred invalidation queues
The IOMMU's IOTLB invalidation is a costly process. When iommu mode is not set to "strict", it is done asynchronously. Current code amortizes the cost of invalidating IOTLB entries by batching all the invalidations in the system and performing a single global invalidation instead. The code queues pending invalidations in a global queue that is accessed under the global "async_umap_flush_lock" spinlock, which can result is significant spinlock contention. This patch splits this deferred queue into multiple per-cpu deferred queues, and thus gets rid of the "async_umap_flush_lock" and its contention. To keep existing deferred invalidation behavior, it still invalidates the pending invalidations of all CPUs whenever a CPU reaches its watermark or a timeout occurs. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Showing
Please register or sign in to comment