- 26 Jul, 2022 4 commits
-
-
Logan Gunthorpe authored
Add EREMOTEIO error return to dma_map_sgtable() which will be used by .map_sg() implementations that detect P2PDMA pages that the underlying DMA device cannot access. Signed-off-by:
Logan Gunthorpe <logang@deltatee.com> Reviewed-by:
Jason Gunthorpe <jgg@nvidia.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
Logan Gunthorpe authored
Add pci_p2pdma_map_segment() as a helper for dma_map_sg() implementations. It takes an scatterlist segment that must point to a pci_p2pdma struct page and will map it if the mapping requires a bus address. The return value indicates whether the mapping required a bus address or whether the caller still needs to map the segment normally. If the segment should not be mapped, -EREMOTEIO is returned. This helper uses a state structure to track the changes to the pgmap across calls and avoid needing to lookup into the xarray for every page. The prototype for the helper is added to dma-map-ops.h as it is only useful to dma map implementations and don't need to pollute the public pci-p2pdma header. Signed-off-by:
Logan Gunthorpe <logang@deltatee.com> Acked-by:
Bjorn Helgaas <bhelgaas@google.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
Logan Gunthorpe authored
Attempt to find the mapping type for P2PDMA pages on the first DMA map attempt if it has not been done ahead of time. Previously, the mapping type was expected to be calculated ahead of time, but if pages are to come from userspace then there's no way to ensure the path was checked ahead of time. This change will calculate the mapping type if it hasn't pre-calculated so it is no longer invalid to call pci_p2pdma_map_sg() before the mapping type is calculated, so drop the WARN_ON when that is the case. Signed-off-by:
Logan Gunthorpe <logang@deltatee.com> Acked-by:
Bjorn Helgaas <bhelgaas@google.com> Reviewed-by:
Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
Logan Gunthorpe authored
Introduce a dma_flags field in struct scatterlist. These flags will be used by dma_[un]map_sg_p2pdma() to determine when a given SGL segments dma_address points to a PCI bus address. dma_unmap_sg_p2pdma() will need to perform different cleanup when a segment is marked as a bus address. The dma_flags field will fit in the existing padding on 64BIT systems (assuming CONFIG_NEED_SG_DMA_LENGTH is also set). The new bit will only be used when CONFIG_PCI_P2PDMA is set; this means PCI P2PDMA will require CONFIG_64BIT. This should be acceptable as the majority of P2PDMA use cases are restricted to newer root complexes and roughly require the extra address space for memory BARs used in the transactions. Signed-off-by:
Logan Gunthorpe <logang@deltatee.com> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
- 22 Jul, 2022 3 commits
-
-
Tianyu Lan authored
- Fix the used field of struct io_tlb_area wasn't initialized - Set area number to be 0 if input area number parameter is 0 - Use array_size() to calculate io_tlb_area array size - Make parameters of swiotlb_do_find_slots() more reasonable Fixes: 26ffb91fa5e0 ("swiotlb: split up the global swiotlb lock") Signed-off-by:
Tianyu Lan <tiala@microsoft.com> Reviewed-by:
Michael Kelley <mikelley@microsoft.com> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
Lukas Bulwahn authored
Commit e3217540 ("ARM/dma-mapping: remove dmabounce") removes the config DMABOUNCE. A comment to the function __dma_page_cpu_to_dev() refers to this removed config DMABOUNCE. Remove the obsolete explanation, but keep the recommendation not to use __dma_page_cpu_to_dev() and use dma_sync_* functions instead. Signed-off-by:
Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
John Garry authored
Add a comment about limiting the default the SCSI disk request_queue max_sectors initial value to that of the SCSI host optimal sectors limit. Suggested-by:
Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by:
John Garry <john.garry@huawei.com> Reviewed-by:
Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
- 19 Jul, 2022 6 commits
-
-
John Garry authored
ATA devices (struct ata_device) have a max_sectors field which is configured internally in libata. This is then used to (re)configure the associated sdev request queue max_sectors value from how it is earlier set in __scsi_init_queue(). In __scsi_init_queue() the max_sectors value is set according to shost limits, which includes host DMA mapping limits. Cap the ata_device max_sectors according to shost->max_sectors to respect this shost limit. Signed-off-by:
John Garry <john.garry@huawei.com> Acked-by:
Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
John Garry authored
Streaming DMA mappings may be considerably slower when mappings go through an IOMMU and the total mapping length is somewhat long. This is because the IOMMU IOVA code allocates and free an IOVA for each mapping, which may affect performance. For performance reasons set the request queue max_sectors from dma_opt_mapping_size(), which knows this mapping limit. Signed-off-by:
John Garry <john.garry@huawei.com> Reviewed-by:
Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
John Garry authored
Streaming DMA mappings may be considerably slower when mappings go through an IOMMU and the total mapping length is somewhat long. This is because the IOMMU IOVA code allocates and free an IOVA for each mapping, which may affect performance. New member Scsi_Host.opt_sectors is added, which is the optimal host max_sectors, and use this value to cap the request queue max_sectors when set. It could be considered to have request queues io_opt value initially set at Scsi_Host.opt_sectors in __scsi_init_queue(), but that is not really the purpose of io_opt. Finally, even though Scsi_Host.opt_sectors value should never be greater than the request queue max_hw_sectors value, continue to limit to this value for safety. Signed-off-by:
John Garry <john.garry@huawei.com> Reviewed-by:
Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
John Garry authored
The shost->max_sectors is repeatedly capped according to the host DMA mapping limit for each sdev in __scsi_init_queue(). This is unnecessary, so set only once when adding the host. Signed-off-by:
John Garry <john.garry@huawei.com> Reviewed-by:
Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
John Garry authored
Add the IOMMU callback for DMA mapping API dma_opt_mapping_size(), which allows the drivers to know the optimal mapping limit and thus limit the requested IOVA lengths. This value is based on the IOVA rcache range limit, as IOVAs allocated above this limit must always be newly allocated, which may be quite slow. Signed-off-by:
John Garry <john.garry@huawei.com> Reviewed-by:
Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by:
Robin Murphy <robin.murphy@arm.com> Acked-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
John Garry authored
Streaming DMA mapping involving an IOMMU may be much slower for larger total mapping size. This is because every IOMMU DMA mapping requires an IOVA to be allocated and freed. IOVA sizes above a certain limit are not cached, which can have a big impact on DMA mapping performance. Provide an API for device drivers to know this "optimal" limit, such that they may try to produce mapping which don't exceed it. Signed-off-by:
John Garry <john.garry@huawei.com> Reviewed-by:
Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
- 18 Jul, 2022 5 commits
-
-
Christoph Hellwig authored
No need to expose this structure definition in the header. Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
Chao Gao authored
Free slots tracking assumes that slots in a segment can be allocated to fulfill a request. This implies that slots in a segment should belong to the same area. Although the possibility of a violation is low, it is better to explicitly enforce segments won't span multiple areas by adjusting the number of slabs when configuring areas. Signed-off-by:
Chao Gao <chao.gao@intel.com> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
Chao Gao authored
default_nslabs are rounded up in two cases with exactly same comments. Add a simple wrapper to reduce duplicate code/comments. It is preparatory to adding more logics into the round-up. No functional change intended. Signed-off-by:
Chao Gao <chao.gao@intel.com> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
Chao Gao authored
Commit 20347fca ("swiotlb: split up the global swiotlb lock") splits io_tlb_mem into multiple areas. Each area has its own lock and index. The global ones are not used so remove them. Signed-off-by:
Chao Gao <chao.gao@intel.com> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
Dan Carpenter authored
Don't dereference "mem" after it has been freed. Flip the two kfree()s around to address this bug. Fixes: 26ffb91fa5e0 ("swiotlb: split up the global swiotlb lock") Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
- 13 Jul, 2022 1 commit
-
-
Tianyu Lan authored
Traditionally swiotlb was not performance critical because it was only used for slow devices. But in some setups, like TDX/SEV confidential guests, all IO has to go through swiotlb. Currently swiotlb only has a single lock. Under high IO load with multiple CPUs this can lead to significat lock contention on the swiotlb lock. This patch splits the swiotlb bounce buffer pool into individual areas which have their own lock. Each CPU tries to allocate in its own area first. Only if that fails does it search other areas. On freeing the allocation is freed into the area where the memory was originally allocated from. Area number can be set via swiotlb kernel parameter and is default to be possible cpu number. If possible cpu number is not power of 2, area number will be round up to the next power of 2. This idea from Andi Kleen patch(https://github.com/intel/tdx/commit/ 4529b5784c141782c72ec9bd9a92df2b68cb7d45). Based-on-idea-by:
Andi Kleen <ak@linux.intel.com> Signed-off-by:
Tianyu Lan <Tianyu.Lan@microsoft.com> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
- 12 Jul, 2022 1 commit
-
-
Robin Murphy authored
In the failure case of trying to use a buffer which we'd previously failed to allocate, the "!mem" condition is no longer sufficient since io_tlb_default_mem became static and assigned by default. Update the condition to work as intended per the rest of that conversion. Fixes: 463e862a ("swiotlb: Convert io_default_tlb_mem to static allocation") Signed-off-by:
Robin Murphy <robin.murphy@arm.com> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
- 07 Jul, 2022 10 commits
-
-
Robin Murphy authored
The dma_sync_* operations are now the only difference between the coherent and non-coherent IOMMU ops. Some minor tweaks to make those safe for coherent devices with minimal overhead, and we can condense down to a single set of DMA ops. Signed-off-by:
Robin Murphy <robin.murphy@arm.com> Signed-off-by:
Christoph Hellwig <hch@lst.de> Tested-by:
Marc Zyngier <maz@kernel.org>
-
Robin Murphy authored
Merge the coherent and non-coherent callbacks down to a single implementation each, relying on the generic dev->dma_coherent flag at the points where the difference matters. Signed-off-by:
Robin Murphy <robin.murphy@arm.com> Signed-off-by:
Christoph Hellwig <hch@lst.de> Tested-by:
Marc Zyngier <maz@kernel.org>
-
Robin Murphy authored
When an IOMMU is present, we trust that it should be capable of remapping any physical memory, and since the device masks represent the input (virtual) addresses to the IOMMU it makes no sense to validate them against physical PFNs anyway. Signed-off-by:
Robin Murphy <robin.murphy@arm.com> Signed-off-by:
Christoph Hellwig <hch@lst.de> Tested-by:
Marc Zyngier <maz@kernel.org>
-
Christoph Hellwig authored
Use dma-direct unconditionally on arm. It has already been used for some time for LPAE and nommu configurations. This mostly changes the streaming mapping implementation and the (simple) coherent allocator for device that are DMA coherent. The existing complex allocator for uncached mappings for non-coherent devices is still used as is using the arch_dma_alloc/arch_dma_free hooks. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Arnd Bergmann <arnd@arndb.de> Acked-by: Andre Przywara <andre.przywara@arm.com> [highbank] Tested-by:
Marc Zyngier <maz@kernel.org>
-
Christoph Hellwig authored
Only the footbridge platforms provide their own DMA address translation helpers, so switch to the generic version for all other platforms, and consolidate the footbridge implementation to remove two levels of indirection. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Arnd Bergmann <arnd@arndb.de> Tested-by:
Marc Zyngier <maz@kernel.org>
-
Christoph Hellwig authored
Use the helpers as expected by the dma-direct code in the old arm dma-mapping code to ease a gradual switch to the common DMA code. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Arnd Bergmann <arnd@arndb.de> Tested-by:
Marc Zyngier <maz@kernel.org>
-
Christoph Hellwig authored
virt_to_dma was only used by the now removed dmabounce code. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Arnd Bergmann <arnd@arndb.de> Tested-by:
Marc Zyngier <maz@kernel.org>
-
Christoph Hellwig authored
With the dmabounce removal these aren't used outside of dma-mapping.c, so mark them static. Move the dma_map_ops declarations down a bit to avoid lots of forward declarations. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Arnd Bergmann <arnd@arndb.de> Tested-by:
Marc Zyngier <maz@kernel.org>
-
Christoph Hellwig authored
Remove the now unused dmabounce code. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Arnd Bergmann <arnd@arndb.de>
-
Arnd Bergmann authored
The sa1111 platform is one of the two remaining users of the old Arm specific "dmabounce" code, which is an earlier implementation of the generic swiotlb. Linus Walleij submitted a patch that removes dmabounce support from the ixp4xx, and I had a look at the other user, which is the sa1111 companion chip. Looking at how dmabounce is used, I could narrow it down to one driver one three machines: - dmabounce is only initialized on assabet/neponset, jornada720 and badge4, which are the platforms that have an sa1111 and support DMA on it. - All three of these suffer from "erratum #7" that requires only doing DMA to half the memory sections based on one of the address lines, in addition, the neponset also can't DMA to the RAM that is connected to sa1111 itself. - the pxa lubbock machine also has sa1111, but does not support DMA on it and does not set dmabounce. - only the OHCI and audio devices on sa1111 support DMA, but as there is no audio driver for this hardware, only OHCI remains. In the OHCI code, I noticed that two other platforms already have a local bounce buffer support in the form of the "local_mem" allocator. Specifically, TMIO and SM501 use this on a few other ARM boards with 16KB or 128KB of local SRAM that can be accessed from the OHCI and from the CPU. While this is not the same problem as on sa1111, I could not find a reason why we can't re-use the existing implementation but replace the physical SRAM address mapping with a locally allocated DMA buffer. There are two main downsides: - rather than using a dynamically sized pool, this buffer needs to be allocated at probe time using a fixed size. Without having any idea of what it should be, I picked a size of 64KB, which is between what the other two OHCI front-ends use in their SRAM. If anyone has a better idea what that size is reasonable, this can be trivially changed. - Previously, only USB transfers to unaddressable memory needed to go through the bounce buffer, now all of them do, which may impact runtime performance for USB endpoints that do a lot of transfers. On the upside, the local_mem support uses write-combining buffers, which should be a bit faster for transfers to the device compared to normal uncached coherent memory as used in dmabounce. Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com> Cc: linux-usb@vger.kernel.org Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Reviewed-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by:
Alan Stern <stern@rowland.harvard.edu> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
- 22 Jun, 2022 4 commits
-
-
Dongli Zhang authored
Panic on purpose if nslabs is too small, in order to sync with the remap retry logic. In addition, print the number of bytes for tlb alloc failure. Signed-off-by:
Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
Dongli Zhang authored
Fix the usage of swiotlb param in kernel doc. Signed-off-by:
Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
Dongli Zhang authored
Both swiotlb_init_remap() and swiotlb_init() have return type void. Signed-off-by:
Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
Dongli Zhang authored
The 'swiotlb_force' is removed since commit c6af2aa9 ("swiotlb: make the swiotlb_init interface more useful"). Signed-off-by:
Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
- 19 Jun, 2022 6 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fixes from Thomas Gleixner: - Make RESERVE_BRK() work again with older binutils. The recent 'simplification' broke that. - Make early #VE handling increment RIP when successful. - Make the #VE code consistent vs. the RIP adjustments and add comments. - Handle load_unaligned_zeropad() across page boundaries correctly in #VE when the second page is shared. * tag 'x86-urgent-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page x86/tdx: Clarify RIP adjustments in #VE handler x86/tdx: Fix early #VE handling x86/mm: Fix RESERVE_BRK() for older binutils
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull build tooling updates from Thomas Gleixner: - Remove obsolete CONFIG_X86_SMAP reference from objtool - Fix overlapping text section failures in faddr2line for real - Remove OBJECT_FILES_NON_STANDARD usage from x86 ftrace and replace it with finegrained annotations so objtool can validate that code correctly. * tag 'objtool-urgent-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ftrace: Remove OBJECT_FILES_NON_STANDARD usage faddr2line: Fix overlapping text section failures, the sequel objtool: Fix obsolete reference to CONFIG_X86_SMAP
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull scheduler fix from Thomas Gleixner: "A single scheduler fix plugging a race between sched_setscheduler() and balance_push(). sched_setscheduler() spliced the balance callbacks accross a lock break which makes it possible for an interleaving schedule() to observe an empty list" * tag 'sched-urgent-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix balance_push() vs __sched_setscheduler()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull lockdep fix from Thomas Gleixner: "A RT fix for lockdep. lockdep invokes prandom_u32() to create cookies. This worked until prandom_u32() was switched to the real random generator, which takes a spinlock for extraction, which does not work on RT when invoked from atomic contexts. lockdep has no requirement for real random numbers and it turns out sched_clock() is good enough to create the cookie. That works everywhere and is faster" * tag 'locking-urgent-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/lockdep: Use sched_clock() for random numbers
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull irq fixes from Thomas Gleixner: "A set of interrupt subsystem updates: Core: - Ensure runtime power management for chained interrupts Drivers: - A collection of OF node refcount fixes - Unbreak MIPS uniprocessor builds - Fix xilinx interrupt controller Kconfig dependencies - Add a missing compatible string to the Uniphier driver" * tag 'irq-urgent-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/loongson-liointc: Use architecture register to get coreid irqchip/uniphier-aidet: Add compatible string for NX1 SoC dt-bindings: interrupt-controller/uniphier-aidet: Add bindings for NX1 SoC irqchip/realtek-rtl: Fix refcount leak in map_interrupts irqchip/gic-v3: Fix refcount leak in gic_populate_ppi_partitions irqchip/gic-v3: Fix error handling in gic_populate_ppi_partitions irqchip/apple-aic: Fix refcount leak in aic_of_ic_init irqchip/apple-aic: Fix refcount leak in build_fiq_affinity irqchip/gic/realview: Fix refcount leak in realview_gic_of_init irqchip/xilinx: Remove microblaze+zynq dependency genirq: PM: Use runtime PM for chained interrupts
-