Commit 79e68821 authored by Adrian Chadd's avatar Adrian Chadd Committed by Kalle Valo

ath10k: go back to using dma_alloc_coherent() for firmware scratch memory

This reverts commit b0578865 ("ath10k: do not use coherent memory for
allocated device memory chunks") in 2015 which converted this allocation from
dma_map_coherent() to kzalloc() / dma_map_single().

The current problem manifests when using later model NICs with larger
(>700KiB) scratch spaces in memory.  Although the kzalloc call
succeeds, the software IOMMU TLB code (via dma_map_single()) panics
because it can't find 700KiB of linear physmem bounce buffers for DMA.
Now, this is a bit of a silly failure mode for the dma map API,
but it's what we currently have to play with.

In these cases, doing kzalloc() works fine, but the dma_map_single()
call fails.

After chatting with Linus briefly about this, it indeed should be
using dma_alloc_coherent() for doing larger device memory allocation
that requires some kind of physical address mapping.

You're not supposed to be using kzalloc and dma_map_* calls for
large memory regions, and I'm guessing not for long-held mappings
either.  Typically dma mappings should be temporary for DMA,
not long held like these.

Now, since hopefully the major annoying underlying problem has also been
addressed (ie, ath10k is no longer tears down all of these allocations
and reallocates them every time the vdevs are brought down) fragmentation
should stop being such a touchy issue.  If it is though, using
dma_alloc_coherent() use gets us access to the CMB APIs too relatively
easily and ideally we would be allocating memory early in boot for
exactly these reasons.
Signed-off-by: default avatarAdrian Chadd <adrian@FreeBSD.org>
Signed-off-by: default avatarKalle Valo <kvalo@qca.qualcomm.com>
parent d86d4716
...@@ -4482,31 +4482,17 @@ static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32 req_id, ...@@ -4482,31 +4482,17 @@ static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32 req_id,
u32 num_units, u32 unit_len) u32 num_units, u32 unit_len)
{ {
dma_addr_t paddr; dma_addr_t paddr;
u32 pool_size = 0; u32 pool_size;
int idx = ar->wmi.num_mem_chunks; int idx = ar->wmi.num_mem_chunks;
void *vaddr = NULL; void *vaddr;
if (ar->wmi.num_mem_chunks == ARRAY_SIZE(ar->wmi.mem_chunks))
return -ENOMEM;
while (!vaddr && num_units) {
pool_size = num_units * round_up(unit_len, 4); pool_size = num_units * round_up(unit_len, 4);
if (!pool_size) vaddr = dma_alloc_coherent(ar->dev, pool_size, &paddr, GFP_KERNEL);
return -EINVAL;
vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN);
if (!vaddr) if (!vaddr)
num_units /= 2;
}
if (!num_units)
return -ENOMEM; return -ENOMEM;
paddr = dma_map_single(ar->dev, vaddr, pool_size, DMA_BIDIRECTIONAL); memset(vaddr, 0, pool_size);
if (dma_mapping_error(ar->dev, paddr)) {
kfree(vaddr);
return -ENOMEM;
}
ar->wmi.mem_chunks[idx].vaddr = vaddr; ar->wmi.mem_chunks[idx].vaddr = vaddr;
ar->wmi.mem_chunks[idx].paddr = paddr; ar->wmi.mem_chunks[idx].paddr = paddr;
...@@ -8281,11 +8267,10 @@ void ath10k_wmi_free_host_mem(struct ath10k *ar) ...@@ -8281,11 +8267,10 @@ void ath10k_wmi_free_host_mem(struct ath10k *ar)
/* free the host memory chunks requested by firmware */ /* free the host memory chunks requested by firmware */
for (i = 0; i < ar->wmi.num_mem_chunks; i++) { for (i = 0; i < ar->wmi.num_mem_chunks; i++) {
dma_unmap_single(ar->dev, dma_free_coherent(ar->dev,
ar->wmi.mem_chunks[i].paddr,
ar->wmi.mem_chunks[i].len, ar->wmi.mem_chunks[i].len,
DMA_BIDIRECTIONAL); ar->wmi.mem_chunks[i].vaddr,
kfree(ar->wmi.mem_chunks[i].vaddr); ar->wmi.mem_chunks[i].paddr);
} }
ar->wmi.num_mem_chunks = 0; ar->wmi.num_mem_chunks = 0;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment