arm64: dts: qcom: sc8280xp: fix PCIe DMA coherency
The devices on the SC8280XP PCIe buses are cache coherent and must be marked as such to avoid data corruption. A coherent device can, for example, end up snooping stale data from the caches instead of using data written by the CPU through the non-cacheable mapping which is used for consistent DMA buffers for non-coherent devices. Note that this is much more likely to happen since commit c44094ee ("arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()") that was added in 6.1 and which removed the cache invalidation when setting up the non-cacheable mapping. Marking the PCIe devices as coherent specifically fixes the intermittent NVMe probe failures observed on the Thinkpad X13s, which was due to corruption of the submission and completion queues. This was typically observed as corruption of the admin submission queue (with well-formed completion): could not locate request for tag 0x0 nvme nvme0: invalid id 0 completed on queue 0 or corruption of the admin or I/O completion queues (malformed completion): could not locate request for tag 0x45f nvme nvme0: invalid id 25695 completed on queue 25965 presumably as these queues are small enough to not be allocated using CMA which in turn make them more likely to be cached (e.g. due to accesses to nearby pages through the cacheable linear map). Increasing the buffer sizes to two pages to force CMA allocation also appears to make the problem go away. Fixes: 813e8315 ("arm64: dts: qcom: sc8280xp/sa8540p: add PCIe2-4 nodes") Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221124142501.29314-1-johan+linaro@kernel.org
Showing
Please register or sign in to comment