• Vivek Gautam's avatar
    iommu: arm-smmu-impl: Add sdm845 implementation hook · 759aaa10
    Vivek Gautam authored
    Add reset hook for sdm845 based platforms to turn off
    the wait-for-safe sequence.
    
    Understanding how wait-for-safe logic affects USB and UFS performance
    on MTP845 and DB845 boards:
    
    Qcom's implementation of arm,mmu-500 adds a WAIT-FOR-SAFE logic
    to address under-performance issues in real-time clients, such as
    Display, and Camera.
    On receiving an invalidation requests, the SMMU forwards SAFE request
    to these clients and waits for SAFE ack signal from real-time clients.
    The SAFE signal from such clients is used to qualify the start of
    invalidation.
    This logic is controlled by chicken bits, one for each - MDP (display),
    IFE0, and IFE1 (camera), that can be accessed only from secure software
    on sdm845.
    
    This configuration, however, degrades the performance of non-real time
    clients, such as USB, and UFS etc. This happens because, with wait-for-safe
    logic enabled the hardware tries to throttle non-real time clients while
    waiting for SAFE ack signals from real-time clients.
    
    On mtp845 and db845 devices, with wait-for-safe logic enabled by the
    bootloaders we see degraded performance of USB and UFS when kernel
    enables the smmu stage-1 translations for these clients.
    Turn off this wait-for-safe logic from the kernel gets us back the perf
    of USB and UFS devices until we re-visit this when we start seeing perf
    issues on display/camera on upstream supported SDM845 platforms.
    The bootloaders on these boards implement secure monitor callbacks to
    handle a specific command - QCOM_SCM_SVC_SMMU_PROGRAM with which the
    logic can be toggled.
    
    There are other boards such as cheza whose bootloaders don't enable this
    logic. Such boards don't implement callbacks to handle the specific SCM
    call so disabling this logic for such boards will be a no-op.
    
    This change is inspired by the downstream change from Patrick Daly
    to address performance issues with display and camera by handling
    this wait-for-safe within separte io-pagetable ops to do TLB
    maintenance. So a big thanks to him for the change and for all the
    offline discussions.
    
    Without this change the UFS reads are pretty slow:
    $ time dd if=/dev/sda of=/dev/zero bs=1048576 count=10 conv=sync
    10+0 records in
    10+0 records out
    10485760 bytes (10.0MB) copied, 22.394903 seconds, 457.2KB/s
    real    0m 22.39s
    user    0m 0.00s
    sys     0m 0.01s
    
    With this change they are back to rock!
    $ time dd if=/dev/sda of=/dev/zero bs=1048576 count=300 conv=sync
    300+0 records in
    300+0 records out
    314572800 bytes (300.0MB) copied, 1.030541 seconds, 291.1MB/s
    real    0m 1.03s
    user    0m 0.00s
    sys     0m 0.54s
    Signed-off-by: default avatarVivek Gautam <vivek.gautam@codeaurora.org>
    Reviewed-by: default avatarRobin Murphy <robin.murphy@arm.com>
    Reviewed-by: default avatarStephen Boyd <swboyd@chromium.org>
    Reviewed-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
    Signed-off-by: default avatarSai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
    Signed-off-by: default avatarWill Deacon <will@kernel.org>
    759aaa10
arm-smmu-impl.c 4.39 KB