• Alexandre Courbot's avatar
    drm/nouveau/instmem/gk20a: use direct CPU access · 69c49382
    Alexandre Courbot authored
    The Great Nouveau Refactoring Take II brought us a lot of goodness,
    including acquire/release methods that are called before and after an
    instobj is modified. These functions can be used as synchronization
    points to manage CPU/GPU coherency if we modify an instobj using the
    CPU.
    
    This patch replaces the legacy and slow PRAMIN access for gk20a instmem
    with CPU mappings and writes. A LRU list is used to unmap unused
    mappings after a certain threshold (currently 1MB) of mapped instobjs is
    reached. This allows mappings to be reused most of the time.
    
    Accessing instobjs using the CPU requires to maintain the GPU L2 cache,
    which we do in the acquire/release functions. This triggers a lot of L2
    flushes/invalidates, but most of them are performed on an empty cache
    (and thus return immediately), and overall context setup performance
    greatly benefits from this (from 250ms to 160ms on Jetson TK1 for a
    simple libdrm program).
    
    Making L2 management more explicit should allow us to grab some more
    performance in the future.
    Signed-off-by: default avatarAlexandre Courbot <acourbot@nvidia.com>
    Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
    69c49382
gk20a.c 16.5 KB