• Alex Sierra's avatar
    drm/amdkfd: rm BO resv on validation to avoid deadlock · ec6abe83
    Alex Sierra authored
    This fix the deadlock with the BO reservations during SVM_BO evictions
    while allocations in VRAM are concurrently performed. More specific,
    while the ttm waits for the fence to be signaled (ttm_bo_wait), it
    already has the BO reserved. In parallel, the restore worker might be
    running, prefetching memory to VRAM. This also requires to reserve the
    BO, but blocks the mmap semaphore first. The deadlock happens when the
    SVM_BO eviction worker kicks in and waits for the mmap semaphore held
    in restore worker. Preventing signal the fence back, causing the
    deadlock until the ttm times out.
    
    We don't need to hold the BO reservation anymore during validation
    and mapping. Now the physical addresses are taken from hmm_range_fault.
    We also take migrate_mutex to prevent range migration while
    validate_and_map update GPU page table.
    Signed-off-by: default avatarAlex Sierra <alex.sierra@amd.com>
    Signed-off-by: default avatarFelix Kuehling <Felix.Kuehling@amd.com>
    Reviewed-by: default avatarPhilip Yang <philip.yang@amd.com>
    Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
    ec6abe83
kfd_svm.c 89.1 KB