• Christian König's avatar
    drm/amdgpu: stop resubmitting jobs for GPU reset v2 · 6868a2c4
    Christian König authored
    Re-submitting IBs by the kernel has many problems because pre-
    requisite state is not automatically re-created as well. In
    other words neither binary semaphores nor things like ring
    buffer pointers are in the state they should be when the
    hardware starts to work on the IBs again.
    
    Additional to that even after more than 5 years of
    developing this feature it is still not stable and we have
    massively problems getting the reference counts right.
    
    As discussed with user space developers this behavior is not
    helpful in the first place. For graphics and multimedia
    workloads it makes much more sense to either completely
    re-create the context or at least re-submitting the IBs
    from userspace.
    
    For compute use cases re-submitting is also not very
    helpful since userspace must rely on the accuracy of
    the result.
    
    Because of this we stop this practice and instead just
    properly note that the fence submission was canceled. The
    only use case we keep the re-submission for now is SRIOV
    and function level resets.
    
    v2: as suggested by Sshaoyun stop resubmitting jobs even for SRIOV
    Signed-off-by: default avatarChristian König <christian.koenig@amd.com>
    Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
    Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
    6868a2c4
amdgpu_device.c 161 KB