• John Harrison's avatar
    drm/i915/guc: Fix the fix for reset lock confusion · 3563d855
    John Harrison authored
    The previous fix for the circlular lock splat about the busyness
    worker wasn't quite complete. Even though the reset-in-progress flag
    is cleared at the start of intel_uc_reset_finish, the entire function
    is still inside the reset mutex lock. Not sure why the patch appeared
    to fix the issue both locally and in CI. However, it is now back
    again.
    
    There is a further complication that the wedge code path within
    intel_gt_reset() jumps around so much that it results in nested
    reset_prepare/_finish calls. That is, the call sequence is:
      intel_gt_reset
      | reset_prepare
      | __intel_gt_set_wedged
      | | reset_prepare
      | | reset_finish
      | reset_finish
    
    The nested finish means that even if the clear of the in-progress flag
    was moved to the end of _finish, it would still be clear for the
    entire second call. Surprisingly, this does not seem to be causing any
    other problems at present.
    
    As an aside, a wedge on fini does not call the finish functions at
    all. The reset_in_progress flag is left set (twice).
    
    So instead of trying to cancel the worker anywhere at all in the reset
    path, just add a cancel to intel_guc_submission_fini instead. Note
    that it is not a problem if the worker is still active during a reset.
    Either it will run before the reset path starts locking things and
    will simply block the reset code for a tiny amount of time. Or it will
    run after the locks have been acquired and will early exit due to the
    try-lock.
    
    Also, do not use the reset-in-progress flag to decide whether a
    synchronous cancel is safe (from a lockdep perspective) or not.
    Instead, use the actual reset mutex state (both the genuine one and
    the custom rolled BACKOFF one).
    
    Fixes: 0e00a881 ("drm/i915/guc: Avoid circular locking issue on busyness flush")
    Signed-off-by: default avatarJohn Harrison <John.C.Harrison@Intel.com>
    Cc: Zhanjun Dong <zhanjun.dong@intel.com>
    Cc: John Harrison <John.C.Harrison@Intel.com>
    Cc: Andi Shyti <andi.shyti@linux.intel.com>
    Cc: Daniel Vetter <daniel@ffwll.ch>
    Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
    Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Cc: Nirmoy Das <nirmoy.das@intel.com>
    Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
    Cc: Andrzej Hajda <andrzej.hajda@intel.com>
    Cc: Matt Roper <matthew.d.roper@intel.com>
    Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
    Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
    Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
    Cc: Madhumitha Tolakanahalli Pradeep <madhumitha.tolakanahalli.pradeep@intel.com>
    Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
    Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
    Cc: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
    Reviewed-by: default avatarNirmoy Das <nirmoy.das@intel.com>
    Reviewed-by: default avatarAndi Shyti <andi.shyti@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240329235306.1559639-1-John.C.Harrison@Intel.com
    3563d855
intel_guc_submission.c 164 KB