• John Harrison's avatar
    drm/i915: Fix request ref counting during error capture & debugfs dump · 3700e353
    John Harrison authored
    When GuC support was added to error capture, the reference counting
    around the request object was broken. Fix it up.
    
    The context based search manages the spinlocking around the search
    internally. So it needs to grab the reference count internally as
    well. The execlist only request based search relies on external
    locking, so it needs an external reference count but within the
    spinlock not outside it.
    
    The only other caller of the context based search is the code for
    dumping engine state to debugfs. That code wasn't previously getting
    an explicit reference at all as it does everything while holding the
    execlist specific spinlock. So, that needs updaing as well as that
    spinlock doesn't help when using GuC submission. Rather than trying to
    conditionally get/put depending on submission model, just change it to
    always do the get/put.
    
    v2: Explicitly document adding an extra blank line in some dense code
    (Andy Shevchenko). Fix multiple potential null pointer derefs in case
    of no request found (some spotted by Tvrtko, but there was more!).
    Also fix a leaked request in case of !started and another in
    __guc_reset_context now that intel_context_find_active_request is
    actually reference counting the returned request.
    v3: Add a _get suffix to intel_context_find_active_request now that it
    grabs a reference (Daniele).
    v4: Split the intel_guc_find_hung_context change to a separate patch
    and rename intel_context_find_active_request_get to
    intel_context_get_active_request (Tvrtko).
    v5: s/locking/reference counting/ in commit message (Tvrtko)
    
    Fixes: dc0dad36 ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
    Fixes: 573ba126 ("drm/i915/guc: Capture error state on context reset")
    Signed-off-by: default avatarJohn Harrison <John.C.Harrison@Intel.com>
    Reviewed-by: default avatarDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
    Acked-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
    Cc: Matthew Brost <matthew.brost@intel.com>
    Cc: Jani Nikula <jani.nikula@linux.intel.com>
    Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Cc: Andrzej Hajda <andrzej.hajda@intel.com>
    Cc: Matthew Auld <matthew.auld@intel.com>
    Cc: Matt Roper <matthew.d.roper@intel.com>
    Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
    Cc: Michael Cheng <michael.cheng@intel.com>
    Cc: Lucas De Marchi <lucas.demarchi@intel.com>
    Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
    Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
    Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
    Cc: Bruce Chang <yu.bruce.chang@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-3-John.C.Harrison@Intel.com
    3700e353
intel_context.h 9.83 KB