• Tvrtko Ursulin's avatar
    drm/i915: Support replaying GPU hangs with captured context image · 0f1bb41b
    Tvrtko Ursulin authored
    When debugging GPU hangs Mesa developers are finding it useful to replay
    the captured error state against the simulator. But due various simulator
    limitations which prevent replicating all hangs, one step further is being
    able to replay against a real GPU.
    
    This is almost doable today with the missing part being able to upload the
    captured context image into the driver state prior to executing the
    uploaded hanging batch and all the buffers.
    
    To enable this last part we add a new context parameter called
    I915_CONTEXT_PARAM_CONTEXT_IMAGE. It follows the existing SSEU
    configuration pattern of being able to select which context to apply
    against, paired with the actual image and its size.
    
    Since this is adding a new concept of debug only uapi, we hide it behind
    a new kconfig option and also require activation with a module parameter.
    Together with a warning banner printed at driver load, all those combined
    should be sufficient to guard against inadvertently enabling the feature.
    
    In terms of implementation we allow the legacy context set param to be
    used since that removes the need to record the per context data in the
    proto context, while still allowing flexibility of specifying context
    images for any context.
    
    Mesa MR using the uapi can be seen at:
      https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27594
    
    v2:
     * Fix whitespace alignment as per checkpatch.
     * Added warning on userspace misuse.
     * Rebase for extracting ce->default_state shadowing.
    
    v3:
     * Rebase for I915_CONTEXT_PARAM_LOW_LATENCY.
    Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
    Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
    Cc: Carlos Santa <carlos.santa@intel.com>
    Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Reviewed-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
    Tested-by: default avatarCarlos Santa <carlos.santa@intel.com>
    Signed-off-by: default avatarTvrtko Ursulin <tursulin@igalia.com>
    Signed-off-by: default avatarTvrtko Ursulin <tursulin@ursulin.net>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240514145939.87427-2-tursulin@igalia.com
    0f1bb41b
intel_lrc.c 45.3 KB