Commit beecec90 authored by Chris Wilson's avatar Chris Wilson

drm/i915/execlists: Preemption!

When we write to ELSP, it triggers a context preemption at the earliest
arbitration point (3DPRIMITIVE, some PIPECONTROLs, a few other
operations and the explicit MI_ARB_CHECK). If this is to the same
context, it triggers a LITE_RESTORE where the RING_TAIL is merely
updated (used currently to chain requests from the same context
together, avoiding bubbles). However, if it is to a different context, a
full context-switch is performed and it will start to execute the new
context saving the image of the old for later execution.

Previously we avoided preemption by only submitting a new context when
the old was idle. But now we wish embrace it, and if the new request has
a higher priority than the currently executing request, we write to the
ELSP regardless, thus triggering preemption, but we tell the GPU to
switch to our special preemption context (not the target). In the
context-switch interrupt handler, we know that the previous contexts
have finished execution and so can unwind all the incomplete requests
and compute the new highest priority request to execute.

It would be feasible to avoid the switch-to-idle intermediate by
programming the ELSP with the target context. The difficulty is in
tracking which request that should be whilst maintaining the dependency
change, the error comes in with coalesced requests. As we only track the
most recent request and its priority, we may run into the issue of being
tricked in preempting a high priority request that was followed by a
low priority request from the same context (e.g. for PI); worse still
that earlier request may be our own dependency and the order then broken
by preemption. By injecting the switch-to-idle and then recomputing the
priority queue, we avoid the issue with tracking in-flight coalesced
requests. Having tried the preempt-to-busy approach, and failed to find
a way around the coalesced priority issue, Michal's original proposal to
inject an idle context (based on handling GuC preemption) succeeds.

The current heuristic for deciding when to preempt are only if the new
request is of higher priority, and has the privileged priority of
greater than 0. Note that the scheduler remains unfair!

v2: Disable for gen8 (bdw/bsw) as we need additional w/a for GPGPU.
Since, the feature is now conditional and not always available when we
have a scheduler, make it known via the HAS_SCHEDULER GETPARAM (now a
capability mask).
v3: Stylistic tweaks.
v4: Appease Joonas with a snippet of kerneldoc, only to fuel to fire of
the preempt vs preempting debate.
Suggested-by: default avatarMichal Winiarski <michal.winiarski@intel.com>
Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-8-chris@chris-wilson.co.uk
parent bf64e0b0
......@@ -368,9 +368,16 @@ static int i915_getparam(struct drm_device *dev, void *data,
break;
case I915_PARAM_HAS_SCHEDULER:
value = 0;
if (dev_priv->engine[RCS] && dev_priv->engine[RCS]->schedule)
if (dev_priv->engine[RCS] && dev_priv->engine[RCS]->schedule) {
value |= I915_SCHEDULER_CAP_ENABLED;
if (INTEL_INFO(dev_priv)->has_logical_ring_preemption &&
i915_modparams.enable_execlists &&
!i915_modparams.enable_guc_submission)
value |= I915_SCHEDULER_CAP_PREEMPTION;
}
break;
case I915_PARAM_MMAP_VERSION:
/* Remember to bump this if the version changes! */
case I915_PARAM_HAS_GEM:
......
......@@ -1382,11 +1382,9 @@ gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
bool tasklet = false;
if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
if (port_count(&execlists->port[0])) {
__set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
tasklet = true;
}
}
if (iir & (GT_RENDER_USER_INTERRUPT << test_shift)) {
notify_ring(engine);
......
......@@ -424,6 +424,7 @@ static const struct intel_device_info intel_cherryview_info __initconst = {
#define GEN9_FEATURES \
GEN8_FEATURES, \
.has_logical_ring_preemption = 1, \
.has_csr = 1, \
.has_guc = 1, \
.has_ipc = 1, \
......@@ -477,6 +478,7 @@ static const struct intel_device_info intel_skylake_gt4_info __initconst = {
.has_rc6 = 1, \
.has_dp_mst = 1, \
.has_logical_ring_contexts = 1, \
.has_logical_ring_preemption = 1, \
.has_guc = 1, \
.has_aliasing_ppgtt = 1, \
.has_full_ppgtt = 1, \
......
This diff is collapsed.
......@@ -238,6 +238,11 @@ struct intel_engine_execlists {
#define EXECLIST_MAX_PORTS 2
} port[EXECLIST_MAX_PORTS];
/**
* @preempt: are we currently handling a preempting context switch?
*/
bool preempt;
/**
* @port_mask: number of execlist ports - 1
*/
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment