drm/i915/execlists: Always reset the context's RING registers
During reset, we try and stop the active ring. This has the consequence that we often clobber the RING registers within the context image. When we find an active request, we update the context image to rerun that request (if it was guilty, we replace the hanging user payload with NOPs). However, we were ignoring an active context if the request had completed, with the consequence that the next submission on that request would start with RING_HEAD==0 and not the tail of the previous request, causing all requests still in the ring to be rerun. Rare, but occasionally seen within CI where we would spot that the context seqno would reverse and complain that we were retiring an incomplete request. <0> [412.390350] <idle>-0 3d.s2 408373352us : __i915_request_submit: rcs0 fence 1e95b:3640 -> current 3638 <0> [412.390350] <idle>-0 3d.s2 408373353us : __i915_request_submit: rcs0 fence 1e95b:3642 -> current 3638 <0> [412.390350] <idle>-0 3d.s2 408373354us : __i915_request_submit: rcs0 fence 1e95b:3644 -> current 3638 <0> [412.390350] <idle>-0 3d.s2 408373354us : __i915_request_submit: rcs0 fence 1e95b:3646 -> current 3638 <0> [412.390350] <idle>-0 3d.s2 408373356us : __execlists_submission_tasklet: rcs0 in[0]: ctx=2.1, fence 1e95b:3646 (current 3638), prio=4 <0> [412.390350] i915_sel-4613 0.... 408373374us : __i915_request_commit: rcs0 fence 1e95b:3648 <0> [412.390350] i915_sel-4613 0d..1 408373377us : process_csb: rcs0 cs-irq head=2, tail=3 <0> [412.390350] i915_sel-4613 0d..1 408373377us : process_csb: rcs0 csb[3]: status=0x00000001:0x00000000, active=0x1 <0> [412.390350] i915_sel-4613 0d..1 408373378us : __i915_request_submit: rcs0 fence 1e95b:3648 -> current 3638 <0> [412.390350] <idle>-0 3..s1 408373378us : execlists_submission_tasklet: rcs0 awake?=1, active=5 <0> [412.390350] i915_sel-4613 0d..1 408373379us : __execlists_submission_tasklet: rcs0 in[0]: ctx=2.2, fence 1e95b:3648 (current 3638), prio=4 <0> [412.390350] i915_sel-4613 0.... 408373381us : i915_reset_engine: rcs0 flags=4 <0> [412.390350] i915_sel-4613 0.... 408373382us : execlists_reset_prepare: rcs0: depth<-0 <0> [412.390350] <idle>-0 3d.s2 408373390us : process_csb: rcs0 cs-irq head=3, tail=4 <0> [412.390350] <idle>-0 3d.s2 408373390us : process_csb: rcs0 csb[4]: status=0x00008002:0x00000002, active=0x1 <0> [412.390350] <idle>-0 3d.s2 408373390us : process_csb: rcs0 out[0]: ctx=2.2, fence 1e95b:3648 (current 3640), prio=4 <0> [412.390350] i915_sel-4613 0.... 408373401us : intel_engine_stop_cs: rcs0 <0> [412.390350] i915_sel-4613 0d..1 408373402us : process_csb: rcs0 cs-irq head=4, tail=4 <0> [412.390350] i915_sel-4613 0.... 408373403us : intel_gpu_reset: engine_mask=1 <0> [412.390350] i915_sel-4613 0d..1 408373408us : execlists_cancel_port_requests: rcs0:port0 fence 1e95b:3648, (current 3648) <0> [412.390350] i915_sel-4613 0.... 408373442us : intel_engine_cancel_stop_cs: rcs0 <0> [412.390350] i915_sel-4613 0.... 408373442us : execlists_reset_finish: rcs0: depth->0 <0> [412.390350] ksoftirq-26 3..s. 408373442us : execlists_submission_tasklet: rcs0 awake?=1, active=0 <0> [412.390350] ksoftirq-26 3d.s1 408373443us : process_csb: rcs0 cs-irq head=5, tail=5 <0> [412.390350] i915_sel-4613 0.... 408373475us : i915_request_retire: rcs0 fence 1e95b:3640, current 3648 <0> [412.390350] i915_sel-4613 0.... 408373476us : i915_request_retire: __retire_engine_request(rcs0) fence 1e95b:3640, current 3648 <0> [412.390350] i915_sel-4613 0.... 408373494us : __i915_request_commit: rcs0 fence 1e95b:3650 <0> [412.390350] i915_sel-4613 0d..1 408373496us : process_csb: rcs0 cs-irq head=5, tail=5 <0> [412.390350] i915_sel-4613 0d..1 408373496us : __i915_request_submit: rcs0 fence 1e95b:3650 -> current 3648 <0> [412.390350] i915_sel-4613 0d..1 408373498us : __execlists_submission_tasklet: rcs0 in[0]: ctx=2.1, fence 1e95b:3650 (current 3648), prio=6 <0> [412.390350] i915_sel-4613 0.... 408373500us : i915_request_retire_upto: rcs0 fence 1e95b:3648, current 3648 <0> [412.390350] i915_sel-4613 0.... 408373500us : i915_request_retire: rcs0 fence 1e95b:3642, current 3648 <0> [412.390350] i915_sel-4613 0.... 408373501us : i915_request_retire: __retire_engine_request(rcs0) fence 1e95b:3642, current 3648 <0> [412.390350] i915_sel-4613 0.... 408373514us : i915_request_retire: rcs0 fence 1e95b:3644, current 3648 <0> [412.390350] i915_sel-4613 0.... 408373515us : i915_request_retire: __retire_engine_request(rcs0) fence 1e95b:3644, current 3648 <0> [412.390350] i915_sel-4613 0.... 408373527us : i915_request_retire: rcs0 fence 1e95b:3646, current 3640 <0> [412.390350] <idle>-0 3..s1 408373569us : execlists_submission_tasklet: rcs0 awake?=1, active=1 <0> [412.390350] <idle>-0 3d.s2 408373569us : process_csb: rcs0 cs-irq head=5, tail=1 <0> [412.390350] <idle>-0 3d.s2 408373570us : process_csb: rcs0 csb[0]: status=0x00000001:0x00000000, active=0x1 <0> [412.390350] <idle>-0 3d.s2 408373570us : process_csb: rcs0 csb[1]: status=0x00000018:0x00000002, active=0x5 <0> [412.390350] <idle>-0 3d.s2 408373570us : process_csb: rcs0 out[0]: ctx=2.1, fence 1e95b:3650 (current 3650), prio=6 <0> [412.390350] <idle>-0 3d.s2 408373571us : process_csb: rcs0 completed ctx=2 <0> [412.390350] i915_sel-4613 0.... 408373621us : i915_request_retire: i915_request_retire:253 GEM_BUG_ON(!i915_request_completed(request)) v2: Fixup the cancellation path to drain the CSB and reset the pointers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190411130515.20716-2-chris@chris-wilson.co.uk
Showing
Please register or sign in to comment