• Chris Wilson's avatar
    drm/i915/gt: Delay execlist processing for tgl · 6ca7217d
    Chris Wilson authored
    When running gem_exec_nop, it floods the system with many requests (with
    the goal of userspace submitting faster than the HW can process a single
    empty batch). This causes the driver to continually resubmit new
    requests onto the end of an active context, a flood of lite-restore
    preemptions. If we time this just right, Tigerlake hangs.
    
    Inserting a small delay between the processing of CS events and
    submitting the next context, prevents the hang. Naturally it does not
    occur with debugging enabled. The suspicion then is that this is related
    to the issues with the CS event buffer, and inserting an mmio read of
    the CS pointer status appears to be very successful in preventing the
    hang. Other registers, or uncached reads, or plain mb, do not prevent
    the hang, suggesting that register is key -- but that the hang can be
    prevented by a simple udelay, suggests it is just a timing issue like
    that encountered by commit 233c1ae3 ("drm/i915/gt: Wait for CSB
    entries on Tigerlake"). Also note that the hang is not prevented by
    applying CTX_DESC_FORCE_RESTORE, or by inserting a delay on the GPU
    between requests.
    Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
    Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Cc: Bruce Chang <yu.bruce.chang@intel.com>
    Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Cc: stable@vger.kernel.org
    Acked-by: default avatarMika Kuoppala <mika.kuoppala@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20201015195023.32346-1-chris@chris-wilson.co.uk
    6ca7217d
intel_lrc.c 166 KB