1. 22 Oct, 2020 8 commits
  2. 21 Oct, 2020 13 commits
  3. 20 Oct, 2020 2 commits
  4. 19 Oct, 2020 11 commits
    • Chris Wilson's avatar
      drm/i915/gt: Wait for CSB entries on Tigerlake · 4a9bb58a
      Chris Wilson authored
      On Tigerlake, we are seeing a repeat of commit d8f50531 ("drm/i915/icl:
      Forcibly evict stale csb entries") where, presumably, due to a missing
      Global Observation Point synchronisation, the write pointer of the CSB
      ringbuffer is updated _prior_ to the contents of the ringbuffer. That is
      we see the GPU report more context-switch entries for us to parse, but
      those entries have not been written, leading us to process stale events,
      and eventually report a hung GPU.
      
      However, this effect appears to be much more severe than we previously
      saw on Icelake (though it might be best if we try the same approach
      there as well and measure), and Bruce suggested the good idea of resetting
      the CSB entry after use so that we can detect when it has been updated by
      the GPU. By instrumenting how long that may be, we can set a reliable
      upper bound for how long we should wait for:
      
          513 late, avg of 61 retries (590 ns), max of 1061 retries (10099 ns)
      
      Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2045
      References: d8f50531 ("drm/i915/icl: Forcibly evict stale csb entries")
      References: HSDES#22011327657, HSDES#1508287568
      Suggested-by: default avatarBruce Chang <yu.bruce.chang@intel.com>
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Bruce Chang <yu.bruce.chang@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: stable@vger.kernel.org # v5.4
      Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200915134923.30088-2-chris@chris-wilson.co.uk
      (cherry picked from commit 233c1ae3c83f21046c6c4083da904163ece8f110)
      Signed-off-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      4a9bb58a
    • Chris Wilson's avatar
      drm/i915/gt: Widen CSB pointer to u64 for the parsers · ca05277e
      Chris Wilson authored
      A CSB entry is 64b, and it is simpler for us to treat it as an array of
      64b entries than as an array of pairs of 32b entries.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200915134923.30088-1-chris@chris-wilson.co.uk
      (cherry picked from commit f24a44e52fbc9881fc5f3bcef536831a15a439f3)
      (cherry picked from commit 3d4dbe0e0f0d04ebcea917b7279586817da8cf46)
      Signed-off-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      ca05277e
    • Chris Wilson's avatar
      drm/i915: Use the active reference on the vma while capturing · db9bc2d3
      Chris Wilson authored
      During error capture, we need to take a reference to the vma from before
      the reset in order to catpure the contents of the vma later. Currently
      we are using both an active reference and a kref, but due to nature of
      the i915_vma reference handling, that kref is on the vma->obj and not
      the vma itself. This means the vma may be destroyed as soon as it is
      idle, that is in between the i915_active_release(&vma->active) and the
      i915_vma_put(vma):
      
      <3> [197.866181] BUG: KASAN: use-after-free in intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <3> [197.866339] Read of size 8 at addr ffff8881258cb800 by task gem_exec_captur/1041
      <3> [197.866467]
      <4> [197.866512] CPU: 2 PID: 1041 Comm: gem_exec_captur Not tainted 5.9.0-g5e4234f97efba-kasan_200+ #1
      <4> [197.866521] Hardware name: Intel Corp. Broxton P/Apollolake RVP1A, BIOS APLKRVPA.X64.0150.B11.1608081044 08/08/2016
      <4> [197.866530] Call Trace:
      <4> [197.866549]  dump_stack+0x99/0xd0
      <4> [197.866760]  ? intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <4> [197.866783]  print_address_description.constprop.8+0x3e/0x60
      <4> [197.866797]  ? kmsg_dump_rewind_nolock+0xd4/0xd4
      <4> [197.866819]  ? lockdep_hardirqs_off+0xd4/0x120
      <4> [197.867037]  ? intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <4> [197.867249]  ? intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <4> [197.867270]  kasan_report.cold.10+0x1f/0x37
      <4> [197.867492]  ? intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <4> [197.867710]  intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <4> [197.867949]  i915_gpu_coredump.part.29+0x150/0x7b0 [i915]
      <4> [197.868186]  i915_capture_error_state+0x5e/0xc0 [i915]
      <4> [197.868396]  intel_gt_handle_error+0x6eb/0xa20 [i915]
      <4> [197.868624]  ? intel_gt_reset_global+0x370/0x370 [i915]
      <4> [197.868644]  ? check_flags+0x50/0x50
      <4> [197.868662]  ? __lock_acquire+0xd59/0x6b00
      <4> [197.868678]  ? register_lock_class+0x1ad0/0x1ad0
      <4> [197.868944]  i915_wedged_set+0xcf/0x1b0 [i915]
      <4> [197.869147]  ? i915_wedged_get+0x90/0x90 [i915]
      <4> [197.869371]  ? i915_wedged_get+0x90/0x90 [i915]
      <4> [197.869398]  simple_attr_write+0x153/0x1c0
      <4> [197.869428]  full_proxy_write+0xee/0x180
      <4> [197.869442]  ? __sb_start_write+0x1f3/0x310
      <4> [197.869465]  vfs_write+0x1a3/0x640
      <4> [197.869492]  ksys_write+0xec/0x1c0
      <4> [197.869507]  ? __ia32_sys_read+0xa0/0xa0
      <4> [197.869525]  ? lockdep_hardirqs_on_prepare+0x32b/0x4e0
      <4> [197.869541]  ? syscall_enter_from_user_mode+0x1c/0x50
      <4> [197.869566]  do_syscall_64+0x33/0x80
      <4> [197.869579]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      <4> [197.869590] RIP: 0033:0x7fd8b7aee281
      <4> [197.869604] Code: c3 0f 1f 84 00 00 00 00 00 48 8b 05 59 8d 20 00 c3 0f 1f 84 00 00 00 00 00 8b 05 8a d1 20 00 85 c0 75 16 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 57 f3 c3 0f 1f 44 00 00 41 54 55 49 89 d4 53
      <4> [197.869613] RSP: 002b:00007ffea3b72008 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      <4> [197.869625] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd8b7aee281
      <4> [197.869633] RDX: 0000000000000002 RSI: 00007fd8b81a82e7 RDI: 000000000000000d
      <4> [197.869641] RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000034
      <4> [197.869650] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fd8b81a82e7
      <4> [197.869658] R13: 000000000000000d R14: 0000000000000000 R15: 0000000000000000
      <3> [197.869707]
      <3> [197.869757] Allocated by task 1041:
      <4> [197.869833]  kasan_save_stack+0x19/0x40
      <4> [197.869843]  __kasan_kmalloc.constprop.5+0xc1/0xd0
      <4> [197.869853]  kmem_cache_alloc+0x106/0x8e0
      <4> [197.870059]  i915_vma_instance+0x212/0x1930 [i915]
      <4> [197.870270]  eb_lookup_vmas+0xe06/0x1d10 [i915]
      <4> [197.870475]  i915_gem_do_execbuffer+0x131d/0x4080 [i915]
      <4> [197.870682]  i915_gem_execbuffer2_ioctl+0x103/0x5d0 [i915]
      <4> [197.870701]  drm_ioctl_kernel+0x1d2/0x270
      <4> [197.870710]  drm_ioctl+0x40d/0x85c
      <4> [197.870721]  __x64_sys_ioctl+0x10d/0x170
      <4> [197.870731]  do_syscall_64+0x33/0x80
      <4> [197.870740]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      <3> [197.870748]
      <3> [197.870798] Freed by task 22:
      <4> [197.870865]  kasan_save_stack+0x19/0x40
      <4> [197.870875]  kasan_set_track+0x1c/0x30
      <4> [197.870884]  kasan_set_free_info+0x1b/0x30
      <4> [197.870894]  __kasan_slab_free+0x111/0x160
      <4> [197.870903]  kmem_cache_free+0xcd/0x710
      <4> [197.871109]  i915_vma_parked+0x618/0x800 [i915]
      <4> [197.871307]  __gt_park+0xdb/0x1e0 [i915]
      <4> [197.871501]  ____intel_wakeref_put_last+0xb1/0x190 [i915]
      <4> [197.871516]  process_one_work+0x8dc/0x15d0
      <4> [197.871525]  worker_thread+0x82/0xb30
      <4> [197.871535]  kthread+0x36d/0x440
      <4> [197.871545]  ret_from_fork+0x22/0x30
      <3> [197.871553]
      <3> [197.871602] The buggy address belongs to the object at ffff8881258cb740
       which belongs to the cache i915_vma of size 968
      
      Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2553
      Fixes: 2850748e ("drm/i915: Pull i915_vma_pin under the vm->mutex")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: <stable@vger.kernel.org> # v5.5+
      Reviewed-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201016092527.29039-1-chris@chris-wilson.co.uk
      (cherry picked from commit 178536b8292ecd118f59d2fac4509c7e70b99854)
      Signed-off-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      db9bc2d3
    • Chris Wilson's avatar
      drm/i915/gt: Undo forced context restores after trivial preemptions · 64402570
      Chris Wilson authored
      We may try to preempt the currently executing request, only to find that
      after unravelling all the dependencies that the original executing
      context is still the earliest in the topological sort and re-submitted
      back to HW (if we do detect some change in the ELSP that requires
      re-submission). However, due to the way we check for wrap-around during
      the unravelling, we mark any context that has been submitted just once
      (i.e. with the rq->wa_tail set, but the ring->tail earlier) as
      potentially wrapping and requiring a forced restore on resubmission.
      This was expected to be not a problem, as it was anticipated that most
      unwinding for preemption would result in a context switch and the few
      that did not would be lost in the noise. It did not take long for
      someone to find one particular workload where the cost of those extra
      context restores was measurable.
      
      However, since we know the wa_tail is of fixed size, and we know that a
      request must be larger than the wa_tail itself, we can safely maintain
      the check for request wrapping and check against a slightly future point
      in the ring that includes an expected wa_tail. (That is if the
      ring->tail is already set to rq->wa_tail, including another 8 bytes in
      the check does not invalidate the incremental wrap detection.)
      
      Fixes: 8ab3a381 ("drm/i915/gt: Incrementally check for rewinding")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Bruce Chang <yu.bruce.chang@intel.com>
      Cc: Ramalingam C <ramalingam.c@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: <stable@vger.kernel.org> # v5.4+
      Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201002083425.4605-1-chris@chris-wilson.co.uk
      (cherry picked from commit bb65548e)
      Signed-off-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      64402570
    • Chris Wilson's avatar
      drm/i915/gt: Delay execlist processing for tgl · 9b99e5ba
      Chris Wilson authored
      When running gem_exec_nop, it floods the system with many requests (with
      the goal of userspace submitting faster than the HW can process a single
      empty batch). This causes the driver to continually resubmit new
      requests onto the end of an active context, a flood of lite-restore
      preemptions. If we time this just right, Tigerlake hangs.
      
      Inserting a small delay between the processing of CS events and
      submitting the next context, prevents the hang. Naturally it does not
      occur with debugging enabled. The suspicion then is that this is related
      to the issues with the CS event buffer, and inserting an mmio read of
      the CS pointer status appears to be very successful in preventing the
      hang. Other registers, or uncached reads, or plain mb, do not prevent
      the hang, suggesting that register is key -- but that the hang can be
      prevented by a simple udelay, suggests it is just a timing issue like
      that encountered by commit 233c1ae3c83f ("drm/i915/gt: Wait for CSB
      entries on Tigerlake"). Also note that the hang is not prevented by
      applying CTX_DESC_FORCE_RESTORE, or by inserting a delay on the GPU
      between requests.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Bruce Chang <yu.bruce.chang@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: stable@vger.kernel.org
      Acked-by: default avatarMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201015195023.32346-1-chris@chris-wilson.co.uk
      (cherry picked from commit 6ca7217dffaf1abba91558e67a2efb655ac91405)
      Signed-off-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      9b99e5ba
    • Chris Wilson's avatar
      drm/i915/gem: Support parsing of oversize batches · d5e87821
      Chris Wilson authored
      Matthew Auld noted that on more recent systems (such as the parser for
      gen9) we may have objects that are larger than expected by the GEM uAPI
      (i.e. greater than u32). These objects would have incorrect implicit
      batch lengths, causing the parser to reject them for being incomplete,
      or worse.
      
      Based on a patch by Matthew Auld.
      Reported-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Fixes: 435e8fc0 ("drm/i915: Allow parsing of unsized batches")
      Testcase: igt/gem_exec_params/larger-than-life-batch
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Matthew Auld <matthew.auld@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Jon Bloomfield <jon.bloomfield@intel.com>
      Reviewed-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Cc: stable@vger.kernel.org
      Link: https://patchwork.freedesktop.org/patch/msgid/20201015115954.871-1-chris@chris-wilson.co.uk
      (cherry picked from commit 57b2d834bf235daab388c3ba12d035c820ae09c6)
      Signed-off-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      d5e87821
    • Ville Syrjälä's avatar
      drm/i915: Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init · 1664ffee
      Ville Syrjälä authored
      Currently we leave the cache_level of the initial fb obj
      set to NONE. This means on eLLC machines the first pin_to_display()
      will try to switch it to WT which requires a vma unbind+bind.
      If that happens during the fbdev initialization rcu does not
      seem operational which causes the unbind to get stuck. To
      most appearances this looks like a dead machine on boot.
      
      Avoid the unbind by already marking the object cache_level
      as WT when creating it. We still do an excplicit ggtt pin
      which will rewrite the PTEs anyway, so they will match whatever
      cache level we set.
      
      Cc: <stable@vger.kernel.org> # v5.7+
      Suggested-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2381Signed-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201007120329.17076-1-ville.syrjala@linux.intel.com
      Link: https://patchwork.freedesktop.org/patch/msgid/20201015122138.30161-1-chris@chris-wilson.co.uk
      (cherry picked from commit d46b60a2e8d246f1f0faa38e52f4f5a73858c338)
      Signed-off-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      1664ffee
    • Ayaz A Siddiqui's avatar
      drm/i915/gt: Initialize reserved and unspecified MOCS indices · 849c0fe9
      Ayaz A Siddiqui authored
      In order to avoid functional breakage of mis-programmed applications that
      have grown to depend on unused MOCS entries, we are programming
      those entries to be equal to fully cached ("L3 + LLC") entry.
      
      These reserved and unspecified entries should not be used as they may be
      changed to less performant variants with better coherency in the future
      if more entries are needed.
      
      v2: As suggested by Lucas De Marchi to utilise __init_mocs_table for
      programming default value, setting I915_MOCS_PTE index of tgl_mocs_table
      with desired value.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Lucas De Marchi <lucas.demarchi@intel.com>
      Cc: Tomasz Lis <tomasz.lis@intel.com>
      Cc: Matt Roper <matthew.d.roper@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Francisco Jerez <currojerez@riseup.net>
      Cc: Mathew Alwin <alwin.mathew@intel.com>
      Cc: Mcguire Russell W <russell.w.mcguire@intel.com>
      Cc: Spruit Neil R <neil.r.spruit@intel.com>
      Cc: Zhou Cheng <cheng.zhou@intel.com>
      Cc: Benemelis Mike G <mike.g.benemelis@intel.com>
      Signed-off-by: default avatarAyaz A Siddiqui <ayaz.siddiqui@intel.com>
      Reviewed-by: default avatarLucas De Marchi <lucas.demarchi@intel.com>
      Acked-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200729102539.134731-2-ayaz.siddiqui@intel.com
      Cc: stable@vger.kernel.org
      (cherry picked from commit 4d8a5cfe3b131f60903949f998c5961cc922e0b0)
      Signed-off-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      849c0fe9
    • Sean Paul's avatar
      drm/i915/dp: Tweak initial dpcd backlight.enabled value · 354842df
      Sean Paul authored
      In commit 79946723 ("drm/i915: Assume 100% brightness when not in
      DPCD control mode"), we fixed the brightness level when DPCD control was
      not active to max brightness. This is as good as we can guess since most
      backlights go on full when uncontrolled.
      
      However in doing so we changed the semantics of the initial
      'backlight.enabled' value. At least on Pixelbooks, they  were relying
      on the brightness level in DP_EDP_BACKLIGHT_BRIGHTNESS_MSB to be 0 on
      boot such that enabled would be false. This causes the device to be
      enabled when the brightness is set. Without this, brightness control
      doesn't work. So by changing brightness to max, we also flipped enabled
      to be true on boot.
      
      To fix this, make enabled a function of brightness and backlight control
      mechanism.
      
      Fixes: 79946723 ("drm/i915: Assume 100% brightness when not in DPCD control mode")
      Cc: Lyude Paul <lyude@redhat.com>
      Cc: Jani Nikula <jani.nikula@intel.com>
      Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
      Cc: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Cc: Kevin Chowski <chowski@chromium.org>>
      Signed-off-by: default avatarSean Paul <seanpaul@chromium.org>
      Reviewed-by: default avatarLyude Paul <lyude@redhat.com>
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200918002845.32766-1-sean@poorly.run
      (cherry picked from commit 4ade8f31c25bef7ce7ed4d7cbac17df7c4bad850)
      Signed-off-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      354842df
    • Peilin Ye's avatar
      Fonts: Support FONT_EXTRA_WORDS macros for font_6x8 · 272d7089
      Peilin Ye authored
      Recently, in commit 6735b463 ("Fonts: Support FONT_EXTRA_WORDS macros
      for built-in fonts"), we wrapped each of our built-in data buffers in a
      `font_data` structure, in order to use the following macros on them, see
      include/linux/font.h:
      
      	#define REFCOUNT(fd)	(((int *)(fd))[-1])
      	#define FNTSIZE(fd)	(((int *)(fd))[-2])
      	#define FNTCHARCNT(fd)	(((int *)(fd))[-3])
      	#define FNTSUM(fd)	(((int *)(fd))[-4])
      
      	#define FONT_EXTRA_WORDS 4
      
      Do the same thing to our new 6x8 font. For built-in fonts, currently we
      only use FNTSIZE(). Since this is only a temporary solution for an
      out-of-bounds issue in the framebuffer layer (see commit 5af08640
      ("fbcon: Fix global-out-of-bounds read in fbcon_get_font()")), all the
      three other fields are intentionally set to zero in order to discourage
      using these negative-indexing macros.
      Signed-off-by: default avatarPeilin Ye <yepeilin.cs@gmail.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/926453876c92caac34cba8545716a491754d04d5.1603037079.git.yepeilin.cs@gmail.com
      272d7089
    • Peilin Ye's avatar
      docs: fb: Add font_6x8 to available built-in fonts · eda4a7bf
      Peilin Ye authored
      Recently we added a new 6x8 font in commit e2028c8e ("lib/fonts: add
      font 6x8 for OLED display"). Add its name to the "compiled-in fonts"
      list.
      Signed-off-by: default avatarPeilin Ye <yepeilin.cs@gmail.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201018205401.698242-1-yepeilin.cs@gmail.com
      eda4a7bf
  5. 18 Oct, 2020 3 commits
  6. 14 Oct, 2020 3 commits