1. 14 Nov, 2016 6 commits
  2. 10 Nov, 2016 11 commits
  3. 09 Nov, 2016 6 commits
  4. 08 Nov, 2016 8 commits
  5. 07 Nov, 2016 9 commits
    • Chris Wilson's avatar
      drm/i915: Mark CPU cache as dirty when used for rendering · 7aa6ca61
      Chris Wilson authored
      On LLC, or even snooped, machines rendering via the GPU ends up in the CPU
      cache. This cacheline dirt also needs to be flushed to main memory when
      moving to an incoherent domain, such as the display's scanout engine.
      Mostly, this happens because either the object is marked as dirty from
      its first use or is avoided by setting the object into the display
      domain from the start.
      
      v2: Treat WT as not requiring a clflush prior to use on the display
      engine as well.
      
      Fixes: 0f71979a ("drm/i915: Performed deferred clflush inside set-cache-level")
      References: https://bugs.freedesktop.org/show_bug.cgi?id=95414Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Jani Nikula <jani.nikula@linux.intel.com>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: <stable@vger.kernel.org> # v4.0+
      Reviewed-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161107165204.7008-1-chris@chris-wilson.co.uk
      7aa6ca61
    • Imre Deak's avatar
      drm/i915: Add assert for no pending GPU requests during suspend/resume in LR mode · 31ab49ab
      Imre Deak authored
      During resume we will reset the SW/HW tracking for each ring head/tail
      pointers and so are not prepared to replay any pending requests (as
      opposed to GPU reset time). Add an assert for this both to the suspend
      and the resume code.
      
      v2:
      - Check for ELSP port idle already during suspend and check !gt.awake
        during resume. (Chris)
      v3:
      - Move the !gt.awake check to i915_gem_resume().
      v4:
      - s/intel_lr_engines_idle/intel_execlists_idle/ (Chris)
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: default avatarImre Deak <imre.deak@intel.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: http://patchwork.freedesktop.org/patch/msgid/1478510405-11799-4-git-send-email-imre.deak@intel.com
      31ab49ab
    • Imre Deak's avatar
      drm/i915: Make sure engines are idle during GPU idling in LR mode · 0cb5670b
      Imre Deak authored
      We assume that the GPU is idle once receiving the seqno via the last
      request's user interrupt. In execlist mode the corresponding context
      completed interrupt can be delayed though and until this latter
      interrupt arrives we consider the request to be pending on the ELSP
      submit port. This can cause a problem during system suspend where this
      last request will be seen by the resume code as still pending. Such
      pending requests are normally replayed after a GPU reset, but during
      resume we reset both SW and HW tracking of the ring head/tail pointers,
      so replaying the pending request with its stale tail pointer will leave
      the ring in an inconsistent state. A subsequent request submission can
      lead then to the GPU executing from uninitialized area in the ring
      behind the above stale tail pointer.
      
      Fix this by making sure any pending request on the ELSP port is
      completed before suspending. I used a polling wait since the completion
      time I measured was <1ms and since normally we only need to wait during
      system suspend. GPU idling during runtime suspend is scheduled with a
      delay (currently 50-100ms) after the retirement of the last request at
      which point the context completed interrupt must have arrived already.
      
      The chance of this bug was increased by
      
      commit 1c777c5d
      Author: Imre Deak <imre.deak@intel.com>
      Date:   Wed Oct 12 17:46:37 2016 +0300
      
          drm/i915/hsw: Fix GPU hang during resume from S3-devices state
      
      but it could happen even without the explicit GPU reset, since we
      disable interrupts afterwards during the suspend sequence.
      
      v2:
      - Do an unlocked poll-wait first. (Chris)
      v3-4:
      - s/intel_lr_engines_idle/intel_execlists_idle/ and move
        i915.enable_execlists check to the new helper. (Chris)
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98470Signed-off-by: default avatarImre Deak <imre.deak@intel.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: http://patchwork.freedesktop.org/patch/msgid/1478510405-11799-3-git-send-email-imre.deak@intel.com
      0cb5670b
    • Imre Deak's avatar
      drm/i915: Avoid early GPU idling due to race with new request · 93c97dc1
      Imre Deak authored
      There is a small race where a new request can be submitted and retired
      after the idle worker started to run which leads to idling the GPU too
      early. Fix this by deferring the idling to the pending instance of the
      worker.
      
      This scenario was pointed out by Chris.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarImre Deak <imre.deak@intel.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: http://patchwork.freedesktop.org/patch/msgid/1478510405-11799-2-git-send-email-imre.deak@intel.com
      93c97dc1
    • Imre Deak's avatar
      drm/i915: Avoid early GPU idling due to already pending idle work · 5bd11a34
      Imre Deak authored
      Atm, in case an idle work handler is already pending but haven't yet
      started to run, retiring a new request will not extend the active period
      as required, rather simply leaves the pending idle work to be scheduled
      at the original expiration time. This may lead to idling the GPU too
      early. Fix this by using the delayed-work scheduler alternative which
      makes sure the handler's expiration time is extended in this case.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Requested-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarImre Deak <imre.deak@intel.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: http://patchwork.freedesktop.org/patch/msgid/1478510405-11799-1-git-send-email-imre.deak@intel.com
      5bd11a34
    • Chris Wilson's avatar
      drm/i915: Limit Valleyview and earlier to only using mappable scanout · 767a222e
      Chris Wilson authored
      Valleyview appears to be limited to only scanning out from the first 512MiB
      of the Global GTT. Lets presume that this behaviour was inherited from the
      display block copied from g4x (not Ironlake) and all earlier generations
      are similarly affected, though testing suggests different symptoms. For
      simplicity, impose that these platforms must scanout from the mappable
      region. (For extra simplicity, use HAS_GMCH_DISPLAY even though this
      catches Cherryview which does not appear to be limited to the low
      aperture for its scanout.)
      
      v2: Use HAS_GMCH_DISPLAY() to more clearly convey my intent about
      limiting this workaround to the old style of display engine.
      
      v3: Update changelog to reflect testing by Ville Syrjälä
      v4: Include the changes to the comments as well
      Reported-by: default avatarLuis Botello <luis.botello.ortega@intel.com>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98036
      Fixes: 2efb813d ("drm/i915: Fallback to using unmappable memory for scanout")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Akash Goel <akash.goel@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.9-rc1+
      Link: http://patchwork.freedesktop.org/patch/msgid/20161107110128.28762-1-chris@chris-wilson.co.uk
      Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com
      767a222e
    • Chris Wilson's avatar
      drm/i915: Round tile chunks up for constructing partial VMAs · 0ef723cb
      Chris Wilson authored
      When we split a large object up into chunks for GTT faulting (because we
      can't fit the whole object into the aperture) we have to align our cuts
      with the fence registers. Each partial VMA must cover a complete set of
      tile rows or the offset into each partial VMA is not aligned with the
      whole image. Currently we enforce a minimum size on each partial VMA,
      but this minimum size itself was not aligned to the tile row causing
      distortion.
      Reported-by: default avatarAndreas Reis <andreas.reis@gmail.com>
      Reported-by: default avatarChris Clayton <chris2553@googlemail.com>
      Reported-by: default avatarNorbert Preining <preining@logic.at>
      Tested-by: default avatarNorbert Preining <preining@logic.at>
      Tested-by: default avatarChris Clayton <chris2553@googlemail.com>
      Fixes: 03af84fe ("drm/i915: Choose partial chunksize based on tile row size")
      Fixes: a61007a8 ("drm/i915: Fix partial GGTT faulting") # enabling patch
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98402
      Testcase: igt/gem_mmap_gtt/medium-copy-odd
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Jani Nikula <jani.nikula@linux.intel.com>
      Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.9-rc1+
      Link: http://patchwork.freedesktop.org/patch/msgid/20161107105443.27855-1-chris@chris-wilson.co.ukReviewed-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      0ef723cb
    • Chris Wilson's avatar
      drm/i915: Remove the vma from the object list upon close · dfd2812e
      Chris Wilson authored
      Currently, the vma is being unlink from the object lookup on destroy.
      However, we are meant to be decoupling it upon close so that the user
      cannot access the closed vma whilst it remains active on the GPU.
      
      [   34.074858] kernel BUG at drivers/gpu/drm/i915/i915_gem_gtt.c:3561!
      [   34.074875] invalid opcode: 0000 [#1] PREEMPT SMP
      [   34.074888] Modules linked in: snd_hda_intel i915 x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel lpc_ich mei_me mei snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_codec snd_hwdep snd_hda_core i2c_designware_platform i2c_designware_core snd_pcm e1000e ptp pps_core sdhci_acpi sdhci mmc_core i2c_hid [last unloaded: i915]
      [   34.075010] CPU: 1 PID: 6224 Comm: gem_close_race Tainted: G     U          4.9.0-rc3-CI-CI_DRM_1800+ #1
      [   34.075034] Hardware name:                  /NUC5i7RYB, BIOS RYBDWi35.86A.0355.2016.0224.1501 02/24/2016
      [   34.075057] task: ffff8802459a8040 task.stack: ffffc90000524000
      [   34.075074] RIP: 0010:[<ffffffffa0392cbc>]  [<ffffffffa0392cbc>] i915_gem_obj_lookup_or_create_vma+0x8c/0xc0 [i915]
      [   34.075118] RSP: 0018:ffffc90000527b68  EFLAGS: 00010202
      [   34.075135] RAX: ffff8802426c5e40 RBX: 0000000000000000 RCX: ffff8802447fc2a8
      [   34.075158] RDX: 0000000000000000 RSI: ffff8802447fc2a8 RDI: ffff880248a4a880
      [   34.075181] RBP: ffffc90000527b88 R08: 0000000000000008 R09: 0000000000000000
      [   34.075203] R10: 0000000000000001 R11: 0000000000000000 R12: ffff880248a4a880
      [   34.075225] R13: ffff8802447fc2a8 R14: ffff880243e9afa8 R15: ffff880248a4a9c8
      [   34.075248] FS:  00007f9b43e59740(0000) GS:ffff880256c80000(0000) knlGS:0000000000000000
      [   34.075273] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   34.075292] CR2: 00007f9b43419140 CR3: 000000024455d000 CR4: 00000000003406e0
      [   34.075314] Stack:
      [   34.075323]  0000000000000000 ffffc90000527bd0 ffff880243cb8008 ffff880243e9afa8
      [   34.075353]  ffffc90000527c08 ffffffffa03874c7 ffffc90000527bb8 ffff880243e9afa8
      [   34.075383]  ffff880243e9afb0 ffffc90000527e10 ffff8802447fc2a8 ffff880243cb8040
      [   34.075414] Call Trace:
      [   34.075435]  [<ffffffffa03874c7>] eb_lookup_vmas.isra.7+0x247/0x330 [i915]
      [   34.075468]  [<ffffffffa0388c34>] i915_gem_do_execbuffer.isra.15+0x604/0x1a10 [i915]
      [   34.075507]  [<ffffffffa039c957>] ? i915_gem_object_get_sg+0x347/0x380 [i915]
      [   34.075532]  [<ffffffff811a69ce>] ? __might_fault+0x3e/0x90
      [   34.075562]  [<ffffffffa038a430>] i915_gem_execbuffer2+0xc0/0x250 [i915]
      [   34.075585]  [<ffffffff81552926>] drm_ioctl+0x1f6/0x480
      [   34.075604]  [<ffffffff8100107a>] ? trace_hardirqs_on_thunk+0x1a/0x1c
      [   34.075635]  [<ffffffffa038a370>] ? i915_gem_execbuffer+0x330/0x330 [i915]
      [   34.075658]  [<ffffffff81202d2e>] do_vfs_ioctl+0x8e/0x690
      [   34.075677]  [<ffffffff8181582d>] ? _raw_spin_unlock_irqrestore+0x3d/0x60
      [   34.075700]  [<ffffffff810fcd51>] ? SyS_timer_settime+0x141/0x1e0
      [   34.075721]  [<ffffffff810d6de2>] ? trace_hardirqs_on_caller+0x122/0x1b0
      [   34.075742]  [<ffffffff8120336c>] SyS_ioctl+0x3c/0x70
      [   34.075760]  [<ffffffff8181602e>] entry_SYSCALL_64_fastpath+0x1c/0xb1
      [   34.075781] Code: 44 a0 48 c7 c2 9a 7e 43 a0 be e0 0d 00 00 48 c7 c7 a0 45 44 a0 e8 55 b8 ce e0 48 85 db 74 a3 49 83 bd f8 03 00 00 00 74 99 0f 0b <0f> 0b 48 89 da 4c 89 ee 4c 89 e7 e8 04 a9 ff ff 48 89 da 49 89
      [   34.075955] RIP  [<ffffffffa0392cbc>] i915_gem_obj_lookup_or_create_vma+0x8c/0xc0 [i915]
      [   34.075994]  RSP <ffffc90000527b68>
      
      Testcase: igt/gem_close_race/basic-threads
      Fixes: db6c2b41 ("drm/i915: Store the vma in an rbtree...")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161104161241.25871-1-chris@chris-wilson.co.ukReviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      dfd2812e
    • Ping Gao's avatar
      drm/i915/gvt: implement scratch page table tree for shadow PPGTT · 3b6411c2
      Ping Gao authored
      All the unused entries in the page table tree(PML4E->PDPE->PDE->PTE)
      should point to scratch page table/scratch page to avoid page walk error
      due to the page prefetching.
      When removing an entry in shadow PPGTT,  it need map to scratch page
      also, the older implementation use single scratch page to assign to all
      level entries, it doesn't align the page walk behavior when removed
      entry is in PML, PDP, PD.  To avoid potential page walk error this patch
      implement a scratch page tree to replace the single scratch page.
      
      v2: more details in commit message address Kevin's comments.
      Signed-off-by: default avatarPing Gao <ping.a.gao@intel.com>
      Signed-off-by: default avatarZhenyu Wang <zhenyuw@linux.intel.com>
      3b6411c2