Commits · fcd46e53449c4d659ffbedcd2823ea2f73e39927 · nexedi / linux

12 Jan, 2017 12 commits

drm/i915: Declare i915_gem_object_create_internal() as taking phys_addr_t size · fcd46e53

Chris Wilson authored Jan 12, 2017

The internal object is a collection of struct pages and so is
intrinsically linked to the available physical memory on the machine,
and not an arbitrary type from the uabi. Use phys_addr_t so the link
between size and memory consumption is clear, and then double check that
we don't overflow the maximum object size.

v2: Also assert that size is not zero - a mistake I made a few times
while writing selftests.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170112130431.1844-1-chris@chris-wilson.co.ukReviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

fcd46e53

drm/i915: Invalidate the guc ggtt TLB upon insertion · 7c3f86b6

Chris Wilson authored Jan 12, 2017

Move the GuC invalidation of its ggtt TLB to where we perform the ggtt
modification rather than proliferate it into all the callers of the
insert (which may or may not in fact have to do the insertion).

v2: Just do the guc invalidate unconditionally, (afaict) it has no impact
without the guc loaded on gen8+
v3: Conditionally invalidate the guc - just in case that register has
not been validated for other modes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170112110050.25333-1-chris@chris-wilson.co.uk

7c3f86b6

drm/i915: Remove useless casts to intel_plane_state · 1d4258db

Maarten Lankhorst authored Jan 12, 2017

The visible member used to be in intel_plane_state->visible,
but has been moved to drm_plane_state->visible. In the conversion
some casts were left in that are now useless.

to_intel_plane_state(x)->base.visible is the same as x->visible,
so use the latter to clear up the code a little.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1484214225-30328-1-git-send-email-maarten.lankhorst@linux.intel.comReviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

1d4258db

drm/i915: Update i915_reset parameter for kerneldoc · df210574

Michel Thierry authored Jan 11, 2017

Since commit c033666a ("drm/i915: Store a i915 backpointer from
engine, and use it") i915_reset receives dev_priv, but the kerneldoc
was not updated.
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170112041817.1102-3-michel.thierry@intel.com

df210574

drm/i915: Keep i915_handle_error kerneldoc parameters together · 87c390b6

Michel Thierry authored Jan 11, 2017

And before the function description.
Tidy up from commit 14bb2c11 ("drm/i915: Fix a buch of kerneldoc
warnings"), all others kerneldoc blocks look ok.

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170112041817.1102-2-michel.thierry@intel.com

87c390b6

drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround. · 8726f2fa

Francisco Jerez authored Jan 12, 2017

The WaDisableLSQCROPERFforOCL workaround has the side effect of
disabling an L3SQ optimization that has huge performance implications
and is unlikely to be necessary for the correct functioning of usual
graphic workloads.  Userspace is free to re-enable the workaround on
demand, and is generally in a better position to determine whether the
workaround is necessary than the DRM is (e.g. only during the
execution of compute kernels that rely on both L3 fences and HDC R/W
requests).

The same workaround seems to apply to BDW (at least to production
stepping G1) and SKL as well (the internal workaround database claims
that it does for all steppings, while the BSpec workaround table only
mentions pre-production steppings), but the DRM doesn't do anything
beyond whitelisting the L3SQCREG4 register so userspace can enable it
when it sees fit.  Do the same on KBL platforms.

Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%,
and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master --
This is followed by a regression of 35% and 10% respectively for the
same benchmarks and platform caused by my recent patch series
switching userspace to use the dataport constant cache instead of the
sampler to implement uniform pull constant loads, which caused us to
hit more heavily the L3 cache (and on platforms other than KBL had the
opposite effect of improving performance of the same two benchmarks).
The overall effect on KBL of this change combined with the recent
userspace change is respectively 4.6% and 2.6%.  SynMark2 OglShMapPcf
was affected by the constant cache changes (though it improved as it
did on other platforms rather than regressing), but is not
significantly affected by this patch (with statistical significance of
5% and sample size 20).

v2: Drop some more code to avoid unused variable warning.

Fixes: 738fa1b3 ("drm/i915/kbl: Add WaDisableLSQCROPERFforOCL")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: beignet@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v4.7+
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
[Removed double Fixes tag]
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1484217894-20505-1-git-send-email-mika.kuoppala@intel.com

8726f2fa

drm/i915/guc: Make sure vma containing firmware is GuC mappable · 83796f26

Michał Winiarski authored Jan 11, 2017

Since commit 4741da92 ("drm/i915/guc: Assert that all GGTT offsets used
by the GuC are mappable"), we're asserting that GuC firmware is in the
GuC mappable range.
Except we're not pinning the object with bias, which means it's possible
to trigger this assert. Let's add a proper bias.

Fixes: 4741da92 ("drm/i915/guc: Assert that all GGTT offsets used by the GuC are mappable")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111151739.28965-1-michal.winiarski@intel.com

83796f26

ALSA: Documentation about HDA DP MST pin init and connection · dd48e8ed

Libin Yang authored Jan 12, 2017

Add the documentation about HD-audio DP MST:
1. pin initialization
2. device entry connection list
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Libin Yang <libin.yang@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1484208294-8637-4-git-send-email-libin.yang@intel.com

dd48e8ed

ALSA: hda - add DP MST audio support · 9152085d

Libin Yang authored Jan 12, 2017

This patch adds the DP MST audio support on i915 platform and
it will enable dyn_pcm_assign feature.

DP MST supports several device entry on the same port and each
device entry can map to one pcm stream. For example, on i915,
there are 3 pins, and each pin has 3 device entries. This means
there should be 3x3 pcms. However, there is only 3 pipe lines in
i915. This means 3 pcms are actived at most at the same moment.
We will create 5 pcms (pin number + dev entry num - 1) in this case.
For the details, please refer commit a76056f2
("ALSA: hda - hdmi dynamically bind PCM to pin when monitor hotplug")

Each device entry is a virtual pin. It is described by pin_nid and dev_id
in struct hdmi_spec_per_pin.
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Libin Yang <libin.yang@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1484208294-8637-3-git-send-email-libin.yang@intel.com

9152085d

ALSA: hda - add DP mst verb support · 13800f39

Libin Yang authored Jan 12, 2017

Add snd_hda_get_dev_select() and snd_hda_set_dev_select() functions
for DP MST audio support.
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Libin Yang <libin.yang@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1484208294-8637-2-git-send-email-libin.yang@intel.com

13800f39

drm/i915: check ppgtt validity when init reg state · 34869776

Zhenyu Wang authored Jan 09, 2017

Check if ppgtt is valid for context when init reg state. For gvt
context which has no i915 allocated ppgtt, failed to check that
would cause kernel null ptr reference error.

v2: remove !48bit ppgtt case as we'll always update before submit (Chris)
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170109131453.3943-1-zhenyuw@linux.intel.com

34869776

drm/i915: Detect vma reserved for execbuf in evict-for-node · 16ee2061

Chris Wilson authored Jan 11, 2017

The vma->exec_list is still the only means we have for both reserving an
object in execbuf, and for constructing the eviction list. So during the
construction of the eviction list, we must treat anything already on the
exec_list as being pinned.

Yes, this sharing of two semantically different lists will be fixed! But
in the meantime, we have the issue that this is tripping up CI since we
started using i915_gem_gtt_reserve_node() + i915_gem_evict_for_node()
from the regular execbuf reservation path in commit 606fec95
("drm/i915: Prefer random replacement before eviction search"):

[  108.424063] kernel BUG at drivers/gpu/drm/i915/i915_vma.h:254!
[  108.424072] invalid opcode: 0000 [#1] PREEMPT SMP
[  108.424079] Modules linked in: snd_hda_intel i915 intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core mei_me snd_pcm lpc_ich mei sdhci_pci sdhci mmc_core e1000e ptp pps_core [last unloaded: i915]
[  108.424132] CPU: 1 PID: 6865 Comm: gem_cs_tlb Tainted: G     U          4.10.0-rc3-CI-CI_DRM_2049+ #1
[  108.424143] Hardware name: Hewlett-Packard HP EliteBook 8440p/172A, BIOS 68CCU Ver. F.24 09/13/2013
[  108.424154] task: ffff88012ae22600 task.stack: ffffc90000a14000
[  108.424220] RIP: 0010:i915_gem_evict_for_node+0x237/0x410 [i915]
[  108.424229] RSP: 0018:ffffc90000a17a58 EFLAGS: 00010202
[  108.424237] RAX: 0000000000005871 RBX: ffff88012d1ad778 RCX: 0000000000000000
[  108.424246] RDX: 000000007ffff000 RSI: ffffc90000a17a68 RDI: ffff880127e694d8
[  108.424255] RBP: ffffc90000a17aa0 R08: ffffc90000a17a68 R09: 0000000000000000
[  108.424264] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000080000000
[  108.424273] R13: ffffc90000a17a68 R14: ffff880127e694d8 R15: ffffffffa0387330
[  108.424283] FS:  00007f8236e3d8c0(0000) GS:ffff880137c40000(0000) knlGS:0000000000000000
[  108.424293] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  108.424305] CR2: 00007f82347a2000 CR3: 000000012c866000 CR4: 00000000000006e0
[  108.424317] Call Trace:
[  108.424368]  i915_gem_gtt_reserve+0x67/0x80 [i915]
[  108.424424]  __i915_vma_do_pin+0x248/0x620 [i915]
[  108.424487]  ? __i915_vma_do_pin+0x162/0x620 [i915]
[  108.424540]  i915_gem_execbuffer_reserve_vma.isra.8+0x153/0x1f0 [i915]
[  108.424591]  i915_gem_execbuffer_reserve.isra.9+0x40e/0x440 [i915]
[  108.424643]  i915_gem_do_execbuffer.isra.15+0x6d9/0x1b20 [i915]
[  108.424696]  i915_gem_execbuffer2+0xc0/0x250 [i915]
[  108.424712]  drm_ioctl+0x200/0x450
[  108.424760]  ? i915_gem_execbuffer+0x330/0x330 [i915]
[  108.424776]  do_vfs_ioctl+0x90/0x6e0
[  108.424789]  ? up_read+0x1a/0x40
[  108.424800]  ? trace_hardirqs_on_caller+0x122/0x1b0
[  108.424813]  SyS_ioctl+0x3c/0x70
[  108.424828]  entry_SYSCALL_64_fastpath+0x1c/0xb1
[  108.424839] RIP: 0033:0x7f8235867357
[  108.424848] RSP: 002b:00007ffdc14504c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  108.424866] RAX: ffffffffffffffda RBX: 00007ffdc1450600 RCX: 00007f8235867357
[  108.424878] RDX: 00007ffdc14505a0 RSI: 0000000040406469 RDI: 0000000000000003
[  108.424890] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000022
[  108.424903] R10: 0000000000000007 R11: 0000000000000246 R12: 0000000000000002
[  108.424915] R13: 0000000000419101 R14: 00007ffdc1450600 R15: 00007ffdc14505f0
[  108.424928] Code: 45 b8 8b 4d c0 4c 89 f2 48 89 de ff d0 49 8b 07 4c 8b 45 b8 48 85 c0 75 dd 65 ff 0d d4 a1 c8 5f 0f 84 47 01 00 00 e9 0d fe ff ff <0f> 0b 45 31 f6 4c 8b 65 c8 49 8b 04 24 4d 39 ec 49 8d 9c 24 28
[  108.425055] RIP: i915_gem_evict_for_node+0x237/0x410 [i915] RSP: ffffc90000a17a58

Fixes: 172ae5b4 ("drm/i915: Fix i915_gem_evict_for_vma (soft-pinning)")
Fixes: 606fec95 ("drm/i915: Prefer random replacement before eviction search")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111182132.19174-1-chris@chris-wilson.co.ukReviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

16ee2061

11 Jan, 2017 4 commits

drm/i915: Add a sanity check that no request is submitted in the middle · c781c978

Chris Wilson authored Jan 11, 2017

It is an error to start a new request on the same timeline (ringbuffer)
as the current one before the current is submitted. If there are two
requests emitting to the ringbuffer at the same time, the operation is
undefined. We can catch this by checking for the timeline having a later
seqno than ours when we come to submit our request.

Currently we have this check at the end of __i915_add_request, but
having an early check as well isolates a failure in the caller versus a
failure in sealing the request (i.e. from inside __i915_add_request
itself). For example, CI is currently tripping over this late assertion
on ctg/ilk:

[  100.329399] [IGT] gem_cs_tlb: starting subtest basic-default
[  100.336333] ------------[ cut here ]------------
[  100.336341] kernel BUG at drivers/gpu/drm/i915/i915_gem_request.c:908!
[  100.336347] invalid opcode: 0000 [#1] PREEMPT SMP
[  100.336351] Modules linked in: snd_hda_intel i915 snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm coretemp mei_me lpc_ich mei e1000e ptp pps_core [last unloaded: i915]
[  100.336373] CPU: 0 PID: 6308 Comm: gem_cs_tlb Tainted: G     U          4.10.0-rc3-CI-CI_DRM_2045+ #1
[  100.336380] Hardware name: LENOVO 7465CTO/7465CTO, BIOS 6DET44WW (2.08 ) 04/22/2009
[  100.336386] task: ffff88012b738040 task.stack: ffffc90000560000
[  100.336441] RIP: 0010:__i915_add_request+0x4aa/0x510 [i915]
[  100.336445] RSP: 0018:ffffc90000563ac0 EFLAGS: 00010212
[  100.336451] RAX: 0000000000005d52 RBX: ffff880133bb84c0 RCX: 0000000000000001
[  100.336456] RDX: 0000000080000001 RSI: ffff88012b738860 RDI: 00000000ffffffff
[  100.336461] RBP: ffffc90000563b00 R08: ffff880133bb8780 R09: 0000000000000000
[  100.336466] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88012f53d950
[  100.336472] R13: ffff88012a2b0af8 R14: ffff88012a5b0008 R15: ffff88012f53d960
[  100.336477] FS:  00007f0d19da38c0(0000) GS:ffff88013bc00000(0000) knlGS:0000000000000000
[  100.336483] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  100.336488] CR2: 00007f0d17706000 CR3: 000000012aa3e000 CR4: 00000000000406f0
[  100.336496] Call Trace:
[  100.336527]  i915_gem_switch_to_kernel_context+0x131/0x1b0 [i915]
[  100.336559]  i915_gem_evict_vm+0x202/0x2b0 [i915]
[  100.336590]  i915_gem_execbuffer_reserve.isra.9+0x3ae/0x440 [i915]
[  100.336623]  i915_gem_do_execbuffer.isra.15+0x6d9/0x1b20 [i915]
[  100.336656]  i915_gem_execbuffer2+0xc0/0x250 [i915]
[  100.336666]  drm_ioctl+0x200/0x450
[  100.336697]  ? i915_gem_execbuffer+0x330/0x330 [i915]
[  100.336708]  do_vfs_ioctl+0x90/0x6e0
[  100.336716]  ? up_read+0x1a/0x40
[  100.336723]  ? trace_hardirqs_on_caller+0x122/0x1b0
[  100.336730]  SyS_ioctl+0x3c/0x70
[  100.336738]  entry_SYSCALL_64_fastpath+0x1c/0xb1
[  100.336745] RIP: 0033:0x7f0d187cb357
[  100.336750] RSP: 002b:00007ffe0b2f7c28 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  100.336761] RAX: ffffffffffffffda RBX: 00007ffe0b2f7d60 RCX: 00007f0d187cb357
[  100.336768] RDX: 00007ffe0b2f7d00 RSI: 0000000040406469 RDI: 0000000000000003
[  100.336775] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000022
[  100.336782] R10: 0000000000000007 R11: 0000000000000246 R12: 0000000000000002
[  100.336789] R13: 0000000000419101 R14: 00007ffe0b2f7d60 R15: 00007ffe0b2f7d50
[  100.336797] Code: 5f 74 1e e9 d4 fb ff ff e8 bc 1e 9c e0 e9 ae fb ff ff 4c 89 e7 e8 77 22 fd ff e9 88 fd ff ff 0f 0b e8 a3 1e 9c e0 e9 b1 fb ff ff <0f> 0b 0f 0b e8 fd af ab e0 85 c0 75 c2 48 c7 c2 80 2c 71 a0 be
[  100.336877] RIP: __i915_add_request+0x4aa/0x510 [i915] RSP: ffffc90000563ac0
[  100.336886] ---[ end trace 22b36545479e5eb7 ]---
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111140858.1922-1-chris@chris-wilson.co.ukReviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

c781c978

drm/i915: Prefer random replacement before eviction search · 606fec95

Chris Wilson authored Jan 11, 2017

Performing an eviction search can be very, very slow especially for a
range restricted replacement. For example, a workload like
gem_concurrent_blit will populate the entire GTT and then cause aperture
thrashing. Since the GTT is a mix of active and inactive tiny objects,
we have to search through almost 400k objects before finding anything
inside the mappable region, and as this search is required before every
operation performance falls off a cliff.

Instead of performing the full search, we do a trial replacement of the
node at a random location fitting the specified restrictions. We lose
the strict LRU property of the GTT in exchange for avoiding the slow
search (several orders of runtime improvement for gem_concurrent_blit
4KiB-global-gtt, e.g. from 5000s to 20s). The loss of LRU replacement is
(later) mitigated firstly by only doing replacement if we find no
freespace and secondly by execbuf doing a PIN_NONBLOCK search first before
it starts thrashing (i.e. the random replacement will only occur from the
already inactive set of objects).

v2: Ascii-art, and check preconditionst
v3: Rephrase final sentence in comment to explain why we don't bother
with if (i915_is_ggtt(vm)) for preferring random replacement.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111112312.31493-3-chris@chris-wilson.co.uk

606fec95

drm/i915: Extract reserving space in the GTT to a helper · 625d988a

Chris Wilson authored Jan 11, 2017

Extract drm_mm_reserve_node + calling i915_gem_evict_for_node into its
own routine so that it can be shared rather than duplicated.

v2: Kerneldoc
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: igvt-g-dev@lists.01.org
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111112312.31493-2-chris@chris-wilson.co.uk

625d988a

drm/i915: Use the MRU stack search after evicting · e007b19d

Chris Wilson authored Jan 11, 2017

When we evict from the GTT to make room for an object, the hole we
create is put onto the MRU stack inside the drm_mm range manager. On the
next search pass, we can speed up a PIN_HIGH allocation by referencing
that stack for the new hole.

v2: Pull together the 3 identical implements (ahem, a couple were
outdated) into a common routine for allocating a node and evicting as
necessary.
v3: Detect invalid calls to i915_gem_gtt_insert()
v4: kerneldoc
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111112312.31493-1-chris@chris-wilson.co.uk

e007b19d

10 Jan, 2017 24 commits

drm/i915/psr: disable psr2 for resolution greater than 32X20 · acf45d11

Nagaraju, Vathsala authored Jan 10, 2017

PSR2 is restricted to work with panel resolutions upto 3200x2000,
move the check to intel_psr_match_conditions and fully block psr.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jim Bride <jim.bride@linux.intel.com>
Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1484031746-20874-1-git-send-email-vathsala.nagaraju@intel.com

acf45d11

drm/i915/psr: program vsc header for psr2 · 97da2ef4

Nagaraju, Vathsala authored Jan 02, 2017

Function hsw_psr_setup handles vsc header setup for psr1 and
skl_psr_setup_vsc handles vsc header setup for psr2.

Setup VSC header in function skl_psr_setup_vsc for psr2 support,
as per edp 1.4 spec, table 6-11:VSC SDP HEADER Extension for psr2
operation.

v2: (Jani)
- Initialize variables to 0
- intel_dp_get_y_cord_status and intel_dp_get_y_cord_status made static
- Correct indentation for continuation lines
- Change DP_PSR_Y_COORDINATE to  DP_PSR2_SU_Y_COORDINATE_REQUIRED
- Change DPRX_FEATURE_ENUMERATION_LIST to DP_DPRX_*
- Change VSC_SDP_EXT_FOR_COLORIMETRY_SUPPORTED to DP_VSC_*

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jim Bride <jim.bride@linux.intel.com>
Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
Signed-off-by: Patil Deepti <deepti.patil@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1483356663-32668-3-git-send-email-vathsala.nagaraju@intel.com

97da2ef4

drm : adds Y-coordinate and Colorimetry Format · d0ce9062

Nagaraju, Vathsala authored Jan 02, 2017

PSR2 vsc revision number hb2( as per table 6-11)is updated to
4 or 5 based on Y cordinate and Colorimetry Format as below
04h = 3D stereo + PSR/PSR2 + Y-coordinate.
05h = -3D stereo- + PSR/PSR2 + Y-coordinate + Pixel Encoding/Colorimetry
Format indication. A DP Source device is allowed to indicate the pixel
encoding/colorimetry format to the DP Sink device with VSC SDP only when
the DP Sink device supports it (
i.e.,VSC_SDP_EXTENSION_FOR_COLORIMETRY_SUPPORTED bit in the
DPRX_FEATURE_ENUMERATION_LIST register (DPCD Address 02210h, bit 3;
is set to 1).

v2: (Jani)
- Change DP_PSR_Y_COORDINATE to DP_PSR2_SU_Y_COORDINATE_REQUIRED.
- Add DP_PSR2_SU_GRANULARITY_REQUIRED.
- Change DPRX_FEATURE_ENUMERATION_LIST to DP_DPRX.
- Add GTC_CAP and AV_SYNC_CAP, other bits in DPRX_FEATURE_ENUMERATION_LIST.

v3: (Jani)
- Add support for bits 7:4 and 1 as per DP v1.4 for
  DPRX_FEATURE_ENUMERATION_LIST.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jim Bride <jim.bride@linux.intel.com>
Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
Signed-off-by: Patil Deepti <deepti.patil@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1483356663-32668-2-git-send-email-vathsala.nagaraju@intel.com

d0ce9062

drm/i915: Replace 4096 with PAGE_SIZE or I915_GTT_PAGE_SIZE · f51455d4

Chris Wilson authored Jan 10, 2017

Start converting over from the byte count to its semantic macro, either
we want to allocate the size of a physical page in main memory or we
want the size of a virtual page in the GTT. 4096 could mean either, but
PAGE_SIZE and I915_GTT_PAGE_SIZE are explicit and should help improve
code comprehension and future changes. In the future, we may want to use
variable GTT page sizes and so have the challenge of knowing which
hardcoded values were used to represent a physical page vs the virtual
page.

v2: Look for a few more 4096s to convert, discover IS_ALIGNED().
v3: 4096ul paranoia, make fence alignment a distinct value of 4096, keep
bdw stolen w/a as 4096 until we know better.
v4: Add asserts that i915_vma_insert() start/end are aligned to GTT page
sizes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170110144734.26052-1-chris@chris-wilson.co.ukReviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

f51455d4

drm/i915: Rename i915_gem_engine_cleanup() to engine_set_wedged() · 2a20d6f8

Chris Wilson authored Jan 10, 2017

It has been some time since i915_gem_engine_cleanup was only called from
the module unload path, and now it is only called when the GPU is
wedged. Mika complained that the name is confusing, especially in light
of the existence of i915_gem_cleanup_engines().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170110172246.27297-5-chris@chris-wilson.co.uk

2a20d6f8

drm/i915: Mark all incomplete requests as -EIO when wedged · 3cd9442f

Chris Wilson authored Jan 10, 2017

Similarly to a normal reset, after we mark the GPU as wedged (completely
fubar and no more requests can be executed), set the error status on all
the in flight requests.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170110172246.27297-4-chris@chris-wilson.co.uk

3cd9442f

drm/i915: Set an error status for a resubmitted request · 3c1b2847

Chris Wilson authored Jan 10, 2017

Let userspace know if its request was resubmitted due to it being
executed at the time of a global reset. In this case, the reset was for
a guilty request on another engine, and this request was an innocent
victim that will be re-executed upon restarting. However, since it was
running at the time of the reset, we can not guarantee that it suffered
no ill-effects from the reset (e.g. some context state may be lost, or
some self-modifying fragment shaders will be restarted from the final
state not their initial state), to let userspace know that it has been
corrupted set a special value on the fence->error, -EAGAIN.

If the request does hang on resubmission, the error will be overwritten
with -EIO.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170110172246.27297-3-chris@chris-wilson.co.uk

3c1b2847

drm/i915: Set guilty-flag on fence after detecting a hang · c0d5f32c

Chris Wilson authored Jan 10, 2017

The struct dma_fence carries a status field exposed to userspace by
sync_file. This is inspected after the fence is signaled and can convey
whether or not the request completed successfully, or in our case if we
detected a hang during the request (signaled via -EIO in
SYNC_IOC_FILE_INFO).

v2: Mark all cancelled requests as failed.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170110172246.27297-2-chris@chris-wilson.co.uk

c0d5f32c

drm/i915: Consolidate reset_request() · 2edc6e0d

Chris Wilson authored Jan 10, 2017

Always reset the requests of the guilty context, including the hung
request that we tell the hardware to skip. This should help if the
reprogram fails entirely, but more importantly makes the guilty path
more uniform (and simplifies the subsequent patch to tweak the cancelled
requests).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170110172246.27297-1-chris@chris-wilson.co.uk

2edc6e0d

drm/i915: Put "cooked" vlank counters in frame CRC lines · 246ee524

Tomeu Vizoso authored Jan 10, 2017

Use drm_accurate_vblank_count so we have the full 32 bit to represent
the frame counter and userspace has a simpler way of knowing when the
counter wraps around.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Robert Foss <robert.foss@collabora.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170110134305.26326-3-tomeu.vizoso@collabora.com

246ee524

drm/i915: Use new CRC debugfs API · 8c6b709d

Tomeu Vizoso authored Jan 10, 2017

The core provides now an ABI to userspace for generation of frame CRCs,
so implement the ->set_crc_source() callback and reuse as much code as
possible with the previous ABI implementation.

When handling the pageflip interrupt, we skip 1 or 2 frames depending on
the HW because they contain wrong values. For the legacy ABI for
generating frame CRCs, this was done in userspace but now that we have a
generic ABI it's better if it's not exposed by the kernel.

v2:
- Leave the legacy implementation in place as the ABI implementation
in the core is incompatible with it.
v3:
- Use the "cooked" vblank counter so we have a whole 32 bits.
- Make sure we don't mess with the state of the legacy CRC capture
ABI implementation.
v4:
- Keep use of get_vblank_counter as in the legacy code, will be
changed in a followup commit.

v5:
- Skip first frame or two as it's known that they contain wrong
data.
- A few fixes suggested by Emil Velikov.

v6:
- Rework programming of the HW registers to preserve previous
behavior.

v7:
- Address whitespace issue.
- Added a comment on why in the implementation of the new ABI we
skip the 1st or 2nd frames.

v9:
- Add stub for intel_crtc_set_crc_source.

v12:
- Rebased.
- Remove stub for intel_crtc_set_crc_source and instead set the
callback to NULL (Jani Nikula).

v15:
- Rebased.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Robert Foss <robert.foss@collabora.com>

irq
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170110134305.26326-2-tomeu.vizoso@collabora.com

8c6b709d

Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queued · 05adb185

Daniel Vetter authored Jan 10, 2017

Pull in latest drm-next from Dave Airlie to get at all the drm-misc
goodies, specifically:
- dma_fence error state handling rework (Chris needs that for error
  recovery)
- crc support locking changes (Tomeu's i915 crc patches need that).
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

05adb185

drm/i915: Split out i915_gem_object_set_tiling() · 957870f9

Chris Wilson authored Jan 10, 2017

Expose an interface for changing the tiling and stride on an object,
that includes the complexity of checking for conflicting bindings and
fence registers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170110121045.27144-2-chris@chris-wilson.co.uk

957870f9

drm/i915: Include ioctl in set-tiling and get-tiling function names · 111dbcab

Chris Wilson authored Jan 10, 2017

Make it clear that these functions are the user entry points for
the tiling/fence registers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170110121045.27144-1-chris@chris-wilson.co.uk

111dbcab

drm/i915: Clip the partial view against the object not vma · 8201c1fa

Chris Wilson authored Jan 10, 2017

The VMA is later clipped against the vm_area_struct before insertion of
the faulting PTE so we are free to create the partial view as we desire.
If we use the object as the extents rather than the area, this partial
can then be used for other areas.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170110095633.6612-2-chris@chris-wilson.co.uk

8201c1fa

drm/i915: Extract compute_partial_view() · 2d4281bb

Chris Wilson authored Jan 10, 2017

In order to reuse the partial view for selftesting, extract the common
function for computing the view.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170110095633.6612-1-chris@chris-wilson.co.uk

2d4281bb

drm/i915/glk: Convert a few more IS_BROXTON() to IS_GEN9_LP() · 254e0931

Michel Thierry authored Jan 09, 2017

Commit cc3f90f0 ("drm/i915/glk: Reuse broxton code for geminilake")
missed a few of occurences of IS_BROXTON() that should have been
coverted to IS_GEN9_LP().

v2: Cite the right commit. (Ander)

Fixes: cc3f90f0 ("drm/i915/glk: Reuse broxton code for geminilake")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: intel-gfx@lists.freedesktop.org
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Tomasz Lis <tomasz.lis@intel.com> (v1)
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1483973495-15138-1-git-send-email-ander.conselvan.de.oliveira@intel.com

254e0931

drm/i915/glk: Add missing bits to allow runtime pm suspend on GLK. · b9fd799e

Rodrigo Vivi authored Dec 16, 2016

Besides having the DMC firmware in place and loaded let's
handle runtime suspend and dc9 as we do for Broxton.

Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1481902946-18593-2-git-send-email-ander.conselvan.de.oliveira@intel.com

b9fd799e

drm/i915/DMC/GLK: Load DMC on GLK · dbb28b5c

Anusha Srivatsa authored Dec 16, 2016

This patch loads the DMC on GLK. There is a single
firmware image for all steppings on a GLK.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1481902946-18593-1-git-send-email-ander.conselvan.de.oliveira@intel.com

dbb28b5c

drm/i915: Move ggtt fence/alignment to i915_gem_tiling.c · 91d4e0aa

Chris Wilson authored Jan 09, 2017

Rename i915_gem_get_ggtt_size() and i915_gem_get_ggtt_alignment() to
i915_gem_fence_size() and i915_gem_fence_alignment() respectively to
better match usage. Similarly move the pair of functions into
i915_gem_tiling.c next to the fence restrictions.
Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-6-chris@chris-wilson.co.ukReviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

91d4e0aa

drm/i915: Remove the rounding down of the gen4+ fence region · cea84d16

Chris Wilson authored Jan 09, 2017

Restricting the fence to the end of the previous tile-row breaks access
to the final portion of the object. On gen2/3 we employed lazy fencing
to pad out the fence with scratch page to provide access to the tail,
and now we also pad out the object on gen4+ we can apply the same fix.

Fixes: af1a7301 ("drm/i915: Only fence tiled region of object.")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-5-chris@chris-wilson.co.uk

cea84d16

drm/i915: Store required fence size/alignment for GGTT vma · 944397f0

Chris Wilson authored Jan 09, 2017

The fence size/alignment is a combination of the vma size plus object
tiling parameters. Those parameters are rarely changed, making the fence
size/alignemnt roughly constant for the lifetime of the VMA. We can
simplify subsequent calculations by precalculating the size/alignment
required for GGTT vma taking fencing into account (with an update if we
do change the tiling or stride).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-4-chris@chris-wilson.co.uk

944397f0

drm/i915: Replace WARNs in fence register writes with extensive asserts · 0d4e8f1d

Chris Wilson authored Jan 09, 2017

All of these conditions are prechecked by i915_tiling_ok() before we
allow setting the tiling/stride on the object and so we should never
fail asserting those conditions before writing the register.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-3-chris@chris-wilson.co.uk

0d4e8f1d

drm/i915: Align GGTT sizes to a fence tile row · 5b30694b

Chris Wilson authored Jan 09, 2017

Ensure the view occupies the full tile row so that reads/writes into the
VMA do not escape (via fenced detiling) into neighbouring objects - we
will pad the object with scratch pages to satisfy the fence. This
applies the lazy-tiling we employed on gen2/3 to gen4+.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-2-chris@chris-wilson.co.uk

5b30694b