Commits · 90d27a1b180e51ef071350a302648b41fe884ff2 · Kirill Smelkov / linux

14 Nov, 2016 1 commit

drm/i915/gvt: fix deadlock in workload_thread · 90d27a1b

Pei Zhang authored Nov 14, 2016

It's a classical abba type deadlock when using 2 mutex objects, which
are gvt.lock(a) and drm.struct_mutex(b). Deadlock happens in threads:
1. intel_gvt_create/destroy_vgpu: P(a)->P(b)
2. workload_thread: P(b)->P(a)

Fix solution is align the lock acquire sequence in both threads. This
patch choose to adjust the sequence in workload_thread function.

This fixed lockup symptom for guest-reboot stress test.

v2: adjust sequence in workload_thread based on zhenyu's suggestion.
    adjust sequence in create/destroy_vgpu function.
v3: fix to still require struct_mutex for dispatch_workload()
Signed-off-by: Pei Zhang <pei.zhang@intel.com>
[zhenyuw: fix unused variables warnings.]
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

90d27a1b

10 Nov, 2016 11 commits

Merge tag 'gvt-next-kvmgt-framework' of https://github.com/01org/gvt-linux ... · 8be8f4a9

Daniel Vetter authored Nov 10, 2016

Merge tag 'gvt-next-kvmgt-framework' of https://github.com/01org/gvt-linux into drm-intel-next-queued

Zhenyu Wang writes:

gvt-next-kvmgt-framework

This adds initial KVMGT framework based on GVT-g MPT(Mediated Passthrough) interface.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

8be8f4a9

drm/i915: Trim the object sg table · 0c40ce13

Tvrtko Ursulin authored Nov 09, 2016

At the moment we allocate enough sg table entries assuming we
will not be able to do any coalescing. But since in practice
we most often can, and more so very effectively, this ends up
wasting a lot of memory.

A simple and effective way of trimming the over-allocated
entries is to copy the table over to a new one allocated to the
exact size.

Experiments on my freshly logged and idle desktop (KDE) showed
that by doing this we can save approximately 1 MiB of RAM, or
when running a typical benchmark like gl_manhattan I have
even seen a 6 MiB saving.

More complicated techniques such as only copying the last used
page and freeing the rest are left to the reader.

v2:
 * Update commit message.
 * Use temporary sg_table on stack. (Chris Wilson)

v3:
 * Commit message update.
 * Comment added.
 * Replace memcpy with copy assignment.
   (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1478704423-7447-1-git-send-email-tvrtko.ursulin@linux.intel.com

0c40ce13

drm/i915/gvt: add KVMGT support · f30437c5

Jike Song authored Nov 09, 2016

KVMGT is the MPT implementation based on VFIO/KVM. It provides
a kvmgt_mpt ops to gvt for vGPU access mediation, e.g. to
mediate and emulate the MMIO accesses, to inject interrupts
to vGPU user, to intercept the GTT writing and replace it with
DMA-able address, to write-protect guest PPGTT table for
shadowing synchronization, etc. This patch provides the MPT
implementation for GVT, not yet functional due to theabsence
of mdev.

It's built as kvmgt.ko, depends on vfio.ko, kvm.ko and mdev.ko,
and being required by i915.ko. To not introduce hard dependency
in i915.ko, we used indirect symbol reference. But that means
users have to include kvmgt.ko into init ramdisk if their
i915.ko is included.
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Xiaoguang Chen <xiaoguang.chen@intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

f30437c5

drm/i915/gvt: refactor intel_gvt_io_emulation_ops to be intel_gvt_ops · 9ec1e66b

Jike Song authored Nov 03, 2016

There are currently 4 methods in intel_gvt_io_emulation_ops
to emulate CFG/MMIO reading/writing for intel vGPU. A possibly
better scope is: add 3 more methods for vgpu create/destroy/reset
respectively, and rename the ops to 'intel_gvt_ops', then pass
it to the MPT module (say the future kvmgt) to use: they are
all methods for external usage.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

9ec1e66b

drm/i915/gvt: allow several MPT methods to be NULL · 7b3343b7

Jike Song authored Nov 03, 2016

Hypervisors are different, the MPT ops is a only superset of
all possibly supported hypervisors. There might be other way
out of the MPT to achieve same target. e.g. vfio-based kvmgt
won't provide map_gfn_to_mfn method to establish guest EPT
mapping for aperture, since it will be done in QEMU/KVM, MMIO
is also trapped elsewhere, etc.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

7b3343b7

drm/i915/gvt: introduce host_init/host_exit to MPT · 40df6ea0

Jike Song authored Nov 03, 2016

GVT host needs init/exit hooks to do some initialization/cleanup
work, e.g.: vfio mdev host device register/unregister.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

40df6ea0

drm/i915/gvt: remove obsolete code for old kvmgt opregion · 8f89743b

Jike Song authored Nov 03, 2016

Current GVT contains some obsolete logic originally cooked to
support the old, non-vfio kvmgt, which is actually workarounds.
We don't support that anymore, so it's safe to remove it and
make a better framework.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

8f89743b

drm/i915/gvt: add intel vgpu types support · 1f31c829

Zhenyu Wang authored Nov 03, 2016

By providing predefined vGPU types, users can choose which type a vgpu
to create and use, without specifying detailed parameters.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>

1f31c829

drm/i915/gvt: use kmap instead of kmap_atomic around guest memory access · c754936f

Xiaoguang Chen authored Nov 03, 2016

kmap_atomic doesn't allow sleep until unmapped. However,
it's necessary to allow sleep during reading/writing guest
memory, so use kmap instead.
Signed-off-by: Bing Niu <bing.niu@intel.com>
Signed-off-by: Xiaoguang Chen <xiaoguang.chen@intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

c754936f

drm/i915/gvt: don't rely on guest PPGTT entry to free old shadow data · 9baf0920

Bing Niu authored Nov 07, 2016

On guest writing a PPGTT entry, if it contains value and the old
entry is valid, gvt will read it and find & free the corresponding
old data for it. However, with the KVM write protection provided
by page_track, the guest entry will be written with new value
before gvt handling. To avoid that, we should use the shadow
entry instead.
Signed-off-by: Bing Niu <bing.niu@intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

9baf0920

Merge tag 'for-kvmgt' of git://git.kernel.org/pub/scm/virt/kvm/kvm into drm-intel-next-queued · 11840e2f
Daniel Vetter authored Nov 10, 2016
```
Paulo Bonzini writes:

The three KVM patches that KVMGT needs.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
```
11840e2f

09 Nov, 2016 6 commits

drm/i915: Spin until breadcrumb threads are complete · 6a5d1db9

Chris Wilson authored Nov 08, 2016

When we need to reset the global seqno on wraparound, we have to wait
until the current rbtrees are drained (or otherwise the next waiter will
be out of sequence). The current mechanism to kick and spin until
complete, may exit too early as it would break if the target thread was
currently running. Instead, we must wake up the threads, but keep
spinning until the trees have been deleted.

In order to appease Tvrtko, busy spin rather than yield().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161108143719.32215-1-chris@chris-wilson.co.ukReviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

6a5d1db9

drm/i915: Pass atomic state to verify_connector_state · 677100ce

Maarten Lankhorst authored Nov 08, 2016

This gets rid of a warning that the connectors are used without locking
when doing a nonblocking modeset.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478609742-13603-11-git-send-email-maarten.lankhorst@linux.intel.com

677100ce

drm/i915: Update atomic modeset state synchronously, v2. · c3b32658

Maarten Lankhorst authored Nov 08, 2016

All of this state should be updated as soon as possible. It shouldn't be
done later because then future updates may not depend on it.

Changes since v1:
- Move the modeset update to before drm_atomic_state_get. (Ville)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478609742-13603-10-git-send-email-maarten.lankhorst@linux.intel.comReviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

c3b32658

drm/edid: Remove drm_select_eld · 1f4faefe

Maarten Lankhorst authored Nov 08, 2016

The only user was i915, which is now gone.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Dave Airlie <airlied@redhat.com> #irc
Cc: dri-devel@lists.freedesktop.org
Link: http://patchwork.freedesktop.org/patch/msgid/1478609742-13603-9-git-send-email-maarten.lankhorst@linux.intel.com

1f4faefe

drm/i915: Pass atomic state to intel_audio_codec_enable, v2. · bbf35e9d

Maarten Lankhorst authored Nov 08, 2016

drm_select_eld requires mode_config.mutex and connection_mutex
because it looks at the connector list and at the legacy encoders.

This is not required, because when we call audio_codec_enable we know
which connector it was called for, so pass the state.

This also removes having to look at crtc->config.

Changes since v1:
- Use intel_crtc->pipe instead of drm_crtc_index. (Ville)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478609742-13603-8-git-send-email-maarten.lankhorst@linux.intel.comReviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

bbf35e9d

drm/i915: Convert intel_hdmi to use atomic state · df18e721

Maarten Lankhorst authored Nov 08, 2016

This is the last connector still looking at crtc->config. Fix this.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478609742-13603-7-git-send-email-maarten.lankhorst@linux.intel.com

df18e721

08 Nov, 2016 8 commits

drm/i915: avoid harmless empty-body warning · 71d5895a

Arnd Bergmann authored Nov 08, 2016

The newly added assert_kernel_context_is_current introduces a warning
when built with W=1:

drivers/gpu/drm/i915/i915_gem.c: In function ‘assert_kernel_context_is_current’:
drivers/gpu/drm/i915/i915_gem.c:4417:63: error: suggest braces around empty body in an ‘else’ statement [-Werror=empty-body]

Changing the GEM_BUG_ON() macro from an empty definition to "do { } while (0)"
makes the macro more robust to use and avoids the warning.

Fixes: 3033acab ("drm/i915: Queue the idling context switch after all other timelines")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161108135834.2166677-1-arnd@arndb.de

71d5895a

Merge tag 'gvt-next-2016-11-07' of https://github.com/01org/gvt-linux into drm-intel-next-queued · d7931c18

Daniel Vetter authored Nov 08, 2016

gvt-next-2016-11-07

- Fix regression from e95433c7
- Some MMIO handler fixes
- Add better handling for guest reset control
- stratch page table tree for shadow ppgtt
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

d7931c18

drm/i915: Use intel_fb_gtt_offset() also for gen2/3 primary plane · bfb81049

Ville Syrjälä authored Nov 07, 2016

The code to determine the primary plane offset for gen2/3 looks
different than the code for gen4+, but in fact it's doing the same
thing. Let's make it uniform. Allows us to eliminate the 'obj' from
the list of local variables as well.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478550057-24864-6-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

bfb81049

drm/i915: Fix error handling for cursor/sprite plane create failure · d2b2cbce

Ville Syrjälä authored Nov 07, 2016

intel_cursor_plane_create() and intel_sprite_plane_create() return
an error pointer, so let's not mistakenly look for a NULL pointer.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
References: https://lists.freedesktop.org/archives/intel-gfx/2016-November/110690.html
Fixes: b079bd17 ("drm/i915: Bail if plane/crtc init fails")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478550057-24864-5-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

d2b2cbce

drm/i915: Grab the rotation from the passed plane state for VLV sprites · 11df4d95

Ville Syrjälä authored Nov 07, 2016

Use the passed in plane_state instead of plane->state in
vlv_update_plane(). Currently the two are one and the same, but if we
start queuing up multiple plane updates they might not be.

Looks like this was rebase fail on my part.

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Fixes: 8d0deca8 ("drm/i915: Pass 90/270 vs. 0/180 rotation info for intel_gen4_compute_page_offset()")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478550057-24864-4-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

11df4d95

drm/i915: Remove chipset flush after cache flush · d0da48cf

Chris Wilson authored Nov 06, 2016

We always flush the chipset prior to executing with the GPU, so we can
skip the flush during ordinary domain management.

This should help mitigate some of the potential performance regressions,
but likely trivial, from doing the flush unconditionally before execbuf
introduced in commit dcd79934 ("drm/i915: Unconditionally flush any
chipset buffers before execbuf")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20161106130001.9509-1-chris@chris-wilson.co.ukReviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

d0da48cf

drm/i915: Remove two sloppy inline functions from .h · 24327f83

Joonas Lahtinen authored Nov 08, 2016

Get rid of sloppy inline functions now that we don't have more users:

i915_gem_request_get_seqno
i915_gem_request_get_engine

v2:
- request->engine is always non-NULL (Chris)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1478589108-3702-1-git-send-email-joonas.lahtinen@linux.intel.com

24327f83

drm/i915: Update DRIVER_DATE to 20161108 · 58e197d6
Daniel Vetter authored Nov 08, 2016
```
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
```
58e197d6

07 Nov, 2016 14 commits

drm/i915: Mark CPU cache as dirty when used for rendering · 7aa6ca61

Chris Wilson authored Nov 07, 2016

On LLC, or even snooped, machines rendering via the GPU ends up in the CPU
cache. This cacheline dirt also needs to be flushed to main memory when
moving to an incoherent domain, such as the display's scanout engine.
Mostly, this happens because either the object is marked as dirty from
its first use or is avoided by setting the object into the display
domain from the start.

v2: Treat WT as not requiring a clflush prior to use on the display
engine as well.

Fixes: 0f71979a ("drm/i915: Performed deferred clflush inside set-cache-level")
References: https://bugs.freedesktop.org/show_bug.cgi?id=95414Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.0+
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161107165204.7008-1-chris@chris-wilson.co.uk

7aa6ca61

drm/i915: Add assert for no pending GPU requests during suspend/resume in LR mode · 31ab49ab

Imre Deak authored Nov 07, 2016

During resume we will reset the SW/HW tracking for each ring head/tail
pointers and so are not prepared to replay any pending requests (as
opposed to GPU reset time). Add an assert for this both to the suspend
and the resume code.

v2:
- Check for ELSP port idle already during suspend and check !gt.awake
  during resume. (Chris)
v3:
- Move the !gt.awake check to i915_gem_resume().
v4:
- s/intel_lr_engines_idle/intel_execlists_idle/ (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1478510405-11799-4-git-send-email-imre.deak@intel.com

31ab49ab

drm/i915: Make sure engines are idle during GPU idling in LR mode · 0cb5670b

Imre Deak authored Nov 07, 2016

We assume that the GPU is idle once receiving the seqno via the last
request's user interrupt. In execlist mode the corresponding context
completed interrupt can be delayed though and until this latter
interrupt arrives we consider the request to be pending on the ELSP
submit port. This can cause a problem during system suspend where this
last request will be seen by the resume code as still pending. Such
pending requests are normally replayed after a GPU reset, but during
resume we reset both SW and HW tracking of the ring head/tail pointers,
so replaying the pending request with its stale tail pointer will leave
the ring in an inconsistent state. A subsequent request submission can
lead then to the GPU executing from uninitialized area in the ring
behind the above stale tail pointer.

Fix this by making sure any pending request on the ELSP port is
completed before suspending. I used a polling wait since the completion
time I measured was <1ms and since normally we only need to wait during
system suspend. GPU idling during runtime suspend is scheduled with a
delay (currently 50-100ms) after the retirement of the last request at
which point the context completed interrupt must have arrived already.

The chance of this bug was increased by

commit 1c777c5d
Author: Imre Deak <imre.deak@intel.com>
Date:   Wed Oct 12 17:46:37 2016 +0300

    drm/i915/hsw: Fix GPU hang during resume from S3-devices state

but it could happen even without the explicit GPU reset, since we
disable interrupts afterwards during the suspend sequence.

v2:
- Do an unlocked poll-wait first. (Chris)
v3-4:
- s/intel_lr_engines_idle/intel_execlists_idle/ and move
  i915.enable_execlists check to the new helper. (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98470Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1478510405-11799-3-git-send-email-imre.deak@intel.com

0cb5670b

drm/i915: Avoid early GPU idling due to race with new request · 93c97dc1

Imre Deak authored Nov 07, 2016

There is a small race where a new request can be submitted and retired
after the idle worker started to run which leads to idling the GPU too
early. Fix this by deferring the idling to the pending instance of the
worker.

This scenario was pointed out by Chris.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1478510405-11799-2-git-send-email-imre.deak@intel.com

93c97dc1

drm/i915: Avoid early GPU idling due to already pending idle work · 5bd11a34

Imre Deak authored Nov 07, 2016

Atm, in case an idle work handler is already pending but haven't yet
started to run, retiring a new request will not extend the active period
as required, rather simply leaves the pending idle work to be scheduled
at the original expiration time. This may lead to idling the GPU too
early. Fix this by using the delayed-work scheduler alternative which
makes sure the handler's expiration time is extended in this case.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Requested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1478510405-11799-1-git-send-email-imre.deak@intel.com

5bd11a34

drm/i915: Limit Valleyview and earlier to only using mappable scanout · 767a222e

Chris Wilson authored Nov 07, 2016

Valleyview appears to be limited to only scanning out from the first 512MiB
of the Global GTT. Lets presume that this behaviour was inherited from the
display block copied from g4x (not Ironlake) and all earlier generations
are similarly affected, though testing suggests different symptoms. For
simplicity, impose that these platforms must scanout from the mappable
region. (For extra simplicity, use HAS_GMCH_DISPLAY even though this
catches Cherryview which does not appear to be limited to the low
aperture for its scanout.)

v2: Use HAS_GMCH_DISPLAY() to more clearly convey my intent about
limiting this workaround to the old style of display engine.

v3: Update changelog to reflect testing by Ville Syrjälä
v4: Include the changes to the comments as well
Reported-by: Luis Botello <luis.botello.ortega@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98036
Fixes: 2efb813d ("drm/i915: Fallback to using unmappable memory for scanout")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Akash Goel <akash.goel@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.9-rc1+
Link: http://patchwork.freedesktop.org/patch/msgid/20161107110128.28762-1-chris@chris-wilson.co.uk
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com

767a222e

drm/i915: Round tile chunks up for constructing partial VMAs · 0ef723cb

Chris Wilson authored Nov 07, 2016

When we split a large object up into chunks for GTT faulting (because we
can't fit the whole object into the aperture) we have to align our cuts
with the fence registers. Each partial VMA must cover a complete set of
tile rows or the offset into each partial VMA is not aligned with the
whole image. Currently we enforce a minimum size on each partial VMA,
but this minimum size itself was not aligned to the tile row causing
distortion.
Reported-by: Andreas Reis <andreas.reis@gmail.com>
Reported-by: Chris Clayton <chris2553@googlemail.com>
Reported-by: Norbert Preining <preining@logic.at>
Tested-by: Norbert Preining <preining@logic.at>
Tested-by: Chris Clayton <chris2553@googlemail.com>
Fixes: 03af84fe ("drm/i915: Choose partial chunksize based on tile row size")
Fixes: a61007a8 ("drm/i915: Fix partial GGTT faulting") # enabling patch
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98402
Testcase: igt/gem_mmap_gtt/medium-copy-odd
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.9-rc1+
Link: http://patchwork.freedesktop.org/patch/msgid/20161107105443.27855-1-chris@chris-wilson.co.ukReviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

0ef723cb

drm/i915: Remove the vma from the object list upon close · dfd2812e

Chris Wilson authored Nov 04, 2016

Currently, the vma is being unlink from the object lookup on destroy.
However, we are meant to be decoupling it upon close so that the user
cannot access the closed vma whilst it remains active on the GPU.

[   34.074858] kernel BUG at drivers/gpu/drm/i915/i915_gem_gtt.c:3561!
[   34.074875] invalid opcode: 0000 [#1] PREEMPT SMP
[   34.074888] Modules linked in: snd_hda_intel i915 x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel lpc_ich mei_me mei snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_codec snd_hwdep snd_hda_core i2c_designware_platform i2c_designware_core snd_pcm e1000e ptp pps_core sdhci_acpi sdhci mmc_core i2c_hid [last unloaded: i915]
[   34.075010] CPU: 1 PID: 6224 Comm: gem_close_race Tainted: G     U          4.9.0-rc3-CI-CI_DRM_1800+ #1
[   34.075034] Hardware name:                  /NUC5i7RYB, BIOS RYBDWi35.86A.0355.2016.0224.1501 02/24/2016
[   34.075057] task: ffff8802459a8040 task.stack: ffffc90000524000
[   34.075074] RIP: 0010:[<ffffffffa0392cbc>]  [<ffffffffa0392cbc>] i915_gem_obj_lookup_or_create_vma+0x8c/0xc0 [i915]
[   34.075118] RSP: 0018:ffffc90000527b68  EFLAGS: 00010202
[   34.075135] RAX: ffff8802426c5e40 RBX: 0000000000000000 RCX: ffff8802447fc2a8
[   34.075158] RDX: 0000000000000000 RSI: ffff8802447fc2a8 RDI: ffff880248a4a880
[   34.075181] RBP: ffffc90000527b88 R08: 0000000000000008 R09: 0000000000000000
[   34.075203] R10: 0000000000000001 R11: 0000000000000000 R12: ffff880248a4a880
[   34.075225] R13: ffff8802447fc2a8 R14: ffff880243e9afa8 R15: ffff880248a4a9c8
[   34.075248] FS:  00007f9b43e59740(0000) GS:ffff880256c80000(0000) knlGS:0000000000000000
[   34.075273] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   34.075292] CR2: 00007f9b43419140 CR3: 000000024455d000 CR4: 00000000003406e0
[   34.075314] Stack:
[   34.075323]  0000000000000000 ffffc90000527bd0 ffff880243cb8008 ffff880243e9afa8
[   34.075353]  ffffc90000527c08 ffffffffa03874c7 ffffc90000527bb8 ffff880243e9afa8
[   34.075383]  ffff880243e9afb0 ffffc90000527e10 ffff8802447fc2a8 ffff880243cb8040
[   34.075414] Call Trace:
[   34.075435]  [<ffffffffa03874c7>] eb_lookup_vmas.isra.7+0x247/0x330 [i915]
[   34.075468]  [<ffffffffa0388c34>] i915_gem_do_execbuffer.isra.15+0x604/0x1a10 [i915]
[   34.075507]  [<ffffffffa039c957>] ? i915_gem_object_get_sg+0x347/0x380 [i915]
[   34.075532]  [<ffffffff811a69ce>] ? __might_fault+0x3e/0x90
[   34.075562]  [<ffffffffa038a430>] i915_gem_execbuffer2+0xc0/0x250 [i915]
[   34.075585]  [<ffffffff81552926>] drm_ioctl+0x1f6/0x480
[   34.075604]  [<ffffffff8100107a>] ? trace_hardirqs_on_thunk+0x1a/0x1c
[   34.075635]  [<ffffffffa038a370>] ? i915_gem_execbuffer+0x330/0x330 [i915]
[   34.075658]  [<ffffffff81202d2e>] do_vfs_ioctl+0x8e/0x690
[   34.075677]  [<ffffffff8181582d>] ? _raw_spin_unlock_irqrestore+0x3d/0x60
[   34.075700]  [<ffffffff810fcd51>] ? SyS_timer_settime+0x141/0x1e0
[   34.075721]  [<ffffffff810d6de2>] ? trace_hardirqs_on_caller+0x122/0x1b0
[   34.075742]  [<ffffffff8120336c>] SyS_ioctl+0x3c/0x70
[   34.075760]  [<ffffffff8181602e>] entry_SYSCALL_64_fastpath+0x1c/0xb1
[   34.075781] Code: 44 a0 48 c7 c2 9a 7e 43 a0 be e0 0d 00 00 48 c7 c7 a0 45 44 a0 e8 55 b8 ce e0 48 85 db 74 a3 49 83 bd f8 03 00 00 00 74 99 0f 0b <0f> 0b 48 89 da 4c 89 ee 4c 89 e7 e8 04 a9 ff ff 48 89 da 49 89
[   34.075955] RIP  [<ffffffffa0392cbc>] i915_gem_obj_lookup_or_create_vma+0x8c/0xc0 [i915]
[   34.075994]  RSP <ffffc90000527b68>

Testcase: igt/gem_close_race/basic-threads
Fixes: db6c2b41 ("drm/i915: Store the vma in an rbtree...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161104161241.25871-1-chris@chris-wilson.co.ukReviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

dfd2812e

drm/i915/gvt: implement scratch page table tree for shadow PPGTT · 3b6411c2

Ping Gao authored Nov 04, 2016

All the unused entries in the page table tree(PML4E->PDPE->PDE->PTE)
should point to scratch page table/scratch page to avoid page walk error
due to the page prefetching.
When removing an entry in shadow PPGTT,  it need map to scratch page
also, the older implementation use single scratch page to assign to all
level entries, it doesn't align the page walk behavior when removed
entry is in PML, PDP, PD.  To avoid potential page walk error this patch
implement a scratch page tree to replace the single scratch page.

v2: more details in commit message address Kevin's comments.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

3b6411c2

drm/i915/gvt: emulate vgpu engine reset control behavior · 2fb39fad

Du, Changbin authored Nov 04, 2016

When SW wishes to reset the render engine, it will program
engine's reset control register and wait response from HW.
We need emulate the behavior of this register so guest i915
driver could walk through the engine reset flow. The registers
are not emulated in gvt yet, this patch add the emulation
logic.

v2: add more desc info in commit message.
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

2fb39fad

drm/i915/gvt: Fix workload status after wait · 9b172345

Zhenyu Wang authored Nov 02, 2016

From commit e95433c7, workload status setting
was changed to only capture on error path, but we need to set it properly in
normal path too, otherwise we'll fail to complete workload which could lead
guest VM vGPU reset.

v2: uses braces and add Fixes tag.

Fixes: e95433c7 ("drm/i915: Rearrange i915_wait_request() accounting with callers")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

9b172345

drm/i915/gvt: update misc ctl regs base on stepping info · d4362225

Ping Gao authored Oct 28, 2016

Misc ctl related registers are for WA purpose, should detect the
stepping info first before updating HW value.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

d4362225

drm/i915/gvt: correct the emulation in TLB control handler · f24940e0

Ping Gao authored Oct 27, 2016

Need a explicit write_vreg in TLB MMIO write handler, beside that
TLB vreg should update correspondingly following HW status to do
correct emulation.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

f24940e0

drm/i915/gvt: add write vreg in MMIO DMA_CTRL handler · 5f399f11

Ping Gao authored Oct 27, 2016

Missing write_vreg in DMA_CTRL write handler would make obsolete
value return when read vreg.

v2: get data from vreg after updating it.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

5f399f11