Commits · d59b21ec6f33b6df7933e86b9906e9f4ee9b218d · Kirill Smelkov / linux

22 Feb, 2017 8 commits

drm/i915: Remove 'retire' parameter from intel_fb_obj_flush · d59b21ec

Chris Wilson authored Feb 22, 2017

Setting retire=true is identical to using origin=ORIGIN_CS, so make the
same simplification to intel_fb_obj_flush() as already employed for
intel_fb_obj_invalidate().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-6-chris@chris-wilson.co.uk

d59b21ec

drm/i915: Perform object clflushing asynchronously · 57822dc6

Chris Wilson authored Feb 22, 2017

Flushing the cachelines for an object is slow, can be as much as 100ms
for a large framebuffer. We currently do this under the struct_mutex BKL
on execution or on pageflip. But now with the ability to add fences to
obj->resv for both flips and execbuf (and we naturally wait on the fence
before CPU access), we can move the clflush operation to a workqueue and
signal a fence for completion, thereby doing the work asynchronously and
not blocking the driver or its clients.

v2: Introduce i915_gem_clflush.h and use a new name, split out some
extras into separate patches.
Suggested-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-5-chris@chris-wilson.co.uk

57822dc6

drm/i915: Skip clflushes for all non-page backed objects · f6aaba4d

Chris Wilson authored Feb 22, 2017

Generalise the skip for physical and stolen objects by skipping anything
we do not have a valid address for inside the sg.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-4-chris@chris-wilson.co.uk

f6aaba4d

drm/i915: Amalgamate flushing of display objects · 5a97bcc6

Chris Wilson authored Feb 22, 2017

We have three different paths by which userspace wants to flush the
display plane (i.e. objects with obj->pin_display). Use a common helper
to identify those paths and to simplify a later change.

v2: Include the conditional in the name, i915_gem_object_flush_if_display
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-3-chris@chris-wilson.co.uk

5a97bcc6

drm/i915: Move cpu_cache_is_coherent() to header · e59dc172

Chris Wilson authored Feb 22, 2017

For use in the next patch, take the current is-coherent helper and add
it to i915_gem_object.h
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-2-chris@chris-wilson.co.uk

e59dc172

drm/i915: Remove change_domain tracepoint · 208b84a3

Chris Wilson authored Feb 22, 2017

The change_domain tracepoint has been inaccurate for a few years - it
doesn't fully capture the domains, especially with userspace bypassing
them. It is defunct, misleading and time to be removed.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-1-chris@chris-wilson.co.uk

208b84a3

drm/i915: Add i915_param charp macro magic · 1d6aa7a3

Chris Wilson authored Feb 21, 2017

Handling the dynamic charp module parameter requires us to copy it for
the error state, or remember to lock it when reading (in case it used
with 0600).

v2: Use __always_inline and __builtin_strcmp
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170221162619.15954-1-chris@chris-wilson.co.ukReviewed-by: Jani Nikula <jani.nikula@intel.com>

1d6aa7a3

drm/i915/gvt: set ring buffer size to default for guc submission · 718e884a

Chuanxiao Dong authored Feb 16, 2017

When not using GuC submission, the ring buffer size for GVT context is
512KB which is the max size. When switching to GuC submission, the ring
buffer size is required to be less than 16KB. So use the GVT context
default ring buffer size if GuC submission is enabled.
Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170216063639.GA17107@intel.comReviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

718e884a

21 Feb, 2017 10 commits

drm/i915/tracepoints: Add hw_id to context tracepoints · 99c181a0

Tvrtko Ursulin authored Feb 21, 2017

It is useful to provide this info to match the one provided
in the request tracepoints.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170221091350.14605-1-tvrtko.ursulin@linux.intel.comReviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

99c181a0

drm/i915/tracepoints: Add backend level request in and out tracepoints · d7d96833

Tvrtko Ursulin authored Feb 21, 2017

Two new tracepoints placed at the call sites where requests are
actually passed to the GPU enable userspace to track engine
utilisation.

These tracepoints are only enabled when the
DRM_I915_LOW_LEVEL_TRACEPOINTS Kconfig option is enabled.

v2: Fix compilation with !CONFIG_DRM_I915_LOW_LEVEL_TRACEPOINTS.

v3: Name global seqno consistently across tracepoints.

v4: Remove port info from request out tracepoint. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

d7d96833

drm/i915/tracepoints: Rename i915_gem_request_notify · dffabc8f

Tvrtko Ursulin authored Feb 21, 2017

i915_gem_ring_notify is more appropriate since we do not have
the request information at this point, but it is simply a
signal from the engine that some request has been completed.

v2:
  * Always trace and log if there were any waiters.
  * Rename to intel_engine_notify. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

dffabc8f

drm/i915/tracepoints: Add request submit and execute tracepoints · 354d036f

Tvrtko Ursulin authored Feb 21, 2017

These new tracepoints are emitted once the request is ready to
be submitted to the GPU and once the request is about to
be submitted to the GPU, respectively.

Former condition triggers as soon as all the fences and
dependencies have been resolved, and the latter once the
backend is about to submit it to the GPU.

New tracepoint are enabled via the new
DRM_I915_LOW_LEVEL_TRACEPOINTS Kconfig option which is disabled
by default to alleviate the performance impact concerns.

v2: Move execute tracepoint to __i915_gem_request_submit.
    (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

354d036f

drm/i915/tracepoints: Remove unused i915_gem_request_complete · 90aa412d

Tvrtko Ursulin authored Feb 21, 2017

Tracepoint is not used and won't be suitable for its replacement.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

90aa412d

drm/i915/tracepoints: Tidy i915_gem_request_wait_begin · 93692502

Tvrtko Ursulin authored Feb 21, 2017

Provide the same information as the other request event classes.

v2: Pass in flags so we can properly report the blocking status.
    (Chris Wilson)

v3: Log hex with 0x prefix for clarity.

v4: Derive blocking status from flags. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

93692502

drm/i915/tracepoints: Adjust i915_gem_ring_dispatch · 1cce8922

Tvrtko Ursulin authored Feb 21, 2017

Rename it to i915_gem_request_queue and fix the logged info
equivalent to the i915_gem_request even class. Also moved it
a bit further apart from the i915_gem_request_add tracepoint
since they otherwise provide similar information too close in
time.

v2: Remove sw fence singalling. We will rely on the soon to
    come GuC scheduling backend to enable that. (Chris Wilson)

v3: Log hex with 0x prefix for clarity.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

1cce8922

drm/i915/tracepoints: Tidy request event class · e235b530

Tvrtko Ursulin authored Feb 21, 2017

At the moment only the global seqno is logged which is not set
until the request is ready for submission.

Add the per-contex seqno and the context hardware id which are
both interesting data points.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

e235b530

drm/i915: Tidy execlists_init_reg_state · 56e51bf0

Tvrtko Ursulin authored Feb 21, 2017

Compact the name of the macro and reg_state variable, and cache
some data in local variables to make the function more compact
and more readable.

v2: Fixup some checkpatch warnings.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170221095839.30525-1-tvrtko.ursulin@linux.intel.com

56e51bf0

drm/i915: Use reservation_object_lock() · e2989f14

Chris Wilson authored Feb 21, 2017

Replace the calls to ww_mutex_lock(&resv->lock) with the helper
reservation_object_lock(resv) and similarly for unlock.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170221091723.6219-1-chris@chris-wilson.co.ukReviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

e2989f14

20 Feb, 2017 14 commits

drm/i915: Assert that the request->tail is always qword aligned · 944a36d4

Chris Wilson authored Feb 17, 2017

The hardware requires that the tail pointer only advance in qword units,
so assert that the value we write is aligned to qwords, and similarly
enforce this restriction onto the request->tail.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170217163833.731-1-chris@chris-wilson.co.ukReviewed-by: Michał Winiarski <michal.winiarski@intel.com>