Commits · 7a61a6dec3dfb9f2e8c39a337580a3c3036c5cdf · Kirill Smelkov / linux

29 Jan, 2019 15 commits

drm/i915: always return something on DDI clock selection · 7a61a6de

Lucas De Marchi authored Jan 25, 2019

Even if we don't have the correct clock and get a warning, we should not
skip the return.

v2: improve commit message (from Joonas)

Fixes: 1fa11ee2 ("drm/i915/icl: start adding the TBT pll")
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: <stable@vger.kernel.org> # v4.19+
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125222444.19926-3-lucas.demarchi@intel.com

7a61a6de

drm/i915/icl: use tc_port in MG_PLL macros · 584fca11

Lucas De Marchi authored Jan 25, 2019

Fix the TODO leftover in the code by changing the argument in MG_PLL
macros. The MG_PLL ids used to access the register values can be
converted from tc_port rather than port.

All these registers can use the TC port to calculate the right offsets
because they are only available for TC ports. The range (PORT_C onwards)
may not be stable and change from platform to platform. So by using the
TC id directly we avoid having to check for the platform in the "leaf
functions" and thus passing dev_priv around.

The helper functions were also renamed to use "tc" as prefix to make
them more generic.

v2: Improve commit message and fix checkpatch warning (from Paulo)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125222444.19926-2-lucas.demarchi@intel.com

584fca11

drm/i915: Drop fake breadcrumb irq · 789659f4

Chris Wilson authored Jan 29, 2019

Missed breadcrumb detection is defunct due to the tight coupling with
dma_fence signaling and the myriad ways we may signal fences from
everywhere but from an interrupt, i.e. we frequently signal a fence
before we even see its interrupt. This means that even if we miss an
interrupt for a fence, it still is signaled before our breadcrumb
hangcheck fires, so simplify the breadcrumb hangchecking by moving it
into the GPU hangcheck and forgo fake interrupts.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129205230.19056-3-chris@chris-wilson.co.uk

789659f4

drm/i915: Replace global breadcrumbs with per-context interrupt tracking · 52c0fdb2

Chris Wilson authored Jan 29, 2019

A few years ago, see commit 688e6c72 ("drm/i915: Slaughter the
thundering i915_wait_request herd"), the issue of handling multiple
clients waiting in parallel was brought to our attention. The
requirement was that every client should be woken immediately upon its
request being signaled, without incurring any cpu overhead.

To handle certain fragility of our hw meant that we could not do a
simple check inside the irq handler (some generations required almost
unbounded delays before we could be sure of seqno coherency) and so
request completion checking required delegation.

Before commit 688e6c72, the solution was simple. Every client
waiting on a request would be woken on every interrupt and each would do
a heavyweight check to see if their request was complete. Commit
688e6c72 introduced an rbtree so that only the earliest waiter on
the global timeline would woken, and would wake the next and so on.
(Along with various complications to handle requests being reordered
along the global timeline, and also a requirement for kthread to provide
a delegate for fence signaling that had no process context.)

The global rbtree depends on knowing the execution timeline (and global
seqno). Without knowing that order, we must instead check all contexts
queued to the HW to see which may have advanced. We trim that list by
only checking queued contexts that are being waited on, but still we
keep a list of all active contexts and their active signalers that we
inspect from inside the irq handler. By moving the waiters onto the fence
signal list, we can combine the client wakeup with the dma_fence
signaling (a dramatic reduction in complexity, but does require the HW
being coherent, the seqno must be visible from the cpu before the
interrupt is raised - we keep a timer backup just in case).

Having previously fixed all the issues with irq-seqno serialisation (by
inserting delays onto the GPU after each request instead of random delays
on the CPU after each interrupt), we can rely on the seqno state to
perfom direct wakeups from the interrupt handler. This allows us to
preserve our single context switch behaviour of the current routine,
with the only downside that we lose the RT priority sorting of wakeups.
In general, direct wakeup latency of multiple clients is about the same
(about 10% better in most cases) with a reduction in total CPU time spent
in the waiter (about 20-50% depending on gen). Average herd behaviour is
improved, but at the cost of not delegating wakeups on task_prio.

v2: Capture fence signaling state for error state and add comments to
warm even the most cold of hearts.
v3: Check if the request is still active before busywaiting
v4: Reduce the amount of pointer misdirection with list_for_each_safe
and using a local i915_request variable inside the loops
v5: Add a missing pluralisation to a purely informative selftest message.

References: 688e6c72 ("drm/i915: Slaughter the thundering i915_wait_request herd")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129205230.19056-2-chris@chris-wilson.co.uk

52c0fdb2

drm/i915: Remove the intel_engine_notify tracepoint · 3df0bd19

Chris Wilson authored Jan 29, 2019

The global seqno is defunct and so we have no meaningful indicator of
forward progress for an engine. You need to listen to the request
signaling tracepoints instead.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129205230.19056-1-chris@chris-wilson.co.uk

3df0bd19

drm/i915/tv: Bypass the vertical filter if possible · 68e94f62

Ville Syrjälä authored Jan 29, 2019

Let's switch the pipe into interlaced mode and switch off
the TV encoder vertical filter if the pipe vdisplay
matches the TV YSIZE exactly.

While I didn't measure it I presume this might reduce
the power consumption a little bit, and the pixel rate
is halved as the pipe will now fetching in interlaced
mode rather than in progressive mode (effectively the
same difference as between IF-ID vs. PF-ID pfit modes
on more modern hardware) so a bit easier on the memory
bandwidth.

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129141913.5515-2-ville.syrjala@linux.intel.comAcked-by: Daniel Vetter <daniel.vetter@ffwll.ch>

68e94f62

drm/i915/tv: Fix adjusted_mode dotclock for interlaced modes · addc80f0

Ville Syrjälä authored Jan 29, 2019

intel_tv_mode_to_mode() assumes the pipe will be in progressive
fetch mode, and thus when programming the pipe into interlaced
mode we have to halve the calculated dotclock to get the correct
field duration.

This becomes more important when we start to program the pipe
into interlaced mode on i965gm as we depend on the timestamps
to get accurate frame counter values. Withot halving the clock
our guesstimated frame counter would tick at twice the expected
speed.

Cc: Imre Deak <imre.deak@intel.com>
Fixes: 690157f0 ("drm/i915/tv: Fix >1024 modes on gen3")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129141913.5515-1-ville.syrjala@linux.intel.comAcked-by: Daniel Vetter <daniel.vetter@ffwll.ch>

addc80f0

drm: Constify drm_color_lut_check() · 5a3db6f0

Ville Syrjälä authored Jan 29, 2019

drm_color_lut_check() doens't modify the passed in blob so
let's make it const.

Also s/uint32_t/u32/ while at it.

v2: Reduce line wraps (Sam)

Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129170609.5718-1-ville.syrjala@linux.intel.comReviewed-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>

5a3db6f0

drm/i915/execlists: Suppress preempting self · c9a64622

Chris Wilson authored Jan 29, 2019

In order to avoid preempting ourselves, we currently refuse to schedule
the tasklet if we reschedule an inflight context. However, this glosses
over a few issues such as what happens after a CS completion event and
we then preempt the newly executing context with itself, or if something
else causes a tasklet_schedule triggering the same evaluation to
preempt the active context with itself.

However, when we avoid preempting ELSP[0], we still retain the preemption
value as it may match a second preemption request within the same time period
that we need to resolve after the next CS event. However, since we only
store the maximum preemption priority seen, it may not match the
subsequent event and so we should double check whether or not we
actually do need to trigger a preempt-to-idle by comparing the top
priorities from each queue. Later, this gives us a hook for finer
control over deciding whether the preempt-to-idle is justified.

The sequence of events where we end up preempting for no avail is:

1. Queue requests/contexts A, B
2. Priority boost A; no preemption as it is executing, but keep hint
3. After CS switch, B is less than hint, force preempt-to-idle
4. Resubmit B after idling

v2: We can simplify a bunch of tests based on the knowledge that PI will
ensure that earlier requests along the same context will have the highest
priority.
v3: Demonstrate the stale preemption hint with a selftest

References: a2bf92e8 ("drm/i915/execlists: Avoid kicking priority on the current context")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129185452.20989-4-chris@chris-wilson.co.uk

c9a64622

drm/i915: Rename execlists->queue_priority to queue_priority_hint · 4d97cbe0

Chris Wilson authored Jan 29, 2019

After noticing that we trigger preemption events for currently executing
requests, as well as requests that complete before the preemption and
attempting to suppress those preemption events, it is wise to not
consider the queue_priority to be authoritative. As we only track the
maximum priority seen between dequeue passes, if the maximum priority
request is no longer available for dequeuing (it completed or is even
executing on another engine), we have no knowledge of the previous
queue_priority as it would require us to keep a full history of enqueued
requests -- but we already have that history in the priolists!

Rename the queue_priority to queue_priority_hint so that we do not
confuse it as being exactly the maximum priority in the queue, but merely
an indication that we have seen a new maximum priority value and as such
we should check whether it should preempt the currently running request.

v2: s/preempt_priority_hint/queue_priority_hint/ as preempt implies it
being only used for the singular task of preemption and not the wider
question of waking up due to a change in the queue.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129185452.20989-3-chris@chris-wilson.co.uk

4d97cbe0

drm/i915: Identify active requests · 85474441

Chris Wilson authored Jan 29, 2019

To allow requests to forgo a common execution timeline, one question we
need to be able to answer is "is this request running?". To track
whether a request has started on HW, we can emit a breadcrumb at the
beginning of the request and check its timeline's HWSP to see if the
breadcrumb has advanced past the start of this request. (This is in
contrast to the global timeline where we need only ask if we are on the
global timeline and if the timeline has advanced past the end of the
previous request.)

There is still confusion from a preempted request, which has already
started but relinquished the HW to a high priority request. For the
common case, this discrepancy should be negligible. However, for
identification of hung requests, knowing which one was running at the
time of the hang will be much more important.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129185452.20989-2-chris@chris-wilson.co.uk

85474441

drm/i915/selftests: Apply a subtest filter · 06039d98

Chris Wilson authored Jan 29, 2019

In bringup on simulated HW even rudimentary tests are slow, and so many
may fail that we want to be able to filter out the noise to focus on the
specific problem. Even just the tests groups provided for igt is not
specific enough, and we would like to isolate one particular subtest
(and probably subsubtests!). For simplicity, allow the user to provide a
command line parameter such as

	i915.st_filter=i915_timeline_mock_selftests/igt_sync

to restrict ourselves to only running on subtest. The exact name to use
is given during a normal run, highlighted as an error if it failed,
debug otherwise. The test group is optional, and then all subtests are
compared for an exact match with the filter (most subtests have unique
names). The filter can be negated, e.g. i915.st_filter=!igt_sync and
then all tests but those that match will be run. More than one match can
be supplied separated by a comma, e.g.

	i915.st_filter=igt_vma_create,igt_vma_pin1

to only run those specified, or

	i915.st_filter=!igt_vma_create,!igt_vma_pin1

to run all but those named. Mixing a blacklist and whitelist will only
execute those subtests matching the whitelist so long as they are
previously excluded in the blacklist.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129185452.20989-1-chris@chris-wilson.co.uk

06039d98

Merge drm/drm-next into drm-intel-next-queued · 8716ae72
Rodrigo Vivi authored Jan 29, 2019
```
A backmerge to unblock gen8+ semaphores.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
```
8716ae72

drm/i915: Fix skl srckey mask bits · 968bf969

Ville Syrjälä authored Jan 25, 2019

We're incorrectly masking off the R/V channel enable bit from
KEYMSK. Fix it up.

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Fixes: b2081525 ("drm/i915: Add plane alpha blending support, v2.")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125183846.28755-1-ville.syrjala@linux.intel.comReviewed-by: Matt Roper <matthew.d.roper@intel.com>

968bf969

drm/i915: Enable fastboot by default on Skylake and newer · 3d6535cb

Hans de Goede authored Jan 24, 2019

We really want to have fastboot enabled by default to avoid an ugly
modeset during boot.

Rather then enabling it everywhere, lets start with enabling it on
Skylake and newer.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190124130114.3967-1-maarten.lankhorst@linux.intel.com

3d6535cb

28 Jan, 2019 15 commits

drm/i915: Track active timelines · 9407d3bd

Chris Wilson authored Jan 28, 2019

Now that we pin timelines around use, we have a clearly defined lifetime
and convenient points at which we can track only the active timelines.
This allows us to reduce the list iteration to only consider those
active timelines and not all.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-6-chris@chris-wilson.co.uk

9407d3bd

drm/i915: Track the context's seqno in its own timeline HWSP · 5013eb8c

Chris Wilson authored Jan 28, 2019

Now that we have allocated ourselves a cacheline to store a breadcrumb,
we can emit a write from the GPU into the timeline's HWSP of the
per-context seqno as we complete each request. This drops the mirroring
of the per-engine HWSP and allows each context to operate independently.
We do not need to unwind the per-context timeline, and so requests are
always consistent with the timeline breadcrumb, greatly simplifying the
completion checks as we no longer need to be concerned about the
global_seqno changing mid check.

One complication though is that we have to be wary that the request may
outlive the HWSP and so avoid touching the potentially danging pointer
after we have retired the fence. We also have to guard our access of the
HWSP with RCU, the release of the obj->mm.pages should already be RCU-safe.

At this point, we are emitting both per-context and global seqno and
still using the single per-engine execution timeline for resolving
interrupts.

v2: s/fake_complete/mark_complete/
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-5-chris@chris-wilson.co.uk

5013eb8c

drm/i915: Share per-timeline HWSP using a slab suballocator · 8ba306a6

Chris Wilson authored Jan 28, 2019

If we restrict ourselves to only using a cacheline for each timeline's
HWSP (we could go smaller, but want to avoid needless polluting
cachelines on different engines between different contexts), then we can
suballocate a single 4k page into 64 different timeline HWSP. By
treating each fresh allocation as a slab of 64 entries, we can keep it
around for the next 64 allocation attempts until we need to refresh the
slab cache.

John Harrison noted the issue of fragmentation leading to the same worst
case performance of one page per timeline as before, which can be
mitigated by adopting a freelist.

v2: Keep all partially allocated HWSP on a freelist

This is still without migration, so it is possible for the system to end
up with each timeline in its own page, but we ensure that no new
allocation would needless allocate a fresh page!

v3: Throw a selftest at the allocator to try and catch invalid cacheline
reuse.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-4-chris@chris-wilson.co.uk

8ba306a6

drm/i915: Allocate a status page for each timeline · 52954edd

Chris Wilson authored Jan 28, 2019

Allocate a page for use as a status page by a group of timelines, as we
only need a dword of storage for each (rounded up to the cacheline for
safety) we can pack multiple timelines into the same page. Each timeline
will then be able to track its own HW seqno.

v2: Reuse the common per-engine HWSP for the solitary ringbuffer
timeline, so that we do not have to emit (using per-gen specialised
vfuncs) the breadcrumb into the distinct timeline HWSP and instead can
keep on using the common MI_STORE_DWORD_INDEX. However, to maintain the
sleight-of-hand for the global/per-context seqno switchover, we will
store both temporarily (and so use a custom offset for the shared timeline
HWSP until the switch over).

v3: Keep things simple and allocate a page for each timeline, page
sharing comes next.

v4: I was caught repeating the same MI_STORE_DWORD_IMM over and over
again in selftests.

v5: And caught red handed copying create timeline + check.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-3-chris@chris-wilson.co.uk

52954edd

drm/i915: Enlarge vma->pin_count · b18fe4be

Chris Wilson authored Jan 28, 2019

Previously we only accommodated having a vma pinned by a small number of
users, with the maximum being pinned for use by the display engine. As
such, we used a small bitfield only large enough to allow the vma to
be pinned twice (for back/front buffers) in each scanout plane. Keeping
the maximum permissible pin_count small allows us to quickly catch a
potential leak. However, as we want to split a 4096B page into 64
different cachelines and pin each cacheline for use by a different
timeline, we will exceed the current maximum permissible vma->pin_count
and so time has come to enlarge it.

Whilst we are here, try to pull together the similar bits:

Address/layout specification:
 - bias, mappable, zone_4g: address limit specifiers
 - fixed: address override, limits still apply though
 - high: not strictly an address limit, but an address direction to search

Search controls:
 - nonblock, nonfault, noevict

v2: Rewrite the guideline comment on bit consumption.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: John Harrison <john.C.Harrison@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-2-chris@chris-wilson.co.uk

b18fe4be

drm/i915: Introduce concept of per-timeline (context) HWSP · 3adac468

Chris Wilson authored Jan 28, 2019

Supplement the per-engine HWSP with a per-timeline HWSP. That is a
per-request pointer through which we can check a local seqno,
abstracting away the presumption of a global seqno. In this first step,
we point each request back into the engine's HWSP so everything
continues to work with the global timeline.

v2: s/i915_request_hwsp/hwsp_seqno/ to emphasis that this is the current
HW value and that we are accessing it via i915_request merely as a
convenience.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-1-chris@chris-wilson.co.uk

3adac468

drm/i915: Move list of timelines under its own lock · 1e345568

Chris Wilson authored Jan 28, 2019

Currently, the list of timelines is serialised by the struct_mutex, but
to alleviate difficulties with using that mutex in future, move the
list management under its own dedicated mutex.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-5-chris@chris-wilson.co.uk

1e345568

drm/i915: Always allocate an object/vma for the HWSP · 0ca88ba0

Chris Wilson authored Jan 28, 2019

Currently we only allocate an object and vma if we are using a GGTT
virtual HWSP, and a plain struct page for a physical HWSP. For
convenience later on with global timelines, it will be useful to always
have the status page being tracked by a struct i915_vma. Make it so.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-4-chris@chris-wilson.co.uk

0ca88ba0

drm/i915: Move vma lookup to its own lock · 528cbd17

Chris Wilson authored Jan 28, 2019

Remove the struct_mutex requirement for looking up the vma for an
object.

v2: Highlight how the race for duplicate vma creation is resolved on
reacquiring the lock with a short comment.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-3-chris@chris-wilson.co.uk

528cbd17

drm/i915: Pull VM lists under the VM mutex. · 09d7e46b

Chris Wilson authored Jan 28, 2019

A starting point to counter the pervasive struct_mutex. For the goal of
avoiding (or at least blocking under them!) global locks during user
request submission, a simple but important step is being able to manage
each clients GTT separately. For which, we want to replace using the
struct_mutex as the guard for all things GTT/VM and switch instead to a
specific mutex inside i915_address_space.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-2-chris@chris-wilson.co.uk

09d7e46b

drm/i915: Stop tracking MRU activity on VMA · 499197dc

Chris Wilson authored Jan 28, 2019

Our goal is to remove struct_mutex and replace it with fine grained
locking. One of the thorny issues is our eviction logic for reclaiming
space for an execbuffer (or GTT mmaping, among a few other examples).
While eviction itself is easy to move under a per-VM mutex, performing
the activity tracking is less agreeable. One solution is not to do any
MRU tracking and do a simple coarse evaluation during eviction of
active/inactive, with a loose temporal ordering of last
insertion/evaluation. That keeps all the locking constrained to when we
are manipulating the VM itself, neatly avoiding the tricky handling of
possible recursive locking during execbuf and elsewhere.

Note that discarding the MRU (currently implemented as a pair of lists,
to avoid scanning the active list for a NONBLOCKING search) is unlikely
to impact upon our efficiency to reclaim VM space (where we think a LRU
model is best) as our current strategy is to use random idle replacement
first before doing a search, and over time the use of softpinned 48b
per-ppGTT is growing (thereby eliminating any need to perform any eviction
searches, in theory at least) with the remaining users being found on
much older devices (gen2-gen6).

v2: Changelog and commentary rewritten to elaborate on the duality of a
single list being both an inactive and active list.
v3: Consolidate bool parameters into a single set of flags; don't
comment on the duality of a single variable being a multiplicity of
bits.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-1-chris@chris-wilson.co.uk

499197dc

drm/i915: Try to sanitize bogus DPLL state left over by broken SNB BIOSen · 7bed8adc

Ville Syrjälä authored Jan 11, 2019

Certain SNB machines (eg. ASUS K53SV) seem to have a broken BIOS
which misprograms the hardware badly when encountering a suitably
high resolution display. The programmed pipe timings are somewhat
bonkers and the DPLL is totally misprogrammed (P divider == 0).
That will result in atomic commit timeouts as apparently the pipe
is sufficiently stuck to not signal vblank interrupts.

IIRC something like this was also observed on some other SNB
machine years ago (might have been a Dell XPS 8300) but a BIOS
update cured it. Sadly looks like this was never fixed for the
ASUS K53SV as the latest BIOS (K53SV.320 11/11/2011) is still
broken.

The quickest way to deal with this seems to be to shut down
the pipe+ports+DPLL. Unfortunately doing this during the
normal sanitization phase isn't quite soon enough as we
already spew several WARNs about the bogus hardware state.
But it's better than hanging the boot for a few dozen seconds.
Since this is limited to a few old machines it doesn't seem
entirely worthwile to try and rework the readout+sanitization
code to handle it more gracefully.

v2: Fix potential NULL deref (kbuild test robot)
    Constify has_bogus_dpll_config()

Cc: stable@vger.kernel.org # v4.20+
Cc: Daniel Kamil Kozar <dkk089@gmail.com>
Reported-by: Daniel Kamil Kozar <dkk089@gmail.com>
Tested-by: Daniel Kamil Kozar <dkk089@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109245
Fixes: 516a49cc ("drm/i915: Fix assert_plane() warning on bootup with external display")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190111174950.10681-1-ville.syrjala@linux.intel.comReviewed-by: Mika Kahola <mika.kahola@intel.com>

7bed8adc

drm/i915/tv: Use the scanline counter for timestamps on i965gm TV output · 8a920e24

Ville Syrjälä authored Jan 25, 2019

Just like the frame counter, the pixel counter also reads zero
all the time when the TV encoder is used. Fortunately the
scanline counter still works sufficiently well so let's use that
to correct the vblank timestamps. Otherwise the timestamps may
en up out of whack, and since we use them to guesstimate the
vblank counter value that may end up incorrect as well.

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125181931.19482-2-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

8a920e24

drm/i915/tv: Fix return value for intel_tv_compute_config() · 6a2a9404

Ville Syrjälä authored Jan 25, 2019

Ever since commit 204474a6 ("drm/i915: Pass down rc in
intel_encoder->compute_config()") we're supposed to return an
errno from .compute_config(). I failed to notice that when
pushing the TV encoder fixes which were written before said
commmit. Fix up the return value for the error case.

Cc: Imre Deak <imre.deak@intel.com>
Fixes: 690157f0 ("drm/i915/tv: Fix >1024 modes on gen3")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125181931.19482-1-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

6a2a9404

drm/i915: Wait for a moment before forcibly resetting the device · ad4062da

Chris Wilson authored Jan 28, 2019

During igt, we ask to reset the device if any requests are still
outstanding at the end of a test, as this quickly kills off any
erroneous hanging request streams that may escape a test. However, since
it may take the device a few milliseconds to flush itself after the end
of a normal test, *cough* guc *cough*, we may accidentally tell the
device to reset itself after it idles. If we wait a moment, our usual
I915_IDLE_ENGINES_TIMEOUT of 200ms (seems a bit high, but still better
than umpteen hangchecks!), we can differentiate better between a stuck
engine and a healthy one, and so avoid prematurely forcing the reset and
any extra complications that may entail.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128010245.20148-1-chris@chris-wilson.co.uk

ad4062da

26 Jan, 2019 1 commit

drm/i915: Disable -Wuninitialized · c5627461

Nathan Chancellor authored Jan 26, 2019

This warning is disabled by default in scripts/Makefile.extrawarn when
W= is not provided but this Makefile adds -Wall after this warning is
disabled so it shows up in the build when it shouldn't:

In file included from drivers/gpu/drm/i915/intel_breadcrumbs.c:895:
drivers/gpu/drm/i915/selftests/intel_breadcrumbs.c:350:34: error:
variable 'wq' is uninitialized when used within its own initialization
[-Werror,-Wuninitialized]
        DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq);
                                        ^~
./include/linux/wait.h:74:63: note: expanded from macro
'DECLARE_WAIT_QUEUE_HEAD_ONSTACK'
        struct wait_queue_head name = __WAIT_QUEUE_HEAD_INIT_ONSTACK(name)
                               ~~~~                                  ^~~~
./include/linux/wait.h:72:33: note: expanded from macro
'__WAIT_QUEUE_HEAD_INIT_ONSTACK'
        ({ init_waitqueue_head(&name); name; })
                                       ^~~~
1 error generated.

Explicitly disable the warning like commit 46e20680 ("drm/i915:
Disable some extra clang warnings").

Link: https://github.com/ClangBuiltLinux/linux/issues/220Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nick Desaulniers <nick.desaulniers@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190126071122.24557-1-natechancellor@gmail.com

c5627461

25 Jan, 2019 9 commits

drm/i915: correct the pitch check for NV12 framebuffer · 29214e8c

P Raviraj Sitaram authored Dec 19, 2018

framebuffer for NV12 requires the pitch to the multiplier of 4, instead
of the width. This patch corrects it.

For instance, a 480p video, whose width and pitch are 854 and 896
respectively, is excluded for NV12 plane so far.

Changes since v1:
    - Removed check for NV12 buffer dimensions since additional checks
      are done for viewport size in intel_sprite.c
Signed-off-by: Dongseong Hwang <dongseong.hwang@intel.com>
Signed-off-by: P Raviraj Sitaram <raviraj.p.sitaram@intel.com>
Cc: Chandra Konduru <chandra.konduru@intel.com>
Cc: Vidya Srinivas <vidya.srinivas@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1545208152-22658-1-git-send-email-raviraj.p.sitaram@intel.com

29214e8c

drm/i915: Clean up intel_plane_atomic_check_with_state() · 790cc994

Ville Syrjälä authored Jan 11, 2019

Rename some of the state variables in
intel_plane_atomic_check_with_state() to make it less confusing.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190111170823.4441-2-ville.syrjala@linux.intel.comReviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>

790cc994

drm/i915/tv: Filter out >1024 wide modes that would need vertical scaling on gen3 · 0bb1ffe4

Ville Syrjälä authored Nov 12, 2018

Since gen3 can't handle >1024 wide sources with vertical scaling
let's not advertize such modes in the mode list. Less tempetation
to the user to try out things that won't work.

v2: s/IS_GEN3(dev_priv/IS_GEN(dev_priv, 3)/
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181112170000.27531-17-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

0bb1ffe4

drm/i915/tv: Fix >1024 modes on gen3 · 690157f0

Ville Syrjälä authored Nov 12, 2018

On gen3 we must disable the TV encoder vertical filter for >1024
pixel wide sources. Once that's done all we can is try to center
the image on the screen. Naturally the TV mode vertical resolution
must be equal or larger than the user mode vertical resolution
or else we'd have to cut off part of the user mode.

And while we may not be able to respect the user's choice of
top and bottom borders exactly (or we'd have to reject he mode
most likely), we can try to maintain the relative sizes of the
top and bottom border with respect to each orher.

Additionally we must configure the pipe as interlaced if the
TV mode is interlaced.

v2: Make +intel_tv_connector_duplicate_state() static and drop
    the badly copy pasted kerneldoc
    s/IS_GEN3(dev_priv/IS_GEN(dev_priv, 3)/
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181112170000.27531-16-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

690157f0

drm/i915/tv: Generate better pipe timings for TV encoder · e3bb355c

Ville Syrjälä authored Nov 12, 2018

To make vblank timestamps work better with the TV encoder let's
scale the pipe timings such that the relationship between the
TV active and TV blanking periods is mirrored in the
corresponding pipe timings.

Note that in reality the pipe runs at a faster speed during the
TV vblank, and correspondigly there are periods when the pipe
is enitrely stopped. We pretend that this isn't the case and
as such we incur some error in the vblank timestamps during
the TV vblank. Further explanation of the issues in a big
comment in the code.

This makes the vblank timestamps good enough to make
i965gm (which doesn't have a working frame counter with
the TV encoder) report correct frame numbers. Previously
you could get all kinds of nonsense which resulted in
eg. glxgears reporting that it's running at twice the
actual framerate in most cases.

v2: s/IS_GEN4(dev_priv)/IS_GEN(dev_priv, 4)/ in the comment
    for consistency
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181112170000.27531-15-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

e3bb355c

drm/i915/tv: Add 1080p30/50/60 TV modes · a0ff6779

Ville Syrjälä authored Nov 12, 2018

Add the missing 1080p TV modes. On gen4 all of them work just fine,
whereas on gen3 only the 30Hz mode actually works correctly.

v2: s/IS_GEN3(dev_priv)/IS_GEN(dev_priv, 3)/
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181112170000.27531-14-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

a0ff6779

drm/i915/tv: Nuke reported_modes[] · 528132a3

Ville Syrjälä authored Nov 12, 2018

Remove the silly reported_modes[] array. I suppse once upon a time
this actually had something to do with modes we reported to userspace.
Now it is just the placeholder for the mode we use for load detection.
We don't need it even for that, and instead we can just rely on
the fallback mode in intel_get_load_detect_pipe().
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181112170000.27531-13-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

528132a3

drm/i915/tv: Make TV mode autoselection actually useable · e94390aa

Ville Syrjälä authored Nov 12, 2018

The current code insists on picking a new TV mode when
switching between component and non-component cables.
That's super annoying. Let's just keep the current TV
mode unless the new cable type actually disagrees with it.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181112170000.27531-12-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

e94390aa

drm/i915/tv: Use drm_mode_set_name() to name TV modes · 5023520f

Ville Syrjälä authored Nov 12, 2018

No point in storing the mode names in the array. drm_mode_set_name()
will give us the same names without wasting space for these string
constants.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181112170000.27531-11-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

5023520f