Commits · d1b32c32e8943d18a104f347fbe40c46276011d9 · Kirill Smelkov / linux

23 May, 2016 30 commits

drm/i915: Set BXT cdclk to minimum initially · d1b32c32

Ville Syrjälä authored May 13, 2016

In case the driver is initialized without active displays, we should
just drop the cdclk to the minimum frequency right off the bat. There
might not be a modeset to drop it to the minimum late rafter all.

With DMC supposedly we should always have the cdclk up and running.
The DMC will shut the DE PLL down when appropriate, so let's nuke
the related FIXMEs as well. Trying to do anything different would
go against the expectations of the DMC firmware, and we all know
how fragile the DMC firmware is.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-22-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

d1b32c32

drm/i915: Replace bxt_verify_cdclk_state() with a more generic cdclk check · 342be926

Ville Syrjälä authored May 13, 2016

Rather than having a BXT specific function to make sure the DE PLL is
enabled after disabling DC6, let's just make sure the current cdclk
is the same as what we last programmed.

Having another check in bxt_display_core_init() almost immediately after
the cdclk init seems redundant, so let's just kill that one.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-21-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

342be926

drm/i915: Make bxt_set_cdclk() operate in terms of the current vs target DE PLL vco · 5f199dfa

Ville Syrjälä authored May 13, 2016

Make bxt_set_cdclk() more readable by looking at current vs. target
DE PLL vco to determine if the DE PLL needs disabling and/or enabling.
We can also calculate the CD2X divider simply as (vco/cdclk) instead of
depending on magic numbers.

The magic numbers are still needed though, but only to map the supported
cdclk frequencies to corresponding DE PLL frequencies.

Note that w'll now program CDCLK_CTL correctly even for the bypass case.
Actually the CD2X divider should not matter in that case since the
hardware will bypass it too, but the "decimal" part should matter (if we
want to do gmbus/aux with the bypass enabled).
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-20-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

5f199dfa

drm/i915: Rewrite broxton_get_display_clock_speed() in terms of the DE PLL vco/refclk · f5986242

Ville Syrjälä authored May 13, 2016

Now that we've read out the DE PLL vco and refclk, we can just use them
in the cdclk calculation. While at it switch over to
DIV_ROUND_CLOSEST().
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-19-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

f5986242

drm/i915: Update cached cdclk state from broxton_init_cdclk() · 089c6fd5

Ville Syrjälä authored May 13, 2016

Let's make sure our cached cdclk state is accurate right after
broxton_init_cdclk() whether or not we end up changing the cdclk
frequency.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-18-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

089c6fd5

drm/i915: Store BXT DE PLL vco and ref clocks in dev_priv · 83d7c81f

Ville Syrjälä authored May 13, 2016

We have need to know the DE PLL refclk and output frequency in various
cdclk calculations, so let's store those in dev_priv.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-17-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

83d7c81f

drm/i915: Extract bxt DE PLL enable/disable from broxton_set_cdclk() · 2b73001e

Ville Syrjälä authored May 13, 2016

Enabling and disalbing the DE PLL are two nice self contained
operations, so let's move them into a few small helper functions.
Makes it easier to see the forest from the trees in broxton_set_cdclk().
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-16-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

2b73001e

drm/i915: Store cdclk PLL reference clock under dev_priv · 709e05c3

Ville Syrjälä authored May 13, 2016

Future platforms will have multiple options for the cdclk PLL reference
clock, so let's start tracking that under dev_priv alreday on SKL,
although on SKL it's always 24 MHz.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-15-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

709e05c3

drm/i915: Rename skl_vco_freq to cdclk_pll.vco · 63911d72

Ville Syrjälä authored May 13, 2016

We'll want to store the cdclk PLL (whatever PLL that is in reality) vco
frequency somewhere on other platforms too, so let's rename the
skl_vco_freq to cdclk_pll.vco, and let's store it in kHz instead of MHz
to match most of the other clocks.

v2: Drop the spurious > vs != change (Imre)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-14-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

63911d72

drm/i915: Make 308 and 671 MHz cdclks more accurate on SKL · 487ed2e4

Ville Syrjälä authored May 13, 2016

The SKL 308.57 MHz cdclk is probably 8640/28 = ~308.571 Mhz.
Similartly the 617.14 MHz cdclk is probably 8640/14 = ~617.143 MHz.
Let's use the slightly more accurate numbers. Potentially we might
change to computing all of these based on dividers, but let's
stick to the current theme for now..
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-13-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

487ed2e4

drm/i915: Move SKL+ DBUF enable/disable to display core init/uninit · 70c2c184

Ville Syrjälä authored May 13, 2016

SKL and BXT have the same snippets of code for enabling disabling the
DBUF. Extract those into helpers and move the calls from
init/unit_cdclk() to the display core init/init since this stuff isn't
really about cdclk. Also doing the enable twice shouldn't hurt since
you're just setting the request bit again when it was already set.

We can also toss in a few WARNs about the register values into
skl_get_dpll0_vco() now that we know that things should always be
sane there.

Flatten skl_init_cdclk() while at it.

v2: s/skl/gen9/ in function names (Imre)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-12-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

70c2c184

drm/i915: Unify SKL cdclk init paths · 9f7eb31a

Ville Syrjälä authored May 13, 2016

Currently we initialize cdclk on SKL from two different places,
depending on whether it's during driver init or resume. Let's
unify it to happen from the same place always, and that place will be
the display core init function.

To do this we first run through the cdclk sanitation code, which will
first verify that the PLL is programmed correctly, after which we can
read out the current cdclk frequency, and once the cdclk is known we
verify that the cdclk "decimal" frequency is programmed correctly. If
any of these fail we will force a cdclk change, and to be safe we also
force the PLL to be turned off and on again. If the sanitation step
didn't notice anything amiss, we'll skip the cdclk programming which
will prevent cdclk reprogramming when the displays might be active.

We can also toss in a few WARNs about the register values into
skl_update_dpll0() since we now know that the PLL state should
always be sane when that function is called.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-11-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

9f7eb31a

drm/i915: Beef up skl_sanitize_cdclk() a bit · 09492498

Ville Syrjälä authored May 13, 2016

Also verify the DPLL_CTRL1 register value in skl_sanitize_cdclk(), throw
out a few unneeded variables, and write the CDCLK_CTL check a bit more
legible way.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-10-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

09492498

drm/i915: Keep track of preferred cdclk vco frequency on SKL · b2045352

Ville Syrjälä authored May 13, 2016

Now that skl_vco_freq tracks the actual DPLL0 vco frequency, we'll need
something that keeps track of which vco frequency we want to use in case
the current vco is 0. This would be important across supend/resume since
we'll disable DPLL0 around those parts.

We'll also update our idea of max cdclk/dotclock when the preferred
vco changes. That could happen if out initial guess was wrong, and
later eDP would force us to change it. One issue here could be that
changing the max dotclock could cause our mode list to change during
next time the displays get probed. But I don't see a good way to avoid
that, except perhaps by allowing either vco frequency to be used as
needed. But the docs suggest that such usage wasn't really inteded.

Also need to make sure we don't update our max_cdclk value before we
have a preferred vco value, which means moving that to happen after
the cdclk sanitation.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-9-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

b2045352

drm/i915: Allow enable/disable of DPLL0 around cdclk changes on SKL · 1cd593e0

Ville Syrjälä authored May 13, 2016

In case we originally guessed wrong which lcpll vco frequency to use,
we will need to shut down the pll and restart it when reprogamming the
cdclk.

This also allows us to track the actual vco frequency in dev_priv
instead of just a guess.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-8-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

1cd593e0

drm/i915: Report the current DPLL0 vco on SKL/KBL · 2f2a121a

Ville Syrjälä authored May 13, 2016

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-7-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

2f2a121a

drm/i915: Actually read out DPLL0 vco on skl from hardware · ea61791e

Ville Syrjälä authored May 13, 2016

Currently we're trying to guess which lcpll vco frequency is used
use based on the cdclk. That doesn't work for cdclk==540 since
both vco frequencies can generate a 540 Mhz output. Let's stop
guessing and just read the actual vco frequency from the
hardware.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-6-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

ea61791e

drm/i915: Extract skl_calc_cdclk() · a8ca4934

Ville Syrjälä authored May 13, 2016

We have many places where we want to pick a suitable cdclk frequency for
skl based on the dotclock and lcpll vco. Split that code into a small
helper and call it from all over.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-5-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

a8ca4934

drm/i915: Move the SKL DPLL0 VCO computation into intel_dp_compute_config() · 14d41b3b

Ville Syrjälä authored May 13, 2016

Shared plls won't get assigned until the .compute_clocks() hook gets
called, which happens from the crtc .atomic_check hook. That's too late
as the cdclk computation has already happened. So let's move the DPLL0
VCO computation into intel_dp_compute_config() so that it's done when
the cdclk computation happens. Also only do it for eDP since we only
pick DPLL0 for eDP.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-4-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

14d41b3b

drm/i915/skl: SKL CDCLK change on modeset tracking VCO · c89e39f3

Clint Taylor authored May 13, 2016

WARNING: Using ChromeOS with an eDP panel and a 4K@60 DP monitor connected
to DDI1 the system will hard hang during a cold boot. Occurs when DDI1
is enabled when the cdclk is less then required. DP connected to DDI2
and HPD on either port works correctly.

Set cdclk based on the max required pixel clock based on VCO
selected. Track boot vco instead of boot cdclk.

The vco is now tracked at the atomic level and all CRTCs updated if
the required vco is changed. Not tested with eDP v1.4 panels that
require 8640 vco due to availability.

V1: initial version
V2: add vco tracking in intel_dp_compute_config(), rename
skl_boot_cdclk.
V3: rebase, V2 feedback not possible as encoders are not aware of
atomic.
V4: track target vco is atomic state. modeset all CRTCs if vco changes
V5: rename atomic variable, cleaner if/else logic, use existing vco if
encoder does not return a new vco value. check_patch.pl cleanup
V6: simplify logic in intel_modeset_checks.
V7: reorder an IF for readability and whitespace fix.
V8: use dev_cdclk for tracking new cdclk during atomic
V9: correctly handle vco 8640 when crtcs==0
V10: Clean up if else in crtcs==0
V11: Rebase for new intel_dpll_mgr.c
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
[vsyrjala: rebased due to churn]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-3-git-send-email-ville.syrjala@linux.intel.com

c89e39f3

drm/i915: Fix BXT min_pixclk after state readout · 9558d15d

Ville Syrjälä authored May 13, 2016

commit 4e5ca60f ("drm/i915: Use ilk_max_pixel_rate() for BXT cdclk calculation")
tried to change BXT to use ilk_max_pixel_rate() to compute the
pipe pixel rate. I failed to notice that there was another place
in the state readout code that needs the same treatment. So let's
change that one too.

Should probably just change things to always compuyte the pipe pixel
rates, instead of just doing on platforms that can change cdclk
dynamically. But for now let's just move BXT fully over to the
side that uses ilk_pipe_pixel_rate().

Cc: Jani Nikula <jani.nikula@intel.com>
Fixes: 4e5ca60f ("drm/i915: Use ilk_max_pixel_rate() for BXT cdclk calculation")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-2-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

9558d15d

drm/i915/opregion: Rename init/fini functions to register/unregister · 03d92e47

Chris Wilson authored May 23, 2016

Current intel_opregion_init is called during the driver registration
phase and intel_opregion_fini from the unregistration phase. Rename the
functions so that this is clear from their names. The phases tell us
what we expect the existing hw state to be, e.g. whether interrupts are
still enabled etc.

It should be noted that the opregion init/fini routines are asymmetric
and this is carried across into their new names. Indeed, their new names
make it even clearer that perhaps all is not well in the opregion
suspend/resume sequence (as well in the module unload).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1464012490-30961-2-git-send-email-chris@chris-wilson.co.ukReviewed-by: Jani Nikula <jani.nikula@linux.intel.com>

03d92e47

drm/i915/opregion: Convert to using native drm_i915_private · 6f9f4b7a

Chris Wilson authored May 23, 2016

Prefer passing struct drm_i915_private to internal interfaces as this
saves us having to dance between drm_device and our native struct. The
savings hare are small (only 70 bytes of unrequired dancing), but
progressive!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1464012490-30961-1-git-send-email-chris@chris-wilson.co.ukReviewed-by: Jani Nikula <jani.nikula@linux.intel.com>

6f9f4b7a

drm/i915: Enable GSE interrupt on BDW+ · 11825b0d

Ville Syrjälä authored May 19, 2016

We've never actually enabled or unmasked the GSE interrupt on BDW+,
even though the interrupt handler was always prepared for it.
Let's enable it and see what happens.

Credit to Mark Kettenis who fixed this in the OpenBSD fork of the
driver. He reports that it fixed the "ACPI _BCM/_BCQ-based
brightness mechanism on a MacBookPro12,1 and a 3rd gen Lenovo X1
Carbon" for them.

Mark says:
"FWIW, this *is* needed if you want ACPI-based backlight control to
work. On Linux you probably don't notice, since "hardware" backlight
control is preferred over "firmware" or "platform" backlight control.

It would help me if this did land in the Linux tree though, as it will
make future imports of the i915 driver into OpenBSD easier."

So even though we don't really need this, let's put it in to help Mark
with future porting efforts. Should be harmless to have it enabled in
any case.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
References: http://lists.freedesktop.org/archives/intel-gfx/2015-December/081799.htmlReported-by: Mark Kettenis <mark.kettenis@xs4all.nl>
Cc: Mark Kettenis <mark.kettenis@xs4all.nl>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463649283-28698-1-git-send-email-ville.syrjala@linux.intel.comAcked-by: Jani Nikula <jani.nikula@intel.com>

11825b0d

drm/i915/guc: rework guc_add_workqueue_item() · 0a31afbc

Dave Gordon authored May 13, 2016

Mostly little optimisations and future-proofing against code breakage.
For instance, if the driver is correctly following the submission
protocol, the "out of space" condition is impossible, so the previous
runtime WARN_ON() is promoted to a GEM_BUG_ON() for a more dramatic
effect in development and less impact in end-user systems.

Similarly we can make alignment checking more stringent and replace
other WARN_ON() conditions that don't relate to the runtime hardware
state with either BUILD_BUG_ON() for compile-time-detectable issues, or
GEM_BUG_ON() for logical "can't happen" errors.

With those changes, we can convert it to void, as suggested by Chris
Wilson, and update the calling code appropriately.

v2:
    Note that we're now putting the request seqno in the "fence_id"
    field of each GuC-work-item, in case it turns up somewhere useful
    (e.g. in a GuC log) [Tvrtko Ursulin].
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

0a31afbc

drm/i915/guc: don't spinwait if the GuC's workqueue is full · 551aaecd

Dave Gordon authored May 13, 2016

Rather than wait to see whether more space becomes available in the GuC
submission workqueue, we can just return -EAGAIN and let the caller try
again in a little while. This gets rid of an uninterruptable sleep in
the polling code :)

We'll also add a counter to the GuC client statistics, to see how often
we find the WQ full.

v2:
    Flag the likely() code path (Tvtrko Ursulin).

v4:
    Add/update comments about failure counters (Tvtrko Ursulin).
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

551aaecd

drm/i915/guc: pass request (not client) to i915_guc_{wq_check_space, submit}() · 7c2c270d

Dave Gordon authored May 13, 2016

The knowledge of how to derive the relevant client from the request
should be localised within i915_guc_submission.c; the LRC code shouldn't
have to know about the internal details of the GuC submission process.
And all the information the GuC code needs should be encapsulated in (or
reachable from) the request.

v2:
    GEM_BUG_ON() for bad GuC client (Tvrtko Ursulin).
    Add/update kerneldoc explaining check_space/submit protocol
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

7c2c270d

drm/i915/guc: add enable_guc_loading parameter · fce91f22

Dave Gordon authored May 20, 2016

Split the function of "enable_guc_submission" into two separate
options.  The new one ("enable_guc_loading") controls only the
*fetching and loading* of the GuC firmware image. The existing
one is redefined to control only the *use* of the GuC for batch
submission once the firmware is loaded.

In addition, the degree of control has been refined from a simple
bool to an integer key, allowing several options:
-1 (default)     whatever the platform default is
 0  DISABLE      don't load/use the GuC
 1  BEST EFFORT  try to load/use the GuC, fallback if not available
 2  REQUIRE      must load/use the GuC, else leave the GPU wedged

The new platform default (as coded here) will be to attempt to
load the GuC iff the device has a GuC that requires firmware,
but not yet to use it for submission. A later patch will change
to enable it if appropriate.

v4:
    Changed some error-message levels, mostly ERROR->INFO, per
    review comments by Tvrtko Ursulin.

v5:
    Dropped one more error message, disabled GuC submission on
    hypothetical firmware-free devices [Tvrtko Ursulin].

v6:
    Logging tidy by Tvrtko Ursulin:
     * Do not log falling back to execlists when wedging the GPU.
     * Do not log fw load errors when load was disabled by user.
     * Pass down some error code from fw load for log message to
       make more sense.
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v5)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Tested-by: Fiedorowicz, Lukasz <lukasz.fiedorowicz@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v5)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> (v6)

fce91f22

drm/i915/guc: distinguish HAS_GUC() from HAS_GUC_UCODE/HAS_GUC_SCHED · 1a3d1898

Dave Gordon authored May 13, 2016

For now, anything with a GuC requires uCode loading, and then supports
command submission once loaded. But these are logically distinct from
simply "having a GuC", so we need a separate macro for the latter. Then,
various tests should use this new macro rather than HAS_GUC_UCODE() or
testing enable_guc_submission.

v4:
    Added a couple more uses of the new macro.
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

1a3d1898

drm/i915/guc: rename loader entry points · f09d675f

Dave Gordon authored May 13, 2016

The GuC initialisation code could do other things apart from loading
firmware, so here we rename the three primary entry points to remove any
specific reference to "ucode" (no functional changes, just renaming).
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

f09d675f

22 May, 2016 1 commit
- drm/i915: Update DRIVER_DATE to 20160522 · 9cce4431
  Daniel Vetter authored May 22, 2016
```
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
```
  9cce4431
20 May, 2016 9 commits

drm/i915: Inline sg_next() for the optimised SGL iterator · 63d15326

Dave Gordon authored May 20, 2016

Avoiding the out-of-line call to sg_next() reduces the kernel execution
overhead by 10% in some workloads (for example the Unreal Engine 4 demo
Atlantis on 2GiB GTTs) which are dominated by the cost of inserting PTEs
due to texture thrashing. We can demonstrate this in a microbenchmark
that forces us to rebind the object on every execbuf, where we can
measure a 25% improvement, in the time required to execute an execbuf
requiring a texture to be rebound, for inlining the sg_next() for large
texture sizes.

Benchmark: igt/benchmarks/gem_exec_fault
Benchmark: igt/benchmarks/gem_exec_trace/Atlantis
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1463741647-15666-5-git-send-email-chris@chris-wilson.co.uk

63d15326

drm/i915: Introduce & use new lightweight SGL iterators · 85d1225e

Dave Gordon authored May 20, 2016

The existing for_each_sg_page() iterator is somewhat heavyweight, and is
limiting i915 driver performance in a few benchmarks. So here we
introduce somewhat lighter weight iterators, primarily for use with GEM
objects or other case where we need only deal with whole aligned pages.

Unlike the old iterator, the new iterators use an internal state
structure which is not intended to be accessed by the caller; instead
each takes as a parameter an output variable which is set before each
iteration. This makes them particularly simple to use :)

One of the new iterators provides the caller with the DMA address of
each page in turn; the other provides the 'struct page' pointer required
by many memory management operations.

Various uses of for_each_sg_page() are then converted to the new macros.

v2: Force inlining of the sg_iter constructor and make the union
anonymous.
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1463741647-15666-4-git-send-email-chris@chris-wilson.co.uk

85d1225e

drm/i915: optimise i915_gem_object_map() for small objects · b338fa47

Dave Gordon authored May 20, 2016

We're using this function for ringbuffers and other "small" objects, so
it's worth avoiding an extra malloc()/free() cycle if the page array is
small enough to put on the stack. Here we've chosen an arbitrary cutoff
of 32 (4k) pages, which is big enough for a ringbuffer (4 pages) or a
context image (currently up to 22 pages).

v5:
    change name of local array [Chris Wilson]
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1463741647-15666-3-git-send-email-chris@chris-wilson.co.uk

b338fa47

drm/i915: refactor i915_gem_object_pin_map() · dd6034c6

Dave Gordon authored May 20, 2016

The recently-added i915_gem_object_pin_map() can be further optimised
for "small" objects. To facilitate this, and simplify the error paths
before adding the new code, this patch pulls out the "mapping" part of
the operation (involving local allocations which must be undone before
return) into its own subfunction.

The next patch will then insert the new optimisation into the middle of
the now-separated subfunction.

This reorganisation will probably not affect the generated code, as the
compiler will most likely inline it anyway, but it makes the logical
structure a bit clearer and easier to modify.

v2:
    Restructure loop-over-pages & error check [Chris Wilson]

v3:
    Add page count to debug messages [Chris Wilson]
    Convert WARN_ON() to GEM_BUG_ON()

v4:
    Drop the DEBUG messages [Tvrtko Ursulin]
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1463741647-15666-2-git-send-email-chris@chris-wilson.co.uk

dd6034c6

drm/i915/psr: Implement PSR2 w/a for gen9 · dc00b6a0

Daniel Vetter authored May 19, 2016

Found this while browsing Bspec. Looks like it applies to both skl and
kbl.

v2: Also for bxt (Art).

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Sonika Jindal <sonika.jindal@intel.com>
Cc: Durgadoss R <durgadoss.r@intel.com>
Cc: "Pandiyan, Dhinakaran" <dhinakaran.pandiyan@intel.com>
Cc: "Runyan, Arthur J" <arthur.j.runyan@intel.com>
Reviewed-by: Sonika Jindal<sonika.jindal@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463642060-30728-1-git-send-email-daniel.vetter@ffwll.ch

dc00b6a0

drm/i915/psr: Use ->get_aux_send_ctl functions · d4dcbdce

Daniel Vetter authored May 18, 2016

I just wanted to get rid of the rmw cycle for gen9, but this also
fixes some bugs we haven't carried over, like using recommended
precharge and timeout values.

Also I noticed that we don't set the fastwake sync length on skl, and
that's used by PSR2 selective updates. Fix that.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Sonika Jindal <sonika.jindal@intel.com>
Cc: Durgadoss R <durgadoss.r@intel.com>
Cc: "Pandiyan, Dhinakaran" <dhinakaran.pandiyan@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463590036-17824-6-git-send-email-daniel.vetter@ffwll.ch

d4dcbdce

drm/i915/psr: Order DP aux transactions correctly · 6f32ea7e

Daniel Vetter authored May 18, 2016

On bdw/hsw we have a separate psr dp aux registers to set up, but on
bdw it's shared with the main dp aux thing. Which means any subsequent
dp aux transaction will trample over it, and hence must be done
beforehand.

Also this means we can't do any dp aux transactions while PSR is
active, or at least we must restore the old state.

Probably need a psr disable/enable pair around dp aux transactions in
general.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Sonika Jindal <sonika.jindal@intel.com>
Cc: Durgadoss R <durgadoss.r@intel.com>
Cc: "Pandiyan, Dhinakaran" <dhinakaran.pandiyan@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463590036-17824-5-git-send-email-daniel.vetter@ffwll.ch

6f32ea7e

drm/i915/psr: Make idle_frames sensible again · 1c80c25f

Daniel Vetter authored May 18, 2016

This reverts

commit dfaf37ba
Author: Rodrigo Vivi <rodrigo.vivi@intel.com>
Date:   Mon Dec 7 14:45:20 2015 -0800

    drm/i915: Fix idle_frames counter.

and

commit 97173eaf
Author: Rodrigo Vivi <rodrigo.vivi@intel.com>
Date:   Tue Jul 7 16:28:55 2015 -0700

    drm/i915: PSR: Increase idle_frames

and implements

commit d44b4dcb
Author: Rodrigo Vivi <rodrigo.vivi@intel.com>
Date:   Fri Nov 14 08:52:31 2014 -0800

    drm/i915: HSW/BDW PSR Set idle_frames = VBT + 1

without the hack to use 2 idle frames when VBT says 1. We keep the + 1
just for safety, although I haven't really figured out why that one
exists.

It's nonsense. idle_frames = number of frames where the screen is
entirely idle before we think about entering PSR.

idle_patter = part of link training, and we probably totally butchered
link training because we told the hw to entirely skip it. No wonder
PSR occasionally just fell over.

I suspect the reason we've increased idle frames is that it makes PSR
entry slightly less likely, and more likely to happen in a quite
system, which probably increased the changes the panel came back up
without link training. The proper fix is to implement link training
for PSR.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Sonika Jindal <sonika.jindal@intel.com>
Cc: Durgadoss R <durgadoss.r@intel.com>
Cc: "Pandiyan, Dhinakaran" <dhinakaran.pandiyan@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463590036-17824-3-git-send-email-daniel.vetter@ffwll.ch

1c80c25f

drm/i915/psr: Try to program link training times correctly · 50db1390

Daniel Vetter authored May 18, 2016

The default of 0 is 500us of link training, but that's not enough for
some platforms. Decoding this correctly means we're using 2.5ms of
link training on these platforms, which fixes flickering issues
associated with enabling PSR.

v2: Unbotch the math a bit.

v3: Drop debug hunk.

v4: Improve commit message.
Tested-by: Lyude <cpaul@redhat.com>
Cc: Lyude <cpaul@redhat.com>
Cc: stable@vger.kernel.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95176
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Sonika Jindal <sonika.jindal@intel.com>
Cc: Durgadoss R <durgadoss.r@intel.com>
Cc: "Pandiyan, Dhinakaran" <dhinakaran.pandiyan@intel.com>
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: fritsch@kodi.tv
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463590036-17824-2-git-send-email-daniel.vetter@ffwll.ch

50db1390