Commits · 699fc401da1c9cc8c6bda578ca3d6310924276a2 · nexedi / linux

13 Oct, 2015 6 commits

drm/i915: Include gpio_mmio_base in GMBUS reg defines · 699fc401

Ville Syrjälä authored Sep 18, 2015

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

699fc401

drm/i915: Parametrize HSW video DIP data registers · 436c6d4a

Ville Syrjälä authored Sep 18, 2015

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

436c6d4a

drm/i915: Eliminate weird parameter inversion from BXT PPS registers · 03999f04

Ville Syrjälä authored Oct 12, 2015

v2: Keep using the same registers (PCH_*) instead of accidentally
    starting to use the other ones (BXT_*)2) (Jesse)
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

03999f04

drm/i915: Hold dev->event_lock whilst inspecting intel_crtc->unpin_work · 6042639c

Chris Wilson authored Oct 10, 2015

We should serialise access to the intel_crtc->unpin_work through the
dev->event_lock spinlock. It should not be possible for it to disappear
without severe error as the mmio_flip worker has not tagged the
unpin_work pending flip-completion. Similarly if the error exists, just
taking the unpin_work whilst holding the spinlock and then using it
unserialised just masks the race. (It is supposed to be valid as the
unpin_work exists until the flip completion interrupt which should not
fire until we flush the mmio writes to update the display base which is
the last time we access the unpin_work from the kthread.)

References: https://bugs.freedesktop.org/show_bug.cgi?id=92335Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

6042639c

i915: switch from acpi_os_ioremap to memremap · 115719fc

Williams, Dan J authored Oct 12, 2015

i915 expects the OpRegion to be cached (i.e. not __iomem), so explicitly
map it with memremap rather than the implied cache setting of
acpi_os_ioremap().

Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: intel-gfx@lists.freedesktop.org
Cc: David Airlie <airlied@linux.ie>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

115719fc

drm/i915: Drop unnecessary #include <linux/vga_switcheroo.h> · adfb4672

Lukas Wunner authored Oct 12, 2015

Commit 599bbb9d ("drm/i915: i915 cannot provide switcher services.")
removed all remaining vga_switcheroo symbols from intel_acpi.c but left
the include. Drop it.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

adfb4672

10 Oct, 2015 1 commit
- drm/i915: Update DRIVER_DATE to 20151010 · 80bea189
  Daniel Vetter authored Oct 10, 2015
```
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
```
  80bea189
09 Oct, 2015 8 commits

drm/i915: Partial revert of atomic watermark series · 261a27d1

Matt Roper authored Oct 08, 2015

It's been reported that the atomic watermark series triggers some
regressions on SKL, which we haven't been able to track down yet. Let's
temporarily revert these patches while we track down the root cause.

This commit squashes the reverts of:
76305b1a drm/i915: Calculate watermark configuration during atomic check (v2)
a4611e44 drm/i915: Don't set plane visible during HW readout if CRTC is off
a28170f3 drm/i915: Calculate ILK-style watermarks during atomic check (v3)
de4a9f83 drm/i915: Calculate pipe watermarks into CRTC state (v3)
de165e0b drm/i915: Refactor ilk_update_wm (v3)

Reference: http://lists.freedesktop.org/archives/intel-gfx/2015-October/077190.html
Cc: "Zanoni, Paulo R" <paulo.r.zanoni@intel.com>
Cc: "Vetter, Daniel" <daniel.vetter@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

261a27d1

drm/i915: Early exit from semaphore_waits_for for execlist mode. · 381e8ae3

Tomas Elf authored Oct 08, 2015

When submitting semaphores in execlist mode the hang checker crashes in this
function because it is only runnable in ring submission mode. The reason this
is of particular interest to the TDR patch series is because we use semaphores
as a mean to induce hangs during testing (which is the recommended way to
induce hangs for gen8+). It's not clear how this is supposed to work in
execlist mode since:

1. This function requires a ring buffer.

2. Retrieving a ring buffer in execlist mode requires us to retrieve the
corresponding context, which we get from a request.

3. Retieving a request from the hang checker is not straight-forward since that
requires us to grab the struct_mutex in order to synchronize against the
request retirement thread.

4. Grabbing the struct_mutex from the hang checker is nothing that we will do
since that puts us at risk of deadlock since a hung thread might be holding the
struct_mutex already.

Therefore it's not obvious how we're supposed to deal with this. For now, we're
doing an early exit from this function, which avoids any kernel panic situation
when running our own internal TDR ULT.

* v2: (Chris Wilson)
Turned the execlist mode check into a ringbuffer NULL check to make it more
submission mode agnostic and less of a layering violation.
Signed-off-by: Tomas Elf <tomas.elf@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

381e8ae3

drm/i915: Remove wrong warning from i915_gem_context_clean · 61fb5881

Tvrtko Ursulin authored Oct 08, 2015

commit e9f24d5f
Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Date:   Mon Oct 5 13:26:36 2015 +0100

    drm/i915: Clean up associated VMAs on context destruction

Introduced a wrong assumption that all contexts have a ppgtt
instance. This is not true when full PPGTT is not active so
remove the WARN_ON_ONCE from the context cleanup code.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

61fb5881

drm/i915: Determine the stolen memory base address on gen2 · 0ad98c74

Ville Syrjälä authored Oct 08, 2015

There isn't an explicit stolen memory base register on gen2.
Some old comment in the i915 code suggests we should get it via
max_low_pfn_mapped, but that's clearly a bad idea on my MGM.

The e820 map in said machine looks like this:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009f7ff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009f800-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000ce000-0x00000000000cffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000dc000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000001f6effff] usable
[    0.000000] BIOS-e820: [mem 0x000000001f6f0000-0x000000001f6f7fff] ACPI data
[    0.000000] BIOS-e820: [mem 0x000000001f6f8000-0x000000001f6fffff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x000000001f700000-0x000000001fffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fec10000-0x00000000fec1ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000ffb00000-0x00000000ffbfffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fff00000-0x00000000ffffffff] reserved

That makes max_low_pfn_mapped = 1f6f0000, so assuming our stolen memory
would start there would place it on top of some ACPI memory regions.
So not a good idea as already stated.

The 9MB region after the ACPI regions at 0x1f700000 however looks
promising given that the macine reports the stolen memory size to be
8MB. Looking at the PGTBL_CTL register, the GTT entries are at offset
0x1fee00000, and given that the GTT entries occupy 128KB, it looks like
the stolen memory could start at 0x1f700000 and the GTT entries would
occupy the last 128KB of the stolen memory.

After some more digging through chipset documentation, I've determined
the BIOS first allocates space for something called TSEG (something to
do with SMM) from the top of memory, and then it allocates the graphics
stolen memory below that. Accordind to the chipset documentation TSEG
has a fixed size of 1MB on 855. So that explains the top 1MB in the
e820 region. And it also confirms that the GTT entries are in fact at
the end of the the stolen memory region.

Derive the stolen memory base address on gen2 the same as the BIOS does
(TOM-TSEG_SIZE-stolen_size). There are a few differences between the
registers on various gen2 chipsets, so a few different codepaths are
required.

865G is again bit more special since it seems to support enough memory
to hit 4GB address space issues. This means the PCI allocations will
also affect the location of the stolen memory. Fortunately there
appears to be the TOUD register which may give us the correct answer
directly. But the chipset docs are a bit unclear, so I'm not 100%
sure that the graphics stolen memory is always the last thing the
BIOS steals. Someone would need to verify it on a real system.

I tested this on the my 830 and 855 machines, and so far everything
looks peachy.

v2: Rewrite to use the TOM-TSEG_SIZE-stolen_size and TOUD methods
v3: Fix TSEG size for 830
v4: Add missing 'else' (Chris)
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

0ad98c74

drm/i915: fix FBC buffer size checks · 856312ae

Paulo Zanoni authored Oct 01, 2015

According to my experiments (and later confirmation from the hardware
developers), the maximum sizes mentioned in the specification delimit
how far in the buffer the hardware tracking can go. And the hardware
calculates the size based on the plane address we provide - and the
provided plane address might not be the real x:0,y:0 point due to the
compute_page_offset() function.

On platforms that do the x/y offset adjustment trick it will be really
hard to reproduce a bug, but on the current SKL we can reproduce the
bug with igt/kms_frontbuffer_tracking/fbc-farfromfence. With this
patch, we'll go from "CRC assertion failure" to "FBC unexpectedly
disabled", which is still a failure on the test suite but is not a
perceived user bug - you will just not save as much power as you could
if FBC is disabled.

v2, rewrite patch after clarification from the Hadware guys:
  - Rename function so it's clear what the check is for.
  - Use the new intel_fbc_get_plane_source_sizes() function in order
    to get the proper sizes as seen by FBC.
v3:
  - Rebase after the s/sizes/size/ on the previous patch.
  - Adjust comment wording (Ville).
  - s/used_/effective_/ (Ville).

Testcase: igt/kms_frontbuffer_tracking/fbc-farfromfence (SKL)
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

856312ae

drm/i915: fix CFB size calculation · c4ffd409

Paulo Zanoni authored Oct 01, 2015

We were considering the whole framebuffer height, but the spec says we
should only consider the active display height size. There were still
some unclear questions based on the spec, but the hardware guys
clarified them for us. According to them:

- CFB size = CFB stride * Number of lines FBC writes to CFB
- CFB stride = plane stride / compression limit
- Number of lines FBC writes to CFB = MIN(plane source height, maximum
  number of lines FBC writes to CFB)
- Plane source height =
  - pipe source height (PIPE_SRCSZ register) (before SKL)
  - plane size register height (PLANE_SIZE register) (SKL+)
- Maximum number of lines FBC writes to CFB =
  - plane source height (before HSW)
  - 2048 (HSW+)

For the plane source height, I could just have made our code do
I915_READ() in order to be more future proof, but since it's not cool
to do register reads I decided to just recalculate the values we use
when we actually write to those registers.

With this patch, depending on your machine configuration, a lot of the
kms_frontbuffer_tracking subtests that used to result in a SKIP due to
not enough stolen memory still start resulting in a PASS.

v2: Use the clipped src size instead of pipe_src_h (Ville).
v3: Use the appropriate information provided by the hardware guys.
v4: Bikesheds: s/sizes/size/, s/fb_cpp/cpp/ (Ville).
v5: - Don't use crtc->config->pipe_src_x for BDW- (Ville).
    - Fix the register name written in the comment.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

c4ffd409

drm/i915: remove pre-atomic check from SKL update_primary_plane · a42e5a23

Paulo Zanoni authored Sep 30, 2015

The comment suggests the check was there for some non-fully-atomic
case, and I couldn't find a case where we wouldn't correctly
initialize plane_state, so remove the check.

Let's leave a WARN there just in case.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

a42e5a23

drm/i915: don't allocate fbcon from stolen memory if it's too big · 3badb49f

Paulo Zanoni authored Sep 23, 2015

Technology has evolved and now we have eDP panels with 3200x1800
resolution. In the meantime, the BIOS guys didn't change the default
32mb for stolen memory. On top of that, we can't assume our users will
be able to increase the default stolen memory size to more than 32mb -
I'm not even sure all BIOSes allow that.

So just the fbcon buffer alone eats 22mb of my stolen memroy, and due
to the BDW/SKL restriction of not using the last 8mb of stolen memory,
all that's left for FBC is 2mb! Since fbcon is not the coolest feature
ever, I think it's better to save our precious stolen resource to FBC
and the other guys.

On the other hand, we really want to use as much stolen memory as
possible, since on some older systems the stolen memory may be a
considerable percentage of the total available memory.

This patch tries to achieve a little balance using a simple heuristic:
if the fbcon wants more than half of the available stolen memory,
don't use stolen memory in order to leave some for FBC and the other
features.

The long term plan should be to implement a way to set priorities for
stolen memory allocation and then evict low priority users when the
high priority ones need the memory. While we still don't have that,
let's try to make FBC usable with the simple solution.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

3badb49f

08 Oct, 2015 5 commits

Revert "drm/i915: Call encoder hotplug for init and resume cases" · 3a2fb2c3

Daniel Vetter authored Oct 08, 2015

This reverts commit 5d250b05.

It results on a deadlock on platforms where we need to (at least
partially) re-init hpd interrupts from power domain code, since
->hot_plug might again grab a power well reference (to do edid/dp_aux
transactions. At least chv is affected.
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
References: http://mid.gmane.org/20151008133548.GX26517@intel.comSigned-off-by: Daniel Vetter <daniel.vetter@intel.com>

3a2fb2c3

Revert "drm/i915: Add hot_plug hook for hdmi encoder" · 8166fcea

Daniel Vetter authored Oct 08, 2015

This reverts commit 0b5e88dc.

It completely breaks booting on at least bsw (and maybe more).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88081Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

8166fcea

drm/i915: use error path · 11aee0f6

Sudip Mukherjee authored Oct 08, 2015

Use goto to handle the error path to avoid duplicating the same code. In
the error path intel_dig_port is the last one to be released as it was
the first one to be allocated and ideally the error path should be the
reverse of the execution path.

Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

11aee0f6

drm/i915/irq: Fix misspelled word register in kernel-doc · aafd8581

Javier Martinez Canillas authored Oct 08, 2015

There is a typo in the function i915_handle_error()
kernel-doc and the word register is spelled wrongly.
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

aafd8581

drm/i915/irq: Fix kernel-doc warnings · 468f9d29

Javier Martinez Canillas authored Oct 08, 2015

Add the dev parameter for the functions i915_enable_asle_pipestat() and
i915_reset_and_wakeup() to the kernel-doc to fix the following warnings:

.//drivers/gpu/drm/i915/i915_irq.c:586: warning: No description found for parameter 'dev'
.//drivers/gpu/drm/i915/i915_irq.c:2400: warning: No description found for parameter 'dev'
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

468f9d29

07 Oct, 2015 16 commits

drm/i915: Hook up ring workaround writes at context creation time on Gen6-7. · 4f91fc6d

Francisco Jerez authored Oct 07, 2015

intel_rcs_ctx_init() emits all workaround register writes on the list
to the ring, in addition to calling i915_gem_render_state_init().  The
workaround list is currently empty on Gen6-7 so this shouldn't cause
any functional changes.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

4f91fc6d

drm/i915: Don't warn if the workaround list is empty. · 02235808

Francisco Jerez authored Oct 07, 2015

It's not an error for the workaround list to be empty if no
workarounds are needed.  This will avoid spamming the logs
unnecessarily on Gen6 after the workaround list is hooked up on
pre-Gen8 hardware by the following commits.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

02235808

drm/i915: Resurrect golden context on gen6/7 · 6a8aadc4

Daniel Vetter authored Oct 01, 2015

In

commit 8f0e2b9d
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Tue Dec 2 16:19:07 2014 +0100

    drm/i915: Move golden context init into ->init_context

I've shuffled around per-ctx init code a bit for legacy contexts but
accidentally dropped the render state init call on gen6/7. Resurrect
it.
Reported-by: Francisco Jerez <currojerez@riseup.net>
Cc: Francisco Jerez <currojerez@riseup.net>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Thomas Daniel <thomas.daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

6a8aadc4

drm/i915/chv: remove pre-production hardware workarounds · 5b5929cb

Jani Nikula authored Oct 07, 2015

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[danvet: Appease gcc and remove the unused variable.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

5b5929cb

drm/i915/snb: remove pre-production hardware workaround · d39398f5

Jani Nikula authored Oct 07, 2015

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

d39398f5

drm/i915/bxt: Set time interval unit to 0.833us · 26148bd3

Akash Goel authored Sep 18, 2015

Note that in Bspec you have to dig around in a section called
"Timestamp bases" and Bspec update request is filed.
Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
[danvet: Add note about state of Bspec.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

26148bd3

drm/i915: Kill DRI1 cliprects · 2f5945bc

Chris Wilson authored Oct 06, 2015

Passing cliprects into the kernel for it to re-execute the batch buffer
with different CMD_DRAWRECT died out long ago. As DRI1 support has been
removed from the kernel, we can now simply reject any execbuf trying to
use this "feature".

To keep Daniel happy with the prospect of being able to reuse these
fields in the next decade, continue to ensure that current userspace is
not passing garbage in through the dead fields.

v2: Fix the cliprects_ptr check
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

2f5945bc

drm/i915: Avoid GPU stalls from kswapd · 5763ff04

Chris Wilson authored Oct 01, 2015

Exclude active GPU pages from the purview of the background shrinker
(kswapd), as these cause uncontrollable GPU stalls. Given that the
shrinker is rerun until the freelists are satisfied, we should have
opportunity in subsequent passes to recover the pages once idle. If the
machine does run out of memory entirely, we have the forced idling in the
oom-notifier as a means of releasing all the pages we can before an oom
is prematurely executed.

Note that this relies upon an up-front retire_requests to keep the
inactive list in shape, which was added in a previous patch, mostly as
execlist ctx pinning band-aids.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: Add note about retire_requests.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

5763ff04

drm/i915: Remove dead i915_gem_evict_everything() · ce8daef3

Chris Wilson authored Oct 01, 2015

With UMS gone, we no longer use it during suspend. And with the last
user removed from the shrinker, we can remove the dead code.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

ce8daef3

drm/i915: During shrink_all we only need to idle the GPU · c9c0f5ea

Chris Wilson authored Oct 01, 2015

We can forgo an evict-everything here as the shrinker operation itself
will unbind any vma as required. If we explicitly idle the GPU through a
switch to the default context, we not only create a request in an
illegal context (e.g. whilst shrinking during execbuf with a request
already allocated), but switching to the default context will not free
up the memory backing the active contexts - unless in the unlikely
situation that context had already been closed (and just kept arrive by
being the current context). The saving is near zero and the danger real.

To compensate for the loss of the forced retire, add a couple of
retire-requests to i915_gem_shirnk() - this should help free up any
transitive cache from the requests.

Note that the second retire_requests is for the benefit of the
hand-rolled execlist ctx active tracking: We need to manually kick
requests to get those unpinned again. Once that's fixed we can try to
remove this again.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Add summary of why we need a pile of retire_requests.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

c9c0f5ea

drm/i915: Add a tracepoint for the shrinker · 3abafa53

Chris Wilson authored Oct 01, 2015

Often it is very useful to know why we suddenly purge vast tracts of
memory and surprisingly up until now we didn't even have a tracepoint
for when we shrink our memory.

Note that there are slab_start/end tracepoints already, but those
don't cover the internal recursion when we directly call into our
shrinker code. Hence a separate tracepoint seems justified. Also note
that we don't really need a separate tracepoint for the actual amount
of pages freed since we already have an unbind tracpoint for that.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Add a note that there's also slab_start/end and why they're
insufficient.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

3abafa53

drm/i915: DocBook add i915_component.h support · cb422619

Libin Yang authored Oct 01, 2015

Add the item of i915_component.h in DocBook and add the DOC for
i915_component.h. Explain the struct i915_audio_component_ops and
struct i915_audio_component_audio_ops usage.
Signed-off-by: Libin Yang <libin.yang@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

cb422619

drm/i915: add kerneldoc for i915_audio_component · 3f4c90bd

Libin Yang authored Oct 01, 2015

Add the kerneldoc for i915_audio_component in i915_component.h
Signed-off-by: Libin Yang <libin.yang@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

3f4c90bd

Merge remote-tracking branch 'takashi/topic/drm-sync-audio-rate' into drm-intel-next-queued · 28446598

Daniel Vetter authored Oct 07, 2015

Pull in the i915/hda changes for N/CTS setting so I can apply the
follow-up documentation work for drm/i915.

Some conflicts because ofc we had to rework i915 while that N/CTS work
was going on. But not more than adjacent changes really.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

28446598

drm/i915: shrinker_control->nr_to_scan is now unsigned long · 14387540

Chris Wilson authored Oct 01, 2015

As the shrinker_control now passes us unsigned long targets, update our
shrinker functions to match.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

14387540

drm/i915: Fix kerneldoc for i915_gem_shrink_all · 1f2449cd

Daniel Vetter authored Oct 06, 2015

I've botched this, so let's fix it.

Botched in

commit eb0b44ad
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Wed Mar 18 14:47:59 2015 +0100

    drm/i915: kerneldoc for i915_gem_shrinker.c

v2: Be a good citizen^Wmaintainer and add the proper commit citation.
Noticed by Jani.
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

1f2449cd

06 Oct, 2015 4 commits

drm/i915: Use a task to cancel the userptr on invalidate_range · 380996aa

Chris Wilson authored Oct 01, 2015

Whilst discussing possible ways to trigger an invalidate_range on a
userptr with an aliased GGTT mmapping (and so cause a struct_mutex
deadlock), the conclusion is that we can, and we must, prevent any
possible deadlock by avoiding taking the mutex at all during
invalidate_range. This has numerous advantages all of which stem from
avoid the sleeping function from inside the unknown context. In
particular, it simplifies the invalidate_range because we no longer
have to juggle the spinlock/mutex and can just hold the spinlock
for the entire walk. To compensate, we have to make get_pages a bit more
complicated in order to serialise with a pending cancel_userptr worker.
As we hold the struct_mutex, we have no choice but to return EAGAIN and
hope that the worker is then flushed before we retry after reacquiring
the struct_mutex.

The important caveat is that the invalidate_range itself is no longer
synchronous. There exists a small but definite period in time in which
the old PTE's page remain accessible via the GPU. Note however that the
physical pages themselves are not invalidated by the mmu_notifier, just
the CPU view of the address space. The impact should be limited to a
delay in pages being flushed, rather than a possibility of writing to
the wrong pages. The only race condition that this worsens is remapping
an userptr active on the GPU where fresh work may still reference the
old pages due to struct_mutex contention. Given that userspace is racing
with the GPU, it is fair to say that the results are undefined.

v2: Only queue (and importantly only take one refcnt) the worker once.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

380996aa

drm/i915: Fix userptr deadlock with aliased GTT mmappings · e4b946bf

Chris Wilson authored Oct 01, 2015

Michał Winiarski found a really evil way to trigger a struct_mutex
deadlock with userptr. He found that if he allocated a userptr bo and
then GTT mmaped another bo, or even itself, at the same address as the
userptr using MAP_FIXED, he could then cause a deadlock any time we then
had to invalidate the GTT mmappings (so at will). Tvrtko then found by
repeatedly allocating GTT mmappings he could alias with an old userptr
mmap and also trigger the deadlock.

To counter act the deadlock, we make the observation that we only need
to take the struct_mutex if the object has any pages to revoke, and that
before userspace can alias with the userptr address space, it must have
invalidated the userptr->pages. Thus if we can check for those pages
outside of the struct_mutex, we can avoid the deadlock. To do so we
introduce a separate flag for userptr objects that we can inspect from
the mmu-notifier underneath its spinlock.

The patch makes one eye-catching change. That is the removal serial=0
after detecting a to-be-freed object inside the invalidate walker. I
felt setting serial=0 was a questionable pessimisation: it denies us the
chance to reuse the current iterator for the next loop (before it is
freed) and being explicit makes the reader question the validity of the
locking (since the object-free race could occur elsewhere). The
serialisation of the iterator is through the spinlock, if the object is
freed before the next loop then the notifier.serial will be incremented
and we start the walk from the beginning as we detect the invalid cache.

To try and tame the error paths and interactions with the userptr->active
flag, we have to do a fair amount of rearranging of get_pages_userptr().

v2: Grammar fixes
v3: Reorder set-active so that it is only set when obj->pages is set
(and so needs cancellation). Only the order of setting obj->pages and
the active-flag is crucial. Calling gup after invalidate-range begin
means the userptr sees the new set of backing storage (and so will not
need to invalidate its new pages), but we have to be careful not to set
the active-flag prior to successfully establishing obj->pages.
v4: Take the active->flag early so we know in the mmu-notifier when we
have to cancel a pending gup-worker.
v5: Rearrange the error path so that is not so convoluted
v6: Set pinned to 0 when negative before calling release_pages()
Reported-by: Michał Winiarski <michal.winiarski@intel.com>
Testcase: igt/gem_userptr_blits/map-fixed*
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

e4b946bf

drm/i915: Only update the current userptr worker · 68d6c840

Chris Wilson authored Oct 01, 2015

The userptr worker allows for a slight race condition where upon there
may two or more threads calling get_user_pages for the same object. When
we have the array of pages, then we serialise the update of the object.
However, the worker should only overwrite the obj->userptr.work pointer
if and only if it is the active one. Currently we clear it for a
secondary worker with the effect that we may rarely force a second
lookup.

v2: Rebase and rename a variable to avoid 80cols
v3: Mention v2
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

68d6c840

drm/i915: prevent out of range pt in the PDE macros (take 3) · 24dfd073

Michel Thierry authored Oct 02, 2015

We tried to fix this in commit fdc454c1 ("drm/i915: Prevent out of
range pt in gen6_for_each_pde").

But the static analyzer still complains that, just before we break due
to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an
iter value that is bigger than I915_PDES. Of course, this isn't really
a problem since no one uses pt outside the macro. Still, every single
new usage of the macro will create a new issue for us to mark as a
false positive.

Also, Paulo re-started the discussion a while ago [1], but didn't end up
implemented.

In order to "solve" this "problem", this patch takes the ideas from
Chris and Dave, but that check would change the desired behavior of the
code, because the object (for example pdp->page_directory[iter]) can be
null during init/alloc, and C would take this as false, breaking the for
loop immediately.

This has been already verified with "static analysis tools".

[1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html

v2: Make it a single statement, while preventing the common subexpression
elimination (Chris)

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

24dfd073