Commits · 60c0a66ee96cbdb22e029da70ec42e249f1996a5 · nexedi / linux

12 Jul, 2018 5 commits

drm/i915: Tidy error handling in i915_gem_init_hw · 60c0a66e

Michał Winiarski authored Jul 12, 2018

Let's reorder things so that we can do onion teardown rather than double
goto.

References: b96f6ebf ("drm/i915: Correctly handle error path in i915_gem_init_hw")
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712124810.25241-1-michal.winiarski@intel.com

60c0a66e

drm/i915/guc: Skip cleaning up the doorbells on error-before-allocate · 5bfbeacf

Chris Wilson authored Jul 12, 2018

If we fail the module load, we may try and cleanup before we even
allocate the GuC clients. KISS in order to try and re-enable
drv_module_reload for BAT.

Testcase: igt/drv_module_reload/basic-reload-inject
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712105830.20390-1-chris@chris-wilson.co.uk

5bfbeacf

drm/i915: Silence warning for no vlv powercontext · 818fed4f

Chris Wilson authored Jul 12, 2018

Along a module load error path, we may try to cleanup the powercontext
even before we have allocated it.  Reorganising GT powermanagement is an
 on going process, so for simplicity handle it.

[  522.733832] WARN_ON(!dev_priv->vlv_pctx)
[  522.733986] WARNING: CPU: 1 PID: 3856 at drivers/gpu/drm/i915/intel_pm.c:7350 intel_cleanup_gt_powersave+0x5f/0x70 [i915]
[  522.733991] Modules linked in: i915(+) vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic btusb btrtl btbcm btintel intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul bluetooth snd_hda_codec ghash_clmulni_intel snd_hwdep snd_hda_core ecdh_generic lpc_ich r8169 snd_pcm mii i2c_hid prime_numbers [last unloaded: i915]
[  522.734105] CPU: 1 PID: 3856 Comm: drv_module_relo Tainted: G     U            4.18.0-rc4-CI-CI_DRM_4474+ #1
[  522.734110] Hardware name: \xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff \xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff/DN2820FYK, BIOS FYBYT10H.86A.0059.2017.0607.2130 06/07/2017
[  522.734193] RIP: 0010:intel_cleanup_gt_powersave+0x5f/0x70 [i915]
[  522.734197] Code: 00 74 0d 48 c7 83 68 a6 00 00 00 00 00 00 eb c8 e8 36 6f 37 e1 eb ec 48 c7 c6 c5 7a 3d a0 48 c7 c7 b5 78 3d a0 e8 71 04 e0 e0 <0f> 0b eb aa 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f3 c3 0f 1f 40
[  522.734445] RSP: 0018:ffffc900004f3af0 EFLAGS: 00010282
[  522.734453] RAX: 0000000000000000 RBX: ffff880106360000 RCX: 0000000000000001
[  522.734458] RDX: 0000000080000001 RSI: ffffffff820c65c4 RDI: 00000000ffffffff
[  522.734463] RBP: ffff880106360000 R08: 000000009f79baee R09: 0000000000000000
[  522.734467] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88013b3133f8
[  522.734472] R13: 00000000ffffffed R14: ffff880106360d58 R15: ffff88013b3133f8
[  522.734477] FS:  00007f43f70af980(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000
[  522.734481] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  522.734486] CR2: 000055a13a787580 CR3: 00000001325e6000 CR4: 00000000001006e0
[  522.734490] Call Trace:
[  522.734595]  intel_modeset_cleanup+0xcf/0x140 [i915]
[  522.734682]  i915_driver_load+0xc85/0x10a0 [i915]
[  522.734694]  ? _raw_spin_unlock_irqrestore+0x4c/0x60
[  522.734703]  ? trace_hardirqs_on_caller+0xe0/0x1b0
[  522.734790]  i915_pci_probe+0x29/0x90 [i915]
[  522.734801]  pci_device_probe+0xa1/0x130
[  522.734813]  driver_probe_device+0x306/0x480
[  522.734824]  __driver_attach+0xdb/0x100
[  522.734830]  ? driver_probe_device+0x480/0x480
[  522.734836]  ? driver_probe_device+0x480/0x480
[  522.734844]  bus_for_each_dev+0x74/0xc0
[  522.734855]  bus_add_driver+0x15f/0x250
[  522.734863]  ? 0xffffffffa0793000
[  522.734870]  driver_register+0x56/0xe0
[  522.734877]  ? 0xffffffffa0793000
[  522.734883]  do_one_initcall+0x58/0x370
[  522.734893]  ? do_init_module+0x1d/0x1ea
[  522.734900]  ? rcu_read_lock_sched_held+0x6f/0x80
[  522.734906]  ? kmem_cache_alloc_trace+0x282/0x2e0
[  522.734918]  do_init_module+0x56/0x1ea
[  522.734927]  load_module+0x2435/0x2b20
[  522.734965]  ? __se_sys_finit_module+0xd3/0xf0
[  522.734972]  __se_sys_finit_module+0xd3/0xf0
[  522.734995]  do_syscall_64+0x55/0x190
[  522.735003]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  522.735009] RIP: 0033:0x7f43f675d839
[  522.735014] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
[  522.735260] RSP: 002b:00007ffe69384238 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[  522.735269] RAX: ffffffffffffffda RBX: 000056100e387090 RCX: 00007f43f675d839
[  522.735273] RDX: 0000000000000000 RSI: 000056100e37bff0 RDI: 0000000000000003
[  522.735278] RBP: 000056100e37bff0 R08: 0000000000000000 R09: 0000000000000000
[  522.735282] R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000000
[  522.735286] R13: 000056100e37c890 R14: 0000000000000020 R15: 0000000000000027
[  522.735309] irq event stamp: 1389594
[  522.735316] hardirqs last  enabled at (1389593): [<ffffffff810f896c>] console_unlock+0x3fc/0x600
[  522.735323] hardirqs last disabled at (1389594): [<ffffffff81a0111c>] error_entry+0x7c/0x100
[  522.735329] softirqs last  enabled at (13893567): [<ffffffff81c0034f>] __do_softirq+0x34f/0x505
[  522.735336] softirqs last disabled at (1389335): [<ffffffff8108c7b9>] irq_exit+0xa9/0xc0
[  522.735432] WARNING: CPU: 1 PID: 3856 at drivers/gpu/drm/i915/intel_pm.c:7350 intel_cleanup_gt_powersave+0x5f/0x70 [i915]

Testcase: igt/drv_module_reload/basic-reload-inject
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712105454.16091-1-chris@chris-wilson.co.uk

818fed4f

drm/i915/tv: fix strncpy truncation warning · d6b4ea86

Dominique Martinet authored Jul 12, 2018

Change it to use strlcpy instead
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712074103.21571-1-chris@chris-wilson.co.ukSigned-off-by: Chris Wilson <chris@chris-wilson.co.uk>

d6b4ea86

Merge tag 'gvt-next-2018-07-11' of https://github.com/intel/gvt-linux into drm-intel-next-queued · 91045034

Rodrigo Vivi authored Jul 12, 2018

gvt-next-2018-07-11

- vGPU huge page support (Changbin)
- BXT display irq warning fix (Colin)
- Handle GVT dependency well (Henry)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180711023353.GU1267@zhen-hp.sh.intel.com

91045034

11 Jul, 2018 3 commits

drm/i915/execlists: Switch to rb_root_cached · 655250a8

Chris Wilson authored Jun 29, 2018

The kernel recently gained an augmented rbtree with the purpose of
cacheing the leftmost element of the rbtree, a frequent optimisation to
avoid calls to rb_first() which is also employed by the
execlists->queue. Switch from our open-coded cache to the library.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180629075348.27358-9-chris@chris-wilson.co.uk

655250a8

drm/i915/selftests: Add a safety net to live_workarounds · cb4dc8da

Chris Wilson authored Jul 11, 2018

Since live_workarounds poke around the w/a registers and checks to see
if they survive across a reset, we are prone to fouling the machine and
leaving it in a non-recoverable state. Wrap the probe inside a timeout
to abort the test if the reset fails.

v2: Include GEM_TRACE on declaring wedged.
v3: Add a few includes to make the header look standalone.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107188Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180711122952.18448-1-chris@chris-wilson.co.uk

cb4dc8da

drm/i915: Introduce i915_address_space.mutex · 19bb33c7

Chris Wilson authored Jul 11, 2018

Add a mutex into struct i915_address_space to be used while operating on
the vma and their lists for a particular vm. As this may be called from
the shrinker, we taint the mutex with fs_reclaim so that from the start
lockdep warns us if we are caught holding the mutex across an
allocation. (With such small steps we will eventually rid ourselves of
struct_mutex recursion!)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180711073608.20286-2-chris@chris-wilson.co.uk

19bb33c7

10 Jul, 2018 11 commits

drm/i915: use the ICL stolen memory · 185441e0

Paulo Zanoni authored May 04, 2018

Now that our stolen memory is already reserved by the x86 subsystem
(since commit "x86/gpu: reserve ICL's graphics stolen memory"), make
use of it.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: x86@kernel.org
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180504203252.28048-2-paulo.r.zanoni@intel.com

185441e0

x86/gpu: reserve ICL's graphics stolen memory · db0c8d8b

Paulo Zanoni authored May 04, 2018

ICL changes the registers and addresses to 64 bits.

I also briefly looked at implementing an u64 version of the PCI config
read functions, but I concluded this wouldn't be trivial, so it's not
worth doing it for a single user that can't have any racing problems
while reading the register in two separate operations.

v2:
 - Scrub the development (non-public) changelog (Joonas).
 - Remove the i915.ko bits so this can be easily backported in order
   to properly avoid stolen memory even on machines without i915.ko
   (Joonas).
 - CC stable for the reasons above.

Issue: VIZ-9250
CC: stable@vger.kernel.org
Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Fixes: 41231001 ("drm/i915/icl: Add initial Icelake definitions.")
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180504203252.28048-1-paulo.r.zanoni@intel.com

db0c8d8b

drm/i915: Unwind HW init after GVT setup failure · 7ab87ede

Chris Wilson authored Jul 10, 2018

Following intel_gvt_init() failure, we missed unwinding our setup
leaving pointers dangling past the module unload. For our example, the
pm_qos:

[  441.057615] top: 000000006b3baf1c, n: 0000000054d8ef33, p: 0000000097cdf1a2
               prev: 0000000054d8ef33, n: 0000000097cdf1a2, p: 000000006b3baf1c
               next: 0000000097cdf1a2, n: 000000006de8fc8b, p: 0000000081087253
[  441.057627] WARNING: CPU: 4 PID: 9277 at lib/plist.c:42 plist_check_prev_next+0x2d/0x40
[  441.057628] Modules linked in: i915(+) vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core e1000e snd_pcm mei_me mei prime_numbers [last unloaded: i915]
[  441.057652] CPU: 4 PID: 9277 Comm: drv_selftest Tainted: G     U            4.18.0-rc4-CI-CI_DRM_4464+ #1
[  441.057653] Hardware name: System manufacturer System Product Name/Z170 PRO GAMING, BIOS 3402 04/26/2017
[  441.057656] RIP: 0010:plist_check_prev_next+0x2d/0x40
[  441.057657] Code: 08 48 39 f0 74 2b 49 89 f0 48 8b 4f 08 50 ff 32 52 48 89 fe 41 ff 70 08 48 8b 17 48 c7 c7 d8 ae 14 82 4d 8b 08 e8 63 0e 76 ff <0f> 0b 48 83 c4 20 c3 48 39 10 75 d0 f3 c3 0f 1f 44 00 00 41 54 55
[  441.057717] RSP: 0018:ffffc900003a3a68 EFLAGS: 00010082
[  441.057720] RAX: 0000000000000000 RBX: ffff8802193978c0 RCX: 0000000000000002
[  441.057721] RDX: 0000000080000002 RSI: ffffffff820c65a4 RDI: 00000000ffffffff
[  441.057722] RBP: ffff8802193978c0 R08: 0000000000000000 R09: 0000000000000001
[  441.057724] R10: ffffc900003a3a70 R11: 0000000000000000 R12: ffffffff82243de0
[  441.057725] R13: ffffffff82243de0 R14: ffff88021a6c78c0 R15: 0000000077359400
[  441.057726] FS:  00007fc23a4a9980(0000) GS:ffff880236d00000(0000) knlGS:0000000000000000
[  441.057728] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  441.057729] CR2: 0000563e4503d038 CR3: 0000000138f86005 CR4: 00000000003606e0
[  441.057730] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  441.057731] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  441.057732] Call Trace:
[  441.057736]  plist_check_list+0x2e/0x40
[  441.057738]  plist_add+0x23/0x130
[  441.057743]  pm_qos_update_target+0x1bd/0x2f0
[  441.057771]  i915_driver_load+0xec4/0x1060 [i915]
[  441.057775]  ? trace_hardirqs_on_caller+0xe0/0x1b0
[  441.057800]  i915_pci_probe+0x29/0x90 [i915]
[  441.057804]  pci_device_probe+0xa1/0x130
[  441.057807]  driver_probe_device+0x306/0x480
[  441.057810]  __driver_attach+0xdb/0x100
[  441.057812]  ? driver_probe_device+0x480/0x480
[  441.057813]  ? driver_probe_device+0x480/0x480
[  441.057816]  bus_for_each_dev+0x74/0xc0
[  441.057819]  bus_add_driver+0x15f/0x250
[  441.057821]  ? 0xffffffffa0696000
[  441.057823]  driver_register+0x56/0xe0
[  441.057825]  ? 0xffffffffa0696000
[  441.057827]  do_one_initcall+0x58/0x370
[  441.057830]  ? do_init_module+0x1d/0x1ea
[  441.057832]  ? rcu_read_lock_sched_held+0x6f/0x80
[  441.057834]  ? kmem_cache_alloc_trace+0x282/0x2e0
[  441.057838]  do_init_module+0x56/0x1ea
[  441.057841]  load_module+0x2435/0x2b20
[  441.057852]  ? __se_sys_finit_module+0xd3/0xf0
[  441.057854]  __se_sys_finit_module+0xd3/0xf0
[  441.057861]  do_syscall_64+0x55/0x190
[  441.057863]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  441.057865] RIP: 0033:0x7fc239d75839
[  441.057866] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
[  441.057927] RSP: 002b:00007fffb7825d38 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[  441.057930] RAX: ffffffffffffffda RBX: 0000563e45035dd0 RCX: 00007fc239d75839
[  441.057931] RDX: 0000000000000000 RSI: 0000563e4502f8a0 RDI: 0000000000000004
[  441.057932] RBP: 0000563e4502f8a0 R08: 0000000000000004 R09: 0000000000000000
[  441.057933] R10: 00007fffb7825ea0 R11: 0000000000000246 R12: 0000000000000000
[  441.057934] R13: 0000563e4502f690 R14: 0000000000000000 R15: 000000000000003f
[  441.057940] irq event stamp: 231338
[  441.057943] hardirqs last  enabled at (231337): [<ffffffff8193e3fc>] _raw_spin_unlock_irqrestore+0x4c/0x60
[  441.057944] hardirqs last disabled at (231338): [<ffffffff8193e26d>] _raw_spin_lock_irqsave+0xd/0x50
[  441.057947] softirqs last  enabled at (231024): [<ffffffff81c0034f>] __do_softirq+0x34f/0x505
[  441.057949] softirqs last disabled at (231005): [<ffffffff8108c7b9>] irq_exit+0xa9/0xc0
[  441.057951] WARNING: CPU: 4 PID: 9277 at lib/plist.c:42 plist_check_prev_next+0x2d/0x40

v2: Add a load failure point to intel_gvt_init() so that we always
exercise this path in future.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107129Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180710143821.1889-1-chris@chris-wilson.co.uk

7ab87ede

drm/i915: Cleanup modesetting on load-error path · 73bad7ca

Chris Wilson authored Jul 10, 2018

After handling a critical failure initialising GEM we need to unwind the
modesetting setup.

Testcase: igt/drv_module_reload/basic-reload-inject
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180710094421.16223-2-chris@chris-wilson.co.ukReviewed-by: Matthew Auld <matthew.auld@intel.com>

73bad7ca

drm/i915: Flush the residual parking on emergency shutdown · 8bcf9f70

Chris Wilson authored Jul 10, 2018

On unwinding following a critical failure inside GEM init, we also need
to be sure to flush the workers before unloading the module.

Testcase: igt/drv_module_reload/basic-reload-inject
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180710094421.16223-1-chris@chris-wilson.co.uk

8bcf9f70

drm/i915: Tidy i915_gem_suspend() · bf06112f

Chris Wilson authored Jul 09, 2018

In the next patch, we will make a fairly minor change to flush
outstanding resets before suspend. In order to keep churn to a minimum
in that functional patch, we fix up the comments and coding style now.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180709130208.11730-7-chris@chris-wilson.co.uk

bf06112f

drm/i915: Only reset hangcheck at the start of an activity cycle · b7bb6138

Chris Wilson authored Jul 09, 2018

Across a reset, the seqno (and thus hangcheck) should restart and the
hangcheck naturally progress, for when it does not, we want to declare an
emergency. Currently, we only detect if reset and reinit fails, but we
do not detect if the call to reinit succeeds but the HW is fried - as we
are resetting hangcheck on initialisation the engine. Remove that and
rely on the natural progress to reset the hangcheck timer.

References: e21b1413 ("drm/i915: Mark the hangcheck as idle when unparking the engines")
References: 1fd00c0f ("drm/i915: Declare the driver wedged if hangcheck makes no progress")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180709130208.11730-2-chris@chris-wilson.co.uk

b7bb6138

drm/i915/selftests: Filter out both physical address swizzles · cecb368d

Chris Wilson authored Jul 09, 2018

In our swizzling selftests, we cannot predict the physical address of
the target page (at least not simply!) and so skip bit17 swizzles.
However, there are two bit17 swizzle modes and we only skipped one, with
the second being observed on the lab gdg causing the test to fail,
as soon as we hit a page with bit17 set in its address.

Testcase: igt/drv_selftest/live_objects #gdg
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180709194915.5789-1-chris@chris-wilson.co.uk

cecb368d

drm/i915/selftests: Constrain mock_gtt tests to fit within RAM · ebfa7944

Chris Wilson authored Jul 10, 2018

Be pessimistic and presume that we actually allocate every page we
exercise via the mock_gtt (e.g. for gvt). In which case we have to keep
our working set under the available physical memory to prevent oom.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180710080424.7821-1-chris@chris-wilson.co.uk

ebfa7944

drm/i915: Remove function details from device error messages · 8cff1f4a

Chris Wilson authored Jul 09, 2018

Error messages are intended to be addressed to the user; be clear,
succinct, instructive and unambiguous. Adding the function name to
that message does not add any information the user requires and in
the process makes the message less clear.

E.g.

[  245.539711] i915 0000:00:02.0: [drm:i915_gem_init [i915]] Failed to initialize GPU, declaring it wedged!

becomes

[  245.539711] i915 0000:00:02.0: Failed to initialize GPU, declaring it wedged!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180709134858.12446-1-chris@chris-wilson.co.uk

8cff1f4a

drm/i915/gvt: declare gvt as i915's soft dependency · 279ce5d1

Hang Yuan authored Jul 09, 2018

This helps initramfs builder and other tools to know the full dependencies
of i915 and have gvt module loaded with i915.

v2: add condition and change to pre-dependency (Chris)
v3: move declaration to gvt.c. (Chris)
v4: remove xengt (Zhenyu)
Signed-off-by: Hang Yuan <hang.yuan@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

279ce5d1

09 Jul, 2018 21 commits

drm/i915: Update DRIVER_DATE to 20180709 · 82edc7e8
Rodrigo Vivi authored Jul 09, 2018
```
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
```
82edc7e8

drm/i915/selftests: Prevent background reaping of active objects · 932cac10

Chris Wilson authored Jul 09, 2018

igt_mmap_offset_exhaustion() wants to test what happens when the mmap
space is filled with zombie objects, objects discarded by userspace but
still active on the GPU. As they are only protected by the active
reference, we have to be certain that active reference is kept while we
peek into our dangling pointer. That active reference should not be
freed until we retire, but we do that retirement from a background
thread. This leaves us with a subtle timing problem, exacerbated and
highlighted by KASAN:

<3>[  132.380399] BUG: KASAN: use-after-free in drm_gem_create_mmap_offset+0x8c/0xd0
<3>[  132.380430] Read of size 8 at addr ffff8801e13245f8 by task drv_selftest/5822

<4>[  132.380470] CPU: 0 PID: 5822 Comm: drv_selftest Tainted: G     U            4.18.0-rc3-g7ae7763aa2be-kasan_48+ #1
<4>[  132.380473] Hardware name: Dell Inc. XPS 8300  /0Y2MRG, BIOS A06 10/17/2011
<4>[  132.380475] Call Trace:
<4>[  132.380481]  dump_stack+0x7c/0xbb
<4>[  132.380487]  print_address_description+0x65/0x270
<4>[  132.380493]  kasan_report+0x25b/0x380
<4>[  132.380497]  ? drm_gem_create_mmap_offset+0x8c/0xd0
<4>[  132.380503]  drm_gem_create_mmap_offset+0x8c/0xd0
<4>[  132.380584]  i915_gem_object_create_mmap_offset+0x6d/0x100 [i915]
<4>[  132.380650]  igt_mmap_offset_exhaustion+0x462/0x940 [i915]
<4>[  132.380714]  ? i915_gem_close_object+0x740/0x740 [i915]
<4>[  132.380784]  ? igt_gem_huge+0x269/0x3d0 [i915]
<4>[  132.380865]  __i915_subtests+0x5a/0x160 [i915]
<4>[  132.380936]  __run_selftests+0x1a2/0x2f0 [i915]
<4>[  132.381008]  i915_live_selftests+0x4e/0x80 [i915]
<4>[  132.381071]  i915_pci_probe+0xd8/0x1b0 [i915]
<4>[  132.381077]  pci_device_probe+0x1c5/0x3a0
<4>[  132.381087]  driver_probe_device+0x6b6/0xcb0
<4>[  132.381094]  __driver_attach+0x22d/0x2c0
<4>[  132.381100]  ? driver_probe_device+0xcb0/0xcb0
<4>[  132.381103]  bus_for_each_dev+0x113/0x1a0
<4>[  132.381108]  ? check_flags.part.24+0x450/0x450
<4>[  132.381112]  ? subsys_dev_iter_exit+0x10/0x10
<4>[  132.381123]  bus_add_driver+0x38b/0x6e0
<4>[  132.381131]  driver_register+0x189/0x400
<4>[  132.381136]  ? 0xffffffffc12d8000
<4>[  132.381140]  do_one_initcall+0xa0/0x4c0
<4>[  132.381145]  ? initcall_blacklisted+0x180/0x180
<4>[  132.381152]  ? do_init_module+0x4a/0x54c
<4>[  132.381156]  ? rcu_lockdep_current_cpu_online+0xdc/0x130
<4>[  132.381161]  ? kasan_unpoison_shadow+0x30/0x40
<4>[  132.381169]  do_init_module+0x1b5/0x54c
<4>[  132.381177]  load_module+0x619e/0x9b70
<4>[  132.381202]  ? module_frob_arch_sections+0x20/0x20
<4>[  132.381211]  ? vfs_read+0x257/0x2f0
<4>[  132.381214]  ? vfs_read+0x257/0x2f0
<4>[  132.381221]  ? kernel_read+0x8b/0x130
<4>[  132.381231]  ? copy_strings_kernel+0x120/0x120
<4>[  132.381244]  ? __se_sys_finit_module+0x17c/0x1a0
<4>[  132.381248]  __se_sys_finit_module+0x17c/0x1a0
<4>[  132.381252]  ? __ia32_sys_init_module+0xa0/0xa0
<4>[  132.381261]  ? __se_sys_newstat+0x77/0xd0
<4>[  132.381265]  ? cp_new_stat+0x590/0x590
<4>[  132.381269]  ? kmem_cache_free+0x2f0/0x340
<4>[  132.381285]  do_syscall_64+0x97/0x400
<4>[  132.381292]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[  132.381295] RIP: 0033:0x7eff4af46839
<4>[  132.381297] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
<4>[  132.381426] RSP: 002b:00007ffcd84f4cf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
<4>[  132.381432] RAX: ffffffffffffffda RBX: 000055dfdeb429a0 RCX: 00007eff4af46839
<4>[  132.381435] RDX: 0000000000000000 RSI: 000055dfdeb43670 RDI: 0000000000000004
<4>[  132.381437] RBP: 000055dfdeb43670 R08: 0000000000000004 R09: 0000000000000000
<4>[  132.381440] R10: 00007ffcd84f4e60 R11: 0000000000000246 R12: 0000000000000000
<4>[  132.381442] R13: 000055dfdeb3bec0 R14: 0000000000000000 R15: 000000000000003b

<3>[  132.381466] Allocated by task 5822:
<4>[  132.381485]  kmem_cache_alloc+0xdf/0x2e0
<4>[  132.381546]  i915_gem_object_create_internal+0x24/0x1e0 [i915]
<4>[  132.381609]  igt_mmap_offset_exhaustion+0x257/0x940 [i915]
<4>[  132.381677]  __i915_subtests+0x5a/0x160 [i915]
<4>[  132.381742]  __run_selftests+0x1a2/0x2f0 [i915]
<4>[  132.381806]  i915_live_selftests+0x4e/0x80 [i915]
<4>[  132.381865]  i915_pci_probe+0xd8/0x1b0 [i915]
<4>[  132.381868]  pci_device_probe+0x1c5/0x3a0
<4>[  132.381871]  driver_probe_device+0x6b6/0xcb0
<4>[  132.381874]  __driver_attach+0x22d/0x2c0
<4>[  132.381877]  bus_for_each_dev+0x113/0x1a0
<4>[  132.381880]  bus_add_driver+0x38b/0x6e0
<4>[  132.381884]  driver_register+0x189/0x400
<4>[  132.381886]  do_one_initcall+0xa0/0x4c0
<4>[  132.381889]  do_init_module+0x1b5/0x54c
<4>[  132.381892]  load_module+0x619e/0x9b70
<4>[  132.381895]  __se_sys_finit_module+0x17c/0x1a0
<4>[  132.381898]  do_syscall_64+0x97/0x400
<4>[  132.381901]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

<3>[  132.381914] Freed by task 150:
<4>[  132.381931]  kmem_cache_free+0xb7/0x340
<4>[  132.381995]  __i915_gem_free_objects+0x875/0xf50 [i915]
<4>[  132.382054]  __i915_gem_free_work+0x69/0xb0 [i915]
<4>[  132.382058]  process_one_work+0x78b/0x1740
<4>[  132.382061]  worker_thread+0x82/0xb80
<4>[  132.382064]  kthread+0x30c/0x3d0
<4>[  132.382067]  ret_from_fork+0x3a/0x50

<3>[  132.382081] The buggy address belongs to the object at ffff8801e1324500
                   which belongs to the cache drm_i915_gem_object of size 1168
<3>[  132.382133] The buggy address is located 248 bytes inside of
                   1168-byte region [ffff8801e1324500, ffff8801e1324990)
<3>[  132.382179] The buggy address belongs to the page:
<0>[  132.382202] page:ffffea000784c800 count:1 mapcount:0 mapping:ffff8801dedf6500 index:0xffff8801e1323ec0 compound_mapcount: 0
<0>[  132.382251] flags: 0x8000000000008100(slab|head)
<1>[  132.382274] raw: 8000000000008100 ffff8801d6317440 ffff8801d6317440 ffff8801dedf6500
<1>[  132.382307] raw: ffff8801e1323ec0 0000000000140013 00000001ffffffff 0000000000000000
<1>[  132.382339] page dumped because: kasan: bad access detected

<3>[  132.382373] Memory state around the buggy address:
<3>[  132.382395]  ffff8801e1324480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
<3>[  132.382426]  ffff8801e1324500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
<3>[  132.382457] >ffff8801e1324580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
<3>[  132.382488]                                                                 ^
<3>[  132.382517]  ffff8801e1324600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
<3>[  132.382548]  ffff8801e1324680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

This patch tricks the system into running without the background retire
thread, until after we finish the test. The only reaping should then be
performed by the mmap offset routine to reclaim the space as required.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180709130208.11730-1-chris@chris-wilson.co.uk

932cac10

drm/i915/selftests: Replace wait-on-timeout with explicit timeout · d9a13ce3

Chris Wilson authored Jul 09, 2018

In igt_flush_test() we install a background timer in order to ensure
that the wait completes within a certain time. We can now tell the wait
that it has to complete within a timeout, and so no longer need the
background timer.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180709122044.7028-3-chris@chris-wilson.co.uk

d9a13ce3

drm/i915: Provide a timeout to i915_gem_wait_for_idle() on setup · 2621cefa

Chris Wilson authored Jul 09, 2018

With a broken GPU we expect it to fail during the initial
GPU setup where do a couple of context switches to record the defaults.
This is a task that takes a few milliseconds even on the slowest of
devices, but we may have to wait 60s for hangcheck to give in and
declare the machine inoperable. In this a case where any gpu hang is
unacceptable, both from a timeliness and practical standpoint.

We can therefore set a timeout on our wait-for-idle that is shorter than
the hangcheck (which may be up to 60s for a declaring a wedged driver)
and so detect the broken GPU much more quickly during driver load (and
so prevent stalling userspace for ages).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180709122044.7028-2-chris@chris-wilson.co.uk

2621cefa

drm/i915: Provide a timeout to i915_gem_wait_for_idle() · ec625fb9

Chris Wilson authored Jul 09, 2018

Usually we have no idea about the upper bound we need to wait to catch
up with userspace when idling the device, but in a few situations we
know the system was idle beforehand and can provide a short timeout in
order to very quickly catch a failure, long before hangcheck kicks in.

In the following patches, we will use the timeout to curtain two overly
long waits, where we know we can expect the GPU to complete within a
reasonable time or declare it broken.

In particular, with a broken GPU we expect it to fail during the initial
GPU setup where do a couple of context switches to record the defaults.
This is a task that takes a few milliseconds even on the slowest of
devices, but we may have to wait 60s for hangcheck to give in and
declare the machine inoperable. In this a case where any gpu hang is
unacceptable, both from a timeliness and practical standpoint.

The other improvement is that in selftests, we do not need to arm an
independent timer to inject a wedge, as we can just limit the timeout on
the wait directly.

v2: Include the timeout parameter in the trace.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180709122044.7028-1-chris@chris-wilson.co.uk

ec625fb9

drm/i915/selftests: Magic numbers for old Y-tiling · e1479132

Chris Wilson authored Jul 07, 2018

i915g has a slightly different tiling layout, and so requires a
different reference swizzle pattern.

Testcase: igt/drv_selftests/live_objects #gdg
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180707100405.817-1-chris@chris-wilson.co.uk

e1479132

drm/i915/gvt: Handle EDP_PSR_IMR and EDP_PSR_IIR for BXT. · 93d68b25

Colin Xu authored Jul 09, 2018

BXT supports EDP. However since GVT-g only simulate DP monitor
to guest and handles EDP_PSR_IMR and EDP_PSR_IIR as default MMIO
r/w. If guest r/w these IMR/IIR, GVT-g won't simulate the real
HW behavior and below warning is printed:
--------
Interrupt register 0x64838 is not zero: 0xffffffff
WARNING: CPU: 1 PID: 1 at drivers/gpu/drm/i915/i915_irq.c:161
gen3_assert_iir_is_zero+0x34/0xa0

Call Trace:
gen8_de_irq_postinstall+0xad/0x330
gen8_irq_postinstall+0x23/0x80
drm_irq_install+0xb5/0x130
i915_driver_load+0xafd/0xf70
--------
Since GVT-g won't simulate EDP to guest, always set EDP_PSR_IMR
and EDP_PSR_IIR IMR/IIR to 0.
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

93d68b25

drm/i915: Enable platform support for vGPU huge gtt pages · aa36ed6d

Changbin Du authored May 15, 2018

Now GVTg supports shadowing both 2M/64K huge gtt pages. So let's turn on
the cap info bit VGT_CAPS_HUGE_GTT.

v2: Split changes in i915 side into a separated patch.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

aa36ed6d

drm/i915/gvt: Fix error handling in ppgtt_populate_spt_by_guest_entry · 80e76ea6

Changbin Du authored May 15, 2018

Don't forget to free allocated spt if shadowing failed.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

80e76ea6

drm/i915/gvt: Handle special sequence on PDE IPS bit · 54c81653

Changbin Du authored May 15, 2018

If the guest update the 64K gtt entry before changing IPS bit of PDE, we
need to re-shadow the whole page table. Because we have ignored all
updates to unused entries.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

54c81653

drm/i915/gvt: Add 2M huge gtt support · b901b252

Changbin Du authored May 15, 2018

This add 2M huge gtt support for GVTg. Unlike 64K gtt entry, we can
shadow 2M guest entry with real huge gtt. But before that, we have to
check memory physical continuous, alignment and if it is supported on
the host. We can get all supported page sizes from
intel_device_info.page_sizes.

Finally we must split the 2M page into smaller pages if we cannot
satisfy guest Huge Page.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

b901b252

drm/i915/kvmgt: Support setting dma map for huge pages · 79e542f5

Changbin Du authored May 15, 2018

To support huge gtt, we need to support huge pages in kvmgt first.
This patch adds a 'size' param to the intel_gvt_mpt::dma_map_guest_page
API and implements it in kvmgt.

v2: rebase.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

79e542f5

drm/i915/gvt: Add 64K huge gtt support · eb3a3530

Changbin Du authored May 15, 2018

Finally, this add the first huge gtt support for GVTg - 64K pages. Since
64K page and 4K page cannot be mixed on the same page table, so we always
split a 64K entry into small 4K page. And when unshadow guest 64K entry,
we need ensure all the shadowed entries in shadow page table also get
cleared.

For page table which has 64K gtt entry, only PTE#0, PTE#16, PTE#32, ...
PTE#496 are used. Unused PTEs update should be ignored.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

eb3a3530

drm/i915/gvt: Make PTE iterator 64K entry aware · 4c9414d7

Changbin Du authored May 15, 2018

64K PTE is special, only PTE#0, PTE#16, PTE#32, ... PTE#496 are used in
the page table.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

4c9414d7

drm/i915/gvt: Split ppgtt_alloc_spt into two parts · 155521c9

Changbin Du authored May 15, 2018

We need a interface to allocate a pure shadow page which doesn't have
a guest page associated with. Such shadow page is used to shadow 2M
huge gtt entry.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

155521c9

drm/i915/gvt: Add GTT clear_pse operation · c3e69763

Changbin Du authored May 15, 2018

Add clear_pse operation in case we need to split huge gtt into small pages.

v2: correct description.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

c3e69763

drm/i915/gvt: Add software PTE flag to mark special 64K splited entry · 71634848

Changbin Du authored May 15, 2018

This add a software PTE flag on the Ignored bit of PTE. It will be used
to identify splited 64K shadow entries.

v2: fix mask definition.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

71634848

drm/i915/gvt: Detect 64K gtt entry by IPS bit of PDE · 40b27176

Changbin Du authored May 15, 2018

This change help us detect the real entry type per PSE and IPS setting.
For 64K entry, we also need to check reg GEN8_GAMW_ECO_DEV_RW_IA.

v2: Extend IPS mmio control to Gen10. (Matthew Auld)
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

40b27176

drm/i915/gvt: Handle MMIO GEN8_GAMW_ECO_DEV_RW_IA for 64K GTT · 52ca14e6

Changbin Du authored May 15, 2018

The register RENDER_HWS_PGA_GEN7 is renamed to GEN8_GAMW_ECO_DEV_RW_IA
from GEN8 which can control IPS enabling.

v3: MMIO control for IPS is not removed from gen9 but gen10 (Matthew Auld)
v2: IPS of all engines must be enabled together for gen9.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

52ca14e6

drm/i915/gvt: Add PTE IPS bit operations · 6fd79378

Changbin Du authored May 15, 2018

Add three IPS operation functions to test/set/clear IPS in PDE.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

6fd79378

drm/i915/gvt: Add new 64K entry type · b294657d

Changbin Du authored May 15, 2018

Add a new entry type GTT_TYPE_PPGTT_PTE_64K_ENTRY. 64K entry is very
different from 2M/1G entry. 64K entry is controlled by IPS bit in upper
PDE. To leverage the current logic, I take IPS bit as 'PSE' for PTE
level. Which means, 64K entries can also processed by get_pse_type().

v2: Make it bisectable.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

b294657d