Commits · 1c66c0f391da32534cf143e6a0f6391776aa9bf8 · Kirill Smelkov / linux

21 Dec, 2023 40 commits

drm/xe: fix submissions without vm · 1c66c0f3

Daniele Ceraolo Spurio authored Aug 22, 2023

Kernel queues can submit privileged batches directly in GGTT, so they
don't always need a vm. The submission front-end already supports
creating and submitting jobs without a vm, but some parts of the
back-end assume the vm is always there. Fix this by handling a lack of
vm in the back-end as well.

v2: s/XE_BUG_ON/XE_WARN_ON, s/engine/exec_queue
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20230822173334.1664332-2-daniele.ceraolospurio@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

1c66c0f3

drm/xe: Drop xe_mmio_write64() · 486b2ef2

Matt Roper authored Aug 22, 2023

The only possible 64-bit register writes in the driver come from the
highly questionable MMIO ioctl. That ioctl's register write support
only operates for userspace running as root and cannot be used by any
real userspace; it exists solely to support the "xe_reg" debug tool in
IGT. Since the spec indicates that hardware does not officially support
64-bit register accesses, there's no reason to allow such 64-bit writes,
even for debugging.

Bspec: 60027
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://lore.kernel.org/r/20230823003312.1356779-4-matthew.d.roper@intel.comSigned-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

486b2ef2

drm/xe: Avoid 64-bit register reads · 07431945

Matt Roper authored Aug 22, 2023

Intel hardware officially only supports GTTMMADR register accesses of
32-bits or less (although 64-bit accesses to device memory and PTEs in
the GSM are fine).  Even though we do usually seem to get back
reasonable values when performing readq() operations on registers in
BAR0, we shouldn't rely on this violation of the spec working
consistently.  It's likely that even when we do get proper register
values back the hardware is internally satisfying the request via a
non-atomic sequence of two 32-bit reads, which can be problematic for
timestamps and counters if rollover of the lower bits is not considered.

Replace xe_mmio_read64() with xe_mmio_read64_2x32() that implements
64-bit register reads as two 32-bit reads and attempts to ensure that
the upper dword has stabilized to avoid problematic rollovers for
counter and timestamp registers.

v2:
 - Move function from xe_mmio.h to xe_mmio.c.  (Lucas)
 - Convert comment to kerneldoc and note that it shouldn't be used on
   registers where reads may trigger side effects.  (Lucas)

Bspec: 60027
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://lore.kernel.org/r/20230823003312.1356779-3-matthew.d.roper@intel.comSigned-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

07431945

drm/xe/lnl: Hook up MOCS table · 770576f1

Balasubramani Vivekanandan authored Aug 11, 2023

LNL uses the Xe2 MOCS table introduced in an earlier patch.

Bspec: 71582
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

770576f1

drm/xe/lnl: Add GuC firmware definition · 943c01b7

Matt Roper authored Aug 11, 2023

Define the GuC firmware to load on the platform.

Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

943c01b7

drm/xe/lnl: Add LNL platform definition · 33303615

Matt Roper authored Aug 11, 2023

LNL is an integrated GPU based on the Xe2 architecture.

Bspec: 70821
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

33303615

drm/xe/xe2: Program GuC's MOCS on Xe2 and beyond · 0993b22f

Matt Roper authored Aug 11, 2023

As with PVC, Xe2 platforms require that the index of an uncached MOCS
entry be programmed into the GUC_SHIM_CONTROL register.  This will
likely be needed on future platforms as well.

Xe2 also extends the size of the MOCS index register field from two bits
to four bits.  Since these extra bits were unused on PVC, it should be
safe to just increase the size of the mask.

Bspec: 60592
Cc: Haridhar Kalvala <haridhar.kalvala@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

0993b22f

drm/xe/xe2: Add MOCS table · e4751ab5

Balasubramani Vivekanandan authored Aug 11, 2023

Additional minor change to remove L4_2_RESERVED, which will never be
required.

v2: Make L3/L4 names consistent for GLOB_MOCS defines (Matt Roper)

Bspec: 71582
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

e4751ab5

drm/xe/xe2: Track VA bits independently of max page table level · e9bb0891

Matt Roper authored Aug 11, 2023

Starting with Xe2, a 5-level page table is always used, regardless of
the actual virtual address range supported by the platform.  The two
values need to be tracked separately in the device descriptor since Xe2
platforms only have a 48 bit virtual address range.

Bspec: 59505, 65637, 70817
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

e9bb0891

drm/xe/xe2: Define Xe2_LPM IP features · 595e4a3a

Matt Roper authored Aug 11, 2023

Xe2_LPM media is represented by GMD_ID value 20.00.
It provides 1 VD + 1 VE + 1 SFC.

Bspec: 70821, 70819
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

595e4a3a

drm/xe/xe2: Define Xe2_LPG IP features · 2985bedc

Matt Roper authored Aug 11, 2023

Define a common set of Xe2 graphics feature flags and definitions that
will be used for all platforms in this family.

Several of the feature flags are inherited unchanged from Xe_HP and/or
Xe_HPC platforms:
 - dma_mask_size remains 46   (Bspec 70817)
 - supports_usm=1             (Bspec 59651)
 - has_flatccs=1              (Bspec 58797)
 - has_asid=1                 (Bspec 59654, 59265, 60288)
 - has_range_tlb_invalidate=1 (Bspec 71126)

However some of them still need proper implementation in the driver to
be used, so they are disabled.

Notable Xe2-specific changes:
 - All Xe2 platforms use a five-level page table, regardless of the
   virtual address space for the platform.  (Bspec 59505)

The graphics engine mask represents the Xe2 architecture engines (Bspec
60149), but individual platforms may have a reduced set of usable
engines, as reflected by their fusing.

Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

2985bedc

drm/xe/xe2: AuxCCS is no longer used · be6dd3c8

Matt Roper authored Aug 11, 2023

Starting with Xe2, all platforms (including igpu platforms) use FlatCCS
compression rather than AuxCCS.  Similar to PVC, any future platforms
that don't support FlatCCS should not attempt to fall back to AuxCCS
programming.
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

be6dd3c8

drm/xe/xe2: Handle fused-off CCS engines · 53497182

Matt Roper authored Aug 11, 2023

On Xe2 platforms, availability of the CCS engines is reflected in the
FUSE4 register.

Bspec: 62483
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

53497182

drm/xe/xe2: Update context image layouts · c5fa5814

Matt Roper authored Aug 11, 2023

Engine register state layout has changed a bit on Xe2.  We'll also
explicitly define a BCS layout to ensure BLIT_SWCTL and BLIT_CCTL are
included.

Bspec: 65182, 60184, 55793
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

c5fa5814

drm/xe/xe2: Add MCR register steering for media GT · 8e99b545

Matt Roper authored Aug 11, 2023

Xe2 media has a few types of MCR registers, but all except for "GPMXMT"
can safely steer to instance 0,0.  GPMXMT follows the same rules that
MTL's OADDRM ranges did, so it can re-use the same enum value.

Bspec: 71186
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

8e99b545

drm/xe/xe2: Add MCR register steering for primary GT · 5c82000f

Matt Roper authored Aug 11, 2023

Xe2 uses the same steering control register and steering semaphore
register as MTL.  As with recent platforms, group/instance 0,0 is
sufficient to target a non-terminated instance for most classes of MCR
registers; the only types of ranges that need to consider platform
fusing to find a non-terminated instance are SLICE/DSS ranges and a new
SQIDI_PSMI type of range.

Note that the range of valid bits in XE2_NODE_ENABLE_MASK may be reduced
for some Xe2 SKUs.  However the lowest bits are always valid and only
the lowest instance is obtained via __ffs(), so there's no need to
complicate the masking with extra platform/subplatform checks.

Also note that Wa_14017387313 suggests skipping MCR lock acquisition
around GAM and GAMWKR registers to prevent MCR register accesses in an
interrupt handler from deadlocking when the steering semaphore is
already held outside the interrupt context.  At this time Xe never
issues MCR accesses from within an interrupt handler so the workaround
is not currently needed.

v2:
  - [0x008700-0x0087FF] range to extend up to 0x887F (Matt Attwood)
  - [0x00EF00-0x00F4FF] -> [0x00F000, 0xFFFF] to follow latest
    bspec version (Bala)

Bspec: 71185
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

5c82000f

drm/xe/xe2: Add GT topology readout · 015906ff

Matt Roper authored Aug 11, 2023

Xe2 platforms have three DSS fuse registers for both geometry and
compute.

Bspec: 67171, 67537, 67401, 67536
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

015906ff

drm/xe/xe2: Update render/compute context image sizes · 13a3398b

Matt Roper authored Aug 11, 2023

The render and compute context are significantly smaller on Xe2 than on
previous platforms.

Registers:
 - Render:  3008 dwords -> 12032 bytes -> round to 3 pages
 - Compute: 1424 dwords ->  5696 bytes -> round to 2 pages

We also allocate one additional page for the HWSP, so the total
allocation sizes for render and compute are 4 and 3 pages respectively.

Bspec: 65182, 56578, 55793
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

13a3398b

drm/xe: Stop tracking 4-tile support · 0aef9ff7

Matt Roper authored Aug 17, 2023

The choice of Y-major tiling format (either the legacy "TileY" or the
newer "Tile4") is based on graphics IP version (12.50 and beyond have
Tile4, earlier platforms have TileY). The tracking in xe was originally
added to allow re-using display from i915. However as of i915 commit
4ebf43d0 ("drm/i915: Eliminate has_4tile feature flag"), the display
code determines TileY vs Tile4 itself, so this can be removed from xe.
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230817230407.909816-10-matthew.d.roper@intel.comSigned-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

0aef9ff7

drm/xe: enable idle msg and set hysteresis for GSCCS · 07d7ba13

Daniele Ceraolo Spurio authored Aug 17, 2023

On MTL (and only on MTL) the GSCCS defaults with idle messaging
disabled. This means that, once awoken, the GSCCS will never signal its
idleness to the GT. To allow the GT to enter the proper low-power state,
we need therefore to turn idle messaging on. As part of this, we also
need to set a proper hysteresis value for the engine.

v2: use MEDIA_VERSION() and CLR() for the RTP rule and action, add reg
bit define in descending order (Matt)

Bspec: 71496
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230817221707.1602873-1-daniele.ceraolospurio@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

07d7ba13

drm/xe: don't expose the GSCCS to users · 92939935

Daniele Ceraolo Spurio authored Aug 17, 2023

The kernel is the only expected user of the GSCCS, so we don't want to
expose it to userspace.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230817201831.1583172-7-daniele.ceraolospurio@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

92939935

drm/xe: GSC forcewake support · f4c33ae8

Daniele Ceraolo Spurio authored Aug 17, 2023

The ID for the GSC forcewake domain already exists, but we're missing
the register definitions and the domain intialization, so add that in.

v2: move reg definition to be in address order (Matt)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230817201831.1583172-6-daniele.ceraolospurio@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

f4c33ae8

drm/xe: add GSCCS ring ops · aef61349

Daniele Ceraolo Spurio authored Aug 17, 2023

Like the BCS, the GSCCS doesn't have any special HW that needs handling
when emitting commands, so we can re-use the same emit_job code. To make
it clear that this is now a shared low-level function, it has been
renamed to use the "simple" postfix, instead of "copy", to indicate that
it applies to all engines that don't need any additional engine-specific
handling.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230817201831.1583172-5-daniele.ceraolospurio@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

aef61349

drm/xe: add GSCCS irq support · 3d2b5d4e

Daniele Ceraolo Spurio authored Aug 17, 2023

The GSCCS has its own enable and mask registers. The interrupt identity
for the GSCCS shows OTHER_CLASS instance 6.

Bspec: 54029, 54030
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230817201831.1583172-4-daniele.ceraolospurio@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

3d2b5d4e

drm/xe: base definitions for the GSCCS · 29654910

Daniele Ceraolo Spurio authored Aug 17, 2023

The first step in introducing the GSCCS is to add all the basic defs for
it (name, mmio base, class/instance, lrc size etc).

Bspec: 60149, 60421, 63752
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230817201831.1583172-3-daniele.ceraolospurio@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

29654910

drm/xe: common function to assign queue name · 0b1d1473

Daniele Ceraolo Spurio authored Aug 17, 2023

The queue name assignment is identical in both GuC and execlists
backends, so we can move it to a common function. This will make adding
a new entry in the next patch slightly cleaner.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230817201831.1583172-2-daniele.ceraolospurio@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

0b1d1473

drm/xe: Add CONFIG_DRM_XE_PREEMPT_TIMEOUT · a863b416

Niranjana Vishwanathapura authored Aug 07, 2023

Allow preemption timeout to be specified as a config option.

v2: Change unit to microseconds (Tejas)
v3: Remove get_default_preempt_timeout()
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

a863b416

drm/xe: Simplify engine class sched_props setting · 50b09903

Niranjana Vishwanathapura authored Aug 07, 2023

Shortens the too long code lines.
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

50b09903

drm/xe/dg2: Remove Wa_15010599737 · 0955d3be

Shekhar Chauhan authored Aug 14, 2023

Since this is specific to DirectX, we don't need it on Linux.
Signed-off-by: Shekhar Chauhan <shekhar.chauhan@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230814150323.874033-1-shekhar.chauhan@intel.comSigned-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

0955d3be

drm/xe: Improve vram info debug printing · 286089ce

Oak Zeng authored Jul 14, 2023

Print both device physical address range and CPU io range
of vram. Also print vram's actual size, usable size excluding
stolen memory, and CPU io accessible size.
V1:
  - Add back small BAR device info (Matt)
Signed-off-by: Oak Zeng <oak.zeng@intel.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

286089ce

drm/xe: Make xe_mem_region struct · 0887a2e7

Oak Zeng authored Jul 11, 2023

Make a xe_mem_region structure which will be used in the
coming patches. The new structure is used in both xe device
level (xe->mem.vram) and xe_tile level (tile->vram).

Make the definition of xe_mem_region.dpa_base to be the DPA
base of this memory region and change codes according to
this new definition.

v1:
  - rename xe_mem_region.base to dpa_base per conversation with Mike
    Ruhl
Signed-off-by: Oak Zeng <oak.zeng@intel.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

0887a2e7

drm/xe: Call __guc_exec_queue_fini_async direct for KERNEL exec_queues · a20c75db

Matthew Brost authored Aug 11, 2023

Usually we call __guc_exec_queue_fini_async via a worker as the
exec_queue fini can be done from within the GPU scheduler which creates
a circular dependency without a worker. Kernel exec_queues are fini'd at
driver unload (not from within the GPU scheduler) so it is safe to
directly call __guc_exec_queue_fini_async.
Suggested-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

a20c75db

drm/xe: skip rebind_list if vma destroyed · ca8656a2

Matthew Auld authored Aug 08, 2023

If we are closing a vm, mark each vma as XE_VMA_DESTROYED and skip
touching the rebind_list if this is seen on the eviction path. That way
we can safely drop the vm dma-resv lock on the close path without
needing to worry about racing with the eviction path trying to add stuff
to the rebind_list which can corrupt our contended list, since the
destroy and rebind links are the same list entry underneath.

References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/514Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

ca8656a2

drm/xe/guc_submit: fixup deregister in job timeout · ef6ea972

Matthew Auld authored Aug 09, 2023

Rather check if the engine is still registered before proceeding with
deregister steps. Also the engine being marked as disabled doesn't mean
the engine has been disabled or deregistered from GuC pov, and here we
are signalling fences so we need to be sure GuC is not still using this
context.

v2:
 - Drop the read_stopped() for this path. Since we are signalling
   fences on error here, best play it safe and wait for the GT reset to
   mark the engine as disabled, rather than it just being queued.
v3 (Matt Brost):
 - Keep the read_stopped() on the wait event, since there is no need to
   wait for an already scheduled GT reset. If it is set we can then just
   bail without signalling anything.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

ef6ea972

drm/xe/pm: Add vram_d3cold_threshold for d3cold capable device · 3d4b0bfc

Anshuman Gupta authored Aug 02, 2023

Do not register vram_d3cold_threshold device sysfs universally
for each gfx device, only register sysfs and set the threshold
value for d3cold capable devices.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/all/20230802070449.2426563-1-anshuman.gupta@intel.com/Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

3d4b0bfc

drm/xe: Add Wa_14015150844 for DG2 and Xe_LPG · c7e4a611

Matt Roper authored Jul 27, 2023

The workaround database tells us to set this bit, even though the bspec
indicates the bit doesn't exist on these platforms. Since this is a
write-only register, we also can't read back its value to verify whether
it's actually working or not. For now we'll trust that the workaround
database knows what it's talking about; if not, the hardware will just
ignore the attempt to write to a non-existent bit and it shouldn't cause
any problems.
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://lore.kernel.org/r/20230727220920.2291913-2-matthew.d.roper@intel.comSigned-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

c7e4a611

drm/xe: don't warn for bogus pagefaults · 17d28aa8

Matthew Auld authored Aug 09, 2023

This appears to be easily user triggerable so warning is perhaps too
much. Rather just make it debug print.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/534Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

17d28aa8

drm/xe: Implement HW workaround 14016763929 · 7f6c6e50

Oak Zeng authored Aug 03, 2023

To workaround a HW bug on DG2, driver is required to map the whole
ppgtt virtual address space before GPU workload submission. Thus
set the XE_VM_FLAG_SCRATCH_PAGE flag during vm create so the whole
address space is mapped to point to scratch page.

v1:
  - Move the workaround implementation from xe_vm_create to
    xe_vm_create_ioctl - Brian
  - Reorder error checking in xe_vm_create_ioctl - Jose
  - Implement WA only for DG2-G10 and DG2-G12
Signed-off-by: Oak Zeng <oak.zeng@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

7f6c6e50

drm/xe: Update ARL-S DevIDs to the latest BSpec · de4651d6

Lucas De Marchi authored Aug 04, 2023

BSpec changed with regard the DevIDs for ARL-S. Update the define
accordingly.

Bspec: 55420
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Link: https://lore.kernel.org/r/20230804231709.1065087-3-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

de4651d6

drm/xe: Set max pte size when skipping rebinds · c47794bd

Matthew Brost authored Aug 02, 2023

When a rebind is skipped, we must set the max pte size of the newly
created vma to value of the old vma as we do not pte walk for the new
vma. Without this future rebinds may be incorrectly skipped due to the
wrong max pte size. Null binds are more likely to expose this bug as
larger ptes are more frequently used compared to normal bindings.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Testcase: dEQP-VK.sparse_resources.buffer.ssbo.sparse_residency.buffer_size_2_24
Reported-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Fixes: 8f33b4f0 ("drm/xe: Avoid doing rebinds")
Reference: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23045Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

c47794bd