Commits · 451028644775a5e07aaab3f147fda583e7054de6 · Kirill Smelkov / linux

21 Dec, 2023 40 commits

drm/xe/pat: Prefer the arch/IP names · 45102864

Lucas De Marchi authored Sep 27, 2023

Both DG2 and PVC are derived from XeHP, but DG2 should not really
re-use something introduced by PVC, so it's odd to have DG2 re-using the
PVC programming for PAT. Let's prefer using the architecture and/or IP
names.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230927193902.2849159-8-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

45102864

drm/xe/dg2: Fix using wrong PAT table · 194bdb85

Lucas De Marchi authored Sep 27, 2023

DG2 should use the MCR variant to program the PAT registers, like PVC,
but shouldn't use the same table as PVC.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230927193902.2849159-7-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

194bdb85

drm/xe: Use vfunc to initialize PAT · b445be57

Lucas De Marchi authored Sep 27, 2023

Split the PAT initialization between SW-only and HW. The _early() only
sets up the ops and data structure that are used later to program the
tables. This allows the PAT to be easily extended to other platforms.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230927193902.2849159-6-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

b445be57

drm/xe/migrate: Do not hand-encode pte · 23c8495e

Lucas De Marchi authored Sep 27, 2023

Instead of encoding the pte, call a new vfunc from xe_vm to handle that.
The encoding may not be the same on every platform, so keeping it in one
place helps to better support them.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230927193902.2849159-5-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

23c8495e

drm/xe: Use vfunc for pte/pde ppgtt encoding · 0e5e77bd

Lucas De Marchi authored Sep 27, 2023

Move the function to encode pte/pde to be vfuncs inside struct xe_vm.
This will allow to easily extend to platforms that don't have a
compatible encoding.

v2: Fix kunit build
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230927193902.2849159-4-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

0e5e77bd

drm/xe: Remove check for vma == NULL · b3bb7d9c

Lucas De Marchi authored Sep 27, 2023

vma at this point can never be NULL as otherwise it would crash earlier
in the only caller, xe_pt_stage_bind_entry(). Remove the extra check and
avoid adding and removing the bits from the pte.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230927193902.2849159-3-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

b3bb7d9c

drm/xe: Normalize pte/pde encoding · f24081cd

Lucas De Marchi authored Sep 27, 2023

Split functions that do only part of the pde/pte encoding and that can
be called by the different places. This normalizes how pde/pte are
encoded so they can be moved elsewhere in a subsequent change.

xe_pte_encode() was calling __pte_encode() with a NULL vma, which is the
opposite of what xe_pt_stage_bind_entry() does. Stop passing a NULL vma
and just split another function that deals with a vma rather than a bo.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230927193902.2849159-2-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

f24081cd

drm/xe: Infer service copy functionality from engine list · 1951dad5

Matt Roper authored Sep 27, 2023

On platforms with multiple BCS engines (i.e., PVC and Xe2), not all BCS
engines are created equal.  The BCS0 engine is what the specs refer to
as a "resource copy engine," which supports the platform's full set of
copy/fill instructions.  In contast, the non-BCS0 "service copy" engines
are more streamlined and only support a subset of the GPU instructions
supported by the resource copy engine.  Platforms with both types of
copy engines always support the MEM_COPY and MEM_SET instructions which
can be used for simple copy and fill operations on either type of BCS
engine.  Since the simple MEM_SET instruction meets the needs of Xe's
migrate code (and since the more elaborate XY_FAST_COLOR_BLT instruction
isn't available to use on service copy engines), we always prefer to use
MEM_SET for clearing buffers on our newer platforms.

We've been using a 'has_link_copy_engine' feature flag to keep track of
which platforms should use MEM_SET for fills.  However a feature flag
like this is unnecessary since we can already derive the presence of
service copy engines (and in turn the MEM_SET instruction) just by
looking at the platform's pre-fusing engine list.  Utilizing the engine
list for this also avoids mistakes like we've made on Xe2 where we
forget to set the feature flag in the IP definition.

For clarity, "service copy" is a general term that covers any blitter
engines that support a limited subset of the overall blitter instruction
set (in practice this is any non-BCS0 blitter engine).  The "link copy
engines" introduced on PVC and the "paging copy engine" present in Xe2
are both instances of service copy engines.

v2:
 - Rewrite / expand the commit message.  (Bala)
 - Fix checkpatch whitespace error.

Bspec: 65019
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Haridhar Kalvala <haridhar.kalvala@intel.com>
Link: https://lore.kernel.org/r/20230927205143.2695089-2-matthew.d.roper@intel.comSigned-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

1951dad5

drm/xe/irq: Clear GFX_MSTR_IRQ as part of IRQ reset · 51a5d656

Gustavo Sousa authored Sep 26, 2023

Starting with Xe_LP+, GFX_MSTR_IRQ contains status bits that have W1C
behavior. If we do not properly reset them, we would miss delivery of
interrupts if a pending bit is set when enabling IRQs.

As an example, the display part of our probe routine contains paths
where we wait for vblank interrupts. If a display interrupt was already
pending when enabling IRQs, we would time out waiting for the vblank.

That in fact happened recently when modprobing Xe on a Lunar Lake with a
specific configuration; and that's how we found out we were missing this
step in the IRQ enabling logic.

Fix the issue by clearing GFX_MSTR_IRQ as part of the IRQ reset.

v2:
  - Make resetting GFX_MSTR_IRQ be the last step to avoid bit
    re-latching. (Ville)
v3:
  - Swap nesting order: guard loop with the IP version check instead of
    doing the check at each iteration. (Lucas)
v4:
  - Add braces for the "if" statement guarding the loop to make the
    compiler happy. (Gustavo)

BSpec: 50875, 54028, 62357
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230926221914.106843-2-gustavo.sousa@intel.comSigned-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

51a5d656

drm/xe: Add Wa_18028616096 · 5fdd4b21

Shekhar Chauhan authored Sep 25, 2023

Drop UGM per set fragment threshold to 3

BSpec: 54833
Signed-off-by: Shekhar Chauhan <shekhar.chauhan@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230925160543.915217-1-shekhar.chauhan@intel.comSigned-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

5fdd4b21

drm/xe: Align size to PAGE_SIZE · 02cadbb5

Pallavi Mishra authored Sep 21, 2023

Ensure alignment with PAGE_SIZE for the size parameter
passed to __xe_bo_create_locked()

v2: move size alignment under else condition (Lucas)
Signed-off-by: Pallavi Mishra <pallavi.mishra@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230920213259.3458968-1-pallavi.mishra@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

02cadbb5

drm/xe: Accept a const xe device · babba646

Lucas De Marchi authored Sep 22, 2023

Depending on the context, it's preferred to have a const pointer to make
sure nothing is modified underneath. The assert macros only ever read
data from xe/tile/gt for printing, so they can be made const by default.
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://lore.kernel.org/r/20230922174320.2372617-1-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

babba646

drm/xe: add msix support · bc18dae5

Dani Liberman authored Sep 18, 2023

In future devices we will need to support msix interrupts.
Reviewed-by: Ohad Sharabi <osharabi@habana.ai>
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

bc18dae5

drm/xe: change old msi irq api to a new one · 14d25d8d

Dani Liberman authored Sep 18, 2023

As a preparation for msix support, changing for new msi irq api
which supports both msi and msix.
Reviewed-by: Ohad Sharabi <osharabi@habana.ai>
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Rebase fixes by Rodrigo]

14d25d8d

drm/xe: proper setting of irq enabled flag · dbac286d

Dani Liberman authored Sep 18, 2023

IRQ enabled flag should be set only after request irq succeeds.
Reviewed-by: Ohad Sharabi <osharabi@habana.ai>
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

dbac286d

drm/xe: Implement fdinfo memory stats printing · 08452333

Tejas Upadhyay authored Sep 15, 2023

Use the newly added drm_print_memory_stats helper to show memory
utilisation of our objects in drm/driver specific fdinfo output.

To collect the stats we walk the per memory regions object lists
and accumulate object size into the respective drm_memory_stats
categories.

Objects with multiple possible placements are reported in multiple
regions for total and shared sizes, while other categories are
counted only for the currently active region.

V4:
  - Remove rcu lock - Auld/Thomas
  - take refcnt only if its non-zero - Auld
  - DMA_RESV_USAGE_BOOKKEEP covers all fences - Auld
  - covert to xe_bo for public objects
V3:
  - dont use xe_bo_get/put, not needed
  - use designated initializer - Jani
  - use list_for_each_entry_rcu
  - Fix Checkpatch err - CI
V2:
  - Use static initializer for mem_type - Himal/Jani
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

08452333

drm/xe: Account ring buffer and context state storage · 303fb116

Tejas Upadhyay authored Aug 29, 2023

Account ring buffers and logical context space against the owning client
memory usage stats.
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

303fb116

drm/xe: Track page table memory usage for client · 2ff00c4f

Tejas Upadhyay authored Sep 14, 2023

Account page table memory usage in the owning client
memory usage stats.

V2:
  - Minor tweak to if (vm->pt_root[id]) check - Himal
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

2ff00c4f

drm/xe: Record each drm client with its VM · 9e4e9761

Tejas Upadhyay authored Aug 10, 2023

Enable accounting of indirect client memory usage.
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

9e4e9761

drm/xe: Add tracking support for bos per client · b27970f3

Tejas Upadhyay authored Sep 21, 2023

In order to show per client memory consumption, we
need tracking support APIs to add at every bo consumption
and removal. Adding APIs here to add tracking calls at
places wherever it is applicable.

V5:
  - Rebase
V4:
  - remove client bo before vm_put
  - spin_lock_irqsave not required - Auld
V3:
  - update .h to return xe_drm_client_remove_bo void
  - protect xe_drm_client_remove_bo under CONFIG_PROC_FS check - Himal
  - Fixed Checkpatch error - CI
V2:
  - make xe_drm_client_remove_bo return void - Himal
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

b27970f3

drm/xe: Interface xe drm client with fdinfo interface · 85c6ad1a

Tejas Upadhyay authored Sep 14, 2023

DRM core driver has introduced recently fdinfo interface to
show memory stats of individual drm client. Lets interface
xe drm client to fdinfo interface.

V2:
  - cover call to xe_drm_client_fdinfo under PROC_FS
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

85c6ad1a

drm/xe: Add drm-client infrastructure · 8f965392

Tejas Upadhyay authored Sep 14, 2023

Add drm-client infrastructure to record stats of consumption
done by individual drm client.

V2:
  - Typo - CI
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

8f965392

drm/xe: Add child contexts to the GuC context lookup · cb90d469

Daniele Ceraolo Spurio authored Sep 14, 2023

The CAT_ERROR message from the GuC provides the guc id of the context
that caused the problem, which can be a child context. We therefore
need to be able to match that id to the exec_queue that owns it, which
we do by adding child context to the context lookup.

While at it, fix the error path of the guc id allocation code to
correctly free the ids allocated for parallel queues.

v2: rebase on s/XE_WARN_ON/xe_assert

Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/590Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

cb90d469

drm/xe/wa: Apply tile workarounds at probe/resume · 0d053475

Matt Roper authored Sep 13, 2023

Although the vast majority of workarounds the driver needs to implement
are either GT-based or display-based, there are occasionally workarounds
that reside outside those parts of the hardware (i.e., in they target
registers in the sgunit/soc); we can consider these to be "tile"
workarounds since there will be instance of these registers per tile.
The registers in question should only lose their values during a
function-level reset, so they only need to be applied during probe and
resume; the registers will not be affected by GT/engine resets.

Tile workarounds are rare (there's only one, 22010954014, that's
relevant to Xe at the moment) so it's probably not worth updating the
xe_rtp design to handle tile-level workarounds yet, although we may want
to consider that in the future if/when more of these show up on future
platforms.
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/20230913231411.291933-13-matthew.d.roper@intel.comSigned-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

0d053475

drm/xe: Disallow pinning dma-bufs in VRAM · 7764222d

Thomas Hellström authored Sep 20, 2023

For now only support pinning in TT memory, for two reasons:
1) Avoid pinning in a placement not accessible to some importers.
2) Pinning in VRAM requires PIN accounting which is a to-do.

v2:
- Adjust the dma-buf kunit test accordingly.
Suggested-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20230920095001.5539-1-thomas.hellstrom@linux.intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

7764222d

drm/xe: Simplify final return from xe_irq_install() · d435a039

Gustavo Sousa authored Sep 15, 2023

At the end of the function, we will always return err no matter it's
value. Simplify this by just returning the result of
drmm_add_action_or_reset().
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230915220233.59736-1-gustavo.sousa@intel.comSigned-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

d435a039

drm/xe: Reinstate pipelined fence enable_signaling · fc678ec7

Thomas Hellström authored Sep 15, 2023

With the GPUVA conversion, the xe_bo::vmas member became replaced with
drm_gem_object::gpuva.list, however there was a couple of usage instances
left using the old member. Most notably the pipelined fence
enable_signaling.

Remove the xe_bo::vmas member completely, fix usage instances and
also enable this pipelined fence enable_signaling even for faulting
VM:s since we actually wait for bind fences to complete.

v2:
- Rebase.
v3:
- Fix display code build error.

Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230915172606.14436-1-thomas.hellstrom@linux.intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

fc678ec7

drm/xe/uc: Add GuC/HuC firmware path overrides · a455ed04

Daniele Ceraolo Spurio authored Sep 13, 2023

When testing a new binary and/or debugging binary-related issues, it is
useful to have the option to change which binary is loaded without
having to update and re-compile the kernel. To support this option, this
patch adds 2 new modparams to override the FW path for GuC and HuC. The
HuC modparam can also be set to an empty string to disable HuC loading.

Note that those modparams only take effect on platforms where we already
have a default FW, so we're sure there is support for FW loading and the
kernel isn't going to explode in an undefined path.

v2: simplify comment (John),
    rebase on s/guc_submission_enabled/uc_enabled
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

a455ed04

drm/xe/uc: Fix uC status tracking · 75730847

Daniele Ceraolo Spurio authored Sep 13, 2023

The current uC status tracking has a few issues:

1) the HuC is moved to "disabled" instead of "not supported"

2) the status is left uninitialized instead of "disabled" when the
   modparam is used to disable support

3) due to #1, a number of checks are done against "disabled" instead of
   the appropriate status.

Address all of those by making sure to follow the appropriate state
transition and checking against the required state.

v2: rebase on s/guc_submission_enabled/uc_enabled/
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

75730847

drm/xe/uc: Rename guc_submission_enabled() to uc_enabled() · c4991ee0

Daniele Ceraolo Spurio authored Sep 13, 2023

The guc_submission_enabled() function is being used as a boolean toggle
for all firmwares and all related features, not just GuC submission. We
could add additional flags/functions to distinguish and allow different
use-cases (e.g. loading HuC but not using GuC submission), but given
that not using GuC is a debug-only scenario having a global switch for
all FWs is enough. However, we want to make it clear that this switch
turns off everything, so rename it to uc_enabled().

v2: rebase on s/XE_WARN_ON/xe_assert
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

c4991ee0

drm/xe/pmu: Enable PMU interface · 3856b0f7

Aravind Iddamsetty authored Aug 30, 2023

There are a set of engine group busyness counters provided by HW which are
perfect fit to be exposed via PMU perf events.

BSPEC: 46559, 46560, 46722, 46729, 52071, 71028

events can be listed using:
perf list
  xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
  xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
  xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
  xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
  xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]

and can be read using:

perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
           time             counts unit events
     1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     2.003294678                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     3.005199582                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     4.007076497                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     5.008553068                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     6.010531563              43520 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     7.012468029              44800 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     8.013463515                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     9.015300183                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
    10.017233010                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
    10.971934120                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/

The pmu base implementation is taken from i915.

v2:
Store last known value when device is awake return that while the GT is
suspended and then update the driver copy when read during awake.

v3:
1. drop init_samples, as storing counters before going to suspend should
be sufficient.
2. ported the "drm/i915/pmu: Make PMU sample array two-dimensional" and
dropped helpers to store and read samples.
3. use xe_device_mem_access_get_if_ongoing to check if device is active
before reading the OA registers.
4. dropped format attr as no longer needed
5. introduce xe_pmu_suspend to call engine_group_busyness_store
6. few other nits.

v4: minor nits.

v5: take forcewake when accessing the OAG registers

v6:
1. drop engine_busyness_sample_type
2. update UAPI documentation

v7:
1. update UAPI documentation
2. drop MEDIA_GT specific change for media busyness counter.
Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

3856b0f7

drm/xe: Use spinlock in forcewake instead of mutex · cd853419

Aravind Iddamsetty authored Aug 30, 2023

In PMU we need to access certain registers which fall under GT power
domain for which we need to take forcewake. But as PMU being an atomic
context can't expect to have any sleeping calls.
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

cd853419

drm/xe: Get GT clock to nanosecs · 8d07691c

Aravind Iddamsetty authored Aug 30, 2023

Helper to convert GT clock cycles to nanoseconds.

v2: Use DIV_ROUND_CLOSEST_ULL helper(Ashutosh)
v3: rename xe_gt_clock_interval_to_ns to xe_gt_clock_cycles_to_ns
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

8d07691c

drm/xe/guc: Switch to major-only GuC FW tracking for MTL · 430003b8

Daniele Ceraolo Spurio authored Sep 06, 2023

Newer HuC binaries for MTL (8.5.1+) require GuC 70.7 or newer, so we
need to move on from 70.6.4. Given that the MTL GuC uses major-only
version matching in i915, we can do the same here instead of just
bumping the version (and having to push the versioned binaries,
because they're not there already for i915).
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

430003b8

drm/xe: Use Xe assert macros instead of XE_WARN_ON macro · c73acc1e

Francois Dugast authored Sep 12, 2023

The XE_WARN_ON macro maps to WARN_ON which is not justified
in many cases where only a simple debug check is needed.
Replace the use of the XE_WARN_ON macro with the new xe_assert
macros which relies on drm_*. This takes a struct drm_device
argument, which is one of the main changes in this commit. The
other main change is that the condition is reversed, as with
XE_WARN_ON a message is displayed if the condition is true,
whereas with xe_assert it is if the condition is false.

v2:
- Rebase
- Keep WARN splats in xe_wopcm.c (Matt Roper)

v3:
- Rebase
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

c73acc1e

drm/xe: Introduce Xe assert macros · 1975b591

Michal Wajdeczko authored Sep 12, 2023

As we are moving away from the controversial XE_BUG_ON macro,
relying just on WARN_ON or drm_err does not cover the cases
where we want to annotate functions with additional detailed
debug checks to assert that all prerequisites are satisfied,
without paying footprint or performance penalty on non-debug
builds, where all misuses introduced during code integration
were already fixed.

Introduce family of Xe assert macros that try to follow classic
assert() utility and can be compiled out on non-debug builds.

Macros are based on drm_WARN, but unlikely to origin, disallow
use in expressions since we will compile that code out.

As we are operating on the xe pointers, we can print additional
information about the device, like tile or GT identifier, that
is not available from generic WARN report:

[ ] xe 0000:00:02.0: [drm] Assertion `true == false` failed!
    platform: 1 subplatform: 1
    graphics: Xe_LP 12.00 step B0
    media: Xe_M 12.00 step B0
    display: enabled step D0
    tile: 0 VRAM 0 B
    GT: 0 type 1

[ ] xe 0000:b3:00.0: [drm] Assertion `true == false` failed!
    platform: 7 subplatform: 3
    graphics: Xe_HPG 12.55 step A1
    media: Xe_HPM 12.55 step A1
    display: disabled step **
    tile: 0 VRAM 14.0 GiB
    GT: 0 type 1

[ ] WARNING: CPU: 0 PID: 2687 at drivers/gpu/drm/xe/xe_device.c:281 xe_device_probe+0x374/0x520 [xe]
[ ] RIP: 0010:xe_device_probe+0x374/0x520 [xe]
[ ] Call Trace:
[ ]  ? __warn+0x7b/0x160
[ ]  ? xe_device_probe+0x374/0x520 [xe]
[ ]  ? report_bug+0x1c3/0x1d0
[ ]  ? handle_bug+0x42/0x70
[ ]  ? exc_invalid_op+0x14/0x70
[ ]  ? asm_exc_invalid_op+0x16/0x20
[ ]  ? xe_device_probe+0x374/0x520 [xe]
[ ]  ? xe_device_probe+0x374/0x520 [xe]
[ ]  xe_pci_probe+0x6e3/0x950 [xe]
[ ]  ? lockdep_hardirqs_on+0xc7/0x140
[ ]  pci_device_probe+0x9e/0x160
[ ]  really_probe+0x19d/0x400

v2: use lowercase names
v3: apply xe coding style
v4: fix non-debug build and improve kernel-doc
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Oded Gabbay <ogabbay@kernel.org>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

1975b591

drm/xe: Replace XE_WARN_ON with drm_warn when just printing a string · 5c0553cd

Francois Dugast authored Sep 12, 2023

Use the generic drm_warn instead of the driver-specific XE_WARN_ON
in cases where XE_WARN_ON is used to unconditionally print a debug
message.

v2: Rebase
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

5c0553cd

drm/xe: Fix fence reservation accouting · 30278e29

Matthew Brost authored Sep 11, 2023

Both execs and the preempt rebind worker can issue rebinds. Rebinds
require a fence, per tile, inserted into dma-resv slots of the VM and
BO (if external). The fence reservation accouting did not take into
account the number of fences required for rebinds, fix this.

v2: Rebase
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reported-by: Christopher Snowhill <kode54@gmail.com>
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/518Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

30278e29

drm/xe: Convert remaining instances of ttm_eu_reserve_buffers to drm_exec · 1f727182

Thomas Hellström authored Sep 08, 2023

The VM_BIND functionality and vma destruction was locking
potentially multiple dma_resv objects using the
ttm_eu_reserve_buffers() function. Rework those to use the drm_exec
helper, taking care that any calls to xe_bo_validate() ends up
inside an unsealed locking transaction.

v4:
- Remove an unbalanced xe_bo_put() (igt and Matthew Brost)
v5:
- Rebase conflict
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230908091716.36984-7-thomas.hellstrom@linux.intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

1f727182

drm/xe: Convert pagefaulting code to use drm_exec · 2714d509

Thomas Hellström authored Sep 08, 2023

Replace the calls into ttm_eu_reserve_buffers with the drm_exec helpers.
Also reuse some code.

v4:
- Kerneldoc xe_vm_prepare_vma().
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230908091716.36984-6-thomas.hellstrom@linux.intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

2714d509