Commits · 83f000784844cb9d4669ef1a3366479db3197b33 · Kirill Smelkov / linux

18 Oct, 2024 1 commit

Merge tag 'drm-xe-fixes-2024-10-17' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes · 83f00078

Dave Airlie authored Oct 18, 2024

Driver Changes:
- New workaround to Xe2 (Aradhya)
- Fix unbalanced rpm put (Matthew Auld)
- Remove fragile lock optimization (Matthew Brost)
- Fix job release, delegating it to the drm scheduler (Matthew Brost)
- Fix timestamp bit width for Xe2 (Lucas)
- Fix external BO's dma-resv usag (Matthew Brost)
- Fix returning success for timeout in wait_token (Nirmoy)
- Initialize fence to avoid it being detected as signaled (Matthew Auld)
- Improve cache flush for BMG (Matthew Auld)
- Don't allow hflip for tile4 framebuffer on Xe2 (Juha-Pekka)
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/jkldrex5733ldxrla75b4ayvhujjhw2kccmasl5rotoufoacj4@pkvlrrv4orc7

83f00078

17 Oct, 2024 7 commits

Merge tag 'drm-misc-fixes-2024-10-17' of... · 49ff3e79

Dave Airlie authored Oct 18, 2024

Merge tag 'drm-misc-fixes-2024-10-17' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes

Short summary of fixes pull:

ast:
- Clear EDID on unplugged connectors

host1x:
- Fix boot on Tegra186
- Set DMA parameters

mgag200:
- Revert VBLANK support

panel:
- himax-hx83192: Adjust power and gamma

qaic:
- Sgtable loop fixes

vmwgfx:
- Limit display layout allocatino size
- Handle allocation errors in connector checks
- Clean up KMS code for 2d-only setup
- Report surface-check errors correctly
- Remove NULL test around kvfree()
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20241017115516.GA196624@linux.fritz.box

49ff3e79

Merge tag 'drm-intel-fixes-2024-10-17' of... · 7626b4e9

Dave Airlie authored Oct 18, 2024

Merge tag 'drm-intel-fixes-2024-10-17' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes

- Two DP bandwidth related MST fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZxDLdML9Dwqkb1AW@jlahtine-mobl.ger.corp.intel.com

7626b4e9

Merge tag 'amd-drm-fixes-6.12-2024-10-16' of... · 01541a87

Dave Airlie authored Oct 18, 2024

Merge tag 'amd-drm-fixes-6.12-2024-10-16' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.12-2024-10-16:

amdgpu:
- SR-IOV fix
- CS chunk handling fix
- MES fixes
- SMU13 fixes

amdkfd:
- VRAM usage reporting fix

radeon:
- Fix possible_clones handling
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241016200514.3520286-1-alexander.deucher@amd.com

01541a87

Merge tag 'drm-msm-fixes-2024-10-16' of https://gitlab.freedesktop.org/drm/msm into drm-fixes · 4cd33d97

Dave Airlie authored Oct 17, 2024

Fixes for v6.12

Display:
- move CRTC resource assignment to atomic_check otherwise to make
  consecutive calls to atomic_check() consistent
- fix rounding / sign-extension issues with pclk calculation in
  case of DSC
- cleanups to drop incorrect null checks in dpu snapshots
- fix to use kvzalloc in dpu snapshot to avoid allocation issues
  in heavily loaded system cases
- Fix to not program merge_3d block if dual LM is not being used
- Fix to not flush merge_3d block if its not enabled otherwise
  this leads to false timeouts

GPU:
- a7xx: add a fence wait before SMMU table update
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGsp3Zbd_H3FhHdRz9yCYA4wxX4SenpYRSk=Mx2d8GMSuQ@mail.gmail.com

4cd33d97

drm/ast: vga: Clear EDID if no display is connected · c09c4f2a

Thomas Zimmermann authored Oct 15, 2024

Do not keep the obsolete EDID around after unplugging the display
from the connector.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 2a2391f8 ("drm/ast: vga: Transparently handle BMC support")
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Jocelyn Falempe <jfalempe@redhat.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241015065113.11790-3-tzimmermann@suse.de

c09c4f2a

drm/ast: sil164: Clear EDID if no display is connected · 5b3c0209

Thomas Zimmermann authored Oct 15, 2024

Do not keep the obsolete EDID around after unplugging the display
from the connector.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: d20c2f84 ("drm/ast: sil164: Transparently handle BMC support")
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Jocelyn Falempe <jfalempe@redhat.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241015065113.11790-2-tzimmermann@suse.de

5b3c0209

Revert "drm/mgag200: Add vblank support" · e5a3c24b

Thomas Zimmermann authored Oct 15, 2024

This reverts commit 6c9e14ee.
This reverts commit d5070c9b.
This reverts commit 89c6ea20.

The VLINE interrupt doesn't work correctly on G200SE-A (at least). We
have also seen missing interrupts on G200ER. So revert vblank support.
Fixes frozen displays and warnings about missed vblanks.

[   33.818362] [CRTC:34:crtc-0] vblank wait timed out

From the vblank code, the driver only keeps the register constants and
the line that disables all interrupts in mgag200_device_init(). Both
is still useful without vblank handling.
Reported-by: Tony Luck <tony.luck@intel.com>
Closes: https://lore.kernel.org/dri-devel/Zvx6lSi7oq5xvTZb@agluck-desk3.sc.intel.com/rawTested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241015063932.8620-1-tzimmermann@suse.de

e5a3c24b

16 Oct, 2024 15 commits

drm/amdgpu/swsmu: default to fullscreen 3D profile for dGPUs · ec1aab78

Alex Deucher authored Oct 03, 2024

This uses more aggressive hueristics than the the bootup default
profile.  On windows the OS has a special fullscreen 3D mode
where this is used.  Since we don't have the equivalent on Linux
default to this profile for dGPUs.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3618
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1500
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3131Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 336568de918e08c825b3b1cbe2ec809f2fc26d94)

ec1aab78

drm/i915/display: Don't allow tile4 framebuffer to do hflip on display20 or greater · ffafd126

Juha-Pekka Heikkila authored Oct 07, 2024

On display ver 20 onwards tile4 is not supported with horizontal flip

Bspec: 69853
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Sai Teja Pottumuttu <sai.teja.pottumuttu@intel.com>
Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241007182841.2104740-1-juhapekka.heikkila@gmail.com
(cherry picked from commit 73e8e2f9a358caa005ed6e52dcb7fa2bca59d132)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

ffafd126

drm/xe/bmg: improve cache flushing behaviour · 6df106e9

Matthew Auld authored Oct 07, 2024

The BSpec says that EN_L3_RW_CCS_CACHE_FLUSH must be toggled
on for manual global invalidation to take effect and actually flush
device cache, however this also turns on flushing for things like
pipecontrol, which occurs between submissions for compute/render. This
sounds like massive overkill for our needs, where we already have the
manual flushing on the display side with the global invalidation. Some
observations on BMG:

1. Disabling l2 caching for host writes and stubbing out the driver
   global invalidation but keeping EN_L3_RW_CCS_CACHE_FLUSH enabled, has
   no impact on wb-transient-vs-display IGT, which makes sense since the
   pipecontrol is now flushing the device cache after the render copy.
   Without EN_L3_RW_CCS_CACHE_FLUSH the test then fails, which is also
   expected since device cache is now dirty and display engine can't see
   the writes.

2. Disabling EN_L3_RW_CCS_CACHE_FLUSH, but keeping the driver global
   invalidation also has no impact on wb-transient-vs-display. This
   suggests that the global invalidation still works as expected and is
   flushing the device cache without EN_L3_RW_CCS_CACHE_FLUSH turned on.

With that drop EN_L3_RW_CCS_CACHE_FLUSH. This helps some workloads since
we no longer flush the device cache between submissions as part of
pipecontrol.

Edit: We now also have clarification from HW side that BSpec was indeed
wrong here.

v2:
  - Rebase and update commit message.

BSpec: 71718
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Vitasta Wattal <vitasta.wattal@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241007074541.33937-2-matthew.auld@intel.com
(cherry picked from commit 67ec9f87bd6c57db1251bb2244d242f7ca5a0b6a)
[ Fix conflict due to changed xe_mmio_write32() signature ]
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

6df106e9

drm/xe/xe_sync: initialise ufence.signalled · 816b186c

Matthew Auld authored Oct 11, 2024

We can incorrectly think that the fence has signalled, if we get a
non-zero value here from the kmalloc, which is quite plausible. Just use
kzalloc to prevent stuff like this.

Fixes: 977e5b82 ("drm/xe: Expose user fence from xe_sync_entry")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: <stable@vger.kernel.org> # v6.10+
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241011133633.388008-2-matthew.auld@intel.com
(cherry picked from commit 26f69e88dcc95fffc62ed2aea30ad7b1fdf31fdb)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

816b186c

drm/xe/ufence: ufence can be signaled right after wait_woken · 4e8b5a16

Nirmoy Das authored Oct 11, 2024

do_comapre() can return success after a timedout wait_woken() which was
treated as -ETIME. The loop calling wait_woken() sets correct err so
there is no need to re-evaluate err.

v2: Remove entire check that reevaluate err at the end(Matt)

Fixes: e670f0b4 ("drm/xe/uapi: Return correct error code for xe_wait_user_fence_ioctl")
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1630
Cc: stable@vger.kernel.org # v6.8+
Cc: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241011151029.4160630-1-nirmoy.das@intel.comSigned-off-by: Nirmoy Das <nirmoy.das@intel.com>
(cherry picked from commit ec7e6a1d527755fc3c7a3303eaa5577aac5cf6be)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

4e8b5a16

drm/xe: Use bookkeep slots for external BO's in exec IOCTL · e7518276

Matthew Brost authored Sep 11, 2024

Fix external BO's dma-resv usage in exec IOCTL using bookkeep slots
rather than write slots. This leaves syncing to user space rather than
the KMD blindly enforcing write semantics on every external BO.

Fixes: dd08ebf6 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Kenneth Graunke <kenneth.w.graunke@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reported-by: Simona Vetter <simona.vetter@ffwll.ch>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2673Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240911152622.903058-1-matthew.brost@intel.com
(cherry picked from commit b8b1163248759ba18509f7443a2d19b15b4c1df8)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

e7518276

drm/xe/query: Increase timestamp width · 477d665e

Lucas De Marchi authored Oct 10, 2024

Starting with Xe2 the timestamp is a full 64 bit counter, contrary to
the 36 bit that was available before. Although 36 should be sufficient
for any reasonable delta calculation (for Xe2, of about 30min), it's
surprising to userspace to get something truncated. Also if the
timestamp being compared to is coming from the GPU and the application
is not careful enough to apply the width there, a delta calculation
would be wrong.

Extend it to full 64-bits starting with Xe2.

v2: Expand width=64 to media gt, as it's just a wrong tagging in the
spec - empirical tests show it goes beyond 36 bits and match the engines
for the main gt

Bspec: 60411
Cc: Szymon Morek <szymon.morek@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241011035618.1057602-1-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 9d559cdcb21f42188d4c3ff3b4fe42b240f4af5d)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

477d665e

drm/xe: Don't free job in TDR · 82926f52

Matthew Brost authored Oct 02, 2024

Freeing job in TDR is not safe as TDR can pass the run_job thread
resulting in UAF. It is only safe for free job to naturally be called by
the scheduler. Rather free job in TDR, add to pending list.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2811
Cc: Matthew Auld <matthew.auld@intel.com>
Fixes: e275d61c ("drm/xe/guc: Handle timing out of signaled jobs gracefully")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241003001657.3517883-3-matthew.brost@intel.com
(cherry picked from commit ea2f6a77d0c40d97f4a4dc93fee4afe15d94926d)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

82926f52

drm/xe: Take job list lock in xe_sched_add_pending_job · ed931fb4

Matthew Brost authored Oct 02, 2024

A fragile micro optimization in xe_sched_add_pending_job relied on both
the GPU scheduler being stopped and fence signaling stopped to safely
add a job to the pending list without the job list lock in
xe_sched_add_pending_job. Remove this optimization and just take the job
list lock.

Fixes: 7ddb9403 ("drm/xe: Sample ctx timestamp to determine if jobs have timed out")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241003001657.3517883-2-matthew.brost@intel.com
(cherry picked from commit 90521df5fc43980e4575bd8c5b1cb62afe1a9f5f)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

ed931fb4

drm/xe: fix unbalanced rpm put() with declare_wedged() · 761f916a

Matthew Auld authored Oct 09, 2024

Technically the or_reset() means we call the action on failure, however
that would lead to unbalanced rpm put(). Move the get() earlier to fix
this. It should be extremely unlikely to ever trigger this in practice.

Fixes: 90936a0a ("drm/xe: Don't suspend device upon wedge")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241009084808.204432-4-matthew.auld@intel.com
(cherry picked from commit a187c1b0a800565a4db6372268692aff99df7f53)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

761f916a

drm/xe: fix unbalanced rpm put() with fence_fini() · 03a86c24

Matthew Auld authored Oct 09, 2024

Currently we can call fence_fini() twice if something goes wrong when
sending the GuC CT for the tlb request, since we signal the fence and
return an error, leading to the caller also calling fini() on the error
path in the case of stack version of the flow, which leads to an extra
rpm put() which might later cause device to enter suspend when it
shouldn't. It looks like we can just drop the fini() call since the
fence signaller side will already call this for us.

There are known mysterious splats with device going to sleep even with
an rpm ref, and this could be one candidate.

v2 (Matt B):
  - Prefer warning if we detect double fini()

Fixes: f0027022 ("drm/xe: Hold a PM ref when GT TLB invalidations are inflight")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241009084808.204432-3-matthew.auld@intel.com
(cherry picked from commit cfcbc0520d5055825f0647ab922b655688605183)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

03a86c24

drm/xe/xe2lpg: Extend Wa_15016589081 for xe2lpg · 4ceead37

Aradhya Bhatia authored Oct 09, 2024

Add workaround (wa) 15016589081 which applies to Xe2_v3_LPG_MD.

Xe2_v3_LPG_MD is a Lunar Lake platform with GFX version: 20.04.
This wa is type: permanent, and hence is applicable on all steppings.
Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241009065542.283151-1-aradhya.bhatia@intel.com
(cherry picked from commit 8fb1da9f9bfb02f710a7f826d50781b0b030cf53)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

4ceead37

drm/i915/dp_mst: Don't require DSC hblank quirk for a non-DSC compatible mode · 2f54e713

Imre Deak authored Oct 09, 2024

If an MST branch device doesn't support DSC for a given mode, but the
MST link has enough BW for the mode, assume that the branch device does
support the mode using an uncompressed stream.

Fixes: 55eaef16 ("drm/i915/dp_mst: Handle the Synaptics HBlank expansion quirk")
Cc: stable@vger.kernel.org # v6.8+
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241009110135.1216498-2-imre.deak@intel.com
(cherry picked from commit 4e75c3e208a06ad6fd9b3517fb77337460d7c2b0)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

2f54e713

drm/i915/dp_mst: Handle error during DSC BW overhead/slice calculation · 69b3d872

Imre Deak authored Oct 09, 2024

The MST branch device may not support the number of DSC slices a mode
requires, handle the error in this case.

Fixes: 4e0837a8 ("drm/i915/dp_mst: Account for FEC and DSC overhead during BW allocation")
Cc: stable@vger.kernel.org # v6.8+
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241009110135.1216498-1-imre.deak@intel.com
(cherry picked from commit 802a69b6b8a0502a9e2309afec7e1b77f67874f2)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

69b3d872

drm/msm/a6xx+: Insert a fence wait before SMMU table update · 77ad507d

Rob Clark authored Oct 15, 2024

The CP_SMMU_TABLE_UPDATE _should_ be waiting for idle, but on some
devices (x1-85, possibly others), it seems to pass that barrier while
there are still things in the event completion FIFO waiting to be
written back to memory.

Work around that by adding a fence wait before context switch.  The
CP_EVENT_WRITE that writes the fence is the last write from a submit,
so seeing this value hit memory is a reliable indication that it is
safe to proceed with the context switch.

v2: Only emit CP_WAIT_TIMESTAMP on a7xx, as it is not supported on a6xx.
    Conversely, I've not been able to reproduce this issue on a6xx, so
    hopefully it is limited to a7xx, or perhaps just certain a7xx
    devices.

Fixes: af66706a ("drm/msm/a6xx: Add skeleton A7xx support")
Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/63Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>

77ad507d

15 Oct, 2024 12 commits

drm/msm/dpu: don't always program merge_3d block · f87f3b80

Jessica Zhang authored Oct 09, 2024

Only program the merge_3d block for the video phys encoder when the 3d
blend mode is not NONE

Fixes: 3e79527a ("drm/msm/dpu: enable merge_3d support on sm8150/sm8250")
Suggested-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/619095/
Link: https://lore.kernel.org/r/20241009-merge3d-fix-v1-1-0d0b6f5c244e@quicinc.comSigned-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>

f87f3b80

drm/msm/dpu: Don't always set merge_3d pending flush · 40dad89c

Jessica Zhang authored Oct 09, 2024

Don't set the merge_3d pending flush bits if the mode_3d is
BLEND_3D_NONE.

Always flushing merge_3d can cause timeout issues when there are
multiple commits with concurrent writeback enabled.

This is because the video phys enc waits for the hw_ctl flush register
to be completely cleared [1] in its wait_for_commit_done(), but the WB
encoder always sets the merge_3d pending flush during each commit
regardless of if the merge_3d is actually active.

This means that the hw_ctl flush register will never be 0 when there are
multiple CWB commits and the video phys enc will hit vblank timeout
errors after the first CWB commit.

[1] commit fe9df3f5 ("drm/msm/dpu: add real wait_for_commit_done()")

Fixes: 3e79527a ("drm/msm/dpu: enable merge_3d support on sm8150/sm8250")
Fixes: d7d0e73f ("drm/msm/dpu: introduce the dpu_encoder_phys_* for writeback")
Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/619092/
Link: https://lore.kernel.org/r/20241009-mode3d-fix-v1-1-c0258354fadc@quicinc.comSigned-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>

40dad89c

gpu: host1x: Set up device DMA parameters · eb0c0621

Thierry Reding authored Sep 16, 2024

In order to store device DMA parameters, the DMA framework depends on
the device's dma_parms field to point at a valid memory location. Add
backing storage for this in struct host1x_memory_context and point to
it.
Reported-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240916133320.368620-1-thierry.reding@gmail.com
(cherry picked from commit b4ad4ef374d66cc8df3188bb1ddb65bce5fc9e50)
Signed-off-by: Thierry Reding <treding@nvidia.com>

eb0c0621

drm/amdgpu/swsmu: Only force workload setup on init · cb07c833

Alex Deucher authored Oct 02, 2024

Needed to set the workload type at init time so that
we can apply the navi3x margin optimization.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3618
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3131
Fixes: c50fe289 ("drm/amdgpu/swsmu: always force a state reprogram on init")
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 580ad7cbd4b7be8d2cb5ab5c1fca6bb76045eb0e)
Cc: stable@vger.kernel.org

cb07c833

drm/radeon: Fix encoder->possible_clones · 28127dba

Ville Syrjälä authored Oct 14, 2024

Include the encoder itself in its possible_clones bitmask.
In the past nothing validated that drivers were populating
possible_clones correctly, but that changed in commit
74d2aacb ("drm: Validate encoder->possible_clones").
Looks like radeon never got the memo and is still not
following the rules 100% correctly.

This results in some warnings during driver initialization:
Bogus possible_clones: [ENCODER:46:TV-46] possible_clones=0x4 (full encoder mask=0x7)
WARNING: CPU: 0 PID: 170 at drivers/gpu/drm/drm_mode_config.c:615 drm_mode_config_validate+0x113/0x39c
...

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Fixes: 74d2aacb ("drm: Validate encoder->possible_clones")
Reported-by: Erhard Furtner <erhard_f@mailbox.org>
Closes: https://lore.kernel.org/dri-devel/20241009000321.418e4294@yea/Tested-by: Erhard Furtner <erhard_f@mailbox.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3b6e7d40649c0d75572039aff9d0911864c689db)
Cc: stable@vger.kernel.org

28127dba

drm/amdgpu/smu13: always apply the powersave optimization · 7a1613e4

Alex Deucher authored Oct 03, 2024

It can avoid margin issues in some very demanding applications.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3618
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3131
Fixes: c50fe289 ("drm/amdgpu/swsmu: always force a state reprogram on init")
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 62f38b4ccaa6aa063ca781d80b10aacd39dc5c76)
Cc: stable@vger.kernel.org

7a1613e4

drm/amdkfd: Accounting pdd vram_usage for svm · 68d26c10

Philip Yang authored Oct 04, 2024

Process device data pdd->vram_usage is read by rocm-smi via sysfs, this
is currently missing the svm_bo usage accounting, so "rocm-smi
--showpids" per process VRAM usage report is incorrect.

Add pdd->vram_usage accounting when svm_bo allocation and release,
change to atomic64_t type because it is updated outside process mutex
now.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 98c0b0efcc11f2a5ddf3ce33af1e48eedf808b04)

68d26c10

drm/amd/amdgpu: Fix double unlock in amdgpu_mes_add_ring · e7457532

Srinivasan Shanmugam authored Oct 08, 2024

This patch addresses a double unlock issue in the amdgpu_mes_add_ring
function. The mutex was being unlocked twice under certain error
conditions, which could lead to undefined behavior.

The fix ensures that the mutex is unlocked only once before jumping to
the clean_up_memory label. The unlock operation is moved to just before
the goto statement within the conditional block that checks the return
value of amdgpu_ring_init. This prevents the second unlock attempt after
the clean_up_memory label, which is no longer necessary as the mutex is
already unlocked by this point in the code flow.

This change resolves the potential double unlock and maintains the
correct mutex handling throughout the function.

Fixes below:
Commit d0c423b6 ("drm/amdgpu/mes: use ring for kernel queue
submission"), leads to the following Smatch static checker warning:

	drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c:1240 amdgpu_mes_add_ring()
	warn: double unlock '&adev->mes.mutex_hidden' (orig line 1213)

drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
    1143 int amdgpu_mes_add_ring(struct amdgpu_device *adev, int gang_id,
    1144                         int queue_type, int idx,
    1145                         struct amdgpu_mes_ctx_data *ctx_data,
    1146                         struct amdgpu_ring **out)
    1147 {
    1148         struct amdgpu_ring *ring;
    1149         struct amdgpu_mes_gang *gang;
    1150         struct amdgpu_mes_queue_properties qprops = {0};
    1151         int r, queue_id, pasid;
    1152
    1153         /*
    1154          * Avoid taking any other locks under MES lock to avoid circular
    1155          * lock dependencies.
    1156          */
    1157         amdgpu_mes_lock(&adev->mes);
    1158         gang = idr_find(&adev->mes.gang_id_idr, gang_id);
    1159         if (!gang) {
    1160                 DRM_ERROR("gang id %d doesn't exist\n", gang_id);
    1161                 amdgpu_mes_unlock(&adev->mes);
    1162                 return -EINVAL;
    1163         }
    1164         pasid = gang->process->pasid;
    1165
    1166         ring = kzalloc(sizeof(struct amdgpu_ring), GFP_KERNEL);
    1167         if (!ring) {
    1168                 amdgpu_mes_unlock(&adev->mes);
    1169                 return -ENOMEM;
    1170         }
    1171
    1172         ring->ring_obj = NULL;
    1173         ring->use_doorbell = true;
    1174         ring->is_mes_queue = true;
    1175         ring->mes_ctx = ctx_data;
    1176         ring->idx = idx;
    1177         ring->no_scheduler = true;
    1178
    1179         if (queue_type == AMDGPU_RING_TYPE_COMPUTE) {
    1180                 int offset = offsetof(struct amdgpu_mes_ctx_meta_data,
    1181                                       compute[ring->idx].mec_hpd);
    1182                 ring->eop_gpu_addr =
    1183                         amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
    1184         }
    1185
    1186         switch (queue_type) {
    1187         case AMDGPU_RING_TYPE_GFX:
    1188                 ring->funcs = adev->gfx.gfx_ring[0].funcs;
    1189                 ring->me = adev->gfx.gfx_ring[0].me;
    1190                 ring->pipe = adev->gfx.gfx_ring[0].pipe;
    1191                 break;
    1192         case AMDGPU_RING_TYPE_COMPUTE:
    1193                 ring->funcs = adev->gfx.compute_ring[0].funcs;
    1194                 ring->me = adev->gfx.compute_ring[0].me;
    1195                 ring->pipe = adev->gfx.compute_ring[0].pipe;
    1196                 break;
    1197         case AMDGPU_RING_TYPE_SDMA:
    1198                 ring->funcs = adev->sdma.instance[0].ring.funcs;
    1199                 break;
    1200         default:
    1201                 BUG();
    1202         }
    1203
    1204         r = amdgpu_ring_init(adev, ring, 1024, NULL, 0,
    1205                              AMDGPU_RING_PRIO_DEFAULT, NULL);
    1206         if (r)
    1207                 goto clean_up_memory;
    1208
    1209         amdgpu_mes_ring_to_queue_props(adev, ring, &qprops);
    1210
    1211         dma_fence_wait(gang->process->vm->last_update, false);
    1212         dma_fence_wait(ctx_data->meta_data_va->last_pt_update, false);
    1213         amdgpu_mes_unlock(&adev->mes);
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    1214
    1215         r = amdgpu_mes_add_hw_queue(adev, gang_id, &qprops, &queue_id);
    1216         if (r)
    1217                 goto clean_up_ring;
                         ^^^^^^^^^^^^^^^^^^

    1218
    1219         ring->hw_queue_id = queue_id;
    1220         ring->doorbell_index = qprops.doorbell_off;
    1221
    1222         if (queue_type == AMDGPU_RING_TYPE_GFX)
    1223                 sprintf(ring->name, "gfx_%d.%d.%d", pasid, gang_id, queue_id);
    1224         else if (queue_type == AMDGPU_RING_TYPE_COMPUTE)
    1225                 sprintf(ring->name, "compute_%d.%d.%d", pasid, gang_id,
    1226                         queue_id);
    1227         else if (queue_type == AMDGPU_RING_TYPE_SDMA)
    1228                 sprintf(ring->name, "sdma_%d.%d.%d", pasid, gang_id,
    1229                         queue_id);
    1230         else
    1231                 BUG();
    1232
    1233         *out = ring;
    1234         return 0;
    1235
    1236 clean_up_ring:
    1237         amdgpu_ring_fini(ring);
    1238 clean_up_memory:
    1239         kfree(ring);
--> 1240         amdgpu_mes_unlock(&adev->mes);
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    1241         return r;
    1242 }

Fixes: d0c423b6 ("drm/amdgpu/mes: use ring for kernel queue submission")
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Suggested-by: Jack Xiao <Jack.Xiao@amd.com>
Reported by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit bfaf1883605fd0c0dbabacd67ed49708470d5ea4)

e7457532

drm/amdgpu/mes: fix issue of writing to the same log buffer from 2 MES pipes · 7760d7f9

Michael Chen authored Oct 08, 2024

With Unified MES enabled in gfx12, need separate event log buffer for the
2 MES pipes to avoid data overwrite.
Signed-off-by: Michael Chen <michael.chen@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 144df260f3daab42c4611021f929b3342de516e5)
Cc: stable@vger.kernel.org # 6.11.x

7760d7f9

drm/amdgpu: prevent BO_HANDLES error from being overwritten · c0ec082f

Mohammed Anees authored Oct 09, 2024

Before this patch, if multiple BO_HANDLES chunks were submitted,
the error -EINVAL would be correctly set but could be overwritten
by the return value from amdgpu_cs_p1_bo_handles(). This patch
ensures that if there are multiple BO_HANDLES, we stop.

Fixes: fec5f8e8 ("drm/amdgpu: disallow multiple BO_HANDLES chunks in one submit")
Signed-off-by: Mohammed Anees <pvmohammedanees2003@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 40f2cd98828f454bdc5006ad3d94330a5ea164b7)
Cc: stable@vger.kernel.org

c0ec082f

drm/amdgpu: enable enforce_isolation sysfs node on VFs · d2c72d96

Alex Deucher authored Oct 08, 2024

It should be enabled on both bare metal and VFs.

Fixes: e189be9b ("drm/amdgpu: Add enforce_isolation sysfs attribute")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Cc: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
(cherry picked from commit dc8847b054fd6679866ed4ee861e069e54c10799)

d2c72d96

gpu: host1x: Fix boot regression for Tegra · c8347f91

Jon Hunter authored Sep 25, 2024

Commit 4c27ac45 ("gpu: host1x: Request syncpoint IRQs only during
probe") caused a boot regression for the Tegra186 device. Following this
update the function host1x_intr_init() now calls
host1x_hw_intr_disable_all_syncpt_intrs() during probe. However,
host1x_intr_init() is called before runtime power-management is enabled
for Host1x and the function host1x_hw_intr_disable_all_syncpt_intrs() is
accessing hardware registers. So if the Host1x hardware is not enabled
prior to probing then the device will now hang on attempting to access
the registers. So far this is only observed on Tegra186, but potentially
could be seen on other devices.

Fix this by moving the call to the function host1x_intr_init() in probe
to after enabling the runtime power-management in the probe and update
the failure path in probe as necessary.

Fixes: 4c27ac45 ("gpu: host1x: Request syncpoint IRQs only during probe")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240925160504.60221-1-jonathanh@nvidia.com
(cherry picked from commit dc56f8428e5f34418f3243a60cec13166efe4fdb)
Signed-off-by: Thierry Reding <treding@nvidia.com>

c8347f91

14 Oct, 2024 5 commits

drm/msm: Allocate memory for disp snapshot with kvzalloc() · e4a45582

Douglas Anderson authored Oct 14, 2024

With the "drm/msm: add a display mmu fault handler" series [1] we saw
issues in the field where memory allocation was failing when
allocating space for registers in msm_disp_state_dump_regs().
Specifically we were seeing an order 5 allocation fail. It's not
surprising that order 5 allocations will sometimes fail after the
system has been up and running for a while.

There's no need here for contiguous memory. Change the allocation to
kvzalloc() which should make it much less likely to fail.

[1] https://lore.kernel.org/r/20240628214848.4075651-1-quic_abhinavk@quicinc.com/

Fixes: 98659487 ("drm/msm: add support to take dpu snapshot")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/619658/
Link: https://lore.kernel.org/r/20241014093605.2.I72441365ffe91f3dceb17db0a8ec976af8139590@changeidSigned-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>

e4a45582

drm/msm: Avoid NULL dereference in msm_disp_state_print_regs() · 293f5326

Douglas Anderson authored Oct 14, 2024

If the allocation in msm_disp_state_dump_regs() failed then
`block->state` can be NULL. The msm_disp_state_print_regs() function
_does_ have code to try to handle it with:

  if (*reg)
    dump_addr = *reg;

...but since "dump_addr" is initialized to NULL the above is actually
a noop. The code then goes on to dereference `dump_addr`.

Make the function print "Registers not stored" when it sees a NULL to
solve this. Since we're touching the code, fix
msm_disp_state_print_regs() not to pointlessly take a double-pointer
and properly mark the pointer as `const`.

Fixes: 98659487 ("drm/msm: add support to take dpu snapshot")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/619657/
Link: https://lore.kernel.org/r/20241014093605.1.Ia1217cecec9ef09eb3c6d125360cc6c8574b0e73@changeidSigned-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>

293f5326

drm/msm/dsi: fix 32-bit signed integer extension in pclk_rate calculation · 358b7624

Jonathan Marek authored Oct 07, 2024

When (mode->clock * 1000) is larger than (1<<31), int to unsigned long
conversion will sign extend the int to 64 bits and the pclk_rate value
will be incorrect.

Fix this by making the result of the multiplication unsigned.

Note that above (1<<32) would still be broken and require more changes, but
its unlikely anyone will need that anytime soon.

Fixes: c4d8cfe5 ("drm/msm/dsi: add implementation for helper functions")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/618434/
Link: https://lore.kernel.org/r/20241007050157.26855-2-jonathan@marek.caSigned-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>

358b7624

drm/msm/dsi: improve/fix dsc pclk calculation · 24436a54

Jonathan Marek authored Oct 07, 2024

drm_mode_vrefresh() can introduce a large rounding error, avoid it.

Fixes: 7c9e4a55 ("drm/msm/dsi: Reduce pclk rate for compression")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/618432/
Link: https://lore.kernel.org/r/20241007050157.26855-1-jonathan@marek.caSigned-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>

24436a54

drm/msm/hdmi: drop pll_cmp_to_fdata from hdmi_phy_8998 · f260ed88

Dmitry Baryshkov authored Sep 22, 2024

The pll_cmp_to_fdata() was never used by the working code. Drop it to
prevent warnings with W=1 and clang.
Reported-by: Jani Nikula <jani.nikula@intel.com>
Closes: https://lore.kernel.org/dri-devel/3553b1db35665e6ff08592e35eb438a574d1ad65.1725962479.git.jani.nikula@intel.comSigned-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Fixes: caedbf17 ("drm/msm: add msm8998 hdmi phy/pll support")
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/615348/
Link: https://lore.kernel.org/r/20240922-msm-drop-unused-func-v1-1-c5dc083415b8@linaro.orgSigned-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>

f260ed88