Commits · b5b561621d5d6bc0ddd6cc442893f6768d151c27 · Kirill Smelkov / linux

05 Jun, 2024 40 commits

drm/amdkfd: remove dead code in kfd_create_vcrat_image_gpu · b5b56162

Jesse Zhang authored May 30, 2024

kfd_create_vcrat_image_gpu itself checks the avail_size at the start.
So the value of avail_size is at least VCRAT_SIZE_FOR_GPU(16384),
minus struct crat_header(40UL) and struct crat_subtype_compute(40UL) it cannot be less than 0.
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

b5b56162

drm/amdkfd: fix the kdf debugger issue · 8178cfb0

Jesse Zhang authored May 31, 2024

The expression caps | HSA_CAP_TRAP_DEBUG_PRECISE_MEMORY_OPERATIONS_SUPPORTED
and  caps | HSA_CAP_TRAP_DEBUG_PRECISE_ALU_OPERATIONS_SUPPORTED
are always 1/true regardless of the values of its operand.

Fixes: 9243240b ("drm/amdkfd: enable single alu ops for gfx12")
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

8178cfb0

drm/amdkfd: Comment out the unused variable use_static in pm_map_queues_v9 · 46eb63ec

Jesse Zhang authored May 31, 2024

To fix the warning about unused value,
remove the use_static and use the parameter is_static directly.
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

46eb63ec

drm/amd/display: Move fpo_in_use to stream_status · e69d4335

Alvin Lee authored Apr 26, 2024

[Description]
Refactor code and move fpo_in_use into stream_status to avoid
unexpected changes to previous dc_state (i.e., current_state).
Since stream pointers are shared between current and new dc_states,
updating parameters of one stream will update the other as well
which causes unexpected behaviors (i.e., checking that fpo_in_use
isn't set in previous state and set in the new state is invalid).
To avoid incorrect updates to current_state, move the fpo_in_use flag
into dc_stream_status since stream_status is owned by dc and are not
shared between different dc_states.
Reviewed-by: Samson Tam <samson.tam@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

e69d4335

drm/amd/display: Only program P-State force if pipe config changed · 4621e10e

Alvin Lee authored Apr 26, 2024

[Description]
Today for MED update type we do not call update clocks. However, for FPO
the assumption is that update clocks should be called to disable P-State
switch before any HW programming since FPO in FW and driver are not
synchronized. This causes an issue where on a MED update, an FPO P-State
switch could be taking place, then driver forces P-State disallow in the below
code and prevents FPO from completing the sequence. In this case we add a check
to avoid re-programming (and thus re-setting) the P-State force register by
only reprogramming if the pipe was not previously Subvp or FPO. The assumption
is that the P-State force register should be programmed correctly the first
time SubVP / FPO was enabled, so there's no need to update / reset it if the
pipe config has never exited SubVP / FPO.
Reviewed-by: Samson Tam <samson.tam@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

4621e10e

drm/amd/display: Add retires when read DPCD · 2770b915

Joan Lee authored May 15, 2024

[why & how]
Sometimes read DPCD return fail while result not retrieved yet. Add
retries mechanism in Replay handle hpd irq to get real result.
Reviewed-by: Jerry Zuo <jerry.zuo@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Joan Lee <joan.lee@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

2770b915

drm/amd/display: Fix DML2 logic to set clk state to min · cc4d6ea0

Nicholas Susanto authored May 14, 2024

[Why]

When an eDP with high clock states is going into s0i3, stream_count is
0. This causes DML to not update the clks to the lowest state and
blocking us to enter s0i3 since eDP is out of vmin.

[How]

When stream_count is 0, set all the clocks to the lowest state.
Reviewed-by: Jun Lei <jun.lei@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Nicholas Susanto <nicholas.susanto@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

cc4d6ea0

drm/amdkfd: Handle deallocated VPGRs in gfx11+ trap handler · c5afb313

Jay Cornwall authored May 28, 2024

A wavefront may deallocate its VGPRs at the end of a program while
waiting for memory transactions to complete. If it subsequently
receives a context save exception it will be unable to save,
since this requires VGPRs. In this case the trap handler should
terminate the wavefront.

Fixes intermittent VM faults under context switching load.

V2: Use S_ENDPGM instead of S_ENDPGM_SAVED for performance counters
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

c5afb313

drm/amd/display: Support new VA page table block size · 4002a6c5

Chris Park authored May 14, 2024

[Why]
Page table definition increased up to 2MB.

[How]
Define new use case of page table for VA.
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Chris Park <chris.park@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

4002a6c5

drm/amd/display: workaround for oled eDP not lighting up on DCN401 · e902dd7f

Joshua Aberback authored May 13, 2024

[Why]
Currently there's an issue on DCN401 that prevents oled eDP panels from
being lit up that is still under investigation. To unblock dev work
while investigating, we can work around the issue by skipping toggling
the enablement of the backlight.

[How]
 - new debug bit that will skip touching backlight enable DPCD for oled
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

e902dd7f

drm/amd/display: Add params of set_abm_event for VB Scaling · 1349db15

Chun-LiangChang authored May 10, 2024

[Why]
Add parameters for set_abm_event to enable varibright scaling.
VariBright Scaling is a feature to refer to system states like

1. Power mode
2. Battery Life percent
3. FullScreen video
4. Backlight slider

to adjust variBright strength to get low power or user experience.

[How]
Add parameters of set_abm_event for VB Scaling
Reviewed-by: Jun Lei <jun.lei@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Chun-LiangChang <chuchang@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

1349db15

drm/amd/display: Fix swapped dimension calculations · e86e8798

Joshua Aberback authored May 13, 2024

[Why]
The values calculated in optc1_get_otg_active_size are assigned to the
wrong output parameters, vertical blank is being used for horizontal size
and vice versa. This results in DPG test pattern looking wrong during
hardware init, as the DPG dimensions get assigned from this output, and
potentially other issues.
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

e86e8798

drm/amd/display: Wait for hardmins to complete on dcn401 · 57c49821

Dillon Varone authored May 13, 2024

[WHY&HOW]
When updating clocks via SMU, DAL needs to wait for requests to be fulfilled
before proceeding.
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

57c49821

drm/amd/display: turn on symclk for dio virtual stream in dpms sequence · 7069484d

Wenjing Liu authored May 10, 2024

[why]
In order to support glitchless display clock ramping for virtual stream,
we must
turn on symclk for stream encoder. The code will power on phy and enable
symclk
for dio encoder during virtual stream dpms sequence.
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

7069484d

drm/amd/display: Keep VBios pixel rate div setting until next mode set · 975507d7

yi-lchen authored Apr 29, 2024

[why]
Vbios & Driver have difference pixel rate div policy.
When enabling fast boot & performing blank & unblank w/o timing setting,
pixel clock & pixel rate dividor are not match.
It would cause too high pixel reate and eDP would be black screen.

[How]
We would keep pixel rate div setting by Vbios until next timing setting.
Reviewed-by: Jun Lei <jun.lei@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: yi-lchen <yi-lchen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

975507d7

Revert "drm/amdgpu/gfx11: enable gfx pipe1 hardware support" · a9ebd104

Alex Deucher authored May 31, 2024

This reverts commit 6670142d.

Pierre-Eric reported problems with this on his navi33.  Revert
for now until we understand what is going wrong.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Pierre-eric.Pelloux-prayer@amd.com

a9ebd104

drm/amdgpu: update gc_12_0_0 headers · b592d01d

Alex Deucher authored May 31, 2024

Add some additional registers.
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

b592d01d

drm/amdgpu: define new gfx12 uapi flags · 73e1d104

Marek Olšák authored Apr 29, 2023

define new gfx12 uapi flags
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

73e1d104

drm/amdkfd: remove dead code in the function svm_range_get_pte_flags · dfe190af

Jesse Zhang authored May 30, 2024

The varible uncached set false, the condition uncached cannot be true.
So remove the dead code, mapping flags will set the flag AMDGPU_VM_MTYPE_UC in else.
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

dfe190af

drm/amdgpu: Update the impelmentation of AMDGPU_PTE_MTYPE_NV10 · 39282901

Shane Xiao authored May 29, 2024

This patch changes the implementation of AMDGPU_PTE_MTYPE_NV10,
clear the bits before setting the new one.
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: longlyao <Longlong.Yao@amd.com>
Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

39282901

drm/amdgpu: disable lane0 L1TLB and enable lane1 L1TLB · 301dfbfc

Yifan Zhang authored May 23, 2024

This patch to disable lane0 L1TLB and enable lane1 L1TLB.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

301dfbfc

drm/amdgpu: init SAW registers for mmhub v3.3 · b7e2170b

Yifan Zhang authored May 06, 2024

This patch to configure mmhub3.3 SAW registers
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

b7e2170b

drm/amdgpu/pptable: Fix UBSAN array-index-out-of-bounds · 98f9e5ea

Tasos Sahanidis authored May 31, 2024

Flexible arrays used [1] instead of []. Replace the former with the latter
to resolve multiple UBSAN warnings observed on boot with a BONAIRE card.

In addition, use the __counted_by attribute where possible to hint the
length of the arrays to the compiler and any sanitizers.
Signed-off-by: Tasos Sahanidis <tasos@tasossah.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

98f9e5ea

drm/amd/display: Fix a handful of spelling mistakes · 34a6aa4e

Colin Ian King authored May 31, 2024

There are a few spelling mistakes in dml2_printf messages. Fix them.
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

34a6aa4e

drm/amdgpu: fix comments and error message for ipdump · a1a049bd

Sunil Khatri authored May 31, 2024

Fix comments and error messages to rightly represent
the information.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

a1a049bd

drm/amdgpu: rename ip_dump_cp_queues to compute queues · 33837d62

Sunil Khatri authored May 31, 2024

Rename the variable ip_dump_cp_queues to ip_dump_compute_queue
as it represent compute queues.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

33837d62

drm/amdgpu: add cp queue registers for gfx9 ipdump · 34b8d94b

Sunil Khatri authored May 29, 2024

Add gfx9 support of CP queue registers for all queues
to be used by devcoredump.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

34b8d94b

drm/amdgpu: add print support for gfx9 ipdump · 173ef918

Sunil Khatri authored May 29, 2024

Add support of gfx9 ipdump print so devcoredump
could trigger it to dump the captured registers
in devcoredump.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

173ef918

drm/amdgpu: add gfx9 register support in ipdump · 514dc965

Sunil Khatri authored May 29, 2024

Add general registers of gfx9 in ipdump for
devcoredump support.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

514dc965

drm/amdgpu: Fix type mismatch in amdgpu_gfx_kiq_init_ring · 745f7170

Srinivasan Shanmugam authored May 25, 2024

This commit fixes a type mismatch in the amdgpu_gfx_kiq_init_ring
function triggered by the snprintf function expecting unsigned char
arguments due to the '%hhu' format specifier, but receiving int and u32
arguments.

The issue occurred because the arguments xcc_id, ring->me, ring->pipe,
and ring->queue were of type int and u32, not unsigned char. This led to
a type mismatch when these arguments were passed to snprintf.

To resolve this, the snprintf line was modified to cast these arguments
to unsigned char. This ensures that the arguments are of the correct
type for the '%hhu' format specifier and resolves the warning.

Fixes the below:
>> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:333:4: warning: format
>> specifies type 'unsigned char' but the argument has type 'int'
>> [-Wformat]
                    xcc_id, ring->me, ring->pipe, ring->queue);
                    ^~~~~~
>> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:333:12: warning: format
>> specifies type 'unsigned char' but the argument has type 'u32' (aka
>> 'unsigned int') [-Wformat]
                    xcc_id, ring->me, ring->pipe, ring->queue);
                            ^~~~~~~~
   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:333:22: warning: format specifies type 'unsigned char' but the argument has type 'u32' (aka 'unsigned int') [-Wformat]
                    xcc_id, ring->me, ring->pipe, ring->queue);
                                      ^~~~~~~~~~
   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:333:34: warning: format specifies type 'unsigned char' but the argument has type 'u32' (aka 'unsigned int') [-Wformat]
                    xcc_id, ring->me, ring->pipe, ring->queue);
                                                  ^~~~~~~~~~~
   4 warnings generated.

Fixes: 0ea55445 ("drm/amdgpu: Fix snprintf usage in amdgpu_gfx_kiq_init_ring")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202405250446.XeaWe66u-lkp@intel.com/
Cc: Lijo Lazar <lijo.lazar@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

745f7170

drm/amdgu: fix Unintentional integer overflow for mall size · c09d2eff

Jesse Zhang authored May 29, 2024

Potentially overflowing expression mall_size_per_umc * adev->gmc.num_umc with type unsigned int (32 bits, unsigned)
is evaluated using 32-bit arithmetic,and then used in a context that expects an expression of type u64 (64 bits, unsigned).
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

c09d2eff

drm/amdgpu: Update programming for boot error reporting · a474161e

Hawking Zhang authored May 30, 2024

AMDGPU_RAS_GPU_ERR_BOOT_STATUS field is no longer valid.
The polling sequence is also simplifed according to
the latest firmware change.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

a474161e

drm/radeon: Remove __counted_by from StateArray.states[] · 2bac0844

Bill Wendling authored May 29, 2024

Work for __counted_by on generic pointers in structures (not just
flexible array members) has started landing in Clang 19 (current tip of
tree). During the development of this feature, a restriction was added
to __counted_by to prevent the flexible array member's element type from
including a flexible array member itself such as:

  struct foo {
    int count;
    char buf[];
  };

  struct bar {
    int count;
    struct foo data[] __counted_by(count);
  };

because the size of data cannot be calculated with the standard array
size formula:

  sizeof(struct foo) * count

This restriction was downgraded to a warning but due to CONFIG_WERROR,
it can still break the build. The application of __counted_by on the
states member of 'struct _StateArray' triggers this restriction,
resulting in:

  drivers/gpu/drm/radeon/pptable.h:442:5: error: 'counted_by' should not be applied to an array with element of unknown size because 'ATOM_PPLIB_STATE_V2' (aka 'struct _ATOM_PPLIB_STATE_V2') is a struct type with a flexible array member. This will be an error in a future compiler version [-Werror,-Wbounds-safety-counted-by-elt-type-unknown-size]
    442 |     ATOM_PPLIB_STATE_V2 states[] __counted_by(ucNumEntries);
        |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1 error generated.

Remove this use of __counted_by to fix the warning/error. However,
rather than remove it altogether, leave it commented, as it may be
possible to support this in future compiler releases.

Cc: stable@vger.kernel.org
Closes: https://github.com/ClangBuiltLinux/linux/issues/2028
Fixes: efade6fe ("drm/radeon: silence UBSAN warning (v3)")
Signed-off-by: Bill Wendling <morbo@google.com>
Co-developed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

2bac0844

drm/amdgpu/soc24: use common nbio callback to set remap offset · 7d3b9668

Alex Deucher authored May 08, 2024

This fixes HDP flushes on systems with non-4K pages.
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

7d3b9668

drm/amd/display: Convert some legacy DRM debug macros into appropriate categories · 730ac573

Tvrtko Ursulin authored May 28, 2024

Currently when one enables driver debugging dmesg gets spammed, at I
suspect vblank rate, with messages like:

 [drm:amdgpu_dm_atomic_check [amdgpu]] MPO enablement requested on crtc:[00000000f073c3bb]

Fix if by converting some logging from deprecated and incorrect
DRM_DEBUG_DRIVER to drm_dbg_atomic. Plus some localized drive-by changes
to drm_dbg_kms.

By no means an exhaustive conversion but at least it allows turning on
driver debug selectively.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

730ac573

drm/amdgpu: Estimate RAS reservation when report capacity v2 · 473af28d

Hawking Zhang authored May 28, 2024

Add estimate of how much vram we need to reserve for RAS
when caculating the total available vram.

v2: apply the change to MP0 v13_0_2 and v13_0_14
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

473af28d

drm/amdgpu: use u32 for buf size in __amdgpu_eeprom_xfer · 76bec2a0

Tao Zhou authored May 20, 2024

And also make sure the value of msg[1].len should be in the range of u16.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

76bec2a0

drm/amdkfd: gfx12 context save/restore trap handler fixes · fda812eb

Jay Cornwall authored May 23, 2024

Fix LDS size interpretation: 512 bytes (>= gfx12) vs 256 (< gfx12).

Ensure STATE_PRIV.BARRIER_COMPLETE cannot change after reading or
before writing. Other waves in the threadgroup may cause this field
to assert if they complete the barrier.

Do not overwrite EXCP_FLAG_PRIV.{SAVE_CONTEXT,HOST_TRAP} when
restoring this register. Both of these fields can assert while the
wavefront is running the trap handler.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

fda812eb

drm/amdgpu: drop some kernel messages in VCN code · 813e7d4c

David (Ming Qiang) Wu authored May 23, 2024

We have messages when the VCN fails to initialize and
there is no need to report on success.
Also PSP loading FWs is the default for production.
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Sonny Jiang <sonjiang@amd.com>
Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

813e7d4c

drm/amdgpu: Update the impelmentation of AMDGPU_PTE_MTYPE_GFX12 · eba791dc

Shane Xiao authored May 29, 2024

This patch changes the implementation of AMDGPU_PTE_MTYPE_GFX12,
clear the bits before setting the new one.
This fixed the potential issue that GFX12 setting memory to NC.

v2: Clear mtype field before setting the new one (Alex)
v3: Fix typo (Felix)
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: longlyao <Longlong.Yao@amd.com>
Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

eba791dc