Commits · 5cb818b861be114148e8dbeb4259698148019dd1 · nexedi / linux

14 Jul, 2017 40 commits

drm/amd/amdgpu: fix si_enable_smc_cac() failed issue · 5cb818b8

Jim Qu authored Jul 12, 2017

Signed-off-by: Jim Qu <Jim.Qu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

5cb818b8

drm/amdgpu: Off by one sanity checks · 1d11ee89

Dan Carpenter authored Jul 11, 2017

This is just future proofing code, not something that can be triggered
in real life.  We're testing to make sure we don't shift wrap when we
do "1ull << i" so "i" has to be in the 0-63 range.  If it's 64 then we
have gone too far.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

1d11ee89

drm/amdgpu: implement si_read_bios_from_rom · 6d949d24

Alex Deucher authored Jul 12, 2017

This allows us to read the vbios image directly from ROM.
This is already implemented for other asics, but was not
yet available for SI.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

6d949d24

drm/amdgpu/soc15: drop dead function · 1a6ec7ed

Alex Deucher authored Jul 11, 2017

Maybe a leftover from bringup?
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

1a6ec7ed

drm/amdgpu: call atomfirmware get_clock_info for atomfirmware systems · 88b64e95

Alex Deucher authored Jul 10, 2017

Rather than the legacy atombios version.
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

88b64e95

drm/amdgpu: add get_clock_info for atomfirmware · 79077ee1

Alex Deucher authored Jul 10, 2017

The information has moved to different tables, notably
smu_info for core refclk and umc_info for mem refclk.
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

79077ee1

drm/amdgpu: Send no-retry XNACK for all fault types · 9f57f7b4

Jay Cornwall authored Apr 26, 2017

A subset of VM fault types currently send retry XNACK to the client.
This causes a storm of interrupts from the VM to the host.

Until the storm is throttled by other means send no-retry XNACK for
all fault types instead. No change in behavior to the client which
will stall indefinitely with the current configuration in any case.
Improves system stability under GC or MMHUB faults.
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: John Bridgman <John.Bridgman@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

9f57f7b4

drm/amdgpu: Correctly establish the suspend/resume hook for amdkfd · ba997709

Yong Zhao authored Nov 09, 2015

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

ba997709

drm/amdgpu: Make SDMA phase quantum configurable · a667386c

Felix Kuehling authored Jul 15, 2016

Set a configurable SDMA phase quantum when enabling SDMA context
switching. The default value significantly reduces SDMA latency
in page table updates when user-mode SDMA queues have concurrent
activity, compared to the initial HW setting.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Andres Rodriguez <andres.rodriguez@amd.com>
Reviewed-by: Shaoyun Liu <shaoyun.liu@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

a667386c

drm/amdgpu: Enable SDMA context switching for CIK · 763dbbfa

Felix Kuehling authored Jun 15, 2016

Enable SDMA context switching on CIK (copied from sdma_v3_0.c).
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

763dbbfa

drm/amdgpu: Enable SDMA_CNTL.ATC_L1_ENABLE for SDMA on CZ · 4048f0f0

shaoyunl authored Dec 04, 2015

For GFX context, the ATC bit in SDMA*_GFX_VIRTUAL_ADDRESS can be cleared
to perform in VM mode. For RLC context, to support ATC mode , ATC bit in
SDMA*_RLC*_VIRTUAL_ADDRESS should be set. SDMA_CNTL.ATC_L1_ENABLE bit is
global setting that enables the L1-L2 translation for ATC address.
Signed-off-by: shaoyun liu <shaoyun.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

4048f0f0

drm/amdgpu: Try evicting from CPU visible to invisible VRAM first · cb2dd1a6

Michel Dänzer authored Jul 04, 2017

This gives BOs which haven't been accessed by the CPU since they were
moved to visible VRAM another chance to stay in VRAM when another BO
needs to go to visible VRAM.

This should allow BOs to stay in VRAM longer in some cases.

v2:
* Only do this for BOs which don't have the
  AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag set.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

cb2dd1a6

drm/amdgpu: Don't force BOs into visible VRAM for page faults · 41d9a6a7

John Brooks authored Jun 27, 2017

There is no need for page faults to force BOs into visible VRAM if it's
full, and the time it takes to do so is great enough to cause noticeable
stuttering. Add GTT as a possible placement so that if visible VRAM is
full, page faults move BOs to GTT instead of evicting other BOs from VRAM.
Suggested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: John Brooks <john@fastquake.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

41d9a6a7

drm/amdgpu: Set/clear CPU_ACCESS flag on page fault and move to VRAM · 96cf8271

John Brooks authored Jun 30, 2017

When a BO is moved to VRAM, clear AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED.
This allows it to potentially later move to invisible VRAM if the CPU
does not access it again.

Setting the CPU_ACCESS flag in amdgpu_bo_fault_reserve_notify() also means
that we can remove the loop to restrict lpfn to the end of visible VRAM,
because amdgpu_ttm_placement_init() will do it for us.

v3 [Michel Dänzer]
* Use AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED instead of a new flag
  (Christian König)
* Clear flag in amdgpu_bo_move instead of amdgpu_move_ram_vram
  (Christian)
* Explicitly mention amdgpu_bo_fault_reserve_notify in amdgpu_bo_move
* Also clear flag in amdgpu_bo_create_restricted
Suggested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: John Brooks <john@fastquake.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

96cf8271

drm/amdgpu: Throttle visible VRAM moves separately · 00f06b24

John Brooks authored Jun 27, 2017

The BO move throttling code is designed to allow VRAM to fill quickly if it
is relatively empty. However, this does not take into account situations
where the visible VRAM is smaller than total VRAM, and total VRAM may not
be close to full but the visible VRAM segment is under pressure. In such
situations, visible VRAM would experience unrestricted swapping and
performance would drop.

Add a separate counter specifically for moves involving visible VRAM, and
check it before moving BOs there.

v2: Only perform calculations for separate counter if visible VRAM is
    smaller than total VRAM. (Michel Dänzer)
v3: [Michel Dänzer]
* Use BO's location rather than the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED
  flag to determine whether to account a move for visible VRAM in most
  cases.
* Use a single

	if (adev->mc.visible_vram_size < adev->mc.real_vram_size) {

  block in amdgpu_cs_get_threshold_for_moves.

Fixes: 95844d20 (drm/amdgpu: throttle buffer migrations at CS using a fixed MBps limit (v2))
Signed-off-by: John Brooks <john@fastquake.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

00f06b24

drm/amdgpu: Add vis_vramlimit module parameter · 218b5dcd

John Brooks authored Jun 27, 2017

Allow specifying a limit on visible VRAM via a module parameter. This is
helpful for testing performance under visible VRAM pressure.

v2: Add cast to 64-bit (Christian König)
Signed-off-by: John Brooks <john@fastquake.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

218b5dcd

drm/amdgpu: change gartsize default to 256MB · f9321cc4

Christian König authored Jul 07, 2017

Limit the default GART size and save a lot of VRAM.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

f9321cc4

drm/amdgpu: add new gttsize module parameter v2 · 36d38372

Christian König authored Jul 07, 2017

This allows setting the gtt size independent of the gart size.

v2: fix copy and paste typo
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

36d38372

drm/amdgpu: limit the GTT manager address space · bb84284e

Christian König authored Jul 07, 2017

We should only cover the GART size with the GTT manager.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

bb84284e

drm/amdgpu: consistent name all GART related parts · 6f02a696

Christian König authored Jul 07, 2017

Rename symbols from gtt_ to gart_ as appropriate.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

6f02a696

drm/amdgpu: remove gtt_base_align handling · ed21c047

Christian König authored Jul 06, 2017

Not used any more.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

ed21c047

drm/amdgpu: move GART struct and function into amdgpu_gart.h v2 · 3490bdb5

Christian König authored Jul 06, 2017

No functional change, just cleanup.

v2: rebased, keep gart name.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

3490bdb5

drm/amdgpu: check scratch registers to see if we need post (v2) · 70d17a25

Alex Deucher authored Jun 30, 2017

Rather than checking the CONGIG_MEMSIZE register as that may
not be reliable on some APUs.

v2: The scratch register is only used on CIK+
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

70d17a25

drm/amdgpu/soc15: init nbio registers for vega10 · 833fa075

Alex Deucher authored Jul 06, 2017

Call nbio init registers on hw_init to set up any
nbio registers that need initialization at hw init time.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

833fa075

drm/amdgpu: add nbio 6.1 register init function · 12097c6d

Alex Deucher authored Jul 06, 2017

Used for nbio registers that need to be initialized.  Currently
only used for a golden setting that got missed on some boards.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

12097c6d

drm/amd/powerplay: added didt support for vega10 · 9b7b8154

Evan Quan authored Jul 05, 2017

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

9b7b8154

drm/amd/powerplay: added grbm_idx_mutex lock/unlock to cgs v2 · 209ee27e

Evan Quan authored Jul 04, 2017

  - v2: rename param 'en' as 'lock'
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

209ee27e

drm/amd/powerplay: added support for new se_cac_idx APIs to cgs · c62a59d0

Evan Quan authored Jul 04, 2017

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

c62a59d0

drm/amd/powerplay: added soc15 support for new se_cac_idx APIs · 2f11fb02

Evan Quan authored Jul 04, 2017

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

2f11fb02

drm/amd/powerplay: added new se_cac_idx r/w APIs v2 · 16abb5d2

Evan Quan authored Jul 04, 2017

  - v2: added missing spinlock init
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

16abb5d2

drm/amd/powerplay: added index gc cac read/write apis for vega10 · 560460f2

Evan Quan authored Jul 03, 2017

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

560460f2

drm/amdgpu: use TTM values instead of MC values for the info queries · 09628c3f

Christian König authored Jun 30, 2017

Use the TTM values instead of the hardware config here.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

09628c3f

drm/amdgpu: remove maximum BO size limitation v2 · 935eefb3

Christian König authored Jun 30, 2017

We can finally remove this now.

v2: remove now unused max_size variable as well.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

935eefb3

drm/amdgpu: stop mapping BOs to GTT · 5e7e8396

Christian König authored Jun 30, 2017

No need to map BOs to GTT on eviction and intermediate transfers any more.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

5e7e8396

drm/amdgpu: use the GTT windows for BO moves v2 · abca90f1

Christian König authored Jun 30, 2017

This way we don't need to map the full BO at a time any more.

v2: use fixed windows for src/dst
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

abca90f1

drm/amdgpu: add amdgpu_gart_map function v2 · 0c2c421e

Christian König authored Jun 29, 2017

This allows us to write the mapped PTEs into
an IB instead of the table directly.

v2: fix build with debugfs enabled, remove unused assignment
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

0c2c421e

drm/amdgpu: reserve the first 2x512 pages of GART · cc25188a

Christian König authored Jun 28, 2017

We want to use them as remap address space.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

cc25188a

drm/amdgpu: make arrays pctl0_data and pctl1_data static · 606ce3c0

Colin Ian King authored Jul 06, 2017

The arrays pctl0_data and pctl1_data do not need to be in global scope,
so them both static.

Cleans up sparse warnings:
symbol 'pctl0_data' was not declared. Should it be static?
symbol 'pctl1_data' was not declared. Should it be static?
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

606ce3c0

drm/amdgpu/gmc9: get vram width from atom for Raven · 8d6a5230

Alex Deucher authored Jul 05, 2017

Get it from the system info table.
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

8d6a5230

drm/amdgpu/atomfirmware: implement vram_width for APUs · 21f6bcb6

Alex Deucher authored Jul 05, 2017

Implement support using the new atomfirmware system info table.
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

21f6bcb6