Commits · 4f2937bfff1057700c402438b83d66179283675e · nexedi / linux

09 Dec, 2017 15 commits

drm/amdkfd: sync IOLINK defines to thunk spec · 4f2937bf

Harish Kasiviswanathan authored Dec 08, 2017

Current thunk spec v1.07 dated Feb 1, 2016

v2: fix indentation
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

4f2937bf

drm/amdkfd: Support enumerating non-GPU devices · 6d82eb0e

Harish Kasiviswanathan authored Dec 08, 2017

Modify kfd_topology_enum_kfd_devices(..) function to support non-GPU
nodes. The function returned NULL when it encountered non-GPU (say CPU)
nodes. This caused kfd_ioctl_create_event and kfd_init_apertures to fail
for Intel + Tonga.

kfd_topology_enum_kfd_devices will now parse all the nodes and return
valid kfd_dev for nodes with GPU.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

6d82eb0e

drm/amdkfd: Decouple CRAT parsing from device list update · 4f449311

Harish Kasiviswanathan authored Dec 08, 2017

Currently, CRAT parsing is intertwined with topology_device_list and
hence repeated calls to kfd_parse_crat_table() will fail. Decouple
kfd_parse_crat_table() and topology_device_list.

kfd_parse_crat_table() will parse CRAT and add topology devices to a
temporary list temp_topology_device_list and then
kfd_topology_update_device_list will move contents from temporary list to
master list.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

4f449311

drm/amdkfd: Reorganize CRAT fetching from ACPI · 8e05247d

Harish Kasiviswanathan authored Dec 08, 2017

Reorganize and rename kfd_topology_get_crat_acpi function. In this way
acpi_get_table(..) needs to be called only once. This will also aid in
dGPU topology implementation.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

8e05247d

drm/amdkfd: Group up CRAT related functions · 174de876

Felix Kuehling authored Dec 08, 2017

Take CRAT related functions out of kfd_topology.c and place them in
kfd_crat.c. This is the initial step of supporting more CRAT features,
i.e. creating virtual CRAT table for KFD devices without CRAT.

v2: Minor cleanup that was missed previously because code moved around
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

174de876

drm/amdkfd: Fix memory leaks in kfd topology · 5108d768

Yong Zhao authored Dec 08, 2017

Kobject created using kobject_create_and_add() can be freed using
kobject_put() when there is no referenece any more. However,
kobject memory allocated with kzalloc() has to set up a release
callback in order to free it when the counter decreases to 0.
Otherwise it causes memory leak.
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

5108d768

drm/amdkfd: Topology: Fix location_id · d63f0ba2

Harish Kasiviswanathan authored Dec 08, 2017

Fix location_id format to match Thunk specification.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

d63f0ba2

drm/amdkfd: Update number of compute unit from KGD · f7ce2fad

Flora Cui authored Dec 08, 2017

Overwrite the active simd_count from KGD at driver loading time. This is
based on assumption that register GC_USER_SHADER_ARRAY_CONFIG won’t get
changed.

V2: remove the incorrect simd_count reported at loading module.
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed by: Yair Shachar< yair.shachar@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

f7ce2fad

drm/amd: Remove get_vmem_size from KGD-KFD interface · 4248bd0b

Harish Kasiviswanathan authored Dec 08, 2017

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

4248bd0b

drm/amdkfd: Remove deprecated get_vmem_size · b4ec7757

Harish Kasiviswanathan authored Dec 08, 2017

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

b4ec7757

drm/amdkfd: Stop using get_vmem_size KGD-KFD interface · 0504cccf

Harish Kasiviswanathan authored Dec 08, 2017

get_vmem_size() is deprecated. Instead use get_local_mem_info().
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

0504cccf

drm/amdgpu: Implement get_local_mem_info · 30f1c042

Harish Kasiviswanathan authored Dec 08, 2017

Implement new kgd-kfd interface function get_local_mem_info.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

30f1c042

drm/amd: Add get_local_mem_info to KGD-KFD interface · 4073ed78

Harish Kasiviswanathan authored Dec 08, 2017

Add get_local_mem_info which provides more information about local
memory than get_vmem_size:
- public and private framebuffer size
- memory clock
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

4073ed78

drm/amdgpu: add amdgpu interface to query cu info · ebdebf42

Flora Cui authored Dec 08, 2017

Signed-off-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

ebdebf42

drm/amd: add new interface to query cu info · 8cce58fe

Flora Cui authored Dec 08, 2017

Signed-off-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

8cce58fe

27 Nov, 2017 12 commits

drm/amdkfd: Simplify locking during process creation · c0ede1f8

Yong Zhao authored Nov 27, 2017

Also fixes error handling if kfd_process_init_cwsr fails.
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

c0ede1f8

drm/amdkfd: Factor PDD destruction out of kfd_process_wq_release · de1450a5

Felix Kuehling authored Nov 27, 2017

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

de1450a5

drm/amdkfd: Reduce nesting in kfd_create_process_device_data · 2d9b36f9

Felix Kuehling authored Nov 27, 2017

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

2d9b36f9

drm/amdkfd: Return NULL if kfd_lookup_process_by_pasid fails · 82c16b42

Yong Zhao authored Nov 27, 2017

If no matching process is found, return NULL instead of a pointer
to the last process in the kfd_processes_table.
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

82c16b42

drm/amdkfd: Use ref count to prevent kfd_process destruction · abb208a8

Felix Kuehling authored Nov 27, 2017

Use a reference counter instead of a lock to prevent process
destruction while functions running out of process context are using
the kfd_process structure. In many cases these functions don't need
the structure to be locked. In the few cases that really do need the
process lock, take it explicitly.

This helps simplify lock dependencies between the process lock and
other locks, particularly amdgpu and mm_struct locks. This will be
important when amdgpu calls back to amdkfd for memory evictions.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

abb208a8

drm/amdkfd: Make kfd_process reference counted · 5ce10687

Felix Kuehling authored Nov 27, 2017

This will be used to elliminate the use of the process lock for
preventing concurrent process destruction. This will simplify lock
dependencies between KFD and KGD.

This also simplifies the process destruction in a few ways:
* Don't allocate work struct dynamically
* Remove unnecessary hack that increments mm reference counter
* Remove unnecessary process locking during destruction
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

5ce10687

drm/amdkfd: Get reference to lead_thread task struct · c7b1243e

Felix Kuehling authored Nov 27, 2017

Increment the kfd_process.lead_thread's reference counter to make
it safe to dereference. This is needed for getting a safe reference
to the process' mm_struct.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

c7b1243e

drm/amdkfd: Add debugfs support to KFD · 851a645e

Felix Kuehling authored Nov 27, 2017

This commit adds several debugfs entries for kfd:

kfd/hqds: dumps all HQDs on all GPUs for KFD-controlled compute and
    SDMA RLC queues

kfd/mqds: dumps all MQDs of all KFD processes on all GPUs

kfd/rls: dumps HWS runlists on all GPUs
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

851a645e

drm/amdgpu: Add kfd2kgd APIs for dumping HQDs · 80c195f5

Felix Kuehling authored Nov 27, 2017

This can be used by KFD for debugging features, such as dumping
HQDs in debugfs.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

80c195f5

drm/amdgpu: Fix definition of KFD_CIK_SDMA_QUEUE_OFFSET · fdcba29c

Felix Kuehling authored Nov 27, 2017

This counts the queue offset in register index, not register address.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

fdcba29c

drm/amdkfd: Fix oversubscription accounting · 36582fa5

Felix Kuehling authored Nov 27, 2017

Don't count SDMA queues towards compute HQD oversubscription when
deciding whether to create a chained runlist.
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

36582fa5

drm/amdkfd: map multiple processes to HW scheduler · a99c6d4f

Felix Kuehling authored Nov 27, 2017

Allow HWS to to execute multiple processes on the hardware
concurrently. The number of concurrent processes is limited by
the number of VMIDs allocated to the HWS.

A module parameter can be used for limiting this further or turn
it off altogether (mainly for debugging purposes).
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

a99c6d4f

04 Dec, 2017 1 commit

drm/amdkfd: Fix printing pointer cast · 8f8fb9b9

Kent Russell authored Dec 04, 2017

Just print a pointer instead of casting

v2: Remove the 0x prefix, since %p prints that automatically, and remove
it from one other spot as well
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

8f8fb9b9

27 Nov, 2017 2 commits

drm/amdkfd: Add crash protection in debugger register path · 3c0b4280

Philip Yang authored Nov 27, 2017

After debugger is registered, the pqm_destroy_queue fails because is_debug
is true, the queue should not be removed from process_queue_list since
the count is not reduced.

Test application calls debugger unregister without register debugger, add
null pointer check protection to avoid crash for this case
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

3c0b4280

drm/amdgpu: fix get_max_engine_clock_in_mhz · a9efcc19

Felix Kuehling authored Nov 27, 2017

Use proper powerplay function. This fixes OpenCL initialization
problems.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

a9efcc19

24 Nov, 2017 1 commit

drm/amdkfd: Delete a useless parameter from create_queue function pointer · b46cb7d7

Yong Zhao authored Nov 24, 2017

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

b46cb7d7

14 Nov, 2017 4 commits

drm/amdkfd: Add support for user-mode trap handlers · d7b9bd22

Felix Kuehling authored Nov 14, 2017

A second-level user mode trap handler can be installed. The CWSR trap
handler jumps to the secondary trap handler conditionally for any
conditions not handled by it. This can be used e.g. for debugging or
catching math exceptions.

When CWSR is disabled, the user mode trap handler is installed as
first level trap handler.
Signed-off-by: Shaoyun.liu <shaoyun.liu@amd.com>
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

d7b9bd22

drm/amdkfd: Add CWSR support · 373d7080

Felix Kuehling authored Nov 14, 2017

This hardware feature allows the GPU to preempt shader execution in
the middle of a compute wave, save the state and restore it later
to resume execution.

Memory for saving the state is allocated per queue in user mode and
the address and size passed to the create_queue ioctl. The size
depends on the number of waves that can be in flight simultaneously
on a given ASIC.
Signed-off-by: Shaoyun.liu <shaoyun.liu@amd.com>
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

373d7080

drm/amdkfd: Add trap handler for CWSR · 449fea61

Felix Kuehling authored Nov 14, 2017

The trap handler is like an interrupt handler running on the GPU
compute unit. It is needed for supporting CWSR (compute wave
save/restore).

This file defines an array with the pre-compiled GFXv8 shader ISA.
The assembly code is included for reference in #if 0 ... #endif.
Signed-off-by: Shaoyun.liu <shaoyun.liu@amd.com>
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

449fea61

drm/amdkfd: Cleanup qpd.pqm initialization · b20cd0df

Felix Kuehling authored Nov 14, 2017

The PQM doesn't change after process creation. So initialize it in
kfd_create_process_device_data.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

b20cd0df

06 Nov, 2017 2 commits

drm/amdkfd: Use order_base_2 to get log2 of buffes sizes · 115c8c41

Felix Kuehling authored Nov 06, 2017

Replace (ffs(size) - 1) with order_base_2(size) as a more straight
forward way to get log2 of buffer sizes.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

115c8c41

drm/amdkfd: Hardware DWORD size is 4 bytes · 6d566930

Felix Kuehling authored Nov 06, 2017

Don't use sizeof(uint32_t) or similar types for hardware or firmware
DWORD size. The hardware and firmware don't care about Linux types.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

6d566930

01 Nov, 2017 3 commits

drm/amdkfd: Implement amdkfd SDMA functions for VI · 5aaf2bef

Philip Cox authored Nov 01, 2017

Signed-off-by: Philip Cox <Philip.Cox@amd.com>
Signed-off-by: shaoyun liu <shaoyun.liu@amd.com>
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

5aaf2bef

drm/amdkfd: Use ASIC-specific SDMA MQD type · 97b9ad12

Felix Kuehling authored Nov 01, 2017

Signed-off-by: shaoyun liu <shaoyun.liu@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

97b9ad12

drm/amdgpu: Implement amdgpu SDMA functions for VI · 9807c366

Philip Cox authored Nov 01, 2017

Signed-off-by: Philip Cox <Philip.Cox@amd.com>
Signed-off-by: shaoyun liu <shaoyun.liu@amd.com>
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

9807c366