1. 24 Jul, 2024 4 commits
    • Tim Huang's avatar
      drm/amdgpu: add missed harvest check for VCN IP v4/v5 · 0b071245
      Tim Huang authored
      To prevent below probe failure, add a check for models with VCN
      IP v4.0.6 where VCN1 may be harvested.
      
      v2:
      Apply the same check to VCN IP v4.0 and v5.0.
      
      [   54.070117] RIP: 0010:vcn_v4_0_5_start_dpg_mode+0x9be/0x36b0 [amdgpu]
      [   54.071055] Code: 80 fb ff 8d 82 00 80 fe ff 81 fe 00 06 00 00 0f 43
      c2 49 69 d5 38 0d 00 00 48 8d 71 04 c1 e8 02 4c 01 f2 48 89 b2 50 f6 02
      00 <89> 01 48 8b 82 50 f6 02 00 48 8d 48 04 48 89 8a 50 f6 02 00 c7 00
      [   54.072408] RSP: 0018:ffffb17985f736f8 EFLAGS: 00010286
      [   54.072793] RAX: 00000000000000d6 RBX: ffff99a82f680000 RCX:
      0000000000000000
      [   54.073315] RDX: ffff99a82f680000 RSI: 0000000000000004 RDI:
      ffff99a82f680000
      [   54.073835] RBP: ffffb17985f73730 R08: 0000000000000001 R09:
      0000000000000000
      [   54.074353] R10: 0000000000000008 R11: ffffb17983c05000 R12:
      0000000000000000
      [   54.074879] R13: 0000000000000000 R14: ffff99a82f680000 R15:
      0000000000000001
      [   54.075400] FS:  00007f8d9c79a000(0000) GS:ffff99ab2f140000(0000)
      knlGS:0000000000000000
      [   54.075988] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   54.076408] CR2: 0000000000000000 CR3: 0000000140c3a000 CR4:
      0000000000750ef0
      [   54.076927] PKRU: 55555554
      [   54.077132] Call Trace:
      [   54.077319]  <TASK>
      [   54.077484]  ? show_regs+0x69/0x80
      [   54.077747]  ? __die+0x28/0x70
      [   54.077979]  ? page_fault_oops+0x180/0x4b0
      [   54.078286]  ? do_user_addr_fault+0x2d2/0x680
      [   54.078610]  ? exc_page_fault+0x84/0x190
      [   54.078910]  ? asm_exc_page_fault+0x2b/0x30
      [   54.079224]  ? vcn_v4_0_5_start_dpg_mode+0x9be/0x36b0 [amdgpu]
      [   54.079941]  ? vcn_v4_0_5_start_dpg_mode+0xe6/0x36b0 [amdgpu]
      [   54.080617]  vcn_v4_0_5_set_powergating_state+0x82/0x19b0 [amdgpu]
      [   54.081316]  amdgpu_device_ip_set_powergating_state+0x64/0xc0
      [amdgpu]
      [   54.082057]  amdgpu_vcn_ring_begin_use+0x6f/0x1d0 [amdgpu]
      [   54.082727]  amdgpu_ring_alloc+0x44/0x70 [amdgpu]
      [   54.083351]  amdgpu_vcn_dec_sw_ring_test_ring+0x40/0x110 [amdgpu]
      [   54.084054]  amdgpu_ring_test_helper+0x22/0x90 [amdgpu]
      [   54.084698]  vcn_v4_0_5_hw_init+0x87/0xc0 [amdgpu]
      [   54.085307]  amdgpu_device_init+0x1f96/0x2780 [amdgpu]
      [   54.085951]  amdgpu_driver_load_kms+0x1e/0xc0 [amdgpu]
      [   54.086591]  amdgpu_pci_probe+0x19f/0x550 [amdgpu]
      [   54.087215]  local_pci_probe+0x48/0xa0
      [   54.087509]  pci_device_probe+0xc9/0x250
      [   54.087812]  really_probe+0x1a4/0x3f0
      [   54.088101]  __driver_probe_device+0x7d/0x170
      [   54.088443]  driver_probe_device+0x24/0xa0
      [   54.088765]  __driver_attach+0xdd/0x1d0
      [   54.089068]  ? __pfx___driver_attach+0x10/0x10
      [   54.089417]  bus_for_each_dev+0x8e/0xe0
      [   54.089718]  driver_attach+0x22/0x30
      [   54.090000]  bus_add_driver+0x120/0x220
      [   54.090303]  driver_register+0x62/0x120
      [   54.090606]  ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
      [   54.091255]  __pci_register_driver+0x62/0x70
      [   54.091593]  amdgpu_init+0x67/0xff0 [amdgpu]
      [   54.092190]  do_one_initcall+0x5f/0x330
      [   54.092495]  do_init_module+0x68/0x240
      [   54.092794]  load_module+0x201c/0x2110
      [   54.093093]  init_module_from_file+0x97/0xd0
      [   54.093428]  ? init_module_from_file+0x97/0xd0
      [   54.093777]  idempotent_init_module+0x11c/0x2a0
      [   54.094134]  __x64_sys_finit_module+0x64/0xc0
      [   54.094476]  do_syscall_64+0x58/0x120
      [   54.094767]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
      Signed-off-by: default avatarTim Huang <tim.huang@amd.com>
      Reviewed-by: default avatarSaleemkhan Jamadar <saleemkhan.jamadar@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      0b071245
    • Yifan Zhang's avatar
      drm/amdgpu: skip kfd init if GFX is not ready. · 3b37e272
      Yifan Zhang authored
      avoid kfd init crash in that case.
      Signed-off-by: default avatarYifan Zhang <yifan1.zhang@amd.com>
      Acked-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Tested-by: default avatarJesse Zhang <Jesse.Zhang@amd.com>
      Reviewed-by: default avatarJesse Zhang <Jesse.Zhang@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      3b37e272
    • Philip Yang's avatar
      drm/amdkfd: Validate user queue update · 305cd109
      Philip Yang authored
      Ensure update queue new ring buffer is mapped on GPU with correct size.
      
      Decrease queue old ring_bo queue_refcount and increase new ring_bo
      queue_refcount.
      Signed-off-by: default avatarPhilip Yang <Philip.Yang@amd.com>
      Reviewed-by: default avatarFelix Kuehling <felix.kuehling@amd.com>
      Acked-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      305cd109
    • Philip Yang's avatar
      drm/amdkfd: Validate user queue svm memory residency · b049504e
      Philip Yang authored
      Queue CWSR area maybe registered to GPU as svm memory, create queue to
      ensure svm mapped to GPU with KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED flag.
      
      Add queue_refcount to struct svm_range, to track queue CWSR area usage.
      
      Because unmap mmu notifier callback return value is ignored, if
      application unmap the CWSR area while queue is active, pr_warn message
      in dmesg log. To be safe, evict user queue.
      Signed-off-by: default avatarPhilip Yang <Philip.Yang@amd.com>
      Reviewed-by: default avatarFelix Kuehling <felix.kuehling@amd.com>
      Acked-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      b049504e
  2. 23 Jul, 2024 36 commits