• Stanislaw Gruszka's avatar
    accel/ivpu: Do not use wait event interruptible · b0873eea
    Stanislaw Gruszka authored
    If we receive signal when waiting for IPC message response in
    ivpu_ipc_receive() we return error and continue to operate.
    Then the driver can send another IPC messages and re-use occupied
    slot of the message still processed by the firmware. This can result
    in corrupting firmware memory and following FW crash with messages:
    
    [ 3698.569719] intel_vpu 0000:00:0b.0: [drm] ivpu_ipc_send_receive_internal(): IPC receive failed: type 0x1103, ret -512
    [ 3698.569747] intel_vpu 0000:00:0b.0: [drm] ivpu_jsm_unregister_db(): Failed to unregister doorbell 3: -512
    [ 3698.569756] intel_vpu 0000:00:0b.0: [drm] ivpu_ipc_tx_prepare(): IPC message vpu:0x88980000 not released by firmware
    [ 3698.569763] intel_vpu 0000:00:0b.0: [drm] ivpu_ipc_tx_prepare(): JSM message vpu:0x88980040 not released by firmware
    [ 3698.570234] intel_vpu 0000:00:0b.0: [drm] ivpu_ipc_send_receive_internal(): IPC receive failed: type 0x110e, ret -512
    [ 3698.570318] intel_vpu 0000:00:0b.0: [drm] *ERROR* ivpu_mmu_dump_event(): MMU EVTQ: 0x10 (Translation fault) SSID: 0 SID: 3, e[2] 00000000, e[3] 00000208, in addr: 0x88988000, fetch addr: 0x0
    
    To fix the issue don't use interruptible variant of wait event to
    allow firmware to finish IPC processing.
    
    Fixes: 5d7422cf ("accel/ivpu: Add IPC driver and JSM messages")
    Reviewed-by: default avatarKarol Wachowski <karol.wachowski@linux.intel.com>
    Reviewed-by: default avatarJeffrey Hugo <quic_jhugo@quicinc.com>
    Signed-off-by: default avatarStanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20230925121137.872158-2-stanislaw.gruszka@linux.intel.com
    b0873eea
ivpu_ipc.c 13.1 KB