• Erico Nunes's avatar
    drm/lima: mask irqs in timeout path before hard reset · a421cc7a
    Erico Nunes authored
    There is a race condition in which a rendering job might take just long
    enough to trigger the drm sched job timeout handler but also still
    complete before the hard reset is done by the timeout handler.
    This runs into race conditions not expected by the timeout handler.
    In some very specific cases it currently may result in a refcount
    imbalance on lima_pm_idle, with a stack dump such as:
    
    [10136.669170] WARNING: CPU: 0 PID: 0 at drivers/gpu/drm/lima/lima_devfreq.c:205 lima_devfreq_record_idle+0xa0/0xb0
    ...
    [10136.669459] pc : lima_devfreq_record_idle+0xa0/0xb0
    ...
    [10136.669628] Call trace:
    [10136.669634]  lima_devfreq_record_idle+0xa0/0xb0
    [10136.669646]  lima_sched_pipe_task_done+0x5c/0xb0
    [10136.669656]  lima_gp_irq_handler+0xa8/0x120
    [10136.669666]  __handle_irq_event_percpu+0x48/0x160
    [10136.669679]  handle_irq_event+0x4c/0xc0
    
    We can prevent that race condition entirely by masking the irqs at the
    beginning of the timeout handler, at which point we give up on waiting
    for that job entirely.
    The irqs will be enabled again at the next hard reset which is already
    done as a recovery by the timeout handler.
    Signed-off-by: default avatarErico Nunes <nunes.erico@gmail.com>
    Reviewed-by: default avatarQiang Yu <yuq825@gmail.com>
    Signed-off-by: default avatarQiang Yu <yuq825@gmail.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240405152951.1531555-4-nunes.erico@gmail.com
    a421cc7a
lima_sched.c 13.5 KB