habanalabs: soft-reset device if context-switch fails
This patch fix a bug in the driver, where if the TPC or MME remains in non-IDLE even after all the command submissions are done (due to user bug or malicious user), then future command submissions will fail in the context-switch stage and the driver will remain in "stuck" mode. The fix is to do a soft-reset of the device in case the context-switch fails, because the device should be IDLE during context-switch. If it is not IDLE, then something is wrong and we should reset the compute engines. Signed-off-by:Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Showing
Please register or sign in to comment