-
Sagi Grimberg authored
When we delete a controller, we execute the following: 1. nvme_stop_ctrl() - stop some work elements that may be inflight or scheduled (specifically also .stop_ctrl which cancels ctrl error recovery work) 2. nvme_remove_namespaces() - which first flushes scan_work to avoid competing ns addition/removal 3. continue to teardown the controller However, if err_work was scheduled to run in (1), it is designed to cancel any inflight I/O, particularly I/O that is originating from ns scan_work in (2), but because it is cancelled in .stop_ctrl(), we can prevent forward progress of (2) as ns scanning is blocking on I/O (that will never be cancelled). The race is: 1. transport layer error observed -> err_work is scheduled 2. scan_work executes, discovers ns, generate I/O to it 3. nvme_ctop_ctrl() -> .stop_ctrl() -> cancel_work_sync(err_work) - err_work never executed 4. nvme_remove_namespaces() -> flush_work(scan_work) --> deadlock, because scan_work is blocked on I/O that was supposed to be cancelled by err_work, but was cancelled before executing. Fix this by flushing err_work instead of cancelling it, to force it to execute and cancel all inflight I/O. Fixes: b435ecea ("nvme: Add .stop_ctrl to nvme ctrl ops") Fixes: f6c8e432 ("nvme: flush namespace scanning work just before removing namespaces") Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
a1ae8d4d