• Bart Van Assche's avatar
    scsi: ufs: Fix a deadlock between PM and the SCSI error handler · 7029e215
    Bart Van Assche authored
    The following deadlock has been observed on multiple test setups:
    
     * ufshcd_wl_suspend() is waiting for blk_execute_rq(START STOP UNIT) to
       complete while ufshcd_wl_suspend() holds host_sem.
    
     * The SCSI error handler is activated, changes the host state to
       SHOST_RECOVERY, ufshcd_eh_host_reset_handler() and ufshcd_err_handler()
       are called and the latter function tries to obtain host_sem.
    
    This is a deadlock because blk_execute_rq() can't execute SCSI commands
    while the host is in the SHOST_RECOVERY state and because the error handler
    cannot make progress because host_sem is held by another thread.
    
    Fix this deadlock as follows:
    
     * Fail attempts to suspend the system while the SCSI error handler is in
       progress by setting the SCMD_FAIL_IF_RECOVERING flag for START STOP UNIT
       commands.
    
     * If the system is suspending and a START STOP UNIT command times out,
       handle the SCSI command timeout from inside the context of the SCSI
       timeout handler instead of activating the SCSI error handler.
    
    The runtime power management code is not affected by this deadlock since
    hba->host_sem is not touched by the runtime power management functions in
    the UFS driver.
    Reviewed-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
    Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
    Link: https://lore.kernel.org/r/20221018202958.1902564-11-bvanassche@acm.orgSigned-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
    7029e215
ufshcd.c 270 KB