• Adrian Hunter's avatar
    scsi: ufs: core: Revert "scsi: ufs: Synchronize SCSI and UFS error handling" · 88b09900
    Adrian Hunter authored
    This reverts commit a113eaaf.
    
    There are a couple of issues with the commit:
    
     1. It causes deadlocks.
    
     2. It causes the shost->eh_cmd_q list of failed requests not to be
        processed, ever.
    
    So revert it.
    
    1. Deadlocks
    
    The SCSI error handler runs with requests blocked beginning when
    scsi_schedule_eh() sets SHOST_RECOVERY state, continuing through
    scsi_error_handler() callback ->eh_strategy_handler() until
    scsi_restart_operations() is called.  By setting eh_strategy_handler to
    ufshcd_err_handler, the patch changed the UFS error handler to run with
    requests blocked, including PM requests, for the entire run of the error
    handler.
    
    That conflicts with UFS error handler existing synchronization with UFS
    device PM operations.  The UFS error handler synchronizes with runtime PM
    by doing pm_runtime_get_sync() prior to blocking requests itself.  It
    synchronizes with system PM by use of hba->host_sem, again before blocking
    requests itself.  However, if requests are already blocked, then PM
    operations will block.  So:
    
       the UFS error handler blocks waiting on PM
     + PM blocks waiting on SCSI PM requests to process or fail
     + PM requests are blocked waiting on error handling to finish
     =  deadlock
    
    This happens both for runtime PM and system PM.
    
    Prior to the patch, these deadlocks could not happen even if SCSI error
    handling was running, because the presence of requests in shost->eh_cmd_q
    would mean the queues could not be suspended, which would mean that, should
    the UFS error handler run at the same time, it would not need to wait for
    PM or vice versa.
    
    Please note these scenarios are not just theoretical, they were found
    during testing on a Samsung Galaxy Book S.
    
    2. ->eh_strategy_handler() must process shost->eh_cmd_q list of failed
    requests, as all other eh_strategy_handler's do except UFS error handler.
    Refer for example: scsi_unjam_host(), ata_scsi_error() and
    sas_scsi_recover_host().
    
    Link: https://lore.kernel.org/r/20210917144349.14058-1-adrian.hunter@intel.com
    Fixes: a113eaaf ("scsi: ufs: Synchronize SCSI and UFS error handling")
    Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
    Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
    Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
    88b09900
ufshcd.h 41.4 KB