• David Jeffery's avatar
    sd: medium access timeout counter fails to reset · 2a863ba8
    David Jeffery authored
    There is an error with the medium access timeout feature of the sd driver. The
    sdkp->medium_access_timed_out value is reset to zero in sd_done() in the wrong
    place.  Currently it is reset to zero only when a command returns sense data.
    This can result in cases where the medium access check falsely triggers from
    timed out commands which are hours or days apart.
    
    For example, an I/O command times out and is aborted.  It then retries and
    succeeds.  But with no sense data generated and returned, the
    medium_access_timed_out value is not reset.  If no sd command returns sense
    data, then the next command to time out (however far in time from the first
    failure) will trigger the medium access timeout and put the device offline.
    
    The resetting of sdkp->medium_access_timed_out should occur before the check
    for sense data.
    
    To reproduce using scsi_debug, use SCSI_DEBUG_OPT_TIMEOUT or
    SCSI_DEBUG_OPT_MAC_TIMEOUT to force an I/O command to timeout.  Then, remove
    the opt value so the I/O will succeed on retry.  Perform more I/O as desired.
    Finally, repeat the process to make a new I/O command time out.  Without the
    patch, the device will be marked offline even though many I/O commands have
    succeeded between the 2 instances of timed out commands.
    Signed-off-by: default avatarDavid Jeffery <djeffery@redhat.com>
    Reviewed-by: default avatarEwan D. Milne <emilne@redhat.com>
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    2a863ba8
sd.c 85.1 KB