• Bart Van Assche's avatar
    IB/srp: Fail I/O requests if the transport is offline · 2ce19e72
    Bart Van Assche authored
    If an SRP target is no longer reachable and srp_reset_host() fails to
    reconnect then ib_srp will invoke scsi_remove_host().  That function
    will invoke __scsi_remove_device() for each LUN.  And that last
    function will change the device state from SDEV_TRANSPORT_OFFLINE into
    SDEV_CANCEL.  Certain user space software, e.g. older versions of
    multipathd, continue queueing I/O to SCSI devices that are in the
    SDEV_CANCEL state.
    
    If these I/O requests are submitted as SG_IO that means that the
    REQ_PREEMPT flag will be set and hence that these requests will be
    passed to srp_queuecommand().  These requests will time out.  If new
    requests are queued fast enough from user space these active requests
    will prevent __scsi_remove_device() to finish.
    
    Avoid this by failing I/O requests in the SDEV_CANCEL state if the
    transport is offline.  Introduce a new variable to keep track of the
    transport state instead of failing requests if (!target->connected ||
    target->qp_in_error), so that the SCSI error handler has a chance to
    retry commands after a transport layer failure occurred.
    Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
    Cc: <stable@vger.kernel.org> # 3.8
    Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
    2ce19e72
ib_srp.h 5.06 KB