• Mike Christie's avatar
    scsi: iscsi: Fix in-kernel conn failure handling · 23d6fefb
    Mike Christie authored
    Commit 0ab71045 ("scsi: iscsi: Perform connection failure entirely in
    kernel space") has the following regressions/bugs that this patch fixes:
    
    1. It can return cmds to upper layers like dm-multipath where that can
    retry them. After they are successful the fs/app can send new I/O to the
    same sectors, but we've left the cmds running in FW or in the net layer.
    We need to be calling ep_disconnect if userspace is not up.
    
    This patch only fixes the issue for offload drivers. iscsi_tcp will be
    fixed in separate commit because it doesn't have a ep_disconnect call.
    
    2. The drivers that implement ep_disconnect expect that it's called before
    conn_stop. Besides crashes, if the cleanup_task callout is called before
    ep_disconnect it might free up driver/card resources for session1 then they
    could be allocated for session2. But because the driver's ep_disconnect is
    not called it has not cleaned up the firmware so the card is still using
    the resources for the original cmd.
    
    3. The stop_conn_work_fn can run after userspace has done its recovery and
    we are happily using the session. We will then end up with various bugs
    depending on what is going on at the time.
    
    We may also run stop_conn_work_fn late after userspace has called stop_conn
    and ep_disconnect and is now going to call start/bind conn. If
    stop_conn_work_fn runs after bind but before start, we would leave the conn
    in a unbound but sort of started state where IO might be allowed even
    though the drivers have been set in a state where they no longer expect
    I/O.
    
    4. Returning -EAGAIN in iscsi_if_destroy_conn if we haven't yet run the in
    kernel stop_conn function is breaking userspace. We should have been doing
    this for the caller.
    
    Link: https://lore.kernel.org/r/20210525181821.7617-8-michael.christie@oracle.com
    Fixes: 0ab71045 ("scsi: iscsi: Perform connection failure entirely in kernel space")
    Reviewed-by: default avatarLee Duncan <lduncan@suse.com>
    Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
    Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
    23d6fefb
scsi_transport_iscsi.h 17 KB