• Ruozhu Li's avatar
    nvme: fix use after free when disconnecting a reconnecting ctrl · 8b77fa6f
    Ruozhu Li authored
    A crash happens when trying to disconnect a reconnecting ctrl:
    
     1) The network was cut off when the connection was just established,
        scan work hang there waiting for some IOs complete.  Those I/Os were
        retried because we return BLK_STS_RESOURCE to blk in reconnecting.
     2) After a while, I tried to disconnect this connection.  This
        procedure also hangs because it tried to obtain ctrl->scan_lock.
        It should be noted that now we have switched the controller state
        to NVME_CTRL_DELETING.
     3) In nvme_check_ready(), we always return true when ctrl->state is
        NVME_CTRL_DELETING, so those retrying I/Os were issued to the bottom
        device which was already freed.
    
    To fix this, when ctrl->state is NVME_CTRL_DELETING, issue cmd to bottom
    device only when queue state is live.  If not, return host path error to
    the block layer
    Signed-off-by: default avatarRuozhu Li <liruozhu@huawei.com>
    Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    8b77fa6f
core.c 124 KB