• Sagi Grimberg's avatar
    nvme-rdma: fix timeout handler · 0475a8dc
    Sagi Grimberg authored
    When a request times out in a LIVE state, we simply trigger error
    recovery and let the error recovery handle the request cancellation,
    however when a request times out in a non LIVE state, we make sure to
    complete it immediately as it might block controller setup or teardown
    and prevent forward progress.
    
    However tearing down the entire set of I/O and admin queues causes
    freeze/unfreeze imbalance (q->mq_freeze_depth) because and is really
    an overkill to what we actually need, which is to just fence controller
    teardown that may be running, stop the queue, and cancel the request if
    it is not already completed.
    
    Now that we have the controller teardown_lock, we can safely serialize
    request cancellation. This addresses a hang caused by calling extra
    queue freeze on controller namespaces, causing unfreeze to not complete
    correctly.
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Reviewed-by: default avatarJames Smart <james.smart@broadcom.com>
    Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
    0475a8dc
rdma.c 64.2 KB