• Chao Leng's avatar
    nvme-rdma: avoid request double completion for concurrent nvme_rdma_timeout · 7674073b
    Chao Leng authored
    A crash happens when inject completing request long time(nearly 30s).
    Each name space has a request queue, when inject completing request long
    time, multi request queues may have time out requests at the same time,
    nvme_rdma_timeout will execute concurrently. Multi requests in different
    request queues may be queued in the same rdma queue, multi
    nvme_rdma_timeout may call nvme_rdma_stop_queue at the same time.
    The first nvme_rdma_timeout will clear NVME_RDMA_Q_LIVE and continue
    stopping the rdma queue(drain qp), but the others check NVME_RDMA_Q_LIVE
    is already cleared, and then directly complete the requests, complete
    request before the qp is fully drained may lead to a use-after-free
    condition.
    
    Add a multex lock to serialize nvme_rdma_stop_queue.
    Signed-off-by: default avatarChao Leng <lengchao@huawei.com>
    Tested-by: default avatarIsrael Rukshin <israelr@nvidia.com>
    Reviewed-by: default avatarIsrael Rukshin <israelr@nvidia.com>
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    7674073b
rdma.c 64.8 KB