• Sagi Grimberg's avatar
    nvmet-rdma: fix possible bad dereference when freeing rsps · 73964c1d
    Sagi Grimberg authored
    It is possible that the host connected and saw a cm established
    event and started sending nvme capsules on the qp, however the
    ctrl did not yet see an established event. This is why the
    rsp_wait_list exists (for async handling of these cmds, we move
    them to a pending list).
    
    Furthermore, it is possible that the ctrl cm times out, resulting
    in a connect-error cm event. in this case we hit a bad deref [1]
    because in nvmet_rdma_free_rsps we assume that all the responses
    are in the free list.
    
    We are freeing the cmds array anyways, so don't even bother to
    remove the rsp from the free_list. It is also guaranteed that we
    are not racing anything when we are releasing the queue so no
    other context accessing this array should be running.
    
    [1]:
    --
    Workqueue: nvmet-free-wq nvmet_rdma_free_queue_work [nvmet_rdma]
    [...]
    pc : nvmet_rdma_free_rsps+0x78/0xb8 [nvmet_rdma]
    lr : nvmet_rdma_free_queue_work+0x88/0x120 [nvmet_rdma]
     Call trace:
     nvmet_rdma_free_rsps+0x78/0xb8 [nvmet_rdma]
     nvmet_rdma_free_queue_work+0x88/0x120 [nvmet_rdma]
     process_one_work+0x1ec/0x4a0
     worker_thread+0x48/0x490
     kthread+0x158/0x160
     ret_from_fork+0x10/0x18
    --
    Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Signed-off-by: default avatarKeith Busch <kbusch@kernel.org>
    73964c1d
rdma.c 52.2 KB