• Anton Eidelman's avatar
    nvme-multipath: fix possible I/O hang when paths are updated · 504db087
    Anton Eidelman authored
    nvme_state_set_live() making a path available triggers requeue_work
    in order to resubmit requests that ended up on requeue_list when no
    paths were available.
    
    This requeue_work may race with concurrent nvme_ns_head_make_request()
    that do not observe the live path yet.
    Such concurrent requests may by made by either:
    - New IO submission.
    - Requeue_work triggered by nvme_failover_req() or another ana_work.
    
    A race may cause requeue_work capture the state of requeue_list before
    more requests get onto the list. These requests will stay on the list
    forever unless requeue_work is triggered again.
    
    In order to prevent such race, nvme_state_set_live() should
    synchronize_srcu(&head->srcu) before triggering the requeue_work and
    prevent nvme_ns_head_make_request referencing an old snapshot of the
    path list.
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Signed-off-by: default avatarAnton Eidelman <anton@lightbitslabs.com>
    Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    504db087
multipath.c 18.6 KB