• Ilya Dryomov's avatar
    rbd: take header_rwsem in rbd_dev_refresh() only when updating · 0b207d02
    Ilya Dryomov authored
    rbd_dev_refresh() has been holding header_rwsem across header and
    parent info read-in unnecessarily for ages.  With commit 870611e4
    ("rbd: get snapshot context after exclusive lock is ensured to be
    held"), the potential for deadlocks became much more real owning to
    a) header_rwsem now nesting inside lock_rwsem and b) rw_semaphores
    not allowing new readers after a writer is registered.
    
    For example, assuming that I/O request 1, I/O request 2 and header
    read-in request all target the same OSD:
    
    1. I/O request 1 comes in and gets submitted
    2. watch error occurs
    3. rbd_watch_errcb() takes lock_rwsem for write, clears owner_cid and
       releases lock_rwsem
    4. after reestablishing the watch, rbd_reregister_watch() calls
       rbd_dev_refresh() which takes header_rwsem for write and submits
       a header read-in request
    5. I/O request 2 comes in: after taking lock_rwsem for read in
       __rbd_img_handle_request(), it blocks trying to take header_rwsem
       for read in rbd_img_object_requests()
    6. another watch error occurs
    7. rbd_watch_errcb() blocks trying to take lock_rwsem for write
    8. I/O request 1 completion is received by the messenger but can't be
       processed because lock_rwsem won't be granted anymore
    9. header read-in request completion can't be received, let alone
       processed, because the messenger is stranded
    
    Change rbd_dev_refresh() to take header_rwsem only for actually
    updating rbd_dev->header.  Header and parent info read-in don't need
    any locking.
    
    Cc: stable@vger.kernel.org # 0b035401: rbd: move rbd_dev_refresh() definition
    Cc: stable@vger.kernel.org # 510a7330: rbd: decouple header read-in from updating rbd_dev->header
    Cc: stable@vger.kernel.org # c1031177: rbd: decouple parent info read-in from updating rbd_dev
    Cc: stable@vger.kernel.org
    Fixes: 870611e4 ("rbd: get snapshot context after exclusive lock is ensured to be held")
    Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
    Reviewed-by: default avatarDongsheng Yang <dongsheng.yang@easystack.cn>
    0b207d02
rbd.c 187 KB