• Philipp Reisner's avatar
    drbd: fix race between role change and handshake · a8821531
    Philipp Reisner authored
    Symptoms:
    If DRBD was "cleanly shut down" (all in sync, both Secondary before
    disconnect, identical data generation uuids), and then one side was
    promoted *during* the next connection handshake, the role change
    could confuse the handshake.
    
    The Primary would get stuck in WFBitmapS, the Secondary would log
    unexpected cstate (Connected) in receive_bitmap
    and get stuck in WFBitmapT.
    
    Fix:
    The test in is_valid_soft_transition wrong. It works because
    the not allowed actions (promote/attach) do not touch the
    cstate. The previous condition failed to demand a cstate change
    in one clause.
    
    In order to avoid deadlocks give up the state_mutex while waiting
    for the transient state to go away.
    
    Conflicts:
    	drbd/drbd_state.c
    	drbd/drbd_state.h
    	drbd/drbd_wrappers.h
    Signed-off-by: default avatarPhilipp Reisner <philipp.reisner@linbit.com>
    Signed-off-by: default avatarLars Ellenberg <lars.ellenberg@linbit.com>
    Signed-off-by: default avatarJens Axboe <axboe@fb.com>
    a8821531
drbd_state.c 57.6 KB