• Xue jiufei's avatar
    ocfs2/dlm: do not purge lockres that is queued for assert master · ac4fef4d
    Xue jiufei authored
    When workqueue is delayed, it may occur that a lockres is purged while it
    is still queued for master assert.  it may trigger BUG() as follows.
    
    N1                                         N2
    dlm_get_lockres()
    ->dlm_do_master_requery
                                      is the master of lockres,
                                      so queue assert_master work
    
                                      dlm_thread() start running
                                      and purge the lockres
    
                                      dlm_assert_master_worker()
                                      send assert master message
                                      to other nodes
    receiving the assert_master
    message, set master to N2
    
    dlmlock_remote() send create_lock message to N2, but receive DLM_IVLOCKID,
    if it is RECOVERY lockres, it triggers the BUG().
    
    Another BUG() is triggered when N3 become the new master and send
    assert_master to N1, N1 will trigger the BUG() because owner doesn't
    match.  So we should not purge lockres when it is queued for assert
    master.
    Signed-off-by: default avatarjoyce.xue <xuejiufei@huawei.com>
    Reviewed-by: default avatarMark Fasheh <mfasheh@suse.de>
    Cc: Joel Becker <jlbec@evilplan.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    ac4fef4d
dlmthread.c 20.5 KB