• Bob Peterson's avatar
    gfs2: Close timing window with GLF_INVALIDATE_IN_PROGRESS · d99724c3
    Bob Peterson authored
    This patch closes a timing window in which two processes compete
    and overlap in the execution of do_xmote for the same glock:
    
                 Process A                              Process B
       ------------------------------------   -----------------------------
    1. Grabs gl_lockref and calls do_xmote
    2.                                        Grabs gl_lockref but is blocked
    3. Sets GLF_INVALIDATE_IN_PROGRESS
    4. Unlocks gl_lockref
    5.                                        Calls do_xmote
    6. Call glops->go_sync
    7. test_and_clear_bit GLF_DIRTY
    8. Call gfs2_log_flush                    Call glops->go_sync
    9. (slow IO, so it blocks a long time)    test_and_clear_bit GLF_DIRTY
                                              It's not dirty (step 7) returns
    10.                                       Tests GLF_INVALIDATE_IN_PROGRESS
    11.                                       Calls go_inval (rgrp_go_inval)
    12.                                       gfs2_rgrp_relse does brelse
    13.                                       truncate_inode_pages_range
    14.                                       Calls lm_lock UN
    
    In step 14 we've just told dlm to give the glock to another node
    when, in fact, process A has not finished the IO and synced all
    buffer_heads to disk and make sure their revokes are done.
    
    This patch fixes the problem by changing the GLF_INVALIDATE_IN_PROGRESS
    to use test_and_set_bit, and if the bit is already set, process B just
    ignores it and trusts that process A will do the do_xmote in the proper
    order.
    Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
    Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
    d99724c3
glock.c 56.6 KB