• Bob Peterson's avatar
    gfs2: Ignore dlm recovery requests if gfs2 is withdrawn · 03678a99
    Bob Peterson authored
    When a node fails, user space informs dlm of the node failure,
    and dlm instructs gfs2 on the surviving nodes to perform journal
    recovery. It does this by calling various callback functions in
    lock_dlm.c. To mark its progress, it keeps generation numbers
    and recover bits in a dlm "control" lock lvb, which is seen by
    all nodes to determine which journals need to be replayed.
    
    The gfs2 on all nodes get the same recovery requests from dlm,
    so they all try to do the recovery, but only one will be
    granted the exclusive lock on the journal. The others fail
    with a "Busy" message on their "try lock."
    
    However, when a node is withdrawn, it cannot safely do any
    recovery or replay any journals. To make matters worse,
    gfs2 might withdraw as a result of attempting recovery. For
    example, this might happen if the device goes offline, or if
    an hba fails. But in today's gfs2 code, it doesn't check for
    being withdrawn at any step in the recovery process. What's
    worse is that these callbacks from dlm have no return code,
    so there is no way to indicate failure back to dlm. We can
    send a "Recovery failed" uevent eventually, but that tells
    user space what happened, not dlm's kernel code.
    
    Before this patch, lock_dlm would perform its recovery steps but
    ignore the result, and eventually it would still update its
    generation number in the lvb, despite the fact that it may have
    withdrawn or encountered an error. The other nodes would then
    see the newer generation number in the lvb and conclude that
    they don't need to do recovery because the generation number
    is newer than the last one they saw. They think a different
    node has already recovered the journal.
    
    This patch adds checks to several of the callbacks used by dlm
    in its recovery state machine so that the functions are ignored
    and skipped if an io error has occurred or if the file system
    is withdrawn. That prevents the lvb bits from being updated, and
    therefore dlm and user space still see the need for recovery to
    take place.
    Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
    Reviewed-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
    03678a99
recovery.c 11.2 KB