• Bob Peterson's avatar
    GFS2: Fix recovery issues for spectators · 4a772772
    Bob Peterson authored
    This patch fixes a couple problems dealing with spectators who
    remain with gfs2 mounts after the last non-spectator node fails.
    
    Before this patch, spectator mounts would try to acquire the dlm's
    mounted lock EX as part of its normal recovery sequence.
    The mounted lock is only used to determine whether the node is
    the first mounter, the first node to mount the file system, for
    the purposes of file system recovery and journal replay.
    
    It's not necessary for spectators: they should never do journal
    recovery. If they acquire the lock it will prevent another "real"
    first-mounter from acquiring the lock in EX mode, which means it
    also cannot do journal recovery because it doesn't think it's the
    first node to mount the file system.
    
    This patch checks if the mounter is a spectator, and if so, avoids
    grabbing the mounted lock. This allows a secondary mounter who is
    really the first non-spectator mounter, to do journal recovery:
    since the spectator doesn't acquire the lock, it can grab it in
    EX mode, and therefore consider itself to be the first mounter
    both as a "real" first mount, and as a first-real-after-spectator.
    
    Note that the control lock still needs to be taken in PR mode
    in order to fetch the lvb value so it has the current status of
    all journal's recovery. This is used as it is today by a first
    mounter to replay the journals. For spectators, it's merely
    used to fetch the status bits. All recovery is bypassed and the
    node waits until recovery is completed by a non-spectator node.
    
    I also improved the cryptic message given by control_mount when
    a spectator is waiting for a non-spectator to perform recovery.
    
    It also fixes a problem in gfs2_recover_set whereby spectators
    were never queueing recovery work for their own journal.
    They cannot do recovery themselves, but they still need to queue
    the work so they can check the recovery bits and clear the
    DFL_BLOCK_LOCKS bit once the recovery happens on another node.
    
    When the work queue runs on a spectator, it bypasses most of the
    work so it won't print a bunch of annoying messages. All it will
    print is a bunch of messages that look like this until recovery
    completes on the non-spectator node:
    
    GFS2: fsid=mycluster:scratch.s: recover generation 3 jid 0
    GFS2: fsid=mycluster:scratch.s: recover jid 0 result busy
    
    These continue every 1.5 seconds until the recovery is done by
    the non-spectator, at which time it says:
    
    GFS2: fsid=mycluster:scratch.s: recover generation 4 done
    
    Then it proceeds with its mount.
    
    If the file system is mounted in spectator node and the last
    remaining non-spectator is fenced, any IO to the file system is
    blocked by dlm and the spectator waits until recovery is
    performed by a non-spectator.
    
    If a spectator tries to mount the file system before any
    non-spectators, it blocks and repeatedly gives this kernel
    message:
    
    GFS2: fsid=mycluster:scratch: Recovery is required. Waiting for a non-spectator to mount.
    GFS2: fsid=mycluster:scratch: Recovery is required. Waiting for a non-spectator to mount.
    Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
    Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
    4a772772
sys.c 17.8 KB