• Song Liu's avatar
    md/r5cache: r5cache recovery: part 1 · b4c625c6
    Song Liu authored
    Recovery of write-back cache has different logic to write-through only
    cache. Specifically, for write-back cache, the recovery need to scan
    through all active journal entries before flushing data out. Therefore,
    large portion of the recovery logic is rewritten here.
    
    To make the diffs cleaner, we split the rewrite as follows:
    
    1. In this patch, we:
          - add new data to r5l_recovery_ctx
          - add new functions to recovery write-back cache
       The new functions are not used in this patch, so this patch does not
       change the behavior of recovery.
    
    2. In next patch, we:
          - modify main recovery procedure r5l_recovery_log() to call new
            functions
          - remove old functions
    
    With cache feature, there are 2 different scenarios of recovery:
    1. Data-Parity stripe: a stripe with complete parity in journal.
    2. Data-Only stripe: a stripe with only data in journal (or partial
       parity).
    
    The code differentiate Data-Parity stripe from Data-Only stripe with
    flag STRIPE_R5C_CACHING.
    
    For Data-Parity stripes, we use the same procedure as raid5 journal,
    where all the data and parity are replayed to the RAID devices.
    
    For Data-Only strips, we need to finish complete calculate parity and
    finish the full reconstruct write or RMW write. For simplicity, in
    the recovery, we load the stripe to stripe cache. Once the array is
    started, the stripe cache state machine will handle these stripes
    through normal write path.
    
    r5c_recovery_flush_log contains the main procedure of recovery. The
    recovery code first scans through the journal and loads data to
    stripe cache. The code keeps tracks of all these stripes in a list
    (use sh->lru and ctx->cached_list), stripes in the list are
    organized in the order of its first appearance on the journal.
    During the scan, the recovery code assesses each stripe as
    Data-Parity or Data-Only.
    
    During scan, the array may run out of stripe cache. In these cases,
    the recovery code will also call raid5_set_cache_size to increase
    stripe cache size. If the array still runs out of stripe cache
    because there isn't enough memory, the array will not assemble.
    
    At the end of scan, the recovery code replays all Data-Parity
    stripes, and sets proper states for Data-Only stripes. The recovery
    code also increases seq number by 10 and rewrites all Data-Only
    stripes to journal. This is to avoid confusion after repeated
    crashes. More details is explained in raid5-cache.c before
    r5c_recovery_rewrite_data_only_stripes().
    Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
    Signed-off-by: default avatarShaohua Li <shli@fb.com>
    b4c625c6
raid5-cache.c 73.1 KB