• NeilBrown's avatar
    md/raid10: don't clear bitmap bit when bad-block-list write fails. · c340702c
    NeilBrown authored
    When a write fails and a bad-block-list is present, we can
    update the bad-block-list instead of writing the data.  If
    this succeeds then it is OK clear the relevant bitmap-bit as
    no further 'sync' of the block is needed.
    
    However if writing the bad-block-list fails then we need to
    treat the write as failed and particularly must not clear
    the bitmap bit.  Otherwise the device can be re-added (after
    any hardware connection issues are resolved) and because the
    relevant bit in the bitmap is clear, that block will not be
    resynced.  This leads to data corruption.
    
    We already delay the final bio_endio() on the write until
    the bad-block-list is written so that when the write
    returns: either that data is safe, the bad-block record is
    safe, or the fact that the device is faulty is safe.
    However we *don't* delay the clearing of the bitmap, so the
    bitmap bit can be recorded as cleared before we know if the
    bad-block-list was written safely.
    
    So: delay that until the write really is safe.
    i.e. move the call to close_write() until just before
    calling bio_endio(), and recheck the 'is array degraded'
    status before making that call.
    
    This bug goes back to v3.1 when bad-block-lists were
    introduced, though it only affects arrays created with
    mdadm-3.3 or later as only those have bad-block lists.
    
    Backports will require at least
    Commit: 95af587e ("md/raid10: ensure device failure recorded before write request returns.")
    as well.  I'll send that to 'stable' separately.
    
    Note that of the two tests of R10BIO_WriteError that this
    patch adds, the first is certain to fail and the second is
    certain to succeed.  However doing it this way makes the
    patch more obviously correct.  I will tidy the code up in a
    future merge window.
    Reported-by: default avatarNate Dailey <nate.dailey@stratus.com>
    Fixes: bd870a16 ("md/raid10:  Handle write errors by updating badblock log.")
    Signed-off-by: default avatarNeilBrown <neilb@suse.com>
    c340702c
raid10.c 127 KB