• NeilBrown's avatar
    md: avoid endless recovery loop when waiting for fail device to complete. · 4274215d
    NeilBrown authored
    If a device fails in a way that causes pending request to take a while
    to complete, md will not be able to immediately remove it from the
    array in remove_and_add_spares.
    It will then incorrectly look like a spare device and md will try to
    recover it even though it is failed.
    This leads to a recovery process starting and instantly aborting over
    and over again.
    
    We should check if the device is faulty before considering it to be a
    spare.  This will avoid trying to start a recovery that cannot
    proceed.
    
    This bug was introduced in 2.6.26 so that patch is suitable for any
    kernel since then.
    
    Cc: stable@kernel.org
    Reported-by: default avatarJim Paradis <james.paradis@stratus.com>
    Signed-off-by: default avatarNeilBrown <neilb@suse.de>
    4274215d
md.c 193 KB