• NeilBrown's avatar
    md: Fix race when creating a new md device. · b0140891
    NeilBrown authored
    There is a race when creating an md device by opening /dev/mdXX.
    
    If two processes do this at much the same time they will follow the
    call path
      __blkdev_get -> get_gendisk -> kobj_lookup
    
    The first will call
      -> md_probe -> md_alloc -> add_disk -> blk_register_region
    
    and the race happens when the second gets to kobj_lookup after
    add_disk has called blk_register_region but before it returns to
    md_alloc.
    
    In the case the second will not call md_probe (as the probe is already
    done) but will get a handle on the gendisk, return to __blkdev_get
    which will then call md_open (via the ->open) pointer.
    
    As mddev->gendisk hasn't been set yet, md_open will think something is
    wrong an return with ERESTARTSYS.
    
    This can loop endlessly while the first thread makes no progress
    through add_disk.  Nothing is blocking it, but due to scheduler
    behaviour it doesn't get a turn.
    So this is essentially a live-lock.
    
    We fix this by simply moving the assignment to mddev->gendisk before
    the call the add_disk() so md_open doesn't get confused.
    Also move blk_queue_flush earlier because add_disk should be as late
    as possible.
    
    To make sure that md_open doesn't complete until md_alloc has done all
    that is needed, we take mddev->open_mutex during the last part of
    md_alloc.  md_open will wait for this.
    
    This can cause a lock-up on boot so Cc:ing for stable.
    For 2.6.36 and earlier a different patch will be needed as the
    'blk_queue_flush' call isn't there.
    Signed-off-by: default avatarNeilBrown <neilb@suse.de>
    Reported-by: default avatarThomas Jarosch <thomas.jarosch@intra2net.com>
    Tested-by: default avatarThomas Jarosch <thomas.jarosch@intra2net.com>
    Cc: stable@kernel.org
    b0140891
md.c 193 KB