• BingJing Chang's avatar
    md: fix a potential deadlock of raid5/raid10 reshape · 8876391e
    BingJing Chang authored
    There is a potential deadlock if mount/umount happens when
    raid5_finish_reshape() tries to grow the size of emulated disk.
    
    How the deadlock happens?
    1) The raid5 resync thread finished reshape (expanding array).
    2) The mount or umount thread holds VFS sb->s_umount lock and tries to
       write through critical data into raid5 emulated block device. So it
       waits for raid5 kernel thread handling stripes in order to finish it
       I/Os.
    3) In the routine of raid5 kernel thread, md_check_recovery() will be
       called first in order to reap the raid5 resync thread. That is,
       raid5_finish_reshape() will be called. In this function, it will try
       to update conf and call VFS revalidate_disk() to grow the raid5
       emulated block device. It will try to acquire VFS sb->s_umount lock.
    The raid5 kernel thread cannot continue, so no one can handle mount/
    umount I/Os (stripes). Once the write-through I/Os cannot be finished,
    mount/umount will not release sb->s_umount lock. The deadlock happens.
    
    The raid5 kernel thread is an emulated block device. It is responible to
    handle I/Os (stripes) from upper layers. The emulated block device
    should not request any I/Os on itself. That is, it should not call VFS
    layer functions. (If it did, it will try to acquire VFS locks to
    guarantee the I/Os sequence.) So we have the resync thread to send
    resync I/O requests and to wait for the results.
    
    For solving this potential deadlock, we can put the size growth of the
    emulated block device as the final step of reshape thread.
    
    2017/12/29:
    Thanks to Guoqing Jiang <gqjiang@suse.com>,
    we confirmed that there is the same deadlock issue in raid10. It's
    reproducible and can be fixed by this patch. For raid10.c, we can remove
    the similar code to prevent deadlock as well since they has been called
    before.
    Reported-by: default avatarAlex Wu <alexwu@synology.com>
    Reviewed-by: default avatarAlex Wu <alexwu@synology.com>
    Reviewed-by: default avatarChung-Chiang Cheng <cccheng@synology.com>
    Signed-off-by: default avatarBingJing Chang <bingjingc@synology.com>
    Signed-off-by: default avatarShaohua Li <sh.li@alibaba-inc.com>
    8876391e
md.c 245 KB