• Yu Kuai's avatar
    dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape · 41425f96
    Yu Kuai authored
    For raid456, if reshape is still in progress, then IO across reshape
    position will wait for reshape to make progress. However, for dm-raid,
    in following cases reshape will never make progress hence IO will hang:
    
    1) the array is read-only;
    2) MD_RECOVERY_WAIT is set;
    3) MD_RECOVERY_FROZEN is set;
    
    After commit c467e97f ("md/raid6: use valid sector values to determine
    if an I/O should wait on the reshape") fix the problem that IO across
    reshape position doesn't wait for reshape, the dm-raid test
    shell/lvconvert-raid-reshape.sh start to hang:
    
    [root@fedora ~]# cat /proc/979/stack
    [<0>] wait_woken+0x7d/0x90
    [<0>] raid5_make_request+0x929/0x1d70 [raid456]
    [<0>] md_handle_request+0xc2/0x3b0 [md_mod]
    [<0>] raid_map+0x2c/0x50 [dm_raid]
    [<0>] __map_bio+0x251/0x380 [dm_mod]
    [<0>] dm_submit_bio+0x1f0/0x760 [dm_mod]
    [<0>] __submit_bio+0xc2/0x1c0
    [<0>] submit_bio_noacct_nocheck+0x17f/0x450
    [<0>] submit_bio_noacct+0x2bc/0x780
    [<0>] submit_bio+0x70/0xc0
    [<0>] mpage_readahead+0x169/0x1f0
    [<0>] blkdev_readahead+0x18/0x30
    [<0>] read_pages+0x7c/0x3b0
    [<0>] page_cache_ra_unbounded+0x1ab/0x280
    [<0>] force_page_cache_ra+0x9e/0x130
    [<0>] page_cache_sync_ra+0x3b/0x110
    [<0>] filemap_get_pages+0x143/0xa30
    [<0>] filemap_read+0xdc/0x4b0
    [<0>] blkdev_read_iter+0x75/0x200
    [<0>] vfs_read+0x272/0x460
    [<0>] ksys_read+0x7a/0x170
    [<0>] __x64_sys_read+0x1c/0x30
    [<0>] do_syscall_64+0xc6/0x230
    [<0>] entry_SYSCALL_64_after_hwframe+0x6c/0x74
    
    This is because reshape can't make progress.
    
    For md/raid, the problem doesn't exist because register new sync_thread
    doesn't rely on the IO to be done any more:
    
    1) If array is read-only, it can switch to read-write by ioctl/sysfs;
    2) md/raid never set MD_RECOVERY_WAIT;
    3) If MD_RECOVERY_FROZEN is set, mddev_suspend() doesn't hold
       'reconfig_mutex', hence it can be cleared and reshape can continue by
       sysfs api 'sync_action'.
    
    However, I'm not sure yet how to avoid the problem in dm-raid yet. This
    patch on the one hand make sure raid_message() can't change
    sync_thread() through raid_message() after presuspend(), on the other
    hand detect the above 3 cases before wait for IO do be done in
    dm_suspend(), and let dm-raid requeue those IO.
    
    Cc: stable@vger.kernel.org # v6.7+
    Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
    Signed-off-by: default avatarXiao Ni <xni@redhat.com>
    Acked-by: default avatarMike Snitzer <snitzer@kernel.org>
    Signed-off-by: default avatarSong Liu <song@kernel.org>
    Link: https://lore.kernel.org/r/20240305072306.2562024-9-yukuai1@huaweicloud.com
    41425f96
md.h 29.7 KB