1. 07 Mar, 2024 1 commit
    • Jens Axboe's avatar
      Merge tag 'nvme-6.9-2024-03-07' of git://git.infradead.org/nvme into for-6.9/block · 0f7223a3
      Jens Axboe authored
      Pull NVMe updates from Keith:
      
      "nvme updates for Linux 6.9
      
       - RDMA target enhancements (Max)
       - Fabrics fixes (Max, Guixin, Hannes)
       - Atomic queue_limits usage (Christoph)
       - Const use for class_register (Ricardo)
       - Identification error handling fixes (Shin'ichiro, Keith)"
      
      * tag 'nvme-6.9-2024-03-07' of git://git.infradead.org/nvme: (31 commits)
        nvme: clear caller pointer on identify failure
        nvme: host: fix double-free of struct nvme_id_ns in ns_update_nuse()
        nvme: fcloop: make fcloop_class constant
        nvme: fabrics: make nvmf_class constant
        nvme: core: constify struct class usage
        nvme-fabrics: typo in nvmf_parse_key()
        nvme-multipath: use atomic queue limits API for stacking limits
        nvme-multipath: pass queue_limits to blk_alloc_disk
        nvme: use the atomic queue limits update API
        nvme: cleanup nvme_configure_metadata
        nvme: don't query identify data in configure_metadata
        nvme: split out a nvme_identify_ns_nvm helper
        nvme: move common logic into nvme_update_ns_info
        nvme: move setting the write cache flags out of nvme_set_queue_limits
        nvme: move a few things out of nvme_update_disk_info
        nvme: don't use nvme_update_disk_info for the multipath disk
        nvme: move blk_integrity_unregister into nvme_init_integrity
        nvme: cleanup the nvme_init_integrity calling conventions
        nvme: move max_integrity_segments handling out of nvme_init_integrity
        nvme: remove nvme_revalidate_zones
        ...
      0f7223a3
  2. 06 Mar, 2024 36 commits
  3. 05 Mar, 2024 3 commits
    • Song Liu's avatar
      Merge branch 'dmraid-fix-6.9' into md-6.9 · 3a889fdc
      Song Liu authored
      This is the second half of fixes for dmraid. The first half is available
      at [1].
      
      This set contains fixes:
       - reshape can start unexpected, cause data corruption, patch 1,5,6;
       - deadlocks that reshape concurrent with IO, patch 8;
       - a lockdep warning, patch 9;
      
      For all the dmraid related tests in lvm2 suite, there is no new
      regressions compared against 6.6 kernels (which is good baseline before
      recent regressions).
      
      [1] https://lore.kernel.org/all/CAPhsuW7u1UKHCDOBDhD7DzOVtkGemDz_QnJ4DUq_kSN-Q3G66Q@mail.gmail.com/
      
      * dmraid-fix-6.9:
        dm-raid: fix lockdep waring in "pers->hot_add_disk"
        dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape
        dm-raid: add a new helper prepare_suspend() in md_personality
        md/dm-raid: don't call md_reap_sync_thread() directly
        dm-raid: really frozen sync_thread during suspend
        md: add a new helper reshape_interrupted()
        md: export helper md_is_rdwr()
        md: export helpers to stop sync_thread
        md: don't clear MD_RECOVERY_FROZEN for new dm-raid until resume
      3a889fdc
    • Yu Kuai's avatar
      dm-raid: fix lockdep waring in "pers->hot_add_disk" · 95009ae9
      Yu Kuai authored
      The lockdep assert is added by commit a448af25 ("md/raid10: remove
      rcu protection to access rdev from conf") in print_conf(). And I didn't
      notice that dm-raid is calling "pers->hot_add_disk" without holding
      'reconfig_mutex'.
      
      "pers->hot_add_disk" read and write many fields that is protected by
      'reconfig_mutex', and raid_resume() already grab the lock in other
      contex. Hence fix this problem by protecting "pers->host_add_disk"
      with the lock.
      
      Fixes: 9092c02d ("DM RAID: Add ability to restore transiently failed devices on resume")
      Fixes: a448af25 ("md/raid10: remove rcu protection to access rdev from conf")
      Cc: stable@vger.kernel.org # v6.7+
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Signed-off-by: default avatarXiao Ni <xni@redhat.com>
      Acked-by: default avatarMike Snitzer <snitzer@kernel.org>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      Link: https://lore.kernel.org/r/20240305072306.2562024-10-yukuai1@huaweicloud.com
      95009ae9
    • Yu Kuai's avatar
      dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape · 41425f96
      Yu Kuai authored
      For raid456, if reshape is still in progress, then IO across reshape
      position will wait for reshape to make progress. However, for dm-raid,
      in following cases reshape will never make progress hence IO will hang:
      
      1) the array is read-only;
      2) MD_RECOVERY_WAIT is set;
      3) MD_RECOVERY_FROZEN is set;
      
      After commit c467e97f ("md/raid6: use valid sector values to determine
      if an I/O should wait on the reshape") fix the problem that IO across
      reshape position doesn't wait for reshape, the dm-raid test
      shell/lvconvert-raid-reshape.sh start to hang:
      
      [root@fedora ~]# cat /proc/979/stack
      [<0>] wait_woken+0x7d/0x90
      [<0>] raid5_make_request+0x929/0x1d70 [raid456]
      [<0>] md_handle_request+0xc2/0x3b0 [md_mod]
      [<0>] raid_map+0x2c/0x50 [dm_raid]
      [<0>] __map_bio+0x251/0x380 [dm_mod]
      [<0>] dm_submit_bio+0x1f0/0x760 [dm_mod]
      [<0>] __submit_bio+0xc2/0x1c0
      [<0>] submit_bio_noacct_nocheck+0x17f/0x450
      [<0>] submit_bio_noacct+0x2bc/0x780
      [<0>] submit_bio+0x70/0xc0
      [<0>] mpage_readahead+0x169/0x1f0
      [<0>] blkdev_readahead+0x18/0x30
      [<0>] read_pages+0x7c/0x3b0
      [<0>] page_cache_ra_unbounded+0x1ab/0x280
      [<0>] force_page_cache_ra+0x9e/0x130
      [<0>] page_cache_sync_ra+0x3b/0x110
      [<0>] filemap_get_pages+0x143/0xa30
      [<0>] filemap_read+0xdc/0x4b0
      [<0>] blkdev_read_iter+0x75/0x200
      [<0>] vfs_read+0x272/0x460
      [<0>] ksys_read+0x7a/0x170
      [<0>] __x64_sys_read+0x1c/0x30
      [<0>] do_syscall_64+0xc6/0x230
      [<0>] entry_SYSCALL_64_after_hwframe+0x6c/0x74
      
      This is because reshape can't make progress.
      
      For md/raid, the problem doesn't exist because register new sync_thread
      doesn't rely on the IO to be done any more:
      
      1) If array is read-only, it can switch to read-write by ioctl/sysfs;
      2) md/raid never set MD_RECOVERY_WAIT;
      3) If MD_RECOVERY_FROZEN is set, mddev_suspend() doesn't hold
         'reconfig_mutex', hence it can be cleared and reshape can continue by
         sysfs api 'sync_action'.
      
      However, I'm not sure yet how to avoid the problem in dm-raid yet. This
      patch on the one hand make sure raid_message() can't change
      sync_thread() through raid_message() after presuspend(), on the other
      hand detect the above 3 cases before wait for IO do be done in
      dm_suspend(), and let dm-raid requeue those IO.
      
      Cc: stable@vger.kernel.org # v6.7+
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Signed-off-by: default avatarXiao Ni <xni@redhat.com>
      Acked-by: default avatarMike Snitzer <snitzer@kernel.org>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      Link: https://lore.kernel.org/r/20240305072306.2562024-9-yukuai1@huaweicloud.com
      41425f96