1. 24 Nov, 2022 1 commit
    • Damien Le Moal's avatar
      block: mq-deadline: Fix dd_finish_request() for zoned devices · 2820e5d0
      Damien Le Moal authored
      dd_finish_request() tests if the per prio fifo_list is not empty to
      determine if request dispatching must be restarted for handling blocked
      write requests to zoned devices with a call to
      blk_mq_sched_mark_restart_hctx(). While simple, this implementation has
      2 problems:
      
      1) Only the priority level of the completed request is considered.
         However, writes to a zone may be blocked due to other writes to the
         same zone using a different priority level. While this is unlikely to
         happen in practice, as writing a zone with different IO priorirites
         does not make sense, nothing in the code prevents this from
         happening.
      2) The use of list_empty() is dangerous as dd_finish_request() does not
         take dd->lock and may run concurrently with the insert and dispatch
         code.
      
      Fix these 2 problems by testing the write fifo list of all priority
      levels using the new helper dd_has_write_work(), and by testing each
      fifo list using list_empty_careful().
      
      Fixes: c807ab52 ("block/mq-deadline: Add I/O priority support")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@opensource.wdc.com>
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Link: https://lore.kernel.org/r/20221124021208.242541-2-damien.lemoal@opensource.wdc.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      2820e5d0
  2. 23 Nov, 2022 9 commits
  3. 22 Nov, 2022 1 commit
  4. 21 Nov, 2022 3 commits
  5. 16 Nov, 2022 23 commits
  6. 14 Nov, 2022 3 commits
    • Jens Axboe's avatar
      Merge branch 'md-next' of... · 5626196a
      Jens Axboe authored
      Merge branch 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-6.2/block
      
      Pull MD fixes from Song.
      
      * 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
        md/raid1: stop mdx_raid1 thread when raid1 array run failed
        md/raid5: use bdev_write_cache instead of open coding it
        md: fix a crash in mempool_free
        md/raid0, raid10: Don't set discard sectors for request queue
        md/bitmap: Fix bitmap chunk size overflow issues
        md: introduce md_ro_state
        md: factor out __md_set_array_info()
        lib/raid6: drop RAID6_USE_EMPTY_ZERO_PAGE
        raid5-cache: use try_cmpxchg in r5l_wake_reclaim
        drivers/md/md-bitmap: check the return value of md_bitmap_get_counter()
      5626196a
    • Jiang Li's avatar
      md/raid1: stop mdx_raid1 thread when raid1 array run failed · b611ad14
      Jiang Li authored
      fail run raid1 array when we assemble array with the inactive disk only,
      but the mdx_raid1 thread were not stop, Even if the associated resources
      have been released. it will caused a NULL dereference when we do poweroff.
      
      This causes the following Oops:
          [  287.587787] BUG: kernel NULL pointer dereference, address: 0000000000000070
          [  287.594762] #PF: supervisor read access in kernel mode
          [  287.599912] #PF: error_code(0x0000) - not-present page
          [  287.605061] PGD 0 P4D 0
          [  287.607612] Oops: 0000 [#1] SMP NOPTI
          [  287.611287] CPU: 3 PID: 5265 Comm: md0_raid1 Tainted: G     U            5.10.146 #0
          [  287.619029] Hardware name: xxxxxxx/To be filled by O.E.M, BIOS 5.19 06/16/2022
          [  287.626775] RIP: 0010:md_check_recovery+0x57/0x500 [md_mod]
          [  287.632357] Code: fe 01 00 00 48 83 bb 10 03 00 00 00 74 08 48 89 ......
          [  287.651118] RSP: 0018:ffffc90000433d78 EFLAGS: 00010202
          [  287.656347] RAX: 0000000000000000 RBX: ffff888105986800 RCX: 0000000000000000
          [  287.663491] RDX: ffffc90000433bb0 RSI: 00000000ffffefff RDI: ffff888105986800
          [  287.670634] RBP: ffffc90000433da0 R08: 0000000000000000 R09: c0000000ffffefff
          [  287.677771] R10: 0000000000000001 R11: ffffc90000433ba8 R12: ffff888105986800
          [  287.684907] R13: 0000000000000000 R14: fffffffffffffe00 R15: ffff888100b6b500
          [  287.692052] FS:  0000000000000000(0000) GS:ffff888277f80000(0000) knlGS:0000000000000000
          [  287.700149] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          [  287.705897] CR2: 0000000000000070 CR3: 000000000320a000 CR4: 0000000000350ee0
          [  287.713033] Call Trace:
          [  287.715498]  raid1d+0x6c/0xbbb [raid1]
          [  287.719256]  ? __schedule+0x1ff/0x760
          [  287.722930]  ? schedule+0x3b/0xb0
          [  287.726260]  ? schedule_timeout+0x1ed/0x290
          [  287.730456]  ? __switch_to+0x11f/0x400
          [  287.734219]  md_thread+0xe9/0x140 [md_mod]
          [  287.738328]  ? md_thread+0xe9/0x140 [md_mod]
          [  287.742601]  ? wait_woken+0x80/0x80
          [  287.746097]  ? md_register_thread+0xe0/0xe0 [md_mod]
          [  287.751064]  kthread+0x11a/0x140
          [  287.754300]  ? kthread_park+0x90/0x90
          [  287.757974]  ret_from_fork+0x1f/0x30
      
      In fact, when raid1 array run fail, we need to do
      md_unregister_thread() before raid1_free().
      Signed-off-by: default avatarJiang Li <jiang.li@ugreen.com>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      b611ad14
    • Christoph Hellwig's avatar
      md/raid5: use bdev_write_cache instead of open coding it · ad831a16
      Christoph Hellwig authored
      Use the bdev_write_cache instead of two equivalent open coded checks.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      ad831a16