1. 24 Nov, 2022 3 commits
    • Wang ShaoBo's avatar
      drbd: remove call to memset before free device/resource/connection · 6e7b854e
      Wang ShaoBo authored
      This revert c2258ffc ("drbd: poison free'd device, resource and
      connection structs"), add memset is odd here for debugging, there are
      some methods to accurately show what happened, such as kdump.
      Signed-off-by: default avatarWang ShaoBo <bobo.shaobowang@huawei.com>
      Link: https://lore.kernel.org/r/20221124015817.2729789-2-bobo.shaobowang@huawei.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      6e7b854e
    • Damien Le Moal's avatar
      block: mq-deadline: Do not break sequential write streams to zoned HDDs · 015d02f4
      Damien Le Moal authored
      mq-deadline ensures an in order dispatching of write requests to zoned
      block devices using a per zone lock (a bit). This implies that for any
      purely sequential write workload, the drive is exercised most of the
      time at a maximum queue depth of one.
      
      However, when such sequential write workload crosses a zone boundary
      (when sequentially writing multiple contiguous zones), zone write
      locking may prevent the last write to one zone to be issued (as the
      previous write is still being executed) but allow the first write to the
      following zone to be issued (as that zone is not yet being writen and
      not locked). This result in an out of order delivery of the sequential
      write commands to the device every time a zone boundary is crossed.
      
      While such behavior does not break the sequential write constraint of
      zoned block devices (and does not generate any write error), some zoned
      hard-disks react badly to seeing these out of order writes, resulting in
      lower write throughput.
      
      This problem can be addressed by always dispatching the first request
      of a stream of sequential write requests, regardless of the zones
      targeted by these sequential writes. To do so, the function
      deadline_skip_seq_writes() is introduced and used in
      deadline_next_request() to select the next write command to issue if the
      target device is an HDD (blk_queue_nonrot() being false).
      deadline_fifo_request() is modified using the new
      deadline_earlier_request() and deadline_is_seq_write() helpers to ignore
      requests in the fifo list that have a preceding request in lba order
      that is sequential.
      
      With this fix, a sequential write workload executed with the following
      fio command:
      
      fio  --name=seq-write --filename=/dev/sda --zonemode=zbd --direct=1 \
           --size=68719476736  --ioengine=libaio --iodepth=32 --rw=write \
           --bs=65536
      
      results in an increase from 225 MB/s to 250 MB/s of the write throughput
      of an SMR HDD (11% increase).
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@opensource.wdc.com>
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Link: https://lore.kernel.org/r/20221124021208.242541-3-damien.lemoal@opensource.wdc.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      015d02f4
    • Damien Le Moal's avatar
      block: mq-deadline: Fix dd_finish_request() for zoned devices · 2820e5d0
      Damien Le Moal authored
      dd_finish_request() tests if the per prio fifo_list is not empty to
      determine if request dispatching must be restarted for handling blocked
      write requests to zoned devices with a call to
      blk_mq_sched_mark_restart_hctx(). While simple, this implementation has
      2 problems:
      
      1) Only the priority level of the completed request is considered.
         However, writes to a zone may be blocked due to other writes to the
         same zone using a different priority level. While this is unlikely to
         happen in practice, as writing a zone with different IO priorirites
         does not make sense, nothing in the code prevents this from
         happening.
      2) The use of list_empty() is dangerous as dd_finish_request() does not
         take dd->lock and may run concurrently with the insert and dispatch
         code.
      
      Fix these 2 problems by testing the write fifo list of all priority
      levels using the new helper dd_has_write_work(), and by testing each
      fifo list using list_empty_careful().
      
      Fixes: c807ab52 ("block/mq-deadline: Add I/O priority support")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@opensource.wdc.com>
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Link: https://lore.kernel.org/r/20221124021208.242541-2-damien.lemoal@opensource.wdc.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      2820e5d0
  2. 23 Nov, 2022 9 commits
  3. 22 Nov, 2022 1 commit
  4. 21 Nov, 2022 3 commits
  5. 16 Nov, 2022 23 commits
  6. 14 Nov, 2022 1 commit
    • Jens Axboe's avatar
      Merge branch 'md-next' of... · 5626196a
      Jens Axboe authored
      Merge branch 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-6.2/block
      
      Pull MD fixes from Song.
      
      * 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
        md/raid1: stop mdx_raid1 thread when raid1 array run failed
        md/raid5: use bdev_write_cache instead of open coding it
        md: fix a crash in mempool_free
        md/raid0, raid10: Don't set discard sectors for request queue
        md/bitmap: Fix bitmap chunk size overflow issues
        md: introduce md_ro_state
        md: factor out __md_set_array_info()
        lib/raid6: drop RAID6_USE_EMPTY_ZERO_PAGE
        raid5-cache: use try_cmpxchg in r5l_wake_reclaim
        drivers/md/md-bitmap: check the return value of md_bitmap_get_counter()
      5626196a