1. 02 Oct, 2020 17 commits
  2. 29 Sep, 2020 1 commit
    • Niklas Cassel's avatar
      null_blk: add support for max open/active zone limit for zoned devices · dc4d137e
      Niklas Cassel authored
      Add support for user space to set a max open zone and a max active zone
      limit via configfs. By default, the default values are 0 == no limit.
      
      Call the block layer API functions used for exposing the configured
      limits to sysfs.
      
      Add accounting in null_blk_zoned so that these new limits are respected.
      Performing an operation that would exceed these limits results in a
      standard I/O error.
      
      A max open zone limit exists in the ZBC standard.
      While null_blk_zoned is used to test the Zoned Block Device model in
      Linux, when it comes to differences between ZBC and ZNS, null_blk_zoned
      mostly follows ZBC.
      
      Therefore, implement the manage open zone resources function from ZBC,
      but additionally add support for max active zones.
      This enables user space not only to test against a device with an open
      zone limit, but also to test against a device with an active zone limit.
      Signed-off-by: default avatarNiklas Cassel <niklas.cassel@wdc.com>
      Reviewed-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      dc4d137e
  3. 28 Sep, 2020 1 commit
    • Jens Axboe's avatar
      Merge tag 'nvme-5.10-2020-09-27' of git://git.infradead.org/nvme into for-5.10/drivers · 1ed4211d
      Jens Axboe authored
      Pull NVMe updates from Christoph:
      
      "nvme updates for 5.10
      
       - fix keep alive timer modification (Amit Engel)
       - order the PCI ID list more sensibly (Andy Shevchenko)
       - cleanup the open by controller helper (Chaitanya Kulkarni)
       - use an xarray for th CSE log lookup (Chaitanya Kulkarni)
       - support ZNS in nvmet passthrough mode (Chaitanya Kulkarni)
       - fix nvme_ns_report_zones (me)
       - add a sanity check to nvmet-fc (James Smart)
       - fix interrupt allocation when too many polled queues are specified
         (Jeffle Xu)
       - small nvmet-tcp optimization (Mark Wunderlich)"
      
      * tag 'nvme-5.10-2020-09-27' of git://git.infradead.org/nvme:
        nvme-pci: allocate separate interrupt for the reserved non-polled I/O queue
        nvme: fix error handling in nvme_ns_report_zones
        nvmet-fc: fix missing check for no hostport struct
        nvmet: add passthru ZNS support
        nvmet: handle keep-alive timer when kato is modified by a set features cmd
        nvmet-tcp: have queue io_work context run on sock incoming cpu
        nvme-pci: Move enumeration by class to be last in the table
        nvme: use an xarray to lookup the Commands Supported and Effects log
        nvme: lift the file open code from nvme_ctrl_get_by_path
      1ed4211d
  4. 27 Sep, 2020 9 commits
  5. 25 Sep, 2020 1 commit
    • Jens Axboe's avatar
      Merge branch 'md-next' of... · 163090c1
      Jens Axboe authored
      Merge branch 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-5.10/drivers
      
      Pull MD updates from Song.
      
      * 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
        md/raid10: improve discard request for far layout
        md/raid10: improve raid10 discard request
        md/raid10: pull codes that wait for blocked dev into one function
        md/raid10: extend r10bio devs to raid disks
        md: add md_submit_discard_bio() for submitting discard bio
        md: Simplify code with existing definition RESYNC_SECTORS in raid10.c
        md/raid5: reallocate page array after setting new stripe_size
        md/raid5: resize stripe_head when reshape array
        md/raid5: let multiple devices of stripe_head share page
        md/raid6: let async recovery function support different page offset
        md/raid6: let syndrome computor support different page offset
        md/raid5: convert to new xor compution interface
        md/raid5: add new xor function to support different page offset
        md/raid5: make async_copy_data() to support different page offset
        md/raid5: add a new member of offset into r5dev
        md: only calculate blocksize once and use i_blocksize()
      163090c1
  6. 24 Sep, 2020 11 commits
    • Xiao Ni's avatar
      md/raid10: improve discard request for far layout · d3ee2d84
      Xiao Ni authored
      For far layout, the discard region is not continuous on disks. So it needs
      far copies r10bio to cover all regions. It needs a way to know all r10bios
      have finish or not. Similar with raid10_sync_request, only the first r10bio
      master_bio records the discard bio. Other r10bios master_bio record the
      first r10bio. The first r10bio can finish after other r10bios finish and
      then return the discard bio.
      Signed-off-by: default avatarXiao Ni <xni@redhat.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      d3ee2d84
    • Xiao Ni's avatar
      md/raid10: improve raid10 discard request · bcc90d28
      Xiao Ni authored
      Now the discard request is split by chunk size. So it takes a long time
      to finish mkfs on disks which support discard function. This patch improve
      handling raid10 discard request. It uses the similar way with patch
      29efc390 (md/md0: optimize raid0 discard handling).
      
      But it's a little complex than raid0. Because raid10 has different layout.
      If raid10 is offset layout and the discard request is smaller than stripe
      size. There are some holes when we submit discard bio to underlayer disks.
      
      For example: five disks (disk1 - disk5)
      D01 D02 D03 D04 D05
      D05 D01 D02 D03 D04
      D06 D07 D08 D09 D10
      D10 D06 D07 D08 D09
      The discard bio just wants to discard from D03 to D10. For disk3, there is
      a hole between D03 and D08. For disk4, there is a hole between D04 and D09.
      D03 is a chunk, raid10_write_request can handle one chunk perfectly. So
      the part that is not aligned with stripe size is still handled by
      raid10_write_request.
      
      If reshape is running when discard bio comes and the discard bio spans the
      reshape position, raid10_write_request is responsible to handle this
      discard bio.
      
      I did a test with this patch set.
      Without patch:
      time mkfs.xfs /dev/md0
      real4m39.775s
      user0m0.000s
      sys0m0.298s
      
      With patch:
      time mkfs.xfs /dev/md0
      real0m0.105s
      user0m0.000s
      sys0m0.007s
      
      nvme3n1           259:1    0   477G  0 disk
      └─nvme3n1p1       259:10   0    50G  0 part
      nvme4n1           259:2    0   477G  0 disk
      └─nvme4n1p1       259:11   0    50G  0 part
      nvme5n1           259:6    0   477G  0 disk
      └─nvme5n1p1       259:12   0    50G  0 part
      nvme2n1           259:9    0   477G  0 disk
      └─nvme2n1p1       259:15   0    50G  0 part
      nvme0n1           259:13   0   477G  0 disk
      └─nvme0n1p1       259:14   0    50G  0 part
      Reviewed-by: default avatarColy Li <colyli@suse.de>
      Reviewed-by: default avatarGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
      Signed-off-by: default avatarXiao Ni <xni@redhat.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      bcc90d28
    • Xiao Ni's avatar
      md/raid10: pull codes that wait for blocked dev into one function · f046f5d0
      Xiao Ni authored
      The following patch will reuse these logics, so pull the same codes into
      one function.
      Signed-off-by: default avatarXiao Ni <xni@redhat.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      f046f5d0
    • Xiao Ni's avatar
      md/raid10: extend r10bio devs to raid disks · 8650a889
      Xiao Ni authored
      Now it allocs r10bio->devs[conf->copies]. Discard bio needs to submit
      to all member disks and it needs to use r10bio. So extend to
      r10bio->devs[geo.raid_disks].
      Reviewed-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarXiao Ni <xni@redhat.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      8650a889
    • Xiao Ni's avatar
      md: add md_submit_discard_bio() for submitting discard bio · 2628089b
      Xiao Ni authored
      Move these logic from raid0.c to md.c, so that we can also use it in
      raid10.c.
      Reviewed-by: default avatarColy Li <colyli@suse.de>
      Reviewed-by: default avatarGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
      Signed-off-by: default avatarXiao Ni <xni@redhat.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      2628089b
    • Zhen Lei's avatar
      md: Simplify code with existing definition RESYNC_SECTORS in raid10.c · e287308b
      Zhen Lei authored
      #define RESYNC_SECTORS (RESYNC_BLOCK_SIZE >> 9)
      
      "RESYNC_BLOCK_SIZE/512" is equal to "RESYNC_BLOCK_SIZE >> 9", replace it
      with RESYNC_SECTORS.
      Signed-off-by: default avatarZhen Lei <thunder.leizhen@huawei.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      e287308b
    • Yufen Yu's avatar
      md/raid5: reallocate page array after setting new stripe_size · 38912584
      Yufen Yu authored
      When try to resize stripe_size, we also need to free old
      shared page array and allocate new.
      Signed-off-by: default avatarYufen Yu <yuyufen@huawei.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      38912584
    • Yufen Yu's avatar
      md/raid5: resize stripe_head when reshape array · f16acaf3
      Yufen Yu authored
      When reshape array, we try to reuse shared pages of old stripe_head,
      and allocate more for the new one if needed.
      Signed-off-by: default avatarYufen Yu <yuyufen@huawei.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      f16acaf3
    • Yufen Yu's avatar
      md/raid5: let multiple devices of stripe_head share page · 046169f0
      Yufen Yu authored
      In current implementation, grow_buffers() uses alloc_page() to
      allocate the buffers for each stripe_head, i.e. allocate a page
      for each dev[i] in stripe_head.
      
      After setting stripe_size as a configurable value by writing
      sysfs entry, it means that we always allocate 64K buffers, but
      just use 4K of them when stripe_size is 4K in 64KB arm64.
      
      To avoid wasting memory, we try to let multiple sh->dev share
      one real page. That means, multiple sh->dev[i].page will point
      to the only page with different offset. Example of 64K PAGE_SIZE
      and 4K stripe_size as following:
      
                          64K PAGE_SIZE
                +---+---+---+---+------------------------------+
                |   |   |   |   |
                |   |   |   |   |
                +-+-+-+-+-+-+-+-+------------------------------+
                  ^   ^   ^   ^
                  |   |   |   +----------------------------+
                  |   |   |                                |
                  |   |   +-------------------+            |
                  |   |                       |            |
                  |   +----------+            |            |
                  |              |            |            |
                  +-+            |            |            |
                    |            |            |            |
              +-----+-----+------+-----+------+-----+------+------+
      sh      | offset(0) | offset(4K) | offset(8K) | offset(12K) |
       +      +-----------+------------+------------+-------------+
       +----> dev[0].page  dev[1].page  dev[2].page  dev[3].page
      
      A new 'pages' array will be added into stripe_head to record shared
      page used by this stripe_head. Allocate them when grow_buffers()
      and free them when shrink_buffers().
      
      After trying to share page, the users of sh->dev[i].page need to take
      care of the related page offset: page of issued bio and page passed
      to xor compution functions. But thanks for previous different page offset
      supported. Here, we just need to set correct dev[i].offset.
      Signed-off-by: default avatarYufen Yu <yuyufen@huawei.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      046169f0
    • Yufen Yu's avatar
      md/raid6: let async recovery function support different page offset · 4f86ff55
      Yufen Yu authored
      For now, asynchronous raid6 recovery calculate functions are require
      common offset for pages. But, we expect them to support different page
      offset after introducing stripe shared page. Do that by simplily adding
      page offset where each page address are referred. Then, replace the
      old interface with the new ones in raid6 and raid6test.
      Signed-off-by: default avatarYufen Yu <yuyufen@huawei.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      4f86ff55
    • Yufen Yu's avatar
      md/raid6: let syndrome computor support different page offset · d69454bc
      Yufen Yu authored
      For now, syndrome compute functions require common offset in the pages
      array. However, we expect them to support different offset when try to
      use shared page in the following. Simplily covert them by adding page
      offset where each page address are referred.
      
      Since the only caller of async_gen_syndrome() and async_syndrome_val()
      are in raid6, we don't want to reserve the old interface but modify the
      interface directly. After that, replacing old interfaces with new ones
      for raid6 and raid6test.
      Signed-off-by: default avatarYufen Yu <yuyufen@huawei.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      d69454bc