1. 04 Jun, 2021 10 commits
    • Damien Le Moal's avatar
      dm zoned: check zone capacity · bab68499
      Damien Le Moal authored
      The dm-zoned target cannot support zoned block devices with zones that
      have a capacity smaller than the zone size (e.g. NVMe zoned namespaces)
      due to the current chunk zone mapping implementation as it is assumed
      that zones and chunks have the same size with all blocks usable.
      If a zoned drive is found to have zones with a capacity different from
      the zone size, fail the target initialization.
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Cc: stable@vger.kernel.org # v5.9+
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      bab68499
    • Rikard Falkeborn's avatar
      dm table: Constify static struct blk_ksm_ll_ops · ccde2cbf
      Rikard Falkeborn authored
      The only usage of dm_ksm_ll_ops is to make a copy of it to the ksm_ll_ops
      field in the blk_keyslot_manager struct. Make it const to allow the
      compiler to put it in read-only memory.
      Signed-off-by: default avatarRikard Falkeborn <rikard.falkeborn@gmail.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      ccde2cbf
    • Mikulas Patocka's avatar
      dm writecache: interrupt writeback if suspended · af4f6cab
      Mikulas Patocka authored
      If the DM device is suspended, interrupt the writeback sequence so
      that there is no excessive suspend delay.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      af4f6cab
    • Mikulas Patocka's avatar
      dm writecache: don't split bios when overwriting contiguous cache content · ee50cc19
      Mikulas Patocka authored
      If dm-writecache overwrites existing cached data, it splits the
      incoming bio into many block-sized bios. The I/O scheduler does merge
      these bios into one large request but this needless splitting and
      merging causes performance degradation.
      
      Fix this by avoiding bio splitting if the cache target area that is
      being overwritten is contiguous.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      ee50cc19
    • Mikulas Patocka's avatar
      dm kcopyd: avoid spin_lock_irqsave from process context · 6bcd658f
      Mikulas Patocka authored
      The functions "pop", "push_head", "do_work" can only be called from
      process context. Therefore, replace spin_lock_irq{save,restore} with
      spin_{lock,unlock}_irq.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      6bcd658f
    • Mikulas Patocka's avatar
      dm kcopyd: avoid useless atomic operations · db2351eb
      Mikulas Patocka authored
      The functions set_bit and clear_bit are atomic. We don't need
      atomicity when making flags for dm-kcopyd. So, change them to direct
      manipulation of the flags.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      db2351eb
    • Joe Thornber's avatar
      dm space map disk: cache a small number of index entries · 6b06dd5a
      Joe Thornber authored
      The disk space map stores it's index entries in a btree, these are
      accessed very frequently, so having a few cached makes a big difference
      to performance.
      
      With this change provisioning a new block takes roughly 20% less cpu.
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      6b06dd5a
    • Joe Thornber's avatar
      dm space maps: improve performance with inc/dec on ranges of blocks · be500ed7
      Joe Thornber authored
      When we break sharing on btree nodes we typically need to increment
      the reference counts to every value held in the node.  This can
      cause a lot of repeated calls to the space maps.  Fix this by changing
      the interface to the space map inc/dec methods to take ranges of
      adjacent blocks to be operated on.
      
      For installations that are using a lot of snapshots this will reduce
      cpu overhead of fundamental operations such as provisioning a new block,
      or deleting a snapshot, by as much as 10 times.
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      be500ed7
    • Joe Thornber's avatar
      dm space maps: don't reset space map allocation cursor when committing · 5faafc77
      Joe Thornber authored
      Current commit code resets the place where the search for free blocks
      will begin back to the start of the metadata device.  There are a couple
      of repercussions to this:
      
      - The first allocation after the commit is likely to take longer than
        normal as it searches for a free block in an area that is likely to
        have very few free blocks (if any).
      
      - Any free blocks it finds will have been recently freed.  Reusing them
        means we have fewer old copies of the metadata to aid recovery from
        hardware error.
      
      Fix these issues by leaving the cursor alone, only resetting when the
      search hits the end of the metadata device.
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      5faafc77
    • Joe Thornber's avatar
      dm btree: improve btree residency · 4eafdb15
      Joe Thornber authored
      This commit improves the residency of btrees built in the metadata for
      dm-thin and dm-cache.
      
      When inserting a new entry into a full btree node the current code
      splits the node into two.  This can result in very many half full nodes,
      particularly if the insertions are occurring in an ascending order (as
      happens in dm-thin with large writes).
      
      With this commit, when we insert into a full node we first try and move
      some entries to a neighbouring node that has space, failing that it
      tries to split two neighbouring nodes into three.
      
      Results are given below.  'Residency' is how full nodes are on average
      as a percentage.  Average instruction counts for the operations
      are given to show the extra processing has little overhead.
      
                               +--------------------------+--------------------------+
                               |         Before           |         After            |
      +------------+-----------+-----------+--------------+-----------+--------------+
      |    Test    |   Phase   | Residency | Instructions | Residency | Instructions |
      +------------+-----------+-----------+--------------+-----------+--------------+
      | Ascending  | insert    |        50 |         1876 |        96 |         1930 |
      |            | overwrite |        50 |         1789 |        96 |         1746 |
      |            | lookup    |        50 |          778 |        96 |          778 |
      | Descending | insert    |        50 |         3024 |        96 |         3181 |
      |            | overwrite |        50 |         1789 |        96 |         1746 |
      |            | lookup    |        50 |          778 |        96 |          778 |
      | Random     | insert    |        68 |         3800 |        84 |         3736 |
      |            | overwrite |        68 |         4254 |        84 |         3911 |
      |            | lookup    |        68 |          779 |        84 |          779 |
      | Runs       | insert    |        63 |         2546 |        82 |         2815 |
      |            | overwrite |        63 |         2013 |        82 |         1986 |
      |            | lookup    |        63 |          778 |        82 |          779 |
      +------------+-----------+-----------+--------------+-----------+--------------+
      
         Ascending - keys are inserted in ascending order.
         Descending - keys are inserted in descending order.
         Random - keys are inserted in random order.
         Runs - keys are split into ascending runs of ~20 length.  Then
                the runs are shuffled.
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: Colin Ian King <colin.king@canonical.com> # contains_key() fix
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      4eafdb15
  2. 25 May, 2021 3 commits
  3. 23 May, 2021 18 commits
  4. 22 May, 2021 4 commits
    • Linus Torvalds's avatar
      Merge tag 'block-5.13-2021-05-22' of git://git.kernel.dk/linux-block · 4ff2473b
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - Fix BLKRRPART and deletion race (Gulam, Christoph)
      
       - NVMe pull request (Christoph):
            - nvme-tcp corruption and timeout fixes (Sagi Grimberg, Keith
              Busch)
            - nvme-fc teardown fix (James Smart)
            - nvmet/nvme-loop memory leak fixes (Wu Bo)"
      
      * tag 'block-5.13-2021-05-22' of git://git.kernel.dk/linux-block:
        block: fix a race between del_gendisk and BLKRRPART
        block: prevent block device lookups at the beginning of del_gendisk
        nvme-fc: clear q_live at beginning of association teardown
        nvme-tcp: rerun io_work if req_list is not empty
        nvme-tcp: fix possible use-after-completion
        nvme-loop: fix memory leak in nvme_loop_create_ctrl()
        nvmet: fix memory leak in nvmet_alloc_ctrl()
      4ff2473b
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.13-2021-05-22' of git://git.kernel.dk/linux-block · b9231dfb
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "One fix for a regression with poll in this merge window, and another
        just hardens the io-wq exit path a bit"
      
      * tag 'io_uring-5.13-2021-05-22' of git://git.kernel.dk/linux-block:
        io_uring: fortify tctx/io_wq cleanup
        io_uring: don't modify req->poll for rw
      b9231dfb
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.13b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 23d72926
      Linus Torvalds authored
      Pull xen fixes from Juergen Gross:
      
       - a fix for a boot regression when running as PV guest on hardware
         without NX support
      
       - a small series fixing a bug in the Xen pciback driver when
         configuring a PCI card with multiple virtual functions
      
      * tag 'for-linus-5.13b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen-pciback: reconfigure also from backend watch handler
        xen-pciback: redo VF placement in the virtual topology
        x86/Xen: swap NX determination and GDT setup on BSP
      23d72926
    • Linus Torvalds's avatar
      Merge tag 'xfs-5.13-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · a3969ef4
      Linus Torvalds authored
      Pull xfs fixes from Darrick Wong:
      
       - Fix some math errors in the realtime allocator when extent size hints
         are applied.
      
       - Fix unnecessary short writes to realtime files when free space is
         fragmented.
      
       - Fix a crash when using scrub tracepoints.
      
       - Restore ioctl uapi definitions that were accidentally removed in
         5.13-rc1.
      
      * tag 'xfs-5.13-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: restore old ioctl definitions
        xfs: fix deadlock retry tracepoint arguments
        xfs: retry allocations when locality-based search fails
        xfs: adjust rt allocation minlen when extszhint > rtextsize
      a3969ef4
  5. 21 May, 2021 5 commits