Commits · 9614e2ba9161c7f5419f4212fa6057d2a65f6ae6 · Kirill Smelkov / linux

30 Jan, 2018 1 commit

dm cache: Documentation: update default migration_throttling value · 9614e2ba

John Pittman authored Jan 30, 2018

In commit f8350daf ("dm cache: tune migration throttling") the
value for DEFAULT_MIGRATION_THRESHOLD was decreased from 204800 to
2048.  Edit device-mapper/cache.txt to reflect the correct default
value for migration_threshold.
Signed-off-by: John Pittman <jpittman@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

9614e2ba

29 Jan, 2018 6 commits

dm mpath selector: more evenly distribute ties · f2042605

Khazhismel Kumykov authored Jan 19, 2018

Move the last used path to the end of the list (least preferred) so that
ties are more evenly distributed.

For example, in case with three paths with one that is slower than
others, the remaining two would be unevenly used if they tie. This is
due to the rotation not being a truely fair distribution.

Illustrated: paths a, b, c, 'c' has 1 outstanding IO, a and b are 'tied'
Three possible rotations:
(a, b, c) -> best path 'a'
(b, c, a) -> best path 'b'
(c, a, b) -> best path 'a'
(a, b, c) -> best path 'a'
(b, c, a) -> best path 'b'
(c, a, b) -> best path 'a'
...

So 'a' is used 2x more than 'b', although they should be used evenly.

With this change, the most recently used path is always the least
preferred, removing this bias resulting in even distribution.
(a, b, c) -> best path 'a'
(b, c, a) -> best path 'b'
(c, a, b) -> best path 'a'
(c, b, a) -> best path 'b'
...
Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

f2042605

dm unstripe: fix target length versus number of stripes size check · cc656619

Scott Bauer authored Jan 23, 2018

Since the unstripe target takes a target length which is the
size of *one* striped member we're trying to expose, not the
total size of *all* the striped members, the check does not
make sense and fails for some striped setups.

For example, say we have a 4TB striped device:
or 3907018496 sectors per underlying device:

if (sector_div(width, uc->stripes)) :
   3907018496 / 2(num stripes)  == 1953509248

tmp_len = width;
if (sector_div(tmp_len, uc->chunk_size)) :
   1953509248 / 256(chunk size) == 7630895.5
   (fails)

Fix this by removing the first check which isn't valid for unstriping.
Signed-off-by: Scott Bauer <scott.bauer@intel.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

cc656619

dm thin: fix trailing semicolon in __remap_and_issue_shared_cell · bd6d1e0a

Luis de Bethencourt authored Jan 17, 2018

The trailing semicolon is an empty statement that does no operation.
Removing it since it doesn't do anything.
Signed-off-by: Luis de Bethencourt <luisbg@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

bd6d1e0a

dm table: fix NVMe bio-based dm_table_determine_type() validation · eaa160ed

Mike Snitzer authored Jan 13, 2018

The 'verify_rq_based:' code in dm_table_determine_type() was checking
all devices in the DM table rather than only checking the data devices.
Fix this by using the immutable target's iterate_devices method.

Also, tweak the block of dm_table_determine_type() code that decides
whether to upgrade from DM_TYPE_BIO_BASED to DM_TYPE_NVME_BIO_BASED so
that it makes sure the immutable_target doesn't support require
splitting IOs.

These changes have been verified to allow a "thin-pool" target whose
data device is an NVMe device to be upgraded to DM_TYPE_NVME_BIO_BASED.
Using the thin-pool in NVMe bio-based mode was verified to pass all the
device-mapper-test-suite's "thin-provisioning" tests.

Also verified that request-based DM multipath (with queue_mode "rq" and
"mq") works as expected using the 'mptest' harness.

Fixes: 22c11858 ("dm: introduce DM_TYPE_NVME_BIO_BASED")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

eaa160ed

dm: various cleanups to md->queue initialization code · c12c9a3c

Mike Snitzer authored Jan 12, 2018

Also, add dm_sysfs_init() error handling to dm_create().
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

c12c9a3c

dm mpath: delay the retry of a request if the target responded as busy · ac514ffc

Mike Snitzer authored Jan 12, 2018

Add DM_ENDIO_DELAY_REQUEUE to allow request-based multipath's
multipath_end_io() to instruct dm-rq.c:dm_done() to delay a requeue.
This is beneficial to do if BLK_STS_RESOURCE is returned from the target
(because target is busy).

Relative to blk-mq: kick the hw queues via blk_mq_requeue_work(),
indirectly from dm-rq.c:__dm_mq_kick_requeue_list(), after a delay.

For old .request_fn: use blk_delay_queue().

bio-based multipath doesn't have feature parity with request-based for
retryable error requeues; that is something that'll need fixing in the
future.
Suggested-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Bart Van Assche <bart.vanassche@wdc.com>
[as interpreted from Bart's "... patch looks fine to me."]

ac514ffc

17 Jan, 2018 21 commits

dm mpath: return DM_MAPIO_DELAY_REQUEUE if QUEUE_IO or PG_INIT_REQUIRED · 459b5401

Ming Lei authored Jan 11, 2018

Avoid using DM_MAPIO_REQUEUE unless absolutely necessary because it
results in dm-rq.c:dm_mq_queue_rq() returning BLK_STS_RESOURCE to
blk-mq -- doing so should only ever be done if the underlying queue is
out of resources.  So switch to returning DM_MAPIO_DELAY_REQUEUE from
multipath_clone_and_map() if either MPATHF_QUEUE_IO or
MPATHF_PG_INIT_REQUIRED are set.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

459b5401

dm mpath: return DM_MAPIO_REQUEUE on blk-mq rq allocation failure · 050af08f

Ming Lei authored Jan 11, 2018

blk-mq will rerun queue via RESTART or dispatch wake after one request
is completed, so not necessary to wait random time for requeuing, we
should trust blk-mq to do it.

More importantly, we need to return BLK_STS_RESOURCE to blk-mq so that
dequeuing from the I/O scheduler can be stopped, this results in
improved I/O merging.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

050af08f

dm log writes: fix max length used for kstrndup · 4b259fc4

Ma Shimiao authored Dec 12, 2017

If source string is longer than max, kstrndup will allocate max+1
space.  So make sure the result will not exceed max.
Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

4b259fc4

dm: backfill missing calls to mutex_destroy() · d5ffebdd
Mike Snitzer authored Jan 05, 2018
```
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
```
d5ffebdd

dm snapshot: use mutex instead of rw_semaphore · ae1093be

Mikulas Patocka authored Nov 23, 2017

The rw_semaphore is acquired for read only in two places, neither is
performance-critical.  So replace it with a mutex -- which is more
efficient.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

ae1093be

dm flakey: check for null arg_name in parse_features() · 7690e253

Goldwyn Rodrigues authored Dec 03, 2017

One can crash dm-flakey by specifying more feature arguments than the
number of features supplied.  Checking for null in arg_name avoids
this.

dmsetup create flakey-test --table "0 66076080 flakey /dev/sdb9 0 0 180 2 drop_writes"
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

7690e253

dm thin: extend thinpool status format string with omitted fields · 7efd5fed
mulhern authored Nov 27, 2017
```
Signed-off-by: mulhern <amulhern@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
```
7efd5fed

dm thin: fixes in thin-provisioning.txt · cc3ff0af

mulhern authored Nov 27, 2017

Make the format string for thinpool status more correct.

Swap the order of two items to correspond with reality.
Signed-off-by: mulhern <amulhern@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

cc3ff0af

dm thin: document representation of <highest mapped sector> when there is none · 2bc8a61c
mulhern authored Nov 27, 2017
```
Signed-off-by: mulhern <amulhern@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
```
2bc8a61c

dm thin: fix documentation relative to low water mark threshold · 9b28a110

mulhern authored Nov 27, 2017

Fixes:
1. The use of "exceeds" when the opposite of exceeds, falls below,
was meant.
2. Properly speaking, a table can not exceed a threshold.

It emphasizes the important point, which is that it is the userspace
daemon's responsibility to check for low free space when a device
is resumed, since it won't get a special event indicating low free
space in that situation.
Signed-off-by: mulhern <amulhern@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

9b28a110

dm cache: be consistent in specifying sectors and SI units in cache.txt · 1346638e
mulhern authored Nov 27, 2017
```
Signed-off-by: mulhern <amulhern@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
```
1346638e

dm cache: delete obsoleted paragraph in cache.txt · 3716e20a

mulhern authored Nov 27, 2017

The 'mq' policy is no longer the default policy, and the default policy,
'smq', does not store hit counts.
Signed-off-by: mulhern <amulhern@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

3716e20a

dm cache: fix grammar in cache-policies.txt · 67721046

mulhern authored Nov 27, 2017

Use possessive pronoun where appropriate, instead of contraction.
Signed-off-by: mulhern <amulhern@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

67721046

dm snapshot: improve documentation relative to origin suspend requirements · 424da29c

Mikulas Patocka authored Dec 02, 2015

Add a note to snapshot.txt that the origin target must be suspended when
loading or unloading the snapshot target.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

424da29c

dm: move dm_table_destroy() to same header as dm_table_create() · f6e7baad

Brian Norris authored Mar 28, 2017

If anyone is going to use dm_table_create(), they probably should be
able to use dm_table_destroy() too. Move the dm_table_destroy()
definition outside the private header, near dm_table_create()
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

f6e7baad

dm raid: make raid_sets symbol static · 67ac901c

Wei Yongjun authored Jan 02, 2018

Fixes the following sparse warning:

drivers/md/dm-raid.c:33:1: warning:
 symbol 'raid_sets' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

67ac901c

dm bufio: eliminate unnecessary labels in dm_bufio_client_create() · 0e696d38
Mike Snitzer authored Jan 04, 2018
```
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
```
0e696d38

dm bufio: check result of register_shrinker() · 46898e9a

Aliaksei Karaliou authored Dec 23, 2017

dm_bufio_client_create() does not check result of register_shrinker()
which was tagged as __must_check recently, reported by sparse.
Signed-off-by: Aliaksei Karaliou <akaraliou.dev@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

46898e9a

dm bufio: add missed destroys of client mutex · bde14184

Aliaksei Karaliou authored Dec 23, 2017

The client's mutex needs to be destroyed in dm_bufio_client_destroy() as
well as the dm_bufio_client_create() error path.
Signed-off-by: Aliaksei Karaliou <akaraliou.dev@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

bde14184

dm bufio: use REQ_OP_READ and REQ_OP_WRITE · 905be0a1

Mikulas Patocka authored Dec 02, 2017

Use REQ_OP_READ and REQ_OP_WRITE macros instead of READ and WRITE.  They
have the same value, but the block layer uses REQ_OP so bufio should
too.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

905be0a1

dm: add unstriped target · 18a5bf27

Scott Bauer authored Dec 18, 2017

This device mapper "unstriped" target remaps and unstripes I/O so it
is issued solely on a single drive in a HW RAID0 or dm-striped target.

In a 4 drive HW RAID0 the striped target exposes 1/4th of the LBA range
as a virtual drive.  Each I/O to that virtual drive will only be issued
to the 1 drive that was selected of the 4 drives in the HW RAID0.

This unstriped target is most useful for Intel NVMe drives that have
multiple cores but that do not have firmware control to pin separate LBA
ranges to each discrete cpu core.
Signed-off-by: Scott Bauer <scott.bauer@intel.com>
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

18a5bf27

06 Jan, 2018 2 commits

dm mpath: factor out SCSI vs NVMe path selection · 0001ec56

Mike Snitzer authored Dec 11, 2017

Trying to do both SCSI and NVMe bio-based handling with branching in the
same common code has proven too tedious on a code maintenance level.  In
addition it slightly hurts IO performance.

Fix this by factoring out __map_bio() and __map_bio_nvme().
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

0001ec56

dm mpath: optimize NVMe bio-based support · 848b8aef

Mike Snitzer authored Dec 10, 2017

All code that deals with pg_init is not used with bio-based NVMe mode.
This includes skipping initialization of pg_init related variables.

Also, pg_init related members on 'struct multipath' have been grouped
together.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

848b8aef

05 Jan, 2018 1 commit

dm mpath: implement NVMe bio-based support · cd025384

Mike Snitzer authored Dec 05, 2017

This DM multipath NVMe bio-based support requires CONFIG_NVME_MULTIPATH
to not be set. In the future hopefully NVMe multipath and DM multipath
can co-exist more seemlessly. But as is, if CONFIG_NVME_MULTIPATH=Y
then all the individal NVMe paths will remain hidden to upper layers and
as such DM multipath will not be able to manage them.

Though NVMe's native multipathing doesn't multipath namespaces across
subsystems; so technically a user _could_ use CONFIG_NVME_MULTIPATH=Y
and also use DM multipath to multipath across subsystems.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

cd025384

03 Jan, 2018 1 commit

dm mpath: move dm_bio_restore out of endio method · 1836df08

Mike Snitzer authored Dec 06, 2017

Moving the dm_bio_restore() to process_queued_bios() avoids doing that
work in multipath_end_io_bio().
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

1836df08

20 Dec, 2017 5 commits

dm mpath: optimize retrieval of bio_details from per-bio-data · d07a241d
Mike Snitzer authored Dec 11, 2017
```
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
```
d07a241d

dm mpath: remove unnecessary memset() calls for per-io-data · d0442f80

Mike Snitzer authored Dec 11, 2017

All underlying members are initialized directly so the memset() calls
are not needed.  Also, initialize mpio->nr_bytes from the start since it
never changes.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

d0442f80

dm mpath: remove unused param from multipath_init_per_bio_data() · 63f6e6fd
Mike Snitzer authored Dec 05, 2017
```
'struct dm_bio_details *' isn't ever needed.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
```
63f6e6fd

dm: optimize bio-based NVMe IO submission · 978e51ba

Mike Snitzer authored Dec 09, 2017

Upper level bio-based drivers that stack immediately ontop of NVMe can
leverage direct_make_request().  In addition DM's NVMe bio-based
will initially only ever have one NVMe device that it submits IO to at a
time.  There is no splitting needed.  Enhance DM core so that
DM_TYPE_NVME_BIO_BASED's IO submission takes advantage of both of these
characteristics.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

978e51ba

dm: introduce DM_TYPE_NVME_BIO_BASED · 22c11858

Mike Snitzer authored Dec 04, 2017

If dm_table_determine_type() establishes DM_TYPE_NVME_BIO_BASED then
all devices in the DM table do not support partial completions.  Also,
the table has a single immutable target that doesn't require DM core to
split bios.

This will enable adding NVMe optimizations to bio-based DM.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

22c11858

17 Dec, 2017 3 commits

dm: simplify start of block stats accounting for bio-based · f3986374

Mike Snitzer authored Dec 17, 2017

No apparent need to generic_start_io_acct() until before the IO is ready
for submission.  start_io_acct() is the proper place to do this
accounting -- it is also where DM accounts for pending IO and, if
enabled, starts dm-stats accounting.

Replace start_io_acct()'s part_round_stats() with generic_start_io_acct().
This eliminates needing to take part_stat_lock() multiple times when
starting an IO on bio-based devices.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

f3986374

dm: remove redundant mapped_device member from clone_info structure · bc02cdbe

Mike Snitzer authored Dec 14, 2017

'struct dm_io' already has the same pointer.  So update all accesses
from ci->md to ci->io->md.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

bc02cdbe

dm: remove now unused bio-based io_pool and _io_cache · dde1e1ec
Mike Snitzer authored Dec 11, 2017
```
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
```
dde1e1ec