Commits · e9ea15963f3b9d979e1d6d758b8e407775ae6588 · Kirill Smelkov / linux

18 Oct, 2021 40 commits

blk-mq: inline hot part of __blk_mq_sched_restart · e9ea1596

Pavel Begunkov authored Oct 09, 2021

Extract a fast check out of __block_mq_sched_restart() and inline it for
performance reasons.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/894abaa0998e5999f2fe18f271e5efdfc2c32bd2.1633781740.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

e9ea1596

block: inline hot paths of blk_account_io_*() · be6bfe36

Pavel Begunkov authored Oct 09, 2021

Extract hot paths of __blk_account_io_start() and
__blk_account_io_done() into inline functions, so we don't always pay
for function calls.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/b0662a636bd4cc7b4f84c9d0a41efa46a688ef13.1633781740.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

be6bfe36

block: merge block_ioctl into blkdev_ioctl · 8a709512

Christoph Hellwig authored Oct 12, 2021

Simplify the ioctl path and match the code structure on the compat side.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211012104450.659013-4-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

8a709512

block: move the *blkdev_ioctl declarations out of blkdev.h · 84b8514b

Christoph Hellwig authored Oct 12, 2021

These are only used inside of block/.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211012104450.659013-3-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

84b8514b

block: unexport blkdev_ioctl · fea349b0

Christoph Hellwig authored Oct 12, 2021

With the raw driver gone, there is no modular user left.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211012104450.659013-2-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

fea349b0

block: don't dereference request after flush insertion · 4a60f360

Jens Axboe authored Oct 16, 2021

We could have a race here, where the request gets freed before we call
into blk_mq_run_hw_queue(). If this happens, we cannot rely on the state
of the request.

Grab the hardware context before inserting the flush.

Fixes: 0f38d766 ("blk-mq: cleanup blk_mq_submit_bio")
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

4a60f360

blk-mq: cleanup blk_mq_submit_bio · 0f38d766

Christoph Hellwig authored Oct 12, 2021

Move the blk_mq_alloc_data stack allocation only into the branch
that actually needs it, and use rq->mq_hctx instead of data.hctx
to refer to the hctx.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211012104045.658051-3-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

0f38d766

blk-mq: cleanup and rename __blk_mq_alloc_request · b90cfaed

Christoph Hellwig authored Oct 12, 2021

The newly added loop for the cached requests in __blk_mq_alloc_request
is a little too convoluted for my taste, so unwind it a bit. Also
rename the function to __blk_mq_alloc_requests now that it can allocate
more than a single request.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211012104045.658051-2-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

b90cfaed

block: pre-allocate requests if plug is started and is a batch · 47c122e3

Jens Axboe authored Oct 06, 2021

The caller typically has a good (or even exact) idea of how many requests
it needs to submit. We can make the request/tag allocation a lot more
efficient if we just allocate N requests/tags upfront when we queue the
first bio from the batch.

Provide a new plug start helper that allows the caller to specify how many
IOs are expected. This sets plug->nr_ios, and we can use that for smarter
request allocation. The plug provides a holding spot for requests, and
request allocation will check it before calling into the normal request
allocation path.

The blk_finish_plug() is called, check if there are unused requests and
free them. This should not happen in normal operations. The exception is
if we get merging, then we may be left with requests that need freeing
when done.

This raises the per-core performance on my setup from ~5.8M to ~6.1M
IOPS.
Signed-off-by: Jens Axboe <axboe@kernel.dk>

47c122e3

block: bump max plugged deferred size from 16 to 32 · ba0ffdd8

Jens Axboe authored Oct 06, 2021

Particularly for NVMe with efficient deferred submission for many
requests, there are nice benefits to be seen by bumping the default max
plug count from 16 to 32. This is especially true for virtualized setups,
where the submit part is more expensive. But can be noticed even on
native hardware.

Reduce the multiple queue factor from 4 to 2, since we're changing the
default size.

While changing it, move the defines into the block layer private header.
These aren't values that anyone outside of the block layer uses, or
should use.
Signed-off-by: Jens Axboe <axboe@kernel.dk>

ba0ffdd8

block: inherit request start time from bio for BLK_CGROUP · 00067077

Jens Axboe authored Oct 05, 2021

Doing high IOPS testing with blk-cgroups enabled spends ~15-20% of the
time just doing ktime_get_ns() -> readtsc. We essentially read and
set the start time twice, one for the bio and then again when that bio
is mapped to a request.

Given that the time between the two is very short, inherit the bio
start time instead of reading it again. This cuts 1/3rd of the overhead
of the time keeping.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

00067077

block: move blk-throtl fast path inline · a7b36ee6

Jens Axboe authored Oct 05, 2021

Even if no policies are defined, we spend ~2% of the total IO time
checking. Move the fast path inline.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

a7b36ee6