- 18 Oct, 2021 40 commits
-
-
Luis Chamberlain authored
We never checked for errors on add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Luis Chamberlain authored
We never checked for errors on add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling. The out_mem2 error label already does what we need so re-use that. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Luis Chamberlain authored
We never checked for errors on add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling. The read_capacity_error error label already does what we need, so just re-use that. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Luis Chamberlain authored
We never checked for errors on add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Luis Chamberlain authored
No errors were being captured wehen cdrom_register() fails, capture the error and return the error. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Luis Chamberlain authored
We first register cdrom and then we add_disk() and so we we should likewise unregister the cdrom first and then del_gendisk(). Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Luis Chamberlain authored
We never checked for errors on add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
Refactor the pf initialization to have a dedicated helper to initialize a single disk. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
Refactor the pf initialization to have a dedicated helper to initialize a single disk. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
Refactor the pcd initialization to have a dedicated helper to initialize a single disk. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
No need to pass it through a bunch of functions. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Luis Chamberlain authored
We never checked for errors on add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Luis Chamberlain authored
We never checked for errors on add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Luis Chamberlain authored
We never checked for errors on add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Luis Chamberlain authored
We never checked for errors on add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Luis Chamberlain authored
We never checked for errors on add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
There's currently no way to experiment with polled IO with null_blk, which seems like an oversight. This patch adds support for polled IO. We keep a list of issued IOs on submit, and then process that list when mq_ops->poll() is invoked. A new parameter is added, poll_queues. It defaults to 1 like the submit queues, meaning we'll have 1 poll queue available. Fixes-by: Bart Van Assche <bvanassche@acm.org> Fixes-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Link: https://lore.kernel.org/r/baca710d-0f2a-16e2-60bd-b105b854e0ae@kernel.dkSigned-off-by: Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
Trivial to do now, just need our own io_comp_batch on the stack and pass that in to the usual command completion handling. I pondered making this dependent on how many entries we had to process, but even for a single entry there's no discernable difference in performance or latency. Running a sync workload over io_uring: t/io_uring -b512 -d1 -s1 -c1 -p0 -F1 -B1 -n2 /dev/nvme1n1 /dev/nvme2n1 yields the below performance before the patch: IOPS=254820, BW=124MiB/s, IOS/call=1/1, inflight=(1 1) IOPS=251174, BW=122MiB/s, IOS/call=1/1, inflight=(1 1) IOPS=250806, BW=122MiB/s, IOS/call=1/1, inflight=(1 1) and the following after: IOPS=255972, BW=124MiB/s, IOS/call=1/1, inflight=(1 1) IOPS=251920, BW=123MiB/s, IOS/call=1/1, inflight=(1 1) IOPS=251794, BW=122MiB/s, IOS/call=1/1, inflight=(1 1) which definitely isn't slower, about the same if you factor in a bit of variance. For peak performance workloads, benchmarking shows a 2% improvement. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
Wire up using an io_comp_batch for f_op->iopoll(). If the lower stack supports it, we can handle high rates of polled IO more efficiently. This raises the single core efficiency on my system from ~6.1M IOPS to ~6.6M IOPS running a random read workload at depth 128 on two gen2 Optane drives. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
Take advantage of struct io_comp_batch, if passed in to the nvme poll handler. If it's set, rather than complete each request individually inline, store them in the io_comp_batch list. We only do so for requests that will complete successfully, anything else will be completed inline as before. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
Instead of calling blk_mq_end_request() on a single request, add a helper that takes the new struct io_comp_batch and completes any request stored in there. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
sbitmap currently only supports clearing tags one-by-one, add a helper that allows the caller to pass in an array of tags to clear. Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
struct io_comp_batch contains a list head and a completion handler, which will allow completions to more effciently completed batches of IO. For now, no functional changes in this patch, we just define the io_comp_batch structure and add the argument to the file_operations iopoll handler. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
Instead of open-coding the list additions, traversal, and removal, provide a basic set of helpers. Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
Just like the blk_mq_ctx counterparts, we've got a bunch of counters in here that are only for debugfs and are of questionnable value. They are: - dispatched, index of how many requests were dispatched in one go - poll_{considered,invoked,success}, which track poll sucess rates. We're confident in the iopoll implementation at this point, don't bother tracking these. As a bonus, this shrinks each hardware queue from 576 bytes to 512 bytes, dropping a whole cacheline. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
These were added as part of early days debugging for blk-mq, and they are not really useful anymore. Rather than spend cycles updating them, just get rid of them. As a bonus, this shrinks the per-cpu software queue size from 256b to 192b. That's a whole cacheline less. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Add a local variable for rq_flags, it helps to compile out some of rq_flags reloads. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
We should have enough of registers in blk_mq_rq_ctx_init(), store them in local vars, so we don't keep reloading them. note: keeping q->elevator may look unnecessary, but it's also used inside inlined blk_mq_tags_from_data(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Don't init rq->hash and rq->rb_node in blk_mq_rq_ctx_init() if there is no elevator. Also, move some other initialisers that imply barriers to the end, so the compiler is free to rearrange and optimise other the rest of them. note: fold in a change from Jens leaving queue_list unconditional, as it might lead to problems otherwise. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
Add an rq private RQF_ELV flag, which tells the block layer that this request was initialized on a queue that has an IO scheduler attached. This allows for faster checking in the fast path, rather than having to deference rq->q later on. Elevator switching does full quiesce of the queue before detaching an IO scheduler, so it's safe to cache this in the request itself. Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
We set BIO_TRACKED unconditionally when rq_qos_throttle() is called, even though we may not even have an rq_qos handler. Only mark it as TRACKED if it really is potentially tracked. This saves considerable time for the case where the bio isn't tracked: 2.64% -1.65% [kernel.vmlinux] [k] bio_endio Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
It's been a while since this was analyzed, move some members around to better flow with the use case. Initial state up top, and queued state after that. This improves my peak case by about 1.5%, from 7750K to 7900K IOPS. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
For some reason we still have them in blk-core, with the rest of the request completion being in blk-mq. That causes and out-of-line call for each completion. Move them into blk-mq.c instead, where they belong. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
We have exactly one caller of this, just get rid of adding the useless function name to the output. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
If we're completing nbytes and nbytes is the size of the bio, don't bother with calling into the iterator increment helpers. Just clear the bio size and we're done. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Convert bdev->bd_disk->queue to bdev_get_queue(), it's uses a cached queue pointer and so is faster. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/addf6ea988c04213697ba3684c853e4ed7642a39.1634219547.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Convert bdev->bd_disk->queue to bdev_get_queue(), it's uses a cached queue pointer and so is faster. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/efc41f880262517c8dc32f932f1b23112f21b255.1634219547.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Convert bdev->bd_disk->queue to bdev_get_queue(), it's uses a cached queue pointer and so is faster. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/85c36ea784d285a5075baa10049e6b59e15fb484.1634219547.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Convert bdev->bd_disk->queue to bdev_get_queue(), it's uses a cached queue pointer and so is faster. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a352936ce5d9ac719645b1e29b173d931ebcdc02.1634219547.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
There are tons of places where we need to get a request_queue only having bdev, which turns into bdev->bd_disk->queue. There are probably a hundred of such places considering inline helpers, and enough of them are in hot paths. Cache queue pointer in struct block_device and make use of it in bdev_get_queue(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a3bfaecdd28956f03629d0ca5c63ebc096e1c809.1634219547.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>
-