Commits · 04aad37be1a88de6a1919996a615437ac74de479 · Kirill Smelkov / linux

03 Feb, 2023 8 commits

blk-wbt: pass a gendisk to wbt_{enable,disable}_default · 04aad37b

Christoph Hellwig authored Feb 03, 2023

Pass a gendisk to wbt_enable_default and wbt_disable_default to
prepare for phasing out usage of the request_queue in the blk-cgroup
code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-9-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

04aad37b

blk-cgroup: store a gendisk to throttle in struct task_struct · f05837ed

Christoph Hellwig authored Feb 03, 2023

Switch from a request_queue pointer and reference to a gendisk once
for the throttle information in struct task_struct.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Link: https://lore.kernel.org/r/20230203150400.3199230-8-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

f05837ed

blk-cgroup: pin the gendisk in struct blkcg_gq · 84d7d462

Christoph Hellwig authored Feb 03, 2023

Currently each blkcg_gq holds a request_queue reference, which is what
is used in the policies.  But a lot of these interfaces will move over to
use a gendisk, so store a disk in struct blkcg_gq and hold a reference to
it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-7-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

84d7d462

blk-cgroup: remove the !bdi->dev check in blkg_dev_name · 180b04d4

Christoph Hellwig authored Feb 03, 2023

bdi_dev_name already performs the same check.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-6-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

180b04d4

blk-cgroup: simplify blkg freeing from initialization failure paths · 27b642b0

Christoph Hellwig authored Feb 03, 2023

There is no need to delay freeing a blkg to a workqueue when freeing it
after an initialization failure.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-5-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

27b642b0

blk-cgroup: improve error unwinding in blkg_alloc · 0b6f93bd

Christoph Hellwig authored Feb 03, 2023

Unwind only the previous initialization steps that happened in blkg_alloc
using goto based unwinding. This avoids the need for the !queue special
case in blkg_free and thus ensures that any blkg seens outside of
blkg_alloc is always fully constructed.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-4-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

0b6f93bd

blk-cgroup: delay blk-cgroup initialization until add_disk · 178fa7d4

Christoph Hellwig authored Feb 03, 2023

There is no need to initialize the cgroup code before the disk is marked
live. Moving the cgroup initialization earlier will help to have a
fully initialized struct device in the gendisk for the cgroup code to
use in the future. Similarly tear the cgroup information down in
del_gendisk to be symmetric and because none of the cgroup tracking is
needed once non-passthrough I/O stops.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-3-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

178fa7d4

block: don't call blk_throtl_stat_add for non-READ/WRITE commands · a886001c

Christoph Hellwig authored Feb 03, 2023

blk_throtl_stat_add is called from blk_stat_add explicitly, unlike the
other stats that go through q->stats->callbacks. To prepare for cgroup
data moving to the gendisk, ensure blk_throtl_stat_add is only called
for the plain READ and WRITE commands that it actually handles internally,
as blk_stat_add can also be called for passthrough commands on queues that
do not have a gendisk associated with them.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-2-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

a886001c

02 Feb, 2023 1 commit

Merge branch 'md-next' of... · 839c717b

Jens Axboe authored Feb 02, 2023

Merge branch 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-6.3/block

Pull MD updates from Song:

"Non-urgent fixes:
   md: don't update recovery_cp when curr_resync is ACTIVE
   md: Free writes_pending in md_stop

 Performance optimization:
   md: Change active_io to percpu"

* 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
  md: use MD_RESYNC_* whenever possible
  md: Free writes_pending in md_stop
  md: Change active_io to percpu
  md: Factor out is_md_suspended helper
  md: don't update recovery_cp when curr_resync is ACTIVE

839c717b

01 Feb, 2023 6 commits

md: use MD_RESYNC_* whenever possible · ed821cf8

Hou Tao authored Feb 01, 2023

Just replace magic numbers by MD_RESYNC_* enumerations.
Signed-off-by: Hou Tao <houtao1@huawei.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Song Liu <song@kernel.org>

ed821cf8

md: Free writes_pending in md_stop · 07dbb135

Xiao Ni authored Jan 21, 2023

dm raid calls md_stop to stop the raid device. It needs to
free the writes_pending here.
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>

07dbb135

md: Change active_io to percpu · 72adae23

Xiao Ni authored Jan 31, 2023

Now the type of active_io is atomic. It's used to count how many ios are
in the submitting process and it's added and decreased very time. But it
only needs to check if it's zero when suspending the raid. So we can
switch atomic to percpu to improve the performance.

After switching active_io to percpu type, we use the state of active_io
to judge if the raid device is suspended. And we don't need to wake up
->sb_wait in md_handle_request anymore. It's done in the callback function
which is registered when initing active_io. The argument mddev->suspended
is only used to count how many users are trying to set raid to suspend
state.
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>

72adae23

md: Factor out is_md_suspended helper · d1932913

Xiao Ni authored Jan 31, 2023

This helper function will be used in next patch. It's easy for
understanding.
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>

d1932913

md: don't update recovery_cp when curr_resync is ACTIVE · 1d1f25bf

Hou Tao authored Jan 31, 2023

Don't update recovery_cp when curr_resync is MD_RESYNC_ACTIVE, otherwise
md may skip the resync of the first 3 sectors if the resync procedure is
interrupted before the first calling of ->sync_request() as shown below:

md_do_sync thread          control thread
  // setup resync
  mddev->recovery_cp = 0
  j = 0
  mddev->curr_resync = MD_RESYNC_ACTIVE

                             // e.g., set array as idle
                             set_bit(MD_RECOVERY_INTR, &&mddev_recovery)
  // resync loop
  // check INTR before calling sync_request
  !test_bit(MD_RECOVERY_INTR, &mddev->recovery

  // resync interrupted
  // update recovery_cp from 0 to 3
  // the resync of three 3 sectors will be skipped
  mddev->recovery_cp = 3

Fixes: eac58d08 ("md: Use enum for overloaded magic numbers used by mddev->curr_resync")
Cc: stable@vger.kernel.org # 6.0+
Signed-off-by: Hou Tao <houtao1@huawei.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Song Liu <song@kernel.org>

1d1f25bf

loop: Improve the hw_queue_depth kernel module parameter implementation · e152a05f

Bart Van Assche authored Jan 30, 2023

Make the following minor changes which were reported by colleagues
while reviewing this code:
- Remove the parentheses from around the LOOP_DEFAULT_HW_Q_DEPTH
  definition since these are superfluous.
- Accept other number formats than decimal, e.g. hexadecimal.
- Do not set hw_queue_depth to an out-of-range value, even if that value
  won't be used.
- Use the LOOP_DEFAULT_HW_Q_DEPTH macro in the kernel module parameter
  description to prevent that the description gets out of sync.

This patch has been tested as follows:

 # modprobe -r loop
 # modprobe loop hw_queue_depth=-1
 modprobe: ERROR: could not insert 'loop': Invalid argument
 # modprobe loop hw_queue_depth=0
 modprobe: ERROR: could not insert 'loop': Invalid argument
 # modprobe loop hw_queue_depth=1; cat /sys/module/loop/parameters/hw_queue_depth
 1
 # modprobe -r loop; modprobe loop; cat /sys/module/loop/parameters/hw_queue_depth hw_queue_depth=0x10
 16
 # modprobe -r loop; modprobe loop; cat /sys/module/loop/parameters/hw_queue_depth hw_queue_depth=128
 128
 # modprobe -r loop; modprobe loop hw_queue_depth=129; cat /sys/module/loop/parameters/hw_queue_depth
 129
 # modprobe -r loop; modprobe loop hw_queue_depth=$((1<<32))
 modprobe: ERROR: could not insert 'loop': Numerical result out of range

See also commit ef44c508 ("loop: allow user to set the queue
depth").

Cc: Chaitanya Kulkarni <kch@nvidia.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20230130211347.832110-1-bvanassche@acm.orgSigned-off-by: Jens Axboe <axboe@kernel.dk>

e152a05f

31 Jan, 2023 2 commits

block: Remove mm.h from bvec.h · 2d97930d

Matthew Wilcox authored Jan 31, 2023

This was originally added for the definition of nth_page(), but we no
longer use nth_page() in this header, so we can drop the heavyweight
mm.h now.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Link: https://lore.kernel.org/r/20230131050132.2627124-1-willy@infradead.orgSigned-off-by: Jens Axboe <axboe@kernel.dk>

2d97930d

ublk_drv: only allow owner to open unprivileged disk · 48a90519

Ming Lei authored Jan 31, 2023

Owner of one unprivileged ublk device could be one evil user, which
can grant this disk's privilege to other users deliberately, and
this way could be like making one trap and waiting for other users
to be caught.

So only owner to open unprivileged disk even though the owner
grants disk privilege to other user. This way is reasonable too
given anyone can create ublk disk, and no need other's grant.
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Fixes: 4093cb5a ("ublk_drv: add mechanism for supporting unprivileged ublk device")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230131040446.214583-1-ming.lei@redhat.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

48a90519

30 Jan, 2023 14 commits

block: Default to use cgroup support for BFQ · 4a6a7bc2

Ulf Hansson authored Jan 30, 2023

Assuming that both Kconfig options, BLK_CGROUP and IOSCHED_BFQ are set, we
most likely want cgroup support for BFQ too (BFQ_GROUP_IOSCHED), so let's
make it default y.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20230130121240.159456-1-ulf.hansson@linaro.orgSigned-off-by: Jens Axboe <axboe@kernel.dk>