Commits · 7764656b108cd308c39e9a8554353b8f9ca232a3 · Kirill Smelkov / linux

21 Jul, 2021 1 commit

nvme-pci: don't WARN_ON in nvme_reset_work if ctrl.state is not RESETTING · 7764656b

Zhihao Cheng authored Jul 05, 2021

Followling process:
nvme_probe
  nvme_reset_ctrl
    nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING)
    queue_work(nvme_reset_wq, &ctrl->reset_work)

-------------->	nvme_remove
		  nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DELETING)
worker_thread
  process_one_work
    nvme_reset_work
    WARN_ON(dev->ctrl.state != NVME_CTRL_RESETTING)

, which will trigger WARN_ON in nvme_reset_work():
[  127.534298] WARNING: CPU: 0 PID: 139 at drivers/nvme/host/pci.c:2594
[  127.536161] CPU: 0 PID: 139 Comm: kworker/u8:7 Not tainted 5.13.0
[  127.552518] Call Trace:
[  127.552840]  ? kvm_sched_clock_read+0x25/0x40
[  127.553936]  ? native_send_call_func_single_ipi+0x1c/0x30
[  127.555117]  ? send_call_function_single_ipi+0x9b/0x130
[  127.556263]  ? __smp_call_single_queue+0x48/0x60
[  127.557278]  ? ttwu_queue_wakelist+0xfa/0x1c0
[  127.558231]  ? try_to_wake_up+0x265/0x9d0
[  127.559120]  ? ext4_end_io_rsv_work+0x160/0x290
[  127.560118]  process_one_work+0x28c/0x640
[  127.561002]  worker_thread+0x39a/0x700
[  127.561833]  ? rescuer_thread+0x580/0x580
[  127.562714]  kthread+0x18c/0x1e0
[  127.563444]  ? set_kthread_struct+0x70/0x70
[  127.564347]  ret_from_fork+0x1f/0x30

The preceding problem can be easily reproduced by executing following
script (based on blktests suite):
test() {
  pdev="$(_get_pci_dev_from_blkdev)"
  sysfs="/sys/bus/pci/devices/${pdev}"
  for ((i = 0; i < 10; i++)); do
    echo 1 > "$sysfs/remove"
    echo 1 > /sys/bus/pci/rescan
  done
}

Since the device ctrl could be updated as an non-RESETTING state by
repeating probe/remove in userspace (which is a normal situation), we
can replace stack dumping WARN_ON with a warnning message.

Fixes: 82b057ca ("nvme-pci: fix multiple ctrl removal schedulin")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>

7764656b

17 Jul, 2021 1 commit

block: increase BLKCG_MAX_POLS · ec645dc9

Oleksandr Natalenko authored Jul 17, 2021

After mq-deadline learned to deal with cgroups, the BLKCG_MAX_POLS value
became too small for all the elevators to be registered properly. The
following issue is seen:

```
calling  bfq_init+0x0/0x8b @ 1
blkcg_policy_register: BLKCG_MAX_POLS too small
initcall bfq_init+0x0/0x8b returned -28 after 507 usecs
```

which renders BFQ non-functional.

Increase BLKCG_MAX_POLS to allow enough space for everyone.

Fixes: 08a9ad8b ("block/mq-deadline: Add cgroup support")
Link: https://lore.kernel.org/lkml/8988303.mDXGIdCtx8@natalenko.name/Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Link: https://lore.kernel.org/r/20210717123328.945810-1-oleksandr@natalenko.nameSigned-off-by: Jens Axboe <axboe@kernel.dk>

ec645dc9

15 Jul, 2021 4 commits

xen-blkfront: sanitize the removal state machine · 05d69d95

Christoph Hellwig authored Jul 15, 2021

xen-blkfront has a weird protocol where close message from the remote
side can be delayed, and where hot removals are treated somewhat
differently from regular removals, all leading to potential NULL
pointer removals, and a del_gendisk from the block device release
method, which will deadlock. Fix this by just performing normal hot
removals even when the device is opened like all other Linux block
drivers.

Fixes: c76f48eb ("block: take bd_mutex around delete_partitions in del_gendisk")
Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20210715141711.1257293-1-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

05d69d95

Merge tag 'nvme-5.14-2021-07-15' of git://git.infradead.org/nvme into block-5.14 · a347c153

Jens Axboe authored Jul 15, 2021

Pull NVMe fixes from Christoph:

"nvme fixes for Linux 5.14

 - fix various races in nvme-pci when shutting down just after probing
   (Casey Chen)
 - fix a net_device leak in nvme-tcp (Prabhakar Kushwaha)"

* tag 'nvme-5.14-2021-07-15' of git://git.infradead.org/nvme:
  nvme-pci: do not call nvme_dev_remove_admin from nvme_remove
  nvme-pci: fix multiple races in nvme_setup_io_queues
  nvme-tcp: use __dev_get_by_name instead dev_get_by_name for OPT_HOST_IFACE

a347c153

nbd: fix order of cleaning up the queue and freeing the tagset · 16ad3db3

Wang Qing authored Jul 06, 2021

We must release the queue before freeing the tagset.

Fixes: 4af5f2e0 ("nbd: use blk_mq_alloc_disk and blk_cleanup_disk")
Reported-and-tested-by: syzbot+9ca43ff47167c0ee3466@syzkaller.appspotmail.com
Signed-off-by: Wang Qing <wangqing@vivo.com>
Signed-off-by: Guoqing Jiang <jiangguoqing@kylinos.cn>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210706040016.1360412-1-guoqing.jiang@linux.devSigned-off-by: Jens Axboe <axboe@kernel.dk>

16ad3db3

pd: fix order of cleaning up the queue and freeing the tagset · 58b63e0f

Guoqing Jiang authored Jul 06, 2021

We must release the queue before freeing the tagset.

Fixes: 262d431f ("pd: use blk_mq_alloc_disk and blk_cleanup_disk")
Signed-off-by: Guoqing Jiang <jiangguoqing@kylinos.cn>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210706010734.1356066-1-guoqing.jiang@linux.devSigned-off-by: Jens Axboe <axboe@kernel.dk>

58b63e0f

13 Jul, 2021 3 commits

nvme-pci: do not call nvme_dev_remove_admin from nvme_remove · 251ef6f7

Casey Chen authored Jul 07, 2021

nvme_dev_remove_admin could free dev->admin_q and the admin_tagset
while they are being accessed by nvme_dev_disable(), which can be called
by nvme_reset_work via nvme_remove_dead_ctrl.

Commit cb4bfda6 ("nvme-pci: fix hot removal during error handling")
intended to avoid requests being stuck on a removed controller by killing
the admin queue. But the later fix c8e9e9b7 ("nvme-pci: unquiesce
admin queue on shutdown"), together with nvme_dev_disable(dev, true)
right before nvme_dev_remove_admin() could help dispatch requests and
fail them early, so we don't need nvme_dev_remove_admin() any more.

Fixes: cb4bfda6 ("nvme-pci: fix hot removal during error handling")
Signed-off-by: Casey Chen <cachen@purestorage.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>

251ef6f7

nvme-pci: fix multiple races in nvme_setup_io_queues · e4b9852a

Casey Chen authored Jul 07, 2021

Below two paths could overlap each other if we power off a drive quickly
after powering it on. There are multiple races in nvme_setup_io_queues()
because of shutdown_lock missing and improper use of NVMEQ_ENABLED bit.

nvme_reset_work()                                nvme_remove()
  nvme_setup_io_queues()                           nvme_dev_disable()
  ...                                              ...
A1  clear NVMEQ_ENABLED bit for admin queue          lock
    retry:                                       B1  nvme_suspend_io_queues()
A2    pci_free_irq() admin queue                 B2  nvme_suspend_queue() admin queue
A3    pci_free_irq_vectors()                         nvme_pci_disable()
A4    nvme_setup_irqs();                         B3    pci_free_irq_vectors()
      ...                                            unlock
A5    queue_request_irq() for admin queue
      set NVMEQ_ENABLED bit
      ...
      nvme_create_io_queues()
A6      result = queue_request_irq();
        set NVMEQ_ENABLED bit
      ...
      fail to allocate enough IO queues:
A7      nvme_suspend_io_queues()
        goto retry

If B3 runs in between A1 and A2, it will crash if irqaction haven't
been freed by A2. B2 is supposed to free admin queue IRQ but it simply
can't fulfill the job as A1 has cleared NVMEQ_ENABLED bit.

Fix: combine A1 A2 so IRQ get freed as soon as the NVMEQ_ENABLED bit
gets cleared.

After solved #1, A2 could race with B3 if A2 is freeing IRQ while B3
is checking irqaction. A3 also could race with B2 if B2 is freeing
IRQ while A3 is checking irqaction.

Fix: A2 and A3 take lock for mutual exclusion.

A3 could race with B3 since they could run free_msi_irqs() in parallel.

Fix: A3 takes lock for mutual exclusion.

A4 could fail to allocate all needed IRQ vectors if A3 and A4 are
interrupted by B3.

Fix: A4 takes lock for mutual exclusion.

If A5/A6 happened after B2/B1, B3 will crash since irqaction is not NULL.
They are just allocated by A5/A6.

Fix: Lock queue_request_irq() and setting of NVMEQ_ENABLED bit.

A7 could get chance to pci_free_irq() for certain IO queue while B3 is
checking irqaction.

Fix: A7 takes lock.

nvme_dev->online_queues need to be protected by shutdown_lock. Since it
is not atomic, both paths could modify it using its own copy.
Co-developed-by: Yuanyuan Zhong <yzhong@purestorage.com>
Signed-off-by: Casey Chen <cachen@purestorage.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>

e4b9852a

nvme-tcp: use __dev_get_by_name instead dev_get_by_name for OPT_HOST_IFACE · 8b43ced6

Prabhakar Kushwaha authored Jul 13, 2021

dev_get_by_name() finds network device by name but it also increases the
reference count.

If a nvme-tcp queue is present and the network device driver is removed
before nvme_tcp, we will face the following continuous log:

  "kernel:unregister_netdevice: waiting for <eth> to become free. Usage count = 2"

And rmmod further halts. Similar case arises during reboot/shutdown
with nvme-tcp queue present and both never completes.

To fix this, use __dev_get_by_name() which finds network device by
name without increasing any reference counter.

Fixes: 3ede8f72 ("nvme-tcp: allow selecting the network interface for connections")
Signed-off-by: Omkar Kulkarni <okulkarni@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
[hch: remove the ->ndev member entirely]
Signed-off-by: Christoph Hellwig <hch@lst.de>

8b43ced6

07 Jul, 2021 3 commits

blk-cgroup: prevent rcu_sched detected stalls warnings while iterating blkgs · a731763f

Yu Kuai authored Jul 07, 2021

We run a test that create millions of cgroups and blkgs, and then trigger
blkg_destroy_all(). blkg_destroy_all() will hold spin lock for a long
time in such situation. Thus release the lock when a batch of blkgs are
destroyed.

blkcg_activate_policy() and blkcg_deactivate_policy() might have the
same problem, however, as they are basically only called from module
init/exit paths, let's leave them alone for now.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20210707015649.1929797-1-yukuai3@huawei.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

a731763f

block: fix the problem of io_ticks becoming smaller · d80c228d

Chunguang Xu authored Jul 06, 2021

On the IO submission path, blk_account_io_start() may interrupt
the system interruption. When the interruption returns, the value
of part->stamp may have been updated by other cores, so the time
value collected before the interruption may be less than part->
stamp. So when this happens, we should do nothing to make io_ticks
more accurate? For kernels less than 5.0, this may cause io_ticks
to become smaller, which in turn may cause abnormal ioutil values.
Signed-off-by: Chunguang Xu <brookxu@tencent.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/1625521646-1069-1-git-send-email-brookxu.cn@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

d80c228d

Merge branch 'nvme-5.14' of git://git.infradead.org/nvme into block-5.14 · c6af8db9

Jens Axboe authored Jul 07, 2021

Pull single NVMe fix from Christoph.

* 'nvme-5.14' of git://git.infradead.org/nvme:
  nvme-tcp: can't set sk_user_data without write_lock

c6af8db9

05 Jul, 2021 1 commit

nvme-tcp: can't set sk_user_data without write_lock · 0755d3be

Maurizio Lombardi authored Jul 02, 2021

The sk_user_data pointer is supposed to be modified only while
holding the write_lock "sk_callback_lock", otherwise
we could race with other threads and crash the kernel.

we can't take the write_lock in nvmet_tcp_state_change()
because it would cause a deadlock, but the release_work queue
will set the pointer to NULL later so we can simply remove
the assignment.

Fixes: b5332a9f ("nvmet-tcp: fix incorrect locking in state_change sk callback")
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>

0755d3be

02 Jul, 2021 1 commit

loop: remove unused variable in loop_set_status() · 585af8ed

Tetsuo Handa authored Jul 03, 2021

Commit 0384264e ("block: pass a gendisk to bdev_disk_changed")
changed to pass lo->lo_disk instead of lo->lo_device.

Fixes: 0384264e ("block: pass a gendisk to bdev_disk_changed")
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Link: https://lore.kernel.org/r/20210702152714.7978-1-penguin-kernel@I-love.SAKURA.ne.jpSigned-off-by: Jens Axboe <axboe@kernel.dk>

585af8ed

01 Jul, 2021 5 commits

block: remove the bdgrab in blk_drop_partitions · 63c38d85

Christoph Hellwig authored Jul 01, 2021

There is no need to hold a bdev reference when removing the partition.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210701081638.246552-3-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

63c38d85

block: grab a device refcount in disk_uevent · 498dcc13

Christoph Hellwig authored Jul 01, 2021

Sending uevents requires the struct device to be alive.  To
ensure that grab the device refcount instead of just an inode
reference.

Fixes: bc359d03 ("block: add a disk_uevent helper")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210701081638.246552-2-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

498dcc13

s390/dasd: Avoid field over-reading memcpy() · 2b7a8dc0

Kees Cook authored Jul 01, 2021

In preparation for FORTIFY_SOURCE performing compile-time and run-time
field array bounds checking for memcpy(), memmove(), and memset(),
avoid intentionally reading across neighboring array fields.

Add a wrapping structure to serve as the memcpy() source, so the compiler
can do appropriate bounds checking, avoiding this future warning:

In function '__fortify_memcpy',
    inlined from 'create_uid' at drivers/s390/block/dasd_eckd.c:749:2:
./include/linux/fortify-string.h:246:4: error: call to '__read_overflow2_field' declared with attribute error: detected read beyond size of field (2nd parameter)
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Link: https://lore.kernel.org/r/20210701142221.3408680-3-sth@linux.ibm.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

2b7a8dc0

dasd: unexport dasd_set_target_state · 299f2b5f

Christoph Hellwig authored Jul 01, 2021

dasd_set_target_state is only used inside of dasd_mod.ko, so don't
export it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Link: https://lore.kernel.org/r/20210701142221.3408680-2-sth@linux.ibm.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

299f2b5f

block: check disk exist before trying to add partition · b5cfbd35

Yufen Yu authored Jun 10, 2021

If disk have been deleted, we should return fail for ioctl
BLKPG_DEL_PARTITION. Otherwise, the directory /sys/class/block
may remain invalid symlinks file. The race as following:

blkdev_open
				del_gendisk
				    disk->flags &= ~GENHD_FL_UP;
				    blk_drop_partitions
blkpg_ioctl
    bdev_add_partition
    add_partition
        device_add
	    device_add_class_symlinks

ioctl may add_partition after del_gendisk() have tried to delete
partitions. Then, symlinks file will be created.
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Link: https://lore.kernel.org/r/20210610023241.3646241-1-yuyufen@huawei.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

b5cfbd35

30 Jun, 2021 21 commits

ubd: remove dead code in ubd_setup_common · efee99e6

Christoph Hellwig authored Jun 28, 2021

Remove some leftovers of the fake major number parsing that cause
complains from some compilers.

Fixes: 2933a1b2c6f3 ("ubd: remove the code to register as the legacy IDE driver")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210628093937.1325608-1-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

efee99e6

nvme: use return value from blk_execute_rq() · ae5e6886

Keith Busch authored Jun 10, 2021

We don't have an nvme status to report if the driver's .queue_rq()
returns an error without dispatching the requested nvme command. Check
the return value from blk_execute_rq() for all passthrough commands so
the caller may know their command was not successful.

If the command is from the target passthrough interface and fails to
dispatch, synthesize the response back to the host as a internal target
error.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210610214437.641245-5-kbusch@kernel.orgSigned-off-by: Jens Axboe <axboe@kernel.dk>

ae5e6886

block: return errors from blk_execute_rq() · fb9b16e1

Keith Busch authored Jun 10, 2021

The synchronous blk_execute_rq() had not provided a way for its callers
to know if its request was successful or not. Return the blk_status_t
result of the request.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210610214437.641245-4-kbusch@kernel.orgSigned-off-by: Jens Axboe <axboe@kernel.dk>

fb9b16e1

nvme: use blk_execute_rq() for passthrough commands · be42a33b

Keith Busch authored Jun 10, 2021

The generic blk_execute_rq() knows how to handle polled completions. Use
that instead of implementing an nvme specific handler.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210610214437.641245-3-kbusch@kernel.orgSigned-off-by: Jens Axboe <axboe@kernel.dk>

be42a33b

block: support polling through blk_execute_rq · c01b5a81

Keith Busch authored Jun 10, 2021

Poll for completions if the request's hctx is a polling type.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210610214437.641245-2-kbusch@kernel.orgSigned-off-by: Jens Axboe <axboe@kernel.dk>

c01b5a81

block: remove REQ_OP_SCSI_{IN,OUT} · da6269da

Christoph Hellwig authored Jun 24, 2021

With the legacy IDE driver gone drivers now use either REQ_OP_DRV_*
or REQ_OP_SCSI_*, so unify the two concepts of passthrough requests
into a single one.
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

da6269da

block: mark blk_mq_init_queue_data static · 5ec780a6

Christoph Hellwig authored Jun 24, 2021

All driver uses are gone now.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20210624081012.256464-1-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

5ec780a6

loop: rewrite loop_exit using idr_for_each_entry · 8e60947d

Christoph Hellwig authored Jun 23, 2021

Use idr_for_each_entry to simplify removing all devices.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210623145908.92973-10-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

8e60947d

loop: split loop_lookup · b9848081

Christoph Hellwig authored Jun 23, 2021

loop_lookup has two callers - one wants to do the a find by index and the
other wants any unbound loop device. Open code the respective
functionality in each caller.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210623145908.92973-9-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

b9848081

loop: don't allow deleting an unspecified loop device · e5d66a10

Christoph Hellwig authored Jun 23, 2021

Passing a negative index to loop_lookup while return any unbound device.
Doing that for a delete does not make much sense, so add check to
explicitly reject that case.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210623145908.92973-8-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

e5d66a10

loop: move loop_ctl_mutex locking into loop_add · 18d1f200

Christoph Hellwig authored Jun 23, 2021

Move acquiring and releasing loop_ctl_mutex from the callers into
loop_add.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210623145908.92973-7-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

18d1f200

loop: split loop_control_ioctl · f9d10764

Christoph Hellwig authored Jun 23, 2021

Split loop_control_ioctl into a helper for each command. This keeps the
code nicely separated for the upcoming locking changes.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210623145908.92973-6-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

f9d10764

loop: don't call loop_lookup before adding a loop device · 4157fe0b

Christoph Hellwig authored Jun 23, 2021

loop_add returns the right error if the slot wasn't available.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210623145908.92973-5-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

4157fe0b

loop: remove the l argument to loop_add · d6da83d0

Christoph Hellwig authored Jun 23, 2021

None of the callers cares about the allocated struct loop_device.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210623145908.92973-4-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

d6da83d0

loop: reduce loop_ctl_mutex coverage in loop_exit · bd5c39ed

Christoph Hellwig authored Jun 23, 2021

loop_ctl_mutex is only needed to iterate the IDR for removing the loop
devices, so reduce the coverage.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210623145908.92973-3-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

bd5c39ed

loop: reorder loop_exit · 8b52d8be

Christoph Hellwig authored Jun 23, 2021

Unregister the misc and blockdevice first to prevent further access,
and only then iterate to remove the devices.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210623145908.92973-2-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

8b52d8be

mmc: initialized disk->minors · 1033d103

Christoph Hellwig authored Jun 21, 2021

Fix a let hunk from the blk_mq_alloc_disk conversion.

Fixes: 281ea6a5bfdc ("mmc: switch to blk_mq_alloc_disk")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://lore.kernel.org/r/20210621080144.3655131-1-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

1033d103

mmc: switch to blk_mq_alloc_disk · 607d968a

Christoph Hellwig authored Jun 16, 2021

Use the blk_mq_alloc_disk to allocate the request_queue and gendisk
together.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://lore.kernel.org/r/20210616053934.880951-3-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

607d968a

mmc: remove an extra blk_{get,put}_queue pair · 249cda33

Christoph Hellwig authored Jun 16, 2021

The gendisk already acquires a reference to the queue when add_disk
is called, which dropped on put_disk.  So remove the superflous
extra refcounting.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://lore.kernel.org/r/20210616053934.880951-2-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

249cda33

nbd: provide a way for userspace processes to identify device backends · 6497ef8d

Prasanna Kumar Kalever authored Apr 29, 2021

Problem:
On reconfigure of device, there is no way to defend if the backend
storage is matching with the initial backend storage.

Say, if an initial connect request for backend "pool1/image1" got
mapped to /dev/nbd0 and the userspace process is terminated. A next
reconfigure request within NBD_ATTR_DEAD_CONN_TIMEOUT is allowed to
use /dev/nbd0 for a different backend "pool1/image2"

For example, an operation like below could be dangerous:

$ sudo rbd-nbd map --try-netlink rbd-pool/ext4-image
/dev/nbd0
$ sudo blkid /dev/nbd0
/dev/nbd0: UUID="bfc444b4-64b1-418f-8b36-6e0d170cfc04" TYPE="ext4"
$ sudo pkill -9 rbd-nbd
$ sudo rbd-nbd attach --try-netlink --device /dev/nbd0 rbd-pool/xfs-image
/dev/nbd0
$ sudo blkid /dev/nbd0
/dev/nbd0: UUID="d29bf343-6570-4069-a9ea-2fa156ced908" TYPE="xfs"

Solution:
Provide a way for userspace processes to keep some metadata to identify
between the device and the backend, so that when a reconfigure request is
made, we can compare and avoid such dangerous operations.

With this solution, as part of the initial connect request, backend
path can be stored in the sysfs per device config, so that on a reconfigure
request it's easy to check if the backend path matches with the initial
connect backend path.

Please note, ioctl interface to nbd will not have these changes, as there
won't be any reconfigure.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20210429102828.31248-1-prasanna.kalever@redhat.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

6497ef8d

ubd: use blk_mq_alloc_disk and blk_cleanup_disk · 35efb594

Christoph Hellwig authored Jun 14, 2021

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210614060759.3965724-3-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

35efb594