Commit 67a24222 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'for-5.2/block-20190507' of git://git.kernel.dk/linux-block

Pull block updates from Jens Axboe:
 "Nothing major in this series, just fixes and improvements all over the
  map. This contains:

   - Series of fixes for sed-opal (David, Jonas)

   - Fixes and performance tweaks for BFQ (via Paolo)

   - Set of fixes for bcache (via Coly)

   - Set of fixes for md (via Song)

   - Enabling multi-page for passthrough requests (Ming)

   - Queue release fix series (Ming)

   - Device notification improvements (Martin)

   - Propagate underlying device rotational status in loop (Holger)

   - Removal of mtip32xx trim support, which has been disabled for years
     (Christoph)

   - Improvement and cleanup of nvme command handling (Christoph)

   - Add block SPDX tags (Christoph)

   - Cleanup/hardening of bio/bvec iteration (Christoph)

   - A few NVMe pull requests (Christoph)

   - Removal of CONFIG_LBDAF (Christoph)

   - Various little fixes here and there"

* tag 'for-5.2/block-20190507' of git://git.kernel.dk/linux-block: (164 commits)
  block: fix mismerge in bvec_advance
  block: don't drain in-progress dispatch in blk_cleanup_queue()
  blk-mq: move cancel of hctx->run_work into blk_mq_hw_sysfs_release
  blk-mq: always free hctx after request queue is freed
  blk-mq: split blk_mq_alloc_and_init_hctx into two parts
  blk-mq: free hw queue's resource in hctx's release handler
  blk-mq: move cancel of requeue_work into blk_mq_release
  blk-mq: grab .q_usage_counter when queuing request from plug code path
  block: fix function name in comment
  nvmet: protect discovery change log event list iteration
  nvme: mark nvme_core_init and nvme_core_exit static
  nvme: move command size checks to the core
  nvme-fabrics: check more command sizes
  nvme-pci: check more command sizes
  nvme-pci: remove an unneeded variable initialization
  nvme-pci: unquiesce admin queue on shutdown
  nvme-pci: shutdown on timeout during deletion
  nvme-pci: fix psdt field for single segment sgls
  nvme-multipath: don't print ANA group state by default
  nvme-multipath: split bios with the ns_head bio_set before submitting
  ...
parents 8b35ad62 b8753433
......@@ -20,13 +20,26 @@ for that device, by setting low_latency to 0. See Section 3 for
details on how to configure BFQ for the desired tradeoff between
latency and throughput, or on how to maximize throughput.
BFQ has a non-null overhead, which limits the maximum IOPS that a CPU
can process for a device scheduled with BFQ. To give an idea of the
limits on slow or average CPUs, here are, first, the limits of BFQ for
three different CPUs, on, respectively, an average laptop, an old
desktop, and a cheap embedded system, in case full hierarchical
support is enabled (i.e., CONFIG_BFQ_GROUP_IOSCHED is set), but
CONFIG_DEBUG_BLK_CGROUP is not set (Section 4-2):
As every I/O scheduler, BFQ adds some overhead to per-I/O-request
processing. To give an idea of this overhead, the total,
single-lock-protected, per-request processing time of BFQ---i.e., the
sum of the execution times of the request insertion, dispatch and
completion hooks---is, e.g., 1.9 us on an Intel Core i7-2760QM@2.40GHz
(dated CPU for notebooks; time measured with simple code
instrumentation, and using the throughput-sync.sh script of the S
suite [1], in performance-profiling mode). To put this result into
context, the total, single-lock-protected, per-request execution time
of the lightest I/O scheduler available in blk-mq, mq-deadline, is 0.7
us (mq-deadline is ~800 LOC, against ~10500 LOC for BFQ).
Scheduling overhead further limits the maximum IOPS that a CPU can
process (already limited by the execution of the rest of the I/O
stack). To give an idea of the limits with BFQ, on slow or average
CPUs, here are, first, the limits of BFQ for three different CPUs, on,
respectively, an average laptop, an old desktop, and a cheap embedded
system, in case full hierarchical support is enabled (i.e.,
CONFIG_BFQ_GROUP_IOSCHED is set), but CONFIG_DEBUG_BLK_CGROUP is not
set (Section 4-2):
- Intel i7-4850HQ: 400 KIOPS
- AMD A8-3850: 250 KIOPS
- ARM CortexTM-A53 Octa-core: 80 KIOPS
......@@ -566,3 +579,5 @@ applications. Unset this tunable if you need/want to control weights.
Slightly extended version:
http://algogroup.unimore.it/people/paolo/disk_sched/bfq-v1-suite-
results.pdf
[3] https://github.com/Algodev-github/S
......@@ -93,3 +93,7 @@ zoned=[0/1]: Default: 0
zone_size=[MB]: Default: 256
Per zone size when exposed as a zoned block device. Must be a power of two.
zone_nr_conv=[nr_conv]: Default: 0
The number of conventional zones to create when block device is zoned. If
zone_nr_conv >= nr_zones, it will be reduced to nr_zones - 1.
......@@ -72,47 +72,44 @@ and elsewhere regarding submitting Linux kernel patches.
13) Has been build- and runtime tested with and without ``CONFIG_SMP`` and
``CONFIG_PREEMPT.``
14) If the patch affects IO/Disk, etc: has been tested with and without
``CONFIG_LBDAF.``
16) All codepaths have been exercised with all lockdep features enabled.
15) All codepaths have been exercised with all lockdep features enabled.
17) All new ``/proc`` entries are documented under ``Documentation/``
16) All new ``/proc`` entries are documented under ``Documentation/``
17) All new kernel boot parameters are documented in
18) All new kernel boot parameters are documented in
``Documentation/admin-guide/kernel-parameters.rst``.
18) All new module parameters are documented with ``MODULE_PARM_DESC()``
19) All new module parameters are documented with ``MODULE_PARM_DESC()``
19) All new userspace interfaces are documented in ``Documentation/ABI/``.
20) All new userspace interfaces are documented in ``Documentation/ABI/``.
See ``Documentation/ABI/README`` for more information.
Patches that change userspace interfaces should be CCed to
linux-api@vger.kernel.org.
20) Check that it all passes ``make headers_check``.
21) Check that it all passes ``make headers_check``.
21) Has been checked with injection of at least slab and page-allocation
22) Has been checked with injection of at least slab and page-allocation
failures. See ``Documentation/fault-injection/``.
If the new code is substantial, addition of subsystem-specific fault
injection might be appropriate.
22) Newly-added code has been compiled with ``gcc -W`` (use
23) Newly-added code has been compiled with ``gcc -W`` (use
``make EXTRA_CFLAGS=-W``). This will generate lots of noise, but is good
for finding bugs like "warning: comparison between signed and unsigned".
23) Tested after it has been merged into the -mm patchset to make sure
24) Tested after it has been merged into the -mm patchset to make sure
that it still works with all of the other queued patches and various
changes in the VM, VFS, and other subsystems.
24) All memory barriers {e.g., ``barrier()``, ``rmb()``, ``wmb()``} need a
25) All memory barriers {e.g., ``barrier()``, ``rmb()``, ``wmb()``} need a
comment in the source code that explains the logic of what they are doing
and why.
25) If any ioctl's are added by the patch, then also update
26) If any ioctl's are added by the patch, then also update
``Documentation/ioctl/ioctl-number.txt``.
26) If your modified source code depends on or uses any of the kernel
27) If your modified source code depends on or uses any of the kernel
APIs or features that are related to the following ``Kconfig`` symbols,
then test multiple builds with the related ``Kconfig`` symbols disabled
and/or ``=m`` (if that option is available) [not all of these at the
......
......@@ -74,38 +74,34 @@ Linux カーネルパッチ投稿者向けチェックリスト
13: CONFIG_SMP, CONFIG_PREEMPT を有効にした場合と無効にした場合の両方で
ビルドした上、動作確認を行ってください。
14: もしパッチがディスクのI/O性能などに影響を与えるようであれば、
'CONFIG_LBDAF'オプションを有効にした場合と無効にした場合の両方で
テストを実施してみてください。
14: lockdepの機能を全て有効にした上で、全てのコードパスを評価してください。
15: lockdepの機能を全て有効にした上で、全てのコードパスを評価してください。
16: /proc に新しいエントリを追加した場合には、Documentation/ 配下に
15: /proc に新しいエントリを追加した場合には、Documentation/ 配下に
必ずドキュメントを追加してください。
17: 新しいブートパラメータを追加した場合には、
16: 新しいブートパラメータを追加した場合には、
必ずDocumentation/admin-guide/kernel-parameters.rst に説明を追加してください。
18: 新しくmoduleにパラメータを追加した場合には、MODULE_PARM_DESC()を
17: 新しくmoduleにパラメータを追加した場合には、MODULE_PARM_DESC()を
利用して必ずその説明を記述してください。
19: 新しいuserspaceインタフェースを作成した場合には、Documentation/ABI/ に
18: 新しいuserspaceインタフェースを作成した場合には、Documentation/ABI/ に
Documentation/ABI/README を参考にして必ずドキュメントを追加してください。
20: 'make headers_check'を実行して全く問題がないことを確認してください。
19: 'make headers_check'を実行して全く問題がないことを確認してください。
21: 少なくともslabアロケーションとpageアロケーションに失敗した場合の
20: 少なくともslabアロケーションとpageアロケーションに失敗した場合の
挙動について、fault-injectionを利用して確認してください。
Documentation/fault-injection/ を参照してください。
追加したコードがかなりの量であったならば、サブシステム特有の
fault-injectionを追加したほうが良いかもしれません。
22: 新たに追加したコードは、`gcc -W'でコンパイルしてください。
21: 新たに追加したコードは、`gcc -W'でコンパイルしてください。
このオプションは大量の不要なメッセージを出力しますが、
"warning: comparison between signed and unsigned" のようなメッセージは、
バグを見つけるのに役に立ちます。
23: 投稿したパッチが -mm パッチセットにマージされた後、全ての既存のパッチや
22: 投稿したパッチが -mm パッチセットにマージされた後、全ての既存のパッチや
VM, VFS およびその他のサブシステムに関する様々な変更と、現時点でも共存
できることを確認するテストを行ってください。
......@@ -15,7 +15,6 @@ CONFIG_PERF_EVENTS=y
# CONFIG_COMPAT_BRK is not set
CONFIG_SLAB=y
CONFIG_MODULES=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
......
......@@ -17,7 +17,6 @@ CONFIG_PERF_EVENTS=y
CONFIG_SLAB=y
CONFIG_KPROBES=y
CONFIG_MODULES=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
......
......@@ -18,7 +18,6 @@ CONFIG_PERF_EVENTS=y
CONFIG_ISA_ARCOMPACT=y
CONFIG_KPROBES=y
CONFIG_MODULES=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
......
......@@ -20,7 +20,6 @@ CONFIG_MODULES=y
CONFIG_MODULE_FORCE_LOAD=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
......
......@@ -18,7 +18,6 @@ CONFIG_MODULES=y
CONFIG_MODULE_FORCE_LOAD=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
......
......@@ -18,7 +18,6 @@ CONFIG_PERF_EVENTS=y
CONFIG_ISA_ARCOMPACT=y
CONFIG_KPROBES=y
CONFIG_MODULES=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
......
......@@ -17,7 +17,6 @@ CONFIG_PERF_EVENTS=y
# CONFIG_COMPAT_BRK is not set
CONFIG_KPROBES=y
CONFIG_MODULES=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
......
......@@ -12,7 +12,6 @@ CONFIG_PERF_EVENTS=y
# CONFIG_COMPAT_BRK is not set
CONFIG_KPROBES=y
CONFIG_MODULES=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
......
......@@ -23,7 +23,6 @@ CONFIG_SLAB_FREELIST_RANDOM=y
CONFIG_JUMP_LABEL=y
CONFIG_STRICT_KERNEL_RWX=y
CONFIG_GCC_PLUGINS=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_BLK_DEBUG_FS is not set
# CONFIG_IOSCHED_DEADLINE is not set
......
......@@ -23,7 +23,6 @@ CONFIG_SLAB_FREELIST_RANDOM=y
CONFIG_JUMP_LABEL=y
CONFIG_STRICT_KERNEL_RWX=y
CONFIG_GCC_PLUGINS=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_BLK_DEBUG_FS is not set
# CONFIG_IOSCHED_DEADLINE is not set
......
......@@ -9,7 +9,6 @@ CONFIG_EMBEDDED=y
CONFIG_SLAB=y
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
......
......@@ -6,7 +6,6 @@ CONFIG_RD_LZMA=y
CONFIG_EMBEDDED=y
CONFIG_SLOB=y
CONFIG_JUMP_LABEL=y
# CONFIG_LBDAF is not set
CONFIG_PARTITION_ADVANCED=y
# CONFIG_IOSCHED_CFQ is not set
CONFIG_ARCH_CLPS711X=y
......
......@@ -11,7 +11,6 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_EMBEDDED=y
# CONFIG_VM_EVENT_COUNTERS is not set
# CONFIG_SLUB_DEBUG is not set
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
......
......@@ -13,7 +13,6 @@ CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_MODVERSIONS=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_CFQ is not set
CONFIG_ARCH_PXA=y
......
......@@ -4,7 +4,6 @@ CONFIG_HIGH_RES_TIMERS=y
CONFIG_LOG_BUF_SHIFT=14
CONFIG_BLK_DEV_INITRD=y
CONFIG_MODULES=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
......
......@@ -12,7 +12,6 @@ CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_MODVERSIONS=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_CFQ is not set
CONFIG_ARCH_PXA=y
......
......@@ -15,7 +15,6 @@ CONFIG_EMBEDDED=y
# CONFIG_VM_EVENT_COUNTERS is not set
# CONFIG_SLUB_DEBUG is not set
# CONFIG_COMPAT_BRK is not set
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
CONFIG_ARCH_MULTI_V4=y
......
......@@ -5,7 +5,6 @@ CONFIG_BLK_DEV_INITRD=y
CONFIG_EMBEDDED=y
CONFIG_SLOB=y
CONFIG_JUMP_LABEL=y
# CONFIG_LBDAF is not set
CONFIG_PARTITION_ADVANCED=y
# CONFIG_IOSCHED_CFQ is not set
CONFIG_ARCH_MULTI_V4T=y
......
......@@ -17,7 +17,6 @@ CONFIG_OPROFILE=y
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
......
......@@ -13,7 +13,6 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_EMBEDDED=y
# CONFIG_VM_EVENT_COUNTERS is not set
# CONFIG_SLUB_DEBUG is not set
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
......
......@@ -9,7 +9,6 @@ CONFIG_EXPERT=y
# CONFIG_VM_EVENT_COUNTERS is not set
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
CONFIG_PARTITION_ADVANCED=y
# CONFIG_IOSCHED_CFQ is not set
......
......@@ -14,7 +14,6 @@ CONFIG_PROFILING=y
CONFIG_OPROFILE=y
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
......
......@@ -12,7 +12,6 @@ CONFIG_EMBEDDED=y
# CONFIG_VM_EVENT_COUNTERS is not set
# CONFIG_SLUB_DEBUG is not set
# CONFIG_COMPAT_BRK is not set
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_CFQ is not set
# CONFIG_MMU is not set
......
......@@ -11,7 +11,6 @@ CONFIG_SYSCTL_SYSCALL=y
# CONFIG_AIO is not set
CONFIG_EMBEDDED=y
CONFIG_MODULES=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
......
......@@ -17,7 +17,6 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_EMBEDDED=y
# CONFIG_VM_EVENT_COUNTERS is not set
# CONFIG_COMPAT_BRK is not set
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
CONFIG_BLK_CMDLINE_PARSER=y
# CONFIG_MMU is not set
......
......@@ -18,7 +18,6 @@ CONFIG_KEXEC=y
# CONFIG_SECCOMP is not set
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
CONFIG_PARTITION_ADVANCED=y
CONFIG_BSD_DISKLABEL=y
......
......@@ -17,7 +17,6 @@ CONFIG_TC=y
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_SRCVERSION_ALL=y
# CONFIG_LBDAF is not set
CONFIG_PARTITION_ADVANCED=y
CONFIG_OSF_PARTITION=y
# CONFIG_EFI_PARTITION is not set
......
......@@ -16,7 +16,6 @@ CONFIG_TC=y
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_SRCVERSION_ALL=y
# CONFIG_LBDAF is not set
CONFIG_PARTITION_ADVANCED=y
CONFIG_OSF_PARTITION=y
# CONFIG_EFI_PARTITION is not set
......
......@@ -19,7 +19,6 @@ CONFIG_MACH_LOONGSON32=y
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODVERSIONS=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
CONFIG_NET=y
......
......@@ -20,7 +20,6 @@ CONFIG_LOONGSON1_LS1C=y
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODVERSIONS=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
CONFIG_NET=y
......
......@@ -19,7 +19,6 @@ CONFIG_PCI=y
# CONFIG_PCI_QUIRKS is not set
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
CONFIG_PARTITION_ADVANCED=y
CONFIG_MAC_PARTITION=y
......
......@@ -17,7 +17,6 @@ CONFIG_TOSHIBA_RBTX4938_MPLEX_KEEP=y
CONFIG_PCI=y
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
CONFIG_NET=y
CONFIG_PACKET=y
......
......@@ -14,7 +14,6 @@ CONFIG_SLAB=y
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
CONFIG_PA7100LC=y
CONFIG_SMP=y
......
......@@ -19,7 +19,6 @@ CONFIG_SLAB=y
CONFIG_PROFILING=y
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
CONFIG_CFQ_GROUP_IOSCHED=y
CONFIG_CPU_SUBTYPE_SH7786=y
......
......@@ -7,7 +7,6 @@ CONFIG_LOG_BUF_SHIFT=14
CONFIG_BLK_DEV_INITRD=y
# CONFIG_KALLSYMS is not set
CONFIG_SLAB=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
CONFIG_CPU_SUBTYPE_SH7724=y
CONFIG_MEMORY_SIZE=0x10000000
......
......@@ -16,7 +16,6 @@ CONFIG_PERF_COUNTERS=y
CONFIG_SLAB=y
CONFIG_MMAP_ALLOW_UNINITIALIZED=y
CONFIG_PROFILING=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
CONFIG_PARTITION_ADVANCED=y
# CONFIG_IOSCHED_DEADLINE is not set
......
......@@ -3,7 +3,6 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_EMBEDDED=y
# CONFIG_VM_EVENT_COUNTERS is not set
CONFIG_SLAB=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
......
......@@ -11,7 +11,6 @@ CONFIG_PROFILING=y
CONFIG_GCOV_KERNEL=y
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
CONFIG_CPU_SUBTYPE_SH7785=y
CONFIG_MEMORY_START=0x40000000
......
......@@ -26,30 +26,6 @@ menuconfig BLOCK
if BLOCK
config LBDAF
bool "Support for large (2TB+) block devices and files"
depends on !64BIT
default y
help
Enable block devices or files of size 2TB and larger.
This option is required to support the full capacity of large
(2TB+) block devices, including RAID, disk, Network Block Device,
Logical Volume Manager (LVM) and loopback.
This option also enables support for single files larger than
2TB.
The ext4 filesystem requires that this feature be enabled in
order to support filesystems that have the huge_file feature
enabled. Otherwise, it will refuse to mount in the read-write
mode any filesystems that use the huge_file feature, which is
enabled by default by mke2fs.ext4.
The GFS2 filesystem also requires this feature.
If unsure, say Y.
config BLK_SCSI_REQUEST
bool
......
// SPDX-License-Identifier: GPL-2.0
/*
* Bad block management
*
* - Heavily based on MD badblocks code from Neil Brown
*
* Copyright (c) 2015, Intel Corporation.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms and conditions of the GNU General Public License,
* version 2, as published by the Free Software Foundation.
*
* This program is distributed in the hope it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
* more details.
*/
#include <linux/badblocks.h>
......
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* cgroups support for the BFQ I/O scheduler.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License as
* published by the Free Software Foundation; either version 2 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*/
#include <linux/module.h>
#include <linux/slab.h>
......@@ -578,7 +569,8 @@ void bfq_bfqq_move(struct bfq_data *bfqd, struct bfq_queue *bfqq,
bfqg_and_blkg_get(bfqg);
if (bfq_bfqq_busy(bfqq)) {
bfq_pos_tree_add_move(bfqd, bfqq);
if (unlikely(!bfqd->nonrot_with_queueing))
bfq_pos_tree_add_move(bfqd, bfqq);
bfq_activate_bfqq(bfqd, bfqq);
}
......@@ -1102,7 +1094,7 @@ struct cftype bfq_blkcg_legacy_files[] = {
},
#endif /* CONFIG_DEBUG_BLK_CGROUP */
/* the same statictics which cover the bfqg and its descendants */
/* the same statistics which cover the bfqg and its descendants */
{
.name = "bfq.io_service_bytes_recursive",
.private = (unsigned long)&blkcg_policy_bfq,
......
This diff is collapsed.
/* SPDX-License-Identifier: GPL-2.0-or-later */
/*
* Header file for the BFQ I/O scheduler: data structures and
* prototypes of interface functions among BFQ components.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License as
* published by the Free Software Foundation; either version 2 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*/
#ifndef _BFQ_H
#define _BFQ_H
......@@ -32,6 +23,8 @@
#define BFQ_DEFAULT_GRP_IOPRIO 0
#define BFQ_DEFAULT_GRP_CLASS IOPRIO_CLASS_BE
#define MAX_PID_STR_LENGTH 12
/*
* Soft real-time applications are extremely more latency sensitive
* than interactive ones. Over-raise the weight of the former to
......@@ -89,7 +82,7 @@ struct bfq_service_tree {
* expiration. This peculiar definition allows for the following
* optimization, not yet exploited: while a given entity is still in
* service, we already know which is the best candidate for next
* service among the other active entitities in the same parent
* service among the other active entities in the same parent
* entity. We can then quickly compare the timestamps of the
* in-service entity with those of such best candidate.
*
......@@ -140,7 +133,7 @@ struct bfq_weight_counter {
*
* Unless cgroups are used, the weight value is calculated from the
* ioprio to export the same interface as CFQ. When dealing with
* ``well-behaved'' queues (i.e., queues that do not spend too much
* "well-behaved" queues (i.e., queues that do not spend too much
* time to consume their budget and have true sequential behavior, and
* when there are no external factors breaking anticipation) the
* relative weights at each level of the cgroups hierarchy should be
......@@ -240,6 +233,13 @@ struct bfq_queue {
/* next ioprio and ioprio class if a change is in progress */
unsigned short new_ioprio, new_ioprio_class;
/* last total-service-time sample, see bfq_update_inject_limit() */
u64 last_serv_time_ns;
/* limit for request injection */
unsigned int inject_limit;
/* last time the inject limit has been decreased, in jiffies */
unsigned long decrease_time_jif;
/*
* Shared bfq_queue if queue is cooperating with one or more
* other queues.
......@@ -357,29 +357,6 @@ struct bfq_queue {
/* max service rate measured so far */
u32 max_service_rate;
/*
* Ratio between the service received by bfqq while it is in
* service, and the cumulative service (of requests of other
* queues) that may be injected while bfqq is empty but still
* in service. To increase precision, the coefficient is
* measured in tenths of unit. Here are some example of (1)
* ratios, (2) resulting percentages of service injected
* w.r.t. to the total service dispatched while bfqq is in
* service, and (3) corresponding values of the coefficient:
* 1 (50%) -> 10
* 2 (33%) -> 20
* 10 (9%) -> 100
* 9.9 (9%) -> 99
* 1.5 (40%) -> 15
* 0.5 (66%) -> 5
* 0.1 (90%) -> 1
*
* So, if the coefficient is lower than 10, then
* injected service is more than bfqq service.
*/
unsigned int inject_coeff;
/* amount of service injected in current service slot */
unsigned int injected_service;
};
/**
......@@ -418,6 +395,15 @@ struct bfq_io_cq {
*/
bool was_in_burst_list;
/*
* Save the weight when a merge occurs, to be able
* to restore it in case of split. If the weight is not
* correctly resumed when the queue is recycled,
* then the weight of the recycled queue could differ
* from the weight of the original queue.
*/
unsigned int saved_weight;
/*
* Similar to previous fields: save wr information.
*/
......@@ -450,7 +436,7 @@ struct bfq_data {
* weight-raised @bfq_queue (see the comments to the functions
* bfq_weights_tree_[add|remove] for further details).
*/
struct rb_root queue_weights_tree;
struct rb_root_cached queue_weights_tree;
/*
* Number of groups with at least one descendant process that
......@@ -513,6 +499,9 @@ struct bfq_data {
/* number of requests dispatched and waiting for completion */
int rq_in_driver;
/* true if the device is non rotational and performs queueing */
bool nonrot_with_queueing;
/*
* Maximum number of requests in driver in the last
* @hw_tag_samples completed requests.
......@@ -544,6 +533,26 @@ struct bfq_data {
/* time of last request completion (ns) */
u64 last_completion;
/* time of last transition from empty to non-empty (ns) */
u64 last_empty_occupied_ns;
/*
* Flag set to activate the sampling of the total service time
* of a just-arrived first I/O request (see
* bfq_update_inject_limit()). This will cause the setting of
* waited_rq when the request is finally dispatched.
*/
bool wait_dispatch;
/*
* If set, then bfq_update_inject_limit() is invoked when
* waited_rq is eventually completed.
*/
struct request *waited_rq;
/*
* True if some request has been injected during the last service hole.
*/
bool rqs_injected;
/* time of first rq dispatch in current observation interval (ns) */
u64 first_dispatch;
/* time of last rq dispatch in current observation interval (ns) */
......@@ -553,6 +562,7 @@ struct bfq_data {
ktime_t last_budget_start;
/* beginning of the last idle slice */
ktime_t last_idling_start;
unsigned long last_idling_start_jiffies;
/* number of samples in current observation interval */
int peak_rate_samples;
......@@ -898,10 +908,10 @@ void bic_set_bfqq(struct bfq_io_cq *bic, struct bfq_queue *bfqq, bool is_sync);
struct bfq_data *bic_to_bfqd(struct bfq_io_cq *bic);
void bfq_pos_tree_add_move(struct bfq_data *bfqd, struct bfq_queue *bfqq);
void bfq_weights_tree_add(struct bfq_data *bfqd, struct bfq_queue *bfqq,
struct rb_root *root);
struct rb_root_cached *root);
void __bfq_weights_tree_remove(struct bfq_data *bfqd,
struct bfq_queue *bfqq,
struct rb_root *root);
struct rb_root_cached *root);
void bfq_weights_tree_remove(struct bfq_data *bfqd,
struct bfq_queue *bfqq);
void bfq_bfqq_expire(struct bfq_data *bfqd, struct bfq_queue *bfqq,
......@@ -1008,13 +1018,23 @@ void bfq_add_bfqq_busy(struct bfq_data *bfqd, struct bfq_queue *bfqq);
/* --------------- end of interface of B-WF2Q+ ---------------- */
/* Logging facilities. */
static inline void bfq_pid_to_str(int pid, char *str, int len)
{
if (pid != -1)
snprintf(str, len, "%d", pid);
else
snprintf(str, len, "SHARED-");
}
#ifdef CONFIG_BFQ_GROUP_IOSCHED
struct bfq_group *bfqq_group(struct bfq_queue *bfqq);
#define bfq_log_bfqq(bfqd, bfqq, fmt, args...) do { \
char pid_str[MAX_PID_STR_LENGTH]; \
bfq_pid_to_str((bfqq)->pid, pid_str, MAX_PID_STR_LENGTH); \
blk_add_cgroup_trace_msg((bfqd)->queue, \
bfqg_to_blkg(bfqq_group(bfqq))->blkcg, \
"bfq%d%c " fmt, (bfqq)->pid, \
"bfq%s%c " fmt, pid_str, \
bfq_bfqq_sync((bfqq)) ? 'S' : 'A', ##args); \
} while (0)
......@@ -1025,10 +1045,13 @@ struct bfq_group *bfqq_group(struct bfq_queue *bfqq);
#else /* CONFIG_BFQ_GROUP_IOSCHED */
#define bfq_log_bfqq(bfqd, bfqq, fmt, args...) \
blk_add_trace_msg((bfqd)->queue, "bfq%d%c " fmt, (bfqq)->pid, \
#define bfq_log_bfqq(bfqd, bfqq, fmt, args...) do { \
char pid_str[MAX_PID_STR_LENGTH]; \
bfq_pid_to_str((bfqq)->pid, pid_str, MAX_PID_STR_LENGTH); \
blk_add_trace_msg((bfqd)->queue, "bfq%s%c " fmt, pid_str, \
bfq_bfqq_sync((bfqq)) ? 'S' : 'A', \
##args)
##args); \
} while (0)
#define bfq_log_bfqg(bfqd, bfqg, fmt, args...) do {} while (0)
#endif /* CONFIG_BFQ_GROUP_IOSCHED */
......
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* Hierarchical Budget Worst-case Fair Weighted Fair Queueing
* (B-WF2Q+): hierarchical scheduling algorithm by which the BFQ I/O
* scheduler schedules generic entities. The latter can represent
* either single bfq queues (associated with processes) or groups of
* bfq queues (associated with cgroups).
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License as
* published by the Free Software Foundation; either version 2 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*/
#include "bfq-iosched.h"
......@@ -59,7 +50,7 @@ static bool bfq_update_parent_budget(struct bfq_entity *next_in_service);
* bfq_update_next_in_service - update sd->next_in_service
* @sd: sched_data for which to perform the update.
* @new_entity: if not NULL, pointer to the entity whose activation,
* requeueing or repositionig triggered the invocation of
* requeueing or repositioning triggered the invocation of
* this function.
* @expiration: id true, this function is being invoked after the
* expiration of the in-service entity
......@@ -90,7 +81,7 @@ static bool bfq_update_next_in_service(struct bfq_sched_data *sd,
/*
* If this update is triggered by the activation, requeueing
* or repositiong of an entity that does not coincide with
* or repositioning of an entity that does not coincide with
* sd->next_in_service, then a full lookup in the active tree
* can be avoided. In fact, it is enough to check whether the
* just-modified entity has the same priority as
......@@ -737,7 +728,7 @@ __bfq_entity_update_weight_prio(struct bfq_service_tree *old_st,
struct bfq_queue *bfqq = bfq_entity_to_bfqq(entity);
unsigned int prev_weight, new_weight;
struct bfq_data *bfqd = NULL;
struct rb_root *root;
struct rb_root_cached *root;
#ifdef CONFIG_BFQ_GROUP_IOSCHED
struct bfq_sched_data *sd;
struct bfq_group *bfqg;
......@@ -1396,7 +1387,7 @@ static struct bfq_entity *bfq_first_active_entity(struct bfq_service_tree *st,
* In this first case, update the virtual time in @st too (see the
* comments on this update inside the function).
*
* In constrast, if there is an in-service entity, then return the
* In contrast, if there is an in-service entity, then return the
* entity that would be set in service if not only the above
* conditions, but also the next one held true: the currently
* in-service entity, on expiration,
......@@ -1479,12 +1470,12 @@ static struct bfq_entity *bfq_lookup_next_entity(struct bfq_sched_data *sd,
* is being invoked as a part of the expiration path
* of the in-service queue. In this case, even if
* sd->in_service_entity is not NULL,
* sd->in_service_entiy at this point is actually not
* sd->in_service_entity at this point is actually not
* in service any more, and, if needed, has already
* been properly queued or requeued into the right
* tree. The reason why sd->in_service_entity is still
* not NULL here, even if expiration is true, is that
* sd->in_service_entiy is reset as a last step in the
* sd->in_service_entity is reset as a last step in the
* expiration path. So, if expiration is true, tell
* __bfq_lookup_next_entity that there is no
* sd->in_service_entity.
......
// SPDX-License-Identifier: GPL-2.0
/*
* bio-integrity.c - bio data integrity extensions
*
* Copyright (C) 2007, 2008, 2009 Oracle Corporation
* Written by: Martin K. Petersen <martin.petersen@oracle.com>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License version
* 2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; see the file COPYING. If not, write to
* the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139,
* USA.
*
*/
#include <linux/blkdev.h>
......
This diff is collapsed.
// SPDX-License-Identifier: GPL-2.0
/*
* Common Block IO controller cgroup interface
*
......
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright (C) 1991, 1992 Linus Torvalds
* Copyright (C) 1994, Karl Keyte: Added support for disk statistics
......@@ -232,15 +233,6 @@ void blk_sync_queue(struct request_queue *q)
{
del_timer_sync(&q->timeout);
cancel_work_sync(&q->timeout_work);
if (queue_is_mq(q)) {
struct blk_mq_hw_ctx *hctx;
int i;
cancel_delayed_work_sync(&q->requeue_work);
queue_for_each_hw_ctx(q, hctx, i)
cancel_delayed_work_sync(&hctx->run_work);
}
}
EXPORT_SYMBOL(blk_sync_queue);
......@@ -347,18 +339,6 @@ void blk_cleanup_queue(struct request_queue *q)
blk_queue_flag_set(QUEUE_FLAG_DEAD, q);
/*
* make sure all in-progress dispatch are completed because
* blk_freeze_queue() can only complete all requests, and
* dispatch may still be in-progress since we dispatch requests
* from more than one contexts.
*
* We rely on driver to deal with the race in case that queue
* initialization isn't done.
*/
if (queue_is_mq(q) && blk_queue_init_done(q))
blk_mq_quiesce_queue(q);
/* for synchronous bio-based driver finish in-flight integrity i/o */
blk_flush_integrity();
......@@ -375,7 +355,7 @@ void blk_cleanup_queue(struct request_queue *q)
blk_exit_queue(q);
if (queue_is_mq(q))
blk_mq_free_queue(q);
blk_mq_exit_queue(q);
percpu_ref_exit(&q->q_usage_counter);
......
// SPDX-License-Identifier: GPL-2.0
/*
* Functions related to setting various queue properties from drivers
*/
......
// SPDX-License-Identifier: GPL-2.0
/*
* Functions to sequence PREFLUSH and FUA writes.
*
* Copyright (C) 2011 Max Planck Institute for Gravitational Physics
* Copyright (C) 2011 Tejun Heo <tj@kernel.org>
*
* This file is released under the GPLv2.
*
* REQ_{PREFLUSH|FUA} requests are decomposed to sequences consisted of three
* optional steps - PREFLUSH, DATA and POSTFLUSH - according to the request
* properties and hardware capability.
......
// SPDX-License-Identifier: GPL-2.0
/*
* blk-integrity.c - Block layer data integrity extensions
*
* Copyright (C) 2007, 2008 Oracle Corporation
* Written by: Martin K. Petersen <martin.petersen@oracle.com>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License version
* 2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; see the file COPYING. If not, write to
* the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139,
* USA.
*
*/
#include <linux/blkdev.h>
......
// SPDX-License-Identifier: GPL-2.0
/*
* Block rq-qos base io controller
*
......
......@@ -267,23 +267,6 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
goto split;
}
if (bvprvp) {
if (seg_size + bv.bv_len > queue_max_segment_size(q))
goto new_segment;
if (!biovec_phys_mergeable(q, bvprvp, &bv))
goto new_segment;
seg_size += bv.bv_len;
bvprv = bv;
bvprvp = &bvprv;
sectors += bv.bv_len >> 9;
if (nsegs == 1 && seg_size > front_seg_size)
front_seg_size = seg_size;
continue;
}
new_segment:
if (nsegs == max_segs)
goto split;
......@@ -370,12 +353,12 @@ EXPORT_SYMBOL(blk_queue_split);
static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
struct bio *bio)
{
struct bio_vec bv, bvprv = { NULL };
int prev = 0;
struct bio_vec uninitialized_var(bv), bvprv = { NULL };
unsigned int seg_size, nr_phys_segs;
unsigned front_seg_size;
struct bio *fbio, *bbio;
struct bvec_iter iter;
bool new_bio = false;
if (!bio)
return 0;
......@@ -396,7 +379,7 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
nr_phys_segs = 0;
for_each_bio(bio) {
bio_for_each_bvec(bv, bio, iter) {
if (prev) {
if (new_bio) {
if (seg_size + bv.bv_len
> queue_max_segment_size(q))
goto new_segment;
......@@ -404,7 +387,6 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
goto new_segment;
seg_size += bv.bv_len;
bvprv = bv;
if (nr_phys_segs == 1 && seg_size >
front_seg_size)
......@@ -413,12 +395,15 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
continue;
}
new_segment:
bvprv = bv;
prev = 1;
bvec_split_segs(q, &bv, &nr_phys_segs, &seg_size,
&front_seg_size, NULL, UINT_MAX);
new_bio = false;
}
bbio = bio;
if (likely(bio->bi_iter.bi_size)) {
bvprv = bv;
new_bio = true;
}
}
fbio->bi_seg_front_size = front_seg_size;
......@@ -484,79 +469,97 @@ static unsigned blk_bvec_map_sg(struct request_queue *q,
struct scatterlist **sg)
{
unsigned nbytes = bvec->bv_len;
unsigned nsegs = 0, total = 0, offset = 0;
unsigned nsegs = 0, total = 0;
while (nbytes > 0) {
unsigned seg_size;
struct page *pg;
unsigned idx;
*sg = blk_next_sg(sg, sglist);
unsigned offset = bvec->bv_offset + total;
unsigned len = min(get_max_segment_size(q, offset), nbytes);
struct page *page = bvec->bv_page;
seg_size = get_max_segment_size(q, bvec->bv_offset + total);
seg_size = min(nbytes, seg_size);
offset = (total + bvec->bv_offset) % PAGE_SIZE;
idx = (total + bvec->bv_offset) / PAGE_SIZE;
pg = bvec_nth_page(bvec->bv_page, idx);
/*
* Unfortunately a fair number of drivers barf on scatterlists
* that have an offset larger than PAGE_SIZE, despite other
* subsystems dealing with that invariant just fine. For now
* stick to the legacy format where we never present those from
* the block layer, but the code below should be removed once
* these offenders (mostly MMC/SD drivers) are fixed.
*/
page += (offset >> PAGE_SHIFT);
offset &= ~PAGE_MASK;
sg_set_page(*sg, pg, seg_size, offset);
*sg = blk_next_sg(sg, sglist);
sg_set_page(*sg, page, len, offset);
total += seg_size;
nbytes -= seg_size;
total += len;
nbytes -= len;
nsegs++;
}
return nsegs;
}
static inline void
__blk_segment_map_sg(struct request_queue *q, struct bio_vec *bvec,
struct scatterlist *sglist, struct bio_vec *bvprv,
struct scatterlist **sg, int *nsegs)
static inline int __blk_bvec_map_sg(struct bio_vec bv,
struct scatterlist *sglist, struct scatterlist **sg)
{
*sg = blk_next_sg(sg, sglist);
sg_set_page(*sg, bv.bv_page, bv.bv_len, bv.bv_offset);
return 1;
}
/* only try to merge bvecs into one sg if they are from two bios */
static inline bool
__blk_segment_map_sg_merge(struct request_queue *q, struct bio_vec *bvec,
struct bio_vec *bvprv, struct scatterlist **sg)
{
int nbytes = bvec->bv_len;
if (*sg) {
if ((*sg)->length + nbytes > queue_max_segment_size(q))
goto new_segment;
if (!biovec_phys_mergeable(q, bvprv, bvec))
goto new_segment;
if (!*sg)
return false;
(*sg)->length += nbytes;
} else {
new_segment:
if (bvec->bv_offset + bvec->bv_len <= PAGE_SIZE) {
*sg = blk_next_sg(sg, sglist);
sg_set_page(*sg, bvec->bv_page, nbytes, bvec->bv_offset);
(*nsegs) += 1;
} else
(*nsegs) += blk_bvec_map_sg(q, bvec, sglist, sg);
}
*bvprv = *bvec;
}
if ((*sg)->length + nbytes > queue_max_segment_size(q))
return false;
static inline int __blk_bvec_map_sg(struct request_queue *q, struct bio_vec bv,
struct scatterlist *sglist, struct scatterlist **sg)
{
*sg = sglist;
sg_set_page(*sg, bv.bv_page, bv.bv_len, bv.bv_offset);
return 1;
if (!biovec_phys_mergeable(q, bvprv, bvec))
return false;
(*sg)->length += nbytes;
return true;
}
static int __blk_bios_map_sg(struct request_queue *q, struct bio *bio,
struct scatterlist *sglist,
struct scatterlist **sg)
{
struct bio_vec bvec, bvprv = { NULL };
struct bio_vec uninitialized_var(bvec), bvprv = { NULL };
struct bvec_iter iter;
int nsegs = 0;
bool new_bio = false;
for_each_bio(bio)
bio_for_each_bvec(bvec, bio, iter)
__blk_segment_map_sg(q, &bvec, sglist, &bvprv, sg,
&nsegs);
for_each_bio(bio) {
bio_for_each_bvec(bvec, bio, iter) {
/*
* Only try to merge bvecs from two bios given we
* have done bio internal merge when adding pages
* to bio
*/
if (new_bio &&
__blk_segment_map_sg_merge(q, &bvec, &bvprv, sg))
goto next_bvec;
if (bvec.bv_offset + bvec.bv_len <= PAGE_SIZE)
nsegs += __blk_bvec_map_sg(bvec, sglist, sg);
else
nsegs += blk_bvec_map_sg(q, &bvec, sglist, sg);
next_bvec:
new_bio = false;
}
if (likely(bio->bi_iter.bi_size)) {
bvprv = bvec;
new_bio = true;
}
}
return nsegs;
}
......@@ -572,9 +575,9 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq,
int nsegs = 0;
if (rq->rq_flags & RQF_SPECIAL_PAYLOAD)
nsegs = __blk_bvec_map_sg(q, rq->special_vec, sglist, &sg);
nsegs = __blk_bvec_map_sg(rq->special_vec, sglist, &sg);
else if (rq->bio && bio_op(rq->bio) == REQ_OP_WRITE_SAME)
nsegs = __blk_bvec_map_sg(q, bio_iovec(rq->bio), sglist, &sg);
nsegs = __blk_bvec_map_sg(bio_iovec(rq->bio), sglist, &sg);
else if (rq->bio)
nsegs = __blk_bios_map_sg(q, rq->bio, sglist, &sg);
......
// SPDX-License-Identifier: GPL-2.0
/*
* CPU <-> hardware queue mapping helpers
*
......
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright (C) 2017 Facebook
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <https://www.gnu.org/licenses/>.
*/
#include <linux/kernel.h>
......
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright (c) 2016 Christoph Hellwig.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms and conditions of the GNU General Public License,
* version 2, as published by the Free Software Foundation.
*
* This program is distributed in the hope it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
* more details.
*/
#include <linux/kobject.h>
#include <linux/blkdev.h>
......
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright (c) 2017 Sagi Grimberg.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms and conditions of the GNU General Public License,
* version 2, as published by the Free Software Foundation.
*
* This program is distributed in the hope it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
* more details.
*/
#include <linux/blk-mq.h>
#include <linux/blk-mq-rdma.h>
......
// SPDX-License-Identifier: GPL-2.0
/*
* blk-mq scheduling framework
*
......@@ -413,6 +414,14 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx,
struct list_head *list, bool run_queue_async)
{
struct elevator_queue *e;
struct request_queue *q = hctx->queue;
/*
* blk_mq_sched_insert_requests() is called from flush plug
* context only, and hold one usage counter to prevent queue
* from being released.
*/
percpu_ref_get(&q->q_usage_counter);
e = hctx->queue->elevator;
if (e && e->type->ops.insert_requests)
......@@ -426,12 +435,14 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx,
if (!hctx->dispatch_busy && !e && !run_queue_async) {
blk_mq_try_issue_list_directly(hctx, list);
if (list_empty(list))
return;
goto out;
}
blk_mq_insert_requests(hctx, ctx, list);
}
blk_mq_run_hw_queue(hctx, run_queue_async);
out:
percpu_ref_put(&q->q_usage_counter);
}
static void blk_mq_sched_free_tags(struct blk_mq_tag_set *set,
......
// SPDX-License-Identifier: GPL-2.0
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/backing-dev.h>
......@@ -10,6 +11,7 @@
#include <linux/smp.h>
#include <linux/blk-mq.h>
#include "blk.h"
#include "blk-mq.h"
#include "blk-mq-tag.h"
......@@ -33,6 +35,13 @@ static void blk_mq_hw_sysfs_release(struct kobject *kobj)
{
struct blk_mq_hw_ctx *hctx = container_of(kobj, struct blk_mq_hw_ctx,
kobj);
cancel_delayed_work_sync(&hctx->run_work);
if (hctx->flags & BLK_MQ_F_BLOCKING)
cleanup_srcu_struct(hctx->srcu);
blk_free_flush_queue(hctx->fq);
sbitmap_free(&hctx->ctx_map);
free_cpumask_var(hctx->cpumask);
kfree(hctx->ctxs);
kfree(hctx);
......
// SPDX-License-Identifier: GPL-2.0
/*
* Tag allocation using scalable bitmaps. Uses active queue tracking to support
* fairer distribution of tags between multiple submitters when a shared tag map
......
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright (c) 2016 Christoph Hellwig.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms and conditions of the GNU General Public License,
* version 2, as published by the Free Software Foundation.
*
* This program is distributed in the hope it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
* more details.
*/
#include <linux/device.h>
#include <linux/blk-mq.h>
......
// SPDX-License-Identifier: GPL-2.0
/*
* Block multiqueue core code
*
......@@ -2062,7 +2063,7 @@ void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
list_del_init(&page->lru);
/*
* Remove kmemleak object previously allocated in
* blk_mq_init_rq_map().
* blk_mq_alloc_rqs().
*/
kmemleak_free(page_address(page));
__free_pages(page, page->private);
......@@ -2267,12 +2268,11 @@ static void blk_mq_exit_hctx(struct request_queue *q,
if (set->ops->exit_hctx)
set->ops->exit_hctx(hctx, hctx_idx);
if (hctx->flags & BLK_MQ_F_BLOCKING)
cleanup_srcu_struct(hctx->srcu);
blk_mq_remove_cpuhp(hctx);
blk_free_flush_queue(hctx->fq);
sbitmap_free(&hctx->ctx_map);
spin_lock(&q->unused_hctx_lock);
list_add(&hctx->hctx_list, &q->unused_hctx_list);
spin_unlock(&q->unused_hctx_lock);
}
static void blk_mq_exit_hw_queues(struct request_queue *q,
......@@ -2289,15 +2289,65 @@ static void blk_mq_exit_hw_queues(struct request_queue *q,
}
}
static int blk_mq_hw_ctx_size(struct blk_mq_tag_set *tag_set)
{
int hw_ctx_size = sizeof(struct blk_mq_hw_ctx);
BUILD_BUG_ON(ALIGN(offsetof(struct blk_mq_hw_ctx, srcu),
__alignof__(struct blk_mq_hw_ctx)) !=
sizeof(struct blk_mq_hw_ctx));
if (tag_set->flags & BLK_MQ_F_BLOCKING)
hw_ctx_size += sizeof(struct srcu_struct);
return hw_ctx_size;
}
static int blk_mq_init_hctx(struct request_queue *q,
struct blk_mq_tag_set *set,
struct blk_mq_hw_ctx *hctx, unsigned hctx_idx)
{
int node;
hctx->queue_num = hctx_idx;
cpuhp_state_add_instance_nocalls(CPUHP_BLK_MQ_DEAD, &hctx->cpuhp_dead);
hctx->tags = set->tags[hctx_idx];
if (set->ops->init_hctx &&
set->ops->init_hctx(hctx, set->driver_data, hctx_idx))
goto unregister_cpu_notifier;
if (blk_mq_init_request(set, hctx->fq->flush_rq, hctx_idx,
hctx->numa_node))
goto exit_hctx;
return 0;
exit_hctx:
if (set->ops->exit_hctx)
set->ops->exit_hctx(hctx, hctx_idx);
unregister_cpu_notifier:
blk_mq_remove_cpuhp(hctx);
return -1;
}
static struct blk_mq_hw_ctx *
blk_mq_alloc_hctx(struct request_queue *q, struct blk_mq_tag_set *set,
int node)
{
struct blk_mq_hw_ctx *hctx;
gfp_t gfp = GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY;
hctx = kzalloc_node(blk_mq_hw_ctx_size(set), gfp, node);
if (!hctx)
goto fail_alloc_hctx;
node = hctx->numa_node;
if (!zalloc_cpumask_var_node(&hctx->cpumask, gfp, node))
goto free_hctx;
atomic_set(&hctx->nr_active, 0);
if (node == NUMA_NO_NODE)
node = hctx->numa_node = set->numa_node;
node = set->numa_node;
hctx->numa_node = node;
INIT_DELAYED_WORK(&hctx->run_work, blk_mq_run_work_fn);
spin_lock_init(&hctx->lock);
......@@ -2305,58 +2355,47 @@ static int blk_mq_init_hctx(struct request_queue *q,
hctx->queue = q;
hctx->flags = set->flags & ~BLK_MQ_F_TAG_SHARED;
cpuhp_state_add_instance_nocalls(CPUHP_BLK_MQ_DEAD, &hctx->cpuhp_dead);
hctx->tags = set->tags[hctx_idx];
INIT_LIST_HEAD(&hctx->hctx_list);
/*
* Allocate space for all possible cpus to avoid allocation at
* runtime
*/
hctx->ctxs = kmalloc_array_node(nr_cpu_ids, sizeof(void *),
GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY, node);
gfp, node);
if (!hctx->ctxs)
goto unregister_cpu_notifier;
goto free_cpumask;
if (sbitmap_init_node(&hctx->ctx_map, nr_cpu_ids, ilog2(8),
GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY, node))
gfp, node))
goto free_ctxs;
hctx->nr_ctx = 0;
spin_lock_init(&hctx->dispatch_wait_lock);
init_waitqueue_func_entry(&hctx->dispatch_wait, blk_mq_dispatch_wake);
INIT_LIST_HEAD(&hctx->dispatch_wait.entry);
if (set->ops->init_hctx &&
set->ops->init_hctx(hctx, set->driver_data, hctx_idx))
goto free_bitmap;
hctx->fq = blk_alloc_flush_queue(q, hctx->numa_node, set->cmd_size,
GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY);
gfp);
if (!hctx->fq)
goto exit_hctx;
if (blk_mq_init_request(set, hctx->fq->flush_rq, hctx_idx, node))
goto free_fq;
goto free_bitmap;
if (hctx->flags & BLK_MQ_F_BLOCKING)
init_srcu_struct(hctx->srcu);
blk_mq_hctx_kobj_init(hctx);
return 0;
return hctx;
free_fq:
blk_free_flush_queue(hctx->fq);
exit_hctx:
if (set->ops->exit_hctx)
set->ops->exit_hctx(hctx, hctx_idx);
free_bitmap:
sbitmap_free(&hctx->ctx_map);
free_ctxs:
kfree(hctx->ctxs);
unregister_cpu_notifier:
blk_mq_remove_cpuhp(hctx);
return -1;
free_cpumask:
free_cpumask_var(hctx->cpumask);
free_hctx:
kfree(hctx);
fail_alloc_hctx:
return NULL;
}
static void blk_mq_init_cpu_queues(struct request_queue *q,
......@@ -2631,13 +2670,17 @@ static int blk_mq_alloc_ctxs(struct request_queue *q)
*/
void blk_mq_release(struct request_queue *q)
{
struct blk_mq_hw_ctx *hctx;
unsigned int i;
struct blk_mq_hw_ctx *hctx, *next;
int i;
/* hctx kobj stays in hctx */
queue_for_each_hw_ctx(q, hctx, i) {
if (!hctx)
continue;
cancel_delayed_work_sync(&q->requeue_work);
queue_for_each_hw_ctx(q, hctx, i)
WARN_ON_ONCE(hctx && list_empty(&hctx->hctx_list));
/* all hctx are in .unused_hctx_list now */
list_for_each_entry_safe(hctx, next, &q->unused_hctx_list, hctx_list) {
list_del_init(&hctx->hctx_list);
kobject_put(&hctx->kobj);
}
......@@ -2700,51 +2743,38 @@ struct request_queue *blk_mq_init_sq_queue(struct blk_mq_tag_set *set,
}
EXPORT_SYMBOL(blk_mq_init_sq_queue);
static int blk_mq_hw_ctx_size(struct blk_mq_tag_set *tag_set)
{
int hw_ctx_size = sizeof(struct blk_mq_hw_ctx);
BUILD_BUG_ON(ALIGN(offsetof(struct blk_mq_hw_ctx, srcu),
__alignof__(struct blk_mq_hw_ctx)) !=
sizeof(struct blk_mq_hw_ctx));
if (tag_set->flags & BLK_MQ_F_BLOCKING)
hw_ctx_size += sizeof(struct srcu_struct);
return hw_ctx_size;
}
static struct blk_mq_hw_ctx *blk_mq_alloc_and_init_hctx(
struct blk_mq_tag_set *set, struct request_queue *q,
int hctx_idx, int node)
{
struct blk_mq_hw_ctx *hctx;
hctx = kzalloc_node(blk_mq_hw_ctx_size(set),
GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY,
node);
if (!hctx)
return NULL;
struct blk_mq_hw_ctx *hctx = NULL, *tmp;
if (!zalloc_cpumask_var_node(&hctx->cpumask,
GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY,
node)) {
kfree(hctx);
return NULL;
/* reuse dead hctx first */
spin_lock(&q->unused_hctx_lock);
list_for_each_entry(tmp, &q->unused_hctx_list, hctx_list) {
if (tmp->numa_node == node) {
hctx = tmp;
break;
}
}
if (hctx)
list_del_init(&hctx->hctx_list);
spin_unlock(&q->unused_hctx_lock);
atomic_set(&hctx->nr_active, 0);
hctx->numa_node = node;
hctx->queue_num = hctx_idx;
if (!hctx)
hctx = blk_mq_alloc_hctx(q, set, node);
if (!hctx)
goto fail;
if (blk_mq_init_hctx(q, set, hctx, hctx_idx)) {
free_cpumask_var(hctx->cpumask);
kfree(hctx);
return NULL;
}
blk_mq_hctx_kobj_init(hctx);
if (blk_mq_init_hctx(q, set, hctx, hctx_idx))
goto free_hctx;
return hctx;
free_hctx:
kobject_put(&hctx->kobj);
fail:
return NULL;
}
static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set,
......@@ -2770,10 +2800,8 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set,
hctx = blk_mq_alloc_and_init_hctx(set, q, i, node);
if (hctx) {
if (hctxs[i]) {
if (hctxs[i])
blk_mq_exit_hctx(q, set, hctxs[i], i);
kobject_put(&hctxs[i]->kobj);
}
hctxs[i] = hctx;
} else {
if (hctxs[i])
......@@ -2804,9 +2832,7 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set,
if (hctx->tags)
blk_mq_free_map_and_requests(set, j);
blk_mq_exit_hctx(q, set, hctx, j);
kobject_put(&hctx->kobj);
hctxs[j] = NULL;
}
}
mutex_unlock(&q->sysfs_lock);
......@@ -2849,6 +2875,9 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
if (!q->queue_hw_ctx)
goto err_sys_init;
INIT_LIST_HEAD(&q->unused_hctx_list);
spin_lock_init(&q->unused_hctx_lock);
blk_mq_realloc_hw_ctxs(set, q);
if (!q->nr_hw_queues)
goto err_hctxs;
......@@ -2905,7 +2934,8 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
}
EXPORT_SYMBOL(blk_mq_init_allocated_queue);
void blk_mq_free_queue(struct request_queue *q)
/* tags can _not_ be used after returning from blk_mq_exit_queue */
void blk_mq_exit_queue(struct request_queue *q)
{
struct blk_mq_tag_set *set = q->tag_set;
......
......@@ -37,7 +37,7 @@ struct blk_mq_ctx {
struct kobject kobj;
} ____cacheline_aligned_in_smp;
void blk_mq_free_queue(struct request_queue *q);
void blk_mq_exit_queue(struct request_queue *q);
int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr);
void blk_mq_wake_waiters(struct request_queue *q);
bool blk_mq_dispatch_rq_list(struct request_queue *, struct list_head *, bool);
......
// SPDX-License-Identifier: GPL-2.0
#include "blk-rq-qos.h"
/*
......
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef RQ_QOS_H
#define RQ_QOS_H
......
// SPDX-License-Identifier: GPL-2.0
/*
* Functions related to setting various queue properties from drivers
*/
......@@ -662,22 +663,6 @@ void disk_stack_limits(struct gendisk *disk, struct block_device *bdev,
}
EXPORT_SYMBOL(disk_stack_limits);
/**
* blk_queue_dma_pad - set pad mask
* @q: the request queue for the device
* @mask: pad mask
*
* Set dma pad mask.
*
* Appending pad buffer to a request modifies the last entry of a
* scatter list such that it includes the pad buffer.
**/
void blk_queue_dma_pad(struct request_queue *q, unsigned int mask)
{
q->dma_pad_mask = mask;
}
EXPORT_SYMBOL(blk_queue_dma_pad);
/**
* blk_queue_update_dma_pad - update pad mask
* @q: the request queue for the device
......
// SPDX-License-Identifier: GPL-2.0
/*
* Block stat tracking code
*
......
......@@ -728,7 +728,7 @@ static struct queue_sysfs_entry throtl_sample_time_entry = {
};
#endif
static struct attribute *default_attrs[] = {
static struct attribute *queue_attrs[] = {
&queue_requests_entry.attr,
&queue_ra_entry.attr,
&queue_max_hw_sectors_entry.attr,
......@@ -769,7 +769,25 @@ static struct attribute *default_attrs[] = {
#endif
NULL,
};
ATTRIBUTE_GROUPS(default);
static umode_t queue_attr_visible(struct kobject *kobj, struct attribute *attr,
int n)
{
struct request_queue *q =
container_of(kobj, struct request_queue, kobj);
if (attr == &queue_io_timeout_entry.attr &&
(!q->mq_ops || !q->mq_ops->timeout))
return 0;
return attr->mode;
}
static struct attribute_group queue_attr_group = {
.attrs = queue_attrs,
.is_visible = queue_attr_visible,
};
#define to_queue(atr) container_of((atr), struct queue_sysfs_entry, attr)
......@@ -891,7 +909,6 @@ static const struct sysfs_ops queue_sysfs_ops = {
struct kobj_type blk_queue_ktype = {
.sysfs_ops = &queue_sysfs_ops,
.default_groups = default_groups,
.release = blk_release_queue,
};
......@@ -940,6 +957,14 @@ int blk_register_queue(struct gendisk *disk)
goto unlock;
}
ret = sysfs_create_group(&q->kobj, &queue_attr_group);
if (ret) {
blk_trace_remove_sysfs(dev);
kobject_del(&q->kobj);
kobject_put(&dev->kobj);
goto unlock;
}
if (queue_is_mq(q)) {
__blk_mq_register_dev(dev, q);
blk_mq_debugfs_register(q);
......
// SPDX-License-Identifier: GPL-2.0
/*
* Functions related to generic timeout handling of requests.
*/
......
// SPDX-License-Identifier: GPL-2.0
/*
* buffered writeback throttling. loosely based on CoDel. We can't drop
* packets for IO scheduling, so the logic is something like this:
......
// SPDX-License-Identifier: GPL-2.0
/*
* Zoned block device handling
*
......
......@@ -75,7 +75,7 @@ static inline bool biovec_phys_mergeable(struct request_queue *q,
if (addr1 + vec1->bv_len != addr2)
return false;
if (xen_domain() && !xen_biovec_phys_mergeable(vec1, vec2))
if (xen_domain() && !xen_biovec_phys_mergeable(vec1, vec2->bv_page))
return false;
if ((addr1 | mask) != ((addr2 + vec2->bv_len - 1) | mask))
return false;
......
......@@ -163,14 +163,13 @@ static void bounce_end_io(struct bio *bio, mempool_t *pool)
{
struct bio *bio_orig = bio->bi_private;
struct bio_vec *bvec, orig_vec;
int i;
struct bvec_iter orig_iter = bio_orig->bi_iter;
struct bvec_iter_all iter_all;
/*
* free up bounce indirect pages used
*/
bio_for_each_segment_all(bvec, bio, i, iter_all) {
bio_for_each_segment_all(bvec, bio, iter_all) {
orig_vec = bio_iter_iovec(bio_orig, orig_iter);
if (bvec->bv_page != orig_vec.bv_page) {
dec_zone_page_state(bvec->bv_page, NR_BOUNCE);
......
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* BSG helper library
*
* Copyright (C) 2008 James Smart, Emulex Corporation
* Copyright (C) 2011 Red Hat, Inc. All rights reserved.
* Copyright (C) 2011 Mike Christie
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*
*/
#include <linux/slab.h>
#include <linux/blk-mq.h>
......
// SPDX-License-Identifier: GPL-2.0
/*
* bsg.c - block layer implementation of the sg v4 interface
*
* Copyright (C) 2004 Jens Axboe <axboe@suse.de> SUSE Labs
* Copyright (C) 2004 Peter M. Jones <pjones@redhat.com>
*
* This file is subject to the terms and conditions of the GNU General Public
* License version 2. See the file "COPYING" in the main directory of this
* archive for more details.
*
*/
#include <linux/module.h>
#include <linux/init.h>
......
// SPDX-License-Identifier: GPL-2.0
/*
* Block device elevator/IO-scheduler.
*
......@@ -509,8 +510,6 @@ void elv_unregister_queue(struct request_queue *q)
int elv_register(struct elevator_type *e)
{
char *def = "";
/* create icq_cache if requested */
if (e->icq_size) {
if (WARN_ON(e->icq_size < sizeof(struct io_cq)) ||
......@@ -535,8 +534,8 @@ int elv_register(struct elevator_type *e)
list_add_tail(&e->list, &elv_list);
spin_unlock(&elv_list_lock);
printk(KERN_INFO "io scheduler %s registered%s\n", e->elevator_name,
def);
printk(KERN_INFO "io scheduler %s registered\n", e->elevator_name);
return 0;
}
EXPORT_SYMBOL_GPL(elv_register);
......
// SPDX-License-Identifier: GPL-2.0
/*
* gendisk handling
*/
......@@ -531,6 +532,18 @@ void blk_free_devt(dev_t devt)
}
}
/**
* We invalidate devt by assigning NULL pointer for devt in idr.
*/
void blk_invalidate_devt(dev_t devt)
{
if (MAJOR(devt) == BLOCK_EXT_MAJOR) {
spin_lock_bh(&ext_devt_lock);
idr_replace(&ext_devt_idr, NULL, blk_mangle_minor(MINOR(devt)));
spin_unlock_bh(&ext_devt_lock);
}
}
static char *bdevt_str(dev_t devt, char *buf)
{
if (MAJOR(devt) <= 0xff && MINOR(devt) <= 0xff) {
......@@ -793,6 +806,13 @@ void del_gendisk(struct gendisk *disk)
if (!(disk->flags & GENHD_FL_HIDDEN))
blk_unregister_region(disk_devt(disk), disk->minors);
/*
* Remove gendisk pointer from idr so that it cannot be looked up
* while RCU period before freeing gendisk is running to prevent
* use-after-free issues. Note that the device number stays
* "in-use" until we really free the gendisk.
*/
blk_invalidate_devt(disk_devt(disk));
kobject_put(disk->part0.holder_dir);
kobject_put(disk->slave_dir);
......@@ -1628,12 +1648,11 @@ static unsigned long disk_events_poll_jiffies(struct gendisk *disk)
/*
* If device-specific poll interval is set, always use it. If
* the default is being used, poll iff there are events which
* can't be monitored asynchronously.
* the default is being used, poll if the POLL flag is set.
*/
if (ev->poll_msecs >= 0)
intv_msecs = ev->poll_msecs;
else if (disk->events & ~disk->async_events)
else if (disk->event_flags & DISK_EVENT_FLAG_POLL)
intv_msecs = disk_events_dfl_poll_msecs;
return msecs_to_jiffies(intv_msecs);
......@@ -1843,11 +1862,13 @@ static void disk_check_events(struct disk_events *ev,
/*
* Tell userland about new events. Only the events listed in
* @disk->events are reported. Unlisted events are processed the
* same internally but never get reported to userland.
* @disk->events are reported, and only if DISK_EVENT_FLAG_UEVENT
* is set. Otherwise, events are processed internally but never
* get reported to userland.
*/
for (i = 0; i < ARRAY_SIZE(disk_uevents); i++)
if (events & disk->events & (1 << i))
if ((events & disk->events & (1 << i)) &&
(disk->event_flags & DISK_EVENT_FLAG_UEVENT))
envp[nr_events++] = disk_uevents[i];
if (nr_events)
......@@ -1860,6 +1881,7 @@ static void disk_check_events(struct disk_events *ev,
*
* events : list of all supported events
* events_async : list of events which can be detected w/o polling
* (always empty, only for backwards compatibility)
* events_poll_msecs : polling interval, 0: disable, -1: system default
*/
static ssize_t __disk_events_show(unsigned int events, char *buf)
......@@ -1884,15 +1906,16 @@ static ssize_t disk_events_show(struct device *dev,
{
struct gendisk *disk = dev_to_disk(dev);
if (!(disk->event_flags & DISK_EVENT_FLAG_UEVENT))
return 0;
return __disk_events_show(disk->events, buf);
}
static ssize_t disk_events_async_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct gendisk *disk = dev_to_disk(dev);
return __disk_events_show(disk->async_events, buf);
return 0;
}
static ssize_t disk_events_poll_msecs_show(struct device *dev,
......@@ -1901,6 +1924,9 @@ static ssize_t disk_events_poll_msecs_show(struct device *dev,
{
struct gendisk *disk = dev_to_disk(dev);
if (!disk->ev)
return sprintf(buf, "-1\n");
return sprintf(buf, "%ld\n", disk->ev->poll_msecs);
}
......@@ -1917,6 +1943,9 @@ static ssize_t disk_events_poll_msecs_store(struct device *dev,
if (intv < 0 && intv != -1)
return -EINVAL;
if (!disk->ev)
return -ENODEV;
disk_block_events(disk);
disk->ev->poll_msecs = intv;
__disk_unblock_events(disk, true);
......@@ -1981,7 +2010,7 @@ static void disk_alloc_events(struct gendisk *disk)
{
struct disk_events *ev;
if (!disk->fops->check_events)
if (!disk->fops->check_events || !disk->events)
return;
ev = kzalloc(sizeof(*ev), GFP_KERNEL);
......@@ -2003,14 +2032,14 @@ static void disk_alloc_events(struct gendisk *disk)
static void disk_add_events(struct gendisk *disk)
{
if (!disk->ev)
return;
/* FIXME: error handling */
if (sysfs_create_files(&disk_to_dev(disk)->kobj, disk_events_attrs) < 0)
pr_warn("%s: failed to create sysfs files for events\n",
disk->disk_name);
if (!disk->ev)
return;
mutex_lock(&disk_events_mutex);
list_add_tail(&disk->ev->node, &disk_events);
mutex_unlock(&disk_events_mutex);
......@@ -2024,14 +2053,13 @@ static void disk_add_events(struct gendisk *disk)
static void disk_del_events(struct gendisk *disk)
{
if (!disk->ev)
return;
disk_block_events(disk);
if (disk->ev) {
disk_block_events(disk);
mutex_lock(&disk_events_mutex);
list_del_init(&disk->ev->node);
mutex_unlock(&disk_events_mutex);
mutex_lock(&disk_events_mutex);
list_del_init(&disk->ev->node);
mutex_unlock(&disk_events_mutex);
}
sysfs_remove_files(&disk_to_dev(disk)->kobj, disk_events_attrs);
}
......
// SPDX-License-Identifier: GPL-2.0
#include <linux/capability.h>
#include <linux/blkdev.h>
#include <linux/export.h>
......
// SPDX-License-Identifier: GPL-2.0
/*
* fs/ioprio.c
*
......
// SPDX-License-Identifier: GPL-2.0
/*
* The Kyber I/O scheduler. Controls latency by throttling queue depths using
* scalable techniques.
*
* Copyright (C) 2017 Facebook
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <https://www.gnu.org/licenses/>.
*/
#include <linux/kernel.h>
......
// SPDX-License-Identifier: GPL-2.0
/*
* MQ Deadline i/o scheduler - adaptation of the legacy deadline scheduler,
* for the blk-mq scheduling framework
......
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright © 2016 Intel Corporation
*
* Authors:
* Rafael Antognolli <rafael.antognolli@intel.com>
* Scott Bauer <scott.bauer@intel.com>
*
* This program is free software; you can redistribute it and/or modify it
* under the terms and conditions of the GNU General Public License,
* version 2, as published by the Free Software Foundation.
*
* This program is distributed in the hope it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
* more details.
*/
#include <linux/types.h>
......@@ -170,6 +162,8 @@ enum opal_token {
OPAL_READLOCKED = 0x07,
OPAL_WRITELOCKED = 0x08,
OPAL_ACTIVEKEY = 0x0A,
/* lockingsp table */
OPAL_LIFECYCLE = 0x06,
/* locking info table */
OPAL_MAXRANGES = 0x04,
/* mbr control */
......
......@@ -285,6 +285,13 @@ void delete_partition(struct gendisk *disk, int partno)
kobject_put(part->holder_dir);
device_del(part_to_dev(part));
/*
* Remove gendisk pointer from idr so that it cannot be looked up
* while RCU period before freeing gendisk is running to prevent
* use-after-free issues. Note that the device number stays
* "in-use" until we really free the gendisk.
*/
blk_invalidate_devt(part_devt(part));
hd_struct_kill(part);
}
......
// SPDX-License-Identifier: GPL-2.0
/*
* linux/fs/partitions/acorn.c
*
* Copyright (c) 1996-2000 Russell King.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*
* Scan ADFS partitions on hard disk drives. Unfortunately, there
* isn't a standard for partitioning drives on Acorn machines, so
* every single manufacturer of SCSI and IDE cards created their own
......
/* SPDX-License-Identifier: GPL-2.0 */
extern int aix_partition(struct parsed_partitions *state);
/* SPDX-License-Identifier: GPL-2.0 */
/*
* fs/partitions/amiga.h
*/
......
// SPDX-License-Identifier: GPL-2.0-or-later
/************************************************************
* EFI GUID Partition Table handling
*
......@@ -7,21 +8,6 @@
* efi.[ch] by Matt Domsch <Matt_Domsch@dell.com>
* Copyright 2000,2001,2002,2004 Dell Inc.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*
*
* TODO:
*
* Changelog:
......
/* SPDX-License-Identifier: GPL-2.0-or-later */
/************************************************************
* EFI GUID Partition Table
* Per Intel EFI Specification v1.02
......@@ -5,21 +6,6 @@
*
* By Matt Domsch <Matt_Domsch@dell.com> Fri Sep 22 22:15:56 CDT 2000
* Copyright 2000,2001 Dell Inc.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*
************************************************************/
#ifndef FS_PART_EFI_H_INCLUDED
......
/* SPDX-License-Identifier: GPL-2.0 */
int ibm_partition(struct parsed_partitions *);
/* SPDX-License-Identifier: GPL-2.0 */
/*
* fs/partitions/karma.h
*/
......
// SPDX-License-Identifier: GPL-2.0-or-later
/**
* ldm - Support for Windows Logical Disk Manager (Dynamic Disks)
*
......@@ -6,21 +7,6 @@
* Copyright (C) 2001,2002 Jakob Kemi <jakob.kemi@telia.com>
*
* Documentation is available at http://www.linux-ntfs.org/doku.php?id=downloads
*
* This program is free software; you can redistribute it and/or modify it under
* the terms of the GNU General Public License as published by the Free Software
* Foundation; either version 2 of the License, or (at your option) any later
* version.
*
* This program is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
* FOR A PARTICULAR PURPOSE. See the GNU General Public License for more
* details.
*
* You should have received a copy of the GNU General Public License along with
* this program (in the main directory of the source in the file COPYING); if
* not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330,
* Boston, MA 02111-1307 USA
*/
#include <linux/slab.h>
......
// SPDX-License-Identifier: GPL-2.0-or-later
/**
* ldm - Part of the Linux-NTFS project.
*
......@@ -6,21 +7,6 @@
* Copyright (C) 2001,2002 Jakob Kemi <jakob.kemi@telia.com>
*
* Documentation is available at http://www.linux-ntfs.org/doku.php?id=downloads
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License as published by the Free
* Software Foundation; either version 2 of the License, or (at your option)
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program (in the main directory of the Linux-NTFS source
* in the file COPYING); if not, write to the Free Software Foundation,
* Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef _FS_PT_LDM_H_
......
/* SPDX-License-Identifier: GPL-2.0 */
/*
* fs/partitions/msdos.h
*/
......
/* SPDX-License-Identifier: GPL-2.0 */
/*
* fs/partitions/osf.h
*/
......
/* SPDX-License-Identifier: GPL-2.0 */
/*
* fs/partitions/sgi.h
*/
......
/* SPDX-License-Identifier: GPL-2.0 */
/*
* fs/partitions/sun.h
*/
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment