Commit 772c8f6f authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'for-4.11/linus-merge-signed' of git://git.kernel.dk/linux-block

Pull block layer updates from Jens Axboe:

 - blk-mq scheduling framework from me and Omar, with a port of the
   deadline scheduler for this framework. A port of BFQ from Paolo is in
   the works, and should be ready for 4.12.

 - Various fixups and improvements to the above scheduling framework
   from Omar, Paolo, Bart, me, others.

 - Cleanup of the exported sysfs blk-mq data into debugfs, from Omar.
   This allows us to export more information that helps debug hangs or
   performance issues, without cluttering or abusing the sysfs API.

 - Fixes for the sbitmap code, the scalable bitmap code that was
   migrated from blk-mq, from Omar.

 - Removal of the BLOCK_PC support in struct request, and refactoring of
   carrying SCSI payloads in the block layer. This cleans up the code
   nicely, and enables us to kill the SCSI specific parts of struct
   request, shrinking it down nicely. From Christoph mainly, with help
   from Hannes.

 - Support for ranged discard requests and discard merging, also from
   Christoph.

 - Support for OPAL in the block layer, and for NVMe as well. Mainly
   from Scott Bauer, with fixes/updates from various others folks.

 - Error code fixup for gdrom from Christophe.

 - cciss pci irq allocation cleanup from Christoph.

 - Making the cdrom device operations read only, from Kees Cook.

 - Fixes for duplicate bdi registrations and bdi/queue life time
   problems from Jan and Dan.

 - Set of fixes and updates for lightnvm, from Matias and Javier.

 - A few fixes for nbd from Josef, using idr to name devices and a
   workqueue deadlock fix on receive. Also marks Josef as the current
   maintainer of nbd.

 - Fix from Josef, overwriting queue settings when the number of
   hardware queues is updated for a blk-mq device.

 - NVMe fix from Keith, ensuring that we don't repeatedly mark and IO
   aborted, if we didn't end up aborting it.

 - SG gap merging fix from Ming Lei for block.

 - Loop fix also from Ming, fixing a race and crash between setting loop
   status and IO.

 - Two block race fixes from Tahsin, fixing request list iteration and
   fixing a race between device registration and udev device add
   notifiations.

 - Double free fix from cgroup writeback, from Tejun.

 - Another double free fix in blkcg, from Hou Tao.

 - Partition overflow fix for EFI from Alden Tondettar.

* tag 'for-4.11/linus-merge-signed' of git://git.kernel.dk/linux-block: (156 commits)
  nvme: Check for Security send/recv support before issuing commands.
  block/sed-opal: allocate struct opal_dev dynamically
  block/sed-opal: tone down not supported warnings
  block: don't defer flushes on blk-mq + scheduling
  blk-mq-sched: ask scheduler for work, if we failed dispatching leftovers
  blk-mq: don't special case flush inserts for blk-mq-sched
  blk-mq-sched: don't add flushes to the head of requeue queue
  blk-mq: have blk_mq_dispatch_rq_list() return if we queued IO or not
  block: do not allow updates through sysfs until registration completes
  lightnvm: set default lun range when no luns are specified
  lightnvm: fix off-by-one error on target initialization
  Maintainers: Modify SED list from nvme to block
  Move stack parameters for sed_ioctl to prevent oversized stack with CONFIG_KASAN
  uapi: sed-opal fix IOW for activate lsp to use correct struct
  cdrom: Make device operations read-only
  elevator: fix loading wrong elevator type for blk-mq devices
  cciss: switch to pci_irq_alloc_vectors
  block/loop: fix race between I/O and set_status
  blk-mq-sched: don't hold queue_lock when calling exit_icq
  block: set make_request_fn manually in blk_mq_update_nr_hw_queues
  ...
parents fd4a61e0 818551e2
...@@ -249,7 +249,6 @@ struct& cdrom_device_ops\ \{ \hidewidth\cr ...@@ -249,7 +249,6 @@ struct& cdrom_device_ops\ \{ \hidewidth\cr
unsigned\ long);\cr unsigned\ long);\cr
\noalign{\medskip} \noalign{\medskip}
&const\ int& capability;& capability flags \cr &const\ int& capability;& capability flags \cr
&int& n_minors;& number of active minor devices \cr
\};\cr \};\cr
} }
$$ $$
...@@ -258,13 +257,7 @@ it should add a function pointer to this $struct$. When a particular ...@@ -258,13 +257,7 @@ it should add a function pointer to this $struct$. When a particular
function is not implemented, however, this $struct$ should contain a function is not implemented, however, this $struct$ should contain a
NULL instead. The $capability$ flags specify the capabilities of the NULL instead. The $capability$ flags specify the capabilities of the
\cdrom\ hardware and/or low-level \cdrom\ driver when a \cdrom\ drive \cdrom\ hardware and/or low-level \cdrom\ driver when a \cdrom\ drive
is registered with the \UCD. The value $n_minors$ should be a positive is registered with the \UCD.
value indicating the number of minor devices that are supported by
the low-level device driver, normally~1. Although these two variables
are `informative' rather than `operational,' they are included in
$cdrom_device_ops$ because they describe the capability of the {\em
driver\/} rather than the {\em drive}. Nomenclature has always been
difficult in computer programming.
Note that most functions have fewer parameters than their Note that most functions have fewer parameters than their
$blkdev_fops$ counterparts. This is because very little of the $blkdev_fops$ counterparts. This is because very little of the
......
...@@ -8620,10 +8620,10 @@ S: Maintained ...@@ -8620,10 +8620,10 @@ S: Maintained
F: drivers/net/ethernet/netronome/ F: drivers/net/ethernet/netronome/
NETWORK BLOCK DEVICE (NBD) NETWORK BLOCK DEVICE (NBD)
M: Markus Pargmann <mpa@pengutronix.de> M: Josef Bacik <jbacik@fb.com>
S: Maintained S: Maintained
L: linux-block@vger.kernel.org
L: nbd-general@lists.sourceforge.net L: nbd-general@lists.sourceforge.net
T: git git://git.pengutronix.de/git/mpa/linux-nbd.git
F: Documentation/blockdev/nbd.txt F: Documentation/blockdev/nbd.txt
F: drivers/block/nbd.c F: drivers/block/nbd.c
F: include/uapi/linux/nbd.h F: include/uapi/linux/nbd.h
...@@ -11097,6 +11097,17 @@ L: linux-mmc@vger.kernel.org ...@@ -11097,6 +11097,17 @@ L: linux-mmc@vger.kernel.org
S: Maintained S: Maintained
F: drivers/mmc/host/sdhci-spear.c F: drivers/mmc/host/sdhci-spear.c
SECURE ENCRYPTING DEVICE (SED) OPAL DRIVER
M: Scott Bauer <scott.bauer@intel.com>
M: Jonathan Derrick <jonathan.derrick@intel.com>
M: Rafael Antognolli <rafael.antognolli@intel.com>
L: linux-block@vger.kernel.org
S: Supported
F: block/sed*
F: block/opal_proto.h
F: include/linux/sed*
F: include/uapi/linux/sed*
SECURITY SUBSYSTEM SECURITY SUBSYSTEM
M: James Morris <james.l.morris@oracle.com> M: James Morris <james.l.morris@oracle.com>
M: "Serge E. Hallyn" <serge@hallyn.com> M: "Serge E. Hallyn" <serge@hallyn.com>
......
...@@ -49,9 +49,13 @@ config LBDAF ...@@ -49,9 +49,13 @@ config LBDAF
If unsure, say Y. If unsure, say Y.
config BLK_SCSI_REQUEST
bool
config BLK_DEV_BSG config BLK_DEV_BSG
bool "Block layer SG support v4" bool "Block layer SG support v4"
default y default y
select BLK_SCSI_REQUEST
help help
Saying Y here will enable generic SG (SCSI generic) v4 support Saying Y here will enable generic SG (SCSI generic) v4 support
for any block device. for any block device.
...@@ -71,6 +75,7 @@ config BLK_DEV_BSGLIB ...@@ -71,6 +75,7 @@ config BLK_DEV_BSGLIB
bool "Block layer SG support v4 helper lib" bool "Block layer SG support v4 helper lib"
default n default n
select BLK_DEV_BSG select BLK_DEV_BSG
select BLK_SCSI_REQUEST
help help
Subsystems will normally enable this if needed. Users will not Subsystems will normally enable this if needed. Users will not
normally need to manually enable this. normally need to manually enable this.
...@@ -147,6 +152,25 @@ config BLK_WBT_MQ ...@@ -147,6 +152,25 @@ config BLK_WBT_MQ
Multiqueue currently doesn't have support for IO scheduling, Multiqueue currently doesn't have support for IO scheduling,
enabling this option is recommended. enabling this option is recommended.
config BLK_DEBUG_FS
bool "Block layer debugging information in debugfs"
default y
depends on DEBUG_FS
---help---
Include block layer debugging information in debugfs. This information
is mostly useful for kernel developers, but it doesn't incur any cost
at runtime.
Unless you are building a kernel for a tiny system, you should
say Y here.
config BLK_SED_OPAL
bool "Logic for interfacing with Opal enabled SEDs"
---help---
Builds Logic for interfacing with Opal enabled controllers.
Enabling this option enables users to setup/unlock/lock
Locking ranges for SED devices using the Opal protocol.
menu "Partition Types" menu "Partition Types"
source "block/partitions/Kconfig" source "block/partitions/Kconfig"
......
...@@ -63,6 +63,56 @@ config DEFAULT_IOSCHED ...@@ -63,6 +63,56 @@ config DEFAULT_IOSCHED
default "cfq" if DEFAULT_CFQ default "cfq" if DEFAULT_CFQ
default "noop" if DEFAULT_NOOP default "noop" if DEFAULT_NOOP
config MQ_IOSCHED_DEADLINE
tristate "MQ deadline I/O scheduler"
default y
---help---
MQ version of the deadline IO scheduler.
config MQ_IOSCHED_NONE
bool
default y
choice
prompt "Default single-queue blk-mq I/O scheduler"
default DEFAULT_SQ_NONE
help
Select the I/O scheduler which will be used by default for blk-mq
managed block devices with a single queue.
config DEFAULT_SQ_DEADLINE
bool "MQ Deadline" if MQ_IOSCHED_DEADLINE=y
config DEFAULT_SQ_NONE
bool "None"
endchoice
config DEFAULT_SQ_IOSCHED
string
default "mq-deadline" if DEFAULT_SQ_DEADLINE
default "none" if DEFAULT_SQ_NONE
choice
prompt "Default multi-queue blk-mq I/O scheduler"
default DEFAULT_MQ_NONE
help
Select the I/O scheduler which will be used by default for blk-mq
managed block devices with multiple queues.
config DEFAULT_MQ_DEADLINE
bool "MQ Deadline" if MQ_IOSCHED_DEADLINE=y
config DEFAULT_MQ_NONE
bool "None"
endchoice
config DEFAULT_MQ_IOSCHED
string
default "mq-deadline" if DEFAULT_MQ_DEADLINE
default "none" if DEFAULT_MQ_NONE
endmenu endmenu
endif endif
...@@ -6,11 +6,12 @@ obj-$(CONFIG_BLOCK) := bio.o elevator.o blk-core.o blk-tag.o blk-sysfs.o \ ...@@ -6,11 +6,12 @@ obj-$(CONFIG_BLOCK) := bio.o elevator.o blk-core.o blk-tag.o blk-sysfs.o \
blk-flush.o blk-settings.o blk-ioc.o blk-map.o \ blk-flush.o blk-settings.o blk-ioc.o blk-map.o \
blk-exec.o blk-merge.o blk-softirq.o blk-timeout.o \ blk-exec.o blk-merge.o blk-softirq.o blk-timeout.o \
blk-lib.o blk-mq.o blk-mq-tag.o blk-stat.o \ blk-lib.o blk-mq.o blk-mq-tag.o blk-stat.o \
blk-mq-sysfs.o blk-mq-cpumap.o ioctl.o \ blk-mq-sysfs.o blk-mq-cpumap.o blk-mq-sched.o ioctl.o \
genhd.o scsi_ioctl.o partition-generic.o ioprio.o \ genhd.o partition-generic.o ioprio.o \
badblocks.o partitions/ badblocks.o partitions/
obj-$(CONFIG_BOUNCE) += bounce.o obj-$(CONFIG_BOUNCE) += bounce.o
obj-$(CONFIG_BLK_SCSI_REQUEST) += scsi_ioctl.o
obj-$(CONFIG_BLK_DEV_BSG) += bsg.o obj-$(CONFIG_BLK_DEV_BSG) += bsg.o
obj-$(CONFIG_BLK_DEV_BSGLIB) += bsg-lib.o obj-$(CONFIG_BLK_DEV_BSGLIB) += bsg-lib.o
obj-$(CONFIG_BLK_CGROUP) += blk-cgroup.o obj-$(CONFIG_BLK_CGROUP) += blk-cgroup.o
...@@ -18,6 +19,7 @@ obj-$(CONFIG_BLK_DEV_THROTTLING) += blk-throttle.o ...@@ -18,6 +19,7 @@ obj-$(CONFIG_BLK_DEV_THROTTLING) += blk-throttle.o
obj-$(CONFIG_IOSCHED_NOOP) += noop-iosched.o obj-$(CONFIG_IOSCHED_NOOP) += noop-iosched.o
obj-$(CONFIG_IOSCHED_DEADLINE) += deadline-iosched.o obj-$(CONFIG_IOSCHED_DEADLINE) += deadline-iosched.o
obj-$(CONFIG_IOSCHED_CFQ) += cfq-iosched.o obj-$(CONFIG_IOSCHED_CFQ) += cfq-iosched.o
obj-$(CONFIG_MQ_IOSCHED_DEADLINE) += mq-deadline.o
obj-$(CONFIG_BLOCK_COMPAT) += compat_ioctl.o obj-$(CONFIG_BLOCK_COMPAT) += compat_ioctl.o
obj-$(CONFIG_BLK_CMDLINE_PARSER) += cmdline-parser.o obj-$(CONFIG_BLK_CMDLINE_PARSER) += cmdline-parser.o
...@@ -25,3 +27,5 @@ obj-$(CONFIG_BLK_DEV_INTEGRITY) += bio-integrity.o blk-integrity.o t10-pi.o ...@@ -25,3 +27,5 @@ obj-$(CONFIG_BLK_DEV_INTEGRITY) += bio-integrity.o blk-integrity.o t10-pi.o
obj-$(CONFIG_BLK_MQ_PCI) += blk-mq-pci.o obj-$(CONFIG_BLK_MQ_PCI) += blk-mq-pci.o
obj-$(CONFIG_BLK_DEV_ZONED) += blk-zoned.o obj-$(CONFIG_BLK_DEV_ZONED) += blk-zoned.o
obj-$(CONFIG_BLK_WBT) += blk-wbt.o obj-$(CONFIG_BLK_WBT) += blk-wbt.o
obj-$(CONFIG_BLK_DEBUG_FS) += blk-mq-debugfs.o
obj-$(CONFIG_BLK_SED_OPAL) += sed-opal.o
...@@ -1227,9 +1227,6 @@ struct bio *bio_copy_user_iov(struct request_queue *q, ...@@ -1227,9 +1227,6 @@ struct bio *bio_copy_user_iov(struct request_queue *q,
if (!bio) if (!bio)
goto out_bmd; goto out_bmd;
if (iter->type & WRITE)
bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
ret = 0; ret = 0;
if (map_data) { if (map_data) {
...@@ -1394,16 +1391,10 @@ struct bio *bio_map_user_iov(struct request_queue *q, ...@@ -1394,16 +1391,10 @@ struct bio *bio_map_user_iov(struct request_queue *q,
kfree(pages); kfree(pages);
/*
* set data direction, and check if mapped pages need bouncing
*/
if (iter->type & WRITE)
bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
bio_set_flag(bio, BIO_USER_MAPPED); bio_set_flag(bio, BIO_USER_MAPPED);
/* /*
* subtle -- if __bio_map_user() ended up bouncing a bio, * subtle -- if bio_map_user_iov() ended up bouncing a bio,
* it would normally disappear when its bi_end_io is run. * it would normally disappear when its bi_end_io is run.
* however, we need it for the unmap, so grab an extra * however, we need it for the unmap, so grab an extra
* reference to it * reference to it
...@@ -1445,8 +1436,8 @@ static void __bio_unmap_user(struct bio *bio) ...@@ -1445,8 +1436,8 @@ static void __bio_unmap_user(struct bio *bio)
* bio_unmap_user - unmap a bio * bio_unmap_user - unmap a bio
* @bio: the bio being unmapped * @bio: the bio being unmapped
* *
* Unmap a bio previously mapped by bio_map_user(). Must be called with * Unmap a bio previously mapped by bio_map_user_iov(). Must be called from
* a process context. * process context.
* *
* bio_unmap_user() may sleep. * bio_unmap_user() may sleep.
*/ */
...@@ -1590,7 +1581,6 @@ struct bio *bio_copy_kern(struct request_queue *q, void *data, unsigned int len, ...@@ -1590,7 +1581,6 @@ struct bio *bio_copy_kern(struct request_queue *q, void *data, unsigned int len,
bio->bi_private = data; bio->bi_private = data;
} else { } else {
bio->bi_end_io = bio_copy_kern_endio; bio->bi_end_io = bio_copy_kern_endio;
bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
} }
return bio; return bio;
......
...@@ -184,7 +184,7 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg, ...@@ -184,7 +184,7 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg,
goto err_free_blkg; goto err_free_blkg;
} }
wb_congested = wb_congested_get_create(&q->backing_dev_info, wb_congested = wb_congested_get_create(q->backing_dev_info,
blkcg->css.id, blkcg->css.id,
GFP_NOWAIT | __GFP_NOWARN); GFP_NOWAIT | __GFP_NOWARN);
if (!wb_congested) { if (!wb_congested) {
...@@ -469,8 +469,8 @@ static int blkcg_reset_stats(struct cgroup_subsys_state *css, ...@@ -469,8 +469,8 @@ static int blkcg_reset_stats(struct cgroup_subsys_state *css,
const char *blkg_dev_name(struct blkcg_gq *blkg) const char *blkg_dev_name(struct blkcg_gq *blkg)
{ {
/* some drivers (floppy) instantiate a queue w/o disk registered */ /* some drivers (floppy) instantiate a queue w/o disk registered */
if (blkg->q->backing_dev_info.dev) if (blkg->q->backing_dev_info->dev)
return dev_name(blkg->q->backing_dev_info.dev); return dev_name(blkg->q->backing_dev_info->dev);
return NULL; return NULL;
} }
EXPORT_SYMBOL_GPL(blkg_dev_name); EXPORT_SYMBOL_GPL(blkg_dev_name);
...@@ -1079,10 +1079,8 @@ int blkcg_init_queue(struct request_queue *q) ...@@ -1079,10 +1079,8 @@ int blkcg_init_queue(struct request_queue *q)
if (preloaded) if (preloaded)
radix_tree_preload_end(); radix_tree_preload_end();
if (IS_ERR(blkg)) { if (IS_ERR(blkg))
blkg_free(new_blkg);
return PTR_ERR(blkg); return PTR_ERR(blkg);
}
q->root_blkg = blkg; q->root_blkg = blkg;
q->root_rl.blkg = blkg; q->root_rl.blkg = blkg;
...@@ -1223,6 +1221,9 @@ int blkcg_activate_policy(struct request_queue *q, ...@@ -1223,6 +1221,9 @@ int blkcg_activate_policy(struct request_queue *q,
if (blkcg_policy_enabled(q, pol)) if (blkcg_policy_enabled(q, pol))
return 0; return 0;
if (q->mq_ops)
blk_mq_freeze_queue(q);
else
blk_queue_bypass_start(q); blk_queue_bypass_start(q);
pd_prealloc: pd_prealloc:
if (!pd_prealloc) { if (!pd_prealloc) {
...@@ -1261,6 +1262,9 @@ int blkcg_activate_policy(struct request_queue *q, ...@@ -1261,6 +1262,9 @@ int blkcg_activate_policy(struct request_queue *q,
spin_unlock_irq(q->queue_lock); spin_unlock_irq(q->queue_lock);
out_bypass_end: out_bypass_end:
if (q->mq_ops)
blk_mq_unfreeze_queue(q);
else
blk_queue_bypass_end(q); blk_queue_bypass_end(q);
if (pd_prealloc) if (pd_prealloc)
pol->pd_free_fn(pd_prealloc); pol->pd_free_fn(pd_prealloc);
...@@ -1284,7 +1288,11 @@ void blkcg_deactivate_policy(struct request_queue *q, ...@@ -1284,7 +1288,11 @@ void blkcg_deactivate_policy(struct request_queue *q,
if (!blkcg_policy_enabled(q, pol)) if (!blkcg_policy_enabled(q, pol))
return; return;
if (q->mq_ops)
blk_mq_freeze_queue(q);
else
blk_queue_bypass_start(q); blk_queue_bypass_start(q);
spin_lock_irq(q->queue_lock); spin_lock_irq(q->queue_lock);
__clear_bit(pol->plid, q->blkcg_pols); __clear_bit(pol->plid, q->blkcg_pols);
...@@ -1304,6 +1312,10 @@ void blkcg_deactivate_policy(struct request_queue *q, ...@@ -1304,6 +1312,10 @@ void blkcg_deactivate_policy(struct request_queue *q,
} }
spin_unlock_irq(q->queue_lock); spin_unlock_irq(q->queue_lock);
if (q->mq_ops)
blk_mq_unfreeze_queue(q);
else
blk_queue_bypass_end(q); blk_queue_bypass_end(q);
} }
EXPORT_SYMBOL_GPL(blkcg_deactivate_policy); EXPORT_SYMBOL_GPL(blkcg_deactivate_policy);
......
This diff is collapsed.
...@@ -9,11 +9,7 @@ ...@@ -9,11 +9,7 @@
#include <linux/sched/sysctl.h> #include <linux/sched/sysctl.h>
#include "blk.h" #include "blk.h"
#include "blk-mq-sched.h"
/*
* for max sense size
*/
#include <scsi/scsi_cmnd.h>
/** /**
* blk_end_sync_rq - executes a completion event on a request * blk_end_sync_rq - executes a completion event on a request
...@@ -55,7 +51,7 @@ void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk, ...@@ -55,7 +51,7 @@ void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk,
int where = at_head ? ELEVATOR_INSERT_FRONT : ELEVATOR_INSERT_BACK; int where = at_head ? ELEVATOR_INSERT_FRONT : ELEVATOR_INSERT_BACK;
WARN_ON(irqs_disabled()); WARN_ON(irqs_disabled());
WARN_ON(rq->cmd_type == REQ_TYPE_FS); WARN_ON(!blk_rq_is_passthrough(rq));
rq->rq_disk = bd_disk; rq->rq_disk = bd_disk;
rq->end_io = done; rq->end_io = done;
...@@ -65,7 +61,7 @@ void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk, ...@@ -65,7 +61,7 @@ void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk,
* be reused after dying flag is set * be reused after dying flag is set
*/ */
if (q->mq_ops) { if (q->mq_ops) {
blk_mq_insert_request(rq, at_head, true, false); blk_mq_sched_insert_request(rq, at_head, true, false, false);
return; return;
} }
...@@ -100,16 +96,9 @@ int blk_execute_rq(struct request_queue *q, struct gendisk *bd_disk, ...@@ -100,16 +96,9 @@ int blk_execute_rq(struct request_queue *q, struct gendisk *bd_disk,
struct request *rq, int at_head) struct request *rq, int at_head)
{ {
DECLARE_COMPLETION_ONSTACK(wait); DECLARE_COMPLETION_ONSTACK(wait);
char sense[SCSI_SENSE_BUFFERSIZE];
int err = 0; int err = 0;
unsigned long hang_check; unsigned long hang_check;
if (!rq->sense) {
memset(sense, 0, sizeof(sense));
rq->sense = sense;
rq->sense_len = 0;
}
rq->end_io_data = &wait; rq->end_io_data = &wait;
blk_execute_rq_nowait(q, bd_disk, rq, at_head, blk_end_sync_rq); blk_execute_rq_nowait(q, bd_disk, rq, at_head, blk_end_sync_rq);
...@@ -123,11 +112,6 @@ int blk_execute_rq(struct request_queue *q, struct gendisk *bd_disk, ...@@ -123,11 +112,6 @@ int blk_execute_rq(struct request_queue *q, struct gendisk *bd_disk,
if (rq->errors) if (rq->errors)
err = -EIO; err = -EIO;
if (rq->sense == sense) {
rq->sense = NULL;
rq->sense_len = 0;
}
return err; return err;
} }
EXPORT_SYMBOL(blk_execute_rq); EXPORT_SYMBOL(blk_execute_rq);
...@@ -74,6 +74,7 @@ ...@@ -74,6 +74,7 @@
#include "blk.h" #include "blk.h"
#include "blk-mq.h" #include "blk-mq.h"
#include "blk-mq-tag.h" #include "blk-mq-tag.h"
#include "blk-mq-sched.h"
/* FLUSH/FUA sequences */ /* FLUSH/FUA sequences */
enum { enum {
...@@ -296,8 +297,14 @@ static bool blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq) ...@@ -296,8 +297,14 @@ static bool blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq)
if (fq->flush_pending_idx != fq->flush_running_idx || list_empty(pending)) if (fq->flush_pending_idx != fq->flush_running_idx || list_empty(pending))
return false; return false;
/* C2 and C3 */ /* C2 and C3
*
* For blk-mq + scheduling, we can risk having all driver tags
* assigned to empty flushes, and we deadlock if we are expecting
* other requests to make progress. Don't defer for that case.
*/
if (!list_empty(&fq->flush_data_in_flight) && if (!list_empty(&fq->flush_data_in_flight) &&
!(q->mq_ops && q->elevator) &&
time_before(jiffies, time_before(jiffies,
fq->flush_pending_since + FLUSH_PENDING_TIMEOUT)) fq->flush_pending_since + FLUSH_PENDING_TIMEOUT))
return false; return false;
...@@ -326,7 +333,6 @@ static bool blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq) ...@@ -326,7 +333,6 @@ static bool blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq)
blk_mq_tag_set_rq(hctx, first_rq->tag, flush_rq); blk_mq_tag_set_rq(hctx, first_rq->tag, flush_rq);
} }
flush_rq->cmd_type = REQ_TYPE_FS;
flush_rq->cmd_flags = REQ_OP_FLUSH | REQ_PREFLUSH; flush_rq->cmd_flags = REQ_OP_FLUSH | REQ_PREFLUSH;
flush_rq->rq_flags |= RQF_FLUSH_SEQ; flush_rq->rq_flags |= RQF_FLUSH_SEQ;
flush_rq->rq_disk = first_rq->rq_disk; flush_rq->rq_disk = first_rq->rq_disk;
...@@ -391,9 +397,10 @@ static void mq_flush_data_end_io(struct request *rq, int error) ...@@ -391,9 +397,10 @@ static void mq_flush_data_end_io(struct request *rq, int error)
* the comment in flush_end_io(). * the comment in flush_end_io().
*/ */
spin_lock_irqsave(&fq->mq_flush_lock, flags); spin_lock_irqsave(&fq->mq_flush_lock, flags);
if (blk_flush_complete_seq(rq, fq, REQ_FSEQ_DATA, error)) blk_flush_complete_seq(rq, fq, REQ_FSEQ_DATA, error);
blk_mq_run_hw_queue(hctx, true);
spin_unlock_irqrestore(&fq->mq_flush_lock, flags); spin_unlock_irqrestore(&fq->mq_flush_lock, flags);
blk_mq_run_hw_queue(hctx, true);
} }
/** /**
...@@ -453,9 +460,9 @@ void blk_insert_flush(struct request *rq) ...@@ -453,9 +460,9 @@ void blk_insert_flush(struct request *rq)
*/ */
if ((policy & REQ_FSEQ_DATA) && if ((policy & REQ_FSEQ_DATA) &&
!(policy & (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH))) { !(policy & (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH))) {
if (q->mq_ops) { if (q->mq_ops)
blk_mq_insert_request(rq, false, true, false); blk_mq_sched_insert_request(rq, false, true, false, false);
} else else
list_add_tail(&rq->queuelist, &q->queue_head); list_add_tail(&rq->queuelist, &q->queue_head);
return; return;
} }
...@@ -545,11 +552,10 @@ struct blk_flush_queue *blk_alloc_flush_queue(struct request_queue *q, ...@@ -545,11 +552,10 @@ struct blk_flush_queue *blk_alloc_flush_queue(struct request_queue *q,
if (!fq) if (!fq)
goto fail; goto fail;
if (q->mq_ops) { if (q->mq_ops)
spin_lock_init(&fq->mq_flush_lock); spin_lock_init(&fq->mq_flush_lock);
rq_sz = round_up(rq_sz + cmd_size, cache_line_size());
}
rq_sz = round_up(rq_sz + cmd_size, cache_line_size());
fq->flush_rq = kzalloc_node(rq_sz, GFP_KERNEL, node); fq->flush_rq = kzalloc_node(rq_sz, GFP_KERNEL, node);
if (!fq->flush_rq) if (!fq->flush_rq)
goto fail_rq; goto fail_rq;
......
...@@ -443,10 +443,10 @@ void blk_integrity_revalidate(struct gendisk *disk) ...@@ -443,10 +443,10 @@ void blk_integrity_revalidate(struct gendisk *disk)
return; return;
if (bi->profile) if (bi->profile)
disk->queue->backing_dev_info.capabilities |= disk->queue->backing_dev_info->capabilities |=
BDI_CAP_STABLE_WRITES; BDI_CAP_STABLE_WRITES;
else else
disk->queue->backing_dev_info.capabilities &= disk->queue->backing_dev_info->capabilities &=
~BDI_CAP_STABLE_WRITES; ~BDI_CAP_STABLE_WRITES;
} }
......
...@@ -35,7 +35,10 @@ static void icq_free_icq_rcu(struct rcu_head *head) ...@@ -35,7 +35,10 @@ static void icq_free_icq_rcu(struct rcu_head *head)
kmem_cache_free(icq->__rcu_icq_cache, icq); kmem_cache_free(icq->__rcu_icq_cache, icq);
} }
/* Exit an icq. Called with both ioc and q locked. */ /*
* Exit an icq. Called with both ioc and q locked for sq, only ioc locked for
* mq.
*/
static void ioc_exit_icq(struct io_cq *icq) static void ioc_exit_icq(struct io_cq *icq)
{ {
struct elevator_type *et = icq->q->elevator->type; struct elevator_type *et = icq->q->elevator->type;
...@@ -43,8 +46,10 @@ static void ioc_exit_icq(struct io_cq *icq) ...@@ -43,8 +46,10 @@ static void ioc_exit_icq(struct io_cq *icq)
if (icq->flags & ICQ_EXITED) if (icq->flags & ICQ_EXITED)
return; return;
if (et->ops.elevator_exit_icq_fn) if (et->uses_mq && et->ops.mq.exit_icq)
et->ops.elevator_exit_icq_fn(icq); et->ops.mq.exit_icq(icq);
else if (!et->uses_mq && et->ops.sq.elevator_exit_icq_fn)
et->ops.sq.elevator_exit_icq_fn(icq);
icq->flags |= ICQ_EXITED; icq->flags |= ICQ_EXITED;
} }
...@@ -164,6 +169,7 @@ EXPORT_SYMBOL(put_io_context); ...@@ -164,6 +169,7 @@ EXPORT_SYMBOL(put_io_context);
*/ */
void put_io_context_active(struct io_context *ioc) void put_io_context_active(struct io_context *ioc)
{ {
struct elevator_type *et;
unsigned long flags; unsigned long flags;
struct io_cq *icq; struct io_cq *icq;
...@@ -182,6 +188,11 @@ void put_io_context_active(struct io_context *ioc) ...@@ -182,6 +188,11 @@ void put_io_context_active(struct io_context *ioc)
hlist_for_each_entry(icq, &ioc->icq_list, ioc_node) { hlist_for_each_entry(icq, &ioc->icq_list, ioc_node) {
if (icq->flags & ICQ_EXITED) if (icq->flags & ICQ_EXITED)
continue; continue;
et = icq->q->elevator->type;
if (et->uses_mq) {
ioc_exit_icq(icq);
} else {
if (spin_trylock(icq->q->queue_lock)) { if (spin_trylock(icq->q->queue_lock)) {
ioc_exit_icq(icq); ioc_exit_icq(icq);
spin_unlock(icq->q->queue_lock); spin_unlock(icq->q->queue_lock);
...@@ -191,6 +202,7 @@ void put_io_context_active(struct io_context *ioc) ...@@ -191,6 +202,7 @@ void put_io_context_active(struct io_context *ioc)
goto retry; goto retry;
} }
} }
}
spin_unlock_irqrestore(&ioc->lock, flags); spin_unlock_irqrestore(&ioc->lock, flags);
put_io_context(ioc); put_io_context(ioc);
...@@ -383,8 +395,10 @@ struct io_cq *ioc_create_icq(struct io_context *ioc, struct request_queue *q, ...@@ -383,8 +395,10 @@ struct io_cq *ioc_create_icq(struct io_context *ioc, struct request_queue *q,
if (likely(!radix_tree_insert(&ioc->icq_tree, q->id, icq))) { if (likely(!radix_tree_insert(&ioc->icq_tree, q->id, icq))) {
hlist_add_head(&icq->ioc_node, &ioc->icq_list); hlist_add_head(&icq->ioc_node, &ioc->icq_list);
list_add(&icq->q_node, &q->icq_list); list_add(&icq->q_node, &q->icq_list);
if (et->ops.elevator_init_icq_fn) if (et->uses_mq && et->ops.mq.init_icq)
et->ops.elevator_init_icq_fn(icq); et->ops.mq.init_icq(icq);
else if (!et->uses_mq && et->ops.sq.elevator_init_icq_fn)
et->ops.sq.elevator_init_icq_fn(icq);
} else { } else {
kmem_cache_free(et->icq_cache, icq); kmem_cache_free(et->icq_cache, icq);
icq = ioc_lookup_icq(ioc, q); icq = ioc_lookup_icq(ioc, q);
......
...@@ -16,8 +16,6 @@ ...@@ -16,8 +16,6 @@
int blk_rq_append_bio(struct request *rq, struct bio *bio) int blk_rq_append_bio(struct request *rq, struct bio *bio)
{ {
if (!rq->bio) { if (!rq->bio) {
rq->cmd_flags &= REQ_OP_MASK;
rq->cmd_flags |= (bio->bi_opf & REQ_OP_MASK);
blk_rq_bio_prep(rq->q, rq, bio); blk_rq_bio_prep(rq->q, rq, bio);
} else { } else {
if (!ll_back_merge_fn(rq->q, rq, bio)) if (!ll_back_merge_fn(rq->q, rq, bio))
...@@ -62,6 +60,9 @@ static int __blk_rq_map_user_iov(struct request *rq, ...@@ -62,6 +60,9 @@ static int __blk_rq_map_user_iov(struct request *rq,
if (IS_ERR(bio)) if (IS_ERR(bio))
return PTR_ERR(bio); return PTR_ERR(bio);
bio->bi_opf &= ~REQ_OP_MASK;
bio->bi_opf |= req_op(rq);
if (map_data && map_data->null_mapped) if (map_data && map_data->null_mapped)
bio_set_flag(bio, BIO_NULL_MAPPED); bio_set_flag(bio, BIO_NULL_MAPPED);
...@@ -90,7 +91,7 @@ static int __blk_rq_map_user_iov(struct request *rq, ...@@ -90,7 +91,7 @@ static int __blk_rq_map_user_iov(struct request *rq,
} }
/** /**
* blk_rq_map_user_iov - map user data to a request, for REQ_TYPE_BLOCK_PC usage * blk_rq_map_user_iov - map user data to a request, for passthrough requests
* @q: request queue where request should be inserted * @q: request queue where request should be inserted
* @rq: request to map data to * @rq: request to map data to
* @map_data: pointer to the rq_map_data holding pages (if necessary) * @map_data: pointer to the rq_map_data holding pages (if necessary)
...@@ -199,7 +200,7 @@ int blk_rq_unmap_user(struct bio *bio) ...@@ -199,7 +200,7 @@ int blk_rq_unmap_user(struct bio *bio)
EXPORT_SYMBOL(blk_rq_unmap_user); EXPORT_SYMBOL(blk_rq_unmap_user);
/** /**
* blk_rq_map_kern - map kernel data to a request, for REQ_TYPE_BLOCK_PC usage * blk_rq_map_kern - map kernel data to a request, for passthrough requests
* @q: request queue where request should be inserted * @q: request queue where request should be inserted
* @rq: request to fill * @rq: request to fill
* @kbuf: the kernel buffer * @kbuf: the kernel buffer
...@@ -234,8 +235,8 @@ int blk_rq_map_kern(struct request_queue *q, struct request *rq, void *kbuf, ...@@ -234,8 +235,8 @@ int blk_rq_map_kern(struct request_queue *q, struct request *rq, void *kbuf,
if (IS_ERR(bio)) if (IS_ERR(bio))
return PTR_ERR(bio); return PTR_ERR(bio);
if (!reading) bio->bi_opf &= ~REQ_OP_MASK;
bio_set_op_attrs(bio, REQ_OP_WRITE, 0); bio->bi_opf |= req_op(rq);
if (do_copy) if (do_copy)
rq->rq_flags |= RQF_COPY_USER; rq->rq_flags |= RQF_COPY_USER;
......
...@@ -482,13 +482,6 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq, ...@@ -482,13 +482,6 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq,
} }
EXPORT_SYMBOL(blk_rq_map_sg); EXPORT_SYMBOL(blk_rq_map_sg);
static void req_set_nomerge(struct request_queue *q, struct request *req)
{
req->cmd_flags |= REQ_NOMERGE;
if (req == q->last_merge)
q->last_merge = NULL;
}
static inline int ll_new_hw_segment(struct request_queue *q, static inline int ll_new_hw_segment(struct request_queue *q,
struct request *req, struct request *req,
struct bio *bio) struct bio *bio)
...@@ -659,31 +652,32 @@ static void blk_account_io_merge(struct request *req) ...@@ -659,31 +652,32 @@ static void blk_account_io_merge(struct request *req)
} }
/* /*
* Has to be called with the request spinlock acquired * For non-mq, this has to be called with the request spinlock acquired.
* For mq with scheduling, the appropriate queue wide lock should be held.
*/ */
static int attempt_merge(struct request_queue *q, struct request *req, static struct request *attempt_merge(struct request_queue *q,
struct request *next) struct request *req, struct request *next)
{ {
if (!rq_mergeable(req) || !rq_mergeable(next)) if (!rq_mergeable(req) || !rq_mergeable(next))
return 0; return NULL;
if (req_op(req) != req_op(next)) if (req_op(req) != req_op(next))
return 0; return NULL;
/* /*
* not contiguous * not contiguous
*/ */
if (blk_rq_pos(req) + blk_rq_sectors(req) != blk_rq_pos(next)) if (blk_rq_pos(req) + blk_rq_sectors(req) != blk_rq_pos(next))
return 0; return NULL;
if (rq_data_dir(req) != rq_data_dir(next) if (rq_data_dir(req) != rq_data_dir(next)
|| req->rq_disk != next->rq_disk || req->rq_disk != next->rq_disk
|| req_no_special_merge(next)) || req_no_special_merge(next))
return 0; return NULL;
if (req_op(req) == REQ_OP_WRITE_SAME && if (req_op(req) == REQ_OP_WRITE_SAME &&
!blk_write_same_mergeable(req->bio, next->bio)) !blk_write_same_mergeable(req->bio, next->bio))
return 0; return NULL;
/* /*
* If we are allowed to merge, then append bio list * If we are allowed to merge, then append bio list
...@@ -692,7 +686,7 @@ static int attempt_merge(struct request_queue *q, struct request *req, ...@@ -692,7 +686,7 @@ static int attempt_merge(struct request_queue *q, struct request *req,
* counts here. * counts here.
*/ */
if (!ll_merge_requests_fn(q, req, next)) if (!ll_merge_requests_fn(q, req, next))
return 0; return NULL;
/* /*
* If failfast settings disagree or any of the two is already * If failfast settings disagree or any of the two is already
...@@ -732,42 +726,51 @@ static int attempt_merge(struct request_queue *q, struct request *req, ...@@ -732,42 +726,51 @@ static int attempt_merge(struct request_queue *q, struct request *req,
if (blk_rq_cpu_valid(next)) if (blk_rq_cpu_valid(next))
req->cpu = next->cpu; req->cpu = next->cpu;
/* owner-ship of bio passed from next to req */ /*
* ownership of bio passed from next to req, return 'next' for
* the caller to free
*/
next->bio = NULL; next->bio = NULL;
__blk_put_request(q, next); return next;
return 1;
} }
int attempt_back_merge(struct request_queue *q, struct request *rq) struct request *attempt_back_merge(struct request_queue *q, struct request *rq)
{ {
struct request *next = elv_latter_request(q, rq); struct request *next = elv_latter_request(q, rq);
if (next) if (next)
return attempt_merge(q, rq, next); return attempt_merge(q, rq, next);
return 0; return NULL;
} }
int attempt_front_merge(struct request_queue *q, struct request *rq) struct request *attempt_front_merge(struct request_queue *q, struct request *rq)
{ {
struct request *prev = elv_former_request(q, rq); struct request *prev = elv_former_request(q, rq);
if (prev) if (prev)
return attempt_merge(q, prev, rq); return attempt_merge(q, prev, rq);
return 0; return NULL;
} }
int blk_attempt_req_merge(struct request_queue *q, struct request *rq, int blk_attempt_req_merge(struct request_queue *q, struct request *rq,
struct request *next) struct request *next)
{ {
struct elevator_queue *e = q->elevator; struct elevator_queue *e = q->elevator;
struct request *free;
if (e->type->ops.elevator_allow_rq_merge_fn) if (!e->uses_mq && e->type->ops.sq.elevator_allow_rq_merge_fn)
if (!e->type->ops.elevator_allow_rq_merge_fn(q, rq, next)) if (!e->type->ops.sq.elevator_allow_rq_merge_fn(q, rq, next))
return 0; return 0;
return attempt_merge(q, rq, next); free = attempt_merge(q, rq, next);
if (free) {
__blk_put_request(q, free);
return 1;
}
return 0;
} }
bool blk_rq_merge_ok(struct request *rq, struct bio *bio) bool blk_rq_merge_ok(struct request *rq, struct bio *bio)
...@@ -798,9 +801,12 @@ bool blk_rq_merge_ok(struct request *rq, struct bio *bio) ...@@ -798,9 +801,12 @@ bool blk_rq_merge_ok(struct request *rq, struct bio *bio)
return true; return true;
} }
int blk_try_merge(struct request *rq, struct bio *bio) enum elv_merge blk_try_merge(struct request *rq, struct bio *bio)
{ {
if (blk_rq_pos(rq) + blk_rq_sectors(rq) == bio->bi_iter.bi_sector) if (req_op(rq) == REQ_OP_DISCARD &&
queue_max_discard_segments(rq->q) > 1)
return ELEVATOR_DISCARD_MERGE;
else if (blk_rq_pos(rq) + blk_rq_sectors(rq) == bio->bi_iter.bi_sector)
return ELEVATOR_BACK_MERGE; return ELEVATOR_BACK_MERGE;
else if (blk_rq_pos(rq) - bio_sectors(bio) == bio->bi_iter.bi_sector) else if (blk_rq_pos(rq) - bio_sectors(bio) == bio->bi_iter.bi_sector)
return ELEVATOR_FRONT_MERGE; return ELEVATOR_FRONT_MERGE;
......
This diff is collapsed.
This diff is collapsed.
#ifndef BLK_MQ_SCHED_H
#define BLK_MQ_SCHED_H
#include "blk-mq.h"
#include "blk-mq-tag.h"
int blk_mq_sched_init_hctx_data(struct request_queue *q, size_t size,
int (*init)(struct blk_mq_hw_ctx *),
void (*exit)(struct blk_mq_hw_ctx *));
void blk_mq_sched_free_hctx_data(struct request_queue *q,
void (*exit)(struct blk_mq_hw_ctx *));
struct request *blk_mq_sched_get_request(struct request_queue *q, struct bio *bio, unsigned int op, struct blk_mq_alloc_data *data);
void blk_mq_sched_put_request(struct request *rq);
void blk_mq_sched_request_inserted(struct request *rq);
bool blk_mq_sched_try_merge(struct request_queue *q, struct bio *bio,
struct request **merged_request);
bool __blk_mq_sched_bio_merge(struct request_queue *q, struct bio *bio);
bool blk_mq_sched_try_insert_merge(struct request_queue *q, struct request *rq);
void blk_mq_sched_restart_queues(struct blk_mq_hw_ctx *hctx);
void blk_mq_sched_insert_request(struct request *rq, bool at_head,
bool run_queue, bool async, bool can_block);
void blk_mq_sched_insert_requests(struct request_queue *q,
struct blk_mq_ctx *ctx,
struct list_head *list, bool run_queue_async);
void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx);
void blk_mq_sched_move_to_dispatch(struct blk_mq_hw_ctx *hctx,
struct list_head *rq_list,
struct request *(*get_rq)(struct blk_mq_hw_ctx *));
int blk_mq_sched_setup(struct request_queue *q);
void blk_mq_sched_teardown(struct request_queue *q);
int blk_mq_sched_init(struct request_queue *q);
static inline bool
blk_mq_sched_bio_merge(struct request_queue *q, struct bio *bio)
{
struct elevator_queue *e = q->elevator;
if (!e || blk_queue_nomerges(q) || !bio_mergeable(bio))
return false;
return __blk_mq_sched_bio_merge(q, bio);
}
static inline int blk_mq_sched_get_rq_priv(struct request_queue *q,
struct request *rq,
struct bio *bio)
{
struct elevator_queue *e = q->elevator;
if (e && e->type->ops.mq.get_rq_priv)
return e->type->ops.mq.get_rq_priv(q, rq, bio);
return 0;
}
static inline void blk_mq_sched_put_rq_priv(struct request_queue *q,
struct request *rq)
{
struct elevator_queue *e = q->elevator;
if (e && e->type->ops.mq.put_rq_priv)
e->type->ops.mq.put_rq_priv(q, rq);
}
static inline bool
blk_mq_sched_allow_merge(struct request_queue *q, struct request *rq,
struct bio *bio)
{
struct elevator_queue *e = q->elevator;
if (e && e->type->ops.mq.allow_merge)
return e->type->ops.mq.allow_merge(q, rq, bio);
return true;
}
static inline void
blk_mq_sched_completed_request(struct blk_mq_hw_ctx *hctx, struct request *rq)
{
struct elevator_queue *e = hctx->queue->elevator;
if (e && e->type->ops.mq.completed_request)
e->type->ops.mq.completed_request(hctx, rq);
BUG_ON(rq->internal_tag == -1);
blk_mq_put_tag(hctx, hctx->sched_tags, rq->mq_ctx, rq->internal_tag);
}
static inline void blk_mq_sched_started_request(struct request *rq)
{
struct request_queue *q = rq->q;
struct elevator_queue *e = q->elevator;
if (e && e->type->ops.mq.started_request)
e->type->ops.mq.started_request(rq);
}
static inline void blk_mq_sched_requeue_request(struct request *rq)
{
struct request_queue *q = rq->q;
struct elevator_queue *e = q->elevator;
if (e && e->type->ops.mq.requeue_request)
e->type->ops.mq.requeue_request(rq);
}
static inline bool blk_mq_sched_has_work(struct blk_mq_hw_ctx *hctx)
{
struct elevator_queue *e = hctx->queue->elevator;
if (e && e->type->ops.mq.has_work)
return e->type->ops.mq.has_work(hctx);
return false;
}
static inline void blk_mq_sched_mark_restart(struct blk_mq_hw_ctx *hctx)
{
if (!test_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state)) {
set_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state);
if (hctx->flags & BLK_MQ_F_TAG_SHARED) {
struct request_queue *q = hctx->queue;
if (!test_bit(QUEUE_FLAG_RESTART, &q->queue_flags))
set_bit(QUEUE_FLAG_RESTART, &q->queue_flags);
}
}
}
static inline bool blk_mq_sched_needs_restart(struct blk_mq_hw_ctx *hctx)
{
return test_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state);
}
#endif
...@@ -122,123 +122,16 @@ static ssize_t blk_mq_hw_sysfs_store(struct kobject *kobj, ...@@ -122,123 +122,16 @@ static ssize_t blk_mq_hw_sysfs_store(struct kobject *kobj,
return res; return res;
} }
static ssize_t blk_mq_sysfs_dispatched_show(struct blk_mq_ctx *ctx, char *page) static ssize_t blk_mq_hw_sysfs_nr_tags_show(struct blk_mq_hw_ctx *hctx,
{
return sprintf(page, "%lu %lu\n", ctx->rq_dispatched[1],
ctx->rq_dispatched[0]);
}
static ssize_t blk_mq_sysfs_merged_show(struct blk_mq_ctx *ctx, char *page)
{
return sprintf(page, "%lu\n", ctx->rq_merged);
}
static ssize_t blk_mq_sysfs_completed_show(struct blk_mq_ctx *ctx, char *page)
{
return sprintf(page, "%lu %lu\n", ctx->rq_completed[1],
ctx->rq_completed[0]);
}
static ssize_t sysfs_list_show(char *page, struct list_head *list, char *msg)
{
struct request *rq;
int len = snprintf(page, PAGE_SIZE - 1, "%s:\n", msg);
list_for_each_entry(rq, list, queuelist) {
const int rq_len = 2 * sizeof(rq) + 2;
/* if the output will be truncated */
if (PAGE_SIZE - 1 < len + rq_len) {
/* backspacing if it can't hold '\t...\n' */
if (PAGE_SIZE - 1 < len + 5)
len -= rq_len;
len += snprintf(page + len, PAGE_SIZE - 1 - len,
"\t...\n");
break;
}
len += snprintf(page + len, PAGE_SIZE - 1 - len,
"\t%p\n", rq);
}
return len;
}
static ssize_t blk_mq_sysfs_rq_list_show(struct blk_mq_ctx *ctx, char *page)
{
ssize_t ret;
spin_lock(&ctx->lock);
ret = sysfs_list_show(page, &ctx->rq_list, "CTX pending");
spin_unlock(&ctx->lock);
return ret;
}
static ssize_t blk_mq_hw_sysfs_poll_show(struct blk_mq_hw_ctx *hctx, char *page)
{
return sprintf(page, "considered=%lu, invoked=%lu, success=%lu\n",
hctx->poll_considered, hctx->poll_invoked,
hctx->poll_success);
}
static ssize_t blk_mq_hw_sysfs_poll_store(struct blk_mq_hw_ctx *hctx,
const char *page, size_t size)
{
hctx->poll_considered = hctx->poll_invoked = hctx->poll_success = 0;
return size;
}
static ssize_t blk_mq_hw_sysfs_queued_show(struct blk_mq_hw_ctx *hctx,
char *page) char *page)
{ {
return sprintf(page, "%lu\n", hctx->queued); return sprintf(page, "%u\n", hctx->tags->nr_tags);
} }
static ssize_t blk_mq_hw_sysfs_run_show(struct blk_mq_hw_ctx *hctx, char *page) static ssize_t blk_mq_hw_sysfs_nr_reserved_tags_show(struct blk_mq_hw_ctx *hctx,
{
return sprintf(page, "%lu\n", hctx->run);
}
static ssize_t blk_mq_hw_sysfs_dispatched_show(struct blk_mq_hw_ctx *hctx,
char *page) char *page)
{ {
char *start_page = page; return sprintf(page, "%u\n", hctx->tags->nr_reserved_tags);
int i;
page += sprintf(page, "%8u\t%lu\n", 0U, hctx->dispatched[0]);
for (i = 1; i < BLK_MQ_MAX_DISPATCH_ORDER - 1; i++) {
unsigned int d = 1U << (i - 1);
page += sprintf(page, "%8u\t%lu\n", d, hctx->dispatched[i]);
}
page += sprintf(page, "%8u+\t%lu\n", 1U << (i - 1),
hctx->dispatched[i]);
return page - start_page;
}
static ssize_t blk_mq_hw_sysfs_rq_list_show(struct blk_mq_hw_ctx *hctx,
char *page)
{
ssize_t ret;
spin_lock(&hctx->lock);
ret = sysfs_list_show(page, &hctx->dispatch, "HCTX pending");
spin_unlock(&hctx->lock);
return ret;
}
static ssize_t blk_mq_hw_sysfs_tags_show(struct blk_mq_hw_ctx *hctx, char *page)
{
return blk_mq_tag_sysfs_show(hctx->tags, page);
}
static ssize_t blk_mq_hw_sysfs_active_show(struct blk_mq_hw_ctx *hctx, char *page)
{
return sprintf(page, "%u\n", atomic_read(&hctx->nr_active));
} }
static ssize_t blk_mq_hw_sysfs_cpus_show(struct blk_mq_hw_ctx *hctx, char *page) static ssize_t blk_mq_hw_sysfs_cpus_show(struct blk_mq_hw_ctx *hctx, char *page)
...@@ -259,121 +152,27 @@ static ssize_t blk_mq_hw_sysfs_cpus_show(struct blk_mq_hw_ctx *hctx, char *page) ...@@ -259,121 +152,27 @@ static ssize_t blk_mq_hw_sysfs_cpus_show(struct blk_mq_hw_ctx *hctx, char *page)
return ret; return ret;
} }
static void blk_mq_stat_clear(struct blk_mq_hw_ctx *hctx)
{
struct blk_mq_ctx *ctx;
unsigned int i;
hctx_for_each_ctx(hctx, ctx, i) {
blk_stat_init(&ctx->stat[BLK_STAT_READ]);
blk_stat_init(&ctx->stat[BLK_STAT_WRITE]);
}
}
static ssize_t blk_mq_hw_sysfs_stat_store(struct blk_mq_hw_ctx *hctx,
const char *page, size_t count)
{
blk_mq_stat_clear(hctx);
return count;
}
static ssize_t print_stat(char *page, struct blk_rq_stat *stat, const char *pre)
{
return sprintf(page, "%s samples=%llu, mean=%lld, min=%lld, max=%lld\n",
pre, (long long) stat->nr_samples,
(long long) stat->mean, (long long) stat->min,
(long long) stat->max);
}
static ssize_t blk_mq_hw_sysfs_stat_show(struct blk_mq_hw_ctx *hctx, char *page)
{
struct blk_rq_stat stat[2];
ssize_t ret;
blk_stat_init(&stat[BLK_STAT_READ]);
blk_stat_init(&stat[BLK_STAT_WRITE]);
blk_hctx_stat_get(hctx, stat);
ret = print_stat(page, &stat[BLK_STAT_READ], "read :");
ret += print_stat(page + ret, &stat[BLK_STAT_WRITE], "write:");
return ret;
}
static struct blk_mq_ctx_sysfs_entry blk_mq_sysfs_dispatched = {
.attr = {.name = "dispatched", .mode = S_IRUGO },
.show = blk_mq_sysfs_dispatched_show,
};
static struct blk_mq_ctx_sysfs_entry blk_mq_sysfs_merged = {
.attr = {.name = "merged", .mode = S_IRUGO },
.show = blk_mq_sysfs_merged_show,
};
static struct blk_mq_ctx_sysfs_entry blk_mq_sysfs_completed = {
.attr = {.name = "completed", .mode = S_IRUGO },
.show = blk_mq_sysfs_completed_show,
};
static struct blk_mq_ctx_sysfs_entry blk_mq_sysfs_rq_list = {
.attr = {.name = "rq_list", .mode = S_IRUGO },
.show = blk_mq_sysfs_rq_list_show,
};
static struct attribute *default_ctx_attrs[] = { static struct attribute *default_ctx_attrs[] = {
&blk_mq_sysfs_dispatched.attr,
&blk_mq_sysfs_merged.attr,
&blk_mq_sysfs_completed.attr,
&blk_mq_sysfs_rq_list.attr,
NULL, NULL,
}; };
static struct blk_mq_hw_ctx_sysfs_entry blk_mq_hw_sysfs_queued = { static struct blk_mq_hw_ctx_sysfs_entry blk_mq_hw_sysfs_nr_tags = {
.attr = {.name = "queued", .mode = S_IRUGO }, .attr = {.name = "nr_tags", .mode = S_IRUGO },
.show = blk_mq_hw_sysfs_queued_show, .show = blk_mq_hw_sysfs_nr_tags_show,
}; };
static struct blk_mq_hw_ctx_sysfs_entry blk_mq_hw_sysfs_run = { static struct blk_mq_hw_ctx_sysfs_entry blk_mq_hw_sysfs_nr_reserved_tags = {
.attr = {.name = "run", .mode = S_IRUGO }, .attr = {.name = "nr_reserved_tags", .mode = S_IRUGO },
.show = blk_mq_hw_sysfs_run_show, .show = blk_mq_hw_sysfs_nr_reserved_tags_show,
};
static struct blk_mq_hw_ctx_sysfs_entry blk_mq_hw_sysfs_dispatched = {
.attr = {.name = "dispatched", .mode = S_IRUGO },
.show = blk_mq_hw_sysfs_dispatched_show,
};
static struct blk_mq_hw_ctx_sysfs_entry blk_mq_hw_sysfs_active = {
.attr = {.name = "active", .mode = S_IRUGO },
.show = blk_mq_hw_sysfs_active_show,
};
static struct blk_mq_hw_ctx_sysfs_entry blk_mq_hw_sysfs_pending = {
.attr = {.name = "pending", .mode = S_IRUGO },
.show = blk_mq_hw_sysfs_rq_list_show,
};
static struct blk_mq_hw_ctx_sysfs_entry blk_mq_hw_sysfs_tags = {
.attr = {.name = "tags", .mode = S_IRUGO },
.show = blk_mq_hw_sysfs_tags_show,
}; };
static struct blk_mq_hw_ctx_sysfs_entry blk_mq_hw_sysfs_cpus = { static struct blk_mq_hw_ctx_sysfs_entry blk_mq_hw_sysfs_cpus = {
.attr = {.name = "cpu_list", .mode = S_IRUGO }, .attr = {.name = "cpu_list", .mode = S_IRUGO },
.show = blk_mq_hw_sysfs_cpus_show, .show = blk_mq_hw_sysfs_cpus_show,
}; };
static struct blk_mq_hw_ctx_sysfs_entry blk_mq_hw_sysfs_poll = {
.attr = {.name = "io_poll", .mode = S_IWUSR | S_IRUGO },
.show = blk_mq_hw_sysfs_poll_show,
.store = blk_mq_hw_sysfs_poll_store,
};
static struct blk_mq_hw_ctx_sysfs_entry blk_mq_hw_sysfs_stat = {
.attr = {.name = "stats", .mode = S_IRUGO | S_IWUSR },
.show = blk_mq_hw_sysfs_stat_show,
.store = blk_mq_hw_sysfs_stat_store,
};
static struct attribute *default_hw_ctx_attrs[] = { static struct attribute *default_hw_ctx_attrs[] = {
&blk_mq_hw_sysfs_queued.attr, &blk_mq_hw_sysfs_nr_tags.attr,
&blk_mq_hw_sysfs_run.attr, &blk_mq_hw_sysfs_nr_reserved_tags.attr,
&blk_mq_hw_sysfs_dispatched.attr,
&blk_mq_hw_sysfs_pending.attr,
&blk_mq_hw_sysfs_tags.attr,
&blk_mq_hw_sysfs_cpus.attr, &blk_mq_hw_sysfs_cpus.attr,
&blk_mq_hw_sysfs_active.attr,
&blk_mq_hw_sysfs_poll.attr,
&blk_mq_hw_sysfs_stat.attr,
NULL, NULL,
}; };
...@@ -455,6 +254,8 @@ static void __blk_mq_unregister_dev(struct device *dev, struct request_queue *q) ...@@ -455,6 +254,8 @@ static void __blk_mq_unregister_dev(struct device *dev, struct request_queue *q)
kobject_put(&hctx->kobj); kobject_put(&hctx->kobj);
} }
blk_mq_debugfs_unregister_hctxs(q);
kobject_uevent(&q->mq_kobj, KOBJ_REMOVE); kobject_uevent(&q->mq_kobj, KOBJ_REMOVE);
kobject_del(&q->mq_kobj); kobject_del(&q->mq_kobj);
kobject_put(&q->mq_kobj); kobject_put(&q->mq_kobj);
...@@ -504,6 +305,8 @@ int blk_mq_register_dev(struct device *dev, struct request_queue *q) ...@@ -504,6 +305,8 @@ int blk_mq_register_dev(struct device *dev, struct request_queue *q)
kobject_uevent(&q->mq_kobj, KOBJ_ADD); kobject_uevent(&q->mq_kobj, KOBJ_ADD);
blk_mq_debugfs_register(q, kobject_name(&dev->kobj));
queue_for_each_hw_ctx(q, hctx, i) { queue_for_each_hw_ctx(q, hctx, i) {
ret = blk_mq_register_hctx(hctx); ret = blk_mq_register_hctx(hctx);
if (ret) if (ret)
...@@ -529,6 +332,8 @@ void blk_mq_sysfs_unregister(struct request_queue *q) ...@@ -529,6 +332,8 @@ void blk_mq_sysfs_unregister(struct request_queue *q)
if (!q->mq_sysfs_init_done) if (!q->mq_sysfs_init_done)
return; return;
blk_mq_debugfs_unregister_hctxs(q);
queue_for_each_hw_ctx(q, hctx, i) queue_for_each_hw_ctx(q, hctx, i)
blk_mq_unregister_hctx(hctx); blk_mq_unregister_hctx(hctx);
} }
...@@ -541,6 +346,8 @@ int blk_mq_sysfs_register(struct request_queue *q) ...@@ -541,6 +346,8 @@ int blk_mq_sysfs_register(struct request_queue *q)
if (!q->mq_sysfs_init_done) if (!q->mq_sysfs_init_done)
return ret; return ret;
blk_mq_debugfs_register_hctxs(q);
queue_for_each_hw_ctx(q, hctx, i) { queue_for_each_hw_ctx(q, hctx, i) {
ret = blk_mq_register_hctx(hctx); ret = blk_mq_register_hctx(hctx);
if (ret) if (ret)
......
...@@ -90,113 +90,97 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx, ...@@ -90,113 +90,97 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx,
return atomic_read(&hctx->nr_active) < depth; return atomic_read(&hctx->nr_active) < depth;
} }
static int __bt_get(struct blk_mq_hw_ctx *hctx, struct sbitmap_queue *bt) static int __blk_mq_get_tag(struct blk_mq_alloc_data *data,
struct sbitmap_queue *bt)
{ {
if (!hctx_may_queue(hctx, bt)) if (!(data->flags & BLK_MQ_REQ_INTERNAL) &&
!hctx_may_queue(data->hctx, bt))
return -1; return -1;
return __sbitmap_queue_get(bt); return __sbitmap_queue_get(bt);
} }
static int bt_get(struct blk_mq_alloc_data *data, struct sbitmap_queue *bt, unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data)
struct blk_mq_hw_ctx *hctx, struct blk_mq_tags *tags)
{ {
struct blk_mq_tags *tags = blk_mq_tags_from_data(data);
struct sbitmap_queue *bt;
struct sbq_wait_state *ws; struct sbq_wait_state *ws;
DEFINE_WAIT(wait); DEFINE_WAIT(wait);
unsigned int tag_offset;
bool drop_ctx;
int tag; int tag;
tag = __bt_get(hctx, bt); if (data->flags & BLK_MQ_REQ_RESERVED) {
if (unlikely(!tags->nr_reserved_tags)) {
WARN_ON_ONCE(1);
return BLK_MQ_TAG_FAIL;
}
bt = &tags->breserved_tags;
tag_offset = 0;
} else {
bt = &tags->bitmap_tags;
tag_offset = tags->nr_reserved_tags;
}
tag = __blk_mq_get_tag(data, bt);
if (tag != -1) if (tag != -1)
return tag; goto found_tag;
if (data->flags & BLK_MQ_REQ_NOWAIT) if (data->flags & BLK_MQ_REQ_NOWAIT)
return -1; return BLK_MQ_TAG_FAIL;
ws = bt_wait_ptr(bt, hctx); ws = bt_wait_ptr(bt, data->hctx);
drop_ctx = data->ctx == NULL;
do { do {
prepare_to_wait(&ws->wait, &wait, TASK_UNINTERRUPTIBLE); prepare_to_wait(&ws->wait, &wait, TASK_UNINTERRUPTIBLE);
tag = __bt_get(hctx, bt); tag = __blk_mq_get_tag(data, bt);
if (tag != -1) if (tag != -1)
break; break;
/* /*
* We're out of tags on this hardware queue, kick any * We're out of tags on this hardware queue, kick any
* pending IO submits before going to sleep waiting for * pending IO submits before going to sleep waiting for
* some to complete. Note that hctx can be NULL here for * some to complete.
* reserved tag allocation.
*/ */
if (hctx) blk_mq_run_hw_queue(data->hctx, false);
blk_mq_run_hw_queue(hctx, false);
/* /*
* Retry tag allocation after running the hardware queue, * Retry tag allocation after running the hardware queue,
* as running the queue may also have found completions. * as running the queue may also have found completions.
*/ */
tag = __bt_get(hctx, bt); tag = __blk_mq_get_tag(data, bt);
if (tag != -1) if (tag != -1)
break; break;
if (data->ctx)
blk_mq_put_ctx(data->ctx); blk_mq_put_ctx(data->ctx);
io_schedule(); io_schedule();
data->ctx = blk_mq_get_ctx(data->q); data->ctx = blk_mq_get_ctx(data->q);
data->hctx = blk_mq_map_queue(data->q, data->ctx->cpu); data->hctx = blk_mq_map_queue(data->q, data->ctx->cpu);
if (data->flags & BLK_MQ_REQ_RESERVED) { tags = blk_mq_tags_from_data(data);
bt = &data->hctx->tags->breserved_tags; if (data->flags & BLK_MQ_REQ_RESERVED)
} else { bt = &tags->breserved_tags;
hctx = data->hctx; else
bt = &hctx->tags->bitmap_tags; bt = &tags->bitmap_tags;
}
finish_wait(&ws->wait, &wait);
ws = bt_wait_ptr(bt, hctx);
} while (1);
finish_wait(&ws->wait, &wait); finish_wait(&ws->wait, &wait);
return tag; ws = bt_wait_ptr(bt, data->hctx);
} } while (1);
static unsigned int __blk_mq_get_tag(struct blk_mq_alloc_data *data)
{
int tag;
tag = bt_get(data, &data->hctx->tags->bitmap_tags, data->hctx,
data->hctx->tags);
if (tag >= 0)
return tag + data->hctx->tags->nr_reserved_tags;
return BLK_MQ_TAG_FAIL;
}
static unsigned int __blk_mq_get_reserved_tag(struct blk_mq_alloc_data *data)
{
int tag;
if (unlikely(!data->hctx->tags->nr_reserved_tags)) {
WARN_ON_ONCE(1);
return BLK_MQ_TAG_FAIL;
}
tag = bt_get(data, &data->hctx->tags->breserved_tags, NULL, if (drop_ctx && data->ctx)
data->hctx->tags); blk_mq_put_ctx(data->ctx);
if (tag < 0)
return BLK_MQ_TAG_FAIL;
return tag; finish_wait(&ws->wait, &wait);
}
unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data) found_tag:
{ return tag + tag_offset;
if (data->flags & BLK_MQ_REQ_RESERVED)
return __blk_mq_get_reserved_tag(data);
return __blk_mq_get_tag(data);
} }
void blk_mq_put_tag(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, void blk_mq_put_tag(struct blk_mq_hw_ctx *hctx, struct blk_mq_tags *tags,
unsigned int tag) struct blk_mq_ctx *ctx, unsigned int tag)
{ {
struct blk_mq_tags *tags = hctx->tags;
if (tag >= tags->nr_reserved_tags) { if (tag >= tags->nr_reserved_tags) {
const int real_tag = tag - tags->nr_reserved_tags; const int real_tag = tag - tags->nr_reserved_tags;
...@@ -312,11 +296,11 @@ int blk_mq_reinit_tagset(struct blk_mq_tag_set *set) ...@@ -312,11 +296,11 @@ int blk_mq_reinit_tagset(struct blk_mq_tag_set *set)
struct blk_mq_tags *tags = set->tags[i]; struct blk_mq_tags *tags = set->tags[i];
for (j = 0; j < tags->nr_tags; j++) { for (j = 0; j < tags->nr_tags; j++) {
if (!tags->rqs[j]) if (!tags->static_rqs[j])
continue; continue;
ret = set->ops->reinit_request(set->driver_data, ret = set->ops->reinit_request(set->driver_data,
tags->rqs[j]); tags->static_rqs[j]);
if (ret) if (ret)
goto out; goto out;
} }
...@@ -351,11 +335,6 @@ void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_iter_fn *fn, ...@@ -351,11 +335,6 @@ void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_iter_fn *fn,
} }
static unsigned int bt_unused_tags(const struct sbitmap_queue *bt)
{
return bt->sb.depth - sbitmap_weight(&bt->sb);
}
static int bt_alloc(struct sbitmap_queue *bt, unsigned int depth, static int bt_alloc(struct sbitmap_queue *bt, unsigned int depth,
bool round_robin, int node) bool round_robin, int node)
{ {
...@@ -411,19 +390,56 @@ void blk_mq_free_tags(struct blk_mq_tags *tags) ...@@ -411,19 +390,56 @@ void blk_mq_free_tags(struct blk_mq_tags *tags)
kfree(tags); kfree(tags);
} }
int blk_mq_tag_update_depth(struct blk_mq_tags *tags, unsigned int tdepth) int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx,
struct blk_mq_tags **tagsptr, unsigned int tdepth,
bool can_grow)
{ {
struct blk_mq_tags *tags = *tagsptr;
if (tdepth <= tags->nr_reserved_tags)
return -EINVAL;
tdepth -= tags->nr_reserved_tags; tdepth -= tags->nr_reserved_tags;
if (tdepth > tags->nr_tags)
/*
* If we are allowed to grow beyond the original size, allocate
* a new set of tags before freeing the old one.
*/
if (tdepth > tags->nr_tags) {
struct blk_mq_tag_set *set = hctx->queue->tag_set;
struct blk_mq_tags *new;
bool ret;
if (!can_grow)
return -EINVAL; return -EINVAL;
/* /*
* Don't need (or can't) update reserved tags here, they remain * We need some sort of upper limit, set it high enough that
* static and should never need resizing. * no valid use cases should require more.
*/
if (tdepth > 16 * BLKDEV_MAX_RQ)
return -EINVAL;
new = blk_mq_alloc_rq_map(set, hctx->queue_num, tdepth, 0);
if (!new)
return -ENOMEM;
ret = blk_mq_alloc_rqs(set, new, hctx->queue_num, tdepth);
if (ret) {
blk_mq_free_rq_map(new);
return -ENOMEM;
}
blk_mq_free_rqs(set, *tagsptr, hctx->queue_num);
blk_mq_free_rq_map(*tagsptr);
*tagsptr = new;
} else {
/*
* Don't need (or can't) update reserved tags here, they
* remain static and should never need resizing.
*/ */
sbitmap_queue_resize(&tags->bitmap_tags, tdepth); sbitmap_queue_resize(&tags->bitmap_tags, tdepth);
}
blk_mq_tag_wakeup_all(tags, false);
return 0; return 0;
} }
...@@ -454,25 +470,3 @@ u32 blk_mq_unique_tag(struct request *rq) ...@@ -454,25 +470,3 @@ u32 blk_mq_unique_tag(struct request *rq)
(rq->tag & BLK_MQ_UNIQUE_TAG_MASK); (rq->tag & BLK_MQ_UNIQUE_TAG_MASK);
} }
EXPORT_SYMBOL(blk_mq_unique_tag); EXPORT_SYMBOL(blk_mq_unique_tag);
ssize_t blk_mq_tag_sysfs_show(struct blk_mq_tags *tags, char *page)
{
char *orig_page = page;
unsigned int free, res;
if (!tags)
return 0;
page += sprintf(page, "nr_tags=%u, reserved_tags=%u, "
"bits_per_word=%u\n",
tags->nr_tags, tags->nr_reserved_tags,
1U << tags->bitmap_tags.sb.shift);
free = bt_unused_tags(&tags->bitmap_tags);
res = bt_unused_tags(&tags->breserved_tags);
page += sprintf(page, "nr_free=%u, nr_reserved=%u\n", free, res);
page += sprintf(page, "active_queues=%u\n", atomic_read(&tags->active_queues));
return page - orig_page;
}
...@@ -16,6 +16,7 @@ struct blk_mq_tags { ...@@ -16,6 +16,7 @@ struct blk_mq_tags {
struct sbitmap_queue breserved_tags; struct sbitmap_queue breserved_tags;
struct request **rqs; struct request **rqs;
struct request **static_rqs;
struct list_head page_list; struct list_head page_list;
}; };
...@@ -24,11 +25,12 @@ extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags, unsigned int r ...@@ -24,11 +25,12 @@ extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags, unsigned int r
extern void blk_mq_free_tags(struct blk_mq_tags *tags); extern void blk_mq_free_tags(struct blk_mq_tags *tags);
extern unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data); extern unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data);
extern void blk_mq_put_tag(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, extern void blk_mq_put_tag(struct blk_mq_hw_ctx *hctx, struct blk_mq_tags *tags,
unsigned int tag); struct blk_mq_ctx *ctx, unsigned int tag);
extern bool blk_mq_has_free_tags(struct blk_mq_tags *tags); extern bool blk_mq_has_free_tags(struct blk_mq_tags *tags);
extern ssize_t blk_mq_tag_sysfs_show(struct blk_mq_tags *tags, char *page); extern int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx,
extern int blk_mq_tag_update_depth(struct blk_mq_tags *tags, unsigned int depth); struct blk_mq_tags **tags,
unsigned int depth, bool can_grow);
extern void blk_mq_tag_wakeup_all(struct blk_mq_tags *tags, bool); extern void blk_mq_tag_wakeup_all(struct blk_mq_tags *tags, bool);
void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_iter_fn *fn, void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_iter_fn *fn,
void *priv); void *priv);
......
This diff is collapsed.
...@@ -32,7 +32,31 @@ void blk_mq_free_queue(struct request_queue *q); ...@@ -32,7 +32,31 @@ void blk_mq_free_queue(struct request_queue *q);
int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr); int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr);
void blk_mq_wake_waiters(struct request_queue *q); void blk_mq_wake_waiters(struct request_queue *q);
bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *, struct list_head *); bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *, struct list_head *);
void blk_mq_flush_busy_ctxs(struct blk_mq_hw_ctx *hctx, struct list_head *list);
bool blk_mq_hctx_has_pending(struct blk_mq_hw_ctx *hctx);
bool blk_mq_get_driver_tag(struct request *rq, struct blk_mq_hw_ctx **hctx,
bool wait);
/*
* Internal helpers for allocating/freeing the request map
*/
void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
unsigned int hctx_idx);
void blk_mq_free_rq_map(struct blk_mq_tags *tags);
struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
unsigned int hctx_idx,
unsigned int nr_tags,
unsigned int reserved_tags);
int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
unsigned int hctx_idx, unsigned int depth);
/*
* Internal helpers for request insertion into sw queues
*/
void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
bool at_head);
void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
struct list_head *list);
/* /*
* CPU hotplug helpers * CPU hotplug helpers
*/ */
...@@ -57,6 +81,35 @@ extern int blk_mq_sysfs_register(struct request_queue *q); ...@@ -57,6 +81,35 @@ extern int blk_mq_sysfs_register(struct request_queue *q);
extern void blk_mq_sysfs_unregister(struct request_queue *q); extern void blk_mq_sysfs_unregister(struct request_queue *q);
extern void blk_mq_hctx_kobj_init(struct blk_mq_hw_ctx *hctx); extern void blk_mq_hctx_kobj_init(struct blk_mq_hw_ctx *hctx);
/*
* debugfs helpers
*/
#ifdef CONFIG_BLK_DEBUG_FS
int blk_mq_debugfs_register(struct request_queue *q, const char *name);
void blk_mq_debugfs_unregister(struct request_queue *q);
int blk_mq_debugfs_register_hctxs(struct request_queue *q);
void blk_mq_debugfs_unregister_hctxs(struct request_queue *q);
#else
static inline int blk_mq_debugfs_register(struct request_queue *q,
const char *name)
{
return 0;
}
static inline void blk_mq_debugfs_unregister(struct request_queue *q)
{
}
static inline int blk_mq_debugfs_register_hctxs(struct request_queue *q)
{
return 0;
}
static inline void blk_mq_debugfs_unregister_hctxs(struct request_queue *q)
{
}
#endif
extern void blk_mq_rq_timed_out(struct request *req, bool reserved); extern void blk_mq_rq_timed_out(struct request *req, bool reserved);
void blk_mq_release(struct request_queue *q); void blk_mq_release(struct request_queue *q);
...@@ -103,6 +156,25 @@ static inline void blk_mq_set_alloc_data(struct blk_mq_alloc_data *data, ...@@ -103,6 +156,25 @@ static inline void blk_mq_set_alloc_data(struct blk_mq_alloc_data *data,
data->hctx = hctx; data->hctx = hctx;
} }
static inline struct blk_mq_tags *blk_mq_tags_from_data(struct blk_mq_alloc_data *data)
{
if (data->flags & BLK_MQ_REQ_INTERNAL)
return data->hctx->sched_tags;
return data->hctx->tags;
}
/*
* Internal helpers for request allocation/init/free
*/
void blk_mq_rq_ctx_init(struct request_queue *q, struct blk_mq_ctx *ctx,
struct request *rq, unsigned int op);
void __blk_mq_finish_request(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
struct request *rq);
void blk_mq_finish_request(struct request *rq);
struct request *__blk_mq_alloc_request(struct blk_mq_alloc_data *data,
unsigned int op);
static inline bool blk_mq_hctx_stopped(struct blk_mq_hw_ctx *hctx) static inline bool blk_mq_hctx_stopped(struct blk_mq_hw_ctx *hctx)
{ {
return test_bit(BLK_MQ_S_STOPPED, &hctx->state); return test_bit(BLK_MQ_S_STOPPED, &hctx->state);
......
...@@ -88,6 +88,7 @@ EXPORT_SYMBOL_GPL(blk_queue_lld_busy); ...@@ -88,6 +88,7 @@ EXPORT_SYMBOL_GPL(blk_queue_lld_busy);
void blk_set_default_limits(struct queue_limits *lim) void blk_set_default_limits(struct queue_limits *lim)
{ {
lim->max_segments = BLK_MAX_SEGMENTS; lim->max_segments = BLK_MAX_SEGMENTS;
lim->max_discard_segments = 1;
lim->max_integrity_segments = 0; lim->max_integrity_segments = 0;
lim->seg_boundary_mask = BLK_SEG_BOUNDARY_MASK; lim->seg_boundary_mask = BLK_SEG_BOUNDARY_MASK;
lim->virt_boundary_mask = 0; lim->virt_boundary_mask = 0;
...@@ -128,6 +129,7 @@ void blk_set_stacking_limits(struct queue_limits *lim) ...@@ -128,6 +129,7 @@ void blk_set_stacking_limits(struct queue_limits *lim)
/* Inherit limits from component devices */ /* Inherit limits from component devices */
lim->discard_zeroes_data = 1; lim->discard_zeroes_data = 1;
lim->max_segments = USHRT_MAX; lim->max_segments = USHRT_MAX;
lim->max_discard_segments = 1;
lim->max_hw_sectors = UINT_MAX; lim->max_hw_sectors = UINT_MAX;
lim->max_segment_size = UINT_MAX; lim->max_segment_size = UINT_MAX;
lim->max_sectors = UINT_MAX; lim->max_sectors = UINT_MAX;
...@@ -253,7 +255,7 @@ void blk_queue_max_hw_sectors(struct request_queue *q, unsigned int max_hw_secto ...@@ -253,7 +255,7 @@ void blk_queue_max_hw_sectors(struct request_queue *q, unsigned int max_hw_secto
max_sectors = min_not_zero(max_hw_sectors, limits->max_dev_sectors); max_sectors = min_not_zero(max_hw_sectors, limits->max_dev_sectors);
max_sectors = min_t(unsigned int, max_sectors, BLK_DEF_MAX_SECTORS); max_sectors = min_t(unsigned int, max_sectors, BLK_DEF_MAX_SECTORS);
limits->max_sectors = max_sectors; limits->max_sectors = max_sectors;
q->backing_dev_info.io_pages = max_sectors >> (PAGE_SHIFT - 9); q->backing_dev_info->io_pages = max_sectors >> (PAGE_SHIFT - 9);
} }
EXPORT_SYMBOL(blk_queue_max_hw_sectors); EXPORT_SYMBOL(blk_queue_max_hw_sectors);
...@@ -336,6 +338,22 @@ void blk_queue_max_segments(struct request_queue *q, unsigned short max_segments ...@@ -336,6 +338,22 @@ void blk_queue_max_segments(struct request_queue *q, unsigned short max_segments
} }
EXPORT_SYMBOL(blk_queue_max_segments); EXPORT_SYMBOL(blk_queue_max_segments);
/**
* blk_queue_max_discard_segments - set max segments for discard requests
* @q: the request queue for the device
* @max_segments: max number of segments
*
* Description:
* Enables a low level driver to set an upper limit on the number of
* segments in a discard request.
**/
void blk_queue_max_discard_segments(struct request_queue *q,
unsigned short max_segments)
{
q->limits.max_discard_segments = max_segments;
}
EXPORT_SYMBOL_GPL(blk_queue_max_discard_segments);
/** /**
* blk_queue_max_segment_size - set max segment size for blk_rq_map_sg * blk_queue_max_segment_size - set max segment size for blk_rq_map_sg
* @q: the request queue for the device * @q: the request queue for the device
...@@ -553,6 +571,8 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, ...@@ -553,6 +571,8 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
b->virt_boundary_mask); b->virt_boundary_mask);
t->max_segments = min_not_zero(t->max_segments, b->max_segments); t->max_segments = min_not_zero(t->max_segments, b->max_segments);
t->max_discard_segments = min_not_zero(t->max_discard_segments,
b->max_discard_segments);
t->max_integrity_segments = min_not_zero(t->max_integrity_segments, t->max_integrity_segments = min_not_zero(t->max_integrity_segments,
b->max_integrity_segments); b->max_integrity_segments);
......
...@@ -89,7 +89,7 @@ queue_requests_store(struct request_queue *q, const char *page, size_t count) ...@@ -89,7 +89,7 @@ queue_requests_store(struct request_queue *q, const char *page, size_t count)
static ssize_t queue_ra_show(struct request_queue *q, char *page) static ssize_t queue_ra_show(struct request_queue *q, char *page)
{ {
unsigned long ra_kb = q->backing_dev_info.ra_pages << unsigned long ra_kb = q->backing_dev_info->ra_pages <<
(PAGE_SHIFT - 10); (PAGE_SHIFT - 10);
return queue_var_show(ra_kb, (page)); return queue_var_show(ra_kb, (page));
...@@ -104,7 +104,7 @@ queue_ra_store(struct request_queue *q, const char *page, size_t count) ...@@ -104,7 +104,7 @@ queue_ra_store(struct request_queue *q, const char *page, size_t count)
if (ret < 0) if (ret < 0)
return ret; return ret;
q->backing_dev_info.ra_pages = ra_kb >> (PAGE_SHIFT - 10); q->backing_dev_info->ra_pages = ra_kb >> (PAGE_SHIFT - 10);
return ret; return ret;
} }
...@@ -121,6 +121,12 @@ static ssize_t queue_max_segments_show(struct request_queue *q, char *page) ...@@ -121,6 +121,12 @@ static ssize_t queue_max_segments_show(struct request_queue *q, char *page)
return queue_var_show(queue_max_segments(q), (page)); return queue_var_show(queue_max_segments(q), (page));
} }
static ssize_t queue_max_discard_segments_show(struct request_queue *q,
char *page)
{
return queue_var_show(queue_max_discard_segments(q), (page));
}
static ssize_t queue_max_integrity_segments_show(struct request_queue *q, char *page) static ssize_t queue_max_integrity_segments_show(struct request_queue *q, char *page)
{ {
return queue_var_show(q->limits.max_integrity_segments, (page)); return queue_var_show(q->limits.max_integrity_segments, (page));
...@@ -236,7 +242,7 @@ queue_max_sectors_store(struct request_queue *q, const char *page, size_t count) ...@@ -236,7 +242,7 @@ queue_max_sectors_store(struct request_queue *q, const char *page, size_t count)
spin_lock_irq(q->queue_lock); spin_lock_irq(q->queue_lock);
q->limits.max_sectors = max_sectors_kb << 1; q->limits.max_sectors = max_sectors_kb << 1;
q->backing_dev_info.io_pages = max_sectors_kb >> (PAGE_SHIFT - 10); q->backing_dev_info->io_pages = max_sectors_kb >> (PAGE_SHIFT - 10);
spin_unlock_irq(q->queue_lock); spin_unlock_irq(q->queue_lock);
return ret; return ret;
...@@ -545,6 +551,11 @@ static struct queue_sysfs_entry queue_max_segments_entry = { ...@@ -545,6 +551,11 @@ static struct queue_sysfs_entry queue_max_segments_entry = {
.show = queue_max_segments_show, .show = queue_max_segments_show,
}; };
static struct queue_sysfs_entry queue_max_discard_segments_entry = {
.attr = {.name = "max_discard_segments", .mode = S_IRUGO },
.show = queue_max_discard_segments_show,
};
static struct queue_sysfs_entry queue_max_integrity_segments_entry = { static struct queue_sysfs_entry queue_max_integrity_segments_entry = {
.attr = {.name = "max_integrity_segments", .mode = S_IRUGO }, .attr = {.name = "max_integrity_segments", .mode = S_IRUGO },
.show = queue_max_integrity_segments_show, .show = queue_max_integrity_segments_show,
...@@ -697,6 +708,7 @@ static struct attribute *default_attrs[] = { ...@@ -697,6 +708,7 @@ static struct attribute *default_attrs[] = {
&queue_max_hw_sectors_entry.attr, &queue_max_hw_sectors_entry.attr,
&queue_max_sectors_entry.attr, &queue_max_sectors_entry.attr,
&queue_max_segments_entry.attr, &queue_max_segments_entry.attr,
&queue_max_discard_segments_entry.attr,
&queue_max_integrity_segments_entry.attr, &queue_max_integrity_segments_entry.attr,
&queue_max_segment_size_entry.attr, &queue_max_segment_size_entry.attr,
&queue_iosched_entry.attr, &queue_iosched_entry.attr,
...@@ -799,7 +811,7 @@ static void blk_release_queue(struct kobject *kobj) ...@@ -799,7 +811,7 @@ static void blk_release_queue(struct kobject *kobj)
container_of(kobj, struct request_queue, kobj); container_of(kobj, struct request_queue, kobj);
wbt_exit(q); wbt_exit(q);
bdi_exit(&q->backing_dev_info); bdi_put(q->backing_dev_info);
blkcg_exit_queue(q); blkcg_exit_queue(q);
if (q->elevator) { if (q->elevator) {
...@@ -814,13 +826,19 @@ static void blk_release_queue(struct kobject *kobj) ...@@ -814,13 +826,19 @@ static void blk_release_queue(struct kobject *kobj)
if (q->queue_tags) if (q->queue_tags)
__blk_queue_free_tags(q); __blk_queue_free_tags(q);
if (!q->mq_ops) if (!q->mq_ops) {
if (q->exit_rq_fn)
q->exit_rq_fn(q, q->fq->flush_rq);
blk_free_flush_queue(q->fq); blk_free_flush_queue(q->fq);
else } else {
blk_mq_release(q); blk_mq_release(q);
}
blk_trace_shutdown(q); blk_trace_shutdown(q);
if (q->mq_ops)
blk_mq_debugfs_unregister(q);
if (q->bio_split) if (q->bio_split)
bioset_free(q->bio_split); bioset_free(q->bio_split);
...@@ -884,32 +902,36 @@ int blk_register_queue(struct gendisk *disk) ...@@ -884,32 +902,36 @@ int blk_register_queue(struct gendisk *disk)
if (ret) if (ret)
return ret; return ret;
if (q->mq_ops)
blk_mq_register_dev(dev, q);
/* Prevent changes through sysfs until registration is completed. */
mutex_lock(&q->sysfs_lock);
ret = kobject_add(&q->kobj, kobject_get(&dev->kobj), "%s", "queue"); ret = kobject_add(&q->kobj, kobject_get(&dev->kobj), "%s", "queue");
if (ret < 0) { if (ret < 0) {
blk_trace_remove_sysfs(dev); blk_trace_remove_sysfs(dev);
return ret; goto unlock;
} }
kobject_uevent(&q->kobj, KOBJ_ADD); kobject_uevent(&q->kobj, KOBJ_ADD);
if (q->mq_ops)
blk_mq_register_dev(dev, q);
blk_wb_init(q); blk_wb_init(q);
if (!q->request_fn) if (q->request_fn || (q->mq_ops && q->elevator)) {
return 0;
ret = elv_register_queue(q); ret = elv_register_queue(q);
if (ret) { if (ret) {
kobject_uevent(&q->kobj, KOBJ_REMOVE); kobject_uevent(&q->kobj, KOBJ_REMOVE);
kobject_del(&q->kobj); kobject_del(&q->kobj);
blk_trace_remove_sysfs(dev); blk_trace_remove_sysfs(dev);
kobject_put(&dev->kobj); kobject_put(&dev->kobj);
return ret; goto unlock;
} }
}
return 0; ret = 0;
unlock:
mutex_unlock(&q->sysfs_lock);
return ret;
} }
void blk_unregister_queue(struct gendisk *disk) void blk_unregister_queue(struct gendisk *disk)
...@@ -922,7 +944,7 @@ void blk_unregister_queue(struct gendisk *disk) ...@@ -922,7 +944,7 @@ void blk_unregister_queue(struct gendisk *disk)
if (q->mq_ops) if (q->mq_ops)
blk_mq_unregister_dev(disk_to_dev(disk), q); blk_mq_unregister_dev(disk_to_dev(disk), q);
if (q->request_fn) if (q->request_fn || (q->mq_ops && q->elevator))
elv_unregister_queue(q); elv_unregister_queue(q);
kobject_uevent(&q->kobj, KOBJ_REMOVE); kobject_uevent(&q->kobj, KOBJ_REMOVE);
......
...@@ -272,6 +272,7 @@ void blk_queue_end_tag(struct request_queue *q, struct request *rq) ...@@ -272,6 +272,7 @@ void blk_queue_end_tag(struct request_queue *q, struct request *rq)
list_del_init(&rq->queuelist); list_del_init(&rq->queuelist);
rq->rq_flags &= ~RQF_QUEUED; rq->rq_flags &= ~RQF_QUEUED;
rq->tag = -1; rq->tag = -1;
rq->internal_tag = -1;
if (unlikely(bqt->tag_index[tag] == NULL)) if (unlikely(bqt->tag_index[tag] == NULL))
printk(KERN_ERR "%s: tag %d is missing\n", printk(KERN_ERR "%s: tag %d is missing\n",
......
...@@ -866,10 +866,12 @@ static void tg_update_disptime(struct throtl_grp *tg) ...@@ -866,10 +866,12 @@ static void tg_update_disptime(struct throtl_grp *tg)
unsigned long read_wait = -1, write_wait = -1, min_wait = -1, disptime; unsigned long read_wait = -1, write_wait = -1, min_wait = -1, disptime;
struct bio *bio; struct bio *bio;
if ((bio = throtl_peek_queued(&sq->queued[READ]))) bio = throtl_peek_queued(&sq->queued[READ]);
if (bio)
tg_may_dispatch(tg, bio, &read_wait); tg_may_dispatch(tg, bio, &read_wait);
if ((bio = throtl_peek_queued(&sq->queued[WRITE]))) bio = throtl_peek_queued(&sq->queued[WRITE]);
if (bio)
tg_may_dispatch(tg, bio, &write_wait); tg_may_dispatch(tg, bio, &write_wait);
min_wait = min(read_wait, write_wait); min_wait = min(read_wait, write_wait);
......
...@@ -96,7 +96,7 @@ static void wb_timestamp(struct rq_wb *rwb, unsigned long *var) ...@@ -96,7 +96,7 @@ static void wb_timestamp(struct rq_wb *rwb, unsigned long *var)
*/ */
static bool wb_recent_wait(struct rq_wb *rwb) static bool wb_recent_wait(struct rq_wb *rwb)
{ {
struct bdi_writeback *wb = &rwb->queue->backing_dev_info.wb; struct bdi_writeback *wb = &rwb->queue->backing_dev_info->wb;
return time_before(jiffies, wb->dirty_sleep + HZ); return time_before(jiffies, wb->dirty_sleep + HZ);
} }
...@@ -279,7 +279,7 @@ enum { ...@@ -279,7 +279,7 @@ enum {
static int __latency_exceeded(struct rq_wb *rwb, struct blk_rq_stat *stat) static int __latency_exceeded(struct rq_wb *rwb, struct blk_rq_stat *stat)
{ {
struct backing_dev_info *bdi = &rwb->queue->backing_dev_info; struct backing_dev_info *bdi = rwb->queue->backing_dev_info;
u64 thislat; u64 thislat;
/* /*
...@@ -339,7 +339,7 @@ static int latency_exceeded(struct rq_wb *rwb) ...@@ -339,7 +339,7 @@ static int latency_exceeded(struct rq_wb *rwb)
static void rwb_trace_step(struct rq_wb *rwb, const char *msg) static void rwb_trace_step(struct rq_wb *rwb, const char *msg)
{ {
struct backing_dev_info *bdi = &rwb->queue->backing_dev_info; struct backing_dev_info *bdi = rwb->queue->backing_dev_info;
trace_wbt_step(bdi, msg, rwb->scale_step, rwb->cur_win_nsec, trace_wbt_step(bdi, msg, rwb->scale_step, rwb->cur_win_nsec,
rwb->wb_background, rwb->wb_normal, rwb->wb_max); rwb->wb_background, rwb->wb_normal, rwb->wb_max);
...@@ -423,7 +423,7 @@ static void wb_timer_fn(unsigned long data) ...@@ -423,7 +423,7 @@ static void wb_timer_fn(unsigned long data)
status = latency_exceeded(rwb); status = latency_exceeded(rwb);
trace_wbt_timer(&rwb->queue->backing_dev_info, status, rwb->scale_step, trace_wbt_timer(rwb->queue->backing_dev_info, status, rwb->scale_step,
inflight); inflight);
/* /*
......
...@@ -14,6 +14,10 @@ ...@@ -14,6 +14,10 @@
/* Max future timer expiry for timeouts */ /* Max future timer expiry for timeouts */
#define BLK_MAX_TIMEOUT (5 * HZ) #define BLK_MAX_TIMEOUT (5 * HZ)
#ifdef CONFIG_DEBUG_FS
extern struct dentry *blk_debugfs_root;
#endif
struct blk_flush_queue { struct blk_flush_queue {
unsigned int flush_queue_delayed:1; unsigned int flush_queue_delayed:1;
unsigned int flush_pending_idx:1; unsigned int flush_pending_idx:1;
...@@ -96,6 +100,8 @@ bool bio_attempt_front_merge(struct request_queue *q, struct request *req, ...@@ -96,6 +100,8 @@ bool bio_attempt_front_merge(struct request_queue *q, struct request *req,
struct bio *bio); struct bio *bio);
bool bio_attempt_back_merge(struct request_queue *q, struct request *req, bool bio_attempt_back_merge(struct request_queue *q, struct request *req,
struct bio *bio); struct bio *bio);
bool bio_attempt_discard_merge(struct request_queue *q, struct request *req,
struct bio *bio);
bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio, bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
unsigned int *request_count, unsigned int *request_count,
struct request **same_queue_rq); struct request **same_queue_rq);
...@@ -167,7 +173,7 @@ static inline struct request *__elv_next_request(struct request_queue *q) ...@@ -167,7 +173,7 @@ static inline struct request *__elv_next_request(struct request_queue *q)
return NULL; return NULL;
} }
if (unlikely(blk_queue_bypass(q)) || if (unlikely(blk_queue_bypass(q)) ||
!q->elevator->type->ops.elevator_dispatch_fn(q, 0)) !q->elevator->type->ops.sq.elevator_dispatch_fn(q, 0))
return NULL; return NULL;
} }
} }
...@@ -176,16 +182,16 @@ static inline void elv_activate_rq(struct request_queue *q, struct request *rq) ...@@ -176,16 +182,16 @@ static inline void elv_activate_rq(struct request_queue *q, struct request *rq)
{ {
struct elevator_queue *e = q->elevator; struct elevator_queue *e = q->elevator;
if (e->type->ops.elevator_activate_req_fn) if (e->type->ops.sq.elevator_activate_req_fn)
e->type->ops.elevator_activate_req_fn(q, rq); e->type->ops.sq.elevator_activate_req_fn(q, rq);
} }
static inline void elv_deactivate_rq(struct request_queue *q, struct request *rq) static inline void elv_deactivate_rq(struct request_queue *q, struct request *rq)
{ {
struct elevator_queue *e = q->elevator; struct elevator_queue *e = q->elevator;
if (e->type->ops.elevator_deactivate_req_fn) if (e->type->ops.sq.elevator_deactivate_req_fn)
e->type->ops.elevator_deactivate_req_fn(q, rq); e->type->ops.sq.elevator_deactivate_req_fn(q, rq);
} }
#ifdef CONFIG_FAIL_IO_TIMEOUT #ifdef CONFIG_FAIL_IO_TIMEOUT
...@@ -204,14 +210,14 @@ int ll_back_merge_fn(struct request_queue *q, struct request *req, ...@@ -204,14 +210,14 @@ int ll_back_merge_fn(struct request_queue *q, struct request *req,
struct bio *bio); struct bio *bio);
int ll_front_merge_fn(struct request_queue *q, struct request *req, int ll_front_merge_fn(struct request_queue *q, struct request *req,
struct bio *bio); struct bio *bio);
int attempt_back_merge(struct request_queue *q, struct request *rq); struct request *attempt_back_merge(struct request_queue *q, struct request *rq);
int attempt_front_merge(struct request_queue *q, struct request *rq); struct request *attempt_front_merge(struct request_queue *q, struct request *rq);
int blk_attempt_req_merge(struct request_queue *q, struct request *rq, int blk_attempt_req_merge(struct request_queue *q, struct request *rq,
struct request *next); struct request *next);
void blk_recalc_rq_segments(struct request *rq); void blk_recalc_rq_segments(struct request *rq);
void blk_rq_set_mixed_merge(struct request *rq); void blk_rq_set_mixed_merge(struct request *rq);
bool blk_rq_merge_ok(struct request *rq, struct bio *bio); bool blk_rq_merge_ok(struct request *rq, struct bio *bio);
int blk_try_merge(struct request *rq, struct bio *bio); enum elv_merge blk_try_merge(struct request *rq, struct bio *bio);
void blk_queue_congestion_threshold(struct request_queue *q); void blk_queue_congestion_threshold(struct request_queue *q);
...@@ -249,7 +255,14 @@ static inline int blk_do_io_stat(struct request *rq) ...@@ -249,7 +255,14 @@ static inline int blk_do_io_stat(struct request *rq)
{ {
return rq->rq_disk && return rq->rq_disk &&
(rq->rq_flags & RQF_IO_STAT) && (rq->rq_flags & RQF_IO_STAT) &&
(rq->cmd_type == REQ_TYPE_FS); !blk_rq_is_passthrough(rq);
}
static inline void req_set_nomerge(struct request_queue *q, struct request *req)
{
req->cmd_flags |= REQ_NOMERGE;
if (req == q->last_merge)
q->last_merge = NULL;
} }
/* /*
...@@ -263,6 +276,22 @@ void ioc_clear_queue(struct request_queue *q); ...@@ -263,6 +276,22 @@ void ioc_clear_queue(struct request_queue *q);
int create_task_io_context(struct task_struct *task, gfp_t gfp_mask, int node); int create_task_io_context(struct task_struct *task, gfp_t gfp_mask, int node);
/**
* rq_ioc - determine io_context for request allocation
* @bio: request being allocated is for this bio (can be %NULL)
*
* Determine io_context to use for request allocation for @bio. May return
* %NULL if %current->io_context doesn't exist.
*/
static inline struct io_context *rq_ioc(struct bio *bio)
{
#ifdef CONFIG_BLK_CGROUP
if (bio && bio->bi_ioc)
return bio->bi_ioc;
#endif
return current->io_context;
}
/** /**
* create_io_context - try to create task->io_context * create_io_context - try to create task->io_context
* @gfp_mask: allocation mask * @gfp_mask: allocation mask
......
...@@ -71,22 +71,24 @@ void bsg_job_done(struct bsg_job *job, int result, ...@@ -71,22 +71,24 @@ void bsg_job_done(struct bsg_job *job, int result,
{ {
struct request *req = job->req; struct request *req = job->req;
struct request *rsp = req->next_rq; struct request *rsp = req->next_rq;
struct scsi_request *rq = scsi_req(req);
int err; int err;
err = job->req->errors = result; err = job->req->errors = result;
if (err < 0) if (err < 0)
/* we're only returning the result field in the reply */ /* we're only returning the result field in the reply */
job->req->sense_len = sizeof(u32); rq->sense_len = sizeof(u32);
else else
job->req->sense_len = job->reply_len; rq->sense_len = job->reply_len;
/* we assume all request payload was transferred, residual == 0 */ /* we assume all request payload was transferred, residual == 0 */
req->resid_len = 0; rq->resid_len = 0;
if (rsp) { if (rsp) {
WARN_ON(reply_payload_rcv_len > rsp->resid_len); WARN_ON(reply_payload_rcv_len > scsi_req(rsp)->resid_len);
/* set reply (bidi) residual */ /* set reply (bidi) residual */
rsp->resid_len -= min(reply_payload_rcv_len, rsp->resid_len); scsi_req(rsp)->resid_len -=
min(reply_payload_rcv_len, scsi_req(rsp)->resid_len);
} }
blk_complete_request(req); blk_complete_request(req);
} }
...@@ -113,6 +115,7 @@ static int bsg_map_buffer(struct bsg_buffer *buf, struct request *req) ...@@ -113,6 +115,7 @@ static int bsg_map_buffer(struct bsg_buffer *buf, struct request *req)
if (!buf->sg_list) if (!buf->sg_list)
return -ENOMEM; return -ENOMEM;
sg_init_table(buf->sg_list, req->nr_phys_segments); sg_init_table(buf->sg_list, req->nr_phys_segments);
scsi_req(req)->resid_len = blk_rq_bytes(req);
buf->sg_cnt = blk_rq_map_sg(req->q, req, buf->sg_list); buf->sg_cnt = blk_rq_map_sg(req->q, req, buf->sg_list);
buf->payload_len = blk_rq_bytes(req); buf->payload_len = blk_rq_bytes(req);
return 0; return 0;
...@@ -127,6 +130,7 @@ static int bsg_create_job(struct device *dev, struct request *req) ...@@ -127,6 +130,7 @@ static int bsg_create_job(struct device *dev, struct request *req)
{ {
struct request *rsp = req->next_rq; struct request *rsp = req->next_rq;
struct request_queue *q = req->q; struct request_queue *q = req->q;
struct scsi_request *rq = scsi_req(req);
struct bsg_job *job; struct bsg_job *job;
int ret; int ret;
...@@ -140,9 +144,9 @@ static int bsg_create_job(struct device *dev, struct request *req) ...@@ -140,9 +144,9 @@ static int bsg_create_job(struct device *dev, struct request *req)
job->req = req; job->req = req;
if (q->bsg_job_size) if (q->bsg_job_size)
job->dd_data = (void *)&job[1]; job->dd_data = (void *)&job[1];
job->request = req->cmd; job->request = rq->cmd;
job->request_len = req->cmd_len; job->request_len = rq->cmd_len;
job->reply = req->sense; job->reply = rq->sense;
job->reply_len = SCSI_SENSE_BUFFERSIZE; /* Size of sense buffer job->reply_len = SCSI_SENSE_BUFFERSIZE; /* Size of sense buffer
* allocated */ * allocated */
if (req->bio) { if (req->bio) {
...@@ -177,7 +181,7 @@ static int bsg_create_job(struct device *dev, struct request *req) ...@@ -177,7 +181,7 @@ static int bsg_create_job(struct device *dev, struct request *req)
* *
* Drivers/subsys should pass this to the queue init function. * Drivers/subsys should pass this to the queue init function.
*/ */
void bsg_request_fn(struct request_queue *q) static void bsg_request_fn(struct request_queue *q)
__releases(q->queue_lock) __releases(q->queue_lock)
__acquires(q->queue_lock) __acquires(q->queue_lock)
{ {
...@@ -214,24 +218,30 @@ void bsg_request_fn(struct request_queue *q) ...@@ -214,24 +218,30 @@ void bsg_request_fn(struct request_queue *q)
put_device(dev); put_device(dev);
spin_lock_irq(q->queue_lock); spin_lock_irq(q->queue_lock);
} }
EXPORT_SYMBOL_GPL(bsg_request_fn);
/** /**
* bsg_setup_queue - Create and add the bsg hooks so we can receive requests * bsg_setup_queue - Create and add the bsg hooks so we can receive requests
* @dev: device to attach bsg device to * @dev: device to attach bsg device to
* @q: request queue setup by caller
* @name: device to give bsg device * @name: device to give bsg device
* @job_fn: bsg job handler * @job_fn: bsg job handler
* @dd_job_size: size of LLD data needed for each job * @dd_job_size: size of LLD data needed for each job
*
* The caller should have setup the reuqest queue with bsg_request_fn
* as the request_fn.
*/ */
int bsg_setup_queue(struct device *dev, struct request_queue *q, struct request_queue *bsg_setup_queue(struct device *dev, char *name,
char *name, bsg_job_fn *job_fn, int dd_job_size) bsg_job_fn *job_fn, int dd_job_size)
{ {
struct request_queue *q;
int ret; int ret;
q = blk_alloc_queue(GFP_KERNEL);
if (!q)
return ERR_PTR(-ENOMEM);
q->cmd_size = sizeof(struct scsi_request);
q->request_fn = bsg_request_fn;
ret = blk_init_allocated_queue(q);
if (ret)
goto out_cleanup_queue;
q->queuedata = dev; q->queuedata = dev;
q->bsg_job_size = dd_job_size; q->bsg_job_size = dd_job_size;
q->bsg_job_fn = job_fn; q->bsg_job_fn = job_fn;
...@@ -243,9 +253,12 @@ int bsg_setup_queue(struct device *dev, struct request_queue *q, ...@@ -243,9 +253,12 @@ int bsg_setup_queue(struct device *dev, struct request_queue *q,
if (ret) { if (ret) {
printk(KERN_ERR "%s: bsg interface failed to " printk(KERN_ERR "%s: bsg interface failed to "
"initialize - register queue\n", dev->kobj.name); "initialize - register queue\n", dev->kobj.name);
return ret; goto out_cleanup_queue;
} }
return 0; return q;
out_cleanup_queue:
blk_cleanup_queue(q);
return ERR_PTR(ret);
} }
EXPORT_SYMBOL_GPL(bsg_setup_queue); EXPORT_SYMBOL_GPL(bsg_setup_queue);
This diff is collapsed.
...@@ -2528,7 +2528,7 @@ static void cfq_remove_request(struct request *rq) ...@@ -2528,7 +2528,7 @@ static void cfq_remove_request(struct request *rq)
} }
} }
static int cfq_merge(struct request_queue *q, struct request **req, static enum elv_merge cfq_merge(struct request_queue *q, struct request **req,
struct bio *bio) struct bio *bio)
{ {
struct cfq_data *cfqd = q->elevator->elevator_data; struct cfq_data *cfqd = q->elevator->elevator_data;
...@@ -2544,7 +2544,7 @@ static int cfq_merge(struct request_queue *q, struct request **req, ...@@ -2544,7 +2544,7 @@ static int cfq_merge(struct request_queue *q, struct request **req,
} }
static void cfq_merged_request(struct request_queue *q, struct request *req, static void cfq_merged_request(struct request_queue *q, struct request *req,
int type) enum elv_merge type)
{ {
if (type == ELEVATOR_FRONT_MERGE) { if (type == ELEVATOR_FRONT_MERGE) {
struct cfq_queue *cfqq = RQ_CFQQ(req); struct cfq_queue *cfqq = RQ_CFQQ(req);
...@@ -2749,9 +2749,11 @@ static struct cfq_queue *cfq_get_next_queue_forced(struct cfq_data *cfqd) ...@@ -2749,9 +2749,11 @@ static struct cfq_queue *cfq_get_next_queue_forced(struct cfq_data *cfqd)
if (!cfqg) if (!cfqg)
return NULL; return NULL;
for_each_cfqg_st(cfqg, i, j, st) for_each_cfqg_st(cfqg, i, j, st) {
if ((cfqq = cfq_rb_first(st)) != NULL) cfqq = cfq_rb_first(st);
if (cfqq)
return cfqq; return cfqq;
}
return NULL; return NULL;
} }
...@@ -3860,6 +3862,8 @@ cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic, ...@@ -3860,6 +3862,8 @@ cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic,
goto out; goto out;
} }
/* cfq_init_cfqq() assumes cfqq->ioprio_class is initialized. */
cfqq->ioprio_class = IOPRIO_CLASS_NONE;
cfq_init_cfqq(cfqd, cfqq, current->pid, is_sync); cfq_init_cfqq(cfqd, cfqq, current->pid, is_sync);
cfq_init_prio_data(cfqq, cic); cfq_init_prio_data(cfqq, cic);
cfq_link_cfqq_cfqg(cfqq, cfqg); cfq_link_cfqq_cfqg(cfqq, cfqg);
...@@ -4838,7 +4842,7 @@ static struct elv_fs_entry cfq_attrs[] = { ...@@ -4838,7 +4842,7 @@ static struct elv_fs_entry cfq_attrs[] = {
}; };
static struct elevator_type iosched_cfq = { static struct elevator_type iosched_cfq = {
.ops = { .ops.sq = {
.elevator_merge_fn = cfq_merge, .elevator_merge_fn = cfq_merge,
.elevator_merged_fn = cfq_merged_request, .elevator_merged_fn = cfq_merged_request,
.elevator_merge_req_fn = cfq_merged_requests, .elevator_merge_req_fn = cfq_merged_requests,
......
...@@ -661,7 +661,6 @@ long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg) ...@@ -661,7 +661,6 @@ long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg)
struct block_device *bdev = inode->i_bdev; struct block_device *bdev = inode->i_bdev;
struct gendisk *disk = bdev->bd_disk; struct gendisk *disk = bdev->bd_disk;
fmode_t mode = file->f_mode; fmode_t mode = file->f_mode;
struct backing_dev_info *bdi;
loff_t size; loff_t size;
unsigned int max_sectors; unsigned int max_sectors;
...@@ -708,9 +707,8 @@ long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg) ...@@ -708,9 +707,8 @@ long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg)
case BLKFRAGET: case BLKFRAGET:
if (!arg) if (!arg)
return -EINVAL; return -EINVAL;
bdi = blk_get_backing_dev_info(bdev);
return compat_put_long(arg, return compat_put_long(arg,
(bdi->ra_pages * PAGE_SIZE) / 512); (bdev->bd_bdi->ra_pages * PAGE_SIZE) / 512);
case BLKROGET: /* compatible */ case BLKROGET: /* compatible */
return compat_put_int(arg, bdev_read_only(bdev) != 0); return compat_put_int(arg, bdev_read_only(bdev) != 0);
case BLKBSZGET_32: /* get the logical block size (cf. BLKSSZGET) */ case BLKBSZGET_32: /* get the logical block size (cf. BLKSSZGET) */
...@@ -728,8 +726,7 @@ long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg) ...@@ -728,8 +726,7 @@ long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg)
case BLKFRASET: case BLKFRASET:
if (!capable(CAP_SYS_ADMIN)) if (!capable(CAP_SYS_ADMIN))
return -EACCES; return -EACCES;
bdi = blk_get_backing_dev_info(bdev); bdev->bd_bdi->ra_pages = (arg * 512) / PAGE_SIZE;
bdi->ra_pages = (arg * 512) / PAGE_SIZE;
return 0; return 0;
case BLKGETSIZE: case BLKGETSIZE:
size = i_size_read(bdev->bd_inode); size = i_size_read(bdev->bd_inode);
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment