Commit 430a67f9 authored by Paolo Valente's avatar Paolo Valente Committed by Jens Axboe

block, bfq: merge bursts of newly-created queues

Many throughput-sensitive workloads are made of several parallel I/O
flows, with all flows generated by the same application, or more
generically by the same task (e.g., system boot). The most
counterproductive action with these workloads is plugging I/O dispatch
when one of the bfq_queues associated with these flows remains
temporarily empty.

To avoid this plugging, BFQ has been using a burst-handling mechanism
for years now. This mechanism has proven effective for throughput, and
not detrimental for service guarantees. This commit pushes this
mechanism a little bit further, basing on the following two facts.

First, all the I/O flows of a the same application or task contribute
to the execution/completion of that common application or task. So the
performance figures that matter are total throughput of the flows and
task-wide I/O latency.  In particular, these flows do not need to be
protected from each other, in terms of individual bandwidth or
latency.

Second, the above fact holds regardless of the number of flows.

Putting these two facts together, this commits merges stably the
bfq_queues associated with these I/O flows, i.e., with the processes
that generate these IO/ flows, regardless of how many the involved
processes are.

To decide whether a set of bfq_queues is actually associated with the
I/O flows of a common application or task, and to merge these queues
stably, this commit operates as follows: given a bfq_queue, say Q2,
currently being created, and the last bfq_queue, say Q1, created
before Q2, Q2 is merged stably with Q1 if
- very little time has elapsed since when Q1 was created
- Q2 has the same ioprio as Q1
- Q2 belongs to the same group as Q1

Merging bfq_queues also reduces scheduling overhead. A fio test with
ten random readers on /dev/nullb shows a throughput boost of 40%, with
a quadcore. Since BFQ's execution time amounts to ~50% of the total
per-request processing time, the above throughput boost implies that
BFQ's overhead is reduced by more than 50%.
Tested-by: default avatarJan Kara <jack@suse.cz>
Signed-off-by: default avatarPaolo Valente <paolo.valente@linaro.org>
Tested-by: default avatarOleksandr Natalenko <oleksandr@natalenko.name>
Link: https://lore.kernel.org/r/20210304174627.161-7-paolo.valente@linaro.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
parent 85686d0d
...@@ -547,6 +547,8 @@ static void bfq_pd_init(struct blkg_policy_data *pd) ...@@ -547,6 +547,8 @@ static void bfq_pd_init(struct blkg_policy_data *pd)
entity->orig_weight = entity->weight = entity->new_weight = d->weight; entity->orig_weight = entity->weight = entity->new_weight = d->weight;
entity->my_sched_data = &bfqg->sched_data; entity->my_sched_data = &bfqg->sched_data;
entity->last_bfqq_created = NULL;
bfqg->my_entity = entity; /* bfqg->my_entity = entity; /*
* the root_group's will be set to NULL * the root_group's will be set to NULL
* in bfq_init_queue() * in bfq_init_queue()
......
This diff is collapsed.
...@@ -197,6 +197,9 @@ struct bfq_entity { ...@@ -197,6 +197,9 @@ struct bfq_entity {
/* flag, set if the entity is counted in groups_with_pending_reqs */ /* flag, set if the entity is counted in groups_with_pending_reqs */
bool in_groups_with_pending_reqs; bool in_groups_with_pending_reqs;
/* last child queue of entity created (for non-leaf entities) */
struct bfq_queue *last_bfqq_created;
}; };
struct bfq_group; struct bfq_group;
...@@ -230,6 +233,8 @@ struct bfq_ttime { ...@@ -230,6 +233,8 @@ struct bfq_ttime {
struct bfq_queue { struct bfq_queue {
/* reference counter */ /* reference counter */
int ref; int ref;
/* counter of references from other queues for delayed stable merge */
int stable_ref;
/* parent bfq_data */ /* parent bfq_data */
struct bfq_data *bfqd; struct bfq_data *bfqd;
...@@ -365,6 +370,8 @@ struct bfq_queue { ...@@ -365,6 +370,8 @@ struct bfq_queue {
unsigned long first_IO_time; /* time of first I/O for this queue */ unsigned long first_IO_time; /* time of first I/O for this queue */
unsigned long creation_time; /* when this queue is created */
/* max service rate measured so far */ /* max service rate measured so far */
u32 max_service_rate; u32 max_service_rate;
...@@ -454,6 +461,11 @@ struct bfq_io_cq { ...@@ -454,6 +461,11 @@ struct bfq_io_cq {
u64 saved_last_serv_time_ns; u64 saved_last_serv_time_ns;
unsigned int saved_inject_limit; unsigned int saved_inject_limit;
unsigned long saved_decrease_time_jif; unsigned long saved_decrease_time_jif;
/* candidate queue for a stable merge (due to close creation time) */
struct bfq_queue *stable_merge_bfqq;
bool stably_merged; /* non splittable if true */
}; };
/** /**
...@@ -578,6 +590,9 @@ struct bfq_data { ...@@ -578,6 +590,9 @@ struct bfq_data {
/* bfqq owning the last completed rq */ /* bfqq owning the last completed rq */
struct bfq_queue *last_completed_rq_bfqq; struct bfq_queue *last_completed_rq_bfqq;
/* last bfqq created, among those in the root group */
struct bfq_queue *last_bfqq_created;
/* time of last transition from empty to non-empty (ns) */ /* time of last transition from empty to non-empty (ns) */
u64 last_empty_occupied_ns; u64 last_empty_occupied_ns;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment