Commits · fe1f9e6659ca6124f500a0f829202c7c902fab0c · Kirill Smelkov / linux

16 May, 2018 7 commits

nbd: fix how we set bd_invalidated · fe1f9e66

Josef Bacik authored May 16, 2018

bd_invalidated is kind of a pain wrt partitions as it really only
triggers the partition rescan if it is set after bd_ops->open() runs, so
setting it when we reset the device isn't useful.  We also sporadically
would still have partitions left over in some disconnect cases, so fix
this by always setting bd_invalidated on open if there's no
configuration or if we've had a disconnect action happen, that way the
partition table gets invalidated and rescanned properly.
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

fe1f9e66

nbd: clear_sock on netlink disconnect · 96d97e17

Josef Bacik authored May 16, 2018

This is what the ioctl based nbd disconnect does as well.  Without this
the device will just sit there and wait for the connection to go away
(or IO to occur) before the device gets torn down.  Instead clear
everything up on our end so the configuration goes away as quickly as
possible.
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

96d97e17

nbd: use bd_set_size when updating disk size · 9e2b1967

Josef Bacik authored May 16, 2018

When we stopped relying on the bdev everywhere I broke updating the
block device size on the fly, which ceph relies on.  We can't just do
set_capacity, we also have to do bd_set_size so things like parted will
notice the device size change.

Fixes: 29eaadc0 ("nbd: stop using the bdev everywhere")
cc: stable@vger.kernel.org
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

9e2b1967

nbd: update size when connected · c3f7c939

Josef Bacik authored May 16, 2018

I messed up changing the size of an NBD device while it was connected by
not actually updating the device or doing the uevent.  Fix this by
updating everything if we're connected and we change the size.

cc: stable@vger.kernel.org
Fixes: 639812a1 ("nbd: don't set the device size until we're connected")
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

c3f7c939

nbd: fix nbd device deletion · 8364da47

Josef Bacik authored May 16, 2018

This fixes a use after free bug, we shouldn't be doing disk->queue right
after we do del_gendisk(disk).  Save the queue and do the cleanup after
the del_gendisk.

Fixes: c6a4759e ("nbd: add device refcounting")
cc: stable@vger.kernel.org
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

8364da47

block: fix MAINTAINERS email for nbd · 3de9beee

Josef Bacik authored May 16, 2018

I've been missing stuff because it's been going into my work email which
is a black hole.  Update to the email I actually use so I stop missing
patches and bug reports.
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

3de9beee

blk-mq: remove redundant insert case in blk_mq_make_request() · 8fa9f556

huhai authored May 16, 2018

We can use blk_mq_sched_insert_request() even if we don't have
an IO scheduler attached, since that case will end up being
exactly the same as what blk_mq_queue_io() was doing now.
Signed-off-by: huhai <huhai@kylinos.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

8fa9f556

15 May, 2018 1 commit

Remove jsflash driver · da3c6efe

Jens Axboe authored May 15, 2018

Nobody is using it anymore, and it's been abandoned. Since David
is fine with removing it, kill it.
Suggested-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

da3c6efe

14 May, 2018 18 commits

block: Add sysfs entry for fua support · 6fcefbe5

Kent Overstreet authored May 08, 2018

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

6fcefbe5

block: Export bio check/set pages_dirty · 1900fcc4

Kent Overstreet authored May 08, 2018

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

1900fcc4

block: Add warning for bi_next not NULL in bio_endio() · 0ba99ca4

Kent Overstreet authored May 08, 2018

Recently found a bug where a driver left bi_next not NULL and then
called bio_endio(), and then the submitter of the bio used
bio_copy_data() which was treating src and dst as lists of bios.

Fixed that bug by splitting out bio_list_copy_data(), but in case other
things are depending on bi_next in weird ways, add a warning to help
avoid more bugs like that in the future.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

0ba99ca4

block: Add missing flush_dcache_page() call · 6e6e811d

Kent Overstreet authored May 08, 2018

Since a bio can point to userspace pages (e.g. direct IO), this is
generally necessary.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

6e6e811d

block: Split out bio_list_copy_data() · 45db54d5

Kent Overstreet authored May 08, 2018

Found a bug (with ASAN) where we were passing a bio to bio_copy_data()
with bi_next not NULL, when it should have been - a driver had left
bi_next set to something after calling bio_endio().

Since the normal case is only copying single bios, split out
bio_list_copy_data() to avoid more bugs like this in the future.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

45db54d5

block: Add bio_copy_data_iter(), zero_fill_bio_iter() · 38a72dac

Kent Overstreet authored May 08, 2018

Add versions that take bvec_iter args instead of using bio->bi_iter - to
be used by bcachefs.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

38a72dac

block: Use bioset_init() for fs_bio_set · f4f8154a

Kent Overstreet authored May 08, 2018

Minor optimization - remove a pointer indirection when using fs_bio_set.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

f4f8154a

block: Add bioset_init()/bioset_exit() · 917a38c7

Kent Overstreet authored May 08, 2018

Similarly to mempool_init()/mempool_exit(), take a pointer indirection
out of allocation/freeing by allowing biosets to be embedded in other
structs.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

917a38c7

block: Convert bio_set to mempool_init() · 8aa6ba2f

Kent Overstreet authored May 08, 2018

Minor performance improvement by getting rid of pointer indirections
from allocation/freeing fastpaths.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

8aa6ba2f

mempool: Add mempool_init()/mempool_exit() · c1a67fef

Kent Overstreet authored May 04, 2015

Allows mempools to be embedded in other structs, getting rid of a
pointer indirection from allocation fastpaths.

mempool_exit() is safe to call on an uninitialized but zeroed mempool.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

c1a67fef

sbitmap: fix race in wait batch accounting · c854ab57

Jens Axboe authored May 14, 2018

If we have multiple callers of sbq_wake_up(), we can end up in a
situation where the wait_cnt will continually go more and more
negative. Consider the case where our wake batch is 1, hence
wait_cnt will start out as 1.

wait_cnt == 1

CPU0				CPU1
atomic_dec_return(), cnt == 0
				atomic_dec_return(), cnt == -1
				cmpxchg(-1, 0) (succeeds)
				[wait_cnt now 0]
cmpxchg(0, 1) (fails)

This ends up with wait_cnt being 0, we'll wakeup immediately
next time. Going through the same loop as above again, and
we'll have wait_cnt -1.

For the case where we have a larger wake batch, the only
difference is that the starting point will be higher. We'll
still end up with continually smaller batch wakeups, which
defeats the purpose of the rolling wakeups.

Always reset the wait_cnt to the batch value. Then it doesn't
matter who wins the race. But ensure that whomever does win
the race is the one that increments the ws index and wakes up
our batch count, loser gets to call __sbq_wake_up() again to
account his wakeups towards the next active wait state index.

Fixes: 6c0ca7ae ("sbitmap: fix wakeup hang after sbq resize")
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

c854ab57

block: consistently use GFP_NOIO instead of __GFP_NORECLAIM · 0eb0b63c

Christoph Hellwig authored May 09, 2018

Same numerical value (for now at least), but a much better documentation
of intent.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

0eb0b63c

block: use GFP_NOIO instead of __GFP_DIRECT_RECLAIM · c3036021

Christoph Hellwig authored May 09, 2018

We just can't do I/O when doing block layer requests allocations,
so use GFP_NOIO instead of the even more limited __GFP_DIRECT_RECLAIM.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

c3036021

block: pass an explicit gfp_t to get_request · 4accf5fc

Christoph Hellwig authored May 09, 2018

blk_old_get_request already has it at hand, and in blk_queue_bio, which
is the fast path, it is constant.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

4accf5fc

block: sanitize blk_get_request calling conventions · ff005a06

Christoph Hellwig authored May 09, 2018

Switch everyone to blk_get_request_flags, and then rename
blk_get_request_flags to blk_get_request.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

ff005a06

block: fix __get_request documentation · a9a14d36

Christoph Hellwig authored May 09, 2018

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

a9a14d36

scsi/osd: remove the gfp argument to osd_start_request · ac613e45

Christoph Hellwig authored May 09, 2018

Always GFP_KERNEL, and keeping it would cause serious complications for
the next change.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

ac613e45

memstick: remove unused variables · 058147bc

Christoph Hellwig authored May 14, 2018

Fixes: 7c2d748e ("memstick: don't call blk_queue_bounce_limit")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

058147bc

11 May, 2018 7 commits

ps3disk: handle highmem pages · e4f0e0cb

Christoph Hellwig authored May 09, 2018

The ps3disk driver already kmaps all pages when copying from/to the
internal bounce buffer, so it can accept highmem pages just fine.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

e4f0e0cb

jsflash: handle highmem pages · 37a5b5c6

Christoph Hellwig authored May 09, 2018

Just kmap the bio single page payload before processing it.

(and yes, now highmem on sparc32 anyway, but kmap_(un)map atomic are nops,
so this gives the right example)
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

37a5b5c6

aoe: handle highmem pages · ad180f6f

Christoph Hellwig authored May 09, 2018

Use kmap_atomic when copying out of a bio_vec.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

ad180f6f

mtd_blkdevs: handle highmem pages · 34ab96e6

Christoph Hellwig authored May 09, 2018

Just kmap the single payload page before passing it on to the FTL.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

34ab96e6

memstick: don't call blk_queue_bounce_limit · 7c2d748e

Christoph Hellwig authored May 09, 2018

All in-tree host drivers set up a proper dma mask and use the dma-mapping
helpers.  This means they will be able to deal with any address that we
are throwing at them.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

7c2d748e

DAC960: don't use block layer bounce buffers · 00f0a51f

Christoph Hellwig authored May 09, 2018

DAC960 just sets the block bounce limit to the dma mask, which means
that the iommu or swiotlb already take care of the bounce buffering,
and the block bouncing can be removed.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

00f0a51f

mtip32xx: don't use block layer bounce buffers · 5c26e050

Christoph Hellwig authored May 09, 2018

mtip32xx just sets the block bounce limit to the dma mask, which means
that the iommu or swiotlb already take care of the bounce buffering,
and the block bouncing can be removed.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

5c26e050

10 May, 2018 7 commits

sbitmap: warn if using smaller shallow depth than was setup · 61445b56

Omar Sandoval authored May 09, 2018

Make sure the user passed the right value to
sbitmap_queue_min_shallow_depth().
Acked-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

61445b56

kyber-iosched: update shallow depth when setting up hardware queue · 28820640

Jens Axboe authored May 09, 2018

We don't expect the async depth to be smaller than the wake batch
count for sbitmap, but just in case, inform sbitmap of what shallow
depth kyber may use.
Acked-by: Paolo Valente <paolo.valente@linaro.org>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

28820640

bfq-iosched: update shallow depth to smallest one used · 483b7bf2

Jens Axboe authored May 09, 2018

If our shallow depth is smaller than the wake batching of sbitmap,
we can introduce hangs. Ensure that sbitmap knows how low we'll go.
Acked-by: Paolo Valente <paolo.valente@linaro.org>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

483b7bf2

sbitmap: fix missed wakeups caused by sbitmap_queue_get_shallow() · a3275539

Omar Sandoval authored May 09, 2018

The sbitmap queue wake batch is calculated such that once allocations
start blocking, all of the bits which are already allocated must be
enough to fulfill the batch counters of all of the waitqueues. However,
the shallow allocation depth can break this invariant, since we block
before our full depth is being utilized. Add
sbitmap_queue_min_shallow_depth(), which saves the minimum shallow depth
the sbq will use, and update sbq_calc_wake_batch() to take it into
account.
Acked-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

a3275539

bfq-iosched: remove unused variable · bd7d4ef6

Jens Axboe authored May 09, 2018

bfqd->sb_shift was attempted used as a cache for the sbitmap queue
shift, but we don't need it, as it never changes. Kill it with fire.
Acked-by: Paolo Valente <paolo.valente@linaro.org>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

bd7d4ef6

bfq: calculate shallow depths at init time · f0635b8a

Jens Axboe authored May 09, 2018

It doesn't change, so don't put it in the per-IO hot path.
Acked-by: Paolo Valente <paolo.valente@linaro.org>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

f0635b8a

bfq-iosched: don't worry about reserved tags in limit_depth · 55141366

Jens Axboe authored May 09, 2018

Reserved tags are used for error handling, we don't need to
care about them for regular IO. The core won't call us for these
anyway.
Acked-by: Paolo Valente <paolo.valente@linaro.org>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

55141366