Commits · f270667a7fc020f1711953ad3b0d6e6b38eba834 · Kirill Smelkov / linux

22 Oct, 2023 40 commits

bcachefs: Slightly reduce btree split threshold · f270667a

Kent Overstreet authored Apr 11, 2020

2/3rds performs a lot better than 3/4ths on the tested workloda, leading
to significanly fewer btree node compactions.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

f270667a

bcachefs: Improve lockdep annotation in journalling code · 15a07f2e

Kent Overstreet authored Apr 11, 2020

bch2_journal_res_get() in nonblocking mode is equivalent to a trylock.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

15a07f2e

bcachefs: Fix a locking bug in bch2_journal_pin_copy() · 94035eed

Kent Overstreet authored Apr 11, 2020

There was a race where the src pin would be flushed - releasing the last
pin on that sequence number - before adding the new journal pin. Oops.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

94035eed

bcachefs: Fix another deadlock in the btree interior update path · 58fb3e51

Kent Overstreet authored Apr 07, 2020

Can't take read locks on btree nodes while holding
btree_interior_update_lock. Also, fix a bug where we were leaking
journal prereservations.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

58fb3e51

bcachefs: Fix a locking bug in bch2_btree_ptr_debugcheck() · 1eba942d

Kent Overstreet authored Apr 07, 2020

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1eba942d

bcachefs: Account for ioclock slop when throttling rebalance thread · e77e4efc

Kent Overstreet authored Apr 07, 2020

This should fix an issue where the rebalance thread was spinning
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e77e4efc

bcachefs: Fix a deadlock on starting an interior btree update · 0f9dda47

Kent Overstreet authored Apr 05, 2020

Not legal to block on a journal prereservation with btree locks held.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

0f9dda47

bcachefs: Fix a debug mode assertion · 1e3b1f9a

Kent Overstreet authored Apr 04, 2020

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1e3b1f9a

bcachefs: Fix a debug assertion · 2aec5955

Kent Overstreet authored Apr 04, 2020

This assertion was passing the wrong btree node type when inserting into
interior nodes.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

2aec5955

bcachefs: Fix another error path locking bug · 8707ab0d

Kent Overstreet authored Apr 04, 2020

btree_update_nodes_written() was leaking a btree node lock on failure to
get a journal reservation.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

8707ab0d

bcachefs: Fix a null ptr deref during journal replay · 75923ba7

Kent Overstreet authored Apr 04, 2020

We were calling bch2_extent_can_insert() incorrectly; it should only be
called when the extents-to-keys pass is running because that's when we
could be splitting a compressed extent. Calling bch2_extent_can_insert()
without passing in a disk reservation was causing a null ptr deref.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

75923ba7

bcachefs: Add another mssing bch2_trans_iter_put() call · 47c46c95

Kent Overstreet authored Apr 01, 2020

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

47c46c95

bcachefs: Trace where btree iterators are allocated · 0329b150

Kent Overstreet authored Apr 01, 2020

This will help with iterator overflow bugs.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

0329b150

bcachefs: Fix fallocate FL_INSERT_RANGE · 283eda57

Kent Overstreet authored Apr 01, 2020

This was another bug because of bch2_btree_iter_set_pos() invalidating
iterators.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

283eda57

bcachefs: Add print method for bch2_btree_ptr_v2 · 59a38a38

Kent Overstreet authored Mar 31, 2020

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

59a38a38

bcachefs: Fix journalling of interior node updates · 501e1bda

Kent Overstreet authored Mar 31, 2020

We weren't journalling updates done while splitting/compacting nodes -
oops.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

501e1bda

bcachefs: Fix iterating of journal keys within a btree node · b58a181d

Kent Overstreet authored Mar 30, 2020

Extent btrees no longer have weird special behaviour for min_key.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

b58a181d

bcachefs: Fix a locking bug · 11f6ed36

Kent Overstreet authored Mar 30, 2020

Dropping the wrong kind of lock can't lead to anything good...
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

11f6ed36

bcachefs: Fix inodes pass in fsck · 1d60b999

Kent Overstreet authored Mar 30, 2020

It wasn't updated for the patch that switched inodes to using the offset
field of struct bkey.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1d60b999

bcachefs: Fix ec_stripe_update_ptrs() · e5e6aaa7

Kent Overstreet authored Mar 30, 2020

bch2_btree_iter_set_pos() invalidates the key returned by peek().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e5e6aaa7

bcachefs: Check btree topology at startup · d06c1a0c

Kent Overstreet authored Mar 29, 2020

When initial btree gc was changed to overlay journal keys as it walks
the btree, it also stopped checking btree topology.

Previously, checking btree topology was a fairly complicated affair -
but it's much easier now that btree_ptr_v2 has min_key in the pointer.

This rewrites the old range_checks code and uses it in both runtime and
initial gc.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

d06c1a0c

bcachefs: Don't allocate memory while holding journal reservation · a0e491c0

Kent Overstreet authored Mar 30, 2020

This fixes a lockdep splat - allocating memory can call
bch2_clear_page_bits() which takes mark_lock.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

a0e491c0

bcachefs: Reduce max nr of btree iters when lockdep is on · 2c31e657

Kent Overstreet authored Mar 29, 2020

This is so we don't overflow MAX_LOCK_DEPTH.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

2c31e657

bcachefs: Kill bkey_type_successor · 39fb2983

Kent Overstreet authored Jan 07, 2020

Previously, BTREE_ID_INODES was special - inodes were indexed by the
inode field, which meant the offset field of struct bpos wasn't used,
which led to special cases in e.g. the btree iterator code.

Now, inodes in the inodes btree are indexed by the offset field.

Also: prevously min_key was special for extents btrees, min_key for
extents would equal max_key for the previous node. Now, min_key =
bkey_successor() of the previous node, same as non extent btrees.

This means we can completely get rid of
btree_type_sucessor/predecessor.

Also make some improvements to the metadata IO validate/compat code.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

39fb2983

bcachefs: Switch a BUG_ON() to a warning · b72633ae

Kent Overstreet authored Mar 29, 2020

This has popped and thus needs to be debugged, but the assertion firing
isn't necessarily fatal so switch it to a warning.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

b72633ae

bcachefs: Use kvpmalloc mempools for compression bounce · 22f77698

Kent Overstreet authored Mar 29, 2020

This fixes an issue where mounting would fail because of memory
fragmentation - previously the compression bounce buffers were using
get_free_pages().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

22f77698

bcachefs: Read journal when keep_journal on · 5a655f06

Kent Overstreet authored Mar 28, 2020

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

5a655f06

bcachefs: Various fixes for interior update path · 56a40fbc

Kent Overstreet authored Mar 28, 2020

The locking was wrong, and we could get a use after free in the error
path where we weren't taking the entrie being freed off the unwritten
list.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

56a40fbc

bcachefs: Use memalloc_nofs_save() · 4e4758c6

Kent Overstreet authored Mar 27, 2020

vmalloc allocations don't always obey GFP_NOFS - memalloc_nofs_save() is
the prefered approach for the future.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

4e4758c6

bcachefs: Improve error message in fsck · f7005e01

Kent Overstreet authored Mar 25, 2020

Seeing the extents that were overlapping is highly useful for figuring
out what went wrong.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

f7005e01

bcachefs: Add an option for keeping journal entries after startup · f1d786a0

Kent Overstreet authored Mar 25, 2020

This will be used by the userspace debug tools.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

f1d786a0

bcachefs: Fix an assertion when nothing to replay · 2f194e16

Kent Overstreet authored Mar 25, 2020

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

2f194e16

bcachefs: Journal updates to interior nodes · 6357d607

Kent Overstreet authored Feb 08, 2020

Previously, the btree has always been self contained and internally
consistent on disk without anything from the journal - the journal just
contained pointers to the btree roots.

However, this meant that btree node split or compact operations - i.e.
anything that changes btree node topology and involves updates to
interior nodes - would require that interior btree node to be written
immediately, which means emitting a btree node write that's mostly empty
(using 4k of space on disk if the filesystemm blocksize is 4k to only
write perhaps ~100 bytes of new keys).

More importantly, this meant most btree node writes had to be FUA, and
consumer drives have a history of slow and/or buggy FUA support - other
filesystes have been bit by this.

This patch changes the interior btree update path to journal updates to
interior nodes, after the writes for the new btree nodes have completed.
Best of all, it turns out to simplify the interior node update path
somewhat.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

6357d607

bcachefs: Replay interior node keys · f44a6a71

Kent Overstreet authored Mar 15, 2020

This slightly modifies the journal replay code so that it can replay
updates to interior nodes.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

f44a6a71

bcachefs: trans_commit() path can now insert to interior nodes · e62d65f2

Kent Overstreet authored Mar 15, 2020

This will be needed for the upcoming patches to journal updates to
interior btree nodes.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e62d65f2

bcachefs: Disable extent merging · 47143a75

Kent Overstreet authored Mar 24, 2020

Extent merging is currently broken, and will be reimplemented
differently soon - right now it only happens when btree nodes are being
compacted, which makes it difficult to test.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

47143a75

bcachefs: Fix a locking bug in fsck · 0728eed7

Kent Overstreet authored Mar 21, 2020

This works around a btree locking issue - we can't be holding read locks
while taking write locks, which currently means we can't have live
iterators holding read locks at commit time.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

0728eed7

bcachefs: Fix count_iters_for_insert() · fa4dc398

Kent Overstreet authored Mar 21, 2020

This fixes a transaction iterator overflow.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

fa4dc398

bcachefs: Fix an iterator bug · 8666a9ad

Kent Overstreet authored Mar 18, 2020

We were incorrectly not restarting the transaction when re-traversing
iterators.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

8666a9ad

bcachefs: Shut down quicker · 6d61724b

Kent Overstreet authored Mar 18, 2020

Internal writes (i.e. copygc/rebalance operations) shouldn't be blocking
on the allocator when we're going RO.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

6d61724b