Commits · 0423fb7185e3c0178b3a09f24afc3777c2ef9522 · Kirill Smelkov / linux

22 Oct, 2023 40 commits

bcachefs: Keep a sorted list of btree iterators · 0423fb71

Kent Overstreet authored Jun 12, 2021

This will be used to make other operations on btree iterators within a
transaction more efficient, and enable some other improvements to how we
manage btree iterators.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

0423fb71

bcachefs: Zero out mem_ptr field in btree ptr keys from journal replay · 877da05f

Kent Overstreet authored Jul 30, 2021

This fixes a bad ptr deref on recovery from unclean shutdown in
bch2_btree_node_get_noiter().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

877da05f

bcachefs: Don't drop read locks at transaction commit time · 9cba7bf7
Kent Overstreet authored Jul 27, 2021
```
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
```
9cba7bf7
bcachefs: traverse_all() shouldn't be restarting the transaction · 0d32711e
Kent Overstreet authored Jul 27, 2021
```
We're only called by bch2_trans_begin() now.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
```
0d32711e

bcachefs: Kill BTREE_INSERT_NOUNLOCK · 1a488e73

Kent Overstreet authored Jul 27, 2021

With the recent transaction restart changes, it's no longer needed - all
transaction commits have BTREE_INSERT_NOUNLOCK semantics.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

1a488e73

bcachefs: Btree splits no longer automatically cause a transaction restart · b253a90d

Kent Overstreet authored Jul 24, 2021

With the new and improved handling of transaction restarts, this should
finally be safe.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

b253a90d

bcachefs: __bch2_trans_commit() no longer calls bch2_trans_reset() · 955af634

Kent Overstreet authored Jul 24, 2021

It's now the caller's responsibility to call bch2_trans_begin.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

955af634

bcachefs: Ensure btree_iter_traverse() obeys iter->should_be_locked · e829b717

Kent Overstreet authored Jul 22, 2021

iter->should_be_locked means that if bch2_btree_iter_relock() fails, we
need to restart the transaction.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

e829b717

bcachefs: bch2_btree_iter_traverse() shouldn't normally call traverse_all() · b4e09b35

Kent Overstreet authored Jul 27, 2021

If there's more than one iterator in the btree_trans, it's requried to
call bch2_trans_begin() to handle transaction restarts.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

b4e09b35

bcachefs: trans->restarted · e5af273f

Kent Overstreet authored Jul 25, 2021

Start tracking when btree transactions have been restarted - and assert
that we're always calling bch2_trans_begin() immediately after
transaction restart.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

e5af273f

bcachefs: Change lockrestart_do() to always call bch2_trans_begin() · 3cc5288a

Kent Overstreet authored Jul 28, 2021

More consistent behaviour means less likely to trip over ourselves in
silly ways.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

3cc5288a

bcachefs: Clean up interior update paths · a88171c9

Kent Overstreet authored Jul 24, 2021

Btree node merging now happens prior to transaction commit, not after,
so we don't need to pay attention to BTREE_INSERT_NOUNLOCK.

Also, foreground_maybe_merge shouldn't be calling
bch2_btree_iter_traverse_all() - this is becoming private to the btree
iterator code and should only be called by bch2_trans_begin().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

a88171c9

bcachefs: Use bch2_trans_begin() more consistently · 700c25b3

Kent Overstreet authored Jul 24, 2021

Upcoming patch will require that a transaction restart is always
immediately followed by bch2_trans_begin().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

700c25b3

bcachefs: Always check for transaction restarts · 8b3e9bd6

Kent Overstreet authored Jul 24, 2021

On transaction restart iterators won't be locked anymore - make sure
we're always checking for errors.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

8b3e9bd6

bcachefs: traverse_all() is responsible for clearing should_be_locked · 67b07638

Kent Overstreet authored Jul 24, 2021

bch2_btree_iter_traverse_all() may loop, and it needs to clear
iter->should_be_locked on every iteration.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

67b07638

bcachefs: bch2_trans_relock() only relocks iters that should be locked · fe523397

Kent Overstreet authored Jul 27, 2021

This avoids unexpected lock restarts in bch2_btree_iter_traverse_all().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

fe523397

bcachefs: Don't traverse iterators in __bch2_trans_commit() · 6918bb55

Kent Overstreet authored Jul 25, 2021

They should already be traversed, and we're asserting that since the
introduction of iter->should_be_locked
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

6918bb55

bcachefs: Add an option for btree node mem ptr optimization · a32b9573

Kent Overstreet authored Jul 26, 2021

bch2_btree_node_ptr_v2 has a field for stashing a pointer to the in
memory btree node; this is safe because we clear this field when reading
in nodes from disk and we never free in memory btree nodes - but, we
have bug reports that indicate something might be faulty with this
optimization, so let's add an option for it.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

a32b9573

bcachefs: Minor tracepoint improvements · 2b4e4b8c

Kent Overstreet authored Jul 24, 2021

Btree iterator tracepoints should print whether they're for the key
cache.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

2b4e4b8c

bcachefs: bch2_btree_iter_relock_intent() · 6e075b54

Kent Overstreet authored Jul 24, 2021

This adds a new helper for btree_cache.c that does what we want where
the iterator is still being traverse - and also eliminates some
unnecessary transaction restarts.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

6e075b54

bcachefs: Use bch2_trans_do() in bch2_btree_key_cache_journal_flush() · a6eba44b

Kent Overstreet authored Jul 23, 2021

We're working to standardize handling of transaction restarts.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

a6eba44b

bcachefs: Fix a btree iterator leak · ed5580b4
Kent Overstreet authored Jul 24, 2021
```
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
```
ed5580b4

bcachefs: Pretty-ify bch2_bkey_val_to_text() · d7b21954

Kent Overstreet authored Jul 21, 2021

Don't print out the ": " when there isn't a value to print.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

d7b21954

bcachefs: Don't squash return code in check_dirents() · 38200544
Kent Overstreet authored Jul 21, 2021
```
We were squashing BCH_FSCK_ERRORS_NOT_FIXED.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
```
38200544

bcachefs: Use bch2_inode_find_by_inum() in truncate · b97bbd4e

Kent Overstreet authored Jul 20, 2021

This is needed for snapshots because we need to start handling lock
restarts even when just calling bch2_inode_peek().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

b97bbd4e

bcachefs: Handle lock restarts in bch2_xattr_get() · 4909fe50

Kent Overstreet authored Jul 20, 2021

Snapshots add another btree lookup, thus we need to handle lock
restarts.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

4909fe50

bcachefs: Don't downgrade in traverse() · 5f87f3c1

Kent Overstreet authored Jul 20, 2021

Downgrading of btree iterators is something that should only happen
explicitly.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

5f87f3c1

bcachefs: BSET_OFFSET() · e719fc34

Kent Overstreet authored Jul 16, 2021

Add a field to struct bset for the sector offset within the btree node
where it was written.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

e719fc34

Revert "bcachefs: statfs bfree and bavail should be the same" · 47924527

Kent Overstreet authored Sep 10, 2023

This reverts commit 664f9847bec525d396d62d2db094ca9020289ae0.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

47924527

bcachefs: Update btree ptrs after every write · 9f1833ca

Kent Overstreet authored Jul 10, 2021

This closes a significant hole (and last known hole) in our ability to
verify metadata. Previously, since btree nodes are log structured, we
couldn't detect lost btree writes that weren't the first write to a
given node. Additionally, this seems to have lead to some significant
metadata corruption on multi device filesystems with metadata
replication: since a write may have made it to one device and not
another, if we read that btree node back from the replica that did have
that write and started appending after that point, the other replica
would have a gap in the bset entries and reading from that replica
wouldn't find the rest of the bsets.

But, since updates to interior btree nodes are now journalled, we can
close this hole by updating pointers to btree nodes after every write
with the currently written number of sectors, without negatively
affecting performance. This means we will always detect lost or corrupt
metadata - it also means that our btree is now a curious hybrid of COW
and non COW btrees, with all the benefits of both (excluding
complexity).
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

9f1833ca

bcachefs: Improve btree_bad_header() error message · f8f86c6a

Kent Overstreet authored Jul 15, 2021

We should always print out the full btree node ptr.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

f8f86c6a

bcachefs: Fixes for unit tests · eb7f44db

Kent Overstreet authored Jul 14, 2021

The unit tests hadn't been updated for various recent btree changes -
this patch makes them work again.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

eb7f44db

bcachefs: Fix bch2_btree_iter_rewind() · 71f892a4

Kent Overstreet authored Jul 14, 2021

We'd hit a BUG() when rewinding at the start of the btree on btrees with
snapshots.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

71f892a4

bcachefs: Improvements to fsck check_dirents() · 914f2786

Kent Overstreet authored Jul 14, 2021

The fsck code handles transaction restarts in a very ad hoc way, and not
always correctly. This patch makes some improvements to check_dirents(),
but more work needs to be done to figure out how this kind of code
should be structured.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

914f2786

bcachefs: Tighten up btree_iter locking assertions · 5aab6635

Kent Overstreet authored Jul 14, 2021

We weren't correctly verifying that we had interior node intent locks -
this patch also fixes bugs uncovered by the new assertions.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

5aab6635

bcachefs: Fix a memory leak in the dio write path · 5468f119

Kent Overstreet authored Jul 14, 2021

There were some error paths where we were leaking page refs - oops.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

5468f119

bcachefs: Add an option for whether inodes use the key cache · 996fb577

Kent Overstreet authored Jun 13, 2021

We probably don't ever want to flip this off in production, but it may
be useful for certain kinds of testing.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

996fb577

bcachefs: Fix an allocator shutdown deadlock · 9f6e1f7b

Kent Overstreet authored Jul 13, 2021

On fstest generic/388, we were seeing sporadic deadlocks in the
emergency shutdown, where we'd get stuck shutting down the allocator
because bch2_btree_update_start() -> bch2_btree_reserve_get() allocated
and then deallocated some btree nodes, putting them back on the
btree_reserve_cache, after the allocator shutdown code had already
cleared out that cache.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

9f6e1f7b

bcachefs: Add safe versions of varint encode/decode · 8d344587

Kent Overstreet authored Jul 13, 2021

This adds safe versions of bch2_varint_(encode|decode) that don't read
or write past the end of the buffer, or varint being encoded.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

8d344587

bcachefs: Add open_buckets to sysfs · 2e655e6d

Kent Overstreet authored Jul 12, 2021

This is to help debug a rare shutdown deadlock in the allocator code -
the btree code is leaking open_buckets.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

2e655e6d