Commits · c929f2306e61500bf68a39cb2a16006bfe844d52 · Kirill Smelkov / linux

22 Oct, 2023 40 commits

bcachefs: Stale ptr cleanup is now done by gc_gens · c929f230

Kent Overstreet authored Feb 13, 2022

Before we had dedicated gc code for bucket->oldest_gen this was
btree_gc's responsibility, but now that we have that we can rip it out,
simplifying the already overcomplicated btree_gc.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

c929f230

bcachefs: Improve journal_entry_btree_keys_to_text() · e7bc7cdf

Kent Overstreet authored Feb 16, 2022

This improves the formatting of journal_entry_btree_keys_to_text() by
putting each key on its own line.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

e7bc7cdf

bcachefs: Fix __btree_path_traverse_all · 33aa419d

Kent Overstreet authored Feb 16, 2022

The loop that traverses paths in traverse_all() needs to be a little bit
tricky, because traversing a path can cause other paths to be added (or
perhaps removed) at about the same position.

The old logic was buggy, replace it with simpler logic.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

33aa419d

bcachefs: Fix slow tracepoints · 4b59a319

Kent Overstreet authored Feb 16, 2022

Some of our tracepoints were calling snprintf("pS") - which does symbol
table lookups - in TP_fast_assign(), which turns out to be a really bad
idea.

This was done because perf trace wasn't correctly printing tracepoints
that use %pS anymore - but it turns out trace-cmd does handle it
correctly.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

4b59a319

bcachefs: Check for stale dirty pointer before reads · eb331fe5

Kent Overstreet authored Feb 15, 2022

Since we retry reads when we discover we read from a pointer that went
stale, if a dirty pointer is erroniously stale it would cause us to loop
retrying that read forever - unless we check before issuing the read,
while the btree is still locked, when we know that a dirty pointer
should never be stale.

This patch adds that check, along with printing some helpful debug info.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

eb331fe5

bcachefs: Kill verify_not_stale() · fcf01959

Kent Overstreet authored Feb 14, 2022

This is ancient code that's more effectively checked in other places
now.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

fcf01959

bcachefs: Fix __bch2_btree_node_lock · 7abda8c1

Kent Overstreet authored Feb 15, 2022

__bch2_btree_node_lock() was implementing the wrong lock ordering for
cached vs. non cached paths - this fixes it to match the btree path sort
order as defined by __btree_path_cmp(), and also simplifies the code
some.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

7abda8c1

bcachefs: Also show when blocked on write locks · c7ce2732

Kent Overstreet authored Feb 15, 2022

This consolidates some of the btree node lock path, so that when we're
blocked taking a write lock on a node it shows up in
bch2_btree_trans_to_text(), along with intent and read locks.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

c7ce2732

bcachefs: Delete redundant tracepoint · 8be1aff0

Kent Overstreet authored Feb 15, 2022

We were emitting two trace events on transaction restart in this code
path - delete the redundant one.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

8be1aff0

bcachefs: Fix locking in data move path · 52eef42c

Kent Overstreet authored Feb 15, 2022

We need to ensure we don't have any btree locks held when calling
do_pending_writes() - besides issuing IOs, upcoming allocator changes
will have allocations doing btree lookups directly.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

52eef42c

bcachefs: Kill bch2_bkey_debugcheck · 2ce8fbd9

Kent Overstreet authored Feb 13, 2022

The old .debugcheck methods are no more and this just calls the .invalid
method, which doesn't add much since we already check that when doing
btree updates and when reading metadata in.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

2ce8fbd9

bcachefs: Print a better message for mark and sweep pass · 0f78264a

Kent Overstreet authored Feb 13, 2022

Btree gc, aka mark and sweep, checks allocations - so let's just print
that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

0f78264a

bcachefs: Small fsck fix · 9e343161

Kent Overstreet authored Feb 13, 2022

The check_dirents pass handles transaction restarts at the toplevel -
check_subdir_count() was incorrectly handling transaction restarts
without returning -EINTR, meaning that the iterator pointing to the
dirent being checked was left invalid.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

9e343161

bcachefs: Fix reflink repair code · aa8982c3

Kent Overstreet authored Feb 10, 2022

The reflink repair code was incorrectly inserting a nonzero deleted key
via journal replay - this is due to bch2_journal_key_insert() being
somewhat hacky, and so this fix is also hacky for now.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

aa8982c3

bcachefs: bch2_gc_gens() no longer uses bucket array · c45c8667

Kent Overstreet authored Dec 24, 2021

Like the previous patches, this converts bch2_gc_gens() to use the alloc
btree directly, and private arrays of generation numbers for its own
recalculation of oldest_gen.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

c45c8667

bcachefs: Copygc no longer uses bucket array · d73e0d2c

Kent Overstreet authored Dec 25, 2021

This converts the copygc code to use the alloc btree directly to find
buckets that need to be evacuated instead of the in-memory bucket array,
which is finally going away soon.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

d73e0d2c

bcachefs: btree_gc no longer uses main in-memory bucket array · ec061b21

Kent Overstreet authored Dec 25, 2021

This changes the btree_gc code to only use the second bucket array, the
one dedicated to GC. On completion, it compares what's in its in memory
bucket array to the allocation information in the btree and writes it
directly, instead of updating the main in-memory bucket array and
writing that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

ec061b21

bcachefs: Inode create no longer needs to probe key cache · 63a2edce

Kent Overstreet authored Jan 09, 2023

Now that we have full key cache coherency, we can simplify
bch2_inode_create().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

63a2edce

bcachefs: Btree key cache coherency · 12ce5b7d

Kent Overstreet authored Jan 12, 2022

 - Updates to non key cache iterators will now be transparently
   redirected to the key cache for cached btrees.

 - Except when creating new keys: then the update goes to underlying
   btree

For for iterating over a cached btree to work, we need to ensure that if
a key exists in the key cache, it also exists in the btree - otherwise
the iterator code will skip past it and not check the key cache.

Otherwise, for consistency, all updates should go to the same place -
the key cache.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

12ce5b7d

bcachefs: BTREE_ITER_WITH_KEY_CACHE · f7b6ca23

Kent Overstreet authored Feb 06, 2022

This is the start of cache coherency with the btree key cache - this
adds a btree iterator flag that causes lookups to also check the key
cache when we're iterating over the btree (not iterating over the key
cache).

Note that we could still race with another thread creating at item in
the key cache and updating it, since we aren't holding the key cache
locked if it wasn't found. The next patch for the update path will
address this by causing the transaction to restart if the key cache is
found to be dirty.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

f7b6ca23

bcachefs: run_one_trigger() now checks journal keys · 45e4cd9e

Kent Overstreet authored Feb 24, 2022

Previously, when doing updates and running triggers before journal
replay completes, triggers would see the incorrect key for the old key
being overwritten - this patch updates the trigger code to check the
journal keys when necessary, needed for the upcoming allocator rewrite.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

45e4cd9e

bcachefs: Stash a copy of key being overwritten in btree_insert_entry · 2e63e180

Kent Overstreet authored Feb 24, 2022

We currently need to call bch2_btree_path_peek_slot() multiple times in
the transaction commit path - and some of those need to be updated to
also check the keys from journal replay, too. Let's consolidate this and
stash the key being overwritten in btree_insert_entry.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

2e63e180

bcachefs: bch2_btree_path_set_pos() · ce91abd6

Kent Overstreet authored Feb 06, 2022

bch2_btree_path_set_pos() is now available outside of btree_iter.c
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

ce91abd6

bcachefs: btree_id_cached() · 7c8f6f98

Kent Overstreet authored Jan 12, 2022

Add a new helper that returns true if the given btree ID uses the btree
key cache. This enables some new cleanups, since the helper can check
the options for whether caching is enabled on a given btree.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

7c8f6f98

bcachefs: Improve btree_key_cache_flush_pos() · a9c0b125

Kent Overstreet authored Jan 12, 2022

btree_key_cache_flush_pos() uses BTREE_ITER_CACHED_NOFILL - but it
wasn't checking for !ck->valid. It does check for the entry being dirty,
so it shouldn't matter, but this refactor it a bit and adds and
assertion.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

a9c0b125

bcachefs: Fix freeing in bch2_dev_buckets_resize() · 80bf2f34

Kent Overstreet authored Feb 06, 2022

We were double-freeing old_buckets and not freeing old_buckets_gens:
also, the code was supposed to free buckets, not old_buckets;
old_buckets is only needed because we have to use rcu_assign_pointer()
instead of swap(), and won't be set if we hit the error path.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

80bf2f34

bcachefs: Don't keep nodes in btree_reserve locked · 35228ecb

Kent Overstreet authored Feb 07, 2022

These nodes aren't reachable by other threads, so there's no need to
keep it locked - and this fixes a bug with the assertion in
bch2_trans_unlock() firing on transaction restart.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

35228ecb

bcachefs: Log message improvements · b74b147d

Kent Overstreet authored Jan 11, 2022

Change the error messages in bch2_inconsistent_error() and
bch2_fatal_error() so we can distinguish them.

Also, prefer bch2_fs_fatal_error() (which also logs an error message) to
bch2_fatal_error(), and change a call to bch2_inconsistent_error() to
bch2_fatal_error() when we can't continue.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

b74b147d

bcachefs: Delete some dead code · 54460a62

Kent Overstreet authored Jan 11, 2022

__bch2_mark_replicas() is now only used in one place, so inline it into
the caller.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

54460a62

bcachefs: Ignore cached data when calculating fragmentation · 0678cbe2

Kent Overstreet authored Jan 10, 2022

Previously, bucket fragmentation was considered to be bucket size -
total amount of live data, both dirty and cached.

This meant that if a bucket was full but only a small amount of data in
it was dirty - the rest cached, we'd get stuck: copygc wouldn't move the
dirty data out of the bucket and the allocator wouldn't be able to
invalidate and drop the cached data.

This changes fragmentation to exclude cached data, so that copygc will
evacuate these buckets and copygc/the allocator will always be able to
make forward progress.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

0678cbe2

bcachefs: Don't use in-memory bucket array for alloc updates · 3763cb95

Kent Overstreet authored Dec 25, 2021

More prep work for getting rid of the in-memory bucket array: now that
we have BTREE_ITER_WITH_JOURNAL, the allocator code can do ntree lookups
before journal replay is finished, and there's no longer any need for it
to get allocation information from the in-memory bucket array.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

3763cb95

bcachefs: Kill allocator short-circuit invalidate · 1f5f52bd

Kent Overstreet authored Dec 24, 2021

The allocator thread invalidates buckets (increments their generation
number) prior to discarding them and putting them on freelists. We've
had a short circuit path for some time to only update the in-memory
bucket mark when doing the invalidate if we're not invalidating cached
data, but that short-circuit path hasn't really been needed for quite
some time (likely since the btree key cache code was added).

We're deleting it now as part of deleting/converting code that uses the
in memory bucket array.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

1f5f52bd

bcachefs: BTREE_INSERT_LAZY_RW is only for recovery path · 6214485b

Kent Overstreet authored Jan 09, 2022

BTREE_INSERT_LAZY_RW shouldn't do anything after the filesystem has
finished starting up - otherwise, it might interfere with going
read-only as part of shutting down.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

6214485b

bcachefs: Handle transaction restarts in __bch2_move_data() · 8ede9910

Kent Overstreet authored Jan 09, 2022

We weren't checking for -EINTR in the main loop in __bch2_move_data -
this code predates modern transaction restarts.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

8ede9910

bcachefs: Simplify bch2_inode_delete_keys() · d5030164

Kent Overstreet authored Dec 27, 2021

Had a bug report that implies bch2_inode_delete_keys() returned -EINTR
before it completed, so this patch simplifies it and makes the flow
control a little more conventional.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

d5030164

bcachefs: iter->update_path · 1f2d9192

Kent Overstreet authored Jan 08, 2022

With BTREE_ITER_FILTER_SNAPSHOTS, we have to distinguish between the
path where the key was found, and the path for inserting into the
current snapshot. This adds a new field to struct btree_iter for saving
a path for the current snapshot, and plumbs it through
bch2_trans_update().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1f2d9192

bcachefs: Refactor bch2_btree_iter() · a1e82d35

Kent Overstreet authored Jan 09, 2022

This splits bch2_btree_iter() up into two functions: an inner function
that handles BTREE_ITER_WITH_JOURNAL, BTREE_ITER_WITH_UPDATES, and
iterating acrcoss leaf nodes, and an outer one that implements
BTREE_ITER_FILTER_SNAPHSOTS.

This is prep work for remember a btree_path at our update position in
BTREE_ITER_FILTER_SNAPSHOTS mode.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

a1e82d35

bcachefs: Tracepoint improvements · bc82d08b

Kent Overstreet authored Jan 08, 2022

This improves the transaction restart tracepoints - adding distinct
tracepoints for all the locations and reasons a transaction might have
been restarted, and ensures that there's a tracepoint for every
transaction restart.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

bc82d08b

bcachefs: New snapshot unit test · 7f6ff935

Kent Overstreet authored Dec 29, 2021

This still needs to be expanded more, but this adds a basic test for
BTREE_ITER_FILTER_SNAPSHOTS.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

7f6ff935

bcachefs: Fix an error path in bch2_snapshot_node_create() · c4ecf802
Kent Overstreet authored Jan 08, 2022
```
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
```
c4ecf802