Commits · aa8982c3f2cbfca89fb73daad9d6e65f7be022c2 · Kirill Smelkov / linux

22 Oct, 2023 40 commits

bcachefs: Fix reflink repair code · aa8982c3

Kent Overstreet authored Feb 10, 2022

The reflink repair code was incorrectly inserting a nonzero deleted key
via journal replay - this is due to bch2_journal_key_insert() being
somewhat hacky, and so this fix is also hacky for now.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

aa8982c3

bcachefs: bch2_gc_gens() no longer uses bucket array · c45c8667

Kent Overstreet authored Dec 24, 2021

Like the previous patches, this converts bch2_gc_gens() to use the alloc
btree directly, and private arrays of generation numbers for its own
recalculation of oldest_gen.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

c45c8667

bcachefs: Copygc no longer uses bucket array · d73e0d2c

Kent Overstreet authored Dec 25, 2021

This converts the copygc code to use the alloc btree directly to find
buckets that need to be evacuated instead of the in-memory bucket array,
which is finally going away soon.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

d73e0d2c

bcachefs: btree_gc no longer uses main in-memory bucket array · ec061b21

Kent Overstreet authored Dec 25, 2021

This changes the btree_gc code to only use the second bucket array, the
one dedicated to GC. On completion, it compares what's in its in memory
bucket array to the allocation information in the btree and writes it
directly, instead of updating the main in-memory bucket array and
writing that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

ec061b21

bcachefs: Inode create no longer needs to probe key cache · 63a2edce

Kent Overstreet authored Jan 09, 2023

Now that we have full key cache coherency, we can simplify
bch2_inode_create().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

63a2edce

bcachefs: Btree key cache coherency · 12ce5b7d

Kent Overstreet authored Jan 12, 2022

 - Updates to non key cache iterators will now be transparently
   redirected to the key cache for cached btrees.

 - Except when creating new keys: then the update goes to underlying
   btree

For for iterating over a cached btree to work, we need to ensure that if
a key exists in the key cache, it also exists in the btree - otherwise
the iterator code will skip past it and not check the key cache.

Otherwise, for consistency, all updates should go to the same place -
the key cache.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

12ce5b7d

bcachefs: BTREE_ITER_WITH_KEY_CACHE · f7b6ca23

Kent Overstreet authored Feb 06, 2022

This is the start of cache coherency with the btree key cache - this
adds a btree iterator flag that causes lookups to also check the key
cache when we're iterating over the btree (not iterating over the key
cache).

Note that we could still race with another thread creating at item in
the key cache and updating it, since we aren't holding the key cache
locked if it wasn't found. The next patch for the update path will
address this by causing the transaction to restart if the key cache is
found to be dirty.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

f7b6ca23

bcachefs: run_one_trigger() now checks journal keys · 45e4cd9e

Kent Overstreet authored Feb 24, 2022

Previously, when doing updates and running triggers before journal
replay completes, triggers would see the incorrect key for the old key
being overwritten - this patch updates the trigger code to check the
journal keys when necessary, needed for the upcoming allocator rewrite.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

45e4cd9e

bcachefs: Stash a copy of key being overwritten in btree_insert_entry · 2e63e180

Kent Overstreet authored Feb 24, 2022

We currently need to call bch2_btree_path_peek_slot() multiple times in
the transaction commit path - and some of those need to be updated to
also check the keys from journal replay, too. Let's consolidate this and
stash the key being overwritten in btree_insert_entry.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

2e63e180

bcachefs: bch2_btree_path_set_pos() · ce91abd6

Kent Overstreet authored Feb 06, 2022

bch2_btree_path_set_pos() is now available outside of btree_iter.c
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

ce91abd6

bcachefs: btree_id_cached() · 7c8f6f98

Kent Overstreet authored Jan 12, 2022

Add a new helper that returns true if the given btree ID uses the btree
key cache. This enables some new cleanups, since the helper can check
the options for whether caching is enabled on a given btree.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

7c8f6f98

bcachefs: Improve btree_key_cache_flush_pos() · a9c0b125

Kent Overstreet authored Jan 12, 2022

btree_key_cache_flush_pos() uses BTREE_ITER_CACHED_NOFILL - but it
wasn't checking for !ck->valid. It does check for the entry being dirty,
so it shouldn't matter, but this refactor it a bit and adds and
assertion.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

a9c0b125

bcachefs: Fix freeing in bch2_dev_buckets_resize() · 80bf2f34

Kent Overstreet authored Feb 06, 2022

We were double-freeing old_buckets and not freeing old_buckets_gens:
also, the code was supposed to free buckets, not old_buckets;
old_buckets is only needed because we have to use rcu_assign_pointer()
instead of swap(), and won't be set if we hit the error path.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

80bf2f34

bcachefs: Don't keep nodes in btree_reserve locked · 35228ecb

Kent Overstreet authored Feb 07, 2022

These nodes aren't reachable by other threads, so there's no need to
keep it locked - and this fixes a bug with the assertion in
bch2_trans_unlock() firing on transaction restart.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

35228ecb

bcachefs: Log message improvements · b74b147d

Kent Overstreet authored Jan 11, 2022

Change the error messages in bch2_inconsistent_error() and
bch2_fatal_error() so we can distinguish them.

Also, prefer bch2_fs_fatal_error() (which also logs an error message) to
bch2_fatal_error(), and change a call to bch2_inconsistent_error() to
bch2_fatal_error() when we can't continue.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

b74b147d

bcachefs: Delete some dead code · 54460a62

Kent Overstreet authored Jan 11, 2022

__bch2_mark_replicas() is now only used in one place, so inline it into
the caller.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

54460a62

bcachefs: Ignore cached data when calculating fragmentation · 0678cbe2

Kent Overstreet authored Jan 10, 2022

Previously, bucket fragmentation was considered to be bucket size -
total amount of live data, both dirty and cached.

This meant that if a bucket was full but only a small amount of data in
it was dirty - the rest cached, we'd get stuck: copygc wouldn't move the
dirty data out of the bucket and the allocator wouldn't be able to
invalidate and drop the cached data.

This changes fragmentation to exclude cached data, so that copygc will
evacuate these buckets and copygc/the allocator will always be able to
make forward progress.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

0678cbe2

bcachefs: Don't use in-memory bucket array for alloc updates · 3763cb95

Kent Overstreet authored Dec 25, 2021

More prep work for getting rid of the in-memory bucket array: now that
we have BTREE_ITER_WITH_JOURNAL, the allocator code can do ntree lookups
before journal replay is finished, and there's no longer any need for it
to get allocation information from the in-memory bucket array.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

3763cb95

bcachefs: Kill allocator short-circuit invalidate · 1f5f52bd

Kent Overstreet authored Dec 24, 2021

The allocator thread invalidates buckets (increments their generation
number) prior to discarding them and putting them on freelists. We've
had a short circuit path for some time to only update the in-memory
bucket mark when doing the invalidate if we're not invalidating cached
data, but that short-circuit path hasn't really been needed for quite
some time (likely since the btree key cache code was added).

We're deleting it now as part of deleting/converting code that uses the
in memory bucket array.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

1f5f52bd

bcachefs: BTREE_INSERT_LAZY_RW is only for recovery path · 6214485b

Kent Overstreet authored Jan 09, 2022

BTREE_INSERT_LAZY_RW shouldn't do anything after the filesystem has
finished starting up - otherwise, it might interfere with going
read-only as part of shutting down.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

6214485b

bcachefs: Handle transaction restarts in __bch2_move_data() · 8ede9910

Kent Overstreet authored Jan 09, 2022

We weren't checking for -EINTR in the main loop in __bch2_move_data -
this code predates modern transaction restarts.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

8ede9910

bcachefs: Simplify bch2_inode_delete_keys() · d5030164

Kent Overstreet authored Dec 27, 2021

Had a bug report that implies bch2_inode_delete_keys() returned -EINTR
before it completed, so this patch simplifies it and makes the flow
control a little more conventional.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

d5030164

bcachefs: iter->update_path · 1f2d9192

Kent Overstreet authored Jan 08, 2022

With BTREE_ITER_FILTER_SNAPSHOTS, we have to distinguish between the
path where the key was found, and the path for inserting into the
current snapshot. This adds a new field to struct btree_iter for saving
a path for the current snapshot, and plumbs it through
bch2_trans_update().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1f2d9192

bcachefs: Refactor bch2_btree_iter() · a1e82d35

Kent Overstreet authored Jan 09, 2022

This splits bch2_btree_iter() up into two functions: an inner function
that handles BTREE_ITER_WITH_JOURNAL, BTREE_ITER_WITH_UPDATES, and
iterating acrcoss leaf nodes, and an outer one that implements
BTREE_ITER_FILTER_SNAPHSOTS.

This is prep work for remember a btree_path at our update position in
BTREE_ITER_FILTER_SNAPSHOTS mode.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

a1e82d35

bcachefs: Tracepoint improvements · bc82d08b

Kent Overstreet authored Jan 08, 2022

This improves the transaction restart tracepoints - adding distinct
tracepoints for all the locations and reasons a transaction might have
been restarted, and ensures that there's a tracepoint for every
transaction restart.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

bc82d08b

bcachefs: New snapshot unit test · 7f6ff935

Kent Overstreet authored Dec 29, 2021

This still needs to be expanded more, but this adds a basic test for
BTREE_ITER_FILTER_SNAPSHOTS.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

7f6ff935

bcachefs: Fix an error path in bch2_snapshot_node_create() · c4ecf802
Kent Overstreet authored Jan 08, 2022
```
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
```
c4ecf802

bcachefs: Use BTREE_INSERT_USE_RESERVE in btree_update_key() · b674bfad

Kent Overstreet authored Jan 08, 2022

bch2_btree_update_key() is used in the btree node write path - before
delivering the completion we have to update the parent pointer with the
number of sectors written.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

b674bfad

bcachefs: Refactor trigger code · 7d782ae4

Kent Overstreet authored Jan 06, 2022

This breaks bch2_trans_commit_run_triggers() up into multiple functions,
and deletes a bit of duplication - prep work for triggers on alloc keys,
which will need to run last.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

7d782ae4

bcachefs: Rename data_op_data_progress -> data_jobs · acc3e09b
Kent Overstreet authored Jan 06, 2022
```
Mild refactoring.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
```
acc3e09b

bcachefs: Fix check_pos_snapshot_overwritten for !snapshots · a7431348

Kent Overstreet authored Jan 06, 2022

It shouldn't run if the btree being checked doesn't have snapshots.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

a7431348

bcachefs: New data structure for buckets waiting on journal commit · 21aec962

Kent Overstreet authored Jan 04, 2022

Implement a hash table, using cuckoo hashing, for empty buckets that are
waiting on a journal commit before they can be reused.

This replaces the journal_seq field of bucket_mark, and is part of
eventually getting rid of the in memory bucket array.

We may need to make bch2_bucket_needs_journal_commit() lockless, pending
profiling and testing.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

21aec962

bcachefs: Also print out in-memory gen on stale dirty pointer · f443fa66

Kent Overstreet authored Feb 13, 2022

We're trying to track down a bug that shows itself as newly-created
extents having stale dirty pointers - possibly due to the in memory gen
and the btree gen being inconsistent. This patch changes the error
message to also print out the in memory bucket gen when this happens.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

f443fa66

bcachefs: Improve path for when btree_gc needs another pass · 8f11548e

Kent Overstreet authored Jan 01, 2022

btree_gc sometimes needs another pass when it corrects bucket generation
numbers or data types - when it finds multiple pointers of different
data types to the same bucket, it may want to keep the second one it
found.

When this happens, we now clear out bucket sector counts _without_
resetting the bucket generation/data types that we already found,
instead of resetting them to what we have in the alloc btree.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

8f11548e

bcachefs: Fix bch2_check_fix_ptrs() · 4e08446d

Kent Overstreet authored Jan 04, 2022

The repair for for btree_ptrs was saying one thing and doing another -
fortunately, that code can just be deleted.

Also, when we update a btree node pointer, we also have to update node
in memery, if it exists in the btree node cache - this fixes
bch2_check_fix_ptrs() to do that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

4e08446d

bcachefs: Fix an uninitialized variable · 9714baaa

Kent Overstreet authored Jan 04, 2022

Only userspace builds were complaining about it, oddly enough.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

9714baaa

Revert "bcachefs: Delete some obsolete journal_seq_blacklist code" · 9b6e2f1e

Kent Overstreet authored Jan 04, 2022

This reverts commit f95b61228efd04c9c158123da5827c96e9773b29.

It turns out, we're seeing filesystems in the wild end up with
blacklisted btree node bsets - this should not be happening, and until
we understand why and fix it we need to keep this code around.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

9b6e2f1e

bcachefs: Log & error message improvements · 03ea3962

Kent Overstreet authored Jan 04, 2022

 - Add a shim uuid_unparse_lower() in the kernel, since %pU doesn't work
   in userspace

 - We don't need to print the bcachefs: or the filesystem name prefix in
   userspace

 - Improve a few error messages
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

03ea3962

bcachefs: BTREE_ITER_FILTER_SNAPSHOTS is selected automatically · 57cfdd8b

Kent Overstreet authored Jan 04, 2022

It doesn't have to be specified - this patch deletes the two instances
where it was.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

57cfdd8b

bcachefs: Switch to __func__for recording where btree_trans was initialized · 669f87a5

Kent Overstreet authored Jan 04, 2022

Symbol decoding, via %ps, isn't supported in userspace - this will also
be faster when we're using trans->fn in the fast path, as with the new
BCH_JSET_ENTRY_log journal messages.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

669f87a5