Commits · 5877d8876afe1c5843731244f39d1739eba2665f · Kirill Smelkov / linux

22 Oct, 2023 40 commits

bcachefs: Re-enable hash_redo_key() · 5877d887

Kent Overstreet authored Sep 04, 2022

When subvolumes & snapshots were rolled out, hash_redo_key() was
disabled due to some new complications - namely, bch2_hash_set() works
at the subvolume level, and fsck does not run in a defined subvolume,
instead working at the snapshot ID level.

This patch splits out bch2_hash_set_snapshot() from bch2_hash_set(), and
makes one small tweak for fsck:

 - Normally, bch2_hash_set() (and other dirent code) needs to know what
   subvolume we're in, because dirents that point to other subvolumes
   should only be visible in the subvolume they were created in, not
   other snapshots. We can't check that in fsck, so we just assume that
   all dirents are visible.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

5877d887

bcachefs: Kill journal_keys->journal_seq_base · 1ffb876f

Kent Overstreet authored Sep 12, 2022

This removes an optimization that didn't actually save us any memory,
due to alignment, but did make the code more complicated than it needed
to be. We were also seeing a bug where journal_seq_base wasn't getting
correctly initailized, so hopefully it'll fix that too.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1ffb876f

bcachefs: Fix redundant transaction restart · e87b0e4a

Kent Overstreet authored Sep 04, 2022

Little bit of tidying up, this makes the counters a little bit clearer
as to what's happening.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e87b0e4a

bcachefs: Ensure intent locks are marked before taking write locks · 1bb91233

Kent Overstreet authored Sep 03, 2022

Locks must be correctly marked for the cycle detector to work.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1bb91233

bcachefs: Avoid using btree_node_lock_nopath() · 38474c26

Kent Overstreet authored Sep 02, 2022

With the upcoming cycle detector, we have to be careful about using
btree_node_lock_nopath - in particular, using it to take write locks can
cause deadlocks.

All held locks need to be tracked in a btree_path, so that the cycle
detector knows about them - unless we know that we cannot cause
deadlocks for other reasons: e.g. we are only taking read locks, or
we're in very early fsck (topology repair).
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

38474c26

bcachefs: Fix usage of six lock's percpu mode, key cache version · 3d21d48e

Kent Overstreet authored Sep 03, 2022

Similar to "bcachefs: Fix usage of six lock's percpu mode", six locks
have a percpu mode, but we can't switch between percpu and non percpu
modes while a lock is in use: threads attempting to take a read lock may
race, and we'll end up with the read count permanently off.

Fixing this the "correct" way, in six_lock_pcpu_(alloc|free) would
require an RCU barrier, and we don't want to do that - instead, we have
to permanently segragate percpu and non percpu objects, including when
on freelists.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

3d21d48e

bcachefs: Refactor bkey_cached_alloc() path · 0242130f

Kent Overstreet authored Sep 03, 2022

Clean up the arguments passed and make them more consistent.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

0242130f

bcachefs: Convert more locking code to btree_bkey_cached_common · da4474f2

Kent Overstreet authored Sep 03, 2022

Ideally, all the code in btree_locking.c should be converted, but then
we'd want to convert btree_path to point to btree_key_cached_common too,
and then we'd be in for a much bigger cleanup - but a bit of incremental
cleanup will still be helpful for the next patches.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

da4474f2

bcachefs: btree_bkey_cached_common->cached · 4e6defd1

Kent Overstreet authored Aug 31, 2022

Add a type descriptor to btree_bkey_cached_common - there's no reason
not to since we've got padding that was otherwise unused, and this is a
nice cleanup (and helpful in later patches).
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

4e6defd1

bcachefs: Fix six_lock_readers_add() · 6b81f194

Kent Overstreet authored Sep 01, 2022

Have to be careful with bit fields - when subtracting, this was
overflowing into the write_locking bit.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

6b81f194

bcachefs: bch2_btree_node_lock_write_nofail() · d5024b01

Kent Overstreet authored Aug 22, 2022

Taking a write lock will be able to fail, with the new cycle detector -
unless we pass it nofail, which is possible but not preferred.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

d5024b01

bcachefs: New locking functions · ca7d8fca

Kent Overstreet authored Aug 21, 2022

In the future, with the new deadlock cycle detector, we won't be using
bare six_lock_* anymore: lock wait entries will all be embedded in
btree_trans, and we will need a btree_trans context whenever locking a
btree node.

This patch plumbs a btree_trans to the few places that need it, and adds
two new locking functions
 - btree_node_lock_nopath, which may fail returning a transaction
   restart, and
 - btree_node_lock_nopath_nofail, to be used in places where we know we
   cannot deadlock (i.e. because we're holding no other locks).
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

ca7d8fca

bcachefs: Mark write locks before taking lock · 54618087

Kent Overstreet authored Aug 26, 2022

six locks are unfair: while a thread is blocked trying to take a write
lock, new read locks will fail. The new deadlock cycle detector makes
use of our existing lock tracing, so we need to tell it we're holding a
write lock before we take the lock for it to work correctly.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

54618087

bcachefs: Delete time_stats for lock contended times · 534a591e

Kent Overstreet authored Aug 27, 2022

Since we've now got time_stats for lock hold times (per btree
transaction), we don't need this anymore.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

534a591e

bcachefs: Don't leak lock pcpu counts memory · c919f53f
Kent Overstreet authored Aug 30, 2022
```
This fixes a small memory leak.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
c919f53f

six locks: Delete six_lock_pcpu_free_rcu() · f5178b34

Kent Overstreet authored Aug 27, 2022

Didn't have any users, and wasn't a good idea to begin with - delete it.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

f5178b34

bcachefs: Add persistent counters for all tracepoints · 674cfc26

Kent Overstreet authored Aug 27, 2022

Also, do some reorganizing/renaming, convert atomic counters in bch_fs
to persistent counters, and add a few missing counters.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

674cfc26

bcachefs: Fix bch2_btree_update_start() to return -BCH_ERR_journal_reclaim_would_deadlock · d97e6aae

Kent Overstreet authored Aug 27, 2022

On failure to get a journal pre-reservation because we're called from
journal reclaim we're not supposed to return a transaction restart error
- this fixes a livelock.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

d97e6aae

bcachefs: Improve bch2_btree_node_relock() · 8a9c1b1c

Kent Overstreet authored Aug 27, 2022

This moves the IS_ERR_OR_NULL() check to the inline part, since that's a
fast path event.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

8a9c1b1c

bcachefs: Improve trans_restart_journal_preres_get tracepoint · ce56bf7f
Kent Overstreet authored Aug 27, 2022
```
It now includes journal_flags.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
ce56bf7f

bcachefs: Improve btree_node_relock_fail tracepoint · 5f1dd9a6

Kent Overstreet authored Aug 27, 2022

It now prints the error name when the btree node is an error pointer;
also, don't trace failures when the the btree node is
BCH_ERR_no_btree_node_up.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

5f1dd9a6

bcachefs: Make more btree_paths available · b1cdc398

Kent Overstreet authored Aug 27, 2022

 - Don't decrease BTREE_ITER_MAX when building with CONFIG_LOCKDEP
   anymore. The lockdep table sizes are configurable now, we don't need
   this anymore.
 - btree_trans_too_many_iters() is less conservative now. Previously it
   was causing a transaction restart if we had used more than
   BTREE_ITER_MAX / 2 paths, change this to BTREE_ITER_MAX - 8.

This helps with excessive transaction restarts/livelocks in the bucket
allocator path.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

b1cdc398

bcachefs: Correctly initialize bkey_cached->lock · 06a53943

Kent Overstreet authored Aug 25, 2022

We need to use the right class for some assertions to work correctly.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

06a53943

bcachefs: Track held write locks · 131dcd5a

Kent Overstreet authored Aug 22, 2022

The upcoming lock cycle detection code will need to know precisely which
locks every btree_trans is holding, including write locks - this patch
updates btree_node_locked_type to include write locks.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

131dcd5a

bcachefs: Print lock counts in debugs btree_transactions · c240c3a9

Kent Overstreet authored Aug 23, 2022

Improve our debugfs output, to help in debugging deadlocks: this shows,
for every btree node we print, the current number of readers/intent
locks/write locks held.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

c240c3a9

bcachefs: Switch btree locking code to struct btree_bkey_cached_common · 14599cce
Kent Overstreet authored Aug 22, 2022
```
This is just some type safety cleanup.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
```
14599cce

bcachefs: Track maximum transaction memory · 616928c3

Kent Overstreet authored Aug 22, 2022

This patch
 - tracks maximum bch2_trans_kmalloc() memory used in btree_transaction_stats
 - makes it available in debugfs
 - switches bch2_trans_init() to using that for the amount of memory to
   preallocate, instead of the parameter passed in

This drastically reduces transaction restarts, and means we no longer
need to track this in the source code.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

616928c3

six locks: Improve six_lock_count · e3738c69

Kent Overstreet authored Aug 21, 2022

six_lock_count now counts up whether a write lock held, and this patch
now also correctly counts six_lock->intent_lock_recurse.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

e3738c69

bcachefs: Kill nodes_intent_locked · 2e27f656

Kent Overstreet authored Aug 21, 2022

Previously, we used two different bit arrays for tracking held btree
node locks. This patch switches to an array of two bit integers, which
will let us track, in a future patch, when we hold a write lock.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

2e27f656

bcachefs: Better use of locking helpers · d4263e56

Kent Overstreet authored Aug 21, 2022

Held btree locks are tracked in btree_path->nodes_locked and
btree_path->nodes_intent_locked. Upcoming patches are going to change
the representation in struct btree_path, so this patch switches to
proper helpers instead of direct access to these fields.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

d4263e56

bcachefs: Reorganize btree_locking.[ch] · 8e569669

Kent Overstreet authored Aug 19, 2022

Tidy things up a bit before doing more work in this file.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

8e569669

bcachefs: btree_locking.c · cd5afabe

Kent Overstreet authored Aug 19, 2022

Start to centralize some of the locking code in a new file; more locking
code will be moving here in the future.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

cd5afabe

bcachefs: Fix adding a device with a label · 02afcb8c

Kent Overstreet authored Aug 18, 2022

Device labels are represented as pointers in the member info section: we
need to get and then set the label for it to be kept correctly.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

02afcb8c

bcachefs: fsck: Another transaction restart handling fix · 12043cf1

Kent Overstreet authored Aug 18, 2022

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

12043cf1

bcachefs: bch2_btree_delete_range_trans() now returns -BCH_ERR_transaction_restart_nested · 42590b53

Kent Overstreet authored Aug 18, 2022

The new convention is that functions that handle transaction restarts
within an existing transaction context should return
-BCH_ERR_transaction_restart_nested when they did so, since they
invalidated the outer transaction context.

This also means bch2_btree_delete_range_trans() is changed to only call
bch2_trans_begin() after a transaction restart, not on every loop
iteration.

This is to fix a bug in fsck, in check_inode() when we truncate an inode
with BCH_INODE_I_SIZE_DIRTY set.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

42590b53

bcachefs: Minor transaction restart handling fix · efd0d038

Kent Overstreet authored Aug 17, 2022

 - fsck_inode_rm() wasn't returning BCH_ERR_transaction_restart_nested
 - change bch2_trans_verify_not_restarted() to call panic() - we don't
   want these errors to be missed
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

efd0d038

bcachefs: Fix bch2_btree_iter_peek_slot() error path · 23dfb3a2

Kent Overstreet authored Aug 17, 2022

iter->k needs to be consistent with iter->pos - required for
bch2_btree_iter_(rewind|advance) to work correctly.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

23dfb3a2

bcachefs: Another should_be_locked fixup · 8192f8a5

Kent Overstreet authored Aug 16, 2022

When returning a key from the key cache, in BTREE_ITER_WITH_KEY_CACHE
mode, we don't want to set should_be_locked on iter->path; we're not
returning a key from that path, so we donn't need to, and also since we
traversed the key cache iterator before setting should_be_locked on that
path it might be unlocked (if we unlocked, bch2_trans_relock() won't
have relocked it).
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

8192f8a5

bcachefs: bch2_bkey_packed_to_binary_text() · d0b50524

Kent Overstreet authored Aug 14, 2022

For debugging the eytzinger search tree code, and low level bkey packing
code, it can be helpful to see things in binary: this patch improves our
helpers for doing so.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

d0b50524

bcachefs: Add assertions for unexpected transaction restarts · f0d2e9f2
Kent Overstreet authored Jul 07, 2022
```
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
```
f0d2e9f2