Commits · e751c01a8ee1ca934cc0953e2e77ad4ea3e64d5e · Kirill Smelkov / linux

22 Oct, 2023 40 commits

bcachefs: Start using bpos.snapshot field · e751c01a

Kent Overstreet authored Mar 24, 2021

This patch starts treating the bpos.snapshot field like part of the key
in the btree code:

* bpos_successor() and bpos_predecessor() now include the snapshot field
* Keys in btrees that will be using snapshots (extents, inodes, dirents
  and xattrs) now always have their snapshot field set to U32_MAX

The btree iterator code gets a new flag, BTREE_ITER_ALL_SNAPSHOTS, that
determines whether we're iterating over keys in all snapshots or not -
internally, this controlls whether bkey_(successor|predecessor)
increment/decrement the snapshot field, or only the higher bits of the
key.

We add a new member to struct btree_iter, iter->snapshot: when
BTREE_ITER_ALL_SNAPSHOTS is not set, iter->pos.snapshot should always
equal iter->snapshot, which will be 0 for btrees that don't use
snapshots, and alsways U32_MAX for btrees that will use snapshots
(until we enable snapshot creation).

This patch also introduces a new metadata version number, and compat
code for reading from/writing to older versions - this isn't a forced
upgrade (yet).
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e751c01a

bcachefs: Split out bpos_cmp() and bkey_cmp() · 4cf91b02

Kent Overstreet authored Mar 04, 2021

With snapshots, we're going to need to differentiate between comparisons
that should and shouldn't include the snapshot field. bpos_cmp is now
the comparison function that does include the snapshot field, used by
core btree code.

Upper level filesystem code generally does _not_ want to compare against
the snapshot field - that code wants keys to compare as equal even when
one of them is in an ancestor snapshot.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

4cf91b02

bcachefs: Add a mechanism for running callbacks at trans commit time · 43d00243

Kent Overstreet authored Feb 03, 2021

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

43d00243

bcachefs: btree key cache locking improvements · 331194a2

Kent Overstreet authored Mar 24, 2021

The btree key cache mutex was becoming a significant bottleneck - it was
mainly used to protect the lists of dirty, clean and freed cached keys.

This patch eliminates the dirty and clean lists - instead, when we need
to scan for keys to drop from the cache we iterate over the rhashtable,
and thus we're able to remove most uses of that lock.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

331194a2

bcachefs: Simplify btree_node_iter_init_pack_failed() · 2649b514

Kent Overstreet authored Mar 27, 2021

Since we now make sure to always generate packed bkey formats that can
pack the min_key of a btree node, this path should actually never
happen.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

2649b514

bcachefs: Fix for bch2_trans_commit() unlocking when it's not supposed to · f793fd85

Kent Overstreet authored Mar 27, 2021

When we pass BTREE_INSERT_NOUNLOCK bch2_trans_commit isn't supposed to
unlock after a successful commit, but it was calling
bch2_trans_cond_resched() - oops.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

f793fd85

bcachefs: Fix packed bkey format calculation for new btree roots · 3bf57160

Kent Overstreet authored Mar 26, 2021

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

3bf57160

bcachefs: Fix building of aux search trees · c7e04e22

Kent Overstreet authored Mar 26, 2021

We weren't packing the min/max keys, which was a major oversight and
completely disabled generating bkey_floats for adjacent nodes.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

c7e04e22

bcachefs: Generate better bkey formats when splitting nodes · 2da5d000

Kent Overstreet authored Mar 26, 2021

On btree node split, we weren't ensuring the min_key of the new larger
node packs in the new format for this node. This triggers some painful
slowpaths in the bset.c aux search tree code - this patch fixes that by
calculating a new format for the new node with the new min_key.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

2da5d000

bcachefs: Drop bkey noops · 0390ea8a

Kent Overstreet authored Mar 24, 2021

Bkey noops were introduced to deal with trimming inline data extents in
place in the btree: if the u64s field of a bkey was 0, that u64 was a
noop and we'd start looking for the next bkey immediately after it.

But extent handling has been lifted above the btree - we no longer
modify existing extents in place in the btree, and the compatibilty code
for old style extent btree nodes is gone, so we can completely drop this
code.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

0390ea8a

bcachefs: Increase default journal size · 7c8b166e

Kent Overstreet authored Mar 24, 2021

The default was 1/256th of the device and capped at 512MB, which is
fairly tiny these days.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

7c8b166e

bcachefs: Use pcpu mode of six locks for interior nodes · a9d79c6e

Kent Overstreet authored Mar 23, 2021

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

a9d79c6e

bcachefs: Split btree_iter_traverse and bch2_btree_iter_traverse() · 08070cba

Kent Overstreet authored Mar 23, 2021

External (to the btree iterator code) users of bch2_btree_iter_traverse
expect that on success the iterator will be pointed at iter->pos and
have that position locked - but since we split iter->pos and
iter->real_pos, that means it has to update iter->real_pos if necessary.

Internal users don't expect it to modify iter->real_pos, so we need two
separate functions.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

08070cba

bcachefs: Improve inode deletion code · d3e6b9a1

Kent Overstreet authored Mar 21, 2021

It had some silly redundancies.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

d3e6b9a1

bcachefs: Add an .invalid method for bch2_btree_ptr_v2 · fad7cfed

Kent Overstreet authored Mar 22, 2021

It was using the method for btree_ptr_v1, but that wasn't checking all
the fields.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

fad7cfed

bcachefs: Include snapshot field in bch2_bpos_to_text · 1fe9b1d3

Kent Overstreet authored Mar 22, 2021

More prep work for snapshots.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1fe9b1d3

bcachefs: Update iter->real_pos lazily · bcad5622

Kent Overstreet authored Mar 21, 2021

peek() has to update iter->real_pos - there's no need for
bch2_btree_iter_set_pos() to update it as well.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

bcad5622

bcachefs: Consolidate bch2_btree_iter_peek() and peek_with_updates() · 818664f5

Kent Overstreet authored Mar 21, 2021

Ideally we'll be getting rid of peek_with_updates(), but the callers
will need to be checked.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

818664f5

bcachefs: Improve iter->real_pos handling · ca58cbd4

Kent Overstreet authored Mar 21, 2021

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

ca58cbd4

bcachefs: Internal btree iterator renaming · 3b0baf6f

Kent Overstreet authored Mar 21, 2021

This just gives some internal helpers some better names.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

3b0baf6f

bcachefs: Kill btree_iter_peek_uptodate() · 07fc72e1

Kent Overstreet authored Mar 21, 2021

Since we're no longer doing next() immediately followed by peek(), this
optimization isn't doing anything anymore.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

07fc72e1

bcachefs: Iterators are now always consistent with iter->real_pos · 5cde51cd

Kent Overstreet authored Mar 21, 2021

This means bch2_btree_iter_traverse_one() can be made more efficient.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

5cde51cd

bcachefs: Have btree_iter_next_node() use btree_iter_set_search_pos() · 345ca825

Kent Overstreet authored Mar 21, 2021

btree node iterators need to obey the regular btree node invarionts
w.r.t. iter->real_pos; once they do, bch2_btree_iter_traverse will have
less that it needs to check.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

345ca825

bcachefs: Replace bch2_btree_iter_next() calls with bch2_btree_iter_advance · e0ba3b64

Kent Overstreet authored Mar 21, 2021

The way btree iterators work internally has been changing, particularly
with the iter->real_pos changes, and bch2_btree_iter_next() is no longer
hyper optimized - it's just advance followed by peek, so it's more
efficient to just call advance where we're not using the return value of
bch2_btree_iter_next().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e0ba3b64

bcachefs: Get disk reservation when overwriting data in old snapshot · cb16bfaa

Kent Overstreet authored Mar 21, 2021

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

cb16bfaa

bcachefs: Switch extent_handle_overwrites() to one key at a time · 4cfb722c

Kent Overstreet authored Mar 20, 2021

Prep work for snapshots
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

4cfb722c

bcachefs: Optimize bch2_btree_iter_verify_level() · 4ce41957

Kent Overstreet authored Mar 20, 2021

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

4ce41957

bcachefs: Fix iterator picking · 5c1ec980

Kent Overstreet authored Mar 20, 2021

comparison was wrong
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

5c1ec980

bcachefs: Don't unconditially version_upgrade in initialize · 73590619

Kent Overstreet authored Mar 21, 2021

This is mkfs's job. Also, clean up the handling of feature bits some.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

73590619

bcachefs: Validate bset version field against sb version fields · 84cc758d

Kent Overstreet authored Mar 21, 2021

The superblock version fields need to be accurate to know whether a
filesystem is supported, thus we should be verifying them.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

84cc758d

bcachefs: Don't overwrite snapshot field in bch2_cut_back() · d361a26d

Kent Overstreet authored Mar 19, 2021

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

d361a26d

bcachefs: Kill bkey ops->debugcheck method · 7e6dbac9

Kent Overstreet authored Mar 19, 2021

This code used to be used for running some assertions on alloc info at
runtime, but it long predates fsck and hasn't been good for much in
ages - we can delete it now.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

7e6dbac9

bcachefs: Assert that iterators aren't being double freed · e9895f0a

Kent Overstreet authored Mar 19, 2021

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e9895f0a

bcachefs: Require all btree iterators to be freed · 50dc0f69

Kent Overstreet authored Mar 19, 2021

We keep running into occasional bugs with btree transaction iterators
overflowing - this will make those bugs more visible.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

50dc0f69

bcachefs: btree_iter_set_dontneed() · 8d956c2f

Kent Overstreet authored Mar 19, 2021

This is a bit clearer than using bch2_btree_iter_free().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

8d956c2f

bcachefs: Fsck code refactoring · abcecb49

Kent Overstreet authored Mar 19, 2021

Change fsck code to always put btree iterators - also, make some flow
control improvements to deal with lock restarts better, and refactor
check_extents() to not walk extents twice for counting/checking
i_sectors.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

abcecb49

bcachefs: Fix btree iterator leak in extent_handle_overwrites() · dbb93db9

Kent Overstreet authored Mar 19, 2021

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

dbb93db9

bcachefs: Don't list non journal devs in journal_debug_to_text() · ba401eaa

Kent Overstreet authored Mar 19, 2021

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

ba401eaa

bcachefs: Add a print statement for when we go read-write · 2c944fa1

Kent Overstreet authored Mar 19, 2021

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

2c944fa1

bcachefs: Kill btree_iter_pos_changed() · f2eaea2f

Kent Overstreet authored Mar 16, 2021

this is used in only one place now, so just inline it into the caller.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

f2eaea2f