- 22 Oct, 2023 40 commits
-
-
Kent Overstreet authored
Back when we relied on the journal sequence number blacklist machinery for consistency between btree and the journal, we needed to ensure a new journal entry was written before any btree writes were done. But, this had the side effect of consuming some space in the journal prior to doing journal replay - which could lead to a very wedged filesystem, since we don't yet have a way to grow the journal prior to going RW. Fortunately, the journal sequence number blacklist machinery isn't needed anymore, as btree node pointers now record the numer of sectors currently written to that node - that code should all be ripped out. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
Snapshot deletion needs to become a multi step process, where we unlink, then tear down the page cache, then delete the subvolume - the deleting flag is equivalent to an inode with i_nlink = 0. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
We should always print out the key we were marking. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
It seems some users have reflink pointers which span many indirect extents, from a short window in time when merging of reflink pointers was allowed. Now, we're seeing transaction path overflows in fix_reflink_p(), the code path to clear out the reflink_p fields now used for front/back pad - but, we don't actually need to be running triggers in that path, which is an easy partial fix. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
for_each_btree_key() now calls bch2_trans_begin() as needed; that means, we can also call it when we're in danger of overflowing transaction paths. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
The way __bch2_mark_reflink_p returns errors was clashing with returning the number of sectors processed - we weren't returning FSCK_ERR_EXIT correctly. Fix this by only using the return code for errors, which actually ends up simplifying the overall logic. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
This fixes a possible race where we fail to remove a device because of btree nodes still on it, that are being deleted by in flight btree updates. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
We have been getting away from handling transaction restarts locally - convert bch2_btree_node_rewrite() to the newer style. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
We were modifying state, then return -EINTR, causing us to skip nodes - ouch. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
But we don't need to call it from outside the btree iterator code anymore, since it's called by bch2_trans_begin() and bch2_btree_path_traverse(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
This is a hacky but effective fix to device usage stats for superblock and journal being wrong on a newly added device (following the comment that already told us how it needed to be done!) Reported-by: Chris Webb <chris@arachsys.com> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
readdir() in a directory with many subvolumes could overflow transaction paths - this is a simple hack around the issue. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
The back merging case wasn't returning errors correctly. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
This changes the on disk format for dirents that point to subvols so that they also record the subvolid of the parent subvol, so that we can filter them out in other subvolumes. This also updates the dirent code to do that filtering, and in particular tweaks the rename code - we need to ensure that there's only ever one dirent (counting multiplicities in different snapshots) that point to a subvolume. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
Code that uses for_each_btree_key often wants transaction restarts to be handled locally and not returned. Originally, we wouldn't return transaction restarts if there was a single iterator in the transaction - the reasoning being if there weren't other iterators being invalidated, and the current iterator was being advanced/retraversed, there weren't any locks or iterators we were required to preserve. But with the btree_path conversion that approach doesn't work anymore - even when we're using for_each_btree_key() with a single iterator there will still be two paths in the transaction, since we now always preserve the path at the pos the iterator was initialized at - the reason being that on restart we often restart from the same place. And it turns out there's now a lot of for_each_btree_key() uses that _do not_ want transaction restarts handled locally, and should be returning them. This patch splits out for_each_btree_key_norestart() and for_each_btree_key_continue_norestart(), and converts existing users as appropriate. for_each_btree_key(), for_each_btree_key_continue(), and for_each_btree_node() now handle transaction restarts themselves by calling bch2_trans_begin() when necessary - and the old hack to not return transaction restarts when there's a single path in the transaction has been deleted. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
It's not an error if we don't have cached data - skip BCH_DATA_cached in bch2_have_enough_devs(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
This fixes a bug where subsequently doing creates with the same name fails. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
check_path() wasn't checking the snapshot ID when checking for directory structure loops - so, snapshots would cause us to detect a loop that wasn't actually a loop. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
When a reflink pointer points to only part of an indirect extent, and then that indirect extent is fragmented (e.g. by copygc), if the reflink pointer only points to one of the fragments we leak a reference. Fix this by storing front/back pad values in reflink pointers - when inserting reflink pointesr, we initialize them to cover the full range of the indirect extents we reference. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
We had a bug where reflink_p pointers weren't being initialized to 0, and when we started using the second word, things broke badly. This patch revs the on disk format version and adds cleanup code to zero out the second word of reflink_p pointers before we start using it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
It shouldn't be necessary when we're only using a single iterator and not doing updates, but that's harder to debug at the moment. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
Now that peek_node()/next_node() are converted to return errors directly, we don't need bch2_trans_exit() to return errors - it's cleaner this way and wasn't used much anymore. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
This changes for_each_btree_node() to work like for_each_btree_key(), and to that end bch2_btree_iter_peek_node() and next_node() also return error ptrs. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
When a reflink pointer points to an indirect extent that doesn't exist, we need to replace it with a KEY_TYPE_error key. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
Checking of directory structure across subvolumes was broken - we need to look up the snapshot ID of the parent subvolume when crossing subvol boundaries. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
Subvolume deletion doesn't flush & evict the btree key cache - ideally it would, but that's tricky, so instead bch2_subvolume_create() needs to make sure the slot doesn't exist in the key cache to avoid triggering assertions. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Brett Holman authored
Type size_t is architecture-specific. Fix warnings for some non-amd64 arches. Signed-off-by: Brett Holman <bholman.devel@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
-
Kent Overstreet authored
This bug was only discovered when we started using the 2nd word in the val, which should have been zeroed out as those fields had never been used before - ouch. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
We were shadowing our exist status, oops Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
Also print the key beyng overwritten for each update. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
This fixes a null ptr deref in bio_alloc_bioset() -> biovec_slab() Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
When force-removing a device, we were silently dropping extents that we no longer had pointers for - we should have been switching them to KEY_TYPE_error, so that reads for data that was lost return errors. This patch adds the logic for switching a key to KEY_TYPE_error to bch2_bkey_drop_ptr(), and improves the logic somewhat. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
With snapshots, __bch2_dev_usr_data_drop() now uses an ALL_SNAPSHOTS iterator, which isn't an extent iterator - meaning we shouldn't be inserting whiteouts with nonzero size to delete. This fixes a bug where we go RO because we tried to insert an invalid key in the device remove path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Brett Holman authored
Prevent false positives in bch2_varint_decode_fast() Signed-off-by: Brett Holman <bholman.devel@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
-
Kent Overstreet authored
It was switching off of the key type incorrectly - this code must've been quite old, and not rereplicating anything that wasn't a btree_ptr_v1 or a plain old extent. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
When we delete a snapshot, we unlink the inode but we don't want to run the inode_rm path - the unlink path deletes the subvolume directly, which does everything we need. Also allowing the inode_rm path to run was getting us "missing subvolume" errors. There's still another bug with snapshot deletion: we need to make snapshot deletion a multi stage process, where we unlink the root dentry, then tear down the page cache, then delete the snapshot. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
this_cpu_ptr() emits a warning when used without preemption disabled - harmless in this case, as we have other locking where bch2_acc_percpu_u64s() is used. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
bch2_trans_begin() is now required for transaction restarts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
These paths weren't updated for btree_path and snapshots - a couple of minor fixes. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
- check for getting to the end of the btree in bch2_path_verify_locks and __btree_path_traverse_all(), this fixes an infinite loop in __btree_path_traverse_all(). - relax requirement in bch2_btree_node_upgrade() that we must want an intent lock, this fixes bugs with paths that point to interior nodes (nonzero level). - bch2_btree_node_update_key(): fix it to upgrade the path to an intent lock, if necessary Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-