Commits · 6f47c706d9d4055ecdd125023577a20449c12b24 · nexedi / linux

30 Mar, 2018 35 commits

btrfs: Document parameters of btrfs_reserve_extent · 6f47c706

Nikolay Borisov authored Mar 13, 2018

This function is the entry to the extent allocator and as such has
quite a number of parameters. Some of those have subtle effects on the
allocation algorithm. Document the parameters.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>

6f47c706

btrfs: Handle error from btrfs_uuid_tree_rem call in _btrfs_ioctl_set_received_subvol · d87ff758

Nikolay Borisov authored Mar 12, 2018

As with every function which deals with modifying the btree
btrfs_uuid_tree_rem can fail for any number of reasons (ie. EIO/ENOMEM).
Handle return error value from this function gracefully by aborting the
transaction.

Fixes: dd5f9615 ("Btrfs: maintain subvolume items in the UUID tree")
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

d87ff758

btrfs: Use sizeof directly instead of a constant variable · 776c4a7c

Nikolay Borisov authored Mar 13, 2018

The kernel would like to have all stack VLA usage removed[1].
Unfortunately using an integer constant variable as the size of an
array is still considered a VLA. Instead let's use directly sizeof(var)
which removes the VLA usage. Use the occasion to remove csum_size
altogether and use sizeof() also for the size passed to memcmp

[1]: https://lkml.org/lkml/2018/3/7/621Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

776c4a7c

btrfs: rename submit callbacks and drop double underscores · d0ee3934
David Sterba authored Mar 08, 2018
```
Signed-off-by: David Sterba <dsterba@suse.com>
```
d0ee3934
btrfs: remove unused parameters from extent_submit_bio_done_t · 6c553435
David Sterba authored Mar 08, 2018
```
Remove parameters not used by any of the callbacks.
Signed-off-by: David Sterba <dsterba@suse.com>
```
6c553435
btrfs: remove unused parameters from extent_submit_bio_start_t · d0779291
David Sterba authored Mar 08, 2018
```
Remove parameters not used by any of the callbacks.
Signed-off-by: David Sterba <dsterba@suse.com>
```
d0779291

btrfs: separate types for submit_bio_start and submit_bio_done · a758781d

David Sterba authored Jun 23, 2017

The callbacks make use of different parameters that are passed to the
other type unnecessarily. This patch adds separate types for each and
the unused parameters will be removed.

The type extent_submit_bio_hook_t keeps all parameters and can be used
where the start/done types are not appropriate.
Signed-off-by: David Sterba <dsterba@suse.com>

a758781d

btrfs: kill tree_mod_log_set_root_pointer helper · d9d19a01

David Sterba authored Mar 05, 2018

A useless wrapper around tree_mod_log_insert_root that hides missing
error handling. Move it to the callers.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

d9d19a01

btrfs: kill tree_mod_log_set_node_key helper · 0e82bcfe

David Sterba authored Mar 05, 2018

A trivial wrapper that can be simply opencoded and makes the GFP
allocation request more visible. The error handling is now moved to the
callers.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

0e82bcfe

btrfs: kill trivial wrapper tree_mod_log_eb_move · bf1d3425

David Sterba authored Mar 05, 2018

The wrapper is effectively an alias for tree_mod_log_insert_move but
also hides the missing error handling. To make that more visible, lift
the BUG_ON to the callers.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

bf1d3425

btrfs: remove trivial locking wrappers of tree mod log · b1a09f1e

David Sterba authored Mar 05, 2018

The wrappers are trivial and do not bring any extra value on top of the
plain locking primitives.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

b1a09f1e

btrfs: drop fs_info parameter from __tree_mod_log_oldest_root · bcd24dab
David Sterba authored Mar 05, 2018
```
It's provided by the extent_buffer.
Signed-off-by: David Sterba <dsterba@suse.com>
```
bcd24dab

btrfs: embed tree_mod_move structure to tree_mod_elem · b6dfa35b

David Sterba authored Mar 05, 2018

The tree_mod_move is not used anywhere and can be embedded as anonymous
structure.
Signed-off-by: David Sterba <dsterba@suse.com>

b6dfa35b

btrfs: drop unused fs_info parameter from tree_mod_log_eb_move · a446a979
David Sterba authored Mar 05, 2018
```
Signed-off-by: David Sterba <dsterba@suse.com>
```
a446a979
btrfs: drop fs_info parameter from tree_mod_log_free_eb · 95b757c1
David Sterba authored Mar 05, 2018
```
It's provided by the extent_buffer.
Signed-off-by: David Sterba <dsterba@suse.com>
```
95b757c1
btrfs: drop fs_info parameter from tree_mod_log_free_eb · db7279a2
David Sterba authored Mar 05, 2018
```
It's provided by the extent_buffer.
Signed-off-by: David Sterba <dsterba@suse.com>
```
db7279a2
btrfs: drop fs_info parameter from tree_mod_log_insert_key · e09c2efe
David Sterba authored Mar 05, 2018
```
It's provided by the extent_buffer.
Signed-off-by: David Sterba <dsterba@suse.com>
```
e09c2efe
btrfs: drop fs_info parameter from tree_mod_log_insert_move · 6074d45f
David Sterba authored Mar 05, 2018
```
It's provided by the extent_buffer.
Signed-off-by: David Sterba <dsterba@suse.com>
```
6074d45f
btrfs: drop fs_info parameter from tree_mod_log_set_node_key · 3ac6de1a
David Sterba authored Mar 05, 2018
```
It's provided by the extent_buffer.
Signed-off-by: David Sterba <dsterba@suse.com>
```
3ac6de1a
btrfs: document more parameters of submit_extent_page · b8b3d625
David Sterba authored Jun 12, 2017
```
Signed-off-by: David Sterba <dsterba@suse.com>
```
b8b3d625

btrfs: cleanup merging conditions in submit_extent_page · 0c8508a6

David Sterba authored Jun 12, 2017

The merge call was factored out to a separate helper but it's a trivial
one and arguably we can opencode it and cache the value.
Signed-off-by: David Sterba <dsterba@suse.com>

0c8508a6

btrfs: remove redundant variable in __do_readpage · 8eec8296

David Sterba authored Jun 06, 2017

The value of page_end is only stored to end, no other use.
Signed-off-by: David Sterba <dsterba@suse.com>

8eec8296

btrfs: assume that bio_ret is always valid in submit_extent_page · 5c2b1fd7

David Sterba authored Jun 06, 2017

All callers pass a valid pointer so we can drop the redundant checks.
The call to submit_one_bio never happend and can be removed.
Signed-off-by: David Sterba <dsterba@suse.com>

5c2b1fd7

Btrfs: scrub: batch rebuild for raid56 · 6ca1765b

Liu Bo authored Mar 07, 2018

In case of raid56, writes and rebuilds always take BTRFS_STRIPE_LEN(64K)
as unit, however, scrub_extent() sets blocksize as unit, so rebuild
process may be triggered on every block on a same stripe.

A typical example would be that when we're replacing a disappeared disk,
all reads on the disks get -EIO, every block (size is 4K if blocksize is
4K) would go thru these,

scrub_handle_errored_block
  scrub_recheck_block # re-read pages one by one
  scrub_recheck_block # rebuild by calling raid56_parity_recover()
                        page by page

Although with raid56 stripe cache most of reads during rebuild can be
avoided, the parity recover calculation(xor or raid6 algorithms) needs to
be done $(BTRFS_STRIPE_LEN / blocksize) times.

This makes it smarter by doing raid56 scrub/replace on stripe length.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>

6ca1765b

btrfs: sort and group mount option definitions · 416a7202

David Sterba authored Mar 09, 2018

Sort mount options by the primary name, followed by the 'no-'
counterpart if it exists. Group the deprecated and debugging options.
Enum and token defintions are synced.
Signed-off-by: David Sterba <dsterba@suse.com>

416a7202

btrfs: Add nossd_spread mount option · 62b8e077

Howard McLauchlan authored Mar 08, 2018

Btrfs has two mount options for SSD optimizations: ssd and ssd_spread.
Presently there is an option to disable all SSD optimizations, but there
isn't an option to disable just ssd_spread.

This patch adds a mount option nossd_spread that disables ssd_spread
only.
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Howard McLauchlan <hmclauchlan@fb.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>

62b8e077

btrfs: Remove btrfs_fs_info::open_ioctl_trans · 92e2f7e3

Nikolay Borisov authored Feb 05, 2018

Since userspace transaction have been removed we no longer have use
for this field so delete it.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

92e2f7e3

btrfs: Remove code referencing unused TRANS_USERSPACE · bcf3a3e7

Nikolay Borisov authored Feb 05, 2018

Now that the userspace transaction ioctls have been removed,
TRANS_USERSPACE is no longer used hence we can remove it.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

bcf3a3e7

btrfs: Remove btrfs_file_private::trans · 859e682d

Nikolay Borisov authored Feb 05, 2018

Now that the userspace transaction IOCTL have been removed, this member
is no longer used so just remove it
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

859e682d

btrfs: Remove userspace transaction ioctls · 7a5a07a8

Nikolay Borisov authored Feb 05, 2018

Commit 3558d4f8 ("btrfs: Deprecate userspace transaction ioctls")
marked the beginning of the end of userspace transaction. This commit
finishes the job! There are no known users and ceph does not use the
ioctl anymore.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Acked-by: Sage Weil <sage@redhat.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

7a5a07a8

btrfs: qgroup: Fix root item corruption when multiple same source snapshots... · 4d31778a

Qu Wenruo authored Dec 19, 2017

btrfs: qgroup: Fix root item corruption when multiple same source snapshots are created with quota enabled

When multiple pending snapshots referring to the same source subvolume
are executed, enabled quota will cause root item corruption, where root
items are using old bytenr (no backref in extent tree).

This can be triggered by fstests btrfs/152.

The cause is when source subvolume is still dirty, extra commit
(simplied transaction commit) of qgroup_account_snapshot() can skip
dirty roots not recorded in current transaction, making root item of
source subvolume not updated.

Fix it by forcing recording source subvolume in current transaction
before qgroup sub-transaction commit.
Reported-by: Justin Maggard <jmaggard@netgear.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

4d31778a

btrfs: Relax memory barrier in btrfs_tree_unlock · 2e32ef87

Nikolay Borisov authored Feb 14, 2018

When performing an unlock on an extent buffer we'd like to order the
decrement of extent_buffer::blocking_writers with waking up any
waiters. In such situations it's sufficient to use smp_mb__after_atomic
rather than the heavy smp_mb. On architectures where atomic operations
are fully ordered (such as x86 or s390) unconditionally executing
a heavyweight smp_mb instruction causes a severe hit to performance
while bringin no improvements in terms of correctness.

The better thing is to use the appropriate smp_mb__after_atomic routine
which will do the correct thing (invoke a full smp_mb or in the case
of ordered atomics insert a compiler barrier). Put another way,
an RMW atomic op + smp_load__after_atomic equals, in terms of
semantics, to a full smp_mb. This ensures that none of the problems
described in the accompanying comment of waitqueue_active occur.
No functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

2e32ef87

btrfs: add define for oldest generation · 7c829b72

Anand Jain authored Mar 07, 2018

Some functions can filter metadata by the generation. Add a define that
will annotate such arguments.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>

7c829b72

btrfs: open code trivial helper btrfs_page_exists_in_range · 051c98eb
David Sterba authored Mar 07, 2018
```
The called function name is self explanatory.
Signed-off-by: David Sterba <dsterba@suse.com>
```
051c98eb

btrfs: Use filemap_range_has_page() · 965aab1c

Matthew Wilcox authored Mar 06, 2018

The current implementation of btrfs_page_exists_in_range() gives the
wrong answer if the workingset code has stored a shadow entry in the
page cache.  The filemap_range_has_page() function does not have this
problem, and it's shared code, so use it instead.
eigned-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

965aab1c

26 Mar, 2018 5 commits

Btrfs: dev-replace: make sure target is identical to source when raid56 rebuild fails · 4759700a

Liu Bo authored Mar 02, 2018

In the last step of scrub_handle_error_block, we try to combine good
copies on all possible mirrors, this works fine for raid1 and raid10,
but not for raid56 as it's doing parity rebuild.

If parity rebuild doesn't get back with correct data which matches its
checksum, in case of replace we'd rather write what is stored in the
source device than the data calculuated from parity.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>

4759700a

Btrfs: raid56: remove redundant async_missing_raid56 · d6a69135

Liu Bo authored Mar 02, 2018

async_missing_raid56() is identical to async_read_rebuild().
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

d6a69135

btrfs: adjust return values of btrfs_inode_by_name · 005d6712

Su Yue authored Mar 05, 2018

Previously, btrfs_inode_by_name() returned 0 which left caller to check
objectid of location even location if the type was invalid.

Let btrfs_inode_by_name() return -EUCLEAN if a corrupted location of a
dir entry is found.  Removal of label out_err also simplifies the
function.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ drop unlikely ]
Signed-off-by: David Sterba <dsterba@suse.com>

005d6712

btrfs: rename btrfs_close_extra_device to btrfs_free_extra_devids · 9b99b115

Anand Jain authored Feb 27, 2018

This function btrfs_close_extra_devices() is about freeing
extra devids which once it may have belonged to this filesystem.
So rename it and add the comment. The _devid suffix is
appropriate as this function won't handle devices which are
outside of the filesytem being mounted.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>

9b99b115

btrfs: Remove root argument from cow_file_range_inline · d02c0e20

Nikolay Borisov authored Mar 02, 2018

This argument is always set to the root of the inode, which is also
passed. So let's get a reference inside the function and simplify
the arg list.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

d02c0e20