Commits · 6bf9e4bd6a277840d3fe8c5d5d530a1fbd3db592 · nexedi / linux

29 Apr, 2019 40 commits

btrfs: inode: Verify inode mode to avoid NULL pointer dereference · 6bf9e4bd

Qu Wenruo authored Mar 13, 2019

[BUG]
When accessing a file on a crafted image, btrfs can crash in block layer:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
  PGD 136501067 P4D 136501067 PUD 124519067 PMD 0
  CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.0.0-rc8-default #252
  RIP: 0010:end_bio_extent_readpage+0x144/0x700
  Call Trace:
   <IRQ>
   blk_update_request+0x8f/0x350
   blk_mq_end_request+0x1a/0x120
   blk_done_softirq+0x99/0xc0
   __do_softirq+0xc7/0x467
   irq_exit+0xd1/0xe0
   call_function_single_interrupt+0xf/0x20
   </IRQ>
  RIP: 0010:default_idle+0x1e/0x170

[CAUSE]
The crafted image has a tricky corruption, the INODE_ITEM has a
different type against its parent dir:

        item 20 key (268 INODE_ITEM 0) itemoff 2808 itemsize 160
                generation 13 transid 13 size 1048576 nbytes 1048576
                block group 0 mode 121644 links 1 uid 0 gid 0 rdev 0
                sequence 9 flags 0x0(none)

This mode number 0120000 means it's a symlink.

But the dir item think it's still a regular file:

        item 8 key (264 DIR_INDEX 5) itemoff 3707 itemsize 32
                location key (268 INODE_ITEM 0) type FILE
                transid 13 data_len 0 name_len 2
                name: f4
        item 40 key (264 DIR_ITEM 51821248) itemoff 1573 itemsize 32
                location key (268 INODE_ITEM 0) type FILE
                transid 13 data_len 0 name_len 2
                name: f4

For symlink, we don't set BTRFS_I(inode)->io_tree.ops and leave it
empty, as symlink is only designed to have inlined extent, all handled
by tree block read.  Thus no need to trigger btrfs_submit_bio_hook() for
inline file extent.

However end_bio_extent_readpage() expects tree->ops populated, as it's
reading regular data extent.  This causes NULL pointer dereference.

[FIX]
This patch fixes the problem in two ways:

- Verify inode mode against its dir item when looking up inode
  So in btrfs_lookup_dentry() if we find inode mode mismatch with dir
  item, we error out so that corrupted inode will not be accessed.

- Verify inode mode when getting extent mapping
  Only regular file should have regular or preallocated extent.
  If we found regular/preallocated file extent for symlink or
  the rest, we error out before submitting the read bio.

With this fix that crafted image can be rejected gracefully:

  BTRFS critical (device loop0): inode mode mismatch with dir: inode mode=0121644 btrfs type=7 dir type=1
Reported-by: Yoon Jungyeon <jungyeon@gatech.edu>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202763Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

6bf9e4bd

btrfs: tree-checker: Verify inode item · 496245ca

Qu Wenruo authored Mar 13, 2019

There is a report in kernel bugzilla about mismatch file type in dir
item and inode item.

This inspires us to check inode mode in inode item.

This patch will check the following members:

- inode key objectid
  Should be ROOT_DIR_DIR or [256, (u64)-256] or FREE_INO.

- inode key offset
  Should be 0

- inode item generation
- inode item transid
  No newer than sb generation + 1.
  The +1 is for log tree.

- inode item mode
  No unknown bits.
  No invalid S_IF* bit.
  NOTE: S_IFMT check is not enough, need to check every know type.

- inode item nlink
  Dir should have no more link than 1.

- inode item flags
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

496245ca

btrfs: tree-checker: Enhance chunk checker to validate chunk profile · 80e46cf2

Qu Wenruo authored Mar 13, 2019

Btrfs-progs already have a comprehensive type checker, to ensure there
is only 0 (SINGLE profile) or 1 (DUP/RAID0/1/5/6/10) bit set for chunk
profile bits.

Do the same work for kernel.
Reported-by: Yoon Jungyeon <jungyeon@gatech.edu>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202765Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

80e46cf2

btrfs: tree-checker: Verify dev item · ab4ba2e1

Qu Wenruo authored Mar 08, 2019

[BUG]
For fuzzed image whose DEV_ITEM has invalid total_bytes as 0, then
kernel will just panic:
  BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
  #PF error: [normal kernel read fault]
  PGD 800000022b2bd067 P4D 800000022b2bd067 PUD 22b2bc067 PMD 0
  Oops: 0000 [#1] SMP PTI
  CPU: 0 PID: 1106 Comm: mount Not tainted 5.0.0-rc8+ #9
  RIP: 0010:btrfs_verify_dev_extents+0x2a5/0x5a0
  Call Trace:
   open_ctree+0x160d/0x2149
   btrfs_mount_root+0x5b2/0x680

[CAUSE]
If device extent verification finds a deivce with 0 total_bytes, then it
assumes it's a seed dummy, then search for seed devices.

But in this case, there is no seed device at all, causing NULL pointer.

[FIX]
Since this is caused by fuzzed image, let's go the tree-check way, just
add a new verification for device item.
Reported-by: Yoon Jungyeon <jungyeon@gatech.edu>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202691Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>

ab4ba2e1

btrfs: tree-checker: Check chunk item at tree block read time · 075cb3c7

Qu Wenruo authored Mar 20, 2019

Since we have btrfs_check_chunk_valid() in tree-checker, let's do
chunk item verification in tree-checker too.

Since the tree-checker is run at endio time, if one chunk leaf fails
chunk verification, we can still retry the other copy, making btrfs more
robust to fuzzed image as we may still get a good chunk item.

Also since we have done chunk verification in tree block read time, skip
the btrfs_check_chunk_valid() call in read_one_chunk() if we're reading
chunk items from leaf.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

075cb3c7

btrfs: tree-checker: Make btrfs_check_chunk_valid() return EUCLEAN instead of EIO · bf871c3b

Qu Wenruo authored Mar 20, 2019

To follow the standard behavior of tree-checker.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

bf871c3b

btrfs: tree-checker: Make chunk item checker messages more readable · f1140243

Qu Wenruo authored Mar 20, 2019

Old error message would be something like:
  BTRFS error (device dm-3): invalid chunk num_stipres: 0

New error message would be:
  Btrfs critical (device dm-3): corrupt superblock syschunk array: chunk_start=2097152, invalid chunk num_stripes: 0
Or
  Btrfs critical (device dm-3): corrupt leaf: root=3 block=8388608 slot=3 chunk_start=2097152, invalid chunk num_stripes: 0

And for certain error message, also output expected value.

The error message levels are changed from error to critical.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

f1140243

btrfs: Move btrfs_check_chunk_valid() to tree-check.[ch] and export it · 82fc28fb

Qu Wenruo authored Mar 20, 2019

By function, chunk item verification is more suitable to be done inside
tree-checker.

So move btrfs_check_chunk_valid() to tree-checker.c and export it.

And since it's now moved to tree-checker, also add a better comment for
what this function is doing.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

82fc28fb

btrfs: qgroup: remove obsolete fs_info members · 90b1377d

David Sterba authored Mar 27, 2019

The commit fcebe456 ("Btrfs: rework qgroup accounting") reworked
qgroups and added some new structures. Another rework of qgroup
mechanics e69bcee3 ("btrfs: qgroup: Cleanup the old
ref_node-oriented mechanism.") stopped using them and left uncleaned.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

90b1377d

btrfs: get fs_info from eb in btrfs_verify_level_key · e064d5e9

David Sterba authored Mar 20, 2019

We can read fs_info from extent buffer and can drop it from the
parameters.
Signed-off-by: David Sterba <dsterba@suse.com>

e064d5e9

btrfs: get fs_info from eb in btree_read_extent_buffer_pages · 5ab12d1f

David Sterba authored Mar 20, 2019

We can read fs_info from extent buffer and can drop it from the
parameters.
Signed-off-by: David Sterba <dsterba@suse.com>

5ab12d1f

btrfs: get fs_info from eb in read_node_slot · d0d20b0f

David Sterba authored Mar 20, 2019

We can read fs_info from extent buffer and can drop it from the
parameters.
Signed-off-by: David Sterba <dsterba@suse.com>

d0d20b0f

btrfs: get fs_info from eb in btrfs_leaf_free_space · e902baac

David Sterba authored Mar 20, 2019

We can read fs_info from extent buffer and can drop it from the
parameters.
Signed-off-by: David Sterba <dsterba@suse.com>

e902baac

btrfs: get fs_info from eb in clean_tree_block · 6a884d7d

David Sterba authored Mar 20, 2019

We can read fs_info from extent buffer and can drop it from the
parameters.
Signed-off-by: David Sterba <dsterba@suse.com>

6a884d7d

btrfs: get fs_info from eb in tree_mod_log_eb_copy · ed874f0d

David Sterba authored Mar 20, 2019

We can read fs_info from extent buffer and can drop it from the
parameters.
Signed-off-by: David Sterba <dsterba@suse.com>

ed874f0d

btrfs: get fs_info from eb in check_tree_block_fsid · b0c9b3b0

David Sterba authored Mar 20, 2019

We can read fs_info from extent buffer and can drop it from the
parameters.
Signed-off-by: David Sterba <dsterba@suse.com>

b0c9b3b0

btrfs: get fs_info from eb in btrfs_exclude_logged_extents · bcdc428c

David Sterba authored Mar 20, 2019

We can read fs_info from extent buffer and can drop it from the
parameters.
Signed-off-by: David Sterba <dsterba@suse.com>

bcdc428c

btrfs: get fs_info from eb in leaf_data_end · 8f881e8c

David Sterba authored Mar 20, 2019

We can read fs_info from extent buffer and can drop it from the
parameters.
Signed-off-by: David Sterba <dsterba@suse.com>

8f881e8c

btrfs: get fs_info from eb in write_one_eb · 0ab02063

David Sterba authored Mar 20, 2019

We can read fs_info from extent buffer and can drop it from the
parameters.
Signed-off-by: David Sterba <dsterba@suse.com>

0ab02063

btrfs: get fs_info from eb in repair_eb_io_failure · 20a1fbf9

David Sterba authored Mar 20, 2019

We can read fs_info from extent buffer and can drop it from the
parameters. As all callsites are updated, add the btrfs_ prefix as the
function is exported.
Signed-off-by: David Sterba <dsterba@suse.com>

20a1fbf9

btrfs: get fs_info from eb in lock_extent_buffer_for_io · 9df76fb5

David Sterba authored Mar 20, 2019

We can read fs_info from extent buffer and can drop it from the
parameters.
Signed-off-by: David Sterba <dsterba@suse.com>

9df76fb5

btrfs: use common file type conversion · 7d157c3d

Phillip Potter authored Mar 26, 2019

Deduplicate the btrfs file type conversion implementation - file systems
that use the same file types as defined by POSIX do not need to define
their own versions and can use the common helper functions decared in
fs_types.h and implemented in fs_types.c

Common implementation can be found via commit:
bbe7449e "fs: common implementation of file type"
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Phillip Potter <phil@philpotter.co.uk>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

7d157c3d

btrfs: Perform locking/unlocking in btrfs_remap_file_range() · 7984ae52

Goldwyn Rodrigues authored Feb 25, 2019

Move code to make it more readable, so as locking and unlocking is
done in the same function. The generic checks that are now performed in
the locked section are unaffected.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

7984ae52

btrfs: use BUG() instead of BUG_ON(1) · 290342f6

Arnd Bergmann authored Mar 25, 2019

BUG_ON(1) leads to bogus warnings from clang when
CONFIG_PROFILE_ANNOTATED_BRANCHES is set:

fs/btrfs/volumes.c:5041:3: error: variable 'max_chunk_size' is used uninitialized whenever 'if' condition is false
      [-Werror,-Wsometimes-uninitialized]
                BUG_ON(1);
                ^~~~~~~~~
include/asm-generic/bug.h:61:36: note: expanded from macro 'BUG_ON'
 #define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } while (0)
                                   ^~~~~~~~~~~~~~~~~~~
include/linux/compiler.h:48:23: note: expanded from macro 'unlikely'
 #  define unlikely(x)   (__branch_check__(x, 0, __builtin_constant_p(x)))
                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fs/btrfs/volumes.c:5046:9: note: uninitialized use occurs here
                             max_chunk_size);
                             ^~~~~~~~~~~~~~
include/linux/kernel.h:860:36: note: expanded from macro 'min'
 #define min(x, y)       __careful_cmp(x, y, <)
                                         ^
include/linux/kernel.h:853:17: note: expanded from macro '__careful_cmp'
                __cmp_once(x, y, __UNIQUE_ID(__x), __UNIQUE_ID(__y), op))
                              ^
include/linux/kernel.h:847:25: note: expanded from macro '__cmp_once'
                typeof(y) unique_y = (y);               \
                                      ^
fs/btrfs/volumes.c:5041:3: note: remove the 'if' if its condition is always true
                BUG_ON(1);
                ^
include/asm-generic/bug.h:61:32: note: expanded from macro 'BUG_ON'
 #define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } while (0)
                               ^
fs/btrfs/volumes.c:4993:20: note: initialize the variable 'max_chunk_size' to silence this warning
        u64 max_chunk_size;
                          ^
                           = 0

Change it to BUG() so clang can see that this code path can never
continue.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David Sterba <dsterba@suse.com>

290342f6

btrfs: move tree block wait and write helpers to tree-log · 247462a5

David Sterba authored Mar 21, 2019

The wrapper names better describe what's happening so they're not
deleted though they're trivial, but at least moved closer to their place
of use.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

247462a5

btrfs: remove stale definition of BUFFER_LRU_MAX · d4eb671a

David Sterba authored Mar 21, 2019

Long time ago (2008), the extent buffers were organized in a LRU list
and switched to rb-tree in 6af118ce ("Btrfs: Index extent
buffers in an rbtree"). There was one stale macro definition left.
Signed-off-by: David Sterba <dsterba@suse.com>

d4eb671a

btrfs: tests: unify messages when tests start · e4fa7469

David Sterba authored Mar 18, 2019

- make the messages more visually consistent and use same format
  "running ... test", any error or other warning can be easily spotted

- move some message to the test entry function

- add message to the inode tests

Example output:

[    8.187391] Btrfs loaded, crc32c=crc32c-generic, assert=on, integrity-checker=on, ref-verify=on
[    8.189476] BTRFS: selftest: sectorsize: 4096  nodesize: 4096
[    8.190761] BTRFS: selftest: running btrfs free space cache tests
[    8.192245] BTRFS: selftest: running extent only tests
[    8.193573] BTRFS: selftest: running bitmap only tests
[    8.194876] BTRFS: selftest: running bitmap and extent tests
[    8.196166] BTRFS: selftest: running space stealing from bitmap to extent tests
[    8.198026] BTRFS: selftest: running extent buffer operation tests
[    8.199328] BTRFS: selftest: running btrfs_split_item tests
[    8.200653] BTRFS: selftest: running extent I/O tests
[    8.201808] BTRFS: selftest: running find delalloc tests
[    8.320733] BTRFS: selftest: running extent buffer bitmap tests
[    8.340795] BTRFS: selftest: running inode tests
[    8.341766] BTRFS: selftest: running btrfs_get_extent tests
[    8.342981] BTRFS: selftest: running hole first btrfs_get_extent test
[    8.344342] BTRFS: selftest: running outstanding_extents tests
[    8.345575] BTRFS: selftest: running qgroup tests
[    8.346537] BTRFS: selftest: running qgroup add/remove tests
[    8.347725] BTRFS: selftest: running qgroup multiple refs test
[    8.354982] BTRFS: selftest: running free space tree tests
[    8.372175] BTRFS: selftest: sectorsize: 4096  nodesize: 8192
[    8.373539] BTRFS: selftest: running btrfs free space cache tests
[    8.374989] BTRFS: selftest: running extent only tests
[    8.376236] BTRFS: selftest: running bitmap only tests
[    8.377483] BTRFS: selftest: running bitmap and extent tests
[    8.378854] BTRFS: selftest: running space stealing from bitmap to extent tests
...
Signed-off-by: David Sterba <dsterba@suse.com>

e4fa7469

btrfs: tests: drop messages when some tests finish · 752dbe48

David Sterba authored Mar 18, 2019

The messages like 'extent I/O tests finished' are redundant, if the test
fails it's quite obvious in the log and hang is also noticeable. No
other then extent_io and free space tree tests print that so make it
consistent.
Signed-off-by: David Sterba <dsterba@suse.com>

752dbe48

btrfs: tests: fix comments about tested extent map ranges · 3173fd92

David Sterba authored Mar 18, 2019

Comments about ranges did not match the code, the correct calculation is
to use start and start+len as the interval boundaries.
Signed-off-by: David Sterba <dsterba@suse.com>

3173fd92

btrfs: tests: use SZ_ constants everywhere · 43f7cddc

David Sterba authored Mar 18, 2019

There are a few unconverted constants that are not powers of two and
haven't been converted.
Signed-off-by: David Sterba <dsterba@suse.com>

43f7cddc

btrfs: tests: use standard error message after extent map allocation failure · 6c304746
David Sterba authored Mar 15, 2019
```
Signed-off-by: David Sterba <dsterba@suse.com>
```
6c304746

btrfs: tests: return error from all extent map test cases · ccfada1f

David Sterba authored Mar 18, 2019

The way the extent map tests handle errors does not conform to the rest
of the suite, where the first failure is reported and then it stops.
Do the same now that we have the errors returned from all the functions.
Signed-off-by: David Sterba <dsterba@suse.com>

ccfada1f

btrfs: tests: return errors from extent map test case 4 · 7c6f6700
David Sterba authored Mar 15, 2019
```
Replace asserts with error messages and return errors.
Signed-off-by: David Sterba <dsterba@suse.com>
```
7c6f6700
btrfs: tests: return errors from extent map test case 3 · 992dce74
David Sterba authored Mar 15, 2019
```
Replace asserts with error messages and return errors.
Signed-off-by: David Sterba <dsterba@suse.com>
```
992dce74
btrfs: tests: return errors from extent map test case 2 · e71f2e17
David Sterba authored Mar 15, 2019
```
Replace asserts with error messages and return errors.
Signed-off-by: David Sterba <dsterba@suse.com>
```
e71f2e17
btrfs: tests: return errors from extent map test case 1 · d7de4b08
David Sterba authored Mar 15, 2019
```
Replace asserts with error messages and return errors.
Signed-off-by: David Sterba <dsterba@suse.com>
```
d7de4b08

btrfs: tests: return errors from extent map tests · 488f6730

David Sterba authored Mar 15, 2019

The individual testcases for extent maps do not return an error on
allocation failures. This is not a big problem as the allocation don't
fail in general but there are functional tests handled with ASSERTS.
This makes tests dependent on them and it's not reliable.

This patch adds the allocation failure handling and allows for the
conversion of the asserts to proper error handling and reporting.
Signed-off-by: David Sterba <dsterba@suse.com>

488f6730

btrfs: tests: properly initialize fs_info of extent buffer · 7b9586bc

David Sterba authored Mar 15, 2019

The fs_info is supposed to be valid, even though it's not used right
now and the test does not crash.
Signed-off-by: David Sterba <dsterba@suse.com>

7b9586bc

btrfs: tests: use standard error message after block group allocation failure · 3199366d
David Sterba authored Mar 15, 2019
```
Signed-off-by: David Sterba <dsterba@suse.com>
```
3199366d
btrfs: tests: use standard error message after inode allocation failure · 6a060db8
David Sterba authored Mar 15, 2019
```
Signed-off-by: David Sterba <dsterba@suse.com>
```
6a060db8