- 28 Jul, 2014 1 commit
-
-
Dmitry Monakhov authored
If we have to copy data we must drop i_data_sem because of get_blocks() will be called inside mext_page_mkuptodate(), but later we must reacquire it again because we are about to change extent's tree Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Reviewed-by:
Jan Kara <jack@suse.cz>
-
- 13 May, 2014 1 commit
-
-
liang xie authored
Make them more consistently Signed-off-by:
xieliang <xieliang@xiaomi.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 21 Apr, 2014 1 commit
-
-
Lukas Czerner authored
Currently in ext4 there is quite a mess when it comes to naming unwritten extents. Sometimes we call it uninitialized and sometimes we refer to it as unwritten. The right name for the extent which has been allocated but does not contain any written data is _unwritten_. Other file systems are using this name consistently, even the buffer head state refers to it as unwritten. We need to fix this confusion in ext4. This commit changes every reference to an uninitialized extent (meaning allocated but unwritten) to unwritten extent. This includes comments, function names and variable names. It even covers abbreviation of the word uninitialized (such as uninit) and some misspellings. This commit does not change any of the code paths at all. This has been confirmed by comparing md5sums of the assembly code of each object file after all the function names were stripped from it. Signed-off-by:
Lukas Czerner <lczerner@redhat.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 23 Feb, 2014 1 commit
-
-
Namjae Jeon authored
This patch implements fallocate's FALLOC_FL_COLLAPSE_RANGE for Ext4. The semantics of this flag are following: 1) It collapses the range lying between offset and length by removing any data blocks which are present in this range and than updates all the logical offsets of extents beyond "offset + len" to nullify the hole created by removing blocks. In short, it does not leave a hole. 2) It should be used exclusively. No other fallocate flag in combination. 3) Offset and length supplied to fallocate should be fs block size aligned in case of xfs and ext4. 4) Collaspe range does not work beyond i_size. Signed-off-by:
Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by:
Ashish Sangwan <a.sangwan@samsung.com> Tested-by:
Dongsu Park <dongsu.park@profitbricks.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 18 Feb, 2014 1 commit
-
-
Dan Carpenter authored
"err" is zero here, there is no need to check again. Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 09 Nov, 2013 1 commit
-
-
J. Bruce Fields authored
We want to do this elsewhere as well. Also catch any attempts to use it for directories (where this ordering would conflict with ancestor-first directory ordering in lock_rename). Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Dave Chinner <david@fromorbit.com> Acked-by:
Jeff Layton <jlayton@redhat.com> Acked-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
J. Bruce Fields <bfields@redhat.com> Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
- 17 Aug, 2013 1 commit
-
-
Theodore Ts'o authored
When we read in an extent tree leaf block from disk, arrange to have all of its entries cached. In nearly all cases the in-memory representation will be more compact than the on-disk representation in the buffer cache, and it allows us to get the information without having to traverse the extent tree for successive extents. Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Reviewed-by:
Zheng Liu <wenqing.lz@taobao.com>
-
- 17 Jun, 2013 1 commit
-
-
Jon Ernst authored
This patch removed several unused variables. Signed-off-by:
Jon Ernst <jonernst07@gmx.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 19 Apr, 2013 1 commit
-
-
Darrick J. Wong authored
Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 12 Apr, 2013 1 commit
-
-
Dmitry Monakhov authored
- grab_cache_page_write_begin() may not wait on page's writeback since (1d1d1a76). But it is still reasonable to wait on page's writeback here in order to be on the safe side. - Fix miss typo: pass 'length' instead of 'end' to __block_write_begin() https://bugzilla.kernel.org/show_bug.cgi?id=56241 TESTCASE: git://oss.sgi.com/xfs/cmds/xfstests.git MKFS_OPTIONS="-b1024" ; ./check ext4/304 Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Akira Fujita <a-fujita.rs.jp.nec.com>
-
- 10 Apr, 2013 1 commit
-
-
Dmitri Monakho authored
This patch should fix sparse complains about shadow declatations. Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 08 Apr, 2013 1 commit
-
-
Dr. Tilmann Bubeck authored
Add a new ioctl, EXT4_IOC_SWAP_BOOT which swaps i_blocks and associated attributes (like i_blocks, i_size, i_flags, ...) from the specified inode with inode EXT4_BOOT_LOADER_INO (#5). This is typically used to store a boot loader in a secure part of the filesystem, where it can't be changed by a normal user by accident. The data blocks of the previous boot loader will be associated with the given inode. This usercode program is a simple example of the usage: int main(int argc, char *argv[]) { int fd; int err; if ( argc != 2 ) { printf("usage: ext4-swap-boot-inode FILE-TO-SWAP\n"); exit(1); } fd = open(argv[1], O_WRONLY); if ( fd < 0 ) { perror("open"); exit(1); } err = ioctl(fd, EXT4_IOC_SWAP_BOOT); if ( err < 0 ) { perror("ioctl"); exit(1); } close(fd); exit(0); } [ Modified by Theodore Ts'o to fix a number of bugs in the original code.] Signed-off-by:
Dr. Tilmann Bubeck <t.bubeck@reinform.de> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 18 Mar, 2013 1 commit
-
-
Dmitry Monakhov authored
Regression was introduced by following commit 8c854473 TESTCASE (git://oss.sgi.com/xfs/cmds/xfstests.git ): #while true;do ./check 301 || break ;done Also fix potential memory leakage in get_ext_path() once ext4_ext_find_extent() have failed. Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 04 Mar, 2013 1 commit
-
-
Dmitry Monakhov authored
mext_replace_branches() will change inode's extents layout so we have to drop corresponding cache. TESTCASE: 301'th xfstest was not yet accepted to official xfstest's branch and can be found here: https://github.com/dmonakhov/xfstests/commit/7b7efeee30a41109201e2040034e71db9b66ddc0 Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Reviewed-by:
Jan Kara <jack@suse.cz>
-
- 23 Feb, 2013 1 commit
-
-
Al Viro authored
Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
- 18 Feb, 2013 1 commit
-
-
Zheng Liu authored
Single extent cache could be removed because we have extent status tree as a extent cache, and it would be better. Signed-off-by:
Zheng Liu <wenqing.lz@taobao.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Cc: Jan kara <jack@suse.cz>
-
- 09 Feb, 2013 1 commit
-
-
Theodore Ts'o authored
So we can better understand what bits of ext4 are responsible for long-running jbd2 handles, use jbd2__journal_start() so we can pass context information for logging purposes. The recommended way for finding the longer-running handles is: T=/sys/kernel/debug/tracing EVENT=$T/events/jbd2/jbd2_handle_stats echo "interval > 5" > $EVENT/filter echo 1 > $EVENT/enable ./run-my-fs-benchmark cat $T/trace > /tmp/problem-handles This will list handles that were active for longer than 20ms. Having longer-running handles is bad, because a commit started at the wrong time could stall for those 20+ milliseconds, which could delay an fsync() or an O_SYNC operation. Here is an example line from the trace file describing a handle which lived on for 311 jiffies, or over 1.2 seconds: postmark-2917 [000] .... 196.435786: jbd2_handle_stats: dev 254,32 tid 570 type 2 line_no 2541 interval 311 sync 0 requested_blocks 1 dirtied_blocks 0 Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 02 Feb, 2013 1 commit
-
-
Akria Fujita authored
Commit 2147b1a6 resulted in a new smatch warning: > fs/ext4/move_extent.c:693 mext_replace_branches() > warn: variable dereferenced before check 'dext' (see line 683) Fix this by adding a check to make sure dext is non-NULL before we derefrence it. Signed-off-by:
Akria Fujita <a-fujita@rs.jp.nec.com> [ modified by tytso to make sure an ext4_error is called ] Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 28 Nov, 2012 1 commit
-
-
Theodore Ts'o authored
Previously, ext4_extents.h was being included at the end of ext4.h, which was bad for a number of reasons: (a) it was not being included in the expected place, and (b) it caused the header to be included multiple times. There were #ifdef's to prevent this from causing any problems, but it still was unnecessary. By moving the function declarations that were in ext4_extents.h to ext4.h, which is standard practice for where the function declarations for the rest of ext4.h can be found, we can remove ext4_extents.h from being included in ext4.h at all, and then we can only include ext4_extents.h where it is needed in ext4's source files. It should be possible to move a few more things into ext4.h, and further reduce the number of source files that need to #include ext4_extents.h, but that's a cleanup for another day. Reported-by:
Sachin Kamat <sachin.kamat@linaro.org> Reported-by:
Wei Yongjun <weiyj.lk@gmail.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 29 Sep, 2012 1 commit
-
-
Dmitry Monakhov authored
Inode's block defrag and ext4_change_inode_journal_flag() may affect nonlocked DIO reads result, so proper synchronization required. - Add missed inode_dio_wait() calls where appropriate - Check inode state under extra i_dio_count reference. Reviewed-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 27 Sep, 2012 2 commits
-
-
Wei Yongjun authored
Convert cpu_to_leXX(leXX_to_cpu(E1) + E2) to use leXX_add_cpu(). dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch ) Signed-off-by:
Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
Wang Sheng-Hui authored
In the check code above, if orig_start != donor_start, we would return -EINVAL. So here, orig_start should be equal with donor_start. Remove the redundant check here. Signed-off-by:
Wang Sheng-Hui <shhuiw@gmail.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 26 Sep, 2012 4 commits
-
-
Dmitry Monakhov authored
Uninitialized extent may became initialized(parallel writeback task) at any moment after we drop i_data_sem, so we have to recheck extent's state after we hold page's lock and i_data_sem. If we about to change page's mapping we must hold page's lock in order to serialize other users. Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
Dmitry Monakhov authored
Non-full list of bugs: 1) uninitialized extent optimization does not hold page's lock, and simply replace brunches after that writeback code goes crazy because block mapping changed under it's feets kernel BUG at fs/ext4/inode.c:1434! ( 288'th xfstress) 2) uninitialized extent may became initialized right after we drop i_data_sem, so extent state must be rechecked 3) Locked pages goes uptodate via following sequence: ->readpage(page); lock_page(page); use_that_page(page) But after readpage() one may invalidate it because it is uptodate and unlocked (reclaimer does that) As result kernel bug at include/linux/buffer_head.c:133! 4) We call write_begin() with already opened stansaction which result in following deadlock: ->move_extent_per_page() ->ext4_journal_start()-> hold journal transaction ->write_begin() ->ext4_da_write_begin() ->ext4_nonda_switch() ->writeback_inodes_sb_if_idle() --> will wait for journal_stop() 5) try_to_release_page() may fail and it does fail if one of page's bh was pinned by journal 6) If we about to change page's mapping we MUST hold it's lock during entire remapping procedure, this is true for both pages(original and donor one) Fixes: - Avoid (1) and (2) simply by temproraly drop uninitialized extent handling optimization, this will be reimplemented later. - Fix (3) by manually forcing page to uptodate state w/o dropping it's lock - Fix (4) by rearranging existing locking: from: journal_start(); ->write_begin to: write_begin(); journal_extend() - Fix (5) simply by checking retvalue - Fix (6) by locking both (original and donor one) pages during extent swap with help of mext_page_double_lock() Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
Dmitry Monakhov authored
Proper block swap for inodes with full journaling enabled is truly non obvious task. In order to be on a safe side let's explicitly disable it for now. Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
-
Dmitry Monakhov authored
- Remove usless checks, because it is too late to check that inode != NULL at the moment it was referenced several times. - Double lock routines looks very ugly and locking ordering relays on order of i_ino, but other kernel code rely on order of pointers. Let's make them simple and clean. - check that inodes belongs to the same SB as soon as possible. Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
-
- 09 Sep, 2011 1 commit
-
-
Aditya Kali authored
This patch adds some tracepoints in ext4/extents.c and updates a tracepoint in ext4/inode.c. Tested: Built and ran the kernel and verified that these tracepoints work. Also ran xfstests. Signed-off-by:
Aditya Kali <adityakali@google.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 06 Jun, 2011 1 commit
-
-
Lukas Czerner authored
Kazuya Mio reported that he was able to hit BUG_ON(next == lblock) in ext4_ext_put_gap_in_cache() while creating a sparse file in extent format and fill the tail of file up to its end. We will hit the BUG_ON when we write the last block (2^32-1) into the sparse file. The root cause of the problem lies in the fact that we specifically set s_maxbytes so that block at s_maxbytes fit into on-disk extent format, which is 32 bit long. However, we are not storing start and end block number, but rather start block number and length in blocks. It means that in order to cover extent from 0 to EXT_MAX_BLOCK we need EXT_MAX_BLOCK+1 to fit into len (because we counting block 0 as well) - and it does not. The only way to fix it without changing the meaning of the struct ext4_extent members is, as Kazuya Mio suggested, to lower s_maxbytes by one fs block so we can cover the whole extent we can get by the on-disk extent format. Also in many places EXT_MAX_BLOCK is used as length instead of maximum logical block number as the name suggests, it is all a bit messy. So this commit renames it to EXT_MAX_BLOCKS and change its usage in some places to actually be maximum number of blocks in the extent. The bug which this commit fixes can be reproduced as follows: dd if=/dev/zero of=/mnt/mp1/file bs=<blocksize> count=1 seek=$((2**32-2)) sync dd if=/dev/zero of=/mnt/mp1/file bs=<blocksize> count=1 seek=$((2**32-1)) Reported-by:
Kazuya Mio <k-mio@sx.jp.nec.com> Signed-off-by:
Lukas Czerner <lczerner@redhat.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 18 May, 2011 1 commit
-
-
Darrick J. Wong authored
wait_on_page_writeback already checks the writeback bit, so callers of it needn't do that test. Signed-off-by:
Darrick J. Wong <djwong@us.ibm.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 28 Oct, 2010 1 commit
-
-
Theodore Ts'o authored
Cleanup namespace leaks from fs/ext4 and the inline trivial functions ext4_{ext,idx}_pblock() and ext4_{ext,idx}_store_pblock() since the code size actually shrinks when we make these functions inline, they're so trivial. Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 27 Jul, 2010 1 commit
-
-
Theodore Ts'o authored
Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 03 Jun, 2010 1 commit
-
-
Theodore Ts'o authored
Dan Roseberg has reported a problem with the MOVE_EXT ioctl. If the donor file is an append-only file, we should not allow the operation to proceed, lest we end up overwriting the contents of an append-only file. Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Cc: Dan Rosenberg <dan.j.rosenberg@gmail.com>
-
- 17 May, 2010 2 commits
-
-
Dmitry Monakhov authored
At several places we modify EXT4_I(inode)->i_flags without holding i_mutex (ext4_do_update_inode, ...). These modifications are racy and we can lose updates to i_flags. So convert handling of i_flags to use bitops which are atomic. https://bugzilla.kernel.org/show_bug.cgi?id=15792 Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
EXT4_ERROR_INODE() tends to provide better error information and in a more consistent format. Some errors were not even identifying the inode or directory which was corrupted, which made them not very useful. Addresses-Google-Bug: #2507977 Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 11 May, 2010 1 commit
-
-
Steven Liu authored
Making sure ee_block is initialized to zero to prevent gcc from kvetching. It's harmless (although it's not obvious that it's harmless) from code inspection: fs/ext4/move_extent.c:478: warning: 'start_ext.ee_block' may be used uninitialized in this function Thanks to Stefan Richter for first bringing this to the attention of linux-ext4@vger.kernel.org. Signed-off-by:
LiuQi <lingjiujianke@gmail.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
-
- 30 Mar, 2010 1 commit
-
-
Tejun Heo authored
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include bloc...
-
- 04 Mar, 2010 3 commits
-
-
Akira Fujita authored
a) Fix sparse warning in ext4_ioctl() b) Remove unneeded variable in mext_leaf_block() c) Fix spelling typo in mext_check_arguments() Signed-off-by:
Akira Fujita <a-fujita@rs.jp.nec.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
Akira Fujita authored
If EXT4_IOC_MOVE_EXT ioctl is called with NULL donor_fd, fget() in ext4_ioctl() gets inappropriate file structure for donor; so we need to do this check earlier, before calling double_down_write_data_sem(). Signed-off-by:
Akira Fujita <a-fujita@rs.jp.nec.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
Akira Fujita authored
If the leaf node has 2 extent space or fewer and EXT4_IOC_MOVE_EXT ioctl is called with the file offset where after the 2nd extent covers, mext_insert_across_blocks() always tries to insert extent into the first extent. As a result, the file gets corrupted because of wrong extent order. The patch fixes this problem. Signed-off-by:
Akira Fujita <a-fujita@rs.jp.nec.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 15 Feb, 2010 1 commit
-
-
Eric Sandeen authored
Just a pet peeve of mine; we had a mishash of calls with either __func__ or "function_name" and the latter tends to get out of sync. I think it's easier to just hide the __func__ in a macro, and it'll be consistent from then on. Signed-off-by:
Eric Sandeen <sandeen@redhat.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-