- 04 Nov, 2014 13 commits
-
-
Gu Zheng authored
Use clear_inode_flag to replace the redundant cond_clear_inode_flag. Signed-off-by:
Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
This patch introduces do_make_empty_dir to mitigate code redundancy for inline_dentry. Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
This patch introduces f2fs_dentry_ptr structure for the use of a function parameter in inline_dentry operations. Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
This patch introduces a core function, f2fs_fill_dentries, to remove redundant code in f2fs_readdir and f2fs_read_inline_dir. Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
This patch adds status information for inline_dentry inodes. Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
Previously, init_inode_metadata does not hold any parent directory's inode page. So, f2fs_init_acl can grab its parent inode page without any problem. But, when we use inline_dentry, that page is grabbed during f2fs_add_link, so that we can fall into deadlock condition like below. INFO: task mknod:11006 blocked for more than 120 seconds. Tainted: G OE 3.17.0-rc1+ #13 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mknod D ffff88003fc94580 0 11006 11004 0x00000000 ffff880007717b10 0000000000000002 ffff88003c323220 ffff880007717fd8 0000000000014580 0000000000014580 ffff88003daecb30 ffff88003c323220 ffff88003fc94e80 ffff88003ffbb4e8 ffff880007717ba0 0000000000000002 Call Trace: [<ffffffff8173dc40>] ? bit_wait+0x50/0x50 [<ffffffff8173d4cd>] io_schedule+0x9d/0x130 [<ffffffff8173dc6c>] bit_wait_io+0x2c/0x50 [<ffffffff8173da3b>] __wait_on_bit_lock+0x4b/0xb0 [<ffffffff811640a7>] __lock_page+0x67/0x70 [<ffffffff810acf50>] ? autoremove_wake_function+0x40/0x40 [<ffffffff811652cc>] pagecache_get_page+0x14c/0x1e0 [<ffffffffa029afa9>] get_node_page+0x59/0x130 [f2fs] [<ffffffffa02a63ad>] read_all_xattrs+0x24d/0x430 [f2fs] [<ffffffffa02a6ca2>] f2fs_getxattr+0x52/0xe0 [f2fs] [<ffffffffa02a7481>] f2fs_get_acl+0x41/0x2d0 [f2fs] [<ffffffff8122d847>] get_acl+0x47/0x70 [<ffffffff8122db5a>] posix_acl_create+0x5a/0x150 [<ffffffffa02a7759>] f2fs_init_acl+0x29/0xcb [f2fs] [<ffffffffa0286a8d>] init_inode_metadata+0x5d/0x340 [f2fs] [<ffffffffa029253a>] f2fs_add_inline_entry+0x12a/0x2e0 [f2fs] [<ffffffffa0286ea5>] __f2fs_add_link+0x45/0x4a0 [f2fs] [<ffffffffa028b5b6>] ? f2fs_new_inode+0x146/0x220 [f2fs] [<ffffffffa028b816>] f2fs_mknod+0x86/0xf0 [f2fs] [<ffffffff811e3ec1>] vfs_mknod+0xe1/0x160 [<ffffffff811e4b26>] SyS_mknod+0x1f6/0x200 [<ffffffff81741d7f>] tracesys+0xe1/0xe6 Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
This patch removes redundant copied code in find_in_inline_dir. Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
This patch introduces to reuse the existing room_for_filename for inline dentry operation. Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
Adds Functions to implement inline dir init/lookup/insert/delete/convert ops. Signed-off-by:
Chao Yu <chao2.yu@samsung.com> [Jaegeuk Kim: remove needless reserved area copy, pointed by Dan Carpenter] Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
This patch exports some dir operations for inline dir, additionally introduces f2fs_drop_nlink from f2fs_delete_entry for reusing by inline dir function. Signed-off-by:
Chao Yu <chao2.yu@samsung.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
This patch defines macro/inline dentry structure, and adds some helpers for inline dir infrastructure. Signed-off-by:
Chao Yu <chao2.yu@samsung.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
If user truncates file's data, we should truncate inmemory pages too. Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
This patch let inmemory pages be clean all the time. Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
- 07 Oct, 2014 2 commits
-
-
Jaegeuk Kim authored
This patch adds support for volatile writes which keep data pages in memory until f2fs_evict_inode is called by iput. For instance, we can use this feature for the sqlite database as follows. While supporting atomic writes for main database file, we can keep its journal data temporarily in the page cache by the following sequence. 1. open -> ioctl(F2FS_IOC_START_VOLATILE_WRITE); 2. writes : keep all the data in the page cache. 3. flush to the database file with atomic writes a. ioctl(F2FS_IOC_START_ATOMIC_WRITE); b. writes c. ioctl(F2FS_IOC_COMMIT_ATOMIC_WRITE); 4. close -> drop the cached data Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
This patch introduces a very limited functionality for atomic write support. In order to support atomic write, this patch adds two ioctls: o F2FS_IOC_START_ATOMIC_WRITE o F2FS_IOC_COMMIT_ATOMIC_WRITE The database engine should be aware of the following sequence. 1. open -> ioctl(F2FS_IOC_START_ATOMIC_WRITE); 2. writes : all the written data will be treated as atomic pages. 3. commit -> ioctl(F2FS_IOC_COMMIT_ATOMIC_WRITE); : this flushes all the data blocks to the disk, which will be shown all or nothing by f2fs recovery procedure. 4. repeat to #2. The IO pattens should be: ,- START_ATOMIC_WRITE ,- COMMIT_ATOMIC_WRITE CP | D D D D D D | FSYNC | D D D D | FSYNC ... `- COMMIT_ATOMIC_WRITE Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
- 30 Sep, 2014 4 commits
-
-
Jaegeuk Kim authored
This patch relocates f2fs_unlock_op in every directory operations to be called after any error was processed. Otherwise, the checkpoint can be entered with valid node ids without its dentry when -ENOSPC is occurred. Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
Previously, f2fs tries to reorganize the dirty nat entries into multiple sets according to its nid ranges. This can improve the flushing nat pages, however, if there are a lot of cached nat entries, it becomes a bottleneck. This patch introduces a new set management flow by removing dirty nat list and adding a series of set operations when the nat entry becomes dirty. Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
This patch introduces FITRIM in f2fs_ioctl. In this case, f2fs will issue small discards and prefree discards as many as possible for the given area. Reviewed-by:
Chao Yu <chao2.yu@samsung.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
This patch add a new data structure to control checkpoint parameters. Currently, it presents the reason of checkpoint such as is_umount and normal sync. Reviewed-by:
Chao Yu <chao2.yu@samsung.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
- 23 Sep, 2014 3 commits
-
-
Jaegeuk Kim authored
If same data is updated multiple times, we don't need to redo whole the operations. Let's just update the lastest one. Reviewed-by:
Chao Yu <chao2.yu@samsung.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
This patch revisited whole the recovery information during the f2fs_sync_file. In this patch, there are three information to make a decision. a) IS_CHECKPOINTED, /* is it checkpointed before? */ b) HAS_FSYNCED_INODE, /* is the inode fsynced before? */ c) HAS_LAST_FSYNC, /* has the latest node fsync mark? */ And, the scenarios for our rule are based on: [Term] F: fsync_mark, D: dentry_mark 1. inode(x) | CP | inode(x) | dnode(F) 2. inode(x) | CP | inode(F) | dnode(F) 3. inode(x) | CP | dnode(F) | inode(x) | inode(F) 4. inode(x) | CP | dnode(F) | inode(F) 5. CP | inode(x) | dnode(F) | inode(DF) 6. CP | inode(DF) | dnode(F) 7. CP | dnode(F) | inode(DF) 8. CP | dnode(F) | inode(x) | inode(DF) For example, #3, the three conditions should be changed as follows. inode(x) | CP | dnode(F) | inode(x) | inode(F) a) x o o o o b) x x x x o c) x o o x o If f2fs_sync_file stops ------^, it should write inode(F) --------------^ So, the need_inode_block_update should return true, since c) get_nat_flag(e, HAS_LAST_FSYNC), is false. For example, #8, CP | alloc | dnode(F) | inode(x) | inode(DF) a) o x x x x b) x x x o c) o o x o If f2fs_sync_file stops -------^, it should write inode(DF) --------------^ Note that, the roll-forward policy should follow this rule, which means, if there are any missing blocks, we doesn't need to recover that inode. Signed-off-by:
Huang Ying <ying.huang@intel.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
Previously, all the dnode pages should be read during the roll-forward recovery. Even worsely, whole the chain was traversed twice. This patch removes that redundant and costly read operations by using page cache of meta_inode and readahead function as well. Reviewed-by:
Chao Yu <chao2.yu@samsung.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
- 16 Sep, 2014 2 commits
-
-
Jaegeuk Kim authored
If user wrote F2FS_IPU_FSYNC:4 in /sys/fs/f2fs/ipu_policy, f2fs_sync_file only starts to try in-place-updates. And, if the number of dirty pages is over /sys/fs/f2fs/min_fsync_blocks, it keeps out-of-order manner. Otherwise, it triggers in-place-updates. This may be used by storage showing very high random write performance. For example, it can be used when, Seq. writes (Data) + wait + Seq. writes (Node) is pretty much slower than, Rand. writes (Data) Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
Previously f2fs only counts dirty dentry pages, but there is no reason not to expand the scope. This patch changes the names on the management of dirty pages and to count dirty pages in each inode info as well. Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
- 09 Sep, 2014 4 commits
-
-
Gu Zheng authored
We use flush cmd control to collect many flush cmds, and flush them together. In this case, we use two list to manage the flush cmds (collect and dispatch), and one spin lock is used to protect this. In fact, the lock-less list(llist) is very suitable to this case, and we use simplify this routine. - v2: -use llist_for_each_entry_safe to fix possible use-after-free issue. -remove the unused field from struct flush_cmd. Thanks for Yu's suggestion. - Signed-off-by:
Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
In commit aec71382 ("f2fs: refactor flush_nat_entries codes for reducing NAT writes"), we descripte the issue as below: "Although building NAT journal in cursum reduce the read/write work for NAT block, but previous design leave us lower performance when write checkpoint frequently for these cases: 1. if journal in cursum has already full, it's a bit of waste that we flush all nat entries to page for persistence, but not to cache any entries. 2. if journal in cursum is not full, we fill nat entries to journal util journal is full, then flush the left dirty entries to disk without merge journaled entries, so these journaled entries may be flushed to disk at next checkpoint but lost chance to flushed last time." Actually, we have the same problem in using SIT journal area. In this patch, firstly we will update sit journal with dirty entries as many as possible. Secondly if there is no space in sit journal, we will remove all entries in journal and walk through the whole dirty entry bitmap of sit, accounting dirty sit entries located in same SIT block to sit entry set. All entry sets are linked to list sit_entry_set in sm_info, sorted ascending order by count of entries in set. Later we flush entries in set which have fewest entries into journal as many as we can, and then flush dense set with merged entries to disk. In this way we can use sit journal area more effectively, also we will reduce SIT update, result in gaining in performance and saving lifetime of flash device. In my testing environment, it shows this patch can help to reduce SIT block update obviously. virtual machine + hard disk: fsstress -p 20 -n 400 -l 5 sit page num cp count sit pages/cp based 2006.50 1349.75 1.486 patched 1566.25 1463.25 1.070 Our latency of merging op is small when handling a great number of dirty SIT entries in flush_sit_entries: latency(ns) dirty sit count 36038 2151 49168 2123 37174 2232 Signed-off-by:
Chao Yu <chao2.yu@samsung.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
If any f2fs_bug_on is triggered, fsck.f2fs is needed. Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
This patch adds sbi->need_fsck to conduct fsck.f2fs later. This flag can only be removed by fsck.f2fs. Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
- 04 Sep, 2014 1 commit
-
-
Jaegeuk Kim authored
This patch adds three inline functions to clean up dirty casting codes. Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
- 21 Aug, 2014 5 commits
-
-
Jaegeuk Kim authored
I think we need to let the dirty node pages remain in the page cache instead of rewriting them in their places. So, after done with successful recovery, write_checkpoint will flush all of them through the normal write path. Through this, we can avoid potential error cases in terms of block allocation. Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
The init_inode_metadata calls truncate_blocks when error is occurred. The callers holds f2fs_lock_op, so we should not call it again in truncate_blocks. Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
This patch adds WARN_ON when f2fs_bug_on is disable to see kernel messages. Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
This patch adds f2fs_cp_error for readability. Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
The generic_shutdown_super calls sync_filesystem, evict_inode, and then f2fs_put_super. In f2fs_evict_inode, we remain some dirty inode information so we should release them at f2fs_put_super. Reviewed-by:
Chao Yu <chao2.yu@samsung.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
- 19 Aug, 2014 4 commits
-
-
Jaegeuk Kim authored
This patch fixes not to skip xattr recovery and inline xattr/data recovery order. Reviewed-by:
Chao Yu <chao2.yu@samsung.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
This patch adds a parentheses to make clear for condition check. And also it changes the return type for better meanings. Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
If mkwrite is called to an inode having inline_data, it can overwrite the data index space as NEW_ADDR. (e.g., the first 4 bytes are coincidently zero) Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
arter97 authored
Fix typo and some grammatical errors. The words "filesystem" and "readahead" are being used without the space treewide. Signed-off-by:
Park Ju Hyung <qkrwngud825@gmail.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
- 02 Aug, 2014 1 commit
-
-
Chao Yu authored
When we recover data of inode in roll-forward procedure, and the inode has both inline data and inline xattr. We may skip recovering inline xattr if we recover inline data form node page first. This patch will fix the problem that we lost inline xattr data in above scenario. Signed-off-by:
Chao Yu <chao2.yu@samsung.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
- 31 Jul, 2014 1 commit
-
-
Chao Yu authored
We do not need to block on ->node_write among different node page writers e.g. fsync/flush, unless we have a node page writer from write_checkpoint. So it's better use rw_semaphore instead of mutex type for ->node_write to promote performance. Signed-off-by:
Chao Yu <chao2.yu@samsung.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-