- 23 Aug, 2023 2 commits
-
-
Chao Yu authored
In error path of f2fs_submit_page_read(), it missed to call iostat_update_and_unbind_ctx() and free bio_post_read_ctx, fix it. Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
In sanity_check_{compress_,}inode(), it doesn't need to set SBI_NEED_FSCK in each error case, instead, we can set the flag in do_read_inode() only once when sanity_check_inode() fails. Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
- 21 Aug, 2023 1 commit
-
-
Jaegeuk Kim authored
====================================================== WARNING: possible circular locking dependency detected 6.5.0-rc5-syzkaller-00353-gae545c32 #0 Not tainted ------------------------------------------------------ syz-executor273/5027 is trying to acquire lock: ffff888077fe1fb0 (&fi->i_sem){+.+.}-{3:3}, at: f2fs_down_write fs/f2fs/f2fs.h:2133 [inline] ffff888077fe1fb0 (&fi->i_sem){+.+.}-{3:3}, at: f2fs_add_inline_entry+0x300/0x6f0 fs/f2fs/inline.c:644 but task is already holding lock: ffff888077fe07c8 (&fi->i_xattr_sem){.+.+}-{3:3}, at: f2fs_down_read fs/f2fs/f2fs.h:2108 [inline] ffff888077fe07c8 (&fi->i_xattr_sem){.+.+}-{3:3}, at: f2fs_add_dentry+0x92/0x230 fs/f2fs/dir.c:783 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&fi->i_xattr_sem){.+.+}-{3:3}: down_read+0x9c/0x470 kernel/locking/rwsem.c:1520 f2fs_down_read fs/f2fs/f2fs.h:2108 [inline] f2fs_getxattr+0xb1e/0x12c0 fs/f2fs/xattr.c:532 __f2fs_get_acl+0x5a/0x900 fs/f2fs/acl.c:179 f2fs_acl_create fs/f2fs/acl.c:377 [inline] f2fs_init_acl+0x15c/0xb30 fs/f2fs/acl.c:420 f2fs_init_inode_metadata+0x159/0x1290 fs/f2fs/dir.c:558 f2fs_add_regular_entry+0x79e/0xb90 fs/f2fs/dir.c:740 f2fs_add_dentry+0x1de/0x230 fs/f2fs/dir.c:788 f2fs_do_add_link+0x190/0x280 fs/f2fs/dir.c:827 f2fs_add_link fs/f2fs/f2fs.h:3554 [inline] f2fs_mkdir+0x377/0x620 fs/f2fs/namei.c:781 vfs_mkdir+0x532/0x7e0 fs/namei.c:4117 do_mkdirat+0x2a9/0x330 fs/namei.c:4140 __do_sys_mkdir fs/namei.c:4160 [inline] __se_sys_mkdir fs/namei.c:4158 [inline] __x64_sys_mkdir+0xf2/0x140 fs/namei.c:4158 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd -> #0 (&fi->i_sem){+.+.}-{3:3}: check_prev_add kernel/locking/lockdep.c:3142 [inline] check_prevs_add kernel/locking/lockdep.c:3261 [inline] validate_chain kernel/locking/lockdep.c:3876 [inline] __lock_acquire+0x2e3d/0x5de0 kernel/locking/lockdep.c:5144 lock_acquire kernel/locking/lockdep.c:5761 [inline] lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5726 down_write+0x93/0x200 kernel/locking/rwsem.c:1573 f2fs_down_write fs/f2fs/f2fs.h:2133 [inline] f2fs_add_inline_entry+0x300/0x6f0 fs/f2fs/inline.c:644 f2fs_add_dentry+0xa6/0x230 fs/f2fs/dir.c:784 f2fs_do_add_link+0x190/0x280 fs/f2fs/dir.c:827 f2fs_add_link fs/f2fs/f2fs.h:3554 [inline] f2fs_mkdir+0x377/0x620 fs/f2fs/namei.c:781 vfs_mkdir+0x532/0x7e0 fs/namei.c:4117 ovl_do_mkdir fs/overlayfs/overlayfs.h:196 [inline] ovl_mkdir_real+0xb5/0x370 fs/overlayfs/dir.c:146 ovl_workdir_create+0x3de/0x820 fs/overlayfs/super.c:309 ovl_make_workdir fs/overlayfs/super.c:711 [inline] ovl_get_workdir fs/overlayfs/super.c:864 [inline] ovl_fill_super+0xdab/0x6180 fs/overlayfs/super.c:1400 vfs_get_super+0xf9/0x290 fs/super.c:1152 vfs_get_tree+0x88/0x350 fs/super.c:1519 do_new_mount fs/namespace.c:3335 [inline] path_mount+0x1492/0x1ed0 fs/namespace.c:3662 do_mount fs/namespace.c:3675 [inline] __do_sys_mount fs/namespace.c:3884 [inline] __se_sys_mount fs/namespace.c:3861 [inline] __x64_sys_mount+0x293/0x310 fs/namespace.c:3861 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- rlock(&fi->i_xattr_sem); lock(&fi->i_sem); lock(&fi->i_xattr_sem); lock(&fi->i_sem); Cc: <stable@vger.kernel.org> Reported-and-tested-by: syzbot+e5600587fa9cbf8e3826@syzkaller.appspotmail.com Fixes: 5eda1ad1 "f2fs: fix deadlock in i_xattr_sem and inode page lock" Tested-by:
Guenter Roeck <linux@roeck-us.net> Reviewed-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
- 18 Aug, 2023 2 commits
-
-
Chao Yu authored
Previously, we have two mechanisms to cache & submit small discards: a) set max small discard number in /sys/fs/f2fs/vdb/max_small_discards, and checkpoint will cache small discard candidates w/ configured maximum number. b) call FITRIM ioctl, also, checkpoint in f2fs_trim_fs() will cache small discard candidates w/ configured discard granularity, but w/o limitation of number. FSTRIM interface is asynchronized, so it won't submit discard directly. Finally, discard thread will submit them in background periodically. However, after commit 9ac00e7c ("f2fs: do not issue small discard commands during checkpoint"), the mechanism a) is broken, since no matter how we configure the sysfs entry /sys/fs/f2fs/vdb/max_small_discards, checkpoint will not cache small discard candidates any more. echo 0 > /sys/fs/f2fs/vdb/max_small_discards xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" xfs_io /mnt/f2fs/file -c "fpunch 0 4k" sync cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 echo 100 > /sys/fs/f2fs/vdb/max_small_discards rm /mnt/f2fs/file xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" xfs_io /mnt/f2fs/file -c "fpunch 0 4k" sync cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 Before the patch: Discard pend list(Show diacrd_cmd count on each entry, .:not exist): 0 . . . . . . . . Discard pend list(Show diacrd_cmd count on each entry, .:not exist): 0 3 1 . . . . . . After the patch: Discard pend list(Show diacrd_cmd count on each entry, .:not exist): 0 . . . . . . . . Discard pend list(Show diacrd_cmd count on each entry, .:not exist): 0 . . . . . . . . This patch reverts commit 9ac00e7c ("f2fs: do not issue small discard commands during checkpoint") in order to fix this issue. Fixes: 9ac00e7c ("f2fs: do not issue small discard commands during checkpoint") Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
The description of max_small_discards is out-of-update in below two aspects, fix it. - it is disabled by default - small discards will be issued during checkpoint Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
- 14 Aug, 2023 18 commits
-
-
Zhiguo Niu authored
The sending interval of discard and GC should also consider direct write requests; filesystem is not idle if there is direct write. Signed-off-by:
Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
cp_foreground_calls sysfs entry shows total CP call count rather than foreground CP call count, fix it. Fixes: fc7100ea ("f2fs: Add f2fs stats to sysfs") Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
As reported, status debugfs entry shows inconsistent GC stats as below: GC calls: 6008 (BG: 6161) - data segments : 3053 (BG: 3053) - node segments : 2955 (BG: 2955) Total GC calls is larger than BGGC calls, the reason is: - f2fs_stat_info.call_count accounts total migrated section count by f2fs_gc() - f2fs_stat_info.bg_gc accounts total call times of f2fs_gc() from background gc_thread Another issue is gc_foreground_calls sysfs entry shows total GC call count rather than FGGC call count. This patch changes as below for fix: - account GC calls and migrated segment count separately - support to account migrated section count if it enables large section mode - fix to show correct value in gc_foreground_calls sysfs entry Fixes: fc7100ea ("f2fs: Add f2fs stats to sysfs") Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
It has checked return value of write_all_xattrs(), remove unneeded following check condition. Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
generic/728 - output mismatch (see /media/fstests/results//generic/728.out.bad) --- tests/generic/728.out 2023-07-19 07:10:48.362711407 +0000 +++ /media/fstests/results//generic/728.out.bad 2023-07-19 08:39:57.000000000 +0000 QA output created by 728 +Expected ctime to change after setxattr. +Expected ctime to change after removexattr. Silence is golden ... (Run 'diff -u /media/fstests/tests/generic/728.out /media/fstests/results//generic/728.out.bad' to see the entire diff) generic/729 1s It needs to update i_ctime after {set,remove}xattr, fix it. Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
syzbot reports a f2fs bug as below: UBSAN: array-index-out-of-bounds in fs/f2fs/f2fs.h:3275:19 index 1409 is out of range for type '__le32[923]' (aka 'unsigned int[923]') Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1e7/0x2d0 lib/dump_stack.c:106 ubsan_epilogue lib/ubsan.c:217 [inline] __ubsan_handle_out_of_bounds+0x11c/0x150 lib/ubsan.c:348 inline_data_addr fs/f2fs/f2fs.h:3275 [inline] __recover_inline_status fs/f2fs/inode.c:113 [inline] do_read_inode fs/f2fs/inode.c:480 [inline] f2fs_iget+0x4730/0x48b0 fs/f2fs/inode.c:604 f2fs_fill_super+0x640e/0x80c0 fs/f2fs/super.c:4601 mount_bdev+0x276/0x3b0 fs/super.c:1391 legacy_get_tree+0xef/0x190 fs/fs_context.c:611 vfs_get_tree+0x8c/0x270 fs/super.c:1519 do_new_mount+0x28f/0xae0 fs/namespace.c:3335 do_mount fs/namespace.c:3675 [inline] __do_sys_mount fs/namespace.c:3884 [inline] __se_sys_mount+0x2d9/0x3c0 fs/namespace.c:3861 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd The issue was bisected to: commit d48a7b3a Author: Chao Yu <chao@kernel.org> Date: Mon Jan 9 03:49:20 2023 +0000 f2fs: fix to do sanity check on extent cache correctly The root cause is we applied both v1 and v2 of the patch, v2 is the right fix, so it needs to revert v1 in order to fix reported issue. v1: commit d48a7b3a ("f2fs: fix to do sanity check on extent cache correctly") https://lore.kernel.org/lkml/20230109034920.492914-1-chao@kernel.org/ v2: commit 269d1194 ("f2fs: fix to do sanity check on extent cache correctly") https://lore.kernel.org/lkml/20230207134808.1827869-1-chao@kernel.org/ Reported-by: syzbot+601018296973a481f302@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/000000000000fcf0690600e4d04d@google.com/ Fixes: d48a7b3a ("f2fs: fix to do sanity check on extent cache correctly") Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Minjie Du authored
Simplify code pattern of 'folio->index + folio_nr_pages(folio)' by using the existing helper folio_next_index(). Signed-off-by:
Minjie Du <duminjie@vivo.com> Reviewed-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chunhai Guo authored
Now f2fs support four block allocation modes: lfs, adaptive, fragment:segment, fragment:block. Only lfs mode is allowed with zoned block device feature. Fixes: 6691d940 ("f2fs: introduce fragment allocation mode mount option") Signed-off-by:
Chunhai Guo <guochunhai@vivo.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Shin'ichiro Kawasaki authored
The commit 25f90805 ("f2fs: add async reset zone command support") introduced "async reset zone commands" by calling __submit_zone_reset_cmd() in async discard operations. However, __submit_zone_reset_cmd() is called regardless of zone type of discard target zone. When devices have conventional zones, zone reset commands are sent to the conventional zones and cause I/O errors. Avoid the I/O errors by checking that the discard target zone type is sequential write required. If not, handle the discard operation in same manner as non-zoned, regular block devices. For that purpose, add a new helper function f2fs_bdev_index() which gets index of the zone reset target device. Fixes: 25f90805 ("f2fs: add async reset zone command support") Signed-off-by:
Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
f2fs won't compress non-full cluster in tail of file, let's skip dirtying and rewrite such cluster during f2fs_ioc_{,de}compress_file. Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
This patch allows f2fs_ioc_{,de}compress_file() to be interrupted, so that, userspace won't be blocked when manual {,de}compression on large file is interrupted by signal. Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Christoph Hellwig authored
f2fs_scan_devices reopens the main device since the very beginning, which has always been useless, and also means that we don't pass the right holder for the reopen, which now leads to a warning as the core super.c holder ops aren't passed in for the reopen. Fixes: 3c62be17 ("f2fs: support multiple devices") Fixes: 0718afd4 ("block: introduce holder ops") Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
Compression option in inode should not be changed after they have been used, however, it may happen in below race case: Thread A Thread B - f2fs_ioc_set_compress_option - check f2fs_is_mmap_file() - check get_dirty_pages() - check F2FS_HAS_BLOCKS() - f2fs_file_mmap - set_inode_flag(FI_MMAP_FILE) - fault - do_page_mkwrite - f2fs_vm_page_mkwrite - f2fs_get_block_locked - fault_dirty_shared_page - set_page_dirty - update i_compress_algorithm - update i_log_cluster_size - update i_cluster_size Avoid such race condition by covering f2fs_file_mmap() w/ i_sem lock, meanwhile add mmap file check condition in f2fs_may_compress() as well. Fixes: e1e8debe ("f2fs: add F2FS_IOC_SET_COMPRESS_OPTION ioctl") Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Randy Dunlap authored
Correct spelling problems as identified by codespell. Fixes: 9e615dbb ("f2fs: add missing description for ipu_policy node") Fixes: b2e4a2b3 ("f2fs: expose discard related parameters in sysfs") Fixes: 846ae671 ("f2fs: expose extension_list sysfs entry") Signed-off-by:
Randy Dunlap <rdunlap@infradead.org> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Chao Yu <chao@kernel.org> Cc: linux-f2fs-devel@lists.sourceforge.net Cc: Yangtao Li <frank.li@vivo.com> Cc: Konstantin Vyshetsky <vkon@google.com> Reviewed-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
https://bugzilla.kernel.org/show_bug.cgi?id=216050 Somehow we're getting a page which has a different mapping. Let's avoid the infinite loop. Cc: <stable@vger.kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
Let's flush the inode being aborted atomic operation to avoid stale dirty inode during eviction in this call stack: f2fs_mark_inode_dirty_sync+0x22/0x40 [f2fs] f2fs_abort_atomic_write+0xc4/0xf0 [f2fs] f2fs_evict_inode+0x3f/0x690 [f2fs] ? sugov_start+0x140/0x140 evict+0xc3/0x1c0 evict_inodes+0x17b/0x210 generic_shutdown_super+0x32/0x120 kill_block_super+0x21/0x50 deactivate_locked_super+0x31/0x90 cleanup_mnt+0x100/0x160 task_work_run+0x59/0x90 do_exit+0x33b/0xa50 do_group_exit+0x2d/0x80 __x64_sys_exit_group+0x14/0x20 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd This triggers f2fs_bug_on() in f2fs_evict_inode: f2fs_bug_on(sbi, is_inode_flag_set(inode, FI_DIRTY_INODE)); This fixes the syzbot report: loop0: detected capacity change from 0 to 131072 F2FS-fs (loop0): invalid crc value F2FS-fs (loop0): Found nat_bits in checkpoint F2FS-fs (loop0): Mounted with checkpoint version = 48b305e4 ------------[ cut here ]------------ kernel BUG at fs/f2fs/inode.c:869! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 5014 Comm: syz-executor220 Not tainted 6.4.0-syzkaller-11479-g6cd06ab1 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/27/2023 RIP: 0010:f2fs_evict_inode+0x172d/0x1e00 fs/f2fs/inode.c:869 Code: ff df 48 c1 ea 03 80 3c 02 00 0f 85 6a 06 00 00 8b 75 40 ba 01 00 00 00 4c 89 e7 e8 6d ce 06 00 e9 aa fc ff ff e8 63 22 e2 fd <0f> 0b e8 5c 22 e2 fd 48 c7 c0 a8 3a 18 8d 48 ba 00 00 00 00 00 fc RSP: 0018:ffffc90003a6fa00 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 RDX: ffff8880273b8000 RSI: ffffffff83a2bd0d RDI: 0000000000000007 RBP: ffff888077db91b0 R08: 0000000000000007 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff888029a3c000 R13: ffff888077db9660 R14: ffff888029a3c0b8 R15: ffff888077db9c50 FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1909bb9000 CR3: 00000000276a9000 CR4: 0000000000350ef0 Call Trace: <TASK> evict+0x2ed/0x6b0 fs/inode.c:665 dispose_list+0x117/0x1e0 fs/inode.c:698 evict_inodes+0x345/0x440 fs/inode.c:748 generic_shutdown_super+0xaf/0x480 fs/super.c:478 kill_block_super+0x64/0xb0 fs/super.c:1417 kill_f2fs_super+0x2af/0x3c0 fs/f2fs/super.c:4704 deactivate_locked_super+0x98/0x160 fs/super.c:330 deactivate_super+0xb1/0xd0 fs/super.c:361 cleanup_mnt+0x2ae/0x3d0 fs/namespace.c:1254 task_work_run+0x16f/0x270 kernel/task_work.c:179 exit_task_work include/linux/task_work.h:38 [inline] do_exit+0xa9a/0x29a0 kernel/exit.c:874 do_group_exit+0xd4/0x2a0 kernel/exit.c:1024 __do_sys_exit_group kernel/exit.c:1035 [inline] __se_sys_exit_group kernel/exit.c:1033 [inline] __x64_sys_exit_group+0x3e/0x50 kernel/exit.c:1033 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f309be71a09 Code: Unable to access opcode bytes at 0x7f309be719df. RSP: 002b:00007fff171df518 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 00007f309bef7330 RCX: 00007f309be71a09 RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000001 RBP: 0000000000000001 R08: ffffffffffffffc0 R09: 00007f309bef1e40 R10: 0000000000010600 R11: 0000000000000246 R12: 00007f309bef7330 R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000001 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- RIP: 0010:f2fs_evict_inode+0x172d/0x1e00 fs/f2fs/inode.c:869 Code: ff df 48 c1 ea 03 80 3c 02 00 0f 85 6a 06 00 00 8b 75 40 ba 01 00 00 00 4c 89 e7 e8 6d ce 06 00 e9 aa fc ff ff e8 63 22 e2 fd <0f> 0b e8 5c 22 e2 fd 48 c7 c0 a8 3a 18 8d 48 ba 00 00 00 00 00 fc RSP: 0018:ffffc90003a6fa00 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 RDX: ffff8880273b8000 RSI: ffffffff83a2bd0d RDI: 0000000000000007 RBP: ffff888077db91b0 R08: 0000000000000007 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff888029a3c000 R13: ffff888077db9660 R14: ffff888029a3c0b8 R15: ffff888077db9c50 FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1909bb9000 CR3: 00000000276a9000 CR4: 0000000000350ef0 Cc: <stable@vger.kernel.org> Reported-and-tested-by: syzbot+e1246909d526a9d470fa@syzkaller.appspotmail.com Reviewed-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
f2fs_compress_alloc_page() uses mempool to allocate memory, it never fail, don't handle error case in its callers. Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
This reverts commit bfd47662. Shinichiro Kawasaki reported: When I ran workloads on f2fs using v6.5-rcX with fixes [1][2] and a zoned block devices with 4kb logical block size, I observe mount failure as follows. When I revert this commit, the failure goes away. [ 167.781975][ T1555] F2FS-fs (dm-0): IO Block Size: 4 KB [ 167.890728][ T1555] F2FS-fs (dm-0): Found nat_bits in checkpoint [ 171.482588][ T1555] F2FS-fs (dm-0): Zone without valid block has non-zero write pointer. Reset the write pointer: wp[0x1300,0x8] [ 171.496000][ T1555] F2FS-fs (dm-0): (0) : Unaligned zone reset attempted (block 280000 + 80000) [ 171.505037][ T1555] F2FS-fs (dm-0): Discard zone failed: (errno=-5) The patch replaced "sbi->log_blocksize - SECTOR_SHIFT" with "sbi->log_sectors_per_block". However, I think these two are not equal when the device has 4k logical block size. The former uses Linux kernel sector size 512 byte. The latter use 512b sector size or 4kb sector size depending on the device. mkfs.f2fs obtains logical block size via BLKSSZGET ioctl from the device and reflects it to the value sbi->log_sector_size_per_block. This causes unexpected write pointer calculations in check_zone_write_pointer(). This resulted in unexpected zone reset and the mount failure. [1] https://lkml.kernel.org/linux-f2fs-devel/20230711050101.GA19128@lst.de/ [2] https://lore.kernel.org/linux-f2fs-devel/20230804091556.2372567-1-shinichiro.kawasaki@wdc.com/ Cc: stable@vger.kernel.org Reported-by:
Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Fixes: bfd47662 ("f2fs: clean up w/ sbi->log_sectors_per_block") Reviewed-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
- 09 Jul, 2023 10 commits
-
-
Linus Torvalds authored
-
Linus Torvalds authored
We just sorted the entries and fields last release, so just out of a perverse sense of curiosity, I decided to see if we can keep things ordered for even just one release. The answer is "No. No we cannot". I suggest that all kernel developers will need weekly training sessions, involving a lot of Big Bird and Sesame Street. And at the yearly maintainer summit, we will all sing the alphabet song together. I doubt I will keep doing this. At some point "perverse sense of curiosity" turns into just a cold dark place filled with sadness and despair. Repeats: 80e62bc8 ("MAINTAINERS: re-sort all entries and fields") Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.infradead.org/users/hch/dma-mappingLinus Torvalds authored
Pull dma-mapping fixes from Christoph Hellwig: - swiotlb area sizing fixes (Petr Tesarik) * tag 'dma-mapping-6.5-2023-07-09' of git://git.infradead.org/users/hch/dma-mapping: swiotlb: reduce the number of areas to match actual memory pool size swiotlb: always set the number of areas before allocating the pool
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull irq update from Borislav Petkov: - Optimize IRQ domain's name assignment * tag 'irq_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqdomain: Use return value of strreplace()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fpu fix from Borislav Petkov: - Do FPU AP initialization on Xen PV too which got missed by the recent boot reordering work * tag 'x86_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/xen: Fix secondary processors' FPU initialization
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fix from Thomas Gleixner: "A single fix for the mechanism to park CPUs with an INIT IPI. On shutdown or kexec, the kernel tries to park the non-boot CPUs with an INIT IPI. But the same code path is also used by the crash utility. If the CPU which panics is not the boot CPU then it sends an INIT IPI to the boot CPU which resets the machine. Prevent this by validating that the CPU which runs the stop mechanism is the boot CPU. If not, leave the other CPUs in HLT" * tag 'x86-core-2023-07-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/smp: Don't send INIT to boot CPU
-
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linuxLinus Torvalds authored
Pull MIPS fixes from Thomas Bogendoerfer: - fixes for KVM - fix for loongson build and cpu probing - DT fixes * tag 'mips_6.5_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: kvm: Fix build error with KVM_MIPS_DEBUG_COP0_COUNTERS enabled MIPS: dts: add missing space before { MIPS: Loongson: Fix build error when make modules_install MIPS: KVM: Fix NULL pointer dereference MIPS: Loongson: Fix cpu_probe_loongson() again
-
git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds authored
Pull xfs fix from Darrick Wong: "Nothing exciting here, just getting rid of a gcc warning that I got tired of seeing when I turn on gcov" * tag 'xfs-6.5-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix uninit warning in xfs_growfs_data
-
git://git.samba.org/sfrench/cifs-2.6Linus Torvalds authored
Pull more smb client updates from Steve French: - fix potential use after free in unmount - minor cleanup - add worker to cleanup stale directory leases * tag '6.5-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: cifs: Add a laundromat thread for cached directories smb: client: remove redundant pointer 'server' cifs: fix session state transition to avoid use-after-free issue
-
https://github.com/jonmason/ntbLinus Torvalds authored
Pull NTB updates from Jon Mason: "Fixes for pci_clean_master, error handling in driver inits, and various other issues/bugs" * tag 'ntb-6.5' of https://github.com/jonmason/ntb: ntb: hw: amd: Fix debugfs_create_dir error checking ntb.rst: Fix copy and paste error ntb_netdev: Fix module_init problem ntb: intel: Remove redundant pci_clear_master ntb: epf: Remove redundant pci_clear_master ntb_hw_amd: Remove redundant pci_clear_master ntb: idt: drop redundant pci_enable_pcie_error_reporting() MAINTAINERS: git://github -> https://github.com for jonmason NTB: EPF: fix possible memory leak in pci_vntb_probe() NTB: ntb_tool: Add check for devm_kcalloc NTB: ntb_transport: fix possible memory leak while device_register() fails ntb: intel: Fix error handling in intel_ntb_pci_driver_init() NTB: amd: Fix error handling in amd_ntb_pci_driver_init() ntb: idt: Fix error handling in idt_pci_driver_init()
-
- 08 Jul, 2023 7 commits
-
-
Hugh Dickins authored
Lockdep is certainly right to complain about (&vma->vm_lock->lock){++++}-{3:3}, at: vma_start_write+0x2d/0x3f but task is already holding lock: (&mapping->i_mmap_rwsem){+.+.}-{3:3}, at: mmap_region+0x4dc/0x6db Invert those to the usual ordering. Fixes: 33313a74 ("mm: lock newly mapped VMA which can be modified after it becomes visible") Cc: stable@vger.kernel.org Signed-off-by:
Hugh Dickins <hughd@google.com> Tested-by:
Suren Baghdasaryan <surenb@google.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
Merge tag 'mm-hotfixes-stable-2023-07-08-10-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull hotfixes from Andrew Morton: "16 hotfixes. Six are cc:stable and the remainder address post-6.4 issues" The merge undoes the disabling of the CONFIG_PER_VMA_LOCK feature, since it was all hopefully fixed in mainline. * tag 'mm-hotfixes-stable-2023-07-08-10-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: lib: dhry: fix sleeping allocations inside non-preemptable section kasan, slub: fix HW_TAGS zeroing with slub_debug kasan: fix type cast in memory_is_poisoned_n mailmap: add entries for Heiko Stuebner mailmap: update manpage link bootmem: remove the vmemmap pages from kmemleak in free_bootmem_page MAINTAINERS: add linux-next info mailmap: add Markus Schneider-Pargmann writeback: account the number of pages written back mm: call arch_swap_restore() from do_swap_page() squashfs: fix cache race with migration mm/hugetlb.c: fix a bug within a BUG(): inconsistent pte comparison docs: update ocfs2-devel mailing list address MAINTAINERS: update ocfs2-devel mailing list address mm: disable CONFIG_PER_VMA_LOCK until its fixed fork: lock VMAs of the parent process when forking
-
Suren Baghdasaryan authored
When forking a child process, the parent write-protects anonymous pages and COW-shares them with the child being forked using copy_present_pte(). We must not take any concurrent page faults on the source vma's as they are being processed, as we expect both the vma and the pte's behind it to be stable. For example, the anon_vma_fork() expects the parents vma->anon_vma to not change during the vma copy. A concurrent page fault on a page newly marked read-only by the page copy might trigger wp_page_copy() and a anon_vma_prepare(vma) on the source vma, defeating the anon_vma_clone() that wasn't done because the parent vma originally didn't have an anon_vma, but we now might end up copying a pte entry for a page that has one. Before the per-vma lock based changes, the mmap_lock guaranteed exclusion with concurrent page faults. But now we need to do a vma_start_write() to make sure no concurrent faults happen on this vma while it is being processed. This fix can potentially regress some fork-heavy workloads. Kernel build time did not show noticeable regression on a 56-core machine while a stress test mapping 10000 VMAs and forking 5000 times in a tight loop shows ~5% regression. If such fork time regression is unacceptable, disabling CONFIG_PER_VMA_LOCK should restore its performance. Further optimizations are possible if this regression proves to be problematic. Suggested-by:
David Hildenbrand <david@redhat.com> Reported-by:
Jiri Slaby <jirislaby@kernel.org> Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/Reported-by:
Holger Hoffstätte <holger@applied-asynchrony.com> Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-asynchrony.com/Reported-by:
Jacob Young <jacobly.alt@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624 Fixes: 0bff0aae ("x86/mm: try VMA lock-based page fault handling first") Cc: stable@vger.kernel.org Signed-off-by:
Suren Baghdasaryan <surenb@google.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Suren Baghdasaryan authored
mmap_region adds a newly created VMA into VMA tree and might modify it afterwards before dropping the mmap_lock. This poses a problem for page faults handled under per-VMA locks because they don't take the mmap_lock and can stumble on this VMA while it's still being modified. Currently this does not pose a problem since post-addition modifications are done only for file-backed VMAs, which are not handled under per-VMA lock. However, once support for handling file-backed page faults with per-VMA locks is added, this will become a race. Fix this by write-locking the VMA before inserting it into the VMA tree. Other places where a new VMA is added into VMA tree do not modify it after the insertion, so do not need the same locking. Cc: stable@vger.kernel.org Signed-off-by:
Suren Baghdasaryan <surenb@google.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Suren Baghdasaryan authored
With recent changes necessitating mmap_lock to be held for write while expanding a stack, per-VMA locks should follow the same rules and be write-locked to prevent page faults into the VMA being expanded. Add the necessary locking. Cc: stable@vger.kernel.org Signed-off-by:
Suren Baghdasaryan <surenb@google.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds authored
Pull more SCSI updates from James Bottomley: "A few late arriving patches that missed the initial pull request. It's mostly bug fixes (the dt-bindings is a fix for the initial pull)" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: core: Remove unused function declaration scsi: target: docs: Remove tcm_mod_builder.py scsi: target: iblock: Quiet bool conversion warning with pr_preempt use scsi: dt-bindings: ufs: qcom: Fix ICE phandle scsi: core: Simplify scsi_cdl_check_cmd() scsi: isci: Fix comment typo scsi: smartpqi: Replace one-element arrays with flexible-array members scsi: target: tcmu: Replace strlcpy() with strscpy() scsi: ncr53c8xx: Replace strlcpy() with strscpy() scsi: lpfc: Fix lpfc_name struct packing
-
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linuxLinus Torvalds authored
Pull more i2c updates from Wolfram Sang: - xiic patch should have been in the original pull but slipped through - mpc patch fixes a build regression - nomadik cleanup * tag 'i2c-for-6.5-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: mpc: Drop unused variable i2c: nomadik: Remove a useless call in the remove function i2c: xiic: Don't try to handle more interrupt events after error
-