1. 11 Sep, 2024 2 commits
    • Chao Yu's avatar
      f2fs: get rid of online repaire on corrupted directory · 884ee6dc
      Chao Yu authored
      syzbot reports a f2fs bug as below:
      
      kernel BUG at fs/f2fs/inode.c:896!
      RIP: 0010:f2fs_evict_inode+0x1598/0x15c0 fs/f2fs/inode.c:896
      Call Trace:
       evict+0x532/0x950 fs/inode.c:704
       dispose_list fs/inode.c:747 [inline]
       evict_inodes+0x5f9/0x690 fs/inode.c:797
       generic_shutdown_super+0x9d/0x2d0 fs/super.c:627
       kill_block_super+0x44/0x90 fs/super.c:1696
       kill_f2fs_super+0x344/0x690 fs/f2fs/super.c:4898
       deactivate_locked_super+0xc4/0x130 fs/super.c:473
       cleanup_mnt+0x41f/0x4b0 fs/namespace.c:1373
       task_work_run+0x24f/0x310 kernel/task_work.c:228
       ptrace_notify+0x2d2/0x380 kernel/signal.c:2402
       ptrace_report_syscall include/linux/ptrace.h:415 [inline]
       ptrace_report_syscall_exit include/linux/ptrace.h:477 [inline]
       syscall_exit_work+0xc6/0x190 kernel/entry/common.c:173
       syscall_exit_to_user_mode_prepare kernel/entry/common.c:200 [inline]
       __syscall_exit_to_user_mode_work kernel/entry/common.c:205 [inline]
       syscall_exit_to_user_mode+0x279/0x370 kernel/entry/common.c:218
       do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      RIP: 0010:f2fs_evict_inode+0x1598/0x15c0 fs/f2fs/inode.c:896
      
      Online repaire on corrupted directory in f2fs_lookup() can generate
      dirty data/meta while racing w/ readonly remount, it may leave dirty
      inode after filesystem becomes readonly, however, checkpoint() will
      skips flushing dirty inode in a state of readonly mode, result in
      above panic.
      
      Let's get rid of online repaire in f2fs_lookup(), and leave the work
      to fsck.f2fs.
      
      Fixes: 510022a8 ("f2fs: add F2FS_INLINE_DOTS to recover missing dot dentries")
      Reported-by: syzbot+ebea2790904673d7c618@syzkaller.appspotmail.com
      Closes: https://lore.kernel.org/all/000000000000a7b20f061ff2d56a@google.comSigned-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      884ee6dc
    • Daeho Jeong's avatar
      f2fs: prevent atomic file from being dirtied before commit · fccaa81d
      Daeho Jeong authored
      Keep atomic file clean while updating and make it dirtied during commit
      in order to avoid unnecessary and excessive inode updates in the previous
      fix.
      
      Fixes: 4bf78322 ("f2fs: mark inode dirty for FI_ATOMIC_COMMITTED flag")
      Signed-off-by: default avatarDaeho Jeong <daehojeong@google.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      fccaa81d
  2. 06 Sep, 2024 18 commits
  3. 21 Aug, 2024 10 commits
    • Christophe JAILLET's avatar
      f2fs: Use sysfs_emit_at() to simplify code · f7a678bb
      Christophe JAILLET authored
      This file already uses sysfs_emit(). So be consistent and also use
      sysfs_emit_at().
      
      This slightly simplifies the code and makes it more readable.
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      f7a678bb
    • Chao Yu's avatar
      f2fs: atomic: fix to forbid dio in atomic_file · b2c160f4
      Chao Yu authored
      atomic write can only be used via buffered IO, let's fail direct IO on
      atomic_file and return -EOPNOTSUPP.
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      b2c160f4
    • Yeongjin Gil's avatar
      f2fs: compress: don't redirty sparse cluster during {,de}compress · f785cec2
      Yeongjin Gil authored
      In f2fs_do_write_data_page, when the data block is NULL_ADDR, it skips
      writepage considering that it has been already truncated.
      This results in an infinite loop as the PAGECACHE_TAG_TOWRITE tag is not
      cleared during the writeback process for a compressed file including
      NULL_ADDR in compress_mode=user.
      
      This is the reproduction process:
      
      1. dd if=/dev/zero bs=4096 count=1024 seek=1024 of=testfile
      2. f2fs_io compress testfile
      3. dd if=/dev/zero bs=4096 count=1 conv=notrunc of=testfile
      4. f2fs_io decompress testfile
      
      To prevent the problem, let's check whether the cluster is fully
      allocated before redirty its pages.
      
      Fixes: 5fdb322f ("f2fs: add F2FS_IOC_DECOMPRESS_FILE and F2FS_IOC_COMPRESS_FILE")
      Reviewed-by: default avatarSungjong Seo <sj1557.seo@samsung.com>
      Reviewed-by: default avatarSunmin Jeong <s_min.jeong@samsung.com>
      Tested-by: default avatarJaewook Kim <jw5454.kim@samsung.com>
      Signed-off-by: default avatarYeongjin Gil <youngjin.gil@samsung.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      f785cec2
    • Shin'ichiro Kawasaki's avatar
      f2fs: check discard support for conventional zones · 43aec4d0
      Shin'ichiro Kawasaki authored
      As the helper function f2fs_bdev_support_discard() shows, f2fs checks if
      the target block devices support discard by calling
      bdev_max_discard_sectors() and bdev_is_zoned(). This check works well
      for most cases, but it does not work for conventional zones on zoned
      block devices. F2fs assumes that zoned block devices support discard,
      and calls __submit_discard_cmd(). When __submit_discard_cmd() is called
      for sequential write required zones, it works fine since
      __submit_discard_cmd() issues zone reset commands instead of discard
      commands. However, when __submit_discard_cmd() is called for
      conventional zones, __blkdev_issue_discard() is called even when the
      devices do not support discard.
      
      The inappropriate __blkdev_issue_discard() call was not a problem before
      the commit 30f1e724 ("block: move discard checks into the ioctl
      handler") because __blkdev_issue_discard() checked if the target devices
      support discard or not. If not, it returned EOPNOTSUPP. After the
      commit, __blkdev_issue_discard() no longer checks it. It always returns
      zero and sets NULL to the given bio pointer. This NULL pointer triggers
      f2fs_bug_on() in __submit_discard_cmd(). The BUG is recreated with the
      commands below at the umount step, where /dev/nullb0 is a zoned null_blk
      with 5GB total size, 128MB zone size and 10 conventional zones.
      
      $ mkfs.f2fs -f -m /dev/nullb0
      $ mount /dev/nullb0 /mnt
      $ for ((i=0;i<5;i++)); do dd if=/dev/zero of=/mnt/test bs=65536 count=1600 conv=fsync; done
      $ umount /mnt
      
      To fix the BUG, avoid the inappropriate __blkdev_issue_discard() call.
      When discard is requested for conventional zones, check if the device
      supports discard or not. If not, return EOPNOTSUPP.
      
      Fixes: 30f1e724 ("block: move discard checks into the ioctl handler")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Reviewed-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      43aec4d0
    • Chao Yu's avatar
      f2fs: fix to avoid use-after-free in f2fs_stop_gc_thread() · c7f114d8
      Chao Yu authored
      syzbot reports a f2fs bug as below:
      
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
       print_report+0xe8/0x550 mm/kasan/report.c:491
       kasan_report+0x143/0x180 mm/kasan/report.c:601
       kasan_check_range+0x282/0x290 mm/kasan/generic.c:189
       instrument_atomic_read_write include/linux/instrumented.h:96 [inline]
       atomic_fetch_add_relaxed include/linux/atomic/atomic-instrumented.h:252 [inline]
       __refcount_add include/linux/refcount.h:184 [inline]
       __refcount_inc include/linux/refcount.h:241 [inline]
       refcount_inc include/linux/refcount.h:258 [inline]
       get_task_struct include/linux/sched/task.h:118 [inline]
       kthread_stop+0xca/0x630 kernel/kthread.c:704
       f2fs_stop_gc_thread+0x65/0xb0 fs/f2fs/gc.c:210
       f2fs_do_shutdown+0x192/0x540 fs/f2fs/file.c:2283
       f2fs_ioc_shutdown fs/f2fs/file.c:2325 [inline]
       __f2fs_ioctl+0x443a/0xbe60 fs/f2fs/file.c:4325
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:907 [inline]
       __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:893
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      The root cause is below race condition, it may cause use-after-free
      issue in sbi->gc_th pointer.
      
      - remount
       - f2fs_remount
        - f2fs_stop_gc_thread
         - kfree(gc_th)
      				- f2fs_ioc_shutdown
      				 - f2fs_do_shutdown
      				  - f2fs_stop_gc_thread
      				   - kthread_stop(gc_th->f2fs_gc_task)
         : sbi->gc_thread = NULL;
      
      We will call f2fs_do_shutdown() in two paths:
      - for f2fs_ioc_shutdown() path, we should grab sb->s_umount semaphore
      for fixing.
      - for f2fs_shutdown() path, it's safe since caller has already grabbed
      sb->s_umount semaphore.
      
      Reported-by: syzbot+1a8e2b31f2ac9bd3d148@syzkaller.appspotmail.com
      Closes: https://lore.kernel.org/linux-f2fs-devel/0000000000005c7ccb061e032b9b@google.com
      Fixes: 7950e9ac ("f2fs: stop gc/discard thread after fs shutdown")
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      c7f114d8
    • Chao Yu's avatar
      f2fs: atomic: fix to truncate pagecache before on-disk metadata truncation · ebd3309a
      Chao Yu authored
      We should always truncate pagecache while truncating on-disk data.
      
      Fixes: a46bebd5 ("f2fs: synchronize atomic write aborts")
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      ebd3309a
    • Chao Yu's avatar
      f2fs: fix to wait page writeback before setting gcing flag · a4d7f2b3
      Chao Yu authored
      Soft IRQ				Thread
      - f2fs_write_end_io
      					- f2fs_defragment_range
      					 - set_page_private_gcing
       - type = WB_DATA_TYPE(page, false);
       : assign type w/ F2FS_WB_CP_DATA
       due to page_private_gcing() is true
        - dec_page_count() w/ wrong type
        - end_page_writeback()
      
      Value of F2FS_WB_CP_DATA reference count may become negative under above
      race condition, the root cause is we missed to wait page writeback before
      setting gcing page private flag, let's fix it.
      
      Fixes: 2d1fe8a8 ("f2fs: fix to tag gcing flag on page during file defragment")
      Fixes: 4961acdd ("f2fs: fix to tag gcing flag on page during block migration")
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      a4d7f2b3
    • Yeongjin Gil's avatar
      f2fs: Create COW inode from parent dentry for atomic write · 8c1b7879
      Yeongjin Gil authored
      The i_pino in f2fs_inode_info has the previous parent's i_ino when inode
      was renamed, which may cause f2fs_ioc_start_atomic_write to fail.
      If file_wrong_pino is true and i_nlink is 1, then to find a valid pino,
      we should refer to the dentry from inode.
      
      To resolve this issue, let's get parent inode using parent dentry
      directly.
      
      Fixes: 3db1de0e ("f2fs: change the current atomic write way")
      Reviewed-by: default avatarSungjong Seo <sj1557.seo@samsung.com>
      Reviewed-by: default avatarSunmin Jeong <s_min.jeong@samsung.com>
      Signed-off-by: default avatarYeongjin Gil <youngjin.gil@samsung.com>
      Reviewed-by: default avatarDaeho Jeong <daehojeong@google.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      8c1b7879
    • Jann Horn's avatar
      f2fs: Require FMODE_WRITE for atomic write ioctls · 4f5a100f
      Jann Horn authored
      The F2FS ioctls for starting and committing atomic writes check for
      inode_owner_or_capable(), but this does not give LSMs like SELinux or
      Landlock an opportunity to deny the write access - if the caller's FSUID
      matches the inode's UID, inode_owner_or_capable() immediately returns true.
      
      There are scenarios where LSMs want to deny a process the ability to write
      particular files, even files that the FSUID of the process owns; but this
      can currently partially be bypassed using atomic write ioctls in two ways:
      
       - F2FS_IOC_START_ATOMIC_REPLACE + F2FS_IOC_COMMIT_ATOMIC_WRITE can
         truncate an inode to size 0
       - F2FS_IOC_START_ATOMIC_WRITE + F2FS_IOC_ABORT_ATOMIC_WRITE can revert
         changes another process concurrently made to a file
      
      Fix it by requiring FMODE_WRITE for these operations, just like for
      F2FS_IOC_MOVE_RANGE. Since any legitimate caller should only be using these
      ioctls when intending to write into the file, that seems unlikely to break
      anything.
      
      Fixes: 88b88a66 ("f2fs: support atomic writes")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Reviewed-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      4f5a100f
    • Zhiguo Niu's avatar
      f2fs: clean up val{>>,<<}F2FS_BLKSIZE_BITS · 8fb9f319
      Zhiguo Niu authored
      Use F2FS_BYTES_TO_BLK(bytes) and F2FS_BLK_TO_BYTES(blk) for cleanup
      Signed-off-by: default avatarZhiguo Niu <zhiguo.niu@unisoc.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      8fb9f319
  4. 15 Aug, 2024 8 commits
  5. 05 Aug, 2024 2 commits
    • Chao Yu's avatar
      f2fs: atomic: fix to avoid racing w/ GC · 1a0bd289
      Chao Yu authored
      Case #1:
      SQLite App		GC Thread		Kworker		Shrinker
      - f2fs_ioc_start_atomic_write
      
      - f2fs_ioc_commit_atomic_write
       - f2fs_commit_atomic_write
        - filemap_write_and_wait_range
        : write atomic_file's data to cow_inode
      								echo 3 > drop_caches
      								to drop atomic_file's
      								cache.
      			- f2fs_gc
      			 - gc_data_segment
      			  - move_data_page
      			   - set_page_dirty
      
      						- writepages
      						 - f2fs_do_write_data_page
      						 : overwrite atomic_file's data
      						   to cow_inode
        - f2fs_down_write(&fi->i_gc_rwsem[WRITE])
        - __f2fs_commit_atomic_write
        - f2fs_up_write(&fi->i_gc_rwsem[WRITE])
      
      Case #2:
      SQLite App		GC Thread		Kworker
      - f2fs_ioc_start_atomic_write
      
      						- __writeback_single_inode
      						 - do_writepages
      						  - f2fs_write_cache_pages
      						   - f2fs_write_single_data_page
      						    - f2fs_do_write_data_page
      						    : write atomic_file's data to cow_inode
      			- f2fs_gc
      			 - gc_data_segment
      			  - move_data_page
      			   - set_page_dirty
      
      						- writepages
      						 - f2fs_do_write_data_page
      						 : overwrite atomic_file's data to cow_inode
      - f2fs_ioc_commit_atomic_write
      
      In above cases racing in between atomic_write and GC, previous
      data in atomic_file may be overwrited to cow_file, result in
      data corruption.
      
      This patch introduces PAGE_PRIVATE_ATOMIC_WRITE bit flag in page.private,
      and use it to indicate that there is last dirty data in atomic file,
      and the data should be writebacked into cow_file, if the flag is not
      tagged in page, we should never write data across files.
      
      Fixes: 3db1de0e ("f2fs: change the current atomic write way")
      Cc: Daeho Jeong <daehojeong@google.com>
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      1a0bd289
    • Julian Sun's avatar
      f2fs: fix macro definition stat_inc_cp_count · d72750e4
      Julian Sun authored
      The macro stat_inc_cp_count accepts a parameter si,
      but it was not used, rather the variable sbi was directly used,
      which may be a local variable inside a function that calls the macros.
      Signed-off-by: default avatarJulian Sun <sunjunchao2870@gmail.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      d72750e4