1. 11 Sep, 2024 5 commits
    • Daeho Jeong's avatar
      f2fs: make BG GC more aggressive for zoned devices · 5062b5be
      Daeho Jeong authored
      Since we don't have any GC on device side for zoned devices, need more
      aggressive BG GC. So, tune the parameters for that.
      Signed-off-by: default avatarDaeho Jeong <daehojeong@google.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      5062b5be
    • Daejun Park's avatar
      f2fs: avoid unused block when dio write in LFS mode · 0638a319
      Daejun Park authored
      This patch addresses the problem that when using LFS mode, unused blocks
      may occur in f2fs_map_blocks() during block allocation for dio writes.
      
      If a new section is allocated during block allocation, it will not be
      included in the map struct by map_is_mergeable() if the LBA of the
      allocated block is not contiguous. However, the block already allocated
      in this process will remain unused due to the LFS mode.
      
      This patch avoids the possibility of unused blocks by escaping
      f2fs_map_blocks() when allocating the last block in a section.
      Signed-off-by: default avatarDaejun Park <daejun7.park@samsung.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      0638a319
    • Chao Yu's avatar
      f2fs: fix to check atomic_file in f2fs ioctl interfaces · bfe5c026
      Chao Yu authored
      Some f2fs ioctl interfaces like f2fs_ioc_set_pin_file(),
      f2fs_move_file_range(), and f2fs_defragment_range() missed to
      check atomic_write status, which may cause potential race issue,
      fix it.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      bfe5c026
    • Chao Yu's avatar
      f2fs: get rid of online repaire on corrupted directory · 884ee6dc
      Chao Yu authored
      syzbot reports a f2fs bug as below:
      
      kernel BUG at fs/f2fs/inode.c:896!
      RIP: 0010:f2fs_evict_inode+0x1598/0x15c0 fs/f2fs/inode.c:896
      Call Trace:
       evict+0x532/0x950 fs/inode.c:704
       dispose_list fs/inode.c:747 [inline]
       evict_inodes+0x5f9/0x690 fs/inode.c:797
       generic_shutdown_super+0x9d/0x2d0 fs/super.c:627
       kill_block_super+0x44/0x90 fs/super.c:1696
       kill_f2fs_super+0x344/0x690 fs/f2fs/super.c:4898
       deactivate_locked_super+0xc4/0x130 fs/super.c:473
       cleanup_mnt+0x41f/0x4b0 fs/namespace.c:1373
       task_work_run+0x24f/0x310 kernel/task_work.c:228
       ptrace_notify+0x2d2/0x380 kernel/signal.c:2402
       ptrace_report_syscall include/linux/ptrace.h:415 [inline]
       ptrace_report_syscall_exit include/linux/ptrace.h:477 [inline]
       syscall_exit_work+0xc6/0x190 kernel/entry/common.c:173
       syscall_exit_to_user_mode_prepare kernel/entry/common.c:200 [inline]
       __syscall_exit_to_user_mode_work kernel/entry/common.c:205 [inline]
       syscall_exit_to_user_mode+0x279/0x370 kernel/entry/common.c:218
       do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      RIP: 0010:f2fs_evict_inode+0x1598/0x15c0 fs/f2fs/inode.c:896
      
      Online repaire on corrupted directory in f2fs_lookup() can generate
      dirty data/meta while racing w/ readonly remount, it may leave dirty
      inode after filesystem becomes readonly, however, checkpoint() will
      skips flushing dirty inode in a state of readonly mode, result in
      above panic.
      
      Let's get rid of online repaire in f2fs_lookup(), and leave the work
      to fsck.f2fs.
      
      Fixes: 510022a8 ("f2fs: add F2FS_INLINE_DOTS to recover missing dot dentries")
      Reported-by: syzbot+ebea2790904673d7c618@syzkaller.appspotmail.com
      Closes: https://lore.kernel.org/all/000000000000a7b20f061ff2d56a@google.comSigned-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      884ee6dc
    • Daeho Jeong's avatar
      f2fs: prevent atomic file from being dirtied before commit · fccaa81d
      Daeho Jeong authored
      Keep atomic file clean while updating and make it dirtied during commit
      in order to avoid unnecessary and excessive inode updates in the previous
      fix.
      
      Fixes: 4bf78322 ("f2fs: mark inode dirty for FI_ATOMIC_COMMITTED flag")
      Signed-off-by: default avatarDaeho Jeong <daehojeong@google.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      fccaa81d
  2. 06 Sep, 2024 18 commits
  3. 21 Aug, 2024 10 commits
    • Christophe JAILLET's avatar
      f2fs: Use sysfs_emit_at() to simplify code · f7a678bb
      Christophe JAILLET authored
      This file already uses sysfs_emit(). So be consistent and also use
      sysfs_emit_at().
      
      This slightly simplifies the code and makes it more readable.
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      f7a678bb
    • Chao Yu's avatar
      f2fs: atomic: fix to forbid dio in atomic_file · b2c160f4
      Chao Yu authored
      atomic write can only be used via buffered IO, let's fail direct IO on
      atomic_file and return -EOPNOTSUPP.
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      b2c160f4
    • Yeongjin Gil's avatar
      f2fs: compress: don't redirty sparse cluster during {,de}compress · f785cec2
      Yeongjin Gil authored
      In f2fs_do_write_data_page, when the data block is NULL_ADDR, it skips
      writepage considering that it has been already truncated.
      This results in an infinite loop as the PAGECACHE_TAG_TOWRITE tag is not
      cleared during the writeback process for a compressed file including
      NULL_ADDR in compress_mode=user.
      
      This is the reproduction process:
      
      1. dd if=/dev/zero bs=4096 count=1024 seek=1024 of=testfile
      2. f2fs_io compress testfile
      3. dd if=/dev/zero bs=4096 count=1 conv=notrunc of=testfile
      4. f2fs_io decompress testfile
      
      To prevent the problem, let's check whether the cluster is fully
      allocated before redirty its pages.
      
      Fixes: 5fdb322f ("f2fs: add F2FS_IOC_DECOMPRESS_FILE and F2FS_IOC_COMPRESS_FILE")
      Reviewed-by: default avatarSungjong Seo <sj1557.seo@samsung.com>
      Reviewed-by: default avatarSunmin Jeong <s_min.jeong@samsung.com>
      Tested-by: default avatarJaewook Kim <jw5454.kim@samsung.com>
      Signed-off-by: default avatarYeongjin Gil <youngjin.gil@samsung.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      f785cec2
    • Shin'ichiro Kawasaki's avatar
      f2fs: check discard support for conventional zones · 43aec4d0
      Shin'ichiro Kawasaki authored
      As the helper function f2fs_bdev_support_discard() shows, f2fs checks if
      the target block devices support discard by calling
      bdev_max_discard_sectors() and bdev_is_zoned(). This check works well
      for most cases, but it does not work for conventional zones on zoned
      block devices. F2fs assumes that zoned block devices support discard,
      and calls __submit_discard_cmd(). When __submit_discard_cmd() is called
      for sequential write required zones, it works fine since
      __submit_discard_cmd() issues zone reset commands instead of discard
      commands. However, when __submit_discard_cmd() is called for
      conventional zones, __blkdev_issue_discard() is called even when the
      devices do not support discard.
      
      The inappropriate __blkdev_issue_discard() call was not a problem before
      the commit 30f1e724 ("block: move discard checks into the ioctl
      handler") because __blkdev_issue_discard() checked if the target devices
      support discard or not. If not, it returned EOPNOTSUPP. After the
      commit, __blkdev_issue_discard() no longer checks it. It always returns
      zero and sets NULL to the given bio pointer. This NULL pointer triggers
      f2fs_bug_on() in __submit_discard_cmd(). The BUG is recreated with the
      commands below at the umount step, where /dev/nullb0 is a zoned null_blk
      with 5GB total size, 128MB zone size and 10 conventional zones.
      
      $ mkfs.f2fs -f -m /dev/nullb0
      $ mount /dev/nullb0 /mnt
      $ for ((i=0;i<5;i++)); do dd if=/dev/zero of=/mnt/test bs=65536 count=1600 conv=fsync; done
      $ umount /mnt
      
      To fix the BUG, avoid the inappropriate __blkdev_issue_discard() call.
      When discard is requested for conventional zones, check if the device
      supports discard or not. If not, return EOPNOTSUPP.
      
      Fixes: 30f1e724 ("block: move discard checks into the ioctl handler")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Reviewed-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      43aec4d0
    • Chao Yu's avatar
      f2fs: fix to avoid use-after-free in f2fs_stop_gc_thread() · c7f114d8
      Chao Yu authored
      syzbot reports a f2fs bug as below:
      
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
       print_report+0xe8/0x550 mm/kasan/report.c:491
       kasan_report+0x143/0x180 mm/kasan/report.c:601
       kasan_check_range+0x282/0x290 mm/kasan/generic.c:189
       instrument_atomic_read_write include/linux/instrumented.h:96 [inline]
       atomic_fetch_add_relaxed include/linux/atomic/atomic-instrumented.h:252 [inline]
       __refcount_add include/linux/refcount.h:184 [inline]
       __refcount_inc include/linux/refcount.h:241 [inline]
       refcount_inc include/linux/refcount.h:258 [inline]
       get_task_struct include/linux/sched/task.h:118 [inline]
       kthread_stop+0xca/0x630 kernel/kthread.c:704
       f2fs_stop_gc_thread+0x65/0xb0 fs/f2fs/gc.c:210
       f2fs_do_shutdown+0x192/0x540 fs/f2fs/file.c:2283
       f2fs_ioc_shutdown fs/f2fs/file.c:2325 [inline]
       __f2fs_ioctl+0x443a/0xbe60 fs/f2fs/file.c:4325
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:907 [inline]
       __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:893
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      The root cause is below race condition, it may cause use-after-free
      issue in sbi->gc_th pointer.
      
      - remount
       - f2fs_remount
        - f2fs_stop_gc_thread
         - kfree(gc_th)
      				- f2fs_ioc_shutdown
      				 - f2fs_do_shutdown
      				  - f2fs_stop_gc_thread
      				   - kthread_stop(gc_th->f2fs_gc_task)
         : sbi->gc_thread = NULL;
      
      We will call f2fs_do_shutdown() in two paths:
      - for f2fs_ioc_shutdown() path, we should grab sb->s_umount semaphore
      for fixing.
      - for f2fs_shutdown() path, it's safe since caller has already grabbed
      sb->s_umount semaphore.
      
      Reported-by: syzbot+1a8e2b31f2ac9bd3d148@syzkaller.appspotmail.com
      Closes: https://lore.kernel.org/linux-f2fs-devel/0000000000005c7ccb061e032b9b@google.com
      Fixes: 7950e9ac ("f2fs: stop gc/discard thread after fs shutdown")
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      c7f114d8
    • Chao Yu's avatar
      f2fs: atomic: fix to truncate pagecache before on-disk metadata truncation · ebd3309a
      Chao Yu authored
      We should always truncate pagecache while truncating on-disk data.
      
      Fixes: a46bebd5 ("f2fs: synchronize atomic write aborts")
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      ebd3309a
    • Chao Yu's avatar
      f2fs: fix to wait page writeback before setting gcing flag · a4d7f2b3
      Chao Yu authored
      Soft IRQ				Thread
      - f2fs_write_end_io
      					- f2fs_defragment_range
      					 - set_page_private_gcing
       - type = WB_DATA_TYPE(page, false);
       : assign type w/ F2FS_WB_CP_DATA
       due to page_private_gcing() is true
        - dec_page_count() w/ wrong type
        - end_page_writeback()
      
      Value of F2FS_WB_CP_DATA reference count may become negative under above
      race condition, the root cause is we missed to wait page writeback before
      setting gcing page private flag, let's fix it.
      
      Fixes: 2d1fe8a8 ("f2fs: fix to tag gcing flag on page during file defragment")
      Fixes: 4961acdd ("f2fs: fix to tag gcing flag on page during block migration")
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      a4d7f2b3
    • Yeongjin Gil's avatar
      f2fs: Create COW inode from parent dentry for atomic write · 8c1b7879
      Yeongjin Gil authored
      The i_pino in f2fs_inode_info has the previous parent's i_ino when inode
      was renamed, which may cause f2fs_ioc_start_atomic_write to fail.
      If file_wrong_pino is true and i_nlink is 1, then to find a valid pino,
      we should refer to the dentry from inode.
      
      To resolve this issue, let's get parent inode using parent dentry
      directly.
      
      Fixes: 3db1de0e ("f2fs: change the current atomic write way")
      Reviewed-by: default avatarSungjong Seo <sj1557.seo@samsung.com>
      Reviewed-by: default avatarSunmin Jeong <s_min.jeong@samsung.com>
      Signed-off-by: default avatarYeongjin Gil <youngjin.gil@samsung.com>
      Reviewed-by: default avatarDaeho Jeong <daehojeong@google.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      8c1b7879
    • Jann Horn's avatar
      f2fs: Require FMODE_WRITE for atomic write ioctls · 4f5a100f
      Jann Horn authored
      The F2FS ioctls for starting and committing atomic writes check for
      inode_owner_or_capable(), but this does not give LSMs like SELinux or
      Landlock an opportunity to deny the write access - if the caller's FSUID
      matches the inode's UID, inode_owner_or_capable() immediately returns true.
      
      There are scenarios where LSMs want to deny a process the ability to write
      particular files, even files that the FSUID of the process owns; but this
      can currently partially be bypassed using atomic write ioctls in two ways:
      
       - F2FS_IOC_START_ATOMIC_REPLACE + F2FS_IOC_COMMIT_ATOMIC_WRITE can
         truncate an inode to size 0
       - F2FS_IOC_START_ATOMIC_WRITE + F2FS_IOC_ABORT_ATOMIC_WRITE can revert
         changes another process concurrently made to a file
      
      Fix it by requiring FMODE_WRITE for these operations, just like for
      F2FS_IOC_MOVE_RANGE. Since any legitimate caller should only be using these
      ioctls when intending to write into the file, that seems unlikely to break
      anything.
      
      Fixes: 88b88a66 ("f2fs: support atomic writes")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Reviewed-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      4f5a100f
    • Zhiguo Niu's avatar
      f2fs: clean up val{>>,<<}F2FS_BLKSIZE_BITS · 8fb9f319
      Zhiguo Niu authored
      Use F2FS_BYTES_TO_BLK(bytes) and F2FS_BLK_TO_BYTES(blk) for cleanup
      Signed-off-by: default avatarZhiguo Niu <zhiguo.niu@unisoc.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      8fb9f319
  4. 15 Aug, 2024 7 commits