1. 06 May, 2022 11 commits
    • Chao Yu's avatar
      f2fs: fix deadloop in foreground GC · cfd66bb7
      Chao Yu authored
      As Yanming reported in bugzilla:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=215914
      
      The root cause is: in a very small sized image, it's very easy to
      exceed threshold of foreground GC, if we calculate free space and
      dirty data based on section granularity, in corner case,
      has_not_enough_free_secs() will always return true, result in
      deadloop in f2fs_gc().
      
      So this patch refactors has_not_enough_free_secs() as below to fix
      this issue:
      1. calculate needed space based on block granularity, and separate
      all blocks to two parts, section part, and block part, comparing
      section part to free section, and comparing block part to free space
      in openned log.
      2. account F2FS_DIRTY_NODES, F2FS_DIRTY_IMETA and F2FS_DIRTY_DENTS
      as node block consumer;
      3. account F2FS_DIRTY_DENTS as data block consumer;
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarMing Yan <yanming@tju.edu.cn>
      Signed-off-by: default avatarChao Yu <chao.yu@oppo.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      cfd66bb7
    • Chao Yu's avatar
      f2fs: fix to do sanity check on block address in f2fs_do_zero_range() · 25f82362
      Chao Yu authored
      As Yanming reported in bugzilla:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=215894
      
      I have encountered a bug in F2FS file system in kernel v5.17.
      
      I have uploaded the system call sequence as case.c, and a fuzzed image can
      be found in google net disk
      
      The kernel should enable CONFIG_KASAN=y and CONFIG_KASAN_INLINE=y. You can
      reproduce the bug by running the following commands:
      
      kernel BUG at fs/f2fs/segment.c:2291!
      Call Trace:
       f2fs_invalidate_blocks+0x193/0x2d0
       f2fs_fallocate+0x2593/0x4a70
       vfs_fallocate+0x2a5/0xac0
       ksys_fallocate+0x35/0x70
       __x64_sys_fallocate+0x8e/0xf0
       do_syscall_64+0x3b/0x90
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      The root cause is, after image was fuzzed, block mapping info in inode
      will be inconsistent with SIT table, so in f2fs_fallocate(), it will cause
      panic when updating SIT with invalid blkaddr.
      
      Let's fix the issue by adding sanity check on block address before updating
      SIT table with it.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarMing Yan <yanming@tju.edu.cn>
      Signed-off-by: default avatarChao Yu <chao.yu@oppo.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      25f82362
    • Chao Yu's avatar
      f2fs: fix to avoid f2fs_bug_on() in dec_valid_node_count() · 4d17e6fe
      Chao Yu authored
      As Yanming reported in bugzilla:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=215897
      
      I have encountered a bug in F2FS file system in kernel v5.17.
      
      The kernel should enable CONFIG_KASAN=y and CONFIG_KASAN_INLINE=y. You can
      reproduce the bug by running the following commands:
      
      The kernel message is shown below:
      
      kernel BUG at fs/f2fs/f2fs.h:2511!
      Call Trace:
       f2fs_remove_inode_page+0x2a2/0x830
       f2fs_evict_inode+0x9b7/0x1510
       evict+0x282/0x4e0
       do_unlinkat+0x33a/0x540
       __x64_sys_unlinkat+0x8e/0xd0
       do_syscall_64+0x3b/0x90
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      The root cause is: .total_valid_block_count or .total_valid_node_count
      could fuzzed to zero, then once dec_valid_node_count() was called, it
      will cause BUG_ON(), this patch fixes to print warning info and set
      SBI_NEED_FSCK into CP instead of panic.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarMing Yan <yanming@tju.edu.cn>
      Signed-off-by: default avatarChao Yu <chao.yu@oppo.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      4d17e6fe
    • Byungki Lee's avatar
      f2fs: write checkpoint during FG_GC · a9163b94
      Byungki Lee authored
      If there's not enough free sections each of which consistis of large segments,
      we can hit no free section for upcoming section allocation. Let's reclaim some
      prefree segments by writing checkpoints.
      Signed-off-by: default avatarByungki Lee <dominicus79@gmail.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      a9163b94
    • Chao Yu's avatar
      f2fs: fix to clear dirty inode in f2fs_evict_inode() · f2db7105
      Chao Yu authored
      As Yanming reported in bugzilla:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=215904
      
      The kernel message is shown below:
      
      kernel BUG at fs/f2fs/inode.c:825!
      Call Trace:
       evict+0x282/0x4e0
       __dentry_kill+0x2b2/0x4d0
       shrink_dentry_list+0x17c/0x4f0
       shrink_dcache_parent+0x143/0x1e0
       do_one_tree+0x9/0x30
       shrink_dcache_for_umount+0x51/0x120
       generic_shutdown_super+0x5c/0x3a0
       kill_block_super+0x90/0xd0
       kill_f2fs_super+0x225/0x310
       deactivate_locked_super+0x78/0xc0
       cleanup_mnt+0x2b7/0x480
       task_work_run+0xc8/0x150
       exit_to_user_mode_prepare+0x14a/0x150
       syscall_exit_to_user_mode+0x1d/0x40
       do_syscall_64+0x48/0x90
      
      The root cause is: inode node and dnode node share the same nid,
      so during f2fs_evict_inode(), dnode node truncation will invalidate
      its NAT entry, so when truncating inode node, it fails due to
      invalid NAT entry, result in inode is still marked as dirty, fix
      this issue by clearing dirty for inode and setting SBI_NEED_FSCK
      flag in filesystem.
      
      output from dump.f2fs:
      [print_node_info: 354] Node ID [0xf:15] is inode
      i_nid[0]                      		[0x       f : 15]
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarMing Yan <yanming@tju.edu.cn>
      Signed-off-by: default avatarChao Yu <chao.yu@oppo.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      f2db7105
    • Luis Chamberlain's avatar
      f2fs: ensure only power of 2 zone sizes are allowed · 7f262f73
      Luis Chamberlain authored
      F2FS zoned support has power of 2 zone size assumption in many places
      such as in __f2fs_issue_discard_zone, init_blkz_info. As the power of 2
      requirement has been removed from the block layer, explicitly add a
      condition in f2fs to allow only power of 2 zone size devices.
      
      This condition will be relaxed once those calculation based on power of
      2 is made generic.
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: default avatarPankaj Raghav <p.raghav@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      7f262f73
    • Luis Chamberlain's avatar
      f2fs: call bdev_zone_sectors() only once on init_blkz_info() · d46db459
      Luis Chamberlain authored
      Instead of calling bdev_zone_sectors() multiple times, call
      it once and cache the value locally. This will make the
      subsequent change easier to read.
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: default avatarPankaj Raghav <p.raghav@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      d46db459
    • Niels Dossche's avatar
      f2fs: extend stat_lock to avoid potential race in statfs · 4de85145
      Niels Dossche authored
      There are multiple calculations and reads of fields of sbi that should
      be protected by stat_lock. As stat_lock is not used to read these
      values in statfs, this can lead to inconsistent results.
      Extend the locking to prevent this issue.
      Commit c9c8ed50 ("f2fs: fix to avoid potential race on
      sbi->unusable_block_count access/update")
      already added the use of sbi->stat_lock in statfs in
      order to make the calculation of multiple, different fields atomic so
      that results are consistent. This is similar to that patch regarding the
      change in statfs.
      Signed-off-by: default avatarNiels Dossche <dossche.niels@gmail.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      4de85145
    • Jaegeuk Kim's avatar
      f2fs: avoid infinite loop to flush node pages · a7b8618a
      Jaegeuk Kim authored
      xfstests/generic/475 can give EIO all the time which give an infinite loop
      to flush node page like below. Let's avoid it.
      
      [16418.518551] Call Trace:
      [16418.518553]  ? dm_submit_bio+0x48/0x400
      [16418.518574]  ? submit_bio_checks+0x1ac/0x5a0
      [16418.525207]  __submit_bio+0x1a9/0x230
      [16418.525210]  ? kmem_cache_alloc+0x29e/0x3c0
      [16418.525223]  submit_bio_noacct+0xa8/0x2b0
      [16418.525226]  submit_bio+0x4d/0x130
      [16418.525238]  __submit_bio+0x49/0x310 [f2fs]
      [16418.525339]  ? bio_add_page+0x6a/0x90
      [16418.525344]  f2fs_submit_page_bio+0x134/0x1f0 [f2fs]
      [16418.525365]  read_node_page+0x125/0x1b0 [f2fs]
      [16418.525388]  __get_node_page.part.0+0x58/0x3f0 [f2fs]
      [16418.525409]  __get_node_page+0x2f/0x60 [f2fs]
      [16418.525431]  f2fs_get_dnode_of_data+0x423/0x860 [f2fs]
      [16418.525452]  ? asm_sysvec_apic_timer_interrupt+0x12/0x20
      [16418.525458]  ? __mod_memcg_state.part.0+0x2a/0x30
      [16418.525465]  ? __mod_memcg_lruvec_state+0x27/0x40
      [16418.525467]  ? __xa_set_mark+0x57/0x70
      [16418.525472]  f2fs_do_write_data_page+0x10e/0x7b0 [f2fs]
      [16418.525493]  f2fs_write_single_data_page+0x555/0x830 [f2fs]
      [16418.525514]  ? sysvec_apic_timer_interrupt+0x4e/0x90
      [16418.525518]  ? asm_sysvec_apic_timer_interrupt+0x12/0x20
      [16418.525523]  f2fs_write_cache_pages+0x303/0x880 [f2fs]
      [16418.525545]  ? blk_flush_plug_list+0x47/0x100
      [16418.525548]  f2fs_write_data_pages+0xfd/0x320 [f2fs]
      [16418.525569]  do_writepages+0xd5/0x210
      [16418.525648]  filemap_fdatawrite_wbc+0x7d/0xc0
      [16418.525655]  filemap_fdatawrite+0x50/0x70
      [16418.525658]  f2fs_sync_dirty_inodes+0xa4/0x230 [f2fs]
      [16418.525679]  f2fs_write_checkpoint+0x16d/0x1720 [f2fs]
      [16418.525699]  ? ttwu_do_wakeup+0x1c/0x160
      [16418.525709]  ? ttwu_do_activate+0x6d/0xd0
      [16418.525711]  ? __wait_for_common+0x11d/0x150
      [16418.525715]  kill_f2fs_super+0xca/0x100 [f2fs]
      [16418.525733]  deactivate_locked_super+0x3b/0xb0
      [16418.525739]  deactivate_super+0x40/0x50
      [16418.525741]  cleanup_mnt+0x139/0x190
      [16418.525747]  __cleanup_mnt+0x12/0x20
      [16418.525749]  task_work_run+0x6d/0xa0
      [16418.525765]  exit_to_user_mode_prepare+0x1ad/0x1b0
      [16418.525771]  syscall_exit_to_user_mode+0x27/0x50
      [16418.525774]  do_syscall_64+0x48/0xc0
      [16418.525776]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      a7b8618a
    • Jaegeuk Kim's avatar
      f2fs: use flush command instead of FUA for zoned device · c550e25b
      Jaegeuk Kim authored
      The block layer for zoned disk can reorder the FUA'ed IOs. Let's use flush
      command to keep the write order.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      c550e25b
    • Dongliang Mu's avatar
      f2fs: remove WARN_ON in f2fs_is_valid_blkaddr · dc2f78e2
      Dongliang Mu authored
      Syzbot triggers two WARNs in f2fs_is_valid_blkaddr and
      __is_bitmap_valid. For example, in f2fs_is_valid_blkaddr,
      if type is DATA_GENERIC_ENHANCE or DATA_GENERIC_ENHANCE_READ,
      it invokes WARN_ON if blkaddr is not in the right range.
      The call trace is as follows:
      
       f2fs_get_node_info+0x45f/0x1070
       read_node_page+0x577/0x1190
       __get_node_page.part.0+0x9e/0x10e0
       __get_node_page
       f2fs_get_node_page+0x109/0x180
       do_read_inode
       f2fs_iget+0x2a5/0x58b0
       f2fs_fill_super+0x3b39/0x7ca0
      
      Fix these two WARNs by replacing WARN_ON with dump_stack.
      
      Reported-by: syzbot+763ae12a2ede1d99d4dc@syzkaller.appspotmail.com
      Signed-off-by: default avatarDongliang Mu <mudongliangabcd@gmail.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      dc2f78e2
  2. 25 Apr, 2022 10 commits
  3. 24 Apr, 2022 8 commits
  4. 23 Apr, 2022 11 commits