1. 14 Aug, 2023 11 commits
    • Chunhai Guo's avatar
      f2fs: Only lfs mode is allowed with zoned block device feature · 2bd4df8f
      Chunhai Guo authored
      Now f2fs support four block allocation modes: lfs, adaptive,
      fragment:segment, fragment:block. Only lfs mode is allowed with zoned block
      device feature.
      
      Fixes: 6691d940 ("f2fs: introduce fragment allocation mode mount option")
      Signed-off-by: default avatarChunhai Guo <guochunhai@vivo.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      2bd4df8f
    • Shin'ichiro Kawasaki's avatar
      f2fs: check zone type before sending async reset zone command · 3cb88bc1
      Shin'ichiro Kawasaki authored
      The commit 25f90805 ("f2fs: add async reset zone command support")
      introduced "async reset zone commands" by calling
      __submit_zone_reset_cmd() in async discard operations. However,
      __submit_zone_reset_cmd() is called regardless of zone type of discard
      target zone. When devices have conventional zones, zone reset commands
      are sent to the conventional zones and cause I/O errors.
      
      Avoid the I/O errors by checking that the discard target zone type is
      sequential write required. If not, handle the discard operation in same
      manner as non-zoned, regular block devices. For that purpose, add a new
      helper function f2fs_bdev_index() which gets index of the zone reset
      target device.
      
      Fixes: 25f90805 ("f2fs: add async reset zone command support")
      Signed-off-by: default avatarShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      3cb88bc1
    • Chao Yu's avatar
      f2fs: compress: don't {,de}compress non-full cluster · 025b3602
      Chao Yu authored
      f2fs won't compress non-full cluster in tail of file, let's skip
      dirtying and rewrite such cluster during f2fs_ioc_{,de}compress_file.
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      025b3602
    • Chao Yu's avatar
      f2fs: allow f2fs_ioc_{,de}compress_file to be interrupted · 3a2c0e55
      Chao Yu authored
      This patch allows f2fs_ioc_{,de}compress_file() to be interrupted, so that,
      userspace won't be blocked when manual {,de}compression on large file is
      interrupted by signal.
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      3a2c0e55
    • Christoph Hellwig's avatar
      f2fs: don't reopen the main block device in f2fs_scan_devices · 51bf8d3c
      Christoph Hellwig authored
      f2fs_scan_devices reopens the main device since the very beginning, which
      has always been useless, and also means that we don't pass the right
      holder for the reopen, which now leads to a warning as the core super.c
      holder ops aren't passed in for the reopen.
      
      Fixes: 3c62be17 ("f2fs: support multiple devices")
      Fixes: 0718afd4 ("block: introduce holder ops")
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      51bf8d3c
    • Chao Yu's avatar
      f2fs: fix to avoid mmap vs set_compress_option case · b5ab3276
      Chao Yu authored
      Compression option in inode should not be changed after they have
      been used, however, it may happen in below race case:
      
      Thread A				Thread B
      - f2fs_ioc_set_compress_option
       - check f2fs_is_mmap_file()
       - check get_dirty_pages()
       - check F2FS_HAS_BLOCKS()
      					- f2fs_file_mmap
      					 - set_inode_flag(FI_MMAP_FILE)
      					- fault
      					 - do_page_mkwrite
      					  - f2fs_vm_page_mkwrite
      					  - f2fs_get_block_locked
      					 - fault_dirty_shared_page
      					  - set_page_dirty
       - update i_compress_algorithm
       - update i_log_cluster_size
       - update i_cluster_size
      
      Avoid such race condition by covering f2fs_file_mmap() w/ i_sem lock,
      meanwhile add mmap file check condition in f2fs_may_compress() as well.
      
      Fixes: e1e8debe ("f2fs: add F2FS_IOC_SET_COMPRESS_OPTION ioctl")
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      b5ab3276
    • Randy Dunlap's avatar
      f2fs: fix spelling in ABI documentation · c709d099
      Randy Dunlap authored
      Correct spelling problems as identified by codespell.
      
      Fixes: 9e615dbb ("f2fs: add missing description for ipu_policy node")
      Fixes: b2e4a2b3 ("f2fs: expose discard related parameters in sysfs")
      Fixes: 846ae671 ("f2fs: expose extension_list sysfs entry")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Jaegeuk Kim <jaegeuk@kernel.org>
      Cc: Chao Yu <chao@kernel.org>
      Cc: linux-f2fs-devel@lists.sourceforge.net
      Cc: Yangtao Li <frank.li@vivo.com>
      Cc: Konstantin Vyshetsky <vkon@google.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      c709d099
    • Jaegeuk Kim's avatar
      f2fs: get out of a repeat loop when getting a locked data page · d2d9bb3b
      Jaegeuk Kim authored
      https://bugzilla.kernel.org/show_bug.cgi?id=216050
      
      Somehow we're getting a page which has a different mapping.
      Let's avoid the infinite loop.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      d2d9bb3b
    • Jaegeuk Kim's avatar
      f2fs: flush inode if atomic file is aborted · a3ab5574
      Jaegeuk Kim authored
      Let's flush the inode being aborted atomic operation to avoid stale dirty
      inode during eviction in this call stack:
      
        f2fs_mark_inode_dirty_sync+0x22/0x40 [f2fs]
        f2fs_abort_atomic_write+0xc4/0xf0 [f2fs]
        f2fs_evict_inode+0x3f/0x690 [f2fs]
        ? sugov_start+0x140/0x140
        evict+0xc3/0x1c0
        evict_inodes+0x17b/0x210
        generic_shutdown_super+0x32/0x120
        kill_block_super+0x21/0x50
        deactivate_locked_super+0x31/0x90
        cleanup_mnt+0x100/0x160
        task_work_run+0x59/0x90
        do_exit+0x33b/0xa50
        do_group_exit+0x2d/0x80
        __x64_sys_exit_group+0x14/0x20
        do_syscall_64+0x3b/0x90
        entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      This triggers f2fs_bug_on() in f2fs_evict_inode:
       f2fs_bug_on(sbi, is_inode_flag_set(inode, FI_DIRTY_INODE));
      
      This fixes the syzbot report:
      
      loop0: detected capacity change from 0 to 131072
      F2FS-fs (loop0): invalid crc value
      F2FS-fs (loop0): Found nat_bits in checkpoint
      F2FS-fs (loop0): Mounted with checkpoint version = 48b305e4
      ------------[ cut here ]------------
      kernel BUG at fs/f2fs/inode.c:869!
      invalid opcode: 0000 [#1] PREEMPT SMP KASAN
      CPU: 0 PID: 5014 Comm: syz-executor220 Not tainted 6.4.0-syzkaller-11479-g6cd06ab1 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/27/2023
      RIP: 0010:f2fs_evict_inode+0x172d/0x1e00 fs/f2fs/inode.c:869
      Code: ff df 48 c1 ea 03 80 3c 02 00 0f 85 6a 06 00 00 8b 75 40 ba 01 00 00 00 4c 89 e7 e8 6d ce 06 00 e9 aa fc ff ff e8 63 22 e2 fd <0f> 0b e8 5c 22 e2 fd 48 c7 c0 a8 3a 18 8d 48 ba 00 00 00 00 00 fc
      RSP: 0018:ffffc90003a6fa00 EFLAGS: 00010293
      RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
      RDX: ffff8880273b8000 RSI: ffffffff83a2bd0d RDI: 0000000000000007
      RBP: ffff888077db91b0 R08: 0000000000000007 R09: 0000000000000000
      R10: 0000000000000001 R11: 0000000000000001 R12: ffff888029a3c000
      R13: ffff888077db9660 R14: ffff888029a3c0b8 R15: ffff888077db9c50
      FS:  0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f1909bb9000 CR3: 00000000276a9000 CR4: 0000000000350ef0
      Call Trace:
       <TASK>
       evict+0x2ed/0x6b0 fs/inode.c:665
       dispose_list+0x117/0x1e0 fs/inode.c:698
       evict_inodes+0x345/0x440 fs/inode.c:748
       generic_shutdown_super+0xaf/0x480 fs/super.c:478
       kill_block_super+0x64/0xb0 fs/super.c:1417
       kill_f2fs_super+0x2af/0x3c0 fs/f2fs/super.c:4704
       deactivate_locked_super+0x98/0x160 fs/super.c:330
       deactivate_super+0xb1/0xd0 fs/super.c:361
       cleanup_mnt+0x2ae/0x3d0 fs/namespace.c:1254
       task_work_run+0x16f/0x270 kernel/task_work.c:179
       exit_task_work include/linux/task_work.h:38 [inline]
       do_exit+0xa9a/0x29a0 kernel/exit.c:874
       do_group_exit+0xd4/0x2a0 kernel/exit.c:1024
       __do_sys_exit_group kernel/exit.c:1035 [inline]
       __se_sys_exit_group kernel/exit.c:1033 [inline]
       __x64_sys_exit_group+0x3e/0x50 kernel/exit.c:1033
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      RIP: 0033:0x7f309be71a09
      Code: Unable to access opcode bytes at 0x7f309be719df.
      RSP: 002b:00007fff171df518 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
      RAX: ffffffffffffffda RBX: 00007f309bef7330 RCX: 00007f309be71a09
      RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000001
      RBP: 0000000000000001 R08: ffffffffffffffc0 R09: 00007f309bef1e40
      R10: 0000000000010600 R11: 0000000000000246 R12: 00007f309bef7330
      R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000001
       </TASK>
      Modules linked in:
      ---[ end trace 0000000000000000 ]---
      RIP: 0010:f2fs_evict_inode+0x172d/0x1e00 fs/f2fs/inode.c:869
      Code: ff df 48 c1 ea 03 80 3c 02 00 0f 85 6a 06 00 00 8b 75 40 ba 01 00 00 00 4c 89 e7 e8 6d ce 06 00 e9 aa fc ff ff e8 63 22 e2 fd <0f> 0b e8 5c 22 e2 fd 48 c7 c0 a8 3a 18 8d 48 ba 00 00 00 00 00 fc
      RSP: 0018:ffffc90003a6fa00 EFLAGS: 00010293
      RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
      RDX: ffff8880273b8000 RSI: ffffffff83a2bd0d RDI: 0000000000000007
      RBP: ffff888077db91b0 R08: 0000000000000007 R09: 0000000000000000
      R10: 0000000000000001 R11: 0000000000000001 R12: ffff888029a3c000
      R13: ffff888077db9660 R14: ffff888029a3c0b8 R15: ffff888077db9c50
      FS:  0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f1909bb9000 CR3: 00000000276a9000 CR4: 0000000000350ef0
      
      Cc: <stable@vger.kernel.org>
      Reported-and-tested-by: syzbot+e1246909d526a9d470fa@syzkaller.appspotmail.com
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      a3ab5574
    • Chao Yu's avatar
      f2fs: don't handle error case of f2fs_compress_alloc_page() · 863907a4
      Chao Yu authored
      f2fs_compress_alloc_page() uses mempool to allocate memory, it never
      fail, don't handle error case in its callers.
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      863907a4
    • Jaegeuk Kim's avatar
      Revert "f2fs: clean up w/ sbi->log_sectors_per_block" · 579c7e41
      Jaegeuk Kim authored
      This reverts commit bfd47662.
      
      Shinichiro Kawasaki reported:
      
      When I ran workloads on f2fs using v6.5-rcX with fixes [1][2] and a zoned block
      devices with 4kb logical block size, I observe mount failure as follows. When
      I revert this commit, the failure goes away.
      
      [  167.781975][ T1555] F2FS-fs (dm-0): IO Block Size:        4 KB
      [  167.890728][ T1555] F2FS-fs (dm-0): Found nat_bits in checkpoint
      [  171.482588][ T1555] F2FS-fs (dm-0): Zone without valid block has non-zero write pointer. Reset the write pointer: wp[0x1300,0x8]
      [  171.496000][ T1555] F2FS-fs (dm-0): (0) : Unaligned zone reset attempted (block 280000 + 80000)
      [  171.505037][ T1555] F2FS-fs (dm-0): Discard zone failed:  (errno=-5)
      
      The patch replaced "sbi->log_blocksize - SECTOR_SHIFT" with
      "sbi->log_sectors_per_block". However, I think these two are not equal when the
      device has 4k logical block size. The former uses Linux kernel sector size 512
      byte. The latter use 512b sector size or 4kb sector size depending on the
      device. mkfs.f2fs obtains logical block size via BLKSSZGET ioctl from the device
      and reflects it to the value sbi->log_sector_size_per_block. This causes
      unexpected write pointer calculations in check_zone_write_pointer(). This
      resulted in unexpected zone reset and the mount failure.
      
      [1] https://lkml.kernel.org/linux-f2fs-devel/20230711050101.GA19128@lst.de/
      [2] https://lore.kernel.org/linux-f2fs-devel/20230804091556.2372567-1-shinichiro.kawasaki@wdc.com/
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarShinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Fixes: bfd47662 ("f2fs: clean up w/ sbi->log_sectors_per_block")
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      579c7e41
  2. 09 Jul, 2023 10 commits
  3. 08 Jul, 2023 19 commits