1. 14 Sep, 2020 1 commit
  2. 11 Sep, 2020 13 commits
    • Daeho Jeong's avatar
      f2fs: change virtual mapping way for compression pages · 6fcaebac
      Daeho Jeong authored
      By profiling f2fs compression works, I've found vmap() callings have
      unexpected hikes in the execution time in our test environment and
      those are bottlenecks of f2fs decompression path. Changing these with
      vm_map_ram(), we can enhance f2fs decompression speed pretty much.
      
      [Verification]
      Android Pixel 3(ARM64, 6GB RAM, 128GB UFS)
      Turned on only 0-3 little cores(at 1.785GHz)
      
      dd if=/dev/zero of=dummy bs=1m count=1000
      echo 3 > /proc/sys/vm/drop_caches
      dd if=dummy of=/dev/zero bs=512k
      
      - w/o compression -
      1048576000 bytes (0.9 G) copied, 2.082554 s, 480 M/s
      1048576000 bytes (0.9 G) copied, 2.081634 s, 480 M/s
      1048576000 bytes (0.9 G) copied, 2.090861 s, 478 M/s
      
      - before patch -
      1048576000 bytes (0.9 G) copied, 7.407527 s, 135 M/s
      1048576000 bytes (0.9 G) copied, 7.283734 s, 137 M/s
      1048576000 bytes (0.9 G) copied, 7.291508 s, 137 M/s
      
      - after patch -
      1048576000 bytes (0.9 G) copied, 1.998959 s, 500 M/s
      1048576000 bytes (0.9 G) copied, 1.987554 s, 503 M/s
      1048576000 bytes (0.9 G) copied, 1.986380 s, 503 M/s
      Signed-off-by: default avatarDaeho Jeong <daehojeong@google.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      6fcaebac
    • Daeho Jeong's avatar
      f2fs: change return value of f2fs_disable_compressed_file to bool · 78134d03
      Daeho Jeong authored
      The returned integer is not required anywhere. So we need to change
      the return value to bool type.
      Signed-off-by: default avatarDaeho Jeong <daehojeong@google.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      78134d03
    • Daeho Jeong's avatar
      f2fs: change i_compr_blocks of inode to atomic value · c2759eba
      Daeho Jeong authored
      writepages() can be concurrently invoked for the same file by different
      threads such as a thread fsyncing the file and a kworker kernel thread.
      So, changing i_compr_blocks without protection is racy and we need to
      protect it by changing it with atomic type value. Plus, we don't need
      a 64bit value for i_compr_blocks, so just we will use a atomic value,
      not atomic64.
      Signed-off-by: default avatarDaeho Jeong <daehojeong@google.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      c2759eba
    • Chao Yu's avatar
      f2fs: trace: fix typo · 32c0fec1
      Chao Yu authored
      Fixes a typo from 'compreesed' to 'compressed'.
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      32c0fec1
    • Chao Yu's avatar
      f2fs: ignore compress mount option on image w/o compression feature · 69c0dd29
      Chao Yu authored
      to keep consistent with behavior when passing compress mount option
      to kernel w/o compression feature, so that mount may not fail on
      such condition.
      Reported-by: default avatarKyungmin Park <kyungmin.park@samsung.com>
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      69c0dd29
    • Randy Dunlap's avatar
      f2fs: Documentation edits/fixes · ca313c82
      Randy Dunlap authored
      Correct grammar and spelling.
      
      Drop duplicate section for resize.f2fs.
      
      Change one occurrence of F2fs to F2FS for consistency.
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Jaegeuk Kim <jaegeuk@kernel.org>
      Cc: Chao Yu <yuchao0@huawei.com>
      Cc: linux-f2fs-devel@lists.sourceforge.net
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      ca313c82
    • Chao Yu's avatar
      f2fs: allocate proper size memory for zstd decompress · 0e2b7385
      Chao Yu authored
      As 5kft <5kft@5kft.org> reported:
      
       kworker/u9:3: page allocation failure: order:9, mode:0x40c40(GFP_NOFS|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0
       CPU: 3 PID: 8168 Comm: kworker/u9:3 Tainted: G         C        5.8.3-sunxi #trunk
       Hardware name: Allwinner sun8i Family
       Workqueue: f2fs_post_read_wq f2fs_post_read_work
       [<c010d6d5>] (unwind_backtrace) from [<c0109a55>] (show_stack+0x11/0x14)
       [<c0109a55>] (show_stack) from [<c056d489>] (dump_stack+0x75/0x84)
       [<c056d489>] (dump_stack) from [<c0243b53>] (warn_alloc+0xa3/0x104)
       [<c0243b53>] (warn_alloc) from [<c024473b>] (__alloc_pages_nodemask+0xb87/0xc40)
       [<c024473b>] (__alloc_pages_nodemask) from [<c02267c5>] (kmalloc_order+0x19/0x38)
       [<c02267c5>] (kmalloc_order) from [<c02267fd>] (kmalloc_order_trace+0x19/0x90)
       [<c02267fd>] (kmalloc_order_trace) from [<c047c665>] (zstd_init_decompress_ctx+0x21/0x88)
       [<c047c665>] (zstd_init_decompress_ctx) from [<c047e9cf>] (f2fs_decompress_pages+0x97/0x228)
       [<c047e9cf>] (f2fs_decompress_pages) from [<c045d0ab>] (__read_end_io+0xfb/0x130)
       [<c045d0ab>] (__read_end_io) from [<c045d141>] (f2fs_post_read_work+0x61/0x84)
       [<c045d141>] (f2fs_post_read_work) from [<c0130b2f>] (process_one_work+0x15f/0x3b0)
       [<c0130b2f>] (process_one_work) from [<c0130e7b>] (worker_thread+0xfb/0x3e0)
       [<c0130e7b>] (worker_thread) from [<c0135c3b>] (kthread+0xeb/0x10c)
       [<c0135c3b>] (kthread) from [<c0100159>]
      
      zstd may allocate large size memory for {,de}compression, it may cause
      file copy failure on low-end device which has very few memory.
      
      For decompression, let's just allocate proper size memory based on current
      file's cluster size instead of max cluster size.
      Reported-by: default avatar5kft <5kft@5kft.org>
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      0e2b7385
    • Daeho Jeong's avatar
      f2fs: change compr_blocks of superblock info to 64bit · ae999bb9
      Daeho Jeong authored
      Current compr_blocks of superblock info is not 64bit value. We are
      accumulating each i_compr_blocks count of inodes to this value and
      those are 64bit values. So, need to change this to 64bit value.
      Signed-off-by: default avatarDaeho Jeong <daehojeong@google.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      ae999bb9
    • Daeho Jeong's avatar
      f2fs: add block address limit check to compressed file · 4eda1682
      Daeho Jeong authored
      Need to add block address range check to compressed file case and
      avoid calling get_data_block_bmap() for compressed file.
      Signed-off-by: default avatarDaeho Jeong <daehojeong@google.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      4eda1682
    • Dan Robertson's avatar
      f2fs: check position in move range ioctl · aad1383c
      Dan Robertson authored
      When the move range ioctl is used, check the input and output position and
      ensure that it is a non-negative value. Without this check
      f2fs_get_dnode_of_data may hit a memmory bug.
      Signed-off-by: default avatarDan Robertson <dan@dlrobertson.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      aad1383c
    • Jack Qiu's avatar
      f2fs: correct statistic of APP_DIRECT_IO/APP_DIRECT_READ_IO · 335cac8b
      Jack Qiu authored
      Miss to update APP_DIRECT_IO/APP_DIRECT_READ_IO when receiving async DIO.
      For example: fio -filename=/data/test.0 -bs=1m -ioengine=libaio -direct=1
      		-name=fill -size=10m -numjobs=1 -iodepth=32 -rw=write
      Signed-off-by: default avatarJack Qiu <jack.qiu@huawei.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      335cac8b
    • Matthew Wilcox (Oracle)'s avatar
      f2fs: Simplify SEEK_DATA implementation · 4cb03fec
      Matthew Wilcox (Oracle) authored
      Instead of finding the first dirty page and then seeing if it matches
      the index of a block that is NEW_ADDR, delay the lookup of the dirty
      bit until we've actually found a block that's NEW_ADDR.
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      4cb03fec
    • Chao Yu's avatar
      f2fs: support age threshold based garbage collection · 093749e2
      Chao Yu authored
      There are several issues in current background GC algorithm:
      - valid blocks is one of key factors during cost overhead calculation,
      so if segment has less valid block, however even its age is young or
      it locates hot segment, CB algorithm will still choose the segment as
      victim, it's not appropriate.
      - GCed data/node will go to existing logs, no matter in-there datas'
      update frequency is the same or not, it may mix hot and cold data
      again.
      - GC alloctor mainly use LFS type segment, it will cost free segment
      more quickly.
      
      This patch introduces a new algorithm named age threshold based
      garbage collection to solve above issues, there are three steps
      mainly:
      
      1. select a source victim:
      - set an age threshold, and select candidates beased threshold:
      e.g.
       0 means youngest, 100 means oldest, if we set age threshold to 80
       then select dirty segments which has age in range of [80, 100] as
       candiddates;
      - set candidate_ratio threshold, and select candidates based the
      ratio, so that we can shrink candidates to those oldest segments;
      - select target segment with fewest valid blocks in order to
      migrate blocks with minimum cost;
      
      2. select a target victim:
      - select candidates beased age threshold;
      - set candidate_radius threshold, search candidates whose age is
      around source victims, searching radius should less than the
      radius threshold.
      - select target segment with most valid blocks in order to avoid
      migrating current target segment.
      
      3. merge valid blocks from source victim into target victim with
      SSR alloctor.
      
      Test steps:
      - create 160 dirty segments:
       * half of them have 128 valid blocks per segment
       * left of them have 384 valid blocks per segment
      - run background GC
      
      Benefit: GC count and block movement count both decrease obviously:
      
      - Before:
        - Valid: 86
        - Dirty: 1
        - Prefree: 11
        - Free: 6001 (6001)
      
      GC calls: 162 (BG: 220)
        - data segments : 160 (160)
        - node segments : 2 (2)
      Try to move 41454 blocks (BG: 41454)
        - data blocks : 40960 (40960)
        - node blocks : 494 (494)
      
      IPU: 0 blocks
      SSR: 0 blocks in 0 segments
      LFS: 41364 blocks in 81 segments
      
      - After:
      
        - Valid: 87
        - Dirty: 0
        - Prefree: 4
        - Free: 6008 (6008)
      
      GC calls: 75 (BG: 76)
        - data segments : 74 (74)
        - node segments : 1 (1)
      Try to move 12813 blocks (BG: 12813)
        - data blocks : 12544 (12544)
        - node blocks : 269 (269)
      
      IPU: 0 blocks
      SSR: 12032 blocks in 77 segments
      LFS: 855 blocks in 2 segments
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      [Jaegeuk Kim: fix a bug along with pinfile in-mem segment & clean up]
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      093749e2
  3. 10 Sep, 2020 15 commits
  4. 09 Sep, 2020 4 commits
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-5.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · ab29a807
      Linus Torvalds authored
      Pull NFS client bugfixes from Trond Myklebust:
      
       - Fix an NFS/RDMA resource leak
      
       - Fix the error handling during delegation recall
      
       - NFSv4.0 needs to return the delegation on a zero-stateid SETATTR
      
       - Stop printk reading past end of string
      
      * tag 'nfs-for-5.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        SUNRPC: stop printk reading past end of string
        NFS: Zero-stateid SETATTR should first return delegation
        NFSv4.1 handle ERR_DELAY error reclaiming locking state on delegation recall
        xprtrdma: Release in-flight MRs on disconnect
      ab29a807
    • Gabriel Krisman Bertazi's avatar
      f2fs: Return EOF on unaligned end of file DIO read · 20d0a107
      Gabriel Krisman Bertazi authored
      Reading past end of file returns EOF for aligned reads but -EINVAL for
      unaligned reads on f2fs.  While documentation is not strict about this
      corner case, most filesystem returns EOF on this case, like iomap
      filesystems.  This patch consolidates the behavior for f2fs, by making
      it return EOF(0).
      
      it can be verified by a read loop on a file that does a partial read
      before EOF (A file that doesn't end at an aligned address).  The
      following code fails on an unaligned file on f2fs, but not on
      btrfs, ext4, and xfs.
      
        while (done < total) {
          ssize_t delta = pread(fd, buf + done, total - done, off + done);
          if (!delta)
            break;
          ...
        }
      
      It is arguable whether filesystems should actually return EOF or
      -EINVAL, but since iomap filesystems support it, and so does the
      original DIO code, it seems reasonable to consolidate on that.
      Signed-off-by: default avatarGabriel Krisman Bertazi <krisman@collabora.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      20d0a107
    • Sahitya Tummala's avatar
      f2fs: fix indefinite loop scanning for free nid · e2cab031
      Sahitya Tummala authored
      If the sbi->ckpt->next_free_nid is not NAT block aligned and if there
      are free nids in that NAT block between the start of the block and
      next_free_nid, then those free nids will not be scanned in scan_nat_page().
      This results into mismatch between nm_i->available_nids and the sum of
      nm_i->free_nid_count of all NAT blocks scanned. And nm_i->available_nids
      will always be greater than the sum of free nids in all the blocks.
      Under this condition, if we use all the currently scanned free nids,
      then it will loop forever in f2fs_alloc_nid() as nm_i->available_nids
      is still not zero but nm_i->free_nid_count of that partially scanned
      NAT block is zero.
      
      Fix this to align the nm_i->next_scan_nid to the first nid of the
      corresponding NAT block.
      Signed-off-by: default avatarSahitya Tummala <stummala@codeaurora.org>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      e2cab031
    • Shin'ichiro Kawasaki's avatar
      f2fs: Fix type of section block count variables · 123aaf77
      Shin'ichiro Kawasaki authored
      Commit da52f8ad ("f2fs: get the right gc victim section when section
      has several segments") added code to count blocks of each section using
      variables with type 'unsigned short', which has 2 bytes size in many
      systems. However, the counts can be larger than the 2 bytes range and
      type conversion results in wrong values. Especially when the f2fs
      sections have blocks as many as USHRT_MAX + 1, the count is handled as 0.
      This triggers eternal loop in init_dirty_segmap() at mount system call.
      Fix this by changing the type of the variables to block_t.
      
      Fixes: da52f8ad ("f2fs: get the right gc victim section when section has several segments")
      Signed-off-by: default avatarShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      123aaf77
  5. 08 Sep, 2020 7 commits