• Chao Yu's avatar
    f2fs: fix summary info corruption · 2b60311d
    Chao Yu authored
    Sometimes, after running generic/270 of fstest, fsck reports summary
    info and actual position of block address in direct node becoming
    inconsistent.
    
    The root cause is race in between __f2fs_replace_block and change_curseg
    as below:
    
    Thread A				Thread B
    - __clone_blkaddrs
     - f2fs_replace_block
      - __f2fs_replace_block
       - segnoA = GET_SEGNO(sbi, blkaddrA);
       - type = se->type:=CURSEG_HOT_DATA
       - if (!IS_CURSEG(sbi, segnoA))
             type = CURSEG_WARM_DATA
    					- allocate_data_block
    					 - allocate_segment
    					  - get_ssr_segment
    					  - change_curseg(segnoA, CURSEG_HOT_DATA)
       - change_curseg(segnoA, CURSEG_WARM_DATA)
        - reset_curseg
         - __set_sit_entry_type
          - change se->type from CURSEG_HOT_DATA to CURSEG_WARM_DATA
    
    So finally, hot curseg locates in segnoA, but type of segnoA becomes
    CURSEG_WARM_DATA.
    
    Then if we invoke __f2fs_replace_block(blkaddrB, blkaddrA, true, false),
    as blkaddrA locates in segnoA, so we will move warm type curseg to segnoA,
    then change its summary cache and writeback it to summary block.
    
    But segnoA is used by hot type curseg too, once it moves or persist, it
    will cover summary block content with inner old summary cache, result in
    inconsistent status.
    
    This patch tries to fix this issue by introduce global curseg lock to avoid
    race in between __f2fs_replace_block and change_curseg.
    Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
    Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
    2b60311d
f2fs.h 96.4 KB