• Long Li's avatar
    xfs: abort intent items when recovery intents fail · f8f9d952
    Long Li authored
    When recovering intents, we capture newly created intent items as part of
    committing recovered intent items.  If intent recovery fails at a later
    point, we forget to remove those newly created intent items from the AIL
    and hang:
    
        [root@localhost ~]# cat /proc/539/stack
        [<0>] xfs_ail_push_all_sync+0x174/0x230
        [<0>] xfs_unmount_flush_inodes+0x8d/0xd0
        [<0>] xfs_mountfs+0x15f7/0x1e70
        [<0>] xfs_fs_fill_super+0x10ec/0x1b20
        [<0>] get_tree_bdev+0x3c8/0x730
        [<0>] vfs_get_tree+0x89/0x2c0
        [<0>] path_mount+0xecf/0x1800
        [<0>] do_mount+0xf3/0x110
        [<0>] __x64_sys_mount+0x154/0x1f0
        [<0>] do_syscall_64+0x39/0x80
        [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
    
    When newly created intent items fail to commit via transaction, intent
    recovery hasn't created done items for these newly created intent items,
    so the capture structure is the sole owner of the captured intent items.
    We must release them explicitly or else they leak:
    
    unreferenced object 0xffff888016719108 (size 432):
      comm "mount", pid 529, jiffies 4294706839 (age 144.463s)
      hex dump (first 32 bytes):
        08 91 71 16 80 88 ff ff 08 91 71 16 80 88 ff ff  ..q.......q.....
        18 91 71 16 80 88 ff ff 18 91 71 16 80 88 ff ff  ..q.......q.....
      backtrace:
        [<ffffffff8230c68f>] xfs_efi_init+0x18f/0x1d0
        [<ffffffff8230c720>] xfs_extent_free_create_intent+0x50/0x150
        [<ffffffff821b671a>] xfs_defer_create_intents+0x16a/0x340
        [<ffffffff821bac3e>] xfs_defer_ops_capture_and_commit+0x8e/0xad0
        [<ffffffff82322bb9>] xfs_cui_item_recover+0x819/0x980
        [<ffffffff823289b6>] xlog_recover_process_intents+0x246/0xb70
        [<ffffffff8233249a>] xlog_recover_finish+0x8a/0x9a0
        [<ffffffff822eeafb>] xfs_log_mount_finish+0x2bb/0x4a0
        [<ffffffff822c0f4f>] xfs_mountfs+0x14bf/0x1e70
        [<ffffffff822d1f80>] xfs_fs_fill_super+0x10d0/0x1b20
        [<ffffffff81a21fa2>] get_tree_bdev+0x3d2/0x6d0
        [<ffffffff81a1ee09>] vfs_get_tree+0x89/0x2c0
        [<ffffffff81a9f35f>] path_mount+0xecf/0x1800
        [<ffffffff81a9fd83>] do_mount+0xf3/0x110
        [<ffffffff81aa00e4>] __x64_sys_mount+0x154/0x1f0
        [<ffffffff83968739>] do_syscall_64+0x39/0x80
    
    Fix the problem above by abort intent items that don't have a done item
    when recovery intents fail.
    
    Fixes: e6fff81e ("xfs: proper replay of deferred ops queued during log recovery")
    Signed-off-by: default avatarLong Li <leo.lilong@huawei.com>
    Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
    Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
    f8f9d952
xfs_log_recover.c 97.3 KB