• BingJing Chang's avatar
    btrfs: send: fix sending link commands for existing file paths · 3aa5bd36
    BingJing Chang authored
    There is a bug sending link commands for existing file paths. When we're
    processing an inode, we go over all references. All the new file paths are
    added to the "new_refs" list. And all the deleted file paths are added to
    the "deleted_refs" list. In the end, when we finish processing the inode,
    we iterate over all the items in the "new_refs" list and send link commands
    for those file paths. After that, we go over all the items in the
    "deleted_refs" list and send unlink commands for them. If there are
    duplicated file paths in both lists, we will try to create them before we
    remove them. Then the receiver gets an -EEXIST error when trying the link
    operations.
    
    Example for having duplicated file paths in both list:
    
      $ btrfs subvolume create vol
    
      # create a file and 2000 hard links to the same inode
      $ touch vol/foo
      $ for i in {1..2000}; do link vol/foo vol/$i ; done
    
      # take a snapshot for a parent snapshot
      $ btrfs subvolume snapshot -r vol snap1
    
      # remove 2000 hard links and re-create the last 1000 links
      $ for i in {1..2000}; do rm vol/$i; done;
      $ for i in {1001..2000}; do link vol/foo vol/$i; done
    
      # take another one for a send snapshot
      $ btrfs subvolume snapshot -r vol snap2
    
      $ mkdir receive_dir
      $ btrfs send snap2 -p snap1 | btrfs receive receive_dir/
      At subvol snap2
      link 1238 -> foo
      ERROR: link 1238 -> foo failed: File exists
    
    In this case, we will have the same file paths added to both lists. In the
    parent snapshot, reference paths {1..1237} are stored in inode references,
    but reference paths {1238..2000} are stored in inode extended references.
    In the send snapshot, all reference paths {1001..2000} are stored in inode
    references. During the incremental send, we process their inode references
    first. In record_changed_ref(), we iterate all its inode references in the
    send/parent snapshot. For every inode reference, we also use find_iref() to
    check whether the same file path also appears in the parent/send snapshot
    or not. Inode references {1238..2000} which appear in the send snapshot but
    not in the parent snapshot are added to the "new_refs" list. On the other
    hand, Inode references {1..1000} which appear in the parent snapshot but
    not in the send snapshot are added to the "deleted_refs" list. Next, when
    we process their inode extended references, reference paths {1238..2000}
    are added to the "deleted_refs" list because all of them only appear in the
    parent snapshot. Now two lists contain items as below:
    "new_refs" list: {1238..2000}
    "deleted_refs" list: {1..1000}, {1238..2000}
    
    Reference paths {1238..2000} appear in both lists. And as the processing
    order mentioned about before, the receiver gets an -EEXIST error when trying
    the link operations.
    
    To fix the bug, the idea is to process the "deleted_refs" list before
    the "new_refs" list. However, it's not easy to reshuffle the processing
    order. For one reason, if we do so, we may unlink all the existing paths
    first, there's no valid path anymore for links. And it's inefficient
    because we do a bunch of unlinks followed by links for the same paths.
    Moreover, it makes less sense to have duplications in both lists. A
    reference path cannot not only be regarded as new but also has been seen in
    the past, or we won't call it a new path. However, it's also not a good
    idea to make find_iref() check a reference against all inode references
    and all inode extended references because it may result in large disk
    reads.
    
    So we introduce two rbtrees to make the references easier for lookups.
    And we also introduce record_new_ref_if_needed() and
    record_deleted_ref_if_needed() for changed_ref() to check and remove
    duplicated references early.
    Reviewed-by: default avatarRobbie Ko <robbieko@synology.com>
    Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
    Signed-off-by: default avatarBingJing Chang <bingjingc@synology.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    3aa5bd36
send.c 206 KB