1. 30 Mar, 2018 25 commits
  2. 26 Mar, 2018 15 commits
    • Liu Bo's avatar
      Btrfs: dev-replace: make sure target is identical to source when raid56 rebuild fails · 4759700a
      Liu Bo authored
      In the last step of scrub_handle_error_block, we try to combine good
      copies on all possible mirrors, this works fine for raid1 and raid10,
      but not for raid56 as it's doing parity rebuild.
      
      If parity rebuild doesn't get back with correct data which matches its
      checksum, in case of replace we'd rather write what is stored in the
      source device than the data calculuated from parity.
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      4759700a
    • Liu Bo's avatar
      Btrfs: raid56: remove redundant async_missing_raid56 · d6a69135
      Liu Bo authored
      async_missing_raid56() is identical to async_read_rebuild().
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      d6a69135
    • Su Yue's avatar
      btrfs: adjust return values of btrfs_inode_by_name · 005d6712
      Su Yue authored
      Previously, btrfs_inode_by_name() returned 0 which left caller to check
      objectid of location even location if the type was invalid.
      
      Let btrfs_inode_by_name() return -EUCLEAN if a corrupted location of a
      dir entry is found.  Removal of label out_err also simplifies the
      function.
      Signed-off-by: default avatarSu Yue <suy.fnst@cn.fujitsu.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      [ drop unlikely ]
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      005d6712
    • Anand Jain's avatar
      btrfs: rename btrfs_close_extra_device to btrfs_free_extra_devids · 9b99b115
      Anand Jain authored
      This function btrfs_close_extra_devices() is about freeing
      extra devids which once it may have belonged to this filesystem.
      So rename it and add the comment. The _devid suffix is
      appropriate as this function won't handle devices which are
      outside of the filesytem being mounted.
      Signed-off-by: default avatarAnand Jain <anand.jain@oracle.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      9b99b115
    • Nikolay Borisov's avatar
      btrfs: Remove root argument from cow_file_range_inline · d02c0e20
      Nikolay Borisov authored
      This argument is always set to the root of the inode, which is also
      passed. So let's get a reference inside the function and simplify
      the arg list.
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      d02c0e20
    • Liu Bo's avatar
      Btrfs: send: fix typo in TLV_PUT · 895a72be
      Liu Bo authored
      According to tlv_put()'s prototype, data and attrlen needs to be
      exchanged in the macro, but seems all callers are already aware of
      this misorder and are therefore not affected.
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      895a72be
    • Nikolay Borisov's avatar
      btrfs: Remove root argument from btrfs_log_dentry_safe · e5b84f7a
      Nikolay Borisov authored
      Now that nothing uses the root arg of btrfs_log_dentry_safe it can be
      safely removed. No functional changes.
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      e5b84f7a
    • Nikolay Borisov's avatar
      btrfs: Remove root arg from btrfs_log_inode_parent · f882274b
      Nikolay Borisov authored
      btrfs_log_inode_parent is called from 2 places (btrfs_log_dentry_safe
      and btrfs_log_new_name) both of which pass inode->root as the root
      argument and the inode itself. Remove the redundant root argument and
      get a reference to the root directly from the inode, also remove
      redundant root != inode->root check from the same function. No
      functional change.
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      f882274b
    • Nikolay Borisov's avatar
      btrfs: Remove redundant comment from btrfs_search_forward · 448f3a17
      Nikolay Borisov authored
      This function always sets keep_locks to 1 and saves the old value of
      keep_locks which is restored at the end. So there is no way it can be
      called without keep_locks being set. Remove comment imposing redundant
      requirement on callers.
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      448f3a17
    • David Sterba's avatar
      btrfs: move btrfs_listxattr prototype to xattr.h · 738c93d4
      David Sterba authored
      There's a proper header for xattr handlers.
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      738c93d4
    • David Sterba's avatar
      btrfs: adjust return type of btrfs_getxattr · bcadd705
      David Sterba authored
      The xattr_handler::get prototype returns int, use it. The only ssize_t
      exception is the per-inode listxattr handler.
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      bcadd705
    • David Sterba's avatar
      btrfs: drop extern from function declarations · ab0d0936
      David Sterba authored
      Extern for functions does not make any difference, there are only a few
      so let's remove them before it's too late.
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      ab0d0936
    • David Sterba's avatar
    • Filipe Manana's avatar
      Btrfs: send, do not issue unnecessary truncate operations · ffa7c429
      Filipe Manana authored
      When send finishes processing an inode representing a regular file, it
      always issues a truncate operation for that file, even if its size did
      not change or the last write sets the file size correctly. In the most
      common cases, the issued write operations set the file to correct size
      (either full or incremental sends) or the file size did not change (for
      incremental sends), so the only case where a truncate operation is needed
      is when a file size becomes smaller in the send snapshot when compared
      to the parent snapshot.
      
      By not issuing unnecessary truncate operations we reduce the stream size
      and save time in the receiver. Currently truncating a file to the same
      size triggers writeback of its last page (if it's dirty) and waits for it
      to complete (only if the file size is not aligned with the filesystem's
      sector size). This is being fixed by another patch and is independent of
      this change (that patch's title is "Btrfs: skip writeback of last page
      when truncating file to same size").
      
      The following script was used to measure time spent by a receiver without
      this change applied, with this change applied, and without this change and
      with the truncate fix applied (the fix to not make it start and wait for
      writeback to complete).
      
        $ cat test_send.sh
        #!/bin/bash
      
        SRC_DEV=/dev/sdc
        DST_DEV=/dev/sdd
        SRC_MNT=/mnt/sdc
        DST_MNT=/mnt/sdd
      
        mkfs.btrfs -f $SRC_DEV >/dev/null
        mkfs.btrfs -f $DST_DEV >/dev/null
        mount $SRC_DEV $SRC_MNT
        mount $DST_DEV $DST_MNT
      
        echo "Creating source filesystem"
        for ((t = 0; t < 10; t++)); do
            (
                for ((i = 1; i <= 20000; i++)); do
                    xfs_io -f -c "pwrite -S 0xab 0 5000" \
                        $SRC_MNT/file_$i > /dev/null
                done
            ) &
           worker_pids[$t]=$!
        done
        wait ${worker_pids[@]}
      
        echo "Creating and sending snapshot"
        btrfs subvolume snapshot -r $SRC_MNT $SRC_MNT/snap1 >/dev/null
        /usr/bin/time -f "send took %e seconds"    \
               btrfs send -f $SRC_MNT/send_file $SRC_MNT/snap1
        /usr/bin/time -f "receive took %e seconds" \
               btrfs receive -f $SRC_MNT/send_file $DST_MNT
      
        umount $SRC_MNT
        umount $DST_MNT
      
      The results, which are averages for 5 runs for each case, were the
      following:
      
      * Without this change
      
      average receive time was 26.49 seconds
      standard deviation of 2.53 seconds
      
      * Without this change and with the truncate fix
      
      average receive time was 12.51 seconds
      standard deviation of 0.32 seconds
      
      * With this change and without the truncate fix
      
      average receive time was 10.02 seconds
      standard deviation of 1.11 seconds
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      ffa7c429
    • Filipe Manana's avatar
      Btrfs: skip writeback of last page when truncating file to same size · 213e8c55
      Filipe Manana authored
      When we truncate a file to the same size and that size is not aligned
      with the sector size, we end up triggering writeback (and wait for it to
      complete) of the last page. This is unncessary as we can not have delayed
      allocation beyond the inode's i_size and the goal of truncating a file
      to its own size is to discard prealloc extents (allocated via the
      fallocate(2) system call). Besides the unnecessary IO start and wait, it
      also breaks the oppurtunity for larger contiguous extents on disk, as
      before the last dirty page there might be other dirty pages.
      
      This scenario is probably not very common in general, however it is
      common for btrfs receive implementations because currently the send
      stream always issues a truncate operation for each processed inode as
      the last operation for that inode (this truncate operation is not
      always needed and the send implementation will be addressed to avoid
      them).
      
      So improve this by not starting and waiting for writeback of the inode's
      last page when we are truncating to exactly the same size.
      
      The following script was used to quickly measure the time a receive
      operation takes:
      
       $ cat test_send.sh
       #!/bin/bash
      
       SRC_DEV=/dev/sdc
       DST_DEV=/dev/sdd
       SRC_MNT=/mnt/sdc
       DST_MNT=/mnt/sdd
      
       mkfs.btrfs -f $SRC_DEV >/dev/null
       mkfs.btrfs -f $DST_DEV >/dev/null
       mount $SRC_DEV $SRC_MNT
       mount $DST_DEV $DST_MNT
      
       echo "Creating source filesystem"
       for ((t = 0; t < 10; t++)); do
           (
               for ((i = 1; i <= 20000; i++)); do
                   xfs_io -f -c "pwrite -S 0xab 0 5000" \
                      $SRC_MNT/file_$i > /dev/null
               done
           ) &
           worker_pids[$t]=$!
       done
       wait ${worker_pids[@]}
      
       echo "Creating and sending snapshot"
       btrfs subvolume snapshot -r $SRC_MNT $SRC_MNT/snap1 >/dev/null
       /usr/bin/time -f "send took %e seconds"    \
           btrfs send -f $SRC_MNT/send_file $SRC_MNT/snap1
       /usr/bin/time -f "receive took %e seconds" \
           btrfs receive -f $SRC_MNT/send_file $DST_MNT
      
       umount $SRC_MNT
       umount $DST_MNT
      
      The results for 5 runs were the following:
      
      * Without this change
      
      average receive time was 26.49 seconds
      standard deviation of 2.53 seconds
      
      * With this change
      
      average receive time was 12.51 seconds
      standard deviation of 0.32 seconds
      Reported-by: default avatarRobbie Ko <robbieko@synology.com>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      213e8c55