1. 17 Dec, 2012 31 commits
  2. 12 Dec, 2012 9 commits
    • Stefan Behrens's avatar
      Btrfs: allow repair code to include target disk when searching mirrors · ad6d620e
      Stefan Behrens authored
      Make the target disk of a running device replace operation
      available for reading. This is only used as a last ressort for
      the defect repair procedure. And it is dependent on the location
      of the data block to read, because during an ongoing device
      replace operation, the target drive is only partially filled
      with the filesystem data.
      Signed-off-by: default avatarStefan Behrens <sbehrens@giantdisaster.de>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      ad6d620e
    • Stefan Behrens's avatar
      Btrfs: increase BTRFS_MAX_MIRRORS by one for dev replace · 72d7aefc
      Stefan Behrens authored
      This change of the define is effective in all modes, it
      is required and used only in the case when a device replace
      procedure is running. The reason is that during an active
      device replace procedure, the target device of the copy
      operation is a mirror for the filesystem data as well that
      can be used to read data in order to repair read errors on
      other disks.
      Signed-off-by: default avatarStefan Behrens <sbehrens@giantdisaster.de>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      72d7aefc
    • Stefan Behrens's avatar
      Btrfs: optionally avoid reads from device replace source drive · 30d9861f
      Stefan Behrens authored
      It is desirable to be able to configure the device replace
      procedure to avoid reading the source drive (the one to be
      copied) whenever possible. This is useful when the number of
      read errors on this disk is high, because it would delay the
      copy procedure alot. Therefore there is an option to avoid
      reading from the source disk unless the repair procedure
      really needs to access it. The regular read req asks for
      mapping the block with mirror_num == 0, in this case the
      source disk is avoided whenever possible. The repair code
      selects the mirror_num explicitly (mirror_num != 0), this
      case is not changed by this commit.
      Signed-off-by: default avatarStefan Behrens <sbehrens@giantdisaster.de>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      30d9861f
    • Stefan Behrens's avatar
      Btrfs: changes to live filesystem are also written to replacement disk · 472262f3
      Stefan Behrens authored
      During a running dev replace operation, all write requests to
      the live filesystem are duplicated to also write to the target
      drive. Therefore btrfs_map_block() is changed to duplicate
      stripes that are written to the source disk of a device replace
      procedure to be written to the target disk as well.
      Signed-off-by: default avatarStefan Behrens <sbehrens@giantdisaster.de>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      472262f3
    • Stefan Behrens's avatar
      Btrfs: introduce GET_READ_MIRRORS functionality for btrfs_map_block() · 29a8d9a0
      Stefan Behrens authored
      Before this commit, btrfs_map_block() was called with REQ_WRITE
      in order to retrieve the list of mirrors for a disk block.
      This needs to be changed for the device replace procedure since
      it makes a difference whether you are asking for read mirrors
      or for locations to write to.
      GET_READ_MIRRORS is introduced as a new interface to call
      btrfs_map_block().
      In the current commit, the functionality is not yet changed,
      only the interface for GET_READ_MIRRORS is introduced and all
      the places that should use this new interface are adapted.
      
      The reason that REQ_WRITE cannot be abused anymore to retrieve
      a list of read mirrors is that during a running dev replace
      operation all write requests to the live filesystem are
      duplicated to also write to the target drive.
      Keep in mind that the target disk is only partially a valid
      copy of the source disk while the operation is ongoing. All
      writes go to the target disk, but not all reads would return
      valid data on the target disk. Therefore it is not possible
      anymore to abuse a REQ_WRITE interface to find valid mirrors
      for a REQ_READ.
      Signed-off-by: default avatarStefan Behrens <sbehrens@giantdisaster.de>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      29a8d9a0
    • Stefan Behrens's avatar
      Btrfs: change core code of btrfs to support the device replace operations · 8dabb742
      Stefan Behrens authored
      This commit contains all the essential changes to the core code
      of Btrfs for support of the device replace procedure.
      Signed-off-by: default avatarStefan Behrens <sbehrens@giantdisaster.de>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      8dabb742
    • Stefan Behrens's avatar
      Btrfs: add new sources for device replace code · e93c89c1
      Stefan Behrens authored
      This adds a new file to the sources together with the header file
      and the changes to ioctl.h and ctree.h that are required by the
      new C source file. Additionally, 4 new functions are added to
      volume.c that deal with device creation and destruction.
      Signed-off-by: default avatarStefan Behrens <sbehrens@giantdisaster.de>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      e93c89c1
    • Stefan Behrens's avatar
      Btrfs: add code to scrub to copy read data to another disk · ff023aac
      Stefan Behrens authored
      The device replace procedure makes use of the scrub code. The scrub
      code is the most efficient code to read the allocated data of a disk,
      i.e. it reads sequentially in order to avoid disk head movements, it
      skips unallocated blocks, it uses read ahead mechanisms, and it
      contains all the code to detect and repair defects.
      This commit adds code to scrub to allow the scrub code to copy read
      data to another disk.
      One goal is to be able to perform as fast as possible. Therefore the
      write requests are collected until huge bios are built, and the
      write process is decoupled from the read process with some kind of
      flow control, of course, in order to limit the allocated memory.
      The best performance on spinning disks could by reached when the
      head movements are avoided as much as possible. Therefore a single
      worker is used to interface the read process with the write process.
      The regular scrub operation works as fast as before, it is not
      negatively influenced and actually it is more or less unchanged.
      Signed-off-by: default avatarStefan Behrens <sbehrens@giantdisaster.de>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      ff023aac
    • Stefan Behrens's avatar
      Btrfs: handle errors from btrfs_map_bio() everywhere · 61891923
      Stefan Behrens authored
      With the addition of the device replace procedure, it is possible
      for btrfs_map_bio(READ) to report an error. This happens when the
      specific mirror is requested which is located on the target disk,
      and the copy operation has not yet copied this block. Hence the
      block cannot be read and this error state is indicated by
      returning EIO.
      Some background information follows now. A new mirror is added
      while the device replace procedure is running.
      btrfs_get_num_copies() returns one more, and
      btrfs_map_bio(GET_READ_MIRROR) adds one more mirror if a disk
      location is involved that was already handled by the device
      replace copy operation. The assigned mirror num is the highest
      mirror number, e.g. the value 3 in case of RAID1.
      If btrfs_map_bio() is invoked with mirror_num == 0 (i.e., select
      any mirror), the copy on the target drive is never selected
      because that disk shall be able to perform the write requests as
      quickly as possible. The parallel execution of read requests would
      only slow down the disk copy procedure. Second case is that
      btrfs_map_bio() is called with mirror_num > 0. This is done from
      the repair code only. In this case, the highest mirror num is
      assigned to the target disk, since it is used last. And when this
      mirror is not available because the copy procedure has not yet
      handled this area, an error is returned. Everywhere in the code
      the handling of such errors is added now.
      Signed-off-by: default avatarStefan Behrens <sbehrens@giantdisaster.de>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      61891923