1. 21 Mar, 2013 1 commit
    • Linus Torvalds's avatar
      Merge tag 'dm-3.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm · 85ab3c46
      Linus Torvalds authored
      Pull device-mapper fixes from Alasdair G Kergon:
       "Fix reported data loss with discards and thin snapshots; avoid a
        deadlock observed in dm verity; fix a race in the new dm cache code
        along with some other minor bugs; store the cache policy version on
        disk to make the stored hints format future-proof."
      
      * tag 'dm-3.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm:
        dm cache: policy ignore hints if generated by different version
        dm cache: policy change version from string to integer set
        dm cache: fix race in writethrough implementation
        dm cache: metadata clear dirty bits on clean shutdown
        dm cache: avoid calling policy destructor twice on error
        dm cache: detect cache_create failure
        dm cache: avoid 64 bit division on 32 bit
        dm verity: avoid deadlock
        dm thin: fix non power of two discard granularity calc
        dm thin: fix discard corruption
      85ab3c46
  2. 20 Mar, 2013 12 commits
    • Mike Snitzer's avatar
      dm cache: policy ignore hints if generated by different version · ea2dd8c1
      Mike Snitzer authored
      When reading the dm cache metadata from disk, ignore the policy hints
      unless they were generated by the same major version number of the same
      policy module.
      
      The hints are considered to be private data belonging to the specific
      module that generated them and there is no requirement for them to make
      sense to different versions of the policy that generated them.
      Policy modules are all required to work fine if no previous hints are
      supplied (or if existing hints are lost).
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      ea2dd8c1
    • Mike Snitzer's avatar
      dm cache: policy change version from string to integer set · 4e7f506f
      Mike Snitzer authored
      Separate dm cache policy version string into 3 unsigned numbers
      corresponding to major, minor and patchlevel and store them at the end
      of the on-disk metadata so we know which version of the policy generated
      the hints in case a future version wants to use them differently.
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      4e7f506f
    • Joe Thornber's avatar
      dm cache: fix race in writethrough implementation · e2e74d61
      Joe Thornber authored
      We have found a race in the optimisation used in the dm cache
      writethrough implementation.  Currently, dm core sends the cache target
      two bios, one for the origin device and one for the cache device and
      these are processed in parallel.  This patch avoids the race by
      changing the code back to a simpler (slower) implementation which
      processes the two writes in series, one after the other, until we can
      develop a complete fix for the problem.
      
      When the cache is in writethrough mode it needs to send WRITE bios to
      both the origin and cache devices.
      
      Previously we've been implementing this by having dm core query the
      cache target on every write to find out how many copies of the bio it
      wants.  The cache will ask for two bios if the block is in the cache,
      and one otherwise.
      
      Then main problem with this is it's racey.  At the time this check is
      made the bio hasn't yet been submitted and so isn't being taken into
      account when quiescing a block for migration (promotion or demotion).
      This means a single bio may be submitted when two were needed because
      the block has since been promoted to the cache (catastrophic), or two
      bios where only one is needed (harmless).
      
      I really don't want to start entering bios into the quiescing system
      (deferred_set) in the get_num_write_bios callback.  Instead this patch
      simplifies things; only one bio is submitted by the core, this is
      first written to the origin and then the cache device in series.
      Obviously this will have a latency impact.
      
      deferred_writethrough_bios is introduced to record bios that must be
      later issued to the cache device from the worker thread.  This deferred
      submission, after the origin bio completes, is required given that we're
      in interrupt context (writethrough_endio).
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      e2e74d61
    • Joe Thornber's avatar
      dm cache: metadata clear dirty bits on clean shutdown · 79ed9caf
      Joe Thornber authored
      When writing the dirty bitset to the metadata device on a clean
      shutdown, clear the dirty bits.  Previously they were left indicating
      the cache was dirty. This led to confusion about whether there really
      was dirty data in the cache or not.  (This was a harmless bug.)
      Reported-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      79ed9caf
    • Heinz Mauelshagen's avatar
      dm cache: avoid calling policy destructor twice on error · b978440b
      Heinz Mauelshagen authored
      If the cache policy's config values are not able to be set we must
      set the policy to NULL after destroying it in create_cache_policy()
      so we don't attempt to destroy it a second time later.
      Signed-off-by: default avatarHeinz Mauelshagen <heinzm@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      b978440b
    • Heinz Mauelshagen's avatar
      dm cache: detect cache_create failure · 617a0b89
      Heinz Mauelshagen authored
      Return error if cache_create() fails.
      
      A missing return check made cache_ctr continue even after an error in
      cache_create() resulting in the cache object being destroyed.  So a
      simple failure like an odd number of cache policy config value arguments
      would result in an oops.
      Signed-off-by: default avatarHeinz Mauelshagen <heinzm@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      617a0b89
    • Joe Thornber's avatar
      dm cache: avoid 64 bit division on 32 bit · 414dd67d
      Joe Thornber authored
      Squash various 32bit link errors.
      
        >> on i386:
        >> drivers/built-in.o: In function `is_discarded_oblock':
        >> dm-cache-target.c:(.text+0x1ea28e): undefined reference to `__udivdi3'
        ...
      Reported-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      414dd67d
    • Mikulas Patocka's avatar
      dm verity: avoid deadlock · 3b6b7813
      Mikulas Patocka authored
      A deadlock was found in the prefetch code in the dm verity map
      function.  This patch fixes this by transferring the prefetch
      to a worker thread and skipping it completely if kmalloc fails.
      
      If generic_make_request is called recursively, it queues the I/O
      request on the current->bio_list without making the I/O request
      and returns. The routine making the recursive call cannot wait
      for the I/O to complete.
      
      The deadlock occurs when one thread grabs the bufio_client
      mutex and waits for an I/O to complete but the I/O is queued
      on another thread's current->bio_list and is waiting to get
      the mutex held by the first thread.
      
      The fix recognises that prefetching is not essential.  If memory
      can be allocated, it queues the prefetch request to the worker thread,
      but if not, it does nothing.
      Signed-off-by: default avatarPaul Taysom <taysom@chromium.org>
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      Cc: stable@kernel.org
      3b6b7813
    • Joe Thornber's avatar
      dm thin: fix non power of two discard granularity calc · 58051b94
      Joe Thornber authored
      Fix a discard granularity calculation to work for non power of 2 block sizes.
      
      In order for thinp to passdown discard bios to the underlying data
      device, the data device must have a discard granularity that is a
      factor of the thinp block size.  Originally this check was done by
      using bitops since the block_size was known to be a power of two.
      
      Introduced by commit f13945d7
      ("dm thin: support a non power of 2 discard_granularity").
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      58051b94
    • Joe Thornber's avatar
      dm thin: fix discard corruption · f046f89a
      Joe Thornber authored
      Fix a bug in dm_btree_remove that could leave leaf values with incorrect
      reference counts.  The effect of this was that removal of a shared block
      could result in the space maps thinking the block was no longer used.
      More concretely, if you have a thin device and a snapshot of it, sending
      a discard to a shared region of the thin could corrupt the snapshot.
      
      Thinp uses a 2-level nested btree to store it's mappings.  This first
      level is indexed by thin device, and the second level by logical
      block.
      
      Often when we're removing an entry in this mapping tree we need to
      rebalance nodes, which can involve shadowing them, possibly creating a
      copy if the block is shared.  If we do create a copy then children of
      that node need to have their reference counts incremented.  In this
      way reference counts percolate down the tree as shared trees diverge.
      
      The rebalance functions were incrementing the children at the
      appropriate time, but they were always assuming the children were
      internal nodes.  This meant the leaf values (in our case packed
      block/flags entries) were not being incremented.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      f046f89a
    • Linus Torvalds's avatar
      Merge tag 'vfio-v3.9-rc4' of git://github.com/awilliam/linux-vfio · 2ffdd7e2
      Linus Torvalds authored
      Pull vfio fix from Alex Williamson.
      
      * tag 'vfio-v3.9-rc4' of git://github.com/awilliam/linux-vfio:
        vfio: include <linux/slab.h> for kmalloc
      2ffdd7e2
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/virt/kvm/kvm · ea4a0ce1
      Linus Torvalds authored
      Pull kvm fixes from Marcelo Tosatti.
      
      * git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798)
        KVM: x86: Convert MSR_KVM_SYSTEM_TIME to use gfn_to_hva_cache functions (CVE-2013-1797)
        KVM: x86: fix for buffer overflow in handling of MSR_KVM_SYSTEM_TIME (CVE-2013-1796)
        KVM: x86: fix deadlock in clock-in-progress request handling
        KVM: allow host header to be included even for !CONFIG_KVM
      ea4a0ce1
  3. 19 Mar, 2013 19 commits
  4. 18 Mar, 2013 8 commits