1. 20 Mar, 2013 8 commits
    • Joe Thornber's avatar
      dm cache: fix race in writethrough implementation · e2e74d61
      Joe Thornber authored
      We have found a race in the optimisation used in the dm cache
      writethrough implementation.  Currently, dm core sends the cache target
      two bios, one for the origin device and one for the cache device and
      these are processed in parallel.  This patch avoids the race by
      changing the code back to a simpler (slower) implementation which
      processes the two writes in series, one after the other, until we can
      develop a complete fix for the problem.
      
      When the cache is in writethrough mode it needs to send WRITE bios to
      both the origin and cache devices.
      
      Previously we've been implementing this by having dm core query the
      cache target on every write to find out how many copies of the bio it
      wants.  The cache will ask for two bios if the block is in the cache,
      and one otherwise.
      
      Then main problem with this is it's racey.  At the time this check is
      made the bio hasn't yet been submitted and so isn't being taken into
      account when quiescing a block for migration (promotion or demotion).
      This means a single bio may be submitted when two were needed because
      the block has since been promoted to the cache (catastrophic), or two
      bios where only one is needed (harmless).
      
      I really don't want to start entering bios into the quiescing system
      (deferred_set) in the get_num_write_bios callback.  Instead this patch
      simplifies things; only one bio is submitted by the core, this is
      first written to the origin and then the cache device in series.
      Obviously this will have a latency impact.
      
      deferred_writethrough_bios is introduced to record bios that must be
      later issued to the cache device from the worker thread.  This deferred
      submission, after the origin bio completes, is required given that we're
      in interrupt context (writethrough_endio).
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      e2e74d61
    • Joe Thornber's avatar
      dm cache: metadata clear dirty bits on clean shutdown · 79ed9caf
      Joe Thornber authored
      When writing the dirty bitset to the metadata device on a clean
      shutdown, clear the dirty bits.  Previously they were left indicating
      the cache was dirty. This led to confusion about whether there really
      was dirty data in the cache or not.  (This was a harmless bug.)
      Reported-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      79ed9caf
    • Heinz Mauelshagen's avatar
      dm cache: avoid calling policy destructor twice on error · b978440b
      Heinz Mauelshagen authored
      If the cache policy's config values are not able to be set we must
      set the policy to NULL after destroying it in create_cache_policy()
      so we don't attempt to destroy it a second time later.
      Signed-off-by: default avatarHeinz Mauelshagen <heinzm@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      b978440b
    • Heinz Mauelshagen's avatar
      dm cache: detect cache_create failure · 617a0b89
      Heinz Mauelshagen authored
      Return error if cache_create() fails.
      
      A missing return check made cache_ctr continue even after an error in
      cache_create() resulting in the cache object being destroyed.  So a
      simple failure like an odd number of cache policy config value arguments
      would result in an oops.
      Signed-off-by: default avatarHeinz Mauelshagen <heinzm@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      617a0b89
    • Joe Thornber's avatar
      dm cache: avoid 64 bit division on 32 bit · 414dd67d
      Joe Thornber authored
      Squash various 32bit link errors.
      
        >> on i386:
        >> drivers/built-in.o: In function `is_discarded_oblock':
        >> dm-cache-target.c:(.text+0x1ea28e): undefined reference to `__udivdi3'
        ...
      Reported-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      414dd67d
    • Mikulas Patocka's avatar
      dm verity: avoid deadlock · 3b6b7813
      Mikulas Patocka authored
      A deadlock was found in the prefetch code in the dm verity map
      function.  This patch fixes this by transferring the prefetch
      to a worker thread and skipping it completely if kmalloc fails.
      
      If generic_make_request is called recursively, it queues the I/O
      request on the current->bio_list without making the I/O request
      and returns. The routine making the recursive call cannot wait
      for the I/O to complete.
      
      The deadlock occurs when one thread grabs the bufio_client
      mutex and waits for an I/O to complete but the I/O is queued
      on another thread's current->bio_list and is waiting to get
      the mutex held by the first thread.
      
      The fix recognises that prefetching is not essential.  If memory
      can be allocated, it queues the prefetch request to the worker thread,
      but if not, it does nothing.
      Signed-off-by: default avatarPaul Taysom <taysom@chromium.org>
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      Cc: stable@kernel.org
      3b6b7813
    • Joe Thornber's avatar
      dm thin: fix non power of two discard granularity calc · 58051b94
      Joe Thornber authored
      Fix a discard granularity calculation to work for non power of 2 block sizes.
      
      In order for thinp to passdown discard bios to the underlying data
      device, the data device must have a discard granularity that is a
      factor of the thinp block size.  Originally this check was done by
      using bitops since the block_size was known to be a power of two.
      
      Introduced by commit f13945d7
      ("dm thin: support a non power of 2 discard_granularity").
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      58051b94
    • Joe Thornber's avatar
      dm thin: fix discard corruption · f046f89a
      Joe Thornber authored
      Fix a bug in dm_btree_remove that could leave leaf values with incorrect
      reference counts.  The effect of this was that removal of a shared block
      could result in the space maps thinking the block was no longer used.
      More concretely, if you have a thin device and a snapshot of it, sending
      a discard to a shared region of the thin could corrupt the snapshot.
      
      Thinp uses a 2-level nested btree to store it's mappings.  This first
      level is indexed by thin device, and the second level by logical
      block.
      
      Often when we're removing an entry in this mapping tree we need to
      rebalance nodes, which can involve shadowing them, possibly creating a
      copy if the block is shared.  If we do create a copy then children of
      that node need to have their reference counts incremented.  In this
      way reference counts percolate down the tree as shared trees diverge.
      
      The rebalance functions were incrementing the children at the
      appropriate time, but they were always assuming the children were
      internal nodes.  This meant the leaf values (in our case packed
      block/flags entries) were not being incremented.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      f046f89a
  2. 17 Mar, 2013 4 commits
    • Linus Torvalds's avatar
      Linux 3.9-rc3 · a937536b
      Linus Torvalds authored
      a937536b
    • David Rientjes's avatar
      perf,x86: fix link failure for non-Intel configs · 6c4d3bc9
      David Rientjes authored
      Commit 1d9d8639 ("perf,x86: fix kernel crash with PEBS/BTS after
      suspend/resume") introduces a link failure since
      perf_restore_debug_store() is only defined for CONFIG_CPU_SUP_INTEL:
      
      	arch/x86/power/built-in.o: In function `restore_processor_state':
      	(.text+0x45c): undefined reference to `perf_restore_debug_store'
      
      Fix it by defining the dummy function appropriately.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6c4d3bc9
    • Linus Torvalds's avatar
      perf,x86: fix wrmsr_on_cpu() warning on suspend/resume · 2a6e06b2
      Linus Torvalds authored
      Commit 1d9d8639 ("perf,x86: fix kernel crash with PEBS/BTS after
      suspend/resume") fixed a crash when doing PEBS performance profiling
      after resuming, but in using init_debug_store_on_cpu() to restore the
      DS_AREA mtrr it also resulted in a new WARN_ON() triggering.
      
      init_debug_store_on_cpu() uses "wrmsr_on_cpu()", which in turn uses CPU
      cross-calls to do the MSR update.  Which is not really valid at the
      early resume stage, and the warning is quite reasonable.  Now, it all
      happens to _work_, for the simple reason that smp_call_function_single()
      ends up just doing the call directly on the CPU when the CPU number
      matches, but we really should just do the wrmsr() directly instead.
      
      This duplicates the wrmsr() logic, but hopefully we can just remove the
      wrmsr_on_cpu() version eventually.
      Reported-and-tested-by: default avatarParag Warudkar <parag.lkml@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2a6e06b2
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · 08637024
      Linus Torvalds authored
      Pull btrfs fixes from Chris Mason:
       "Eric's rcu barrier patch fixes a long standing problem with our
        unmount code hanging on to devices in workqueue helpers.  Liu Bo
        nailed down a difficult assertion for in-memory extent mappings."
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
        Btrfs: fix warning of free_extent_map
        Btrfs: fix warning when creating snapshots
        Btrfs: return as soon as possible when edquot happens
        Btrfs: return EIO if we have extent tree corruption
        btrfs: use rcu_barrier() to wait for bdev puts at unmount
        Btrfs: remove btrfs_try_spin_lock
        Btrfs: get better concurrency for snapshot-aware defrag work
      08637024
  3. 16 Mar, 2013 8 commits
    • Liu Bo's avatar
      Btrfs: fix warning of free_extent_map · 3b277594
      Liu Bo authored
      Users report that an extent map's list is still linked when it's actually
      going to be freed from cache.
      
      The story is that
      
      a) when we're going to drop an extent map and may split this large one into
      smaller ems, and if this large one is flagged as EXTENT_FLAG_LOGGING which means
      that it's on the list to be logged, then the smaller ems split from it will also
      be flagged as EXTENT_FLAG_LOGGING, and this is _not_ expected.
      
      b) we'll keep ems from unlinking the list and freeing when they are flagged with
      EXTENT_FLAG_LOGGING, because the log code holds one reference.
      
      The end result is the warning, but the truth is that we set the flag
      EXTENT_FLAG_LOGGING only during fsync.
      
      So clear flag EXTENT_FLAG_LOGGING for extent maps split from a large one.
      Reported-by: default avatarJohannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
      Reported-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      3b277594
    • Linus Torvalds's avatar
      Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild · e2043785
      Linus Torvalds authored
      Pull kbuild fix from Michal Marek:
       "One fix for for make headers_install/headers_check to not require make
        3.81.  The requirement has been accidentally introduced in 3.7."
      
      * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
        kbuild: fix make headers_check with make 3.80
      e2043785
    • Linus Torvalds's avatar
      Merge tag 'for-3.9-rc3' of git://openrisc.net/jonas/linux · 23659587
      Linus Torvalds authored
      Pull OpenRISC bug fixes from Jonas Bonn:
      
       - The GPIO descriptor work has exposed how broken the non-GPIOLIB bits
         for OpenRISC were.  We now require GPIOLIB as this is the preferred
         way forward.
      
       - The system.h split introduced a bug in llist.h for arches using
         asm-generic/cmpxchg.h directly, which is currently only OpenRISC.
         The patch here moves two defines from asm-generic/atomic.h to
         asm-generic/cmpxchg.h to make things work as they should.
      
       - The VIRT_TO_BUS selector was added for OpenRISC, but OpenRISC does
         not have the virt_to_bus methods, so there's a patch to remove it
         again.
      
      * tag 'for-3.9-rc3' of git://openrisc.net/jonas/linux:
        openrisc: remove HAVE_VIRT_TO_BUS
        asm-generic: move cmpxchg*_local defs to cmpxchg.h
        openrisc: require gpiolib
      23659587
    • Linus Torvalds's avatar
      Merge tag 'char-misc-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 9e1a0aab
      Linus Torvalds authored
      Pull char/misc fixes from Greg Kroah-Hartman:
       "Here are some tiny fixes for the w1 drivers and the final removal
        patch for getting rid of CONFIG_EXPERIMENTAL (all users of it are now
        gone from your tree, this just drops the Kconfig item itself.)
      
        All have been in the linux-next tree for a while"
      
      * tag 'char-misc-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        final removal of CONFIG_EXPERIMENTAL
        w1: fix oops when w1_search is called from netlink connector
        w1-gpio: fix unused variable warning
        w1-gpio: remove erroneous __exit and __exit_p()
        ARM: w1-gpio: fix erroneous gpio requests
      9e1a0aab
    • Linus Torvalds's avatar
      Merge tag 'sound-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 5cd8846c
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "A collection of small fixes, as expected for the middle rc:
         - A couple of fixes for potential NULL dereferences and out-of-range
           array accesses revealed by static code parsers
         - A fix for the wrong error handling detected by trinity
         - A regression fix for missing audio on some MacBooks
         - CA0132 DSP loader fixes
         - Fix for EAPD control of IDT codecs on machines w/o speaker
         - Fix a regression in the HD-audio widget list parser code
         - Workaround for the NuForce UDH-100 USB audio"
      
      * tag 'sound-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda - Fix missing EAPD/GPIO setup for Cirrus codecs
        sound: sequencer: cap array index in seq_chn_common_event()
        ALSA: hda/ca0132 - Remove extra setting of dsp_state.
        ALSA: hda/ca0132 - Check download state of DSP.
        ALSA: hda/ca0132 - Check if dspload_image succeeded.
        ALSA: hda - Disable IDT eapd_switch if there are no internal speakers
        ALSA: hda - Fix snd_hda_get_num_raw_conns() to return a correct value
        ALSA: usb-audio: add a workaround for the NuForce UDH-100
        ALSA: asihpi - fix potential NULL pointer dereference
        ALSA: seq: Fix missing error handling in snd_seq_timer_open()
      5cd8846c
    • Linus Torvalds's avatar
      Merge branch 'fixes-for-3.9' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping · c7f17deb
      Linus Torvalds authored
      Pull DMA-mapping fix from Marek Szyprowski:
       "An important fix for all ARM architectures which use ZONE_DMA.
        Without it dma_alloc_* calls with GFP_ATOMIC flag might have allocated
        buffers outsize DMA zone."
      
      * 'fixes-for-3.9' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
        ARM: DMA-mapping: add missing GFP_DMA flag for atomic buffer allocation
      c7f17deb
    • Linus Torvalds's avatar
      Merge tag 'mfd-fixes-3.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes · de1893f6
      Linus Torvalds authored
      Pull MFD fixes from Samuel Ortiz:
       "This is the first batch of MFD fixes for 3.9.
      
        With this one we have:
      
         - An ab8500 build failure fix.
         - An ab8500 device tree parsing fix.
         - A fix for twl4030_madc remove routine to work properly (when
           built-in).
         - A fix for properly registering palmas interrupt handler.
         - A fix for omap-usb init routine to actually write into the
           hostconfig register.
         - A couple of warning fixes for ab8500-gpadc and tps65912"
      
      * tag 'mfd-fixes-3.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes:
        mfd: twl4030-madc: Remove __exit_p annotation
        mfd: ab8500: Kill "reg" property from binding
        mfd: ab8500-gpadc: Complain if we fail to enable vtvout LDO
        mfd: wm831x: Don't forward declare enum wm831x_auxadc
        mfd: twl4030-audio: Fix argument type for twl4030_audio_disable_resource()
        mfd: tps65912: Declare and use tps65912_irq_exit()
        mfd: palmas: Provide irq flags through DT/platform data
        mfd: Make AB8500_CORE select POWER_SUPPLY to fix build error
        mfd: omap-usb-host: Actually update hostconfig
      de1893f6
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging · 92fbb1c9
      Linus Torvalds authored
      Pull hwmon fixes from Guenter Roeck:
       "Bug fixes for pmbus, ltc2978, and lineage-pem drivers
      
        Added specific maintainer for some hwmon drivers"
      
      * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (pmbus/ltc2978) Fix temperature reporting
        hwmon: (pmbus) Fix krealloc() misuse in pmbus_add_attribute()
        hwmon: (lineage-pem) Add missing terminating entry for pem_[input|fan]_attributes
        MAINTAINERS: Add maintainer for MAX6697, INA209, and INA2XX drivers
      92fbb1c9
  4. 15 Mar, 2013 8 commits
  5. 14 Mar, 2013 12 commits