1. 30 May, 2014 7 commits
    • Linus Torvalds's avatar
      Merge tag 'dm-3.15-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm · 24e19d27
      Linus Torvalds authored
      Pull device-mapper fixes from Mike Snitzer:
       "A dm-cache stable fix to split discards on cache block boundaries
        because dm-cache cannot yet handle discards that span cache blocks.
      
        Really fix a dm-mpath LOCKDEP warning that was introduced in -rc1.
      
        Add a 'no_space_timeout' control to dm-thinp to restore the ability to
        queue IO indefinitely when no data space is available.  This fixes a
        change in behavior that was introduced in -rc6 where the timeout
        couldn't be disabled"
      
      * tag 'dm-3.15-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm mpath: really fix lockdep warning
        dm cache: always split discards on cache block boundaries
        dm thin: add 'no_space_timeout' dm-thin-pool module param
      24e19d27
    • Minchan Kim's avatar
      x86_64: expand kernel stack to 16K · 6538b8ea
      Minchan Kim authored
      While I play inhouse patches with much memory pressure on qemu-kvm,
      3.14 kernel was randomly crashed. The reason was kernel stack overflow.
      
      When I investigated the problem, the callstack was a little bit deeper
      by involve with reclaim functions but not direct reclaim path.
      
      I tried to diet stack size of some functions related with alloc/reclaim
      so did a hundred of byte but overflow was't disappeard so that I encounter
      overflow by another deeper callstack on reclaim/allocator path.
      
      Of course, we might sweep every sites we have found for reducing
      stack usage but I'm not sure how long it saves the world(surely,
      lots of developer start to add nice features which will use stack
      agains) and if we consider another more complex feature in I/O layer
      and/or reclaim path, it might be better to increase stack size(
      meanwhile, stack usage on 64bit machine was doubled compared to 32bit
      while it have sticked to 8K. Hmm, it's not a fair to me and arm64
      already expaned to 16K. )
      
      So, my stupid idea is just let's expand stack size and keep an eye
      toward stack consumption on each kernel functions via stacktrace of ftrace.
      For example, we can have a bar like that each funcion shouldn't exceed 200K
      and emit the warning when some function consumes more in runtime.
      Of course, it could make false positive but at least, it could make a
      chance to think over it.
      
      I guess this topic was discussed several time so there might be
      strong reason not to increase kernel stack size on x86_64, for me not
      knowing so Ccing x86_64 maintainers, other MM guys and virtio
      maintainers.
      
      Here's an example call trace using up the kernel stack:
      
               Depth    Size   Location    (51 entries)
               -----    ----   --------
         0)     7696      16   lookup_address
         1)     7680      16   _lookup_address_cpa.isra.3
         2)     7664      24   __change_page_attr_set_clr
         3)     7640     392   kernel_map_pages
         4)     7248     256   get_page_from_freelist
         5)     6992     352   __alloc_pages_nodemask
         6)     6640       8   alloc_pages_current
         7)     6632     168   new_slab
         8)     6464       8   __slab_alloc
         9)     6456      80   __kmalloc
        10)     6376     376   vring_add_indirect
        11)     6000     144   virtqueue_add_sgs
        12)     5856     288   __virtblk_add_req
        13)     5568      96   virtio_queue_rq
        14)     5472     128   __blk_mq_run_hw_queue
        15)     5344      16   blk_mq_run_hw_queue
        16)     5328      96   blk_mq_insert_requests
        17)     5232     112   blk_mq_flush_plug_list
        18)     5120     112   blk_flush_plug_list
        19)     5008      64   io_schedule_timeout
        20)     4944     128   mempool_alloc
        21)     4816      96   bio_alloc_bioset
        22)     4720      48   get_swap_bio
        23)     4672     160   __swap_writepage
        24)     4512      32   swap_writepage
        25)     4480     320   shrink_page_list
        26)     4160     208   shrink_inactive_list
        27)     3952     304   shrink_lruvec
        28)     3648      80   shrink_zone
        29)     3568     128   do_try_to_free_pages
        30)     3440     208   try_to_free_pages
        31)     3232     352   __alloc_pages_nodemask
        32)     2880       8   alloc_pages_current
        33)     2872     200   __page_cache_alloc
        34)     2672      80   find_or_create_page
        35)     2592      80   ext4_mb_load_buddy
        36)     2512     176   ext4_mb_regular_allocator
        37)     2336     128   ext4_mb_new_blocks
        38)     2208     256   ext4_ext_map_blocks
        39)     1952     160   ext4_map_blocks
        40)     1792     384   ext4_writepages
        41)     1408      16   do_writepages
        42)     1392      96   __writeback_single_inode
        43)     1296     176   writeback_sb_inodes
        44)     1120      80   __writeback_inodes_wb
        45)     1040     160   wb_writeback
        46)      880     208   bdi_writeback_workfn
        47)      672     144   process_one_work
        48)      528     112   worker_thread
        49)      416     240   kthread
        50)      176     176   ret_from_fork
      
      [ Note: the problem is exacerbated by certain gcc versions that seem to
        generate much bigger stack frames due to apparently bad coalescing of
        temporaries and generating too many spills.  Rusty saw gcc-4.6.4 using
        35% more stack on the virtio path than 4.8.2 does, for example.
      
        Minchan not only uses such a bad gcc version (4.6.3 in his case), but
        some of the stack use is due to debugging (CONFIG_DEBUG_PAGEALLOC is
        what causes that kernel_map_pages() frame, for example). But we're
        clearly getting too close.
      
        The VM code also seems to have excessive stack frames partly for the
        same compiler reason, triggered by excessive inlining and lots of
        function arguments.
      
        We need to improve on our stack use, but in the meantime let's do this
        simple stack increase too.  Unlike most earlier reports, there is
        nothing simple that stands out as being really horribly wrong here,
        apart from the fact that the stack frames are just bigger than they
        should need to be.        - Linus ]
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Peter Anvin <hpa@zytor.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Michael S Tsirkin <mst@redhat.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: PJ Waskiewicz <pjwaskiewicz@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6538b8ea
    • Linus Torvalds's avatar
      Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 6f6111e4
      Linus Torvalds authored
      Pull vfs dcache livelock fix from Al Viro:
       "Fixes for livelocks in shrink_dentry_list() introduced by fixes to
        shrink list corruption; the root cause was that trylock of parent's
        ->d_lock could be disrupted by d_walk() happening on other CPUs,
        resulting in shrink_dentry_list() making no progress *and* the same
        d_walk() being called again and again for as long as
        shrink_dentry_list() doesn't get past that mess.
      
        The solution is to have shrink_dentry_list() treat that trylock
        failure not as 'try to do the same thing again', but 'lock them in the
        right order'"
      
      * 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        dentry_kill() doesn't need the second argument now
        dealing with the rest of shrink_dentry_list() livelock
        shrink_dentry_list(): take parent's ->d_lock earlier
        expand dentry_kill(dentry, 0) in shrink_dentry_list()
        split dentry_kill()
        lift the "already marked killed" case into shrink_dentry_list()
      6f6111e4
    • Al Viro's avatar
      dentry_kill() doesn't need the second argument now · 8cbf74da
      Al Viro authored
      it's 1 in the only remaining caller.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      8cbf74da
    • Al Viro's avatar
      dealing with the rest of shrink_dentry_list() livelock · b2b80195
      Al Viro authored
      We have the same problem with ->d_lock order in the inner loop, where
      we are dropping references to ancestors.  Same solution, basically -
      instead of using dentry_kill() we use lock_parent() (introduced in the
      previous commit) to get that lock in a safe way, recheck ->d_count
      (in case if lock_parent() has ended up dropping and retaking ->d_lock
      and somebody managed to grab a reference during that window), trylock
      the inode->i_lock and use __dentry_kill() to do the rest.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      b2b80195
    • Al Viro's avatar
      shrink_dentry_list(): take parent's ->d_lock earlier · 046b961b
      Al Viro authored
      The cause of livelocks there is that we are taking ->d_lock on
      dentry and its parent in the wrong order, forcing us to use
      trylock on the parent's one.  d_walk() takes them in the right
      order, and unfortunately it's not hard to create a situation
      when shrink_dentry_list() can't make progress since trylock
      keeps failing, and shrink_dcache_parent() or check_submounts_and_drop()
      keeps calling d_walk() disrupting the very shrink_dentry_list() it's
      waiting for.
      
      Solution is straightforward - if that trylock fails, let's unlock
      the dentry itself and take locks in the right order.  We need to
      stabilize ->d_parent without holding ->d_lock, but that's doable
      using RCU.  And we'd better do that in the very beginning of the
      loop in shrink_dentry_list(), since the checks on refcount, etc.
      would need to be redone anyway.
      
      That deals with a half of the problem - killing dentries on the
      shrink list itself.  Another one (dropping their parents) is
      in the next commit.
      
      locking parent is interesting - it would be easy to do rcu_read_lock(),
      lock whatever we think is a parent, lock dentry itself and check
      if the parent is still the right one.  Except that we need to check
      that *before* locking the dentry, or we are risking taking ->d_lock
      out of order.  Fortunately, once the D1 is locked, we can check if
      D2->d_parent is equal to D1 without the need to lock D2; D2->d_parent
      can start or stop pointing to D1 only under D1->d_lock, so taking
      D1->d_lock is enough.  In other words, the right solution is
      rcu_read_lock/lock what looks like parent right now/check if it's
      still our parent/rcu_read_unlock/lock the child.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      046b961b
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm · fe45736f
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
       "The usual random collection of relatively small ARM fixes"
      
      * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
        ARM: 8063/1: bL_switcher: fix individual online status reporting of removed CPUs
        ARM: 8064/1: fix v7-M signal return
        ARM: 8057/1: amba: Add Qualcomm vendor ID.
        ARM: 8052/1: unwind: Fix handling of "Pop r4-r[4+nnn],r14" opcode
        ARM: 8051/1: put_user: fix possible data corruption in put_user
        ARM: 8048/1: fix v7-M setup stack location
      fe45736f
  2. 29 May, 2014 6 commits
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · a991639c
      Linus Torvalds authored
      Pull arm64 fix from Will Deacon:
       "Fix CoW regression for transparent hugepages by routing set_pmd_at to
        set_pte_at, which correctly handles PTE_WRITE and will mark the
        resulting table entry as read-only where appropriate"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: mm: fix pmd_write CoW brokenness
      a991639c
    • Linus Torvalds's avatar
      Merge tag 'pm+acpi-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · f035b3d3
      Linus Torvalds authored
      Pull ACPI and power management fixes from Rafael Wysocki:
       "These are three stable-candidate fixes, one for the ACPI thermal
        driver and two for cpufreq drivers.
      
        Specifics:
      
         - A workqueue is destroyed too early during the ACPI thermal driver
           module unload which leads to a NULL pointer dereference in the
           driver's remove callback.  Fix from Aaron Lu.
      
         - A wrong argument is passed to devm_regulator_get_optional() in the
           probe routine of the cpu0 cpufreq driver which leads to resource
           leaks if the driver is unbound from the cpufreq platform device.
           Fix from Lucas Stach.
      
         - A lock is missing in cpufreq_governor_dbs() which leads to memory
           corruption and NULL pointer dereferences during system
           suspend/resume, for example.  Fix from Bibek Basu"
      
      * tag 'pm+acpi-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI / thermal: fix workqueue destroy order
        cpufreq: cpu0: drop wrong devm usage
        cpufreq: remove race while accessing cur_policy
      f035b3d3
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux · 15a7b60e
      Linus Torvalds authored
      Pull clock fixes from Mike Turquette:
       "Small number of user-visible regression fixes for clock drivers.
      
        There is a memory leak fix for an ST platform, an infinite Loop Of
        Doom fix for the recent changes to the basic clock divider (hopefully
        the last fix for those recent changes) and some Tegra PLL changes
        which keep PCI from being hosed on that platform"
      
      * tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux:
        clk: st: Fix memory leak
        clk: divider: Fix table round up function
        clk: tegra: Fix enabling of PLLE
        clk: tegra: Introduce divider mask and shift helpers
        clk: tegra: Fix PLLE programming
      15a7b60e
    • Al Viro's avatar
      expand dentry_kill(dentry, 0) in shrink_dentry_list() · ff2fde99
      Al Viro authored
      Result will be massaged to saner shape in the next commits.  It is
      ugly, no questions - the point of that one is to be a provably
      equivalent transformation (and it might be worth splitting a bit
      more).
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      ff2fde99
    • Al Viro's avatar
      split dentry_kill() · e55fd011
      Al Viro authored
      ... into trylocks and everything else.  The latter (actual killing)
      is __dentry_kill().
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      e55fd011
    • Will Deacon's avatar
      arm64: mm: fix pmd_write CoW brokenness · ceb21835
      Will Deacon authored
      Commit 9c7e535f ("arm64: mm: Route pmd thp functions through pte
      equivalents") changed the pmd manipulator and accessor functions to
      convert the target pmd to a pte, process it with the pte functions, then
      convert it back. Along the way, we gained support for PTE_WRITE, however
      this is completely ignored by set_pmd_at, and so we fail to set the
      PMD_SECT_RDONLY for PMDs, resulting in all sorts of lovely failures (like
      CoW not working).
      
      Partially reverting the offending commit (by making use of
      PMD_SECT_RDONLY explicitly for pmd_{write,wrprotect,mkwrite} functions)
      leads to further issues because pmd_write can then return potentially
      incorrect values for page table entries marked as RDONLY, leading to
      BUG_ON(pmd_write(entry)) tripping under some THP workloads.
      
      This patch fixes the issue by routing set_pmd_at through set_pte_at,
      which correctly takes the PTE_WRITE flag into account. Given that
      THP mappings are always anonymous, the additional cache-flushing code
      in __sync_icache_dcache won't impose any significant overhead as the
      flush will be skipped.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Acked-by: default avatarSteve Capper <steve.capper@arm.com>
      Tested-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      ceb21835
  3. 28 May, 2014 10 commits
    • Linus Torvalds's avatar
      Merge tag 'sound-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · f2159d1e
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "Just two small stable fixes: an HD-audio fix for the new Intel
        chipsets and a PM handling fix in PCM dmaengine core"
      
      * tag 'sound-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda - Fix onboard audio on Intel H97/Z97 chipsets
        ALSA: pcm_dmaengine: Add check during device suspend
      f2159d1e
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 28269580
      Linus Torvalds authored
      Pull vfs fix from Al Viro:
       "Oh, well...  Still nothing useful on that livelock (I had something
        that looked kinda-sorta like a non-invasive solution, but it
        deadlocks), so it's just Miklos' vmsplice fix for now"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        vfs: fix vmplice_to_user()
      28269580
    • Nicolas Pitre's avatar
      ARM: 8063/1: bL_switcher: fix individual online status reporting of removed CPUs · 3f8517e7
      Nicolas Pitre authored
      The content of /sys/devices/system/cpu/cpu*/online  is still 1 for those
      CPUs that the switcher has removed even though the global state in
      /sys/devices/system/cpu/online is updated correctly.
      
      It turns out that commit 0902a904 ("Driver core: Use generic
      offline/online for CPU offline/online") has changed the way those files
      retrieve their content by relying on on the generic attribute handling
      code.  The switcher, by calling cpu_down() directly, bypasses this
      handling and the attribute value doesn't get updated.
      
      Fix this by calling device_offline()/device_online() instead.
      Signed-off-by: default avatarNicolas Pitre <nico@linaro.org>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      3f8517e7
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 4efdedca
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "Small fixes for x86, slightly larger fixes for PPC, and a forgotten
        s390 patch.  The PPC fixes are important because they fix breakage
        that is new in 3.15"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: s390: announce irqfd capability
        KVM: x86: disable master clock if TSC is reset during suspend
        KVM: vmx: disable APIC virtualization in nested guests
        KVM guest: Make pv trampoline code executable
        KVM: PPC: Book3S: ifdef on CONFIG_KVM_BOOK3S_32_HANDLER for 32bit
        KVM: PPC: Book3S HV: Add missing code for transaction reclaim on guest exit
        KVM: PPC: Book3S: HV: make _PAGE_NUMA take effect
      4efdedca
    • Linus Torvalds's avatar
      Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc · 9e3d6331
      Linus Torvalds authored
      Pull two powerpc fixes from Ben Herrenschmidt:
       "Here's a pair of powerpc fixes for 3.15 which are also going to
        stable.
      
        One's a fix for building with newer binutils (the problem currently
        only affects the BookE kernels but the affected macro might come back
        into use on BookS platforms at any time).  Unfortunately, the binutils
        maintainer did a backward incompatible change to a construct that we
        use so we have to add Makefile check.
      
        The other one is a fix for CPUs getting stuck in kexec when running
        single threaded.  Since we routinely use kexec on power (including in
        our newer bootloaders), I deemed that important enough"
      
      * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
        powerpc, kexec: Fix "Processor X is stuck" issue during kexec from ST mode
        powerpc: Fix 64 bit builds with binutils 2.24
      9e3d6331
    • Al Viro's avatar
      lift the "already marked killed" case into shrink_dentry_list() · 64fd72e0
      Al Viro authored
      It can happen only when dentry_kill() is called with unlock_on_failure
      equal to 0 - other callers had dentry pinned until the moment they've
      got ->d_lock and DCACHE_DENTRY_KILLED is set only after lockref_mark_dead().
      
      IOW, only one of three call sites of dentry_kill() might end up reaching
      that code.  Just move it there.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      64fd72e0
    • Miklos Szeredi's avatar
      vfs: fix vmplice_to_user() · b6dd6f47
      Miklos Szeredi authored
      Commit 6130f531 "switch vmsplice_to_user() to copy_page_to_iter()" in
      v3.15-rc1 broke vmsplice(2).
      
      This patch fixes two bugs:
      
       - count is not initialized to a proper value, which resulted in no data
         being copied
      
       - if rw_copy_check_uvector() returns negative then the iov might be leaked.
      
      Tested OK.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      b6dd6f47
    • Mike Turquette's avatar
      Merge tag 'clk-tegra-fixes-3.15' of... · 51784380
      Mike Turquette authored
      Merge tag 'clk-tegra-fixes-3.15' of git://nv-tegra.nvidia.com/user/pdeschrijver/linux into clk-fixes
      
      PLLE fixes for 3.15
      51784380
    • Srivatsa S. Bhat's avatar
      powerpc, kexec: Fix "Processor X is stuck" issue during kexec from ST mode · 011e4b02
      Srivatsa S. Bhat authored
      If we try to perform a kexec when the machine is in ST (Single-Threaded) mode
      (ppc64_cpu --smt=off), the kexec operation doesn't succeed properly, and we
      get the following messages during boot:
      
      [    0.089866] POWER8 performance monitor hardware support registered
      [    0.089985] power8-pmu: PMAO restore workaround active.
      [    5.095419] Processor 1 is stuck.
      [   10.097933] Processor 2 is stuck.
      [   15.100480] Processor 3 is stuck.
      [   20.102982] Processor 4 is stuck.
      [   25.105489] Processor 5 is stuck.
      [   30.108005] Processor 6 is stuck.
      [   35.110518] Processor 7 is stuck.
      [   40.113369] Processor 9 is stuck.
      [   45.115879] Processor 10 is stuck.
      [   50.118389] Processor 11 is stuck.
      [   55.120904] Processor 12 is stuck.
      [   60.123425] Processor 13 is stuck.
      [   65.125970] Processor 14 is stuck.
      [   70.128495] Processor 15 is stuck.
      [   75.131316] Processor 17 is stuck.
      
      Note that only the sibling threads are stuck, while the primary threads (0, 8,
      16 etc) boot just fine. Looking closer at the previous step of kexec, we observe
      that kexec tries to wakeup (bring online) the sibling threads of all the cores,
      before performing kexec:
      
      [ 9464.131231] Starting new kernel
      [ 9464.148507] kexec: Waking offline cpu 1.
      [ 9464.148552] kexec: Waking offline cpu 2.
      [ 9464.148600] kexec: Waking offline cpu 3.
      [ 9464.148636] kexec: Waking offline cpu 4.
      [ 9464.148671] kexec: Waking offline cpu 5.
      [ 9464.148708] kexec: Waking offline cpu 6.
      [ 9464.148743] kexec: Waking offline cpu 7.
      [ 9464.148779] kexec: Waking offline cpu 9.
      [ 9464.148815] kexec: Waking offline cpu 10.
      [ 9464.148851] kexec: Waking offline cpu 11.
      [ 9464.148887] kexec: Waking offline cpu 12.
      [ 9464.148922] kexec: Waking offline cpu 13.
      [ 9464.148958] kexec: Waking offline cpu 14.
      [ 9464.148994] kexec: Waking offline cpu 15.
      [ 9464.149030] kexec: Waking offline cpu 17.
      
      Instrumenting this piece of code revealed that the cpu_up() operation actually
      fails with -EBUSY. Thus, only the primary threads of all the cores are online
      during kexec, and hence this is a sure-shot receipe for disaster, as explained
      in commit e8e5c215 (powerpc/kexec: Fix orphaned offline CPUs across kexec),
      as well as in the comment above wake_offline_cpus().
      
      It turns out that cpu_up() was returning -EBUSY because the variable
      'cpu_hotplug_disabled' was set to 1; and this disabling of CPU hotplug was done
      by migrate_to_reboot_cpu() inside kernel_kexec().
      
      Now, migrate_to_reboot_cpu() was originally written with the assumption that
      any further code will not need to perform CPU hotplug, since we are anyway in
      the reboot path. However, kexec is clearly not such a case, since we depend on
      onlining CPUs, atleast on powerpc.
      
      So re-enable cpu-hotplug after returning from migrate_to_reboot_cpu() in the
      kexec path, to fix this regression in kexec on powerpc.
      
      Also, wrap the cpu_up() in powerpc kexec code within a WARN_ON(), so that we
      can catch such issues more easily in the future.
      
      Fixes: c97102ba (kexec: migrate to reboot cpu)
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      011e4b02
    • Guenter Roeck's avatar
      powerpc: Fix 64 bit builds with binutils 2.24 · 7998eb3d
      Guenter Roeck authored
      With binutils 2.24, various 64 bit builds fail with relocation errors
      such as
      
      arch/powerpc/kernel/built-in.o: In function `exc_debug_crit_book3e':
      	(.text+0x165ee): relocation truncated to fit: R_PPC64_ADDR16_HI
      	against symbol `interrupt_base_book3e' defined in .text section
      	in arch/powerpc/kernel/built-in.o
      arch/powerpc/kernel/built-in.o: In function `exc_debug_crit_book3e':
      	(.text+0x16602): relocation truncated to fit: R_PPC64_ADDR16_HI
      	against symbol `interrupt_end_book3e' defined in .text section
      	in arch/powerpc/kernel/built-in.o
      
      The assembler maintainer says:
      
       I changed the ABI, something that had to be done but unfortunately
       happens to break the booke kernel code.  When building up a 64-bit
       value with lis, ori, shl, oris, ori or similar sequences, you now
       should use @high and @higha in place of @h and @ha.  @h and @ha
       (and their associated relocs R_PPC64_ADDR16_HI and R_PPC64_ADDR16_HA)
       now report overflow if the value is out of 32-bit signed range.
       ie. @h and @ha assume you're building a 32-bit value. This is needed
       to report out-of-range -mcmodel=medium toc pointer offsets in @toc@h
       and @toc@ha expressions, and for consistency I did the same for all
       other @h and @ha relocs.
      
      Replacing @h with @high in one strategic location fixes the relocation
      errors. This has to be done conditionally since the assembler either
      supports @h or @high but not both.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      7998eb3d
  4. 27 May, 2014 9 commits
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · cd79bde2
      Linus Torvalds authored
      Pull virtio_blk fix from Jens Axboe:
       "There's a start/stop queue race in virtio_blk, which causes stalls and
        erratic behaviour for some.  I've had this queued up for 3.16 for a
        while, but I think we should push it into the current series as well.
      
        So I cherry picked the commit and added a stable marker as well, so it
        can propagate down"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        virtio_blk: fix race between start and stop queue
      cd79bde2
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · aa699a1d
      Linus Torvalds authored
      Pull two timer fixes from Thomas Gleixner:
       "Two small fixlets for ARM SoC clocksource drivers:
      
         - avoid calling functions which might sleep from interrupt [disabled]
           context in tcb_clksrc used on Atmel SoCs
      
         - use irq_force_affinity() to pin the per cpu timer interrupt on a
           not yet online cpu in the SiRFprimaII driver"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        clocksource: tcb_clksrc: Make tc_mode interrupt safe
        clocksource: marco: Fix the affinity set for local timer of CPU1
      aa699a1d
    • Linus Torvalds's avatar
      Merge tag 'fixes-for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 758b6712
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "A slightly larger set of fixes than we'd like at this point in the
        release.  Hopefully our very last batch before 3.15:
      
        OMAP:
         - Fix boot regression with CPU_IDLE enabled
         - Fixes for audio playback on OMAP5
         - Clock rate setting fix for OMAP3
         - Misc idle/PM fixes
        Exynos:
         - Removal of a couple of power domains to work around issues with
           access when they are powered down
         - Enabling missing highspeed-i2c driver to make MMC regulators work
         - Secondary CPU spin-up fix for 4212
         - Remove MDMA1 engine to avoid conflicts on secure mode platforms
         - A few other DT fixes
        Marvell:
         - PCI-e fixes for clocks and resource allocation
      
        plus a few other smaller fixes, add a MAINTAINERS entry for reset
        drivers, etc"
      
      * tag 'fixes-for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (21 commits)
        MAINTAINERS: Add reset controller framework entry
        ARM: trusted_foundations: fix compile error on non-SMP
        ARM: at91: sam9260: fix compilation issues
        ARM: mvebu: fix definitions of PCIe interfaces on Armada 38x
        ARM: imx: fix error handling in ipu device registration
        ARM: OMAP4: Fix the boot regression with CPU_IDLE enabled
        ARM: dts: Keep LDO4 always ON for exynos5250-arndale board
        ARM: dts: Fix SPI interrupt numbers for exynos5420
        ARM: dts: fix incorrect ak8975 compatible for exynos4412-trats2 board
        ARM: OMAP2+: Fix DMA hang after off-idle
        ARM: OMAP2+: nand: Fix NAND on OMAP2 and OMAP3 boards
        ARM: dts: Remove g2d_pd node for exynos5420
        ARM: dts: Remove mau_pd node for exynos5420
        ARM: exynos_defconfig: enable HS-I2C to fix for mmc partition mount
        ARM: dts: disable MDMA1 node for exynos5420
        ARM: EXYNOS: fix the secondary CPU boot of exynos4212
        ARM: omap5: hwmod_data: Correct IDLEMODE for McPDM
        ARM: mvebu: mvebu-soc-id: keep clock enabled if PCIe unit is enabled
        ARM: mvebu: mvebu-soc-id: add missing clk_put() call
        ARM: at91/dt: sam9260: correct external trigger value
        ...
      758b6712
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v3.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 51d56652
      Linus Torvalds authored
      Pull pinctrl fix from Linus Walleij:
       "A single last pinctrl fix for the v3.15 series: the vt8500 driver was
        failing to update the output value when the combined set direction
        output and set value was executed"
      
      * tag 'pinctrl-v3.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: vt8500: Ensure value reg is updated when setting direction
      51d56652
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma · c949ddf9
      Linus Torvalds authored
      Pull slave-dmaengine fixes from Vinod Koul:
       "We have three small fixes.
      
        First one from Andy reverts the devm_request irq as we need to ensure
        the tasklet is killed after irq is freed, so we need to do free irq in
        our code.  Other two from Arnd are fixing the compilation issue in
        omap and sa11x0 drivers with ARM randconfigs"
      
      * 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
        dmaengine: sa11x0: remove broken #ifdef
        dmaengine: omap: hide filter_fn for built-in drivers
        dmaengine: dw: went back to plain {request,free}_irq() calls
      c949ddf9
    • Philipp Zabel's avatar
      1b0fe6be
    • Hannes Reinecke's avatar
      dm mpath: really fix lockdep warning · 63d832c3
      Hannes Reinecke authored
      lockdep complains about a circular locking.  And indeed, we need to
      release the lock before calling dm_table_run_md_queue_async().
      
      As such, commit 4cdd2ad7 ("dm mpath: fix lock order inconsistency in
      multipath_ioctl") must also be reverted in addition to fixing the
      lock order in the other dm_table_run_md_queue_async() callers.
      Reported-by: default avatarBart van Assche <bvanassche@acm.org>
      Tested-by: default avatarBart van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      63d832c3
    • Ming Lei's avatar
      virtio_blk: fix race between start and stop queue · aa0818c6
      Ming Lei authored
      When there isn't enough vring descriptor for adding to vq,
      blk-mq will be put as stopped state until some of pending
      descriptors are completed & freed.
      
      Unfortunately, the vq's interrupt may come just before
      blk-mq's BLK_MQ_S_STOPPED flag is set, so the blk-mq will
      still be kept as stopped even though lots of descriptors
      are completed and freed in the interrupt handler. The worst
      case is that all pending descriptors are freed in the
      interrupt handler, and the queue is kept as stopped forever.
      
      This patch fixes the problem by starting/stopping blk-mq
      with holding vq_lock.
      
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarMing Lei <tom.leiming@gmail.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      
      Conflicts:
      	drivers/block/virtio_blk.c
      aa0818c6
    • Heinz Mauelshagen's avatar
      dm cache: always split discards on cache block boundaries · f1daa838
      Heinz Mauelshagen authored
      The DM cache target cannot cope with discards that span multiple cache
      blocks, so each discard bio that spans more than one cache block must
      get split by the DM core.
      Signed-off-by: default avatarHeinz Mauelshagen <heinzm@redhat.com>
      Acked-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org # v3.9+
      f1daa838
  5. 26 May, 2014 2 commits
  6. 25 May, 2014 6 commits