Commits · 24e19d279f9e289e965b4bc4710fbccab824c4c4 · nexedi / linux

30 May, 2014 7 commits

Merge tag 'dm-3.15-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm · 24e19d27

Linus Torvalds authored May 30, 2014

Pull device-mapper fixes from Mike Snitzer:
 "A dm-cache stable fix to split discards on cache block boundaries
  because dm-cache cannot yet handle discards that span cache blocks.

  Really fix a dm-mpath LOCKDEP warning that was introduced in -rc1.

  Add a 'no_space_timeout' control to dm-thinp to restore the ability to
  queue IO indefinitely when no data space is available.  This fixes a
  change in behavior that was introduced in -rc6 where the timeout
  couldn't be disabled"

* tag 'dm-3.15-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm mpath: really fix lockdep warning
  dm cache: always split discards on cache block boundaries
  dm thin: add 'no_space_timeout' dm-thin-pool module param

24e19d27

x86_64: expand kernel stack to 16K · 6538b8ea

Minchan Kim authored May 28, 2014

While I play inhouse patches with much memory pressure on qemu-kvm,
3.14 kernel was randomly crashed. The reason was kernel stack overflow.

When I investigated the problem, the callstack was a little bit deeper
by involve with reclaim functions but not direct reclaim path.

I tried to diet stack size of some functions related with alloc/reclaim
so did a hundred of byte but overflow was't disappeard so that I encounter
overflow by another deeper callstack on reclaim/allocator path.

Of course, we might sweep every sites we have found for reducing
stack usage but I'm not sure how long it saves the world(surely,
lots of developer start to add nice features which will use stack
agains) and if we consider another more complex feature in I/O layer
and/or reclaim path, it might be better to increase stack size(
meanwhile, stack usage on 64bit machine was doubled compared to 32bit
while it have sticked to 8K. Hmm, it's not a fair to me and arm64
already expaned to 16K. )

So, my stupid idea is just let's expand stack size and keep an eye
toward stack consumption on each kernel functions via stacktrace of ftrace.
For example, we can have a bar like that each funcion shouldn't exceed 200K
and emit the warning when some function consumes more in runtime.
Of course, it could make false positive but at least, it could make a
chance to think over it.

I guess this topic was discussed several time so there might be
strong reason not to increase kernel stack size on x86_64, for me not
knowing so Ccing x86_64 maintainers, other MM guys and virtio
maintainers.

Here's an example call trace using up the kernel stack:

         Depth    Size   Location    (51 entries)
         -----    ----   --------
   0)     7696      16   lookup_address
   1)     7680      16   _lookup_address_cpa.isra.3
   2)     7664      24   __change_page_attr_set_clr
   3)     7640     392   kernel_map_pages
   4)     7248     256   get_page_from_freelist
   5)     6992     352   __alloc_pages_nodemask
   6)     6640       8   alloc_pages_current
   7)     6632     168   new_slab
   8)     6464       8   __slab_alloc
   9)     6456      80   __kmalloc
  10)     6376     376   vring_add_indirect
  11)     6000     144   virtqueue_add_sgs
  12)     5856     288   __virtblk_add_req
  13)     5568      96   virtio_queue_rq
  14)     5472     128   __blk_mq_run_hw_queue
  15)     5344      16   blk_mq_run_hw_queue
  16)     5328      96   blk_mq_insert_requests
  17)     5232     112   blk_mq_flush_plug_list
  18)     5120     112   blk_flush_plug_list
  19)     5008      64   io_schedule_timeout
  20)     4944     128   mempool_alloc
  21)     4816      96   bio_alloc_bioset
  22)     4720      48   get_swap_bio
  23)     4672     160   __swap_writepage
  24)     4512      32   swap_writepage
  25)     4480     320   shrink_page_list
  26)     4160     208   shrink_inactive_list
  27)     3952     304   shrink_lruvec
  28)     3648      80   shrink_zone
  29)     3568     128   do_try_to_free_pages
  30)     3440     208   try_to_free_pages
  31)     3232     352   __alloc_pages_nodemask
  32)     2880       8   alloc_pages_current
  33)     2872     200   __page_cache_alloc
  34)     2672      80   find_or_create_page
  35)     2592      80   ext4_mb_load_buddy
  36)     2512     176   ext4_mb_regular_allocator
  37)     2336     128   ext4_mb_new_blocks
  38)     2208     256   ext4_ext_map_blocks
  39)     1952     160   ext4_map_blocks
  40)     1792     384   ext4_writepages
  41)     1408      16   do_writepages
  42)     1392      96   __writeback_single_inode
  43)     1296     176   writeback_sb_inodes
  44)     1120      80   __writeback_inodes_wb
  45)     1040     160   wb_writeback
  46)      880     208   bdi_writeback_workfn
  47)      672     144   process_one_work
  48)      528     112   worker_thread
  49)      416     240   kthread
  50)      176     176   ret_from_fork

[ Note: the problem is exacerbated by certain gcc versions that seem to
  generate much bigger stack frames due to apparently bad coalescing of
  temporaries and generating too many spills.  Rusty saw gcc-4.6.4 using
  35% more stack on the virtio path than 4.8.2 does, for example.

  Minchan not only uses such a bad gcc version (4.6.3 in his case), but
  some of the stack use is due to debugging (CONFIG_DEBUG_PAGEALLOC is
  what causes that kernel_map_pages() frame, for example). But we're
  clearly getting too close.

  The VM code also seems to have excessive stack frames partly for the
  same compiler reason, triggered by excessive inlining and lots of
  function arguments.

  We need to improve on our stack use, but in the meantime let's do this
  simple stack increase too.  Unlike most earlier reports, there is
  nothing simple that stands out as being really horribly wrong here,
  apart from the fact that the stack frames are just bigger than they
  should need to be.        - Linus ]
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S Tsirkin <mst@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: PJ Waskiewicz <pjwaskiewicz@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

6538b8ea

Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 6f6111e4

Linus Torvalds authored May 30, 2014

Pull vfs dcache livelock fix from Al Viro:
 "Fixes for livelocks in shrink_dentry_list() introduced by fixes to
  shrink list corruption; the root cause was that trylock of parent's
  ->d_lock could be disrupted by d_walk() happening on other CPUs,
  resulting in shrink_dentry_list() making no progress *and* the same
  d_walk() being called again and again for as long as
  shrink_dentry_list() doesn't get past that mess.

  The solution is to have shrink_dentry_list() treat that trylock
  failure not as 'try to do the same thing again', but 'lock them in the
  right order'"

* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  dentry_kill() doesn't need the second argument now
  dealing with the rest of shrink_dentry_list() livelock
  shrink_dentry_list(): take parent's ->d_lock earlier
  expand dentry_kill(dentry, 0) in shrink_dentry_list()
  split dentry_kill()
  lift the "already marked killed" case into shrink_dentry_list()

6f6111e4

dentry_kill() doesn't need the second argument now · 8cbf74da
Al Viro authored May 29, 2014
```
it's 1 in the only remaining caller.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
8cbf74da

dealing with the rest of shrink_dentry_list() livelock · b2b80195

Al Viro authored May 29, 2014

We have the same problem with ->d_lock order in the inner loop, where
we are dropping references to ancestors.  Same solution, basically -
instead of using dentry_kill() we use lock_parent() (introduced in the
previous commit) to get that lock in a safe way, recheck ->d_count
(in case if lock_parent() has ended up dropping and retaking ->d_lock
and somebody managed to grab a reference during that window), trylock
the inode->i_lock and use __dentry_kill() to do the rest.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

b2b80195

shrink_dentry_list(): take parent's ->d_lock earlier · 046b961b

Al Viro authored May 29, 2014

The cause of livelocks there is that we are taking ->d_lock on
dentry and its parent in the wrong order, forcing us to use
trylock on the parent's one.  d_walk() takes them in the right
order, and unfortunately it's not hard to create a situation
when shrink_dentry_list() can't make progress since trylock
keeps failing, and shrink_dcache_parent() or check_submounts_and_drop()
keeps calling d_walk() disrupting the very shrink_dentry_list() it's
waiting for.

Solution is straightforward - if that trylock fails, let's unlock
the dentry itself and take locks in the right order.  We need to
stabilize ->d_parent without holding ->d_lock, but that's doable
using RCU.  And we'd better do that in the very beginning of the
loop in shrink_dentry_list(), since the checks on refcount, etc.
would need to be redone anyway.

That deals with a half of the problem - killing dentries on the
shrink list itself.  Another one (dropping their parents) is
in the next commit.

locking parent is interesting - it would be easy to do rcu_read_lock(),
lock whatever we think is a parent, lock dentry itself and check
if the parent is still the right one.  Except that we need to check
that *before* locking the dentry, or we are risking taking ->d_lock
out of order.  Fortunately, once the D1 is locked, we can check if
D2->d_parent is equal to D1 without the need to lock D2; D2->d_parent
can start or stop pointing to D1 only under D1->d_lock, so taking
D1->d_lock is enough.  In other words, the right solution is
rcu_read_lock/lock what looks like parent right now/check if it's
still our parent/rcu_read_unlock/lock the child.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

046b961b

Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm · fe45736f

Linus Torvalds authored May 29, 2014

Pull ARM fixes from Russell King:
 "The usual random collection of relatively small ARM fixes"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 8063/1: bL_switcher: fix individual online status reporting of removed CPUs
  ARM: 8064/1: fix v7-M signal return
  ARM: 8057/1: amba: Add Qualcomm vendor ID.
  ARM: 8052/1: unwind: Fix handling of "Pop r4-r[4+nnn],r14" opcode
  ARM: 8051/1: put_user: fix possible data corruption in put_user
  ARM: 8048/1: fix v7-M setup stack location

fe45736f

29 May, 2014 6 commits

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · a991639c

Linus Torvalds authored May 29, 2014

Pull arm64 fix from Will Deacon:
 "Fix CoW regression for transparent hugepages by routing set_pmd_at to
  set_pte_at, which correctly handles PTE_WRITE and will mark the
  resulting table entry as read-only where appropriate"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: mm: fix pmd_write CoW brokenness

a991639c

Merge tag 'pm+acpi-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · f035b3d3

Linus Torvalds authored May 29, 2014

Pull ACPI and power management fixes from Rafael Wysocki:
 "These are three stable-candidate fixes, one for the ACPI thermal
  driver and two for cpufreq drivers.

  Specifics:

   - A workqueue is destroyed too early during the ACPI thermal driver
     module unload which leads to a NULL pointer dereference in the
     driver's remove callback.  Fix from Aaron Lu.

   - A wrong argument is passed to devm_regulator_get_optional() in the
     probe routine of the cpu0 cpufreq driver which leads to resource
     leaks if the driver is unbound from the cpufreq platform device.
     Fix from Lucas Stach.

   - A lock is missing in cpufreq_governor_dbs() which leads to memory
     corruption and NULL pointer dereferences during system
     suspend/resume, for example.  Fix from Bibek Basu"

* tag 'pm+acpi-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / thermal: fix workqueue destroy order
  cpufreq: cpu0: drop wrong devm usage
  cpufreq: remove race while accessing cur_policy

f035b3d3

Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux · 15a7b60e

Linus Torvalds authored May 29, 2014

Pull clock fixes from Mike Turquette:
 "Small number of user-visible regression fixes for clock drivers.

  There is a memory leak fix for an ST platform, an infinite Loop Of
  Doom fix for the recent changes to the basic clock divider (hopefully
  the last fix for those recent changes) and some Tegra PLL changes
  which keep PCI from being hosed on that platform"

* tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux:
  clk: st: Fix memory leak
  clk: divider: Fix table round up function
  clk: tegra: Fix enabling of PLLE
  clk: tegra: Introduce divider mask and shift helpers
  clk: tegra: Fix PLLE programming

15a7b60e

expand dentry_kill(dentry, 0) in shrink_dentry_list() · ff2fde99

Al Viro authored May 28, 2014

Result will be massaged to saner shape in the next commits.  It is
ugly, no questions - the point of that one is to be a provably
equivalent transformation (and it might be worth splitting a bit
more).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

ff2fde99

split dentry_kill() · e55fd011

Al Viro authored May 28, 2014

... into trylocks and everything else.  The latter (actual killing)
is __dentry_kill().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

e55fd011

arm64: mm: fix pmd_write CoW brokenness · ceb21835

Will Deacon authored May 27, 2014

Commit 9c7e535f ("arm64: mm: Route pmd thp functions through pte
equivalents") changed the pmd manipulator and accessor functions to
convert the target pmd to a pte, process it with the pte functions, then
convert it back. Along the way, we gained support for PTE_WRITE, however
this is completely ignored by set_pmd_at, and so we fail to set the
PMD_SECT_RDONLY for PMDs, resulting in all sorts of lovely failures (like
CoW not working).

Partially reverting the offending commit (by making use of
PMD_SECT_RDONLY explicitly for pmd_{write,wrprotect,mkwrite} functions)
leads to further issues because pmd_write can then return potentially
incorrect values for page table entries marked as RDONLY, leading to
BUG_ON(pmd_write(entry)) tripping under some THP workloads.

This patch fixes the issue by routing set_pmd_at through set_pte_at,
which correctly takes the PTE_WRITE flag into account. Given that
THP mappings are always anonymous, the additional cache-flushing code
in __sync_icache_dcache won't impose any significant overhead as the
flush will be skipped.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>

ceb21835

28 May, 2014 10 commits

Merge tag 'sound-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · f2159d1e

Linus Torvalds authored May 28, 2014

Pull sound fixes from Takashi Iwai:
 "Just two small stable fixes: an HD-audio fix for the new Intel
  chipsets and a PM handling fix in PCM dmaengine core"

* tag 'sound-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Fix onboard audio on Intel H97/Z97 chipsets
  ALSA: pcm_dmaengine: Add check during device suspend

f2159d1e

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 28269580

Linus Torvalds authored May 28, 2014

Pull vfs fix from Al Viro:
 "Oh, well...  Still nothing useful on that livelock (I had something
  that looked kinda-sorta like a non-invasive solution, but it
  deadlocks), so it's just Miklos' vmsplice fix for now"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  vfs: fix vmplice_to_user()

28269580

ARM: 8063/1: bL_switcher: fix individual online status reporting of removed CPUs · 3f8517e7

Nicolas Pitre authored May 23, 2014

The content of /sys/devices/system/cpu/cpu*/online  is still 1 for those
CPUs that the switcher has removed even though the global state in
/sys/devices/system/cpu/online is updated correctly.

It turns out that commit 0902a904 ("Driver core: Use generic
offline/online for CPU offline/online") has changed the way those files
retrieve their content by relying on on the generic attribute handling
code.  The switcher, by calling cpu_down() directly, bypasses this
handling and the attribute value doesn't get updated.

Fix this by calling device_offline()/device_online() instead.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

3f8517e7

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 4efdedca

Linus Torvalds authored May 28, 2014

Pull kvm fixes from Paolo Bonzini:
 "Small fixes for x86, slightly larger fixes for PPC, and a forgotten
  s390 patch.  The PPC fixes are important because they fix breakage
  that is new in 3.15"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: s390: announce irqfd capability
  KVM: x86: disable master clock if TSC is reset during suspend
  KVM: vmx: disable APIC virtualization in nested guests
  KVM guest: Make pv trampoline code executable
  KVM: PPC: Book3S: ifdef on CONFIG_KVM_BOOK3S_32_HANDLER for 32bit
  KVM: PPC: Book3S HV: Add missing code for transaction reclaim on guest exit
  KVM: PPC: Book3S: HV: make _PAGE_NUMA take effect

4efdedca

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc · 9e3d6331

Linus Torvalds authored May 28, 2014

Pull two powerpc fixes from Ben Herrenschmidt:
 "Here's a pair of powerpc fixes for 3.15 which are also going to
  stable.

  One's a fix for building with newer binutils (the problem currently
  only affects the BookE kernels but the affected macro might come back
  into use on BookS platforms at any time).  Unfortunately, the binutils
  maintainer did a backward incompatible change to a construct that we
  use so we have to add Makefile check.

  The other one is a fix for CPUs getting stuck in kexec when running
  single threaded.  Since we routinely use kexec on power (including in
  our newer bootloaders), I deemed that important enough"

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc, kexec: Fix "Processor X is stuck" issue during kexec from ST mode
  powerpc: Fix 64 bit builds with binutils 2.24

9e3d6331

lift the "already marked killed" case into shrink_dentry_list() · 64fd72e0

Al Viro authored May 28, 2014

It can happen only when dentry_kill() is called with unlock_on_failure
equal to 0 - other callers had dentry pinned until the moment they've
got ->d_lock and DCACHE_DENTRY_KILLED is set only after lockref_mark_dead().

IOW, only one of three call sites of dentry_kill() might end up reaching
that code. Just move it there.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

64fd72e0

vfs: fix vmplice_to_user() · b6dd6f47

Miklos Szeredi authored May 27, 2014

Commit 6130f531 "switch vmsplice_to_user() to copy_page_to_iter()" in
v3.15-rc1 broke vmsplice(2).

This patch fixes two bugs:

 - count is not initialized to a proper value, which resulted in no data
   being copied

 - if rw_copy_check_uvector() returns negative then the iov might be leaked.

Tested OK.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

b6dd6f47

Merge tag 'clk-tegra-fixes-3.15' of... · 51784380

Mike Turquette authored May 27, 2014

Merge tag 'clk-tegra-fixes-3.15' of git://nv-tegra.nvidia.com/user/pdeschrijver/linux into clk-fixes

PLLE fixes for 3.15

51784380

powerpc, kexec: Fix "Processor X is stuck" issue during kexec from ST mode · 011e4b02

Srivatsa S. Bhat authored May 27, 2014

If we try to perform a kexec when the machine is in ST (Single-Threaded) mode
(ppc64_cpu --smt=off), the kexec operation doesn't succeed properly, and we
get the following messages during boot:

[    0.089866] POWER8 performance monitor hardware support registered
[    0.089985] power8-pmu: PMAO restore workaround active.
[    5.095419] Processor 1 is stuck.
[   10.097933] Processor 2 is stuck.
[   15.100480] Processor 3 is stuck.
[   20.102982] Processor 4 is stuck.
[   25.105489] Processor 5 is stuck.
[   30.108005] Processor 6 is stuck.
[   35.110518] Processor 7 is stuck.
[   40.113369] Processor 9 is stuck.
[   45.115879] Processor 10 is stuck.
[   50.118389] Processor 11 is stuck.
[   55.120904] Processor 12 is stuck.
[   60.123425] Processor 13 is stuck.
[   65.125970] Processor 14 is stuck.
[   70.128495] Processor 15 is stuck.
[   75.131316] Processor 17 is stuck.

Note that only the sibling threads are stuck, while the primary threads (0, 8,
16 etc) boot just fine. Looking closer at the previous step of kexec, we observe
that kexec tries to wakeup (bring online) the sibling threads of all the cores,
before performing kexec:

[ 9464.131231] Starting new kernel
[ 9464.148507] kexec: Waking offline cpu 1.
[ 9464.148552] kexec: Waking offline cpu 2.
[ 9464.148600] kexec: Waking offline cpu 3.
[ 9464.148636] kexec: Waking offline cpu 4.
[ 9464.148671] kexec: Waking offline cpu 5.
[ 9464.148708] kexec: Waking offline cpu 6.
[ 9464.148743] kexec: Waking offline cpu 7.
[ 9464.148779] kexec: Waking offline cpu 9.
[ 9464.148815] kexec: Waking offline cpu 10.
[ 9464.148851] kexec: Waking offline cpu 11.
[ 9464.148887] kexec: Waking offline cpu 12.
[ 9464.148922] kexec: Waking offline cpu 13.
[ 9464.148958] kexec: Waking offline cpu 14.
[ 9464.148994] kexec: Waking offline cpu 15.
[ 9464.149030] kexec: Waking offline cpu 17.

Instrumenting this piece of code revealed that the cpu_up() operation actually
fails with -EBUSY. Thus, only the primary threads of all the cores are online
during kexec, and hence this is a sure-shot receipe for disaster, as explained
in commit e8e5c215 (powerpc/kexec: Fix orphaned offline CPUs across kexec),
as well as in the comment above wake_offline_cpus().

It turns out that cpu_up() was returning -EBUSY because the variable
'cpu_hotplug_disabled' was set to 1; and this disabling of CPU hotplug was done
by migrate_to_reboot_cpu() inside kernel_kexec().

Now, migrate_to_reboot_cpu() was originally written with the assumption that
any further code will not need to perform CPU hotplug, since we are anyway in
the reboot path. However, kexec is clearly not such a case, since we depend on
onlining CPUs, atleast on powerpc.

So re-enable cpu-hotplug after returning from migrate_to_reboot_cpu() in the
kexec path, to fix this regression in kexec on powerpc.

Also, wrap the cpu_up() in powerpc kexec code within a WARN_ON(), so that we
can catch such issues more easily in the future.

Fixes: c97102ba (kexec: migrate to reboot cpu)
Cc: stable@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

011e4b02

powerpc: Fix 64 bit builds with binutils 2.24 · 7998eb3d

Guenter Roeck authored May 15, 2014

With binutils 2.24, various 64 bit builds fail with relocation errors
such as

arch/powerpc/kernel/built-in.o: In function `exc_debug_crit_book3e':
	(.text+0x165ee): relocation truncated to fit: R_PPC64_ADDR16_HI
	against symbol `interrupt_base_book3e' defined in .text section
	in arch/powerpc/kernel/built-in.o
arch/powerpc/kernel/built-in.o: In function `exc_debug_crit_book3e':
	(.text+0x16602): relocation truncated to fit: R_PPC64_ADDR16_HI
	against symbol `interrupt_end_book3e' defined in .text section
	in arch/powerpc/kernel/built-in.o

The assembler maintainer says:

 I changed the ABI, something that had to be done but unfortunately
 happens to break the booke kernel code.  When building up a 64-bit
 value with lis, ori, shl, oris, ori or similar sequences, you now
 should use @high and @higha in place of @h and @ha.  @h and @ha
 (and their associated relocs R_PPC64_ADDR16_HI and R_PPC64_ADDR16_HA)
 now report overflow if the value is out of 32-bit signed range.
 ie. @h and @ha assume you're building a 32-bit value. This is needed
 to report out-of-range -mcmodel=medium toc pointer offsets in @toc@h
 and @toc@ha expressions, and for consistency I did the same for all
 other @h and @ha relocs.

Replacing @h with @high in one strategic location fixes the relocation
errors. This has to be done conditionally since the assembler either
supports @h or @high but not both.

Cc: <stable@vger.kernel.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

7998eb3d

27 May, 2014 9 commits

Merge branch 'for-linus' of git://git.kernel.dk/linux-block · cd79bde2

Linus Torvalds authored May 27, 2014

Pull virtio_blk fix from Jens Axboe:
 "There's a start/stop queue race in virtio_blk, which causes stalls and
  erratic behaviour for some.  I've had this queued up for 3.16 for a
  while, but I think we should push it into the current series as well.

  So I cherry picked the commit and added a stable marker as well, so it
  can propagate down"

* 'for-linus' of git://git.kernel.dk/linux-block:
  virtio_blk: fix race between start and stop queue

cd79bde2

Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · aa699a1d

Linus Torvalds authored May 27, 2014

Pull two timer fixes from Thomas Gleixner:
 "Two small fixlets for ARM SoC clocksource drivers:

   - avoid calling functions which might sleep from interrupt [disabled]
     context in tcb_clksrc used on Atmel SoCs

   - use irq_force_affinity() to pin the per cpu timer interrupt on a
     not yet online cpu in the SiRFprimaII driver"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource: tcb_clksrc: Make tc_mode interrupt safe
  clocksource: marco: Fix the affinity set for local timer of CPU1

aa699a1d

Merge tag 'fixes-for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 758b6712

Linus Torvalds authored May 27, 2014

Pull ARM SoC fixes from Olof Johansson:
 "A slightly larger set of fixes than we'd like at this point in the
  release.  Hopefully our very last batch before 3.15:

  OMAP:
   - Fix boot regression with CPU_IDLE enabled
   - Fixes for audio playback on OMAP5
   - Clock rate setting fix for OMAP3
   - Misc idle/PM fixes
  Exynos:
   - Removal of a couple of power domains to work around issues with
     access when they are powered down
   - Enabling missing highspeed-i2c driver to make MMC regulators work
   - Secondary CPU spin-up fix for 4212
   - Remove MDMA1 engine to avoid conflicts on secure mode platforms
   - A few other DT fixes
  Marvell:
   - PCI-e fixes for clocks and resource allocation

  plus a few other smaller fixes, add a MAINTAINERS entry for reset
  drivers, etc"

* tag 'fixes-for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (21 commits)
  MAINTAINERS: Add reset controller framework entry
  ARM: trusted_foundations: fix compile error on non-SMP
  ARM: at91: sam9260: fix compilation issues
  ARM: mvebu: fix definitions of PCIe interfaces on Armada 38x
  ARM: imx: fix error handling in ipu device registration
  ARM: OMAP4: Fix the boot regression with CPU_IDLE enabled
  ARM: dts: Keep LDO4 always ON for exynos5250-arndale board
  ARM: dts: Fix SPI interrupt numbers for exynos5420
  ARM: dts: fix incorrect ak8975 compatible for exynos4412-trats2 board
  ARM: OMAP2+: Fix DMA hang after off-idle
  ARM: OMAP2+: nand: Fix NAND on OMAP2 and OMAP3 boards
  ARM: dts: Remove g2d_pd node for exynos5420
  ARM: dts: Remove mau_pd node for exynos5420
  ARM: exynos_defconfig: enable HS-I2C to fix for mmc partition mount
  ARM: dts: disable MDMA1 node for exynos5420
  ARM: EXYNOS: fix the secondary CPU boot of exynos4212
  ARM: omap5: hwmod_data: Correct IDLEMODE for McPDM
  ARM: mvebu: mvebu-soc-id: keep clock enabled if PCIe unit is enabled
  ARM: mvebu: mvebu-soc-id: add missing clk_put() call
  ARM: at91/dt: sam9260: correct external trigger value
  ...

758b6712

Merge tag 'pinctrl-v3.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 51d56652

Linus Torvalds authored May 27, 2014

Pull pinctrl fix from Linus Walleij:
 "A single last pinctrl fix for the v3.15 series: the vt8500 driver was
  failing to update the output value when the combined set direction
  output and set value was executed"

* tag 'pinctrl-v3.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: vt8500: Ensure value reg is updated when setting direction

51d56652

Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma · c949ddf9

Linus Torvalds authored May 27, 2014

Pull slave-dmaengine fixes from Vinod Koul:
 "We have three small fixes.

  First one from Andy reverts the devm_request irq as we need to ensure
  the tasklet is killed after irq is freed, so we need to do free irq in
  our code.  Other two from Arnd are fixing the compilation issue in
  omap and sa11x0 drivers with ARM randconfigs"

* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: sa11x0: remove broken #ifdef
  dmaengine: omap: hide filter_fn for built-in drivers
  dmaengine: dw: went back to plain {request,free}_irq() calls

c949ddf9

MAINTAINERS: Add reset controller framework entry · 1b0fe6be

Philipp Zabel authored May 27, 2014

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Olof Johansson <olof@lixom.net>

1b0fe6be

dm mpath: really fix lockdep warning · 63d832c3

Hannes Reinecke authored May 26, 2014

lockdep complains about a circular locking.  And indeed, we need to
release the lock before calling dm_table_run_md_queue_async().

As such, commit 4cdd2ad7 ("dm mpath: fix lock order inconsistency in
multipath_ioctl") must also be reverted in addition to fixing the
lock order in the other dm_table_run_md_queue_async() callers.
Reported-by: Bart van Assche <bvanassche@acm.org>
Tested-by: Bart van Assche <bvanassche@acm.org>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

63d832c3

virtio_blk: fix race between start and stop queue · aa0818c6

Ming Lei authored May 16, 2014

When there isn't enough vring descriptor for adding to vq,
blk-mq will be put as stopped state until some of pending
descriptors are completed & freed.

Unfortunately, the vq's interrupt may come just before
blk-mq's BLK_MQ_S_STOPPED flag is set, so the blk-mq will
still be kept as stopped even though lots of descriptors
are completed and freed in the interrupt handler. The worst
case is that all pending descriptors are freed in the
interrupt handler, and the queue is kept as stopped forever.

This patch fixes the problem by starting/stopping blk-mq
with holding vq_lock.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>

Conflicts:
	drivers/block/virtio_blk.c

aa0818c6

dm cache: always split discards on cache block boundaries · f1daa838

Heinz Mauelshagen authored May 23, 2014

The DM cache target cannot cope with discards that span multiple cache
blocks, so each discard bio that spans more than one cache block must
get split by the DM core.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org # v3.9+

f1daa838

26 May, 2014 2 commits

Merge branches 'pm-cpufreq' and 'acpi-thermal' · 9b961aa9

Rafael J. Wysocki authored May 26, 2014

* pm-cpufreq:
  cpufreq: cpu0: drop wrong devm usage
  cpufreq: remove race while accessing cur_policy

* acpi-thermal:
  ACPI / thermal: fix workqueue destroy order

9b961aa9

ACPI / thermal: fix workqueue destroy order · 2807bd18

Aaron Lu authored May 26, 2014

When the thermal module is to be removed, we should destroy the wq
acpi_thermal_pm_queue after the ACPI driver's remove callback is
executed as we will need to flush the workqueue there, or a NULL pointer
access will be hit.
Reported-and-tested-by: Kui Zhang <kuizhang@gmail.com>
References: http://www.spinics.net/lists/kernel/msg1747251.html
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

2807bd18

25 May, 2014 6 commits

Linux 3.15-rc7 · c7208164
Linus Torvalds authored May 25, 2014

c7208164

ARM: 8064/1: fix v7-M signal return · 483a6c9d

Rabin Vincent authored May 24, 2014

According to the ARM ARM, the behaviour is UNPREDICTABLE if the PC read
from the exception return stack is not half word aligned.  See the
pseudo code for ExceptionReturn() and PopStack().

The signal handler's address has the bit 0 set, and setup_return()
directly writes this to regs->ARM_pc.  Current hardware happens to
discard this bit, but QEMU's emulation doesn't and this makes processes
crash.  Mask out bit 0 before the exception return in order to get
predictable behaviour.

Fixes: 19c4d593 ("ARM: ARMv7-M: Add support for exception handling")

Cc: stable@kernel.org
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

483a6c9d

ARM: 8057/1: amba: Add Qualcomm vendor ID. · fbebf597

srinik authored May 15, 2014

This patch adds Qualcomm amba vendor Id to the list. This ID is used in mmci driver. The ID selected in same lines like 0x41 is "A" for ARM, 0x51 is "Q" for Qualcomm.

As there are no physical register on Qcom SOC for amba vendor id, this is a fake ID assigned based on "Q" prefix from Qualcomm.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

fbebf597

ARM: 8052/1: unwind: Fix handling of "Pop r4-r[4+nnn],r14" opcode · 8203d5b6

Nikolay Borisov authored May 08, 2014

The arm EABI states that unwind opcode 10100nnn means pop register r4-4[4+nnn],aditionally there is a similar unwind opcode: 10101nnn which means the same thing plus popping r14. Those two cases are handled by the unwind_exec_pop_r4_to_rN function which checks whether the 4th bit is set and does r14 popping.

However, up until now it has been checking whether the 8th bit was set (mask & 0x80) instead of the 4th (mask & 0x8), a simple to make typo but this meant that we were always popping r14 even if we had the former opcode.

This patch changes the mask so that the 2 unwind opcodes are being handled correctly.
Signed-off-by: Nikolay Borisov <Nikolay.Borisov@arm.com>
Reviewed-by: Anurag Aggarwal <anurag19aggarwal@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

8203d5b6

ARM: 8051/1: put_user: fix possible data corruption in put_user · 537094b6

Andrey Ryabinin authored May 07, 2014

According to arm procedure call standart r2 register is call-cloberred.
So after the result of x expression was put into r2 any following
function call in p may overwrite r2. To fix this, the result of p
expression must be saved to the temporary variable before the
assigment x expression to __r2.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

537094b6

ARM: 8048/1: fix v7-M setup stack location · f9ff907b

Rabin Vincent authored May 03, 2014

__v7m_setup_stack currently sits in the .proc.info.init section, and
thus creates a bogus proc info entry (which by the way matches any
unknown CPU IDs, due to the entry's mask being 0).  Move it out of
there.
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

f9ff907b