Commits · 4e873a3de6ff547928e8b171a6de07993823e572 · Kirill Smelkov / linux

29 Jul, 2014 17 commits

igb: do a reset on SR-IOV re-init if device is down · 4e873a3d

Stefan Assmann authored Jul 10, 2014

commit 76252723 upstream.

To properly re-initialize SR-IOV it is necessary to reset the device
even if it is already down. Not doing this may result in Tx unit hangs.
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

4e873a3d

igb: Workaround for i210 Errata 25: Slow System Clock · 13089c3a

Todd Fujinaka authored Jul 10, 2014

commit 94826487 upstream.

On some devices, the internal PLL circuit occasionally provides the
wrong clock frequency after power up. The probability of failure is less
than one failure per 1000 power cycles. When the failure occurs, the
internal clock frequency is around 1/20 of the correct frequency.
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

13089c3a

hwmon: (adt7470) Fix writes to temperature limit registers · c37dcd9c

Guenter Roeck authored Jul 16, 2014

commit de12d6f4 upstream.

Temperature limit registers are signed. Limits therefore need
to be clamped to (-128, 127) degrees C and not to (0, 255)
degrees C.

Without this fix, writing a limit of 128 degrees C sets the
actual limit to -128 degrees C.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

c37dcd9c

hwmon: (da9052) Don't use dash in the name attribute · 9de942f9

Axel Lin authored Jul 09, 2014

commit ee14b644 upstream.

Dashes are not allowed in hwmon name attributes.
Use "da9052" instead of "da9052-hwmon".
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

9de942f9

hwmon: (da9055) Don't use dash in the name attribute · 6687e933

Axel Lin authored Jul 09, 2014

commit 6b00f440 upstream.

Dashes are not allowed in hwmon name attributes.
Use "da9055" instead of "da9055-hwmon".
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

6687e933

tracing: Add TRACE_ITER_PRINTK flag check in __trace_puts/__trace_bputs · e12331d9

zhangwei(Jovi) authored Jul 18, 2013

commit f0160a5a upstream.

The TRACE_ITER_PRINTK check in __trace_puts/__trace_bputs is missing,
so add it, to be consistent with __trace_printk/__trace_bprintk.
Those functions are all called by the same function: trace_printk().

Link: http://lkml.kernel.org/p/51E7A7D6.8090900@huawei.comSigned-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

e12331d9

tracing: Add ftrace_trace_stack into __trace_puts/__trace_bputs · cce7b584

zhangwei(Jovi) authored Jul 18, 2013

commit 8abfb872 upstream.

Currently trace option stacktrace is not applicable for
trace_printk with constant string argument, the reason is
in __trace_puts/__trace_bputs ftrace_trace_stack is missing.

In contrast, when using trace_printk with non constant string
argument(will call into __trace_printk/__trace_bprintk), then
trace option stacktrace is workable, this inconstant result
will confuses users a lot.

Link: http://lkml.kernel.org/p/51E7A7C9.9040401@huawei.comSigned-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

cce7b584

tracing: Fix graph tracer with stack tracer on other archs · 48b19dc8

Steven Rostedt (Red Hat) authored Jul 15, 2014

commit 5f8bf2d2 upstream.

Running my ftrace tests on PowerPC, it failed the test that checks
if function_graph tracer is affected by the stack tracer. It was.
Looking into this, I found that the update_function_graph_func()
must be called even if the trampoline function is not changed.
This is because archs like PowerPC do not support ftrace_ops being
passed by assembly and instead uses a helper function (what the
trampoline function points to). Since this function is not changed
even when multiple ftrace_ops are added to the code, the test that
falls out before calling update_function_graph_func() will miss that
the update must still be done.

Call update_function_graph_function() for all calls to
update_ftrace_function()
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

48b19dc8

fuse: ignore entry-timeout on LOOKUP_REVAL · 21861467

Anand Avati authored Jun 26, 2014

commit 154210cc upstream.

The following test case demonstrates the bug:

  sh# mount -t glusterfs localhost:meta-test /mnt/one

  sh# mount -t glusterfs localhost:meta-test /mnt/two

  sh# echo stuff > /mnt/one/file; rm -f /mnt/two/file; echo stuff > /mnt/one/file
  bash: /mnt/one/file: Stale file handle

  sh# echo stuff > /mnt/one/file; rm -f /mnt/two/file; sleep 1; echo stuff > /mnt/one/file

On the second open() on /mnt/one, FUSE would have used the old
nodeid (file handle) trying to re-open it. Gluster is returning
-ESTALE. The ESTALE propagates back to namei.c:filename_lookup()
where lookup is re-attempted with LOOKUP_REVAL. The right
behavior now, would be for FUSE to ignore the entry-timeout and
and do the up-call revalidation. Instead FUSE is ignoring
LOOKUP_REVAL, succeeding the revalidation (because entry-timeout
has not passed), and open() is again retried on the old file
handle and finally the ESTALE is going back to the application.

Fix: if revalidation is happening with LOOKUP_REVAL, then ignore
entry-timeout and always do the up-call.
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

21861467

fuse: handle large user and group ID · 366a83d6

Miklos Szeredi authored Jul 07, 2014

commit 233a01fa upstream.

If the number in "user_id=N" or "group_id=N" mount options was larger than
INT_MAX then fuse returned EINVAL.

Fix this to handle all valid uid/gid values.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

366a83d6

fuse: timeout comparison fix · de6ab0e7

Miklos Szeredi authored Jul 07, 2014

commit 126b9d43 upstream.

As suggested by checkpatch.pl, use time_before64() instead of direct
comparison of jiffies64 values.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

de6ab0e7

Bluetooth: Ignore H5 non-link packets in non-active state · 5233db85

Loic Poulain authored Jun 23, 2014

commit 48439d50 upstream.

When detecting a non-link packet, h5_reset_rx() frees the Rx skb.
Not returning after that will cause the upcoming h5_rx_payload()
call to dereference a now NULL Rx skb and trigger a kernel oops.
Signed-off-by: Loic Poulain <loic.poulain@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

5233db85

Drivers: hv: util: Fix a bug in the KVP code · 65769bee

K. Y. Srinivasan authored Jul 07, 2014

commit 9bd2d0df upstream.

Add code to poll the channel since we process only one message
at a time and the host may not interrupt us. Also increase the
receive buffer size since some KVP messages are close to 8K bytes in size.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

65769bee

ALSA: hda - Fix broken PM due to incomplete i915 initialization · 02c9a938

Takashi Iwai authored Jul 15, 2014

commit 4da63c6f upstream.

When the initialization of Intel HDMI controller fails due to missing
i915 kernel symbols (e.g. HD-audio is built in while i915 is module),
the driver discontinues the probe.  However, since the probe was done
asynchronously, the driver object still remains, thus the relevant PM
ops are still called at suspend/resume. This results in the bad access
to the incomplete audio card object, eventually leads to Oops or stall
at PM.

This patch adds the missing checks of chip->init_failed flag at each
PM callback in order to fix the problem above.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=79561Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

02c9a938

media: gspca_pac7302: Add new usb-id for Genius i-Look 317 · 3f88bf83

Hans de Goede authored Jul 09, 2014

commit 242841d3 upstream.
Tested-and-reported-by: yullaw <yullaw@mageia.cz>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

3f88bf83

usb: chipidea: udc: Disable auto ZLP generation on ep0 · 1a24f220

Abbas Raza authored Jul 17, 2014

commit 953c6646 upstream.

There are 2 methods for ZLP (zero-length packet) generation:
1) In software
2) Automatic generation by device controller

1) is implemented in UDC driver and it attaches ZLP to IN packet if
   descriptor->size < wLength
2) can be enabled/disabled by setting ZLT bit in the QH

When gadget ffs is connected to ubuntu host, the host sends
get descriptor request and wLength in setup packet is 255 while the
size of descriptor which will be sent by gadget in IN packet is
64 byte. So the composite driver sets req->zero = 1.
In UDC driver following code will be executed then

        if (hwreq->req.zero && hwreq->req.length
            && (hwreq->req.length % hwep->ep.maxpacket == 0))
                add_td_to_list(hwep, hwreq, 0);

Case-A:
So in case of ubuntu host, UDC driver will attach a ZLP to the IN packet.
ubuntu host will request 255 byte in IN request, gadget will send 64 byte
with ZLP and host will come to know that there is no more data.
But hold on, by default ZLT=0 for endpoint 0 so hardware also tries to
automatically generate the ZLP which blocks enumeration for ~6 seconds due
to endpoint 0 STALL, NAKs are sent to host for any requests (OUT/PING)

Case-B:
In case when gadget ffs is connected to Apple device, Apple device sends
setup packet with wLength=64. So descriptor->size = 64 and wLength=64
therefore req->zero = 0 and UDC driver will not attach any ZLP to the
IN packet. Apple device requests 64 bytes, gets 64 bytes and doesn't
further request for IN data. But ZLT=0 by default for endpoint 0 so
hardware tries to automatically generate the ZLP which blocks enumeration
for ~6 seconds due to endpoint 0 STALL, NAKs are sent to host for any
requests (OUT/PING)

According to USB2.0 specs:

    8.5.3.2 Variable-length Data Stage
    A control pipe may have a variable-length data phase in which the
    host requests more data than is contained in the specified data
    structure. When all of the data structure is returned to the host,
    the function should indicate that the Data stage is ended by
    returning a packet that is shorter than the MaxPacketSize for the
    pipe. If the data structure is an exact multiple of wMaxPacketSize
    for the pipe, the function will return a zero-length packet to indicate
    the end of the Data stage.

In Case-A mentioned above:
If we disable software ZLP generation & ZLT=0 for endpoint 0 OR if software
ZLP generation is not disabled but we set ZLT=1 for endpoint 0 then
enumeration doesn't block for 6 seconds.

In Case-B mentioned above:
If we disable software ZLP generation & ZLT=0 for endpoint then enumeration
still blocks due to ZLP automatically generated by hardware and host not needing
it. But if we keep software ZLP generation enabled but we set ZLT=1 for
endpoint 0 then enumeration doesn't block for 6 seconds.

So the proper solution for this issue seems to disable automatic ZLP generation
by hardware (i.e by setting ZLT=1 for endpoint 0) and let software (UDC driver)
handle the ZLP generation based on req->zero field.
Signed-off-by: Abbas Raza <Abbas_Raza@mentor.com>
Signed-off-by: Peter Chen <peter.chen@freescale.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

1a24f220

usb: Check if port status is equal to RxDetect · 3f37d940

Gavin Guo authored Jul 18, 2014

commit bb86cf56 upstream.

When using USB 3.0 pen drive with the [AMD] FCH USB XHCI Controller
[1022:7814], the second hotplugging will experience the USB 3.0 pen
drive is recognized as high-speed device. After bisecting the kernel,
I found the commit number 41e7e056
(USB: Allow USB 3.0 ports to be disabled.) causes the bug. After doing
some experiments, the bug can be fixed by avoiding executing the function
hub_usb3_port_disable(). Because the port status with [AMD] FCH USB
XHCI Controlleris [1022:7814] is already in RxDetect
(I tried printing out the port status before setting to Disabled state),
it's reasonable to check the port status before really executing
hub_usb3_port_disable().

Fixes: 41e7e056 (USB: Allow USB 3.0 ports to be disabled.)
Signed-off-by: Gavin Guo <gavin.guo@canonical.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

3f37d940

28 Jul, 2014 8 commits

mm: kmemleak: avoid false negatives on vmalloc'ed objects · abd263fb

Catalin Marinas authored Nov 12, 2013

commit 7f88f88f upstream.

Commit 248ac0e1 ("mm/vmalloc: remove guard page from between vmap
blocks") had the side effect of making vmap_area.va_end member point to
the next vmap_area.va_start.  This was creating an artificial reference
to vmalloc'ed objects and kmemleak was rarely reporting vmalloc() leaks.

This patch marks the vmap_area containing pointers explicitly and
reduces the min ref_count to 2 as vm_struct still contains a reference
to the vmalloc'ed object.  The kmemleak add_scan_area() function has
been improved to allow a SIZE_MAX argument covering the rest of the
object (for simpler calling sites).
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

abd263fb

shmem: fix splicing from a hole while it's punched · a428dc00

Hugh Dickins authored Jul 23, 2014

commit b1a36650 upstream.

shmem_fault() is the actual culprit in trinity's hole-punch starvation,
and the most significant cause of such problems: since a page faulted is
one that then appears page_mapped(), needing unmap_mapping_range() and
i_mmap_mutex to be unmapped again.

But it is not the only way in which a page can be brought into a hole in
the radix_tree while that hole is being punched; and Vlastimil's testing
implies that if enough other processors are busy filling in the hole,
then shmem_undo_range() can be kept from completing indefinitely.

shmem_file_splice_read() is the main other user of SGP_CACHE, which can
instantiate shmem pagecache pages in the read-only case (without holding
i_mutex, so perhaps concurrently with a hole-punch).  Probably it's
silly not to use SGP_READ already (using the ZERO_PAGE for holes): which
ought to be safe, but might bring surprises - not a change to be rushed.

shmem_read_mapping_page_gfp() is an internal interface used by
drivers/gpu/drm GEM (and next by uprobes): it should be okay.  And
shmem_file_read_iter() uses the SGP_DIRTY variant of SGP_CACHE, when
called internally by the kernel (perhaps for a stacking filesystem,
which might rely on holes to be reserved): it's unclear whether it could
be provoked to keep hole-punch busy or not.

We could apply the same umbrella as now used in shmem_fault() to
shmem_file_splice_read() and the others; but it looks ugly, and use over
a range raises questions - should it actually be per page? can these get
starved themselves?

The origin of this part of the problem is my v3.1 commit d0823576
("mm: pincer in truncate_inode_pages_range"), once it was duplicated
into shmem.c.  It seemed like a nice idea at the time, to ensure
(barring RCU lookup fuzziness) that there's an instant when the entire
hole is empty; but the indefinitely repeated scans to ensure that make
it vulnerable.

Revert that "enhancement" to hole-punch from shmem_undo_range(), but
retain the unproblematic rescanning when it's truncating; add a couple
of comments there.

Remove the "indices[0] >= end" test: that is now handled satisfactorily
by the inner loop, and mem_cgroup_uncharge_start()/end() are too light
to be worth avoiding here.

But if we do not always loop indefinitely, we do need to handle the case
of swap swizzled back to page before shmem_free_swap() gets it: add a
retry for that case, as suggested by Konstantin Khlebnikov; and for the
case of page swizzled back to swap, as suggested by Johannes Weiner.
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Lukas Czerner <lczerner@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: <stable@vger.kernel.org>	[3.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

a428dc00

shmem: fix faulting into a hole, not taking i_mutex · 38d05809

Hugh Dickins authored Jul 23, 2014

commit 8e205f77 upstream.

Commit f00cdc6d ("shmem: fix faulting into a hole while it's
punched") was buggy: Sasha sent a lockdep report to remind us that
grabbing i_mutex in the fault path is a no-no (write syscall may already
hold i_mutex while faulting user buffer).

We tried a completely different approach (see following patch) but that
proved inadequate: good enough for a rational workload, but not good
enough against trinity - which forks off so many mappings of the object
that contention on i_mmap_mutex while hole-puncher holds i_mutex builds
into serious starvation when concurrent faults force the puncher to fall
back to single-page unmap_mapping_range() searches of the i_mmap tree.

So return to the original umbrella approach, but keep away from i_mutex
this time.  We really don't want to bloat every shmem inode with a new
mutex or completion, just to protect this unlikely case from trinity.
So extend the original with wait_queue_head on stack at the hole-punch
end, and wait_queue item on the stack at the fault end.

This involves further use of i_lock to guard against the races: lockdep
has been happy so far, and I see fs/inode.c:unlock_new_inode() holds
i_lock around wake_up_bit(), which is comparable to what we do here.
i_lock is more convenient, but we could switch to shmem's info->lock.

This issue has been tagged with CVE-2014-4171, which will require commit
f00cdc6d and this and the following patch to be backported: we
suggest to 3.1+, though in fact the trinity forkbomb effect might go
back as far as 2.6.16, when madvise(,,MADV_REMOVE) came in - or might
not, since much has changed, with i_mmap_mutex a spinlock before 3.0.
Anyone running trinity on 3.0 and earlier? I don't think we need care.
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Lukas Czerner <lczerner@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: <stable@vger.kernel.org>	[3.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

38d05809

shmem: fix faulting into a hole while it's punched · 8685789b

Hugh Dickins authored Jun 23, 2014

commit f00cdc6d upstream.

Trinity finds that mmap access to a hole while it's punched from shmem
can prevent the madvise(MADV_REMOVE) or fallocate(FALLOC_FL_PUNCH_HOLE)
from completing, until the reader chooses to stop; with the puncher's
hold on i_mutex locking out all other writers until it can complete.

It appears that the tmpfs fault path is too light in comparison with its
hole-punching path, lacking an i_data_sem to obstruct it; but we don't
want to slow down the common case.

Extend shmem_fallocate()'s existing range notification mechanism, so
shmem_fault() can refrain from faulting pages into the hole while it's
punched, waiting instead on i_mutex (when safe to sleep; or repeatedly
faulting when not).

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

8685789b

slip: Fix deadlock in write_wakeup · 7b10a526

Tyler Hall authored Jun 15, 2014

commit 661f7fda upstream.

Use schedule_work() to avoid potentially taking the spinlock in
interrupt context.

Commit cc9fa74e ("slip/slcan: added locking in wakeup function") added
necessary locking to the wakeup function and 367525c8/ddcde142 ("can:
slcan: Fix spinlock variant") converted it to spin_lock_bh() because the lock
is also taken in timers.

Disabling softirqs is not sufficient, however, as tty drivers may call
write_wakeup from interrupt context. This driver calls tty->ops->write() with
its spinlock held, which may immediately cause an interrupt on the same CPU and
subsequent spin_bug().

Simply converting to spin_lock_irq/irqsave() prevents this deadlock, but
causes lockdep to point out a possible circular locking dependency
between these locks:

(&(&sl->lock)->rlock){-.....}, at: slip_write_wakeup
(&port_lock_key){-.....}, at: serial8250_handle_irq.part.13

The slip transmit is holding the slip spinlock when calling the tty write.
This grabs the port lock. On an interrupt, the handler grabs the port
lock and calls write_wakeup which grabs the slip lock. This could be a
problem if a serial interrupt occurs on another CPU during the slip
transmit.

To deal with these issues, don't grab the lock in the wakeup function by
deferring the writeout to a workqueue. Also hold the lock during close
when de-assigning the tty pointer to safely disarm the worker and
timers.

This bug is easily reproducible on the first transmit when slip is
used with the standard 8250 serial driver.

[<c0410b7c>] (spin_bug+0x0/0x38) from [<c006109c>] (do_raw_spin_lock+0x60/0x1d0)
 r5:eab27000 r4:ec02754c
[<c006103c>] (do_raw_spin_lock+0x0/0x1d0) from [<c04185c0>] (_raw_spin_lock+0x28/0x2c)
 r10:0000001f r9:eabb814c r8:eabb8140 r7:40070193 r6:ec02754c r5:eab27000
 r4:ec02754c r3:00000000
[<c0418598>] (_raw_spin_lock+0x0/0x2c) from [<bf3a0220>] (slip_write_wakeup+0x50/0xe0 [slip])
 r4:ec027540 r3:00000003
[<bf3a01d0>] (slip_write_wakeup+0x0/0xe0 [slip]) from [<c026e420>] (tty_wakeup+0x48/0x68)
 r6:00000000 r5:ea80c480 r4:eab27000 r3:bf3a01d0
[<c026e3d8>] (tty_wakeup+0x0/0x68) from [<c028a8ec>] (uart_write_wakeup+0x2c/0x30)
 r5:ed68ea90 r4:c06790d8
[<c028a8c0>] (uart_write_wakeup+0x0/0x30) from [<c028dc44>] (serial8250_tx_chars+0x114/0x170)
[<c028db30>] (serial8250_tx_chars+0x0/0x170) from [<c028dffc>] (serial8250_handle_irq+0xa0/0xbc)
 r6:000000c2 r5:00000060 r4:c06790d8 r3:00000000
[<c028df5c>] (serial8250_handle_irq+0x0/0xbc) from [<c02933a4>] (dw8250_handle_irq+0x38/0x64)
 r7:00000000 r6:edd2f390 r5:000000c2 r4:c06790d8
[<c029336c>] (dw8250_handle_irq+0x0/0x64) from [<c028d2f4>] (serial8250_interrupt+0x44/0xc4)
 r6:00000000 r5:00000000 r4:c06791c4 r3:c029336c
[<c028d2b0>] (serial8250_interrupt+0x0/0xc4) from [<c0067fe4>] (handle_irq_event_percpu+0xb4/0x2b0)
 r10:c06790d8 r9:eab27000 r8:00000000 r7:00000000 r6:0000001f r5:edd52980
 r4:ec53b6c0 r3:c028d2b0
[<c0067f30>] (handle_irq_event_percpu+0x0/0x2b0) from [<c006822c>] (handle_irq_event+0x4c/0x6c)
 r10:c06790d8 r9:eab27000 r8:c0673ae0 r7:c05c2020 r6:ec53b6c0 r5:edd529d4
 r4:edd52980
[<c00681e0>] (handle_irq_event+0x0/0x6c) from [<c006b140>] (handle_level_irq+0xe8/0x100)
 r6:00000000 r5:edd529d4 r4:edd52980 r3:00022000
[<c006b058>] (handle_level_irq+0x0/0x100) from [<c00676f8>] (generic_handle_irq+0x30/0x40)
 r5:0000001f r4:0000001f
[<c00676c8>] (generic_handle_irq+0x0/0x40) from [<c000f57c>] (handle_IRQ+0xd0/0x13c)
 r4:ea997b18 r3:000000e0
[<c000f4ac>] (handle_IRQ+0x0/0x13c) from [<c00086c4>] (armada_370_xp_handle_irq+0x4c/0x118)
 r8:000003ff r7:ea997b18 r6:ffffffff r5:60070013 r4:c0674dc0
[<c0008678>] (armada_370_xp_handle_irq+0x0/0x118) from [<c0013840>] (__irq_svc+0x40/0x70)
Exception stack(0xea997b18 to 0xea997b60)
7b00:                                                       00000001 20070013
7b20: 00000000 0000000b 20070013 eab27000 20070013 00000000 ed10103e eab27000
7b40: c06790d8 ea997b74 ea997b60 ea997b60 c04186c0 c04186c8 60070013 ffffffff
 r9:eab27000 r8:ed10103e r7:ea997b4c r6:ffffffff r5:60070013 r4:c04186c8
[<c04186a4>] (_raw_spin_unlock_irqrestore+0x0/0x54) from [<c0288fc0>] (uart_start+0x40/0x44)
 r4:c06790d8 r3:c028ddd8
[<c0288f80>] (uart_start+0x0/0x44) from [<c028982c>] (uart_write+0xe4/0xf4)
 r6:0000003e r5:00000000 r4:ed68ea90 r3:0000003e
[<c0289748>] (uart_write+0x0/0xf4) from [<bf3a0d20>] (sl_xmit+0x1c4/0x228 [slip])
 r10:ed388e60 r9:0000003c r8:ffffffdd r7:0000003e r6:ec02754c r5:ea717eb8
 r4:ec027000
[<bf3a0b5c>] (sl_xmit+0x0/0x228 [slip]) from [<c0368d74>] (dev_hard_start_xmit+0x39c/0x6d0)
 r8:eaf163c0 r7:ec027000 r6:ea717eb8 r5:00000000 r4:00000000
Signed-off-by: Tyler Hall <tylerwhall@gmail.com>
Cc: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: Andre Naujoks <nautsch2@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

7b10a526

ACPI / resources: only reject zero length resources based at address zero · cff58058

Andy Whitcroft authored Jun 19, 2014

commit 867f9d46 upstream.

The recently merged change (in v3.14-rc6) to ACPI resource detection
(below) causes all zero length ACPI resources to be elided from the
table:

  commit b355cee8
  Author: Zhang Rui <rui.zhang@intel.com>
  Date:   Thu Feb 27 11:37:15 2014 +0800

    ACPI / resources: ignore invalid ACPI device resources

This change has caused a regression in (at least) serial port detection
for a number of machines (see LP#1313981 [1]).  These seem to represent
their IO regions (presumably incorrectly) as a zero length region.
Reverting the above commit restores these serial devices.

Only elide zero length resources which lie at address 0.

Fixes: b355cee8 (ACPI / resources: ignore invalid ACPI device resources)
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

cff58058

serial: sirf: Fix compilation failure · a2c27add

Daniel Thompson authored May 29, 2014

commit 58eb97c9 upstream.

After 07d410e0) serial: sirf: fix spinlock deadlock issue it is no longer
possiblet to compile this driver. The rename of one of the spinlocks is
faulty. After looking at the original patch I believe this is the correct
fix.

Compile tested using ARM's multi_v7_defconfig
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Qipan Li <Qipan.Li@csr.com>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Barry Song <baohua@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

a2c27add

serial: sirf: fix spinlock deadlock issue · 67aedbd9

Qipan Li authored May 26, 2014

commit 07d410e0 upstream.

commit fb78b811 provide a workaround for
kernel panic, but bring potential deadlock risk. that is in
sirfsoc_rx_tmo_process_tl while enter into sirfsoc_uart_pio_rx_chars
cpu hold uart_port->lock, if uart interrupt comes cpu enter into
sirfsoc_uart_isr and deadlock occurs in getting uart_port->lock.

the patch replace spin_lock version to spin_lock_irq* version to avoid
spinlock dead lock issue. let function tty_flip_buffer_push in tasklet
outof spin_lock_irq* protect area to avoid add the pair of spin_lock and
spin_unlock for tty_flip_buffer_push.
BTW drop self defined unused spinlock protect of tx_lock/rx_lock.

56274.220464] BUG: spinlock lockup suspected on CPU#0, swapper/0/0
[56274.223648]  lock: 0xc05d9db0, .magic: dead4ead, .owner: swapper/0/0,
	.owner_cpu: 0
	[56274.231278] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G
	O 3.10.35 #1
	[56274.238241] [<c0015530>] (unwind_backtrace+0x0/0xf4) from
	[<c00120d8>] (show_stack+0x10/0x14)
	[56274.246742] [<c00120d8>] (show_stack+0x10/0x14) from
	[<c01b11b0>] (do_raw_spin_lock+0x110/0x184)
	[56274.255501] [<c01b11b0>] (do_raw_spin_lock+0x110/0x184) from
	[<c02124c8>] (sirfsoc_uart_isr+0x20/0x42c)
	[56274.264874] [<c02124c8>] (sirfsoc_uart_isr+0x20/0x42c) from
	[<c0075790>] (handle_irq_event_percpu+0x54/0x17c)
	[56274.274758] [<c0075790>] (handle_irq_event_percpu+0x54/0x17c)
	from [<c00758f4>] (handle_irq_event+0x3c/0x5c)
	[56274.284561] [<c00758f4>] (handle_irq_event+0x3c/0x5c) from
	[<c0077fa0>] (handle_level_irq+0x98/0xfc)
	[56274.293670] [<c0077fa0>] (handle_level_irq+0x98/0xfc) from
	[<c0074f44>] (generic_handle_irq+0x2c/0x3c)
	[56274.302952] [<c0074f44>] (generic_handle_irq+0x2c/0x3c) from
	[<c000ef80>] (handle_IRQ+0x40/0x90)
	[56274.311706] [<c000ef80>] (handle_IRQ+0x40/0x90) from
	[<c000dc80>] (__irq_svc+0x40/0x70)
	[56274.319697] [<c000dc80>] (__irq_svc+0x40/0x70) from
	[<c038113c>] (_raw_spin_unlock_irqrestore+0x10/0x48)
	[56274.329158] [<c038113c>]
	(_raw_spin_unlock_irqrestore+0x10/0x48) from [<c0200034>]
	(tty_port_tty_get+0x58/0x90)
	[56274.339213] [<c0200034>] (tty_port_tty_get+0x58/0x90) from
	[<c0212008>] (sirfsoc_uart_pio_rx_chars+0x1c/0xc8)
	[56274.349097] [<c0212008>]
	(sirfsoc_uart_pio_rx_chars+0x1c/0xc8) from [<c0212ef8>]
	(sirfsoc_rx_tmo_process_tl+0xe4/0x1fc)
	[56274.359853] [<c0212ef8>]
	(sirfsoc_rx_tmo_process_tl+0xe4/0x1fc) from [<c0027c04>]
	(tasklet_action+0x84/0x114)
	[56274.369739] [<c0027c04>] (tasklet_action+0x84/0x114) from
	[<c0027db4>] (__do_softirq+0x120/0x200)
	[56274.378585] [<c0027db4>] (__do_softirq+0x120/0x200) from
	[<c0027f44>] (do_softirq+0x54/0x5c)
	[56274.386998] [<c0027f44>] (do_softirq+0x54/0x5c) from
	[<c00281ec>] (irq_exit+0x9c/0xd0)
	[56274.394899] [<c00281ec>] (irq_exit+0x9c/0xd0) from
	[<c000ef84>] (handle_IRQ+0x44/0x90)
	[56274.402790] [<c000ef84>] (handle_IRQ+0x44/0x90) from
	[<c000dc80>] (__irq_svc+0x40/0x70)
	[56274.410774] [<c000dc80>] (__irq_svc+0x40/0x70) from
	[<c0288af4>] (cpuidle_enter_state+0x50/0xe0)
	[56274.419532] [<c0288af4>] (cpuidle_enter_state+0x50/0xe0) from
	[<c0288c34>] (cpuidle_idle_call+0xb0/0x148)
	[56274.429080] [<c0288c34>] (cpuidle_idle_call+0xb0/0x148) from
	[<c000f3ac>] (arch_cpu_idle+0x8/0x38)
	[56274.438016] [<c000f3ac>] (arch_cpu_idle+0x8/0x38) from
	[<c0059344>] (cpu_startup_entry+0xfc/0x140)
	[56274.446956] [<c0059344>] (cpu_startup_entry+0xfc/0x140) from
	[<c04a3a54>] (start_kernel+0x2d8/0x2e4)
Signed-off-by: Qipan Li <Qipan.Li@csr.com>
Signed-off-by: Barry Song <Baohua.Song@csr.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

67aedbd9

22 Jul, 2014 2 commits

can: c_can: Remove EOB exit · 9b448624

Thomas Gleixner authored Mar 18, 2014

commit 710c5610 upstream.

The rx_poll code has the following gem:

	if (msg_ctrl_save & IF_MCONT_EOB)
		return num_rx_pkts;

The EOB bit is the indicator for the hardware that this is the last
configured FIFO object. But this object can contain valid data, if we
manage to free up objects before the overrun case hits.

Now if the code exits due to the EOB bit set, then this buffer is
stale and the interrupt bit and NewDat bit of the buffer are still
set. Results in a nice interrupt storm unless we come into an overrun
situation where the MSGLST bit gets set.

     ksoftirqd/0-3     [000] ..s.    79.124101: c_can_poll: rx_poll: val: 00008001 pend 00008001
     ksoftirqd/0-3     [000] ..s.    79.124176: c_can_poll: rx_poll: val: 00008000 pend 00008000
     ksoftirqd/0-3     [000] ..s.    79.124187: c_can_poll: rx_poll: val: 00008002 pend 00008002
     ksoftirqd/0-3     [000] ..s.    79.124256: c_can_poll: rx_poll: val: 00008000 pend 00008000
     ksoftirqd/0-3     [000] ..s.    79.124267: c_can_poll: rx_poll: val: 00008000 pend 00008000

The amazing thing is that the check of the MSGLST (aka overrun bit)
used to be after the check of the EOB bit. That was "fixed" in commit
5d0f801a(can: c_can: Fix RX message handling, handle lost message
before EOB). But the author of this "fix" did not even understand that
the EOB check is broken as well.

Again a simple solution: Remove
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[mkl: adjusted subject and commit message]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

9b448624

Don't trigger congestion wait on dirty-but-not-writeout pages · e43bbc2c

Linus Torvalds authored Jul 21, 2014

commit b738d764 upstream.

shrink_inactive_list() used to wait 0.1s to avoid congestion when all
the pages that were isolated from the inactive list were dirty but not
under active writeback.  That makes no real sense, and apparently causes
major interactivity issues under some loads since 3.11.

The ostensible reason for it was to wait for kswapd to start writing
pages, but that seems questionable as well, since the congestion wait
code seems to trigger for kswapd itself as well.  Also, the logic behind
delaying anything when we haven't actually started writeback is not
clear - it only delays actually starting that writeback.

We'll still trigger the congestion waiting if

 (a) the process is kswapd, and we hit pages flagged for immediate
     reclaim

 (b) the process is not kswapd, and the zone backing dev writeback is
     actually congested.

This probably needs to be revisited, but as it is this fixes a reported
regression.

[mhocko@suse.cz: backport to 3.12 stable tree]
Fixes: e2be15f6 ('mm: vmscan: stall page reclaim and writeback pages based on dirty/writepage pages encountered')
Reported-by: Felipe Contreras <felipe.contreras@gmail.com>
Pinpointed-by: Hillf Danton <dhillf@gmail.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

e43bbc2c

18 Jul, 2014 13 commits

Linux 3.12.25 · 0b8e8118
Jiri Slaby authored Jul 18, 2014

0b8e8118

ACPI / battery: Retry to get battery information if failed during probing · b72ab222

Lan Tianyu authored Jul 07, 2014

commit 75646e75 upstream.

Some machines (eg. Lenovo Z480) ECs are not stable during boot up
and causes battery driver fails to be loaded due to failure of getting
battery information from EC sometimes. After several retries, the
operation will work. This patch is to retry to get battery information 5
times if the first try fails.

[ backport to 3.14.5: removed second parameter in acpi_battery_update(),
introduced by the commit 9e50bc14 (ACPI /
battery: Accelerate battery resume callback)]

[naszar <naszar@ya.ru>: backport to 3.14.5]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=75581Reported-and-tested-by: naszar <naszar@ya.ru>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

b72ab222

x86, ioremap: Speed up check for RAM pages · d30ab279

Roland Dreier authored May 02, 2014

commit c81c8a1e upstream.

In __ioremap_caller() (the guts of ioremap), we loop over the range of
pfns being remapped and checks each one individually with page_is_ram().
For large ioremaps, this can be very slow. For example, we have a
device with a 256 GiB PCI BAR, and ioremapping this BAR can take 20+
seconds -- sometimes long enough to trigger the soft lockup detector!

Internally, page_is_ram() calls walk_system_ram_range() on a single
page. Instead, we can make a single call to walk_system_ram_range()
from __ioremap_caller(), and do our further checks only for any RAM
pages that we find. For the common case of MMIO, this saves an enormous
amount of work, since the range being ioremapped doesn't intersect
system RAM at all.

With this change, ioremap on our 256 GiB BAR takes less than 1 second.
Signed-off-by: Roland Dreier <roland@purestorage.com>
Link: http://lkml.kernel.org/r/1399054721-1331-1-git-send-email-roland@kernel.orgSigned-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

d30ab279

powerpc: Disable RELOCATABLE for COMPILE_TEST with PPC64 · 8310f53f

Guenter Roeck authored Jun 30, 2014

commit fb43e847 upstream.

powerpc:allmodconfig has been failing for some time with the following
error.

arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
arch/powerpc/kernel/exceptions-64s.S:1312: Error: attempt to move .org backwards
make[1]: *** [arch/powerpc/kernel/head_64.o] Error 1

A number of attempts to fix the problem by moving around code have been
unsuccessful and resulted in failed builds for some configurations and
the discovery of toolchain bugs.

Fix the problem by disabling RELOCATABLE for COMPILE_TEST builds instead.
While this is less than perfect, it avoids substantial code changes
which would otherwise be necessary just to make COMPILE_TEST builds
happy and might have undesired side effects.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

8310f53f

ring-buffer: Check if buffer exists before polling · ca846a7a

Steven Rostedt (Red Hat) authored Jun 10, 2014

commit 8b8b3683 upstream.

The per_cpu buffers are created one per possible CPU. But these do
not mean that those CPUs are online, nor do they even exist.

With the addition of the ring buffer polling, it assumes that the
caller polls on an existing buffer. But this is not the case if
the user reads trace_pipe from a CPU that does not exist, and this
causes the kernel to crash.

Simple fix is to check the cpu against buffer bitmask against to see
if the buffer was allocated or not and return -ENODEV if it is
not.

More updates were done to pass the -ENODEV back up to userspace.

Link: http://lkml.kernel.org/r/5393DB61.6060707@oracle.comReported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ca846a7a

DMA, CMA: fix possible memory leak · 378005bb

Joonsoo Kim authored Jun 23, 2014

commit fe8eea4f upstream.

We should free memory for bitmap when we find zone mismatch, otherwise
this memory will leak.

Additionally, I copy code comment from PPC KVM's CMA code to inform why
we need to check zone mis-match.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Reviewed-by: Michal Nazarewicz <mina86@mina86.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Alexander Graf <agraf@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

378005bb

drm/i915: Don't clobber the GTT when it's within stolen memory · f29c738c

Ville Syrjälä authored Jun 05, 2014

commit f1e1c212 upstream.

On most gen2-4 platforms the GTT can be (or maybe always is?)
inside the stolen memory region. If that's the case, reduce the
size of the stolen memory appropriately to make make sure we
don't clobber the GTT.

v2: Deal with gen4 36 bit physical address
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80151Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

f29c738c

drm/radeon: stop poisoning the GART TLB · d5a2e149

Christian König authored Jun 04, 2014

commit 0986c1a5 upstream.

When we set the valid bit on invalid GART entries they are
loaded into the TLB when an adjacent entry is loaded. This
poisons the TLB with invalid entries which are sometimes
not correctly removed on TLB flush.

For stable inclusion the patch probably needs to be modified a bit.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

d5a2e149

drm/radeon: fix typo in golden register setup on evergreen · 70db0157

Alex Deucher authored Jul 07, 2014

commit 6abafb78 upstream.

Fixes hangs on driver load on some cards.

bug:
https://bugs.freedesktop.org/show_bug.cgi?id=76998Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

70db0157

drm/radeon: fix typo in ci_stop_dpm() · d52023d8

Alex Deucher authored Jul 08, 2014

commit ed963771 upstream.

Need to use the RREG32_SMC() accessor since the register
is an smc indirect index.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

d52023d8

drm/radeon/dpm: Reenabling SS on Cayman · d5d31b71

Alexandre Demers authored Jul 08, 2014

commit 41959341 upstream.

It reverts commit c745fe61 now that
Cayman is stable since VDDCI fix. Spread spectrum was not the culprit.

This depends on b0880e87
(drm/radeon/dpm: fix vddci setup typo on cayman).
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

d5d31b71

ext4: fix a potential deadlock in __ext4_es_shrink() · e4b456d4

Theodore Ts'o authored Jul 12, 2014

commit 3f1f9b85 upstream.

This fixes the following lockdep complaint:

[ INFO: possible circular locking dependency detected ]
3.16.0-rc2-mm1+ #7 Tainted: G           O
-------------------------------------------------------
kworker/u24:0/4356 is trying to acquire lock:
 (&(&sbi->s_es_lru_lock)->rlock){+.+.-.}, at: [<ffffffff81285fff>] __ext4_es_shrink+0x4f/0x2e0

but task is already holding lock:
 (&ei->i_es_lock){++++-.}, at: [<ffffffff81286961>] ext4_es_insert_extent+0x71/0x180

which lock already depends on the new lock.

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&ei->i_es_lock);
                               lock(&(&sbi->s_es_lru_lock)->rlock);
                               lock(&ei->i_es_lock);
  lock(&(&sbi->s_es_lru_lock)->rlock);

 *** DEADLOCK ***

6 locks held by kworker/u24:0/4356:
 #0:  ("writeback"){.+.+.+}, at: [<ffffffff81071d00>] process_one_work+0x180/0x560
 #1:  ((&(&wb->dwork)->work)){+.+.+.}, at: [<ffffffff81071d00>] process_one_work+0x180/0x560
 #2:  (&type->s_umount_key#22){++++++}, at: [<ffffffff811a9c74>] grab_super_passive+0x44/0x90
 #3:  (jbd2_handle){+.+...}, at: [<ffffffff812979f9>] start_this_handle+0x189/0x5f0
 #4:  (&ei->i_data_sem){++++..}, at: [<ffffffff81247062>] ext4_map_blocks+0x132/0x550
 #5:  (&ei->i_es_lock){++++-.}, at: [<ffffffff81286961>] ext4_es_insert_extent+0x71/0x180

stack backtrace:
CPU: 0 PID: 4356 Comm: kworker/u24:0 Tainted: G           O   3.16.0-rc2-mm1+ #7
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Workqueue: writeback bdi_writeback_workfn (flush-253:0)
 ffffffff8213dce0 ffff880014b07538 ffffffff815df0bb 0000000000000007
 ffffffff8213e040 ffff880014b07588 ffffffff815db3dd ffff880014b07568
 ffff880014b07610 ffff88003b868930 ffff88003b868908 ffff88003b868930
Call Trace:
 [<ffffffff815df0bb>] dump_stack+0x4e/0x68
 [<ffffffff815db3dd>] print_circular_bug+0x1fb/0x20c
 [<ffffffff810a7a3e>] __lock_acquire+0x163e/0x1d00
 [<ffffffff815e89dc>] ? retint_restore_args+0xe/0xe
 [<ffffffff815ddc7b>] ? __slab_alloc+0x4a8/0x4ce
 [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
 [<ffffffff810a8707>] lock_acquire+0x87/0x120
 [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
 [<ffffffff8128592d>] ? ext4_es_free_extent+0x5d/0x70
 [<ffffffff815e6f09>] _raw_spin_lock+0x39/0x50
 [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
 [<ffffffff8119760b>] ? kmem_cache_alloc+0x18b/0x1a0
 [<ffffffff81285fff>] __ext4_es_shrink+0x4f/0x2e0
 [<ffffffff812869b8>] ext4_es_insert_extent+0xc8/0x180
 [<ffffffff812470f4>] ext4_map_blocks+0x1c4/0x550
 [<ffffffff8124c4c4>] ext4_writepages+0x6d4/0xd00
	...
Reported-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reported-by: Minchan Kim <minchan@kernel.org>
Cc: Zheng Liu <gnehzuil.liu@gmail.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

e4b456d4

ext4: disable synchronous transaction batching if max_batch_time==0 · 38020539

Eric Sandeen authored Jul 05, 2014

commit 5dd21424 upstream.

The mount manpage says of the max_batch_time option,

	This optimization can be turned off entirely
	by setting max_batch_time to 0.

But the code doesn't do that.  So fix the code to do
that.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

38020539