- 30 Nov, 2013 4 commits
-
-
Vladimir Davydov authored
On e1000_down(), we should ensure every asynchronous work is canceled before proceeding. Since the watchdog_task can schedule other works apart from itself, it should be stopped first, but currently it is stopped after the reset_task. This can result in the following race leading to the reset_task running after the module unload: e1000_down_and_stop(): e1000_watchdog(): ---------------------- ----------------- cancel_work_sync(reset_task) schedule_work(reset_task) cancel_delayed_work_sync(watchdog_task) The patch moves cancel_delayed_work_sync(watchdog_task) at the beginning of e1000_down_and_stop() thus ensuring the race is impossible. Cc: Tushar Dave <tushar.n.dave@intel.com> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Vladimir Davydov authored
The patch fixes the following lockdep warning, which is 100% reproducible on network restart: ====================================================== [ INFO: possible circular locking dependency detected ] 3.12.0+ #47 Tainted: GF ------------------------------------------------------- kworker/1:1/27 is trying to acquire lock: ((&(&adapter->watchdog_task)->work)){+.+...}, at: [<ffffffff8108a5b0>] flush_work+0x0/0x70 but task is already holding lock: (&adapter->mutex){+.+...}, at: [<ffffffffa0177c0a>] e1000_reset_task+0x4a/0xa0 [e1000] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&adapter->mutex){+.+...}: [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120 [<ffffffff816b8cbc>] mutex_lock_nested+0x4c/0x390 [<ffffffffa017233d>] e1000_watchdog+0x7d/0x5b0 [e1000] [<ffffffff8108b972>] process_one_work+0x1d2/0x510 [<ffffffff8108ca80>] worker_thread+0x120/0x3a0 [<ffffffff81092c1e>] kthread+0xee/0x110 [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0 -> #0 ((&(&adapter->watchdog_task)->work)){+.+...}: [<ffffffff810bd9c0>] __lock_acquire+0x1710/0x1810 [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120 [<ffffffff8108a5eb>] flush_work+0x3b/0x70 [<ffffffff8108b5d8>] __cancel_work_timer+0x98/0x140 [<ffffffff8108b693>] cancel_delayed_work_sync+0x13/0x20 [<ffffffffa0170cec>] e1000_down_and_stop+0x3c/0x60 [e1000] [<ffffffffa01775b1>] e1000_down+0x131/0x220 [e1000] [<ffffffffa0177c12>] e1000_reset_task+0x52/0xa0 [e1000] [<ffffffff8108b972>] process_one_work+0x1d2/0x510 [<ffffffff8108ca80>] worker_thread+0x120/0x3a0 [<ffffffff81092c1e>] kthread+0xee/0x110 [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&adapter->mutex); lock((&(&adapter->watchdog_task)->work)); lock(&adapter->mutex); lock((&(&adapter->watchdog_task)->work)); *** DEADLOCK *** 3 locks held by kworker/1:1/27: #0: (events){.+.+.+}, at: [<ffffffff8108b906>] process_one_work+0x166/0x510 #1: ((&adapter->reset_task)){+.+...}, at: [<ffffffff8108b906>] process_one_work+0x166/0x510 #2: (&adapter->mutex){+.+...}, at: [<ffffffffa0177c0a>] e1000_reset_task+0x4a/0xa0 [e1000] stack backtrace: CPU: 1 PID: 27 Comm: kworker/1:1 Tainted: GF 3.12.0+ #47 Hardware name: System manufacturer System Product Name/P5B-VM SE, BIOS 0501 05/31/2007 Workqueue: events e1000_reset_task [e1000] ffffffff820f6000 ffff88007b9dba98 ffffffff816b54a2 0000000000000002 ffffffff820f5e50 ffff88007b9dbae8 ffffffff810ba936 ffff88007b9dbac8 ffff88007b9dbb48 ffff88007b9d8f00 ffff88007b9d8780 ffff88007b9d8f00 Call Trace: [<ffffffff816b54a2>] dump_stack+0x49/0x5f [<ffffffff810ba936>] print_circular_bug+0x216/0x310 [<ffffffff810bd9c0>] __lock_acquire+0x1710/0x1810 [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250 [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120 [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250 [<ffffffff8108a5eb>] flush_work+0x3b/0x70 [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250 [<ffffffff8108b5d8>] __cancel_work_timer+0x98/0x140 [<ffffffff8108b693>] cancel_delayed_work_sync+0x13/0x20 [<ffffffffa0170cec>] e1000_down_and_stop+0x3c/0x60 [e1000] [<ffffffffa01775b1>] e1000_down+0x131/0x220 [e1000] [<ffffffffa0177c12>] e1000_reset_task+0x52/0xa0 [e1000] [<ffffffff8108b972>] process_one_work+0x1d2/0x510 [<ffffffff8108b906>] ? process_one_work+0x166/0x510 [<ffffffff8108ca80>] worker_thread+0x120/0x3a0 [<ffffffff8108c960>] ? manage_workers+0x2c0/0x2c0 [<ffffffff81092c1e>] kthread+0xee/0x110 [<ffffffff81092b30>] ? __init_kthread_worker+0x70/0x70 [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0 [<ffffffff81092b30>] ? __init_kthread_worker+0x70/0x70 == The issue background == The problem occurs, because e1000_down(), which is called under adapter->mutex by e1000_reset_task(), tries to synchronously cancel e1000 auxiliary works (reset_task, watchdog_task, phy_info_task, fifo_stall_task), which take adapter->mutex in their handlers. So the question is what does adapter->mutex protect there? The adapter->mutex was introduced by commit 0ef4ee ("e1000: convert to private mutex from rtnl") as a replacement for rtnl_lock() taken in the asynchronous handlers. It targeted on fixing a similar lockdep warning issued when e1000_down() was called under rtnl_lock(), and it fixed it, but unfortunately it introduced the lockdep warning described above. Anyway, that said the source of this bug is that the asynchronous works were made to take rtnl_lock() some time ago, so let's look deeper and find why it was added there. The rtnl_lock() was added to asynchronous handlers by commit 338c15 ("e1000: fix occasional panic on unload") in order to prevent asynchronous handlers from execution after the module is unloaded (e1000_down() is called) as it follows from the comment to the commit: > Net drivers in general have an issue where timers fired > by mod_timer or work threads with schedule_work are running > outside of the rtnl_lock. > > With no other lock protection these routines are vulnerable > to races with driver unload or reset paths. > > The longer term solution to this might be a redesign with > safer locks being taken in the driver to guarantee no > reentrance, but for now a safe and effective fix is > to take the rtnl_lock in these routines. I'm not sure if this locking scheme fixed the problem or just made it unlikely, although I incline to the latter. Anyway, this was long time ago when e1000 auxiliary works were implemented as timers scheduling real work handlers in their routines. The e1000_down() function only canceled the timers, but left the real handlers running if they were running, which could result in work execution after module unload. Today, the e1000 driver uses sane delayed works instead of the pair timer+work to implement its delayed asynchronous handlers, and the e1000_down() synchronously cancels all the works so that the problem that commit 338c15 tried to cope with disappeared, and we don't need any locks in the handlers any more. Moreover, any locking there can potentially result in a deadlock. So, this patch reverts commits 0ef4ee and 338c15. Fixes: 0ef4eedc ("e1000: convert to private mutex from rtnl") Fixes: 338c15e4 ("e1000: fix occasional panic on unload") Cc: Tushar Dave <tushar.n.dave@intel.com> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
yzhu1 authored
This change is based on a similar change made to e1000e support in commit bb9e44d0 ("e1000e: prevent oops when adapter is being closed and reset simultaneously"). The same issue has also been observed on the older e1000 cards. Here, we have increased the RESET_COUNT value to 50 because there are too many accesses to e1000 nic on stress tests to e1000 nic, it is not enough to set RESET_COUT 25. Experimentation has shown that it is enough to set RESET_COUNT 50. Signed-off-by: yzhu1 <yanjun.zhu@windriver.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Akeem G Abodunrin authored
This patch fixes Wake on LAN being reported as supported on some Ethernet ports, in contrary to Hardware capability. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 29 Nov, 2013 12 commits
-
-
Eric Dumazet authored
In commit c9e90429 ("ipv4: fix possible seqlock deadlock") I left another places where IP_INC_STATS_BH() were improperly used. udp_sendmsg(), ping_v4_sendmsg() and tcp_v4_connect() are called from process context, not from softirq context. This was detected by lockdep seqlock support. Reported-by: jongman heo <jongman.heo@samsung.com> Fixes: 584bdf8c ("[IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP") Fixes: c319b4d7 ("net: ipv4: add IPPROTO_ICMP socket kind") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
When user linkup is enabled and user sets linkup of individual port, we need to recompute linkup (carrier) of master interface so the change is reflected. Fix this by calling __team_carrier_check() which does the needed work. Please apply to all stable kernels as well. Thanks. Reported-by: Jan Tluka <jtluka@redhat.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shawn Landden authored
Commit 35f9c09f (tcp: tcp_sendpages() should call tcp_push() once) added an internal flag MSG_SENDPAGE_NOTLAST, similar to MSG_MORE. algif_hash, algif_skcipher, and udp used MSG_MORE from tcp_sendpages() and need to see the new flag as identical to MSG_MORE. This fixes sendfile() on AF_ALG. v3: also fix udp Cc: Tom Herbert <therbert@google.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: <stable@vger.kernel.org> # 3.4.x + 3.2.x Reported-and-tested-by: Shawn Landden <shawnlandden@gmail.com> Original-patch: Richard Weinberger <richard@nod.at> Signed-off-by: Shawn Landden <shawn@churchofgit.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Guenter Roeck authored
Simplify the code. Avoid race conditions caused by attributes being created after hwmon device registration. Implicitly (through hwmon API) add mandatory 'name' sysfs attribute. Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Linus Walleij authored
After commit e9e4ea74 "net: smc91x: dont't use SMC_outw for fixing up halfword-aligned data" The Versatile SMSC LAN91C111 is crashing like this: ------------[ cut here ]------------ kernel BUG at /home/linus/linux/drivers/net/ethernet/smsc/smc91x.c:599! Internal error: Oops - BUG: 0 [#1] ARM Modules linked in: CPU: 0 PID: 43 Comm: udhcpc Not tainted 3.13.0-rc1+ #24 task: c6ccfaa0 ti: c6cd0000 task.ti: c6cd0000 PC is at smc_hardware_send_pkt+0x198/0x22c LR is at smc_hardware_send_pkt+0x24/0x22c pc : [<c01be324>] lr : [<c01be1b0>] psr: 20000013 sp : c6cd1d08 ip : 00000001 fp : 00000000 r10: c02adb08 r9 : 00000000 r8 : c6ced802 r7 : c786fba0 r6 : 00000146 r5 : c8800000 r4 : c78d6000 r3 : 0000000f r2 : 00000146 r1 : 00000000 r0 : 00000031 Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 0005317f Table: 06cf4000 DAC: 00000015 Process udhcpc (pid: 43, stack limit = 0xc6cd01c0) Stack: (0xc6cd1d08 to 0xc6cd2000) 1d00: 00000010 c8800000 c78d6000 c786fba0 c78d6000 c01be868 1d20: c01be7a4 00004000 00000000 c786fba0 c6c12b80 c0208554 000004d0 c780fc60 1d40: 00000220 c01fb734 00000000 00000000 00000000 c6c9a440 c6c12b80 c78d6000 1d60: c786fba0 c6c9a440 00000000 c021d1d8 00000000 00000000 c6c12b80 c78d6000 1d80: c786fba0 00000001 c6c9a440 c02087f8 c6c9a4a0 00080008 00000000 00000000 1da0: c78d6000 c786fba0 c78d6000 00000138 00000000 00000000 00000000 00000000 1dc0: 00000000 c027ba74 00000138 00000138 00000001 00000010 c6cedc00 00000000 1de0: 00000008 c7404400 c6cd1eec c6cd1f14 c067a73c c065c0b8 00000000 c067a740 1e00: 01ffffff 002040d0 00000000 00000000 00000000 00000000 00000000 ffffffff 1e20: 43004400 00110022 c6cdef20 c027ae8c c6ccfaa0 be82d65c 00000014 be82d3cc 1e40: 00000000 00000000 00000000 c01f2870 00000000 00000000 00000000 c6cd1e88 1e60: c6ccfaa0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 1e80: 00000000 00000000 00000031 c7802310 c7802300 00000138 c7404400 c0771da0 1ea0: 00000000 c6cd1eec c7800340 00000138 be82d65c 00000014 be82d3cc c6cd1f08 1ec0: 00000014 00000000 c7404400 c7404400 00000138 c01f4628 c78d6000 00000000 1ee0: 00000000 be82d3cc 00000138 c6cd1f08 00000014 c6cd1ee4 00000001 00000000 1f00: 00000000 00000000 00080011 00000002 06000000 ffffffff 0000ffff 00000002 1f20: 06000000 ffffffff 0000ffff c00928c8 c065c520 c6cd1f58 00000003 c009299c 1f40: 00000003 c065c520 c7404400 00000000 c7404400 c01f2218 c78106b0 c7441cb0 1f60: 00000000 00000006 c06799fc 00000000 00000000 00000006 00000000 c01f3ee0 1f80: 00000000 00000000 be82d678 be82d65c 00000014 00000001 00000122 c00139c8 1fa0: c6cd0000 c0013840 be82d65c 00000014 00000006 be82d3cc 00000138 00000000 1fc0: be82d65c 00000014 00000001 00000122 00000000 00000000 00018cb1 00000000 1fe0: 00003801 be82d3a8 0003a0c7 b6e9af08 60000010 00000006 00000000 00000000 [<c01be324>] (smc_hardware_send_pkt+0x198/0x22c) from [<c01be868>] (smc_hard_start_xmit+0xc4/0x1e8) [<c01be868>] (smc_hard_start_xmit+0xc4/0x1e8) from [<c0208554>] (dev_hard_start_xmit+0x460/0x4cc) [<c0208554>] (dev_hard_start_xmit+0x460/0x4cc) from [<c021d1d8>] (sch_direct_xmit+0x94/0x18c) [<c021d1d8>] (sch_direct_xmit+0x94/0x18c) from [<c02087f8>] (dev_queue_xmit+0x238/0x42c) [<c02087f8>] (dev_queue_xmit+0x238/0x42c) from [<c027ba74>] (packet_sendmsg+0xbe8/0xd28) [<c027ba74>] (packet_sendmsg+0xbe8/0xd28) from [<c01f2870>] (sock_sendmsg+0x84/0xa8) [<c01f2870>] (sock_sendmsg+0x84/0xa8) from [<c01f4628>] (SyS_sendto+0xb8/0xdc) [<c01f4628>] (SyS_sendto+0xb8/0xdc) from [<c0013840>] (ret_fast_syscall+0x0/0x2c) Code: e3130002 1a000001 e3130001 0affffcd (e7f001f2) ---[ end trace 81104fe70e8da7fe ]--- Kernel panic - not syncing: Fatal exception in interrupt This is because the macro operations in smc91x.h defined for Versatile are missing SMC_outsw() as used in this commit. The Versatile needs and uses the same accessors as the other platforms in the first if(...) clause, just switch it to using that and we have one problem less to worry about. This includes a hunk of a patch from Will Deacon fixin the other 32bit platforms as well: Innokom, Ramses, PXA, PCM027. Checkpatch complains about spacing, but I have opted to follow the style of this .h-file. Cc: Russell King <linux@arm.linux.org.uk> Cc: Nicolas Pitre <nico@fluxnic.net> Cc: Eric Miao <eric.y.miao@gmail.com> Cc: Jonathan Cameron <jic23@cam.ac.uk> Cc: stable@vger.kernel.org Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
This reverts commit b268daff. I applied the wrong version of this patch, the proper version is coming up next. Signed-off-by: David S. Miller <davem@davemloft.net>
-
Linus Walleij authored
After commit e9e4ea74 "net: smc91x: dont't use SMC_outw for fixing up halfword-aligned data" The Versatile SMSC LAN91C111 is crashing like this: ------------[ cut here ]------------ kernel BUG at /home/linus/linux/drivers/net/ethernet/smsc/smc91x.c:599! Internal error: Oops - BUG: 0 [#1] ARM Modules linked in: CPU: 0 PID: 43 Comm: udhcpc Not tainted 3.13.0-rc1+ #24 task: c6ccfaa0 ti: c6cd0000 task.ti: c6cd0000 PC is at smc_hardware_send_pkt+0x198/0x22c LR is at smc_hardware_send_pkt+0x24/0x22c pc : [<c01be324>] lr : [<c01be1b0>] psr: 20000013 sp : c6cd1d08 ip : 00000001 fp : 00000000 r10: c02adb08 r9 : 00000000 r8 : c6ced802 r7 : c786fba0 r6 : 00000146 r5 : c8800000 r4 : c78d6000 r3 : 0000000f r2 : 00000146 r1 : 00000000 r0 : 00000031 Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 0005317f Table: 06cf4000 DAC: 00000015 Process udhcpc (pid: 43, stack limit = 0xc6cd01c0) Stack: (0xc6cd1d08 to 0xc6cd2000) 1d00: 00000010 c8800000 c78d6000 c786fba0 c78d6000 c01be868 1d20: c01be7a4 00004000 00000000 c786fba0 c6c12b80 c0208554 000004d0 c780fc60 1d40: 00000220 c01fb734 00000000 00000000 00000000 c6c9a440 c6c12b80 c78d6000 1d60: c786fba0 c6c9a440 00000000 c021d1d8 00000000 00000000 c6c12b80 c78d6000 1d80: c786fba0 00000001 c6c9a440 c02087f8 c6c9a4a0 00080008 00000000 00000000 1da0: c78d6000 c786fba0 c78d6000 00000138 00000000 00000000 00000000 00000000 1dc0: 00000000 c027ba74 00000138 00000138 00000001 00000010 c6cedc00 00000000 1de0: 00000008 c7404400 c6cd1eec c6cd1f14 c067a73c c065c0b8 00000000 c067a740 1e00: 01ffffff 002040d0 00000000 00000000 00000000 00000000 00000000 ffffffff 1e20: 43004400 00110022 c6cdef20 c027ae8c c6ccfaa0 be82d65c 00000014 be82d3cc 1e40: 00000000 00000000 00000000 c01f2870 00000000 00000000 00000000 c6cd1e88 1e60: c6ccfaa0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 1e80: 00000000 00000000 00000031 c7802310 c7802300 00000138 c7404400 c0771da0 1ea0: 00000000 c6cd1eec c7800340 00000138 be82d65c 00000014 be82d3cc c6cd1f08 1ec0: 00000014 00000000 c7404400 c7404400 00000138 c01f4628 c78d6000 00000000 1ee0: 00000000 be82d3cc 00000138 c6cd1f08 00000014 c6cd1ee4 00000001 00000000 1f00: 00000000 00000000 00080011 00000002 06000000 ffffffff 0000ffff 00000002 1f20: 06000000 ffffffff 0000ffff c00928c8 c065c520 c6cd1f58 00000003 c009299c 1f40: 00000003 c065c520 c7404400 00000000 c7404400 c01f2218 c78106b0 c7441cb0 1f60: 00000000 00000006 c06799fc 00000000 00000000 00000006 00000000 c01f3ee0 1f80: 00000000 00000000 be82d678 be82d65c 00000014 00000001 00000122 c00139c8 1fa0: c6cd0000 c0013840 be82d65c 00000014 00000006 be82d3cc 00000138 00000000 1fc0: be82d65c 00000014 00000001 00000122 00000000 00000000 00018cb1 00000000 1fe0: 00003801 be82d3a8 0003a0c7 b6e9af08 60000010 00000006 00000000 00000000 [<c01be324>] (smc_hardware_send_pkt+0x198/0x22c) from [<c01be868>] (smc_hard_start_xmit+0xc4/0x1e8) [<c01be868>] (smc_hard_start_xmit+0xc4/0x1e8) from [<c0208554>] (dev_hard_start_xmit+0x460/0x4cc) [<c0208554>] (dev_hard_start_xmit+0x460/0x4cc) from [<c021d1d8>] (sch_direct_xmit+0x94/0x18c) [<c021d1d8>] (sch_direct_xmit+0x94/0x18c) from [<c02087f8>] (dev_queue_xmit+0x238/0x42c) [<c02087f8>] (dev_queue_xmit+0x238/0x42c) from [<c027ba74>] (packet_sendmsg+0xbe8/0xd28) [<c027ba74>] (packet_sendmsg+0xbe8/0xd28) from [<c01f2870>] (sock_sendmsg+0x84/0xa8) [<c01f2870>] (sock_sendmsg+0x84/0xa8) from [<c01f4628>] (SyS_sendto+0xb8/0xdc) [<c01f4628>] (SyS_sendto+0xb8/0xdc) from [<c0013840>] (ret_fast_syscall+0x0/0x2c) Code: e3130002 1a000001 e3130001 0affffcd (e7f001f2) ---[ end trace 81104fe70e8da7fe ]--- Kernel panic - not syncing: Fatal exception in interrupt This is because the macro operations in smc91x.h defined for Versatile are missing SMC_outsw() as used in this commit. The Versatile needs and uses the same accessors as the other platforms in the first if(...) clause, just switch it to using that and we have one problem less to worry about. Checkpatch complains about spacing, but I have opted to follow the style of this .h-file. Cc: Russell King <linux@arm.linux.org.uk> Cc: Nicolas Pitre <nico@fluxnic.net> Cc: Eric Miao <eric.y.miao@gmail.com> Cc: Jonathan Cameron <jic23@cam.ac.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yang Yingliang authored
Using iperf to send packets(GSO mode is on), a bug is triggered: [ 212.672781] kernel BUG at lib/dynamic_queue_limits.c:26! [ 212.673396] invalid opcode: 0000 [#1] SMP [ 212.673882] Modules linked in: 8139cp(O) nls_utf8 edd fuse loop dm_mod ipv6 i2c_piix4 8139too i2c_core intel_agp joydev pcspkr hid_generic intel_gtt floppy sr_mod mii button sg cdrom ext3 jbd mbcache usbhid hid uhci_hcd ehci_hcd usbcore sd_mod usb_common crc_t10dif crct10dif_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata scsi_mod [last unloaded: 8139cp] [ 212.676084] CPU: 0 PID: 4124 Comm: iperf Tainted: G O 3.12.0-0.7-default+ #16 [ 212.676084] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 212.676084] task: ffff8800d83966c0 ti: ffff8800db4c8000 task.ti: ffff8800db4c8000 [ 212.676084] RIP: 0010:[<ffffffff8122e23f>] [<ffffffff8122e23f>] dql_completed+0x17f/0x190 [ 212.676084] RSP: 0018:ffff880116e03e30 EFLAGS: 00010083 [ 212.676084] RAX: 00000000000005ea RBX: 0000000000000f7c RCX: 0000000000000002 [ 212.676084] RDX: ffff880111dd0dc0 RSI: 0000000000000bd4 RDI: ffff8800db6ffcc0 [ 212.676084] RBP: ffff880116e03e48 R08: 0000000000000992 R09: 0000000000000000 [ 212.676084] R10: ffffffff8181e400 R11: 0000000000000004 R12: 000000000000000f [ 212.676084] R13: ffff8800d94ec840 R14: ffff8800db440c80 R15: 000000000000000e [ 212.676084] FS: 00007f6685a3c700(0000) GS:ffff880116e00000(0000) knlGS:0000000000000000 [ 212.676084] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 212.676084] CR2: 00007f6685ad6460 CR3: 00000000db714000 CR4: 00000000000006f0 [ 212.676084] Stack: [ 212.676084] ffff8800db6ffc00 000000000000000f ffff8800d94ec840 ffff880116e03eb8 [ 212.676084] ffffffffa041509f ffff880116e03e88 0000000f16e03e88 ffff8800d94ec000 [ 212.676084] 00000bd400059858 000000050000000f ffffffff81094c36 ffff880116e03eb8 [ 212.676084] Call Trace: [ 212.676084] <IRQ> [ 212.676084] [<ffffffffa041509f>] cp_interrupt+0x4ef/0x590 [8139cp] [ 212.676084] [<ffffffff81094c36>] ? ktime_get+0x56/0xd0 [ 212.676084] [<ffffffff8108cf73>] handle_irq_event_percpu+0x53/0x170 [ 212.676084] [<ffffffff8108d0cc>] handle_irq_event+0x3c/0x60 [ 212.676084] [<ffffffff8108fdb5>] handle_fasteoi_irq+0x55/0xf0 [ 212.676084] [<ffffffff810045df>] handle_irq+0x1f/0x30 [ 212.676084] [<ffffffff81003c8b>] do_IRQ+0x5b/0xe0 [ 212.676084] [<ffffffff8142beaa>] common_interrupt+0x6a/0x6a [ 212.676084] <EOI> [ 212.676084] [<ffffffffa0416a21>] ? cp_start_xmit+0x621/0x97c [8139cp] [ 212.676084] [<ffffffffa0416a09>] ? cp_start_xmit+0x609/0x97c [8139cp] [ 212.676084] [<ffffffff81378ed9>] dev_hard_start_xmit+0x2c9/0x550 [ 212.676084] [<ffffffff813960a9>] sch_direct_xmit+0x179/0x1d0 [ 212.676084] [<ffffffff813793f3>] dev_queue_xmit+0x293/0x440 [ 212.676084] [<ffffffff813b0e46>] ip_finish_output+0x236/0x450 [ 212.676084] [<ffffffff810e59e7>] ? __alloc_pages_nodemask+0x187/0xb10 [ 212.676084] [<ffffffff813b10e8>] ip_output+0x88/0x90 [ 212.676084] [<ffffffff813afa64>] ip_local_out+0x24/0x30 [ 212.676084] [<ffffffff813aff0d>] ip_queue_xmit+0x14d/0x3e0 [ 212.676084] [<ffffffff813c6fd1>] tcp_transmit_skb+0x501/0x840 [ 212.676084] [<ffffffff813c8323>] tcp_write_xmit+0x1e3/0xb20 [ 212.676084] [<ffffffff81363237>] ? skb_page_frag_refill+0x87/0xd0 [ 212.676084] [<ffffffff813c8c8b>] tcp_push_one+0x2b/0x40 [ 212.676084] [<ffffffff813bb7e6>] tcp_sendmsg+0x926/0xc90 [ 212.676084] [<ffffffff813e1d21>] inet_sendmsg+0x61/0xc0 [ 212.676084] [<ffffffff8135e861>] sock_aio_write+0x101/0x120 [ 212.676084] [<ffffffff81107cf1>] ? vma_adjust+0x2e1/0x5d0 [ 212.676084] [<ffffffff812163e0>] ? timerqueue_add+0x60/0xb0 [ 212.676084] [<ffffffff81130b60>] do_sync_write+0x60/0x90 [ 212.676084] [<ffffffff81130d44>] ? rw_verify_area+0x54/0xf0 [ 212.676084] [<ffffffff81130f66>] vfs_write+0x186/0x190 [ 212.676084] [<ffffffff811317fd>] SyS_write+0x5d/0xa0 [ 212.676084] [<ffffffff814321e2>] system_call_fastpath+0x16/0x1b [ 212.676084] Code: ca 41 89 dc 41 29 cc 45 31 db 29 c2 41 89 c5 89 d0 45 29 c5 f7 d0 c1 e8 1f e9 43 ff ff ff 66 0f 1f 44 00 00 31 c0 e9 7b ff ff ff <0f> 0b eb fe 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 c7 47 40 00 [ 212.676084] RIP [<ffffffff8122e23f>] dql_completed+0x17f/0x190 ------------[ cut here ]------------ When a skb has frags, bytes_compl plus skb->len nr_frags times in cp_tx(). It's not the correct value(actually, it should plus skb->len once) and it will trigger the BUG_ON(bytes_compl > num_queued - dql->num_completed). So only increase bytes_compl when finish sending all frags. pkts_compl also has a wrong value, fix it too. It's introduced by commit 871f0d4c ("8139cp: enable bql"). Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Chang authored
Windows driver will enable ALDPS function, but linux driver and firmware do not have any configuration related to ALDPS function for 8168g. So restart system to linux and remove the NIC cable, LAN enter ALDPS, then LAN RX will be disabled. This issue can be easily reproduced on dual boot windows and linux system with RTL_GIGA_MAC_VER_40 chip. Realtek said, ALDPS function can be disabled by configuring to PHY, switch to page 0x0A43, reg0x10 bit2=0. Signed-off-by: David Chang <dchang@suse.com> Acked-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dan Carpenter authored
If kmsg->msg_namelen > sizeof(struct sockaddr_storage) then in the original code that would lead to memory corruption in the kernel if you had audit configured. If you didn't have audit configured it was harmless. There are some programs such as beta versions of Ruby which use too large of a buffer and returning an error code breaks them. We should clamp the ->msg_namelen value instead. Fixes: 1661bf36 ("net: heap overflow in __audit_sockaddr()") Reported-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Tested-by: Eric Wong <normalperson@yhbt.net> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Veaceslav Falico authored
Currently we're using plain spin_lock() in prb_shutdown_retire_blk_timer(), however the timer might fire right in the middle and thus try to re-aquire the same spinlock, leaving us in a endless loop. To fix that, use the spin_lock_bh() to block it. Fixes: f6fb8f10 ("af-packet: TPACKET_V3 flexible buffer implementation.") CC: "David S. Miller" <davem@davemloft.net> CC: Daniel Borkmann <dborkman@redhat.com> CC: Willem de Bruijn <willemb@google.com> CC: Phil Sutter <phil@nwl.cc> CC: Eric Dumazet <edumazet@google.com> Reported-by: Jan Stancek <jstancek@redhat.com> Tested-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vlad Yasevich authored
Currently macvlan will count received packets after calling each vlans receive handler. Macvtap attempts to count the packet yet again when the user reads the packet from the tap socket. This code doesn't do this consistently either. Remove the counting from macvtap and let only macvlan count received packets. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 28 Nov, 2013 16 commits
-
-
Ivan Vecera authored
The recent be2net commit 6384a4d0 (adds a support for busy polling) introduces a regression that results in kernel crash. It incorrectly modified be_close() so napi_disable() is called only for the first queue. This breaks a correct pairing of napi_enable/_disable for the rest of event queues and causes a crash in subsequent be_open() call. v2: Applied suggestions from Sathya Fixes: 6384a4d0 ("be2net: add support for ndo_busy_poll") Cc: Sathya Perla <sathya.perla@emulex.com> Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com> Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Acked-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
This reverts commit 55485e7b. I applied the wrong version of this patch, the right one is coming up next. Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ivan Vecera authored
The recent be2net commit 6384a4d0 (adds a support for busy polling) introduces a regression that results in kernel crash. It incorrectly modified be_close() so napi_disable() is called only for the first queue. This breaks a correct pairing of napi_enable/_disable for the rest of event queues and causes a crash in subsequent be_open() call. Cc: Sathya Perla <sathya.perla@emulex.com> Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com> Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Acked-by: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Baker Zhang authored
since f9242b6b inet: Sanitize inet{,6} protocol demux. there are not pretended hash tables for ipv4 or ipv6 protocol handler. Signed-off-by: Baker Zhang <Baker.kernel@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
françois romieu authored
2fdac010 ("via-velocity.c: update napi implementation") overlooked an irq disabling spinlock when the Rx part of the NAPI poll handler was converted from netif_rx to netif_receive_skb. NAPI Rx processing can be taken out of the locked section with a pair of napi_{disable / enable} since it only races with the MTU change function. An heavier rework of the NAPI locking would be able to perform NAPI Tx before Rx where I simply removed one of velocity_tx_srv calls. References: https://bugzilla.redhat.com/show_bug.cgi?id=1022733 Fixes: 2fdac010 (via-velocity.c: update napi implementation) Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Tested-by: Alex A. Schmidt <aaschmidt1@gmail.com> Cc: Jamie Heilman <jamie@audible.transient.net> Cc: Michele Baldessari <michele@acksyn.org> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://gitorious.org/linux-can/linux-canDavid S. Miller authored
Marc Kleine-Budde says: ==================== here's a pull request for v3.13, i.e. net/master. It consists of a patch by Oliver Hartkopp which fixes some corner cases in the interrupt handler of the sja1000 driver. Then there are two patches for the c_can dirver. One by me, which fixes a runtime pm related "scheduling while atomic" error and patch by Holger Bechtold that fixes the calculation of the transmitted bytes. The fourth patch is by me, it corrects the clock usage in the flexcan driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andy Whitcroft authored
We are now using csum_ipv6_magic, include the appropriate header. Avoids the following error: drivers/net/xen-netback/netback.c:1313:4: error: implicit declaration of function 'csum_ipv6_magic' [-Werror=implicit-function-declaration] tcph->check = ~csum_ipv6_magic(&ipv6h->saddr, Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gao feng authored
In failure case, we should use kfree_skb not dev_kfree_skb to free skbuff, dev_kfree_skb is defined as consume_skb. Trace takes advantage of this point. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jason Wang authored
After commit 8ffab51b (macvlan: lockless tx path), tx stat counter were converted to percpu stat structure. So we need use to this also for tx_dropped in macvtap. Otherwise, the management won't notice the dropping packet in macvtap tx path. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Vlad Yasevich <vyasevic@redhat.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shaohui Xie authored
Phy is compatible with Vitesse 82xx Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Xufeng Zhang authored
Currently retransmitted DATA chunks could also be used for RTT measurements since there are no flag to identify whether the transmitted DATA chunk is a new one or a retransmitted one. This problem is introduced by commit ae19c548 ("sctp: remove 'resent' bit from the chunk") which inappropriately removed the 'resent' bit completely, instead of doing this, we should set the resent bit only for the retransmitted DATA chunks. Signed-off-by: Xufeng Zhang <xufeng.zhang@windriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Johannes Berg authored
The pmcraid driver is abusing the genetlink API and is using its family ID as the multicast group ID, which is invalid and may belong to somebody else (and likely will.) Make it use the correct API, but since this may already be used as-is by userspace, reserve a family ID for this code and also reserve that group ID to not break userspace assumptions. My previous patch broke event delivery in the driver as I missed that it wasn't using the right API and forgot to update it later in my series. While changing this, I noticed that the genetlink code could use the static group ID instead of a strcmp(), so also do that for the VFS_DQUOT family. Cc: Anil Ravindranath <anil_ravindranath@pmc-sierra.com> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geert Uytterhoeven authored
net/netlink/genetlink.c: In function ‘genl_validate_assign_mc_groups’: net/netlink/genetlink.c:217: warning: ‘err’ may be used uninitialized in this function Commit 2a94fe48 ("genetlink: make multicast groups const, prevent abuse") split genl_register_mc_group() in multiple functions, but dropped the initialization of err. Initialize err to zero to fix this. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Guenter Roeck authored
Use new hwmon API to simplify code, provide missing mandatory 'name' sysfs attribute, and attach hwmon attributes to hwmon device instead of pci device. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Jean Delvare <khali@linux-fr.org> Acked-by: Nithin Nayak Sujir <nsujir@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
dingtianhong authored
Because the ARP monitoring is not support for 802.3ad, but I still could change the mode to 802.3ad from ab mode while ARP monitoring is running, it is incorrect. So add a check for 802.3ad in bonding_store_mode to fix the problem, and make a new macro BOND_NO_USES_ARP() to simplify the code. v2: according to the Dan Williams's suggestion, bond mode is the most important bond option, it should override any of the other sub-options. So when the mode is changed, the conficting values should be cleared or reset, otherwise the user has to duplicate more operations to modify the logic. I disable the arp and enable mii monitoring when the bond mode is changed to AB, TB and 8023AD if the arp interval is true. v3: according to the Nik's suggestion, the default value of miimon should need a name, there is several place to use it, and the bond_store_arp_interval() could use micro BOND_NO_USES_ARP to make the code more simpify. Suggested-by: Dan Williams <dcbw@redhat.com> Suggested-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nicolas Dichtel authored
The first netlink attribute (value 0) must always be defined as none/unspec. This is correctly done in inet_diag.h, but other diag interfaces are wrong. Because we cannot change an existing API, I add a comment to point the mistake and avoid to propagate it in a new diag API in the future. CC: Thomas Graf <tgraf@suug.ch> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 26 Nov, 2013 1 commit
-
-
Marc Kleine-Budde authored
The flexcan IP core uses the peripheral clock ("per") as basic clock for the bit timing calculation. However the driver uses the the wrong clock ("ipg"). This leads to wrong bit rates if the rates on both clock are different. This patch fixes the problem by using the correct clock for the bit rate calculation. Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
- 25 Nov, 2013 3 commits
-
-
Holger Bechtold authored
The number of bytes transmitted was not updated correctly, if several CAN messages (with different length) were transmitted in one 'bunch'. Thus programs like 'ifconfig' showed wrong transmit byte counts. Reason was, that the message object whose DLC is to be read was not necessarily the active one at the time when priv->read_reg(priv, C_CAN_IFACE(MSGCTRL_REG, 0)) & IF_MCONT_DLC_MASK; was executed. Signed-off-by: Holger Bechtold <Holger.Bechtold@gmx.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
Marc Kleine-Budde authored
The c_can driver contians a callpath (c_can_poll -> c_can_state_change -> c_can_get_berr_counter) which may call pm_runtime_get_sync() from the IRQ handler, which is not allowed and results in "BUG: scheduling while atomic". This problem is fixed by introducing __c_can_get_berr_counter, which will not call pm_runtime_get_sync(). Reported-by: Andrew Glen <AGlen@bepmarine.com> Tested-by: Andrew Glen <AGlen@bepmarine.com> Signed-off-by: Andrew Glen <AGlen@bepmarine.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
Oliver Hartkopp authored
This patch fixes the issue that the sja1000_interrupt() function may have returned IRQ_NONE without processing the optional pre_irq() and post_irq() function before. Further the irq processing counter 'n' is moved to the end of the while statement to return correct IRQ_[NONE|HANDLED] values at error conditions. Reported-by: Wolfgang Grandegger <wg@grandegger.com> Acked-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
- 23 Nov, 2013 4 commits
-
-
Ajit Khaparde authored
On BE3-R, the PF programs the initial MAC address for its VFs. Doing it again in VF probe, causes a FW error which although harmless generates an unnecessary error log message. Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ajit Khaparde authored
It is not being set currently. (This field is not applicable for Lancer) Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ajit Khaparde authored
Interrupts need to be enabled in be_resume, when adapter boots back up from D3cold. disabling interrupts in be_suspend() just to be symmetric to be_resume(). Signed-off-by: Ravikumar Nelavelli <ravikumar.nelavelli@emulex.com> Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
If a too small burst is inadvertently set on TBF, we might trigger a bug in tbf_segment(), as 'skb' instead of 'segs' was used in a qdisc_reshape_fail() call. tc qdisc add dev eth0 root handle 1: tbf latency 50ms burst 1KB rate 50mbit Fix the bug, and add a warning, as such configuration is not going to work anyway for non GSO packets. (For some reason, one has to use a burst >= 1520 to get a working configuration, even with old kernels. This is a probable iproute2/tc bug) Based on a report and initial patch from Yang Yingliang Fixes: e43ac79a ("sch_tbf: segment too big GSO packets") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-