- 29 Aug, 2013 40 commits
-
-
Kent Overstreet authored
commit e49c7c37 upstream. Journal writes need to be marked FUA, not just REQ_FLUSH. And btree node writes have... weird ordering requirements. Signed-off-by:
Kent Overstreet <koverstreet@google.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kumar Amit Mehta authored
commit 5c694129 upstream. bio_alloc_bioset returns NULL on failure. This fix adds a missing check for potential NULL pointer dereferencing. Signed-off-by:
Kumar Amit Mehta <gmate.amit@gmail.com> Signed-off-by:
Kent Overstreet <koverstreet@google.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tomas Winkler authored
commit dab9bf41 upstream. 1. MEI_INTEROP_TIMEOUT is in seconds not in jiffies so we use mei_secs_to_jiffies macro While cold boot is fast this is relevant in resume 2. wait_event_interruptible_timeout can return with -ERESTARTSYS so do not override it with -ETIMEDOUT 3.Adjust error message Tested-by:
Shuah Khan <shuah.kh@samsung.com> Signed-off-by:
Tomas Winkler <tomas.winkler@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tomas Winkler authored
commit 99f22c4e upstream. When powering up, we don't have to clean up the device state nothing is connected. Tested-by:
Shuah Khan <shuah.kh@samsung.com> Signed-off-by:
Tomas Winkler <tomas.winkler@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tomas Winkler authored
commit 315a383a upstream. ME HW ready bit is down after hw reset was asserted or on error. Only on error we need to enter the reset flow, additional reset need to be prevented when reset was triggered during initialization , power up/down or a reset is already in progress Tested-by:
Shuah Khan <shuah.kh@samsung.com> Signed-off-by:
Tomas Winkler <tomas.winkler@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
David Vrabel authored
commit 3bc38cbc upstream. If there are UNUSABLE regions in the machine memory map, dom0 will attempt to map them 1:1 which is not permitted by Xen and the kernel will crash. There isn't anything interesting in the UNUSABLE region that the dom0 kernel needs access to so we can avoid making the 1:1 mapping and treat it as RAM. We only do this for dom0, as that is where tboot case shows up. A PV domU could have an UNUSABLE region in its pseudo-physical map and would need to be handled in another patch. This fixes a boot failure on hosts with tboot. tboot marks a region in the e820 map as unusable and the dom0 kernel would attempt to map this region and Xen does not permit unusable regions to be mapped by guests. (XEN) 0000000000000000 - 0000000000060000 (usable) (XEN) 0000000000060000 - 0000000000068000 (reserved) (XEN) 0000000000068000 - 000000000009e000 (usable) (XEN) 0000000000100000 - 0000000000800000 (usable) (XEN) 0000000000800000 - 0000000000972000 (unusable) tboot marked this region as unusable. (XEN) 0000000000972000 - 00000000cf200000 (usable) (XEN) 00000000cf200000 - 00000000cf38f000 (reserved) (XEN) 00000000cf38f000 - 00000000cf3ce000 (ACPI data) (XEN) 00000000cf3ce000 - 00000000d0000000 (reserved) (XEN) 00000000e0000000 - 00000000f0000000 (reserved) (XEN) 00000000fe000000 - 0000000100000000 (reserved) (XEN) 0000000100000000 - 0000000630000000 (usable) Signed-off-by:
David Vrabel <david.vrabel@citrix.com> [v1: Altered the patch and description with domU's with UNUSABLE regions] Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Radu Caragea authored
commit 41aacc1e upstream. This is the updated version of df54d6fa ("x86 get_unmapped_area(): use proper mmap base for bottom-up direction") that only randomizes the mmap base address once. Signed-off-by:
Radu Caragea <sinaelgl@gmail.com> Reported-and-tested-by:
Jeff Shorey <shoreyjeff@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michel Lespinasse <walken@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Adrian Sendroiu <molecula2788@gmail.com> Cc: Kamal Mostafa <kamal@canonical.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Linus Torvalds authored
commit 5ea80f76 upstream. This reverts commit df54d6fa. The commit isn't necessarily wrong, but because it recalculates the random mmap_base every time, it seems to confuse user memory allocators that expect contiguous mmap allocations even when the mmap address isn't specified. In particular, the MATLAB Java runtime seems to be unhappy. See https://bugzilla.kernel.org/show_bug.cgi?id=60774 So we'll want to apply the random offset only once, and Radu has a patch for that. Revert this older commit in order to apply the other one. Reported-by:
Jeff Shorey <shoreyjeff@gmail.com> Cc: Radu Caragea <sinaelgl@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Roland Dreier authored
commit 35dc2483 upstream. There is a nasty bug in the SCSI SG_IO ioctl that in some circumstances leads to one process writing data into the address space of some other random unrelated process if the ioctl is interrupted by a signal. What happens is the following: - A process issues an SG_IO ioctl with direction DXFER_FROM_DEV (ie the underlying SCSI command will transfer data from the SCSI device to the buffer provided in the ioctl) - Before the command finishes, a signal is sent to the process waiting in the ioctl. This will end up waking up the sg_ioctl() code: result = wait_event_interruptible(sfp->read_wait, (srp_done(sfp, srp) || sdp->detached)); but neither srp_done() nor sdp->detached is true, so we end up just setting srp->orphan and returning to userspace: srp->orphan = 1; write_unlock_irq(&sfp->rq_list_lock); return result; /* -ERESTARTSYS because signal hit process */ At this point the original process is done with the ioctl and blithely goes ahead handling the signal, reissuing the ioctl, etc. - Eventually, the SCSI command issued by the first ioctl finishes and ends up in sg_rq_end_io(). At the end of that function, we run through: write_lock_irqsave(&sfp->rq_list_lock, iflags); if (unlikely(srp->orphan)) { if (sfp->keep_orphan) srp->sg_io_owned = 0; else done = 0; } srp->done = done; write_unlock_irqrestore(&sfp->rq_list_lock, iflags); if (likely(done)) { /* Now wake up any sg_read() that is waiting for this * packet. */ wake_up_interruptible(&sfp->read_wait); kill_fasync(&sfp->async_qp, SIGPOLL, POLL_IN); kref_put(&sfp->f_ref, sg_remove_sfp); } else { INIT_WORK(&srp->ew.work, sg_rq_end_io_usercontext); schedule_work(&srp->ew.work); } Since srp->orphan *is* set, we set done to 0 (assuming the userspace app has not set keep_orphan via an SG_SET_KEEP_ORPHAN ioctl), and therefore we end up scheduling sg_rq_end_io_usercontext() to run in a workqueue. - In workqueue context we go through sg_rq_end_io_usercontext() -> sg_finish_rem_req() -> blk_rq_unmap_user() -> ... -> bio_uncopy_user() -> __bio_copy_iov() -> copy_to_user(). The key point here is that we are doing copy_to_user() on a workqueue -- that is, we're on a kernel thread with current->mm equal to whatever random previous user process was scheduled before this kernel thread. So we end up copying whatever data the SCSI command returned to the virtual address of the buffer passed into the original ioctl, but it's quite likely we do this copying into a different address space! As suggested by James Bottomley <James.Bottomley@hansenpartnership.com>, add a check for current->mm (which is NULL if we're on a kernel thread without a real userspace address space) in bio_uncopy_user(), and skip the copy if we're on a kernel thread. There's no reason that I can think of for any caller of bio_uncopy_user() to want to do copying on a kernel thread with a random active userspace address space. Huge thanks to Costa Sapuntzakis <costa@purestorage.com> for the original pointer to this bug in the sg code. Signed-off-by:
Roland Dreier <roland@purestorage.com> Tested-by:
David Milburn <dmilburn@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by:
James Bottomley <JBottomley@Parallels.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Anton Blanchard authored
commit f5944daa upstream. We want ppc64 to be able to select between optimised assembly checksum routines in big endian and the generic lib/checksum.c routines in little endian. The lpfc driver is forcing CONFIG_GENERIC_CSUM on which means we are unable to make the decision to enable it in the arch Kconfig. If the option exists it is always forced on. This got introduced in 3.10 via commit 6a7252fd ([SCSI] lpfc: fix up Kconfig dependencies). I spoke to Randy about it and the original issue was with CRC_T10DIF not being defined. As such, remove the select of CONFIG_GENERIC_CSUM. Signed-off-by:
Anton Blanchard <anton@samba.org> Signed-off-by:
James Bottomley <JBottomley@Parallels.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Martin Peschke authored
commit 924dd584 upstream. BUG: sleeping function called from invalid context at kernel/workqueue.c:2752 in_atomic(): 1, irqs_disabled(): 1, pid: 360, name: zfcperp0.0.1700 CPU: 1 Not tainted 3.9.3+ #69 Process zfcperp0.0.1700 (pid: 360, task: 0000000075b7e080, ksp: 000000007476bc30) <snip> Call Trace: ([<00000000001165de>] show_trace+0x106/0x154) [<00000000001166a0>] show_stack+0x74/0xf4 [<00000000006ff646>] dump_stack+0xc6/0xd4 [<000000000017f3a0>] __might_sleep+0x128/0x148 [<000000000015ece8>] flush_work+0x54/0x1f8 [<00000000001630de>] __cancel_work_timer+0xc6/0x128 [<00000000005067ac>] scsi_device_dev_release_usercontext+0x164/0x23c [<0000000000161816>] execute_in_process_context+0x96/0xa8 [<00000000004d33d8>] device_release+0x60/0xc0 [<000000000048af48>] kobject_release+0xa8/0x1c4 [<00000000004f4bf2>] __scsi_iterate_devices+0xfa/0x130 [<000003ff801b307a>] zfcp_erp_strategy+0x4da/0x1014 [zfcp] [<000003ff801b3caa>] zfcp_erp_thread+0xf6/0x2b0 [zfcp] [<000000000016b75a>] kthread+0xf2/0xfc [<000000000070c9de>] kernel_thread_starter+0x6/0xc [<000000000070c9d8>] kernel_thread_starter+0x0/0xc Apparently, the ref_count for some scsi_device drops down to zero, triggering device removal through execute_in_process_context(), while the lldd error recovery thread iterates through a scsi device list. Unfortunately, execute_in_process_context() decides to immediately execute that device removal function, instead of scheduling asynchronous execution, since it detects process context and thinks it is safe to do so. But almost all calls to shost_for_each_device() in our lldd are inside spin_lock_irq, even in thread context. Obviously, schedule() inside spin_lock_irq sections is a bad idea. Change the lldd to use the proper iterator function, __shost_for_each_device(), in combination with required locking. Occurences that need to be changed include all calls in zfcp_erp.c, since those might be executed in zfcp error recovery thread context with a lock held. Other occurences of shost_for_each_device() in zfcp_fsf.c do not need to be changed (no process context, no surrounding locking). The problem was introduced in Linux 2.6.37 by commit b62a8d9b "[SCSI] zfcp: Use SCSI device data zfcp_scsi_dev instead of zfcp_unit". Reported-by:
Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by:
Martin Peschke <mpeschke@linux.vnet.ibm.com> Signed-off-by:
Steffen Maier <maier@linux.vnet.ibm.com> Signed-off-by:
James Bottomley <JBottomley@Parallels.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Martin Peschke authored
commit d79ff142 upstream. This patch adds wait_event_interruptible_lock_irq_timeout(), which is a straight-forward descendant of wait_event_interruptible_timeout() and wait_event_interruptible_lock_irq(). The zfcp driver used to call wait_event_interruptible_timeout() in combination with some intricate and error-prone locking. Using wait_event_interruptible_lock_irq_timeout() as a replacement nicely cleans up that locking. This rework removes a situation that resulted in a locking imbalance in zfcp_qdio_sbal_get(): BUG: workqueue leaked lock or atomic: events/1/0xffffff00/10 last function: zfcp_fc_wka_port_offline+0x0/0xa0 [zfcp] It was introduced by commit c2af7545 "[SCSI] zfcp: Do not wait for SBALs on stopped queue", which had a new code path related to ZFCP_STATUS_ADAPTER_QDIOUP that took an early exit without a required lock being held. The problem occured when a special, non-SCSI I/O request was being submitted in process context, when the adapter's queues had been torn down. In this case the bug surfaced when the Fibre Channel port connection for a well-known address was closed during a concurrent adapter shut-down procedure, which is a rare constellation. This patch also fixes these warnings from the sparse tool (make C=1): drivers/s390/scsi/zfcp_qdio.c:224:12: warning: context imbalance in 'zfcp_qdio_sbal_check' - wrong count at exit drivers/s390/scsi/zfcp_qdio.c:244:5: warning: context imbalance in 'zfcp_qdio_sbal_get' - unexpected unlock Last but not least, we get rid of that crappy lock-unlock-lock sequence at the beginning of the critical section. It is okay to call zfcp_erp_adapter_reopen() with req_q_lock held. Reported-by:
Mikulas Patocka <mpatocka@redhat.com> Reported-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Martin Peschke <mpeschke@linux.vnet.ibm.com> Signed-off-by:
Steffen Maier <maier@linux.vnet.ibm.com> Signed-off-by:
James Bottomley <JBottomley@Parallels.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Emmanuel Grumbach authored
commit eabc4ac5 upstream. As Arjan pointed out, we mustn't do anything related to PCI configuration until the device is properly enabled with pci_enable_device(). Reported-by:
Arjan van de Ven <arjan@linux.intel.com> Signed-off-by:
Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by:
Johannes Berg <johannes.berg@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Stanislaw Gruszka authored
commit 9186a1fd upstream. If channel switch is pending and we remove interface we can crash like showed below due to passing NULL vif to mac80211: BUG: unable to handle kernel paging request at fffffffffffff8cc IP: [<ffffffff8130924d>] strnlen+0xd/0x40 Call Trace: [<ffffffff8130ad2e>] string.isra.3+0x3e/0xd0 [<ffffffff8130bf99>] vsnprintf+0x219/0x640 [<ffffffff8130c481>] vscnprintf+0x11/0x30 [<ffffffff81061585>] vprintk_emit+0x115/0x4f0 [<ffffffff81657bd5>] printk+0x61/0x63 [<ffffffffa048987f>] ieee80211_chswitch_done+0xaf/0xd0 [mac80211] [<ffffffffa04e7b34>] iwl_chswitch_done+0x34/0x40 [iwldvm] [<ffffffffa04f83c3>] iwlagn_commit_rxon+0x2a3/0xdc0 [iwldvm] [<ffffffffa04ebc50>] ? iwlagn_set_rxon_chain+0x180/0x2c0 [iwldvm] [<ffffffffa04e5e76>] iwl_set_mode+0x36/0x40 [iwldvm] [<ffffffffa04e5f0d>] iwlagn_mac_remove_interface+0x8d/0x1b0 [iwldvm] [<ffffffffa0459b3d>] ieee80211_do_stop+0x29d/0x7f0 [mac80211] This is because we nulify ctx->vif in iwlagn_mac_remove_interface() before calling some other functions that teardown interface. To fix just check ctx->vif on iwl_chswitch_done(). We should not call ieee80211_chswitch_done() as channel switch works were already canceled by mac80211 in ieee80211_do_stop() -> ieee80211_mgd_stop(). Resolve: https://bugzilla.redhat.com/show_bug.cgi?id=979581Reported-by:
Lukasz Jagiello <jagiello.lukasz@gmail.com> Signed-off-by:
Stanislaw Gruszka <sgruszka@redhat.com> Reviewed-by:
Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by:
Johannes Berg <johannes.berg@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Terry Suereth authored
commit 8ffff94d upstream. Fixing support for the Silicon Image 3826 port multiplier, by applying to it the same quirks applied to the Silicon Image 3726. Specifically fixes the repeated timeout/reset process which previously afflicted the 3726, as described from line 290. Slightly based on notes from: https://bugzilla.redhat.com/show_bug.cgi?id=890237Signed-off-by:
Terry Suereth <terry.suereth@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dan Carpenter authored
commit 909bd592 upstream. We want the data stored in "addr" and "qual", but the extra ampersands mean we are copying stack data instead. Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
John W. Linville <linville@tuxdriver.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Anthony Foiani authored
commit 99bbdfa6 upstream. Before this patch, I was seeing the following lockdep splat on my MPC8315 (PPC32) target: [ 9.086051] ================================= [ 9.090393] [ INFO: inconsistent lock state ] [ 9.094744] 3.9.7-ajf-gc39503d #1 Not tainted [ 9.099087] --------------------------------- [ 9.103432] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. [ 9.109431] scsi_eh_1/39 [HC1[1]:SC0[0]:HE0:SE1] takes: [ 9.114642] (&(&host->lock)->rlock){?.+...}, at: [<c02f4168>] sata_fsl_interrupt+0x50/0x250 [ 9.123137] {HARDIRQ-ON-W} state was registered at: [ 9.128004] [<c006cdb8>] lock_acquire+0x90/0xf4 [ 9.132737] [<c043ef04>] _raw_spin_lock+0x34/0x4c [ 9.137645] [<c02f3560>] fsl_sata_set_irq_coalescing+0x68/0x100 [ 9.143750] [<c02f36a0>] sata_fsl_init_controller+0xa8/0xc0 [ 9.149505] [<c02f3f10>] sata_fsl_probe+0x17c/0x2e8 [ 9.154568] [<c02acc90>] driver_probe_device+0x90/0x248 [ 9.159987] [<c02acf0c>] __driver_attach+0xc4/0xc8 [ 9.164964] [<c02aae74>] bus_for_each_dev+0x5c/0xa8 [ 9.170028] [<c02ac218>] bus_add_driver+0x100/0x26c [ 9.175091] [<c02ad638>] driver_register+0x88/0x198 [ 9.180155] [<c0003a24>] do_one_initcall+0x58/0x1b4 [ 9.185226] [<c05aeeac>] kernel_init_freeable+0x118/0x1c0 [ 9.190823] [<c0004110>] kernel_init+0x18/0x108 [ 9.195542] [<c000f6b8>] ret_from_kernel_thread+0x64/0x6c [ 9.201142] irq event stamp: 160 [ 9.204366] hardirqs last enabled at (159): [<c043f778>] _raw_spin_unlock_irq+0x30/0x50 [ 9.212469] hardirqs last disabled at (160): [<c000f414>] reenable_mmu+0x30/0x88 [ 9.219867] softirqs last enabled at (144): [<c002ae5c>] __do_softirq+0x168/0x218 [ 9.227435] softirqs last disabled at (137): [<c002b0d4>] irq_exit+0xa8/0xb4 [ 9.234481] [ 9.234481] other info that might help us debug this: [ 9.240995] Possible unsafe locking scenario: [ 9.240995] [ 9.246898] CPU0 [ 9.249337] ---- [ 9.251776] lock(&(&host->lock)->rlock); [ 9.255878] <Interrupt> [ 9.258492] lock(&(&host->lock)->rlock); [ 9.262765] [ 9.262765] *** DEADLOCK *** [ 9.262765] [ 9.268684] no locks held by scsi_eh_1/39. [ 9.272767] [ 9.272767] stack backtrace: [ 9.277117] Call Trace: [ 9.279589] [cfff9da0] [c0008504] show_stack+0x48/0x150 (unreliable) [ 9.285972] [cfff9de0] [c0447d5c] print_usage_bug.part.35+0x268/0x27c [ 9.292425] [cfff9e10] [c006ace4] mark_lock+0x2ac/0x658 [ 9.297660] [cfff9e40] [c006b7e4] __lock_acquire+0x754/0x1840 [ 9.303414] [cfff9ee0] [c006cdb8] lock_acquire+0x90/0xf4 [ 9.308745] [cfff9f20] [c043ef04] _raw_spin_lock+0x34/0x4c [ 9.314250] [cfff9f30] [c02f4168] sata_fsl_interrupt+0x50/0x250 [ 9.320187] [cfff9f70] [c0079ff0] handle_irq_event_percpu+0x90/0x254 [ 9.326547] [cfff9fc0] [c007a1fc] handle_irq_event+0x48/0x78 [ 9.332220] [cfff9fe0] [c007c95c] handle_level_irq+0x9c/0x104 [ 9.337981] [cfff9ff0] [c000d978] call_handle_irq+0x18/0x28 [ 9.343568] [cc7139f0] [c000608c] do_IRQ+0xf0/0x1a8 [ 9.348464] [cc713a20] [c000fc8c] ret_from_except+0x0/0x14 [ 9.353983] --- Exception: 501 at _raw_spin_unlock_irq+0x40/0x50 [ 9.353983] LR = _raw_spin_unlock_irq+0x30/0x50 [ 9.364839] [cc713af0] [c043db10] wait_for_common+0xac/0x188 [ 9.370513] [cc713b30] [c02ddee4] ata_exec_internal_sg+0x2b0/0x4f0 [ 9.376699] [cc713be0] [c02de18c] ata_exec_internal+0x68/0xa8 [ 9.382454] [cc713c20] [c02de4b8] ata_dev_read_id+0x158/0x594 [ 9.388205] [cc713ca0] [c02ec244] ata_eh_recover+0xd88/0x13d0 [ 9.393962] [cc713d20] [c02f2520] sata_pmp_error_handler+0xc0/0x8ac [ 9.400234] [cc713dd0] [c02ecdc8] ata_scsi_port_error_handler+0x464/0x5e8 [ 9.407023] [cc713e10] [c02ecfd0] ata_scsi_error+0x84/0xb8 [ 9.412528] [cc713e40] [c02c4974] scsi_error_handler+0xd8/0x47c [ 9.418457] [cc713eb0] [c004737c] kthread+0xa8/0xac [ 9.423355] [cc713f40] [c000f6b8] ret_from_kernel_thread+0x64/0x6c This fix was suggested by Bhushan Bharat <R65777@freescale.com>, and was discussed in email at: http://linuxppc.10917.n7.nabble.com/MPC8315-reboot-failure-lockdep-splat-possibly-related-tp75162.html Same patch successfully tested with 3.9.7. linux-next compiled but not tested on hardware. This patch is based off linux-next tag next-20130819 (which is commit 66a01bae29d11916c09f9f5a937cafe7d402e4a5 ) Signed-off-by:
Anthony Foiani <anthony.foiani@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Anatolij Gustschin authored
commit 52d5b9ab upstream. Commit 94ae9843 (usb: phy: rename all phy drivers to phy-$name-usb.c) renamed drivers/usb/phy/otg_fsm.h to drivers/usb/phy/phy-fsm-usb.h but changed drivers/usb/phy/phy-fsm-usb.c to include not existing "phy-otg-fsm.h" instead of new "phy-fsm-usb.h". This breaks building: ... drivers/usb/phy/phy-fsm-usb.c:32:25: fatal error: phy-otg-fsm.h: No such file or directory compilation terminated. make[3]: *** [drivers/usb/phy/phy-fsm-usb.o] Error 1 This commit also missed to modify drivers/usb/phy/phy-fsl-usb.h to include new "phy-fsm-usb.h" instead of "otg_fsm.h" resulting in another build breakage: ... In file included from drivers/usb/phy/phy-fsl-usb.c:46:0: drivers/usb/phy/phy-fsl-usb.h:18:21: fatal error: otg_fsm.h: No such file or directory compilation terminated. make[3]: *** [drivers/usb/phy/phy-fsl-usb.o] Error 1 Fix both issues. Signed-off-by:
Anatolij Gustschin <agust@denx.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Daniel Drake authored
commit 93dbc1b3 upstream. Being a low-level component, various drivers (e.g. olpc-battery) assume that it is ok to communicate with the OLPC Embedded Controller during probe. Therefore the OLPC EC driver must be initialised before other drivers try to use it. This was the case until it was recently moved out of arch/x86 and restructured around commits ac250415 ("Platform: OLPC: turn EC driver into a platform_driver") and 85f90cf6 ("x86: OLPC: switch over to using new EC driver on x86"). Use arch_initcall so that olpc-ec is readied earlier, matching the previous behaviour. Fixes a regression introduced in Linux-3.6 where various drivers such as olpc-battery and olpc-xo1-sci failed to load due to an inability to communicate with the EC. The user-visible effect was a lack of battery monitoring, missing ebook/lid switch input devices, etc. Signed-off-by:
Daniel Drake <dsd@laptop.org> Cc: Andres Salomon <dilinger@queued.net> Cc: Paul Fox <pgf@laptop.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vyacheslav Dubeyko authored
commit 4bf93b50 upstream. Fix the issue with improper counting number of flying bio requests for BIO_EOPNOTSUPP error detection case. The sb_nbio must be incremented exactly the same number of times as complete() function was called (or will be called) because nilfs_segbuf_wait() will call wail_for_completion() for the number of times set to sb_nbio: do { wait_for_completion(&segbuf->sb_bio_event); } while (--segbuf->sb_nbio > 0); Two functions complete() and wait_for_completion() must be called the same number of times for the same sb_bio_event. Otherwise, wait_for_completion() will hang or leak. Signed-off-by:
Vyacheslav Dubeyko <slava@dubeyko.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Acked-by:
Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Tested-by:
Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vyacheslav Dubeyko authored
commit 2df37a19 upstream. Remove double call of bio_put() in nilfs_end_bio_write() for the case of BIO_EOPNOTSUPP error detection. The issue was found by Dan Carpenter and he suggests first version of the fix too. Signed-off-by:
Vyacheslav Dubeyko <slava@dubeyko.com> Reported-by:
Dan Carpenter <dan.carpenter@oracle.com> Acked-by:
Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Tested-by:
Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Wladislav Wiebe authored
commit 9e401275 upstream. Already existing property flags are filled wrong for properties created from initial FDT. This could cause problems if this DYNAMIC device-tree functions are used later, i.e. properties are attached/detached/replaced. Simply dumping flags from the running system show, that some initial static (not allocated via kzmalloc()) nodes are marked as dynamic. I putted some debug extensions to property_proc_show(..) : .. + if (OF_IS_DYNAMIC(pp)) + pr_err("DEBUG: xxx : OF_IS_DYNAMIC\n"); + if (OF_IS_DETACHED(pp)) + pr_err("DEBUG: xxx : OF_IS_DETACHED\n"); when you operate on the nodes (e.g.: ~$ cat /proc/device-tree/*some_node*) you will see that those flags are filled wrong, basically in most cases it will dump a DYNAMIC or DETACHED status, which is in not true. (BTW. this OF_IS_DETACHED is a own define for debug purposes which which just make a test_bit(OF_DETACHED, &x->_flags) If nodes are dynamic kernel is allowed to kfree() them. But it will crash attempting to do so on the nodes from FDT -- they are not allocated via kzmalloc(). Signed-off-by:
Wladislav Wiebe <wladislav.kw@gmail.com> Acked-by:
Alexander Sverdlin <alexander.sverdlin@nsn.com> Signed-off-by:
Rob Herring <rob.herring@calxeda.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chris Wilson authored
commit 884020bf upstream. After any "soft gfx reset" we must manually invalidate the TLBs associated with each ring. Empirically, it seems that a suspend/resume or D3-D0 cycle count as a "soft reset". The symptom is that the hardware would fail to note the new address for its status page, and so it would continue to write the shadow registers and breadcrumbs into the old physical address (now used by something completely different, scary). Whereas the driver would read the new status page and never see any progress, it would appear that the GPU hung immediately upon resume. Based on a patch by naresh kumar kachhi <naresh.kumar.kacchi@intel.com> Reported-by:
Thiago Macieira <thiago@kde.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64725Signed-off-by:
Chris Wilson <chris@chris-wilson.co.uk> Tested-by:
Thiago Macieira <thiago@kde.org> Signed-off-by:
Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rafał Miłecki authored
commit d43a93c8 upstream. This bug (introduced in 3.10) in WREG32_OR made commit d3418eac "drm/radeon/evergreen: setup HDMI before enabling it" cause a regression. Sometimes audio over HDMI wasn't working, sometimes display was corrupted. This fixes: https://bugzilla.kernel.org/show_bug.cgi?id=60687 https://bugzilla.kernel.org/show_bug.cgi?id=60709 https://bugs.freedesktop.org/show_bug.cgi?id=67767Signed-off-by:
Rafał Miłecki <zajec5@gmail.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christian König authored
commit 112a6d0c upstream. When the message buffer is currently moving block until it is idle again. Signed-off-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Alex Deucher authored
commit 022374c0 upstream. Uses the wrong array size for some asics which can lead to garbage getting written to registers. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=60674Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ian Abbott authored
commit 3955dfa8 upstream. Commit dcd7b8bd ("staging: comedi: put module _after_ detach" by myself) reversed a couple of calls in `comedi_device_attach()` when recovering from an error returned by the low-level driver's 'attach' handler. Unfortunately, that introduced a NULL pointer dereference bug as `dev->driver` is NULL after the call to `comedi_device_detach()`. We still have a pointer to the low-level comedi driver structure in the `driv` variable, so use that instead. Signed-off-by:
Ian Abbott <abbotti@mev.co.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nicolas Pitre authored
commit ac124504 upstream. Commit f6f91b0d ("ARM: allow kuser helpers to be removed from the vector page") introduced some help text for the CONFIG_KUSER_HELPERS option which is rather contradictory. Let's fix that, and improve it a little. Signed-off-by:
Nicolas Pitre <nico@linaro.org> Signed-off-by:
Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Will Deacon authored
commit ee7538a0 upstream. This is a port of c95eb318 ("ARM: 7809/1: perf: fix event validation for software group leaders") to arm64, which fixes a panic in the arm64 perf backend found as a result of Vince's fuzzing tool. Signed-off-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Will Deacon authored
commit 868f6fea upstream. This is a port of d9f96635 ("ARM: 7810/1: perf: Fix array out of bounds access in armpmu_map_hw_event()") to arm64, which fixes an oops in the arm64 perf backend found as a result of Vince's fuzzing tool. Signed-off-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nicolas Ferre authored
commit a57603ca upstream. Signed-off-by:
Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sekhar Nori authored
commit acd36357 upstream. Starting with kernel v3.5, it is mandatory to specify ECC strength when using hardware ECC. Without this, kernel panics with a warning of the sort: Driver must set ecc.strength when using hardware ECC ------------[ cut here ]------------ kernel BUG at drivers/mtd/nand/nand_base.c:3519! Fix this by specifying ECC strength for the boards which were missing this. Reported-by:
Holger Freyther <holger@freyther.de> Signed-off-by:
Sekhar Nori <nsekhar@ti.com> Signed-off-by:
Kevin Hilman <khilman@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
David Vrabel authored
commit 4704fe4f upstream. When a event is being bound to a VCPU there is a window between the EVTCHNOP_bind_vpcu call and the adjustment of the local per-cpu masks where an event may be lost. The hypervisor upcalls the new VCPU but the kernel thinks that event is still bound to the old VCPU and ignores it. There is even a problem when the event is being bound to the same VCPU as there is a small window beween the clear_bit() and set_bit() calls in bind_evtchn_to_cpu(). When scanning for pending events, the kernel may read the bit when it is momentarily clear and ignore the event. Avoid this by masking the event during the whole bind operation. Signed-off-by:
David Vrabel <david.vrabel@citrix.com> Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by:
Jan Beulich <jbeulich@suse.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
David Vrabel authored
commit 84ca7a8e upstream. The sizeof() argument in init_evtchn_cpu_bindings() is incorrect resulting in only the first 64 (or 32 in 32-bit guests) ports having their bindings being initialized to VCPU 0. In most cases this does not cause a problem as request_irq() will set the irq affinity which will set the correct local per-cpu mask. However, if the request_irq() is called on a VCPU other than 0, there is a window between the unmasking of the event and the affinity being set were an event may be lost because it is not locally unmasked on any VCPU. If request_irq() is called on VCPU 0 then local irqs are disabled during the window and the race does not occur. Fix this by initializing all NR_EVENT_CHANNEL bits in the local per-cpu masks. Signed-off-by:
David Vrabel <david.vrabel@citrix.com> Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Daniel Drake authored
commit d55e37bb upstream. OpenFirmware wasn't quite following the protocol described in boot.txt and the kernel has detected this through use of the sentinel value in boot_params. OFW does zero out almost all of the stuff that it should do, but not the sentinel. This causes the kernel to clear olpc_ofw_header, which breaks x86 OLPC support. OpenFirmware has now been fixed. However, it would be nice if we could maintain Linux compatibility with old firmware versions. To do that, we just have to avoid zeroing out olpc_ofw_header. OFW does not write to any other parts of the header that are being zapped by the sentinel-detection code, and all users of olpc_ofw_header are somewhat protected through checking for the OLPC_OFW_SIG magic value before using it. So this should not cause any problems for anyone. Signed-off-by:
Daniel Drake <dsd@laptop.org> Link: http://lkml.kernel.org/r/20130809221420.618E6FAB03@dev.laptop.orgAcked-by:
Yinghai Lu <yinghai@kernel.org> Signed-off-by:
H. Peter Anvin <hpa@linux.intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dan Carpenter authored
commit 52e220d3 upstream. This should actually be returning an ERR_PTR on error instead of NULL. That was how it was designed and all the callers expect it. [AV: actually, that's what "VFS: Make clone_mnt()/copy_tree()/collect_mounts() return errors" missed - originally collect_mounts() was expected to return NULL on failure] Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jussi Kivilinna authored
commit 1206ff4f upstream. Patch fixes zd1201 not to use stack as URB transfer_buffer. URB buffers need to be DMA-able, which stack is not. Patch is only compile tested. Signed-off-by:
Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by:
John W. Linville <linville@tuxdriver.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Joern Rennecke authored
commit b0f55f2a upstream. For a search buffer, 2 byte aligned, strchr() was returning pointer outside of buffer (buf - 1) ------------->8---------------- // Input buffer (default 4 byte aigned) char *buffer = "1AA_"; // actual search start (to mimick 2 byte alignment) char *current_line = &(buffer[2]); // Character to search for char c = 'A'; char *c_pos = strchr(current_line, c); printf("%s\n", c_pos) --> 'AA_' as oppose to 'A_' ------------->8---------------- Reported-by:
Anton Kolesov <Anton.Kolesov@synopsys.com> Debugged-by:
Anton Kolesov <Anton.Kolesov@synopsys.com> Cc: Noam Camus <noamc@ezchip.com> Signed-off-by:
Joern Rennecke <joern.rennecke@embecosm.com> Signed-off-by:
Vineet Gupta <vgupta@synopsys.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chuck Anderson authored
commit fc78d343 upstream. An older PVHVM guest (v3.0 based) crashed during vCPU hot-plug with: kernel BUG at drivers/xen/events.c:1328! RCU has detected that a CPU has not entered a quiescent state within the grace period. It needs to send the CPU a reschedule IPI if it is not offline. rcu_implicit_offline_qs() does this check: /* * If the CPU is offline, it is in a quiescent state. We can * trust its state not to change because interrupts are disabled. */ if (cpu_is_offline(rdp->cpu)) { rdp->offline_fqs++; return 1; } Else the CPU is online. Send it a reschedule IPI. The CPU is in the middle of being hot-plugged and has been marked online (!cpu_is_offline()). See start_secondary(): set_cpu_online(smp_processor_id(), true); ... per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE; start_secondary() then waits for the CPU bringing up the hot-plugged CPU to mark it as active: /* * Wait until the cpu which brought this one up marked it * online before enabling interrupts. If we don't do that then * we can end up waking up the softirq thread before this cpu * reached the active state, which makes the scheduler unhappy * and schedule the softirq thread on the wrong cpu. This is * only observable with forced threaded interrupts, but in * theory it could also happen w/o them. It's just way harder * to achieve. */ while (!cpumask_test_cpu(smp_processor_id(), cpu_active_mask)) cpu_relax(); /* enable local interrupts */ local_irq_enable(); The CPU being hot-plugged will be marked active after it has been fully initialized by the CPU managing the hot-plug. In the Xen PVHVM case xen_smp_intr_init() is called to set up the hot-plugged vCPU's XEN_RESCHEDULE_VECTOR. The hot-plugging CPU is marked online, not marked active and does not have its IPI vectors set up. rcu_implicit_offline_qs() sees the hot-plugging cpu is !cpu_is_offline() and tries to send it a reschedule IPI: This will lead to: kernel BUG at drivers/xen/events.c:1328! xen_send_IPI_one() xen_smp_send_reschedule() rcu_implicit_offline_qs() rcu_implicit_dynticks_qs() force_qs_rnp() force_quiescent_state() __rcu_process_callbacks() rcu_process_callbacks() __do_softirq() call_softirq() do_softirq() irq_exit() xen_evtchn_do_upcall() because xen_send_IPI_one() will attempt to use an uninitialized IRQ for the XEN_RESCHEDULE_VECTOR. There is at least one other place that has caused the same crash: xen_smp_send_reschedule() wake_up_idle_cpu() add_timer_on() clocksource_watchdog() call_timer_fn() run_timer_softirq() __do_softirq() call_softirq() do_softirq() irq_exit() xen_evtchn_do_upcall() xen_hvm_callback_vector() clocksource_watchdog() uses cpu_online_mask to pick the next CPU to handle a watchdog timer: /* * Cycle through CPUs to check if the CPUs stay synchronized * to each other. */ next_cpu = cpumask_next(raw_smp_processor_id(), cpu_online_mask); if (next_cpu >= nr_cpu_ids) next_cpu = cpumask_first(cpu_online_mask); watchdog_timer.expires += WATCHDOG_INTERVAL; add_timer_on(&watchdog_timer, next_cpu); This resulted in an attempt to send an IPI to a hot-plugging CPU that had not initialized its reschedule vector. One option would be to make the RCU code check to not check for CPU offline but for CPU active. As becoming active is done after a CPU is online (in older kernels). But Srivatsa pointed out that "the cpu_active vs cpu_online ordering has been completely reworked - in the online path, cpu_active is set *before* cpu_online, and also, in the cpu offline path, the cpu_active bit is reset in the CPU_DYING notification instead of CPU_DOWN_PREPARE." Drilling in this the bring-up path: "[brought up CPU].. send out a CPU_STARTING notification, and in response to that, the scheduler sets the CPU in the cpu_active_mask. Again, this mask is better left to the scheduler alone, since it has the intelligence to use it judiciously." The conclusion was that: " 1. At the IPI sender side: It is incorrect to send an IPI to an offline CPU (cpu not present in the cpu_online_mask). There are numerous places where we check this and warn/complain. 2. At the IPI receiver side: It is incorrect to let the world know of our presence (by setting ourselves in global bitmasks) until our initialization steps are complete to such an extent that we can handle the consequences (such as receiving interrupts without crashing the sender etc.) " (from Srivatsa) As the native code enables the interrupts at some point we need to be able to service them. In other words a CPU must have valid IPI vectors if it has been marked online. It doesn't need to handle the IPI (interrupts may be disabled) but needs to have valid IPI vectors because another CPU may find it in cpu_online_mask and attempt to send it an IPI. This patch will change the order of the Xen vCPU bring-up functions so that Xen vectors have been set up before start_secondary() is called. It also will not continue to bring up a Xen vCPU if xen_smp_intr_init() fails to initialize it. Orabug 13823853 Signed-off-by Chuck Anderson <chuck.anderson@oracle.com> Acked-by:
Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
Jonghwan Choi <jhbird.choi@samsung.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Steven Rostedt (Red Hat) authored
commit 8c4f3c3f upstream. There's been a nasty bug that would show up and not give much info. The bug displayed the following warning: WARNING: at kernel/trace/ftrace.c:1529 __ftrace_hash_rec_update+0x1e3/0x230() Pid: 20903, comm: bash Tainted: G O 3.6.11+ #38405.trunk Call Trace: [<ffffffff8103e5ff>] warn_slowpath_common+0x7f/0xc0 [<ffffffff8103e65a>] warn_slowpath_null+0x1a/0x20 [<ffffffff810c2ee3>] __ftrace_hash_rec_update+0x1e3/0x230 [<ffffffff810c4f28>] ftrace_hash_move+0x28/0x1d0 [<ffffffff811401cc>] ? kfree+0x2c/0x110 [<ffffffff810c68ee>] ftrace_regex_release+0x8e/0x150 [<ffffffff81149f1e>] __fput+0xae/0x220 [<ffffffff8114a09e>] ____fput+0xe/0x10 [<ffffffff8105fa22>] task_work_run+0x72/0x90 [<ffffffff810028ec>] do_notify_resume+0x6c/0xc0 [<ffffffff8126596e>] ? trace_hardirqs_on_thunk+0x3a/0x3c [<ffffffff815c0f88>] int_signal+0x12/0x17 ---[ end trace 793179526ee09b2c ]--- It was finally narrowed down to unloading a module that was being traced. It was actually more than that. When functions are being traced, there's a table of all functions that have a ref count of the number of active tracers attached to that function. When a function trace callback is registered to a function, the function's record ref count is incremented. When it is unregistered, the function's record ref count is decremented. If an inconsistency is detected (ref count goes below zero) the above warning is shown and the function tracing is permanently disabled until reboot. The ftrace callback ops holds a hash of functions that it filters on (and/or filters off). If the hash is empty, the default means to filter all functions (for the filter_hash) or to disable no functions (for the notrace_hash). When a module is unloaded, it frees the function records that represent the module functions. These records exist on their own pages, that is function records for one module will not exist on the same page as function records for other modules or even the core kernel. Now when a module unloads, the records that represents its functions are freed. When the module is loaded again, the records are recreated with a default ref count of zero (unless there's a callback that traces all functions, then they will also be traced, and the ref count will be incremented). The problem is that if an ftrace callback hash includes functions of the module being unloaded, those hash entries will not be removed. If the module is reloaded in the same location, the hash entries still point to the functions of the module but the module's ref counts do not reflect that. With the help of Steve and Joern, we found a reproducer: Using uinput module and uinput_release function. cd /sys/kernel/debug/tracing modprobe uinput echo uinput_release > set_ftrace_filter echo function > current_tracer rmmod uinput modprobe uinput # check /proc/modules to see if loaded in same addr, otherwise try again echo nop > current_tracer [BOOM] The above loads the uinput module, which creates a table of functions that can be traced within the module. We add uinput_release to the filter_hash to trace just that function. Enable function tracincg, which increments the ref count of the record associated to uinput_release. Remove uinput, which frees the records including the one that represents uinput_release. Load the uinput module again (and make sure it's at the same address). This recreates the function records all with a ref count of zero, including uinput_release. Disable function tracing, which will decrement the ref count for uinput_release which is now zero because of the module removal and reload, and we have a mismatch (below zero ref count). The solution is to check all currently tracing ftrace callbacks to see if any are tracing any of the module's functions when a module is loaded (it already does that with callbacks that trace all functions). If a callback happens to have a module function being traced, it increments that records ref count and starts tracing that function. There may be a strange side effect with this, where tracing module functions on unload and then reloading a new module may have that new module's functions being traced. This may be something that confuses the user, but it's not a big deal. Another approach is to disable all callback hashes on module unload, but this leaves some ftrace callbacks that may not be registered, but can still have hashes tracing the module's function where ftrace doesn't know about it. That situation can cause the same bug. This solution solves that case too. Another benefit of this solution, is it is possible to trace a module's function on unload and load. Link: http://lkml.kernel.org/r/20130705142629.GA325@redhat.comReported-by:
Jörn Engel <joern@logfs.org> Reported-by:
Dave Jones <davej@redhat.com> Reported-by:
Steve Hodgson <steve@purestorage.com> Tested-by:
Steve Hodgson <steve@purestorage.com> Signed-off-by:
Steven Rostedt <rostedt@goodmis.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-