- 23 Mar, 2019 40 commits
-
-
Kunihiko Hayashi authored
commit 52128223 upstream. Need to set the update bit in UNIPHIER_CLK_CPUGEAR_UPD to update the CPU-gear value. Fixes: d08f1f0d ("clk: uniphier: add CPU-gear change (cpufreq) support") Cc: linux-stable@vger.kernel.org Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Acked-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jan Kara authored
commit 1c2d1421 upstream. When ext2 filesystem is created with 64k block size, ext2_max_size() will return value less than 0. Also, we cannot write any file in this fs since the sb->maxbytes is less than 0. The core of the problem is that the size of block index tree for such large block size is more than i_blocks can carry. So fix the computation to count with this possibility. File size limits computed with the new function for the full range of possible block sizes look like: bits file_size 10 17247252480 11 275415851008 12 2196873666560 13 2197948973056 14 2198486220800 15 2198754754560 16 2198888906752 CC: stable@vger.kernel.org Reported-by: yangerkun <yangerkun@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vaibhav Jain authored
commit edeb304f upstream. Within cxl module, iteration over array 'adapter->afu' may be racy at few points as it might be simultaneously read during an EEH and its contents being set to NULL while driver is being unloaded or unbound from the adapter. This might result in a NULL pointer to 'struct afu' being de-referenced during an EEH thereby causing a kernel oops. This patch fixes this by making sure that all access to the array 'adapter->afu' is wrapped within the context of spin-lock 'adapter->afu_list_lock'. Fixes: 9e8df8a2 ("cxl: EEH support") Cc: stable@vger.kernel.org # v4.3+ Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Frederic Barrat <fbarrat@linux.ibm.com> Acked-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Michael J. Ruhl authored
commit bc5add09 upstream. When disabling and removing a receive context, it is possible for an asynchronous event (i.e IRQ) to occur. Because of this, there is a race between cleaning up the context, and the context being used by the asynchronous event. cpu 0 (context cleanup) rc->ref_count-- (ref_count == 0) hfi1_rcd_free() cpu 1 (IRQ (with rcd index)) rcd_get_by_index() lock ref_count+++ <-- reference count race (WARNING) return rcd unlock cpu 0 hfi1_free_ctxtdata() <-- incorrect free location lock remove rcd from array unlock free rcd This race will cause the following WARNING trace: WARNING: CPU: 0 PID: 175027 at include/linux/kref.h:52 hfi1_rcd_get_by_index+0x84/0xa0 [hfi1] CPU: 0 PID: 175027 Comm: IMB-MPI1 Kdump: loaded Tainted: G OE ------------ 3.10.0-957.el7.x86_64 #1 Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.11.01.0076.C4.111920150602 11/19/2015 Call Trace: dump_stack+0x19/0x1b __warn+0xd8/0x100 warn_slowpath_null+0x1d/0x20 hfi1_rcd_get_by_index+0x84/0xa0 [hfi1] is_rcv_urgent_int+0x24/0x90 [hfi1] general_interrupt+0x1b6/0x210 [hfi1] __handle_irq_event_percpu+0x44/0x1c0 handle_irq_event_percpu+0x32/0x80 handle_irq_event+0x3c/0x60 handle_edge_irq+0x7f/0x150 handle_irq+0xe4/0x1a0 do_IRQ+0x4d/0xf0 common_interrupt+0x162/0x162 The race can also lead to a use after free which could be similar to: general protection fault: 0000 1 SMP CPU: 71 PID: 177147 Comm: IMB-MPI1 Kdump: loaded Tainted: G W OE ------------ 3.10.0-957.el7.x86_64 #1 Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.11.01.0076.C4.111920150602 11/19/2015 task: ffff9962a8098000 ti: ffff99717a508000 task.ti: ffff99717a508000 __kmalloc+0x94/0x230 Call Trace: ? hfi1_user_sdma_process_request+0x9c8/0x1250 [hfi1] hfi1_user_sdma_process_request+0x9c8/0x1250 [hfi1] hfi1_aio_write+0xba/0x110 [hfi1] do_sync_readv_writev+0x7b/0xd0 do_readv_writev+0xce/0x260 ? handle_mm_fault+0x39d/0x9b0 ? pick_next_task_fair+0x5f/0x1b0 ? sched_clock_cpu+0x85/0xc0 ? __schedule+0x13a/0x890 vfs_writev+0x35/0x60 SyS_writev+0x7f/0x110 system_call_fastpath+0x22/0x27 Use the appropriate kref API to verify access. Reorder context cleanup to ensure context removal before cleanup occurs correctly. Cc: stable@vger.kernel.org # v4.14.0+ Fixes: f683c80c ("IB/hfi1: Resolve kernel panics by reference counting receive contexts") Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lucas Stach authored
commit 3afc8299 upstream. Since 7c5925af (PCI: dwc: Move MSI IRQs allocation to IRQ domains hierarchical API) the MSI init claims one of the controller IRQs as a chained IRQ line for the MSI controller. On some designs, like the i.MX6, this line is shared with a PCIe legacy IRQ. When the line is claimed for the MSI domain, any device trying to use this legacy IRQs will fail to request this IRQ line. As MSI and legacy IRQs are already mutually exclusive on the DWC core, as the core won't forward any legacy IRQs once any MSI has been enabled, users wishing to use legacy IRQs already need to explictly disable MSI support (usually via the pci=nomsi kernel commandline option). To avoid any issues with MSI conflicting with legacy IRQs, just skip all of the DWC MSI initalization, including the IRQ line claim, when MSI is disabled. Fixes: 7c5925af ("PCI: dwc: Move MSI IRQs allocation to IRQ domains hierarchical API") Tested-by: Tim Harvey <tharvey@gateworks.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dongdong Liu authored
commit 9f08a5d8 upstream. Previously dpc_handler() called aer_get_device_error_info() without initializing info->severity, so aer_get_device_error_info() relied on uninitialized data. Add dpc_get_aer_uncorrect_severity() to read the port's AER status, mask, and severity registers and set info->severity. Also, clear the port's AER fatal error status bits. Fixes: 8aefa9b0 ("PCI/DPC: Print AER status in DPC event handling") Signed-off-by: Dongdong Liu <liudongdong3@huawei.com> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Bjorn Helgaas authored
commit 10ecc818 upstream. RussianNeuroMancer reported that the Intel 7265 wifi on a Dell Venue 11 Pro 7140 table stopped working after wakeup from suspend and bisected the problem to 9ab105de ("PCI/ASPM: Disable ASPM L1.2 Substate if we don't have LTR"). David Ward reported the same problem on a Dell Latitude 7350. After af8bb9f8 ("PCI/ACPI: Request LTR control from platform before using it"), we don't enable LTR unless the platform has granted LTR control to us. In addition, we don't notice if the platform had already enabled LTR itself. After 9ab105de ("PCI/ASPM: Disable ASPM L1.2 Substate if we don't have LTR"), we avoid using LTR if we don't think the path to the device has LTR enabled. The combination means that if the platform itself enables LTR but declines to give the OS control over LTR, we unnecessarily avoided using ASPM L1.2. Link: https://bugzilla.kernel.org/show_bug.cgi?id=201469 Fixes: 9ab105de ("PCI/ASPM: Disable ASPM L1.2 Substate if we don't have LTR") Fixes: af8bb9f8 ("PCI/ACPI: Request LTR control from platform before using it") Reported-by: RussianNeuroMancer <russianneuromancer@ya.ru> Reported-by: David Ward <david.ward@ll.mit.edu> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: stable@vger.kernel.org # v4.18+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jan Kara authored
commit f96c3ac8 upstream. When computing maximum size of filesystem possible with given number of group descriptor blocks, we forget to include s_first_data_block into the number of blocks. Thus for filesystems with non-zero s_first_data_block it can happen that computed maximum filesystem size is actually lower than current filesystem size which confuses the code and eventually leads to a BUG_ON in ext4_alloc_group_tables() hitting on flex_gd->count == 0. The problem can be reproduced like: truncate -s 100g /tmp/image mkfs.ext4 -b 1024 -E resize=262144 /tmp/image 32768 mount -t ext4 -o loop /tmp/image /mnt resize2fs /dev/loop0 262145 resize2fs /dev/loop0 300000 Fix the problem by properly including s_first_data_block into the computed number of filesystem blocks. Fixes: 1c6bd717 "ext4: convert file system to meta_bg if needed..." Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
yangerkun authored
commit abdc644e upstream. The reason is that while swapping two inode, we swap the flags too. Some flags such as EXT4_JOURNAL_DATA_FL can really confuse the things since we're not resetting the address operations structure. The simplest way to keep things sane is to restrict the flags that can be swapped. Signed-off-by: yangerkun <yangerkun@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
yangerkun authored
commit aa507b5f upstream. While do swap between two inode, they swap i_data without update quota information. Also, swap_inode_boot_loader can do "revert" somtimes, so update the quota while all operations has been finished. Signed-off-by: yangerkun <yangerkun@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
yangerkun authored
commit a46c68a3 upstream. While do swap, we should make sure there has no new dirty page since we should swap i_data between two inode: 1.We should lock i_mmap_sem with write to avoid new pagecache from mmap read/write; 2.Change filemap_flush to filemap_write_and_wait and move them to the space protected by inode lock to avoid new pagecache from buffer read/write. Signed-off-by: yangerkun <yangerkun@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
yangerkun authored
commit 67a11611 upstream. Before really do swap between inode and boot inode, something need to check to avoid invalid or not permitted operation, like does this inode has inline data. But the condition check should be protected by inode lock to avoid change while swapping. Also some other condition will not change between swapping, but there has no problem to do this under inode lock. Signed-off-by: yangerkun <yangerkun@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Arnd Bergmann authored
commit 9505b98c upstream. pxa_cpufreq_init_voltages() is marked __init but usually inlined into the non-__init pxa_cpufreq_init() function. When building with clang, it can stay as a standalone function in a discarded section, and produce this warning: WARNING: vmlinux.o(.text+0x616a00): Section mismatch in reference from the function pxa_cpufreq_init() to the function .init.text:pxa_cpufreq_init_voltages() The function pxa_cpufreq_init() references the function __init pxa_cpufreq_init_voltages(). This is often because pxa_cpufreq_init lacks a __init annotation or the annotation of pxa_cpufreq_init_voltages is wrong. Fixes: 50e77fcd ("ARM: pxa: remove __init from cpufreq_driver->init()") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Robert Jarzmik <robert.jarzmik@free.fr> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yangtao Li authored
commit 446fae2b upstream. of_cpu_device_node_get() will increase the refcount of device_node, it is necessary to call of_node_put() at the end to release the refcount. Fixes: 9eb15dbb ("cpufreq: Add cpufreq driver for Tegra124") Cc: <stable@vger.kernel.org> # 4.4+ Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Viresh Kumar authored
commit 0334906c upstream. Commit 5ad7346b ("cpufreq: kryo: Add module remove and exit") made it possible to build the kryo cpufreq driver as a module, but it failed to release all the resources, i.e. OPP tables, when the module is unloaded. This patch fixes it by releasing the OPP tables, by calling dev_pm_opp_put_supported_hw() for them, from the qcom_cpufreq_kryo_remove() routine. The array of pointers to the OPP tables is also allocated dynamically now in qcom_cpufreq_kryo_probe(), as the pointers will be required while releasing the resources. Compile tested only. Cc: 4.18+ <stable@vger.kernel.org> # v4.18+ Fixes: 5ad7346b ("cpufreq: kryo: Add module remove and exit") Reviewed-by: Georgi Djakov <georgi.djakov@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Masami Hiramatsu authored
commit 0192e653 upstream. Prohibit probing on optprobe template code, since it is not a code but a template instruction sequence. If we modify this template, copied template must be broken. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrea Righi <righi.andrea@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Fixes: 9326638c ("kprobes, x86: Use NOKPROBE_SYMBOL() instead of __kprobes annotation") Link: http://lkml.kernel.org/r/154998787911.31052.15274376330136234452.stgit@devboxSigned-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Doug Berger authored
commit 33517881 upstream. Using the irq_gc_lock/irq_gc_unlock functions in the suspend and resume functions creates the opportunity for a deadlock during suspend, resume, and shutdown. Using the irq_gc_lock_irqsave/ irq_gc_unlock_irqrestore variants prevents this possible deadlock. Cc: stable@vger.kernel.org Fixes: 7f646e92 ("irqchip: brcmstb-l2: Add Broadcom Set Top Box Level-2 interrupt controller") Signed-off-by: Doug Berger <opendmb@gmail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> [maz: tidied up $SUBJECT] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Zenghui Yu authored
commit 8d565748 upstream. In current logic, its_parse_indirect_baser() will be invoked twice when allocating Device tables. Add a *break* to omit the unnecessary and annoying (might be ...) invoking. Fixes: 32bd44dc ("irqchip/gic-v3-its: Fix the incorrect parsing of VCPU table size") Cc: stable@vger.kernel.org Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lubomir Rintel authored
commit 607076a9 upstream. It doesn't make sense and the USB core warns on each submit of such URB, easily flooding the message buffer with tracebacks. Analogous issue was fixed in regular libertas driver in commit 6528d880 ("libertas: don't set URB_ZERO_PACKET on IN USB transfer"). Cc: stable@vger.kernel.org Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Reviewed-by: Steve deRosier <derosier@cal-sierra.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Stephen Boyd authored
commit baef1c90 upstream. Using the batch API from the interconnect driver sometimes leads to a KASAN error due to an access to freed memory. This is easier to trigger with threadirqs on the kernel commandline. BUG: KASAN: use-after-free in rpmh_tx_done+0x114/0x12c Read of size 1 at addr fffffff51414ad84 by task irq/110-apps_rs/57 CPU: 0 PID: 57 Comm: irq/110-apps_rs Tainted: G W 4.19.10 #72 Call trace: dump_backtrace+0x0/0x2f8 show_stack+0x20/0x2c __dump_stack+0x20/0x28 dump_stack+0xcc/0x10c print_address_description+0x74/0x240 kasan_report+0x250/0x26c __asan_report_load1_noabort+0x20/0x2c rpmh_tx_done+0x114/0x12c tcs_tx_done+0x450/0x768 irq_forced_thread_fn+0x58/0x9c irq_thread+0x120/0x1dc kthread+0x248/0x260 ret_from_fork+0x10/0x18 Allocated by task 385: kasan_kmalloc+0xac/0x148 __kmalloc+0x170/0x1e4 rpmh_write_batch+0x174/0x540 qcom_icc_set+0x8dc/0x9ac icc_set+0x288/0x2e8 a6xx_gmu_stop+0x320/0x3c0 a6xx_pm_suspend+0x108/0x124 adreno_suspend+0x50/0x60 pm_generic_runtime_suspend+0x60/0x78 __rpm_callback+0x214/0x32c rpm_callback+0x54/0x184 rpm_suspend+0x3f8/0xa90 pm_runtime_work+0xb4/0x178 process_one_work+0x544/0xbc0 worker_thread+0x514/0x7d0 kthread+0x248/0x260 ret_from_fork+0x10/0x18 Freed by task 385: __kasan_slab_free+0x12c/0x1e0 kasan_slab_free+0x10/0x1c kfree+0x134/0x588 rpmh_write_batch+0x49c/0x540 qcom_icc_set+0x8dc/0x9ac icc_set+0x288/0x2e8 a6xx_gmu_stop+0x320/0x3c0 a6xx_pm_suspend+0x108/0x124 adreno_suspend+0x50/0x60 cr50_spi spi5.0: SPI transfer timed out pm_generic_runtime_suspend+0x60/0x78 __rpm_callback+0x214/0x32c rpm_callback+0x54/0x184 rpm_suspend+0x3f8/0xa90 pm_runtime_work+0xb4/0x178 process_one_work+0x544/0xbc0 worker_thread+0x514/0x7d0 kthread+0x248/0x260 ret_from_fork+0x10/0x18 The buggy address belongs to the object at fffffff51414ac80 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 260 bytes inside of 512-byte region [fffffff51414ac80, fffffff51414ae80) The buggy address belongs to the page: page:ffffffbfd4505200 count:1 mapcount:0 mapping:fffffff51e00c680 index:0x0 compound_mapcount: 0 flags: 0x4000000000008100(slab|head) raw: 4000000000008100 ffffffbfd4529008 ffffffbfd44f9208 fffffff51e00c680 raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: fffffff51414ac80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fffffff51414ad00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >fffffff51414ad80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ fffffff51414ae00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fffffff51414ae80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc The batch API sets the same completion for each rpmh message that's sent and then loops through all the messages and waits for that single completion declared on the stack to be completed before returning from the function and freeing the message structures. Unfortunately, some messages may still be in process and 'stuck' in the TCS. At some later point, the tcs_tx_done() interrupt will run and try to process messages that have already been freed at the end of rpmh_write_batch(). This will in turn access the 'needs_free' member of the rpmh_request structure and cause KASAN to complain. Furthermore, if there's a message that's completed in rpmh_tx_done() and freed immediately after the complete() call is made we'll be racing with potentially freed memory when accessing the 'needs_free' member: CPU0 CPU1 ---- ---- rpmh_tx_done() complete(&compl) wait_for_completion(&compl) kfree(rpm_msg) if (rpm_msg->needs_free) <KASAN warning splat> Let's fix this by allocating a chunk of completions for each message and waiting for all of them to be completed before returning from the batch API. Alternatively, we could wait for the last message in the batch, but that may be a more complicated change because it looks like tcs_tx_done() just iterates through the indices of the queue and completes each message instead of tracking the last inserted message and completing that first. Fixes: c8790cb6 ("drivers: qcom: rpmh: add support for batch RPMH request") Cc: Lina Iyer <ilina@codeaurora.org> Cc: "Raju P.L.S.S.S.N" <rplsssn@codeaurora.org> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Evan Green <evgreen@chromium.org> Cc: stable@vger.kernel.org Reviewed-by: Lina Iyer <ilina@codeaurora.org> Reviewed-by: Evan Green <evgreen@chromium.org> Signed-off-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Andy Gross <andy.gross@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Filipe Manana authored
commit 8e928218 upstream. In the past we had data corruption when reading compressed extents that are shared within the same file and they are consecutive, this got fixed by commit 005efedf ("Btrfs: fix read corruption of compressed and shared extents") and by commit 808f80b4 ("Btrfs: update fix for read corruption of compressed and shared extents"). However there was a case that was missing in those fixes, which is when the shared and compressed extents are referenced with a non-zero offset. The following shell script creates a reproducer for this issue: #!/bin/bash mkfs.btrfs -f /dev/sdc &> /dev/null mount -o compress /dev/sdc /mnt/sdc # Create a file with 3 consecutive compressed extents, each has an # uncompressed size of 128Kb and a compressed size of 4Kb. for ((i = 1; i <= 3; i++)); do head -c 4096 /dev/zero for ((j = 1; j <= 31; j++)); do head -c 4096 /dev/zero | tr '\0' "\377" done done > /mnt/sdc/foobar sync echo "Digest after file creation: $(md5sum /mnt/sdc/foobar)" # Clone the first extent into offsets 128K and 256K. xfs_io -c "reflink /mnt/sdc/foobar 0 128K 128K" /mnt/sdc/foobar xfs_io -c "reflink /mnt/sdc/foobar 0 256K 128K" /mnt/sdc/foobar sync echo "Digest after cloning: $(md5sum /mnt/sdc/foobar)" # Punch holes into the regions that are already full of zeroes. xfs_io -c "fpunch 0 4K" /mnt/sdc/foobar xfs_io -c "fpunch 128K 4K" /mnt/sdc/foobar xfs_io -c "fpunch 256K 4K" /mnt/sdc/foobar sync echo "Digest after hole punching: $(md5sum /mnt/sdc/foobar)" echo "Dropping page cache..." sysctl -q vm.drop_caches=1 echo "Digest after hole punching: $(md5sum /mnt/sdc/foobar)" umount /dev/sdc When running the script we get the following output: Digest after file creation: 5a0888d80d7ab1fd31c229f83a3bbcc8 /mnt/sdc/foobar linked 131072/131072 bytes at offset 131072 128 KiB, 1 ops; 0.0033 sec (36.960 MiB/sec and 295.6830 ops/sec) linked 131072/131072 bytes at offset 262144 128 KiB, 1 ops; 0.0015 sec (78.567 MiB/sec and 628.5355 ops/sec) Digest after cloning: 5a0888d80d7ab1fd31c229f83a3bbcc8 /mnt/sdc/foobar Digest after hole punching: 5a0888d80d7ab1fd31c229f83a3bbcc8 /mnt/sdc/foobar Dropping page cache... Digest after hole punching: fba694ae8664ed0c2e9ff8937e7f1484 /mnt/sdc/foobar This happens because after reading all the pages of the extent in the range from 128K to 256K for example, we read the hole at offset 256K and then when reading the page at offset 260K we don't submit the existing bio, which is responsible for filling all the page in the range 128K to 256K only, therefore adding the pages from range 260K to 384K to the existing bio and submitting it after iterating over the entire range. Once the bio completes, the uncompressed data fills only the pages in the range 128K to 256K because there's no more data read from disk, leaving the pages in the range 260K to 384K unfilled. It is just a slightly different variant of what was solved by commit 005efedf ("Btrfs: fix read corruption of compressed and shared extents"). Fix this by forcing a bio submit, during readpages(), whenever we find a compressed extent map for a page that is different from the extent map for the previous page or has a different starting offset (in case it's the same compressed extent), instead of the extent map's original start offset. A test case for fstests follows soon. Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org> Fixes: 808f80b4 ("Btrfs: update fix for read corruption of compressed and shared extents") Fixes: 005efedf ("Btrfs: fix read corruption of compressed and shared extents") Cc: stable@vger.kernel.org # 4.3+ Tested-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Johannes Thumshirn authored
commit 349ae63f upstream. We recently had a customer issue with a corrupted filesystem. When trying to mount this image btrfs panicked with a division by zero in calc_stripe_length(). The corrupt chunk had a 'num_stripes' value of 1. calc_stripe_length() takes this value and divides it by the number of copies the RAID profile is expected to have to calculate the amount of data stripes. As a DUP profile is expected to have 2 copies this division resulted in 1/2 = 0. Later then the 'data_stripes' variable is used as a divisor in the stripe length calculation which results in a division by 0 and thus a kernel panic. When encountering a filesystem with a DUP block group and a 'num_stripes' value unequal to 2, refuse mounting as the image is corrupted and will lead to unexpected behaviour. Code inspection showed a RAID1 block group has the same issues. Fixes: e06cd3dd ("Btrfs: add validadtion checks for chunk loading") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Filipe Manana authored
commit a0873490 upstream. We are holding a transaction handle when setting an acl, therefore we can not allocate the xattr value buffer using GFP_KERNEL, as we could deadlock if reclaim is triggered by the allocation, therefore setup a nofs context. Fixes: 39a27ec1 ("btrfs: use GFP_KERNEL for xattr and acl allocations") CC: stable@vger.kernel.org # 4.9+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Filipe Manana authored
commit b89f6d1f upstream. We are holding a transaction handle when creating a tree, therefore we can not allocate the root using GFP_KERNEL, as we could deadlock if reclaim is triggered by the allocation, therefore setup a nofs context. Fixes: 74e4d827 ("btrfs: let callers of btrfs_alloc_root pass gfp flags") CC: stable@vger.kernel.org # 4.9+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Finn Thain authored
commit 28713169 upstream. This patch fixes a build failure when using GCC 8.1: /usr/bin/ld: block/partitions/ldm.o: in function `ldm_parse_tocblock': block/partitions/ldm.c:153: undefined reference to `strcmp' This is caused by a new optimization which effectively replaces a strncmp() call with a strcmp() call. This affects a number of strncmp() call sites in the kernel. The entire class of optimizations is avoided with -fno-builtin, which gets enabled by -ffreestanding. This may avoid possible future build failures in case new optimizations appear in future compilers. I haven't done any performance measurements with this patch but I did count the function calls in a defconfig build. For example, there are now 23 more sprintf() calls and 39 fewer strcpy() calls. The effect on the other libc functions is smaller. If this harms performance we can tackle that regression by optimizing the call sites, ideally using semantic patches. That way, clang and ICC builds might benfit too. Cc: stable@vger.kernel.org Reference: https://marc.info/?l=linux-m68k&m=154514816222244&w=2Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vivek Goyal authored
commit 993a0b2a upstream. If a file has been copied up metadata only, and later data is copied up, upper loses any security.capability xattr it has (underlying filesystem clears it as upon file write). From a user's point of view, this is just a file copy-up and that should not result in losing security.capability xattr. Hence, before data copy up, save security.capability xattr (if any) and restore it on upper after data copy up is complete. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Fixes: 0c288874 ("ovl: A new xattr OVL_XATTR_METACOPY for file on upper") Cc: <stable@vger.kernel.org> # v4.19+ Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vivek Goyal authored
commit 5f32879e upstream. If a file with capability set (and hence security.capability xattr) is written kernel clears security.capability xattr. For overlay, during file copy up if xattrs are copied up first and then data is, copied up. This means data copy up will result in clearing of security.capability xattr file on lower has. And this can result into surprises. If a lower file has CAP_SETUID, then it should not be cleared over copy up (if nothing was actually written to file). This also creates problems with chown logic where it first copies up file and then tries to clear setuid bit. But by that time security.capability xattr is already gone (due to data copy up), and caller gets -ENODATA. This has been reported by Giuseppe here. https://github.com/containers/libpod/issues/2015#issuecomment-447824842 Fix this by copying up data first and then metadta. This is a regression which has been introduced by my commit as part of metadata only copy up patches. TODO: There will be some corner cases where a file is copied up metadata only and later data copy up happens and that will clear security.capability xattr. Something needs to be done about that too. Fixes: bd64e575 ("ovl: During copy up, first copy up metadata and then data") Cc: <stable@vger.kernel.org> # v4.19+ Reported-by: Giuseppe Scrivano <gscrivan@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jann Horn authored
commit a0ce2f0a upstream. Before this patch, it was possible for two pipes to affect each other after data had been transferred between them with tee(): ============ $ cat tee_test.c int main(void) { int pipe_a[2]; if (pipe(pipe_a)) err(1, "pipe"); int pipe_b[2]; if (pipe(pipe_b)) err(1, "pipe"); if (write(pipe_a[1], "abcd", 4) != 4) err(1, "write"); if (tee(pipe_a[0], pipe_b[1], 2, 0) != 2) err(1, "tee"); if (write(pipe_b[1], "xx", 2) != 2) err(1, "write"); char buf[5]; if (read(pipe_a[0], buf, 4) != 4) err(1, "read"); buf[4] = 0; printf("got back: '%s'\n", buf); } $ gcc -o tee_test tee_test.c $ ./tee_test got back: 'abxx' $ ============ As suggested by Al Viro, fix it by creating a separate type for non-mergeable pipe buffers, then changing the types of buffers in splice_pipe_to_pipe() and link_pipe(). Cc: <stable@vger.kernel.org> Fixes: 7c77f0b3 ("splice: implement pipe to pipe splicing") Fixes: 70524490 ("[PATCH] splice: add support for sys_tee()") Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Varad Gautam authored
commit 73052b0d upstream. d_delete only unhashes an entry if it is reached with dentry->d_lockref.count != 1. Prior to commit 8ead9dd5 ("devpts: more pty driver interface cleanups"), d_delete was called on a dentry from devpts_pty_kill with two references held, which would trigger the unhashing, and the subsequent dputs would release it. Commit 8ead9dd5 reworked devpts_pty_kill to stop acquiring the second reference from d_find_alias, and the d_delete call left the dentries still on the hashed list without actually ever being dropped from dcache before explicit cleanup. This causes the number of negative dentries for devpts to pile up, and an `ls /dev/pts` invocation can take seconds to return. Provide always_delete_dentry() from simple_dentry_operations as .d_delete for devpts, to make the dentry be dropped from dcache. Without this cleanup, the number of dentries in /dev/pts/ can be grown arbitrarily as: `python -c 'import pty; pty.spawn(["ls", "/dev/pts"])'` A systemtap probe on dcache_readdir to count d_subdirs shows this count to increase with each pty spawn invocation above: probe kernel.function("dcache_readdir") { subdirs = &@cast($file->f_path->dentry, "dentry")->d_subdirs; p = subdirs; p = @cast(p, "list_head")->next; i = 0 while (p != subdirs) { p = @cast(p, "list_head")->next; i = i+1; } printf("number of dentries: %d\n", i); } Fixes: 8ead9dd5 ("devpts: more pty driver interface cleanups") Signed-off-by: Varad Gautam <vrd@amazon.de> Reported-by: Zheng Wang <wanz@amazon.de> Reported-by: Brandon Schwartz <bsschwar@amazon.de> Root-caused-by: Maximilian Heyne <mheyne@amazon.de> Root-caused-by: Nicolas Pernas Maradei <npernas@amazon.de> CC: David Woodhouse <dwmw@amazon.co.uk> CC: Maximilian Heyne <mheyne@amazon.de> CC: Stefan Nuernberger <snu@amazon.de> CC: Amit Shah <aams@amazon.de> CC: Linus Torvalds <torvalds@linux-foundation.org> CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> CC: Al Viro <viro@ZenIV.linux.org.uk> CC: Christian Brauner <christian.brauner@ubuntu.com> CC: Eric W. Biederman <ebiederm@xmission.com> CC: Matthew Wilcox <willy@infradead.org> CC: Eric Biggers <ebiggers@google.com> CC: <stable@vger.kernel.org> # 4.9+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Himanshu Madhani authored
commit ec322937 upstream. This patch fixes LUN discovery when loop ID is not yet assigned by the firmware during driver load/sg_reset operations. Driver will now search for new loop id before retrying login. Fixes: 48acad09 ("scsi: qla2xxx: Fix N2N link re-connect") Cc: stable@vger.kernel.org #4.19 Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Bart Van Assche authored
commit 32e36bfb upstream. When using SCSI passthrough in combination with the iSCSI target driver then cmd->t_state_lock may be obtained from interrupt context. Hence, all code that obtains cmd->t_state_lock from thread context must disable interrupts first. This patch avoids that lockdep reports the following: WARNING: inconsistent lock state 4.18.0-dbg+ #1 Not tainted -------------------------------- inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. iscsi_ttx/1800 [HC1[1]:SC0[2]:HE0:SE0] takes: 000000006e7b0ceb (&(&cmd->t_state_lock)->rlock){?...}, at: target_complete_cmd+0x47/0x2c0 [target_core_mod] {HARDIRQ-ON-W} state was registered at: lock_acquire+0xd2/0x260 _raw_spin_lock+0x32/0x50 iscsit_close_connection+0x97e/0x1020 [iscsi_target_mod] iscsit_take_action_for_connection_exit+0x108/0x200 [iscsi_target_mod] iscsi_target_rx_thread+0x180/0x190 [iscsi_target_mod] kthread+0x1cf/0x1f0 ret_from_fork+0x24/0x30 irq event stamp: 1281 hardirqs last enabled at (1279): [<ffffffff970ade79>] __local_bh_enable_ip+0xa9/0x160 hardirqs last disabled at (1281): [<ffffffff97a008a5>] interrupt_entry+0xb5/0xd0 softirqs last enabled at (1278): [<ffffffff977cd9a1>] lock_sock_nested+0x51/0xc0 softirqs last disabled at (1280): [<ffffffffc07a6e04>] ip6_finish_output2+0x124/0xe40 [ipv6] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&cmd->t_state_lock)->rlock); <Interrupt> lock(&(&cmd->t_state_lock)->rlock);
-
Martin K. Petersen authored
commit a83da8a4 upstream. It was reported that some devices report an OPTIMAL TRANSFER LENGTH of 0xFFFF blocks. That looks bogus, especially for a device with a 4096-byte physical block size. Ignore OPTIMAL TRANSFER LENGTH if it is not a multiple of the device's reported physical block size. To make the sanity checking conditionals more readable--and to facilitate printing warnings--relocate the checking to a helper function. No functional change aside from the printks. Cc: <stable@vger.kernel.org> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199759Reported-by: Christoph Anton Mitterer <calestyo@scientia.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sagar Biradar authored
commit 0015437c upstream. Fix performance issue where the queue depth for SmartIOC logical volumes is set to 1, and allow the usual logical volume code to be executed Fixes: a052865f (aacraid: Set correct Queue Depth for HBA1000 RAW disks) Cc: stable@vger.kernel.org Signed-off-by: Sagar Biradar <Sagar.Biradar@microchip.com> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Felipe Franciosi authored
commit 3722e6a5 upstream. The virtio scsi spec defines struct virtio_scsi_ctrl_tmf as a set of device-readable records and a single device-writable response entry: struct virtio_scsi_ctrl_tmf { // Device-readable part le32 type; le32 subtype; u8 lun[8]; le64 id; // Device-writable part u8 response; } The above should be organised as two descriptor entries (or potentially more if using VIRTIO_F_ANY_LAYOUT), but without any extra data after "le64 id" or after "u8 response". The Linux driver doesn't respect that, with virtscsi_abort() and virtscsi_device_reset() setting cmd->sc before calling virtscsi_tmf(). It results in the original scsi command payload (or writable buffers) added to the tmf. This fixes the problem by leaving cmd->sc zeroed out, which makes virtscsi_kick_cmd() add the tmf to the control vq without any payload. Cc: stable@vger.kernel.org Signed-off-by: Felipe Franciosi <felipe@nutanix.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Halil Pasic authored
commit 3438b2c0 upstream. A queue with a capacity of zero is clearly not a valid virtio queue. Some emulators report zero queue size if queried with an invalid queue index. Instead of crashing in this case let us just return -ENOENT. To make that work properly, let us fix the notifier cleanup logic as well. Cc: stable@vger.kernel.org Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Martin Schwidefsky authored
commit 87276384 upstream. The setup_lowcore() function creates a new prefix page for the boot CPU. The PSW mask for the system_call, external interrupt, i/o interrupt and the program check handler have the DAT bit set in this new prefix page. At the time setup_lowcore is called the system still runs without virtual address translation, the paging_init() function creates the kernel page table and loads the CR13 with the kernel ASCE. Any code between setup_lowcore() and the end of paging_init() that has a BUG or WARN statement will create a program check that can not be handled correctly as there is no kernel page table yet. To allow early WARN statements initially setup the lowcore with DAT off and set the DAT bit only after paging_init() has completed. Cc: stable@vger.kernel.org Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Samuel Holland authored
commit c950ca8c upstream. The Allwinner A64 SoC is known[1] to have an unstable architectural timer, which manifests itself most obviously in the time jumping forward a multiple of 95 years[2][3]. This coincides with 2^56 cycles at a timer frequency of 24 MHz, implying that the time went slightly backward (and this was interpreted by the kernel as it jumping forward and wrapping around past the epoch). Investigation revealed instability in the low bits of CNTVCT at the point a high bit rolls over. This leads to power-of-two cycle forward and backward jumps. (Testing shows that forward jumps are about twice as likely as backward jumps.) Since the counter value returns to normal after an indeterminate read, each "jump" really consists of both a forward and backward jump from the software perspective. Unless the kernel is trapping CNTVCT reads, a userspace program is able to read the register in a loop faster than it changes. A test program running on all 4 CPU cores that reported jumps larger than 100 ms was run for 13.6 hours and reported the following: Count | Event -------+--------------------------- 9940 | jumped backward 699ms 268 | jumped backward 1398ms 1 | jumped backward 2097ms 16020 | jumped forward 175ms 6443 | jumped forward 699ms 2976 | jumped forward 1398ms 9 | jumped forward 356516ms 9 | jumped forward 357215ms 4 | jumped forward 714430ms 1 | jumped forward 3578440ms This works out to a jump larger than 100 ms about every 5.5 seconds on each CPU core. The largest jump (almost an hour!) was the following sequence of reads: 0x0000007fffffffff → 0x00000093feffffff → 0x0000008000000000 Note that the middle bits don't necessarily all read as all zeroes or all ones during the anomalous behavior; however the low 10 bits checked by the function in this patch have never been observed with any other value. Also note that smaller jumps are much more common, with backward jumps of 2048 (2^11) cycles observed over 400 times per second on each core. (Of course, this is partially explained by lower bits rolling over more frequently.) Any one of these could have caused the 95 year time skip. Similar anomalies were observed while reading CNTPCT (after patching the kernel to allow reads from userspace). However, the CNTPCT jumps are much less frequent, and only small jumps were observed. The same program as before (except now reading CNTPCT) observed after 72 hours: Count | Event -------+--------------------------- 17 | jumped backward 699ms 52 | jumped forward 175ms 2831 | jumped forward 699ms 5 | jumped forward 1398ms Further investigation showed that the instability in CNTPCT/CNTVCT also affected the respective timer's TVAL register. The following values were observed immediately after writing CNVT_TVAL to 0x10000000: CNTVCT | CNTV_TVAL | CNTV_CVAL | CNTV_TVAL Error --------------------+------------+--------------------+----------------- 0x000000d4a2d8bfff | 0x10003fff | 0x000000d4b2d8bfff | +0x00004000 0x000000d4a2d94000 | 0x0fffffff | 0x000000d4b2d97fff | -0x00004000 0x000000d4a2d97fff | 0x10003fff | 0x000000d4b2d97fff | +0x00004000 0x000000d4a2d9c000 | 0x0fffffff | 0x000000d4b2d9ffff | -0x00004000 The pattern of errors in CNTV_TVAL seemed to depend on exactly which value was written to it. For example, after writing 0x10101010: CNTVCT | CNTV_TVAL | CNTV_CVAL | CNTV_TVAL Error --------------------+------------+--------------------+----------------- 0x000001ac3effffff | 0x1110100f | 0x000001ac4f10100f | +0x1000000 0x000001ac40000000 | 0x1010100f | 0x000001ac5110100f | -0x1000000 0x000001ac58ffffff | 0x1110100f | 0x000001ac6910100f | +0x1000000 0x000001ac66000000 | 0x1010100f | 0x000001ac7710100f | -0x1000000 0x000001ac6affffff | 0x1110100f | 0x000001ac7b10100f | +0x1000000 0x000001ac6e000000 | 0x1010100f | 0x000001ac7f10100f | -0x1000000 I was also twice able to reproduce the issue covered by Allwinner's workaround[4], that writing to TVAL sometimes fails, and both CVAL and TVAL are left with entirely bogus values. One was the following values: CNTVCT | CNTV_TVAL | CNTV_CVAL --------------------+------------+-------------------------------------- 0x000000d4a2d6014c | 0x8fbd5721 | 0x000000d132935fff (615s in the past) Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> ======================================================================== Because the CPU can read the CNTPCT/CNTVCT registers faster than they change, performing two reads of the register and comparing the high bits (like other workarounds) is not a workable solution. And because the timer can jump both forward and backward, no pair of reads can distinguish a good value from a bad one. The only way to guarantee a good value from consecutive reads would be to read _three_ times, and take the middle value only if the three values are 1) each unique and 2) increasing. This takes at minimum 3 counter cycles (125 ns), or more if an anomaly is detected. However, since there is a distinct pattern to the bad values, we can optimize the common case (1022/1024 of the time) to a single read by simply ignoring values that match the error pattern. This still takes no more than 3 cycles in the worst case, and requires much less code. As an additional safety check, we still limit the loop iteration to the number of max-frequency (1.2 GHz) CPU cycles in three 24 MHz counter periods. For the TVAL registers, the simple solution is to not use them. Instead, read or write the CVAL and calculate the TVAL value in software. Although the manufacturer is aware of at least part of the erratum[4], there is no official name for it. For now, use the kernel-internal name "UNKNOWN1". [1]: https://github.com/armbian/build/commit/a08cd6fe7ae9 [2]: https://forum.armbian.com/topic/3458-a64-datetime-clock-issue/ [3]: https://irclog.whitequark.org/linux-sunxi/2018-01-26 [4]: https://github.com/Allwinner-Homlet/H6-BSP4.9-linux/blob/master/drivers/clocksource/arm_arch_timer.c#L272Acked-by: Maxime Ripard <maxime.ripard@bootlin.com> Tested-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Samuel Holland <samuel@sholland.org> Cc: stable@vger.kernel.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Stuart Menefy authored
commit d2f276c8 upstream. When shutting down the timer, ensure that after we have stopped the timer any pending interrupts are cleared. This fixes a problem when suspending, as interrupts are disabled before the timer is stopped, so the timer interrupt may still be asserted, preventing the system entering a low power state when the wfi is executed. Signed-off-by: Stuart Menefy <stuart.menefy@mathembedded.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Cc: <stable@vger.kernel.org> # v4.3+ Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Stuart Menefy authored
commit a5719a40 upstream. When a timer tick occurs and the clock is in one-shot mode, the timer needs to be stopped to prevent it triggering subsequent interrupts. Currently this code is in exynos4_mct_tick_clear(), but as it is only needed when an ISR occurs move it into exynos4_mct_tick_isr(), leaving exynos4_mct_tick_clear() just doing what its name suggests it should. Signed-off-by: Stuart Menefy <stuart.menefy@mathembedded.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Cc: stable@vger.kernel.org # v4.3+ Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Stuart Menefy authored
commit 28c4f730 upstream. The step values for some of the LDOs appears to be incorrect, resulting in incorrect voltages (or at least, ones which are different from the Samsung 3.4 vendor kernel). Signed-off-by: Stuart Menefy <stuart.menefy@mathembedded.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-