- 30 Nov, 2017 38 commits
-
-
Mikulas Patocka authored
commit 0440d5c0 upstream. When slub_debug is enabled kmalloc returns unaligned memory. XFS uses this unaligned memory for its buffers (if an unaligned buffer crosses a page, XFS frees it and allocates a full page instead - see the function xfs_buf_allocate_memory). dm-crypt checks if bv_offset is aligned on page size and these checks fail with slub_debug and XFS. Fix this bug by removing the bv_offset checks. Switch to checking if bv_len is aligned instead of bv_offset (this check should be sufficient to prevent overruns if a bio with too small bv_len is received). Fixes: 8f0009a2 ("dm crypt: optionally support larger encryption sector size") Reported-by:
Bruno Prémont <bonbons@sysophe.eu> Tested-by:
Bruno Prémont <bonbons@sysophe.eu> Signed-off-by:
Mikulas Patocka <mpatocka@redhat.com> Reviewed-by:
Milan Broz <gmazyland@gmail.com> Signed-off-by:
Mike Snitzer <snitzer@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Joe Thornber authored
commit d1260e2a upstream. When a DM cache in writeback mode moves data between the slow and fast device it can often avoid a copy if the triggering bio either: i) covers the whole block (no point copying if we're about to overwrite it) ii) the migration is a promotion and the origin block is currently discarded Prior to this fix there was a race with case (ii). The discard status was checked with a shared lock held (rather than exclusive). This meant another bio could run in parallel and write data to the origin, removing the discard state. After the promotion the parallel write would have been lost. With this fix the discard status is re-checked once the exclusive lock has been aquired. If the block is no longer discarded it falls back to the slower full copy path. Fixes: b29d4986 ("dm cache: significant rework to leverage dm-bio-prison-v2") Signed-off-by:
Joe Thornber <ejt@redhat.com> Signed-off-by:
Mike Snitzer <snitzer@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mikulas Patocka authored
commit 95b1369a upstream. When slub_debug is enabled kmalloc returns unaligned memory. XFS uses this unaligned memory for its buffers (if an unaligned buffer crosses a page, XFS frees it and allocates a full page instead - see the function xfs_buf_allocate_memory). dm-integrity checks if bv_offset is aligned on page size and this check fail with slub_debug and XFS. Fix this bug by removing the bv_offset check, leaving only the check for bv_len. Fixes: 7eada909 ("dm: add integrity target") Reported-by:
Bruno Prémont <bonbons@sysophe.eu> Reviewed-by:
Milan Broz <gmazyland@gmail.com> Signed-off-by:
Mikulas Patocka <mpatocka@redhat.com> Signed-off-by:
Mike Snitzer <snitzer@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vijendar Mukunda authored
commit 9ceace3c upstream. This commit adds PCI ID for Raven platform Signed-off-by:
Vijendar Mukunda <Vijendar.Mukunda@amd.com> Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vadim Lomovtsev authored
commit f2ddaf8d upstream. Extend the Cavium ThunderX ACS quirk to cover more device IDs and restrict it to only Root Ports. Signed-off-by:
Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com> [bhelgaas: changelog, stable tag] Signed-off-by:
Bjorn Helgaas <bhelgaas@google.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vadim Lomovtsev authored
commit 7f342678 upstream. The Cavium ThunderX (CN8XXX) family of PCIe Root Ports does not advertise an ACS capability. However, the RTL internally implements similar protection as if ACS had Request Redirection, Completion Redirection, Source Validation, and Upstream Forwarding features enabled. Change Cavium ACS capabilities quirk flags accordingly. Fixes: b404bcfb ("PCI: Add ACS quirk for all Cavium devices") Signed-off-by:
Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com> [bhelgaas: tidy changelog, comment, stable tag] Signed-off-by:
Bjorn Helgaas <bhelgaas@google.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dexuan Cui authored
commit 79aa801e upstream. The effective_affinity_mask is always set when an interrupt is assigned in __assign_irq_vector() -> apic->cpu_mask_to_apicid(), e.g. for struct apic apic_physflat: -> default_cpu_mask_to_apicid() -> irq_data_update_effective_affinity(), but it looks d->common->affinity remains all-1's before the user space or the kernel changes it later. In the early allocation/initialization phase of an IRQ, we should use the effective_affinity_mask, otherwise Hyper-V may not deliver the interrupt to the expected CPU. Without the patch, if we assign 7 Mellanox ConnectX-3 VFs to a 32-vCPU VM, one of the VFs may fail to receive interrupts. Tested-by:
Adrian Suhov <v-adsuho@microsoft.com> Signed-off-by:
Dexuan Cui <decui@microsoft.com> Signed-off-by:
Bjorn Helgaas <bhelgaas@google.com> Reviewed-by:
Jake Oshins <jakeo@microsoft.com> Cc: Jork Loeser <jloeser@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Bjorn Helgaas authored
commit c00054f5 upstream. Previously we programmed the LTR_L1.2_THRESHOLD in the parent (upstream) device using the capability pointer of the *child* (downstream) device, which corrupted some random word of the parent's config space. Use the parent's L1 SS capability pointer to program its LTR_L1.2_THRESHOLD. Fixes: aeda9ade ("PCI/ASPM: Configure L1 substate settings") Signed-off-by:
Bjorn Helgaas <bhelgaas@google.com> Reviewed-by:
Vidya Sagar <vidyas@nvidia.com> CC: Rajat Jain <rajatja@google.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Bjorn Helgaas authored
commit 94ac327e upstream. Every Port that supports the L1.2 substate advertises its Port Common_Mode_Restore_Time, i.e., the time the Port requires to re-establish common mode when exiting L1.2 (see PCIe r3.1, sec 7.33.2). Per sec 5.5.3.3.1, when exiting L1.2, the Downstream Port (the device at the upstream end of the link) must send TS1 training sequences for at least T(COMMONMODE) after it detects electrical idle exit on the Link. We want this to be long enough for both ends of the Link, so we should set it to the maximum of the Port Common_Mode_Restore_Time for the upstream and downstream components on the Link. Previously we only looked at the Port Common_Mode_Restore_Time of the upstream device, so if the downstream device required more time, we didn't program the upstream device's T(COMMONMODE) correctly. Fixes: f1f0366d ("PCI/ASPM: Calculate and save the L1.2 timing parameters") Signed-off-by:
Bjorn Helgaas <bhelgaas@google.com> Reviewed-by:
Vidya Sagar <vidyas@nvidia.com> Acked-by:
Rajat Jain <rajatja@google.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tobias Jordan authored
commit 7978db34 upstream. The for_each_available_child_of_node() loop in _of_add_opp_table_v2() doesn't drop the reference to "np" on errors. Fix that. Fixes: 27465902 (PM / OPP: Add support to parse "operating-points-v2" bindings) Signed-off-by:
Tobias Jordan <Tobias.Jordan@elektrobit.com> [ VK: Improved commit log. ] Signed-off-by:
Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by:
Stephen Boyd <sboyd@codeaurora.org> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Josef Bacik authored
commit 6a468d59 upstream. We can end up sleeping for a while waiting for the dead timeout, which means we could get the per request timer to fire. We did handle this case, but if the dead timeout happened right after we submitted we'd either tear down the connection or possibly requeue as we're handling an error and race with the endio which can lead to panics and other hilarity. Fixes: 560bc4b3 ("nbd: handle dead connections") Signed-off-by:
Josef Bacik <jbacik@fb.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Josef Bacik authored
commit ff57dc94 upstream. If we have a pending signal or the user kills their application then it'll bring down the whole device, which is less than awesome. Instead wait uninterruptible for the dead timeout so we're sure we gave it our best shot. Fixes: 560bc4b3 ("nbd: handle dead connections") Signed-off-by:
Josef Bacik <jbacik@fb.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Simon Guinot authored
commit 0d63785c upstream. The mvneta controller provides a 8-bit register to update the pending Tx descriptor counter. Then, a maximum of 255 Tx descriptors can be added at once. In the current code the mvneta_txq_pend_desc_add function assumes the caller takes care of this limit. But it is not the case. In some situations (xmit_more flag), more than 255 descriptors are added. When this happens, the Tx descriptor counter register is updated with a wrong value, which breaks the whole Tx queue management. This patch fixes the issue by allowing the mvneta_txq_pend_desc_add function to process more than 255 Tx descriptors. Fixes: 2a90f7e1 ("net: mvneta: add xmit_more support") Signed-off-by:
Simon Guinot <simon.guinot@sequanux.org> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mathias Kresin authored
commit 05a67cc2 upstream. There is a typo inside the pinmux setup code. The function is called refclk and not reclk. Fixes: 53263a1c ("MIPS: ralink: add mt7628an support") Signed-off-by:
Mathias Kresin <dev@kresin.me> Acked-by:
John Crispin <john@phrozen.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/16047/Signed-off-by:
James Hogan <jhogan@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mathias Kresin authored
commit 8ef4b43c upstream. According to the datasheet the REFCLK pin is shared with GPIO#37 and the PERST pin is shared with GPIO#36. Fixes: 53263a1c ("MIPS: ralink: add mt7628an support") Signed-off-by:
Mathias Kresin <dev@kresin.me> Acked-by:
John Crispin <john@phrozen.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/16046/Signed-off-by:
James Hogan <jhogan@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ben Hutchings authored
commit a3f14310 upstream. __cmpxchg64_local_generic() is atomic only w.r.t tasks and interrupts on the same CPU (that's what the 'local' means). We can't use it to implement cmpxchg64() in SMP configurations. So, for 32-bit SMP configurations: - Don't define cmpxchg64() - Don't enable HAVE_VIRT_CPU_ACCOUNTING_GEN, which requires it Fixes: e2093c7b ("MIPS: Fall back to generic implementation of ...") Fixes: bb877e96 ("MIPS: Add support for full dynticks CPU time accounting") Signed-off-by:
Ben Hutchings <ben@decadent.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Deng-Cheng Zhu <dengcheng.zhu@mips.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17413/Signed-off-by:
James Hogan <jhogan@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dmitry V. Levin authored
commit 0eef304b upstream. Consistently use types provided by <linux/types.h> to fix the following linux/rxrpc.h userspace compilation errors: /usr/include/linux/rxrpc.h:24:2: error: unknown type name 'u16' u16 srx_service; /* service desired */ /usr/include/linux/rxrpc.h:25:2: error: unknown type name 'u16' u16 transport_type; /* type of transport socket (SOCK_DGRAM) */ /usr/include/linux/rxrpc.h:26:2: error: unknown type name 'u16' u16 transport_len; /* length of transport address */ Use __kernel_sa_family_t instead of sa_family_t the same way as uapi/linux/in.h does, to fix the following linux/rxrpc.h userspace compilation errors: /usr/include/linux/rxrpc.h:23:2: error: unknown type name 'sa_family_t' sa_family_t srx_family; /* address family */ /usr/include/linux/rxrpc.h:28:3: error: unknown type name 'sa_family_t' sa_family_t family; /* transport address family */ Fixes: 727f8914 ("rxrpc: Expose UAPI definitions to userspace") Signed-off-by:
Dmitry V. Levin <ldv@altlinux.org> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dmitry V. Levin authored
commit b9f3eb49 upstream. Move inclusion of a private kernel header <net/tcp.h> from uapi/linux/tls.h to its only user - net/tls.h, to fix the following linux/tls.h userspace compilation error: /usr/include/linux/tls.h:41:21: fatal error: net/tcp.h: No such file or directory As to this point uapi/linux/tls.h was totaly unusuable for userspace, cleanup this header file further by moving other redundant includes to net/tls.h. Fixes: 3c4d7559 ("tls: kernel TLS support") Signed-off-by:
Dmitry V. Levin <ldv@altlinux.org> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Philip Derrin authored
commit 3b0c0c92 upstream. When CONFIG_ARM_LPAE is set, the PMD dump relies on the software read-only bit to determine whether a page is writable. This concealed a bug which left the kernel text section writable (AP2=0) while marked read-only in the software bit. In a kernel with the AP2 bug, the dump looks like this: ---[ Kernel Mapping ]--- 0xc0000000-0xc0200000 2M RW NX SHD 0xc0200000-0xc0600000 4M ro x SHD 0xc0600000-0xc0800000 2M ro NX SHD 0xc0800000-0xc4800000 64M RW NX SHD The fix is to check that the software and hardware bits are both set before displaying "ro". The dump then shows the true perms: ---[ Kernel Mapping ]--- 0xc0000000-0xc0200000 2M RW NX SHD 0xc0200000-0xc0600000 4M RW x SHD 0xc0600000-0xc0800000 2M RW NX SHD 0xc0800000-0xc4800000 64M RW NX SHD Fixes: ded94779 ("ARM: 8109/1: mm: Modify pte_write and pmd_write logic for LPAE") Signed-off-by:
Philip Derrin <philip@cog.systems> Tested-by:
Neil Dick <neil@cog.systems> Reviewed-by:
Kees Cook <keescook@chromium.org> Signed-off-by:
Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Philip Derrin authored
commit 400eeffa upstream. Currently, for ARM kernels with CONFIG_ARM_LPAE and CONFIG_STRICT_KERNEL_RWX enabled, the 2MiB pages mapping the kernel code and rodata are writable. They are marked read-only in a software bit (L_PMD_SECT_RDONLY) but the hardware read-only bit is not set (PMD_SECT_AP2). For user mappings, the logic that propagates the software bit to the hardware bit is in set_pmd_at(); but for the kernel, section_update() writes the PMDs directly, skipping this logic. The fix is to set PMD_SECT_AP2 for read-only sections in section_update(), at the same time as L_PMD_SECT_RDONLY. Fixes: 1e347922 ("ARM: 8275/1: mm: fix PMD_SECT_RDONLY undeclared compile error") Signed-off-by:
Philip Derrin <philip@cog.systems> Reported-by:
Neil Dick <neil@cog.systems> Tested-by:
Neil Dick <neil@cog.systems> Tested-by:
Laura Abbott <labbott@redhat.com> Reviewed-by:
Kees Cook <keescook@chromium.org> Signed-off-by:
Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Catalin Marinas authored
commit 6218f96c upstream. The generic pte_access_permitted() implementation only checks for pte_present() (together with the write permission where applicable). However, for both kernel ptes and PROT_NONE mappings pte_present() also returns true on arm64 even though such mappings are not user accessible. Additionally, arm64 now supports execute-only user permission (PROT_EXEC) which is implemented by clearing the PTE_USER bit. With this patch the arm64 implementation of pte_access_permitted() checks for the PTE_VALID and PTE_USER bits together with writable access if applicable. Reported-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Andi Kleen authored
commit 58ba4d5a upstream. 0day testing reported a perf test regression on Haswell systems without RTM. Commit a5df70c3 hides the in_tx/in_tx_cp attributes when RTM is not available, but the TSX events are still available in sysfs. Due to the missing attributes the event parser fails on those files. Don't show the TSX events in sysfs when RTM is not available on Haswell/Broadwell/Skylake. Fixes: a5df70c3 (perf/x86: Only show format attributes when supported) Reported-by:
kernel test robot <xiaolong.ye@intel.com> Tested-by:
Jin Yao <yao.jin@linux.intel.com> Signed-off-by:
Andi Kleen <ak@linux.intel.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Acked-by:
Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/20171109000718.14137-1-andi@firstfloor.orgSigned-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Andy Lutomirski authored
commit ca37e57b upstream. Running this code with IRQs enabled (where dummy_lock is a spinlock): static void check_load_gs_index(void) { /* This will fail. */ load_gs_index(0xffff); spin_lock(&dummy_lock); spin_unlock(&dummy_lock); } Will generate a lockdep warning. The issue is that the actual write to %gs would cause an exception with IRQs disabled, and the exception handler would, as an inadvertent side effect, update irqflag tracing to reflect the IRQs-off status. native_load_gs_index() would then turn IRQs back on and return with irqflag tracing still thinking that IRQs were off. The dummy lock-and-unlock causes lockdep to notice the error and warn. Fix it by adding the missing tracing. Apparently nothing did this in a context where it mattered. I haven't tried to find a code path that would actually exhibit the warning if appropriately nasty user code were running. I suspect that the security impact of this bug is very, very low -- production systems don't run with lockdep enabled, and the warning is mostly harmless anyway. Found during a quick audit of the entry code to try to track down an unrelated bug that Ingo found in some still-in-development code. Signed-off-by:
Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/e1aeb0e6ba8dd430ec36c8a35e63b429698b4132.1511411918.git.luto@kernel.orgSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Andy Lutomirski authored
commit 548c3050 upstream. When I added entry_SYSCALL_64_after_hwframe(), I left TRACE_IRQS_OFF before it. This means that users of entry_SYSCALL_64_after_hwframe() were responsible for invoking TRACE_IRQS_OFF, and the one and only user (Xen, added in the same commit) got it wrong. I think this would manifest as a warning if a Xen PV guest with CONFIG_DEBUG_LOCKDEP=y were used with context tracking. (The context tracking bit is to cause lockdep to get invoked before we turn IRQs back on.) I haven't tested that for real yet because I can't get a kernel configured like that to boot at all on Xen PV. Move TRACE_IRQS_OFF below the label. Signed-off-by:
Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 8a9949bc ("x86/xen/64: Rearrange the SYSCALL entries") Link: http://lkml.kernel.org/r/9150aac013b7b95d62c2336751d5b6e91d2722aa.1511325444.git.luto@kernel.orgSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Masami Hiramatsu authored
commit 12a78d43 upstream. The kbuild test robot reported this build warning: Warning: arch/x86/tools/test_get_len found difference at <jump_table>:ffffffff8103dd2c Warning: ffffffff8103dd82: f6 09 d8 testb $0xd8,(%rcx) Warning: objdump says 3 bytes, but insn_get_length() says 2 Warning: decoded and checked 1569014 instructions with 1 warnings This sequence seems to be a new instruction not in the opcode map in the Intel SDM. The instruction sequence is "F6 09 d8", means Group3(F6), MOD(00)REG(001)RM(001), and 0xd8. Intel SDM vol2 A.4 Table A-6 said the table index in the group is "Encoding of Bits 5,4,3 of the ModR/M Byte (bits 2,1,0 in parenthesis)" In that table, opcodes listed by the index REG bits as: 000 001 010 011 100 101 110 111 TEST Ib/Iz,(undefined),NOT,NEG,MUL AL/rAX,IMUL AL/rAX,DIV AL/rAX,IDIV AL/rAX So, it seems TEST Ib is assigned to 001. Add the new pattern. Reported-by:
kbuild test robot <fengguang.wu@intel.com> Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tom Lendacky authored
commit ac5292e9 upstream. When crosvm is used to boot a kernel as a VM, the SMP MP-table is found at physical address 0x0. This causes mpf_base to be set to 0 and a subsequent "if (!mpf_base)" check in default_get_smp_config() results in the MP-table not being parsed. Further into the boot this results in an oops when attempting a read_apic_id(). Add a boolean variable that is set to true when the MP-table is found. Use this variable for testing if the MP-table was found so that even a value of 0 for mpf_base will result in continued parsing of the MP-table. Fixes: 5997efb9 ("x86/boot: Use memremap() to map the MPF and MPC data") Reported-by:
Tomeu Vizoso <tomeu@tomeuvizoso.net> Signed-off-by:
Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: regression@leemhuis.info Link: https://lkml.kernel.org/r/20171106201753.23059.86674.stgit@tlendack-t1.amdoffice.netSigned-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Biggers authored
commit 1d9ddde1 upstream. On a non-preemptible kernel, if KEYCTL_DH_COMPUTE is called with the largest permitted inputs (16384 bits), the kernel spends 10+ seconds doing modular exponentiation in mpi_powm() without rescheduling. If all threads do it, it locks up the system. Moreover, it can cause rcu_sched-stall warnings. Notwithstanding the insanity of doing this calculation in kernel mode rather than in userspace, fix it by calling cond_resched() as each bit from the exponent is processed. It's still noninterruptible, but at least it's preemptible now. Do the cond_resched() once per bit rather than once per MPI limb because each limb might still easily take 100+ milliseconds on slow CPUs. Signed-off-by:
Eric Biggers <ebiggers@google.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paul E. McKenney authored
commit 7c2102e5 upstream. The current implementation of synchronize_sched_expedited() incorrectly assumes that resched_cpu() is unconditional, which it is not. This means that synchronize_sched_expedited() can hang when resched_cpu()'s trylock fails as follows (analysis by Neeraj Upadhyay): o CPU1 is waiting for expedited wait to complete: sync_rcu_exp_select_cpus rdp->exp_dynticks_snap & 0x1 // returns 1 for CPU5 IPI sent to CPU5 synchronize_sched_expedited_wait ret = swait_event_timeout(rsp->expedited_wq, sync_rcu_preempt_exp_done(rnp_root), jiffies_stall); expmask = 0x20, CPU 5 in idle path (in cpuidle_enter()) o CPU5 handles IPI and fails to acquire rq lock. Handles IPI sync_sched_exp_handler resched_cpu returns while failing to try lock acquire rq->lock need_resched is not set o CPU5 calls rcu_idle_enter() and as need_resched is not set, goes to idle (schedule() is not called). o CPU 1 reports RCU stall. Given that resched_cpu() is now used only by RCU, this commit fixes the assumption by making resched_cpu() unconditional. Reported-by:
Neeraj Upadhyay <neeraju@codeaurora.org> Suggested-by:
Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Johan Hovold authored
commit 08fcee28 upstream. Serdev currently only supports a single slave device, but the required sanity checks to prevent further registration attempts were missing. If a serial-port node has two child nodes with compatible properties, the OF code would try to register two slave devices using the same id and name. Driver core will not allow this (and there will be loud complaints), but the controller's slave pointer would already have been set to address of the soon to be deallocated second struct serdev_device. As the first slave device remains registered, this can lead to later use-after-free issues when the slave callbacks are accessed. Note that while the serdev registration helpers are exported, they are typically only called by serdev core. Any other (out-of-tree) callers must serialise registration and deregistration themselves. Fixes: cd6484e1 ("serdev: Introduce new bus for serial attached devices") Cc: Rob Herring <robh@kernel.org> Signed-off-by:
Johan Hovold <johan@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Viresh Kumar authored
commit 07458f6a upstream. 'cached_raw_freq' is used to get the next frequency quickly but should always be in sync with sg_policy->next_freq. There is a case where it is not and in such cases it should be reset to avoid switching to incorrect frequencies. Consider this case for example: - policy->cur is 1.2 GHz (Max) - New request comes for 780 MHz and we store that in cached_raw_freq. - Based on 780 MHz, we calculate the effective frequency as 800 MHz. - We then see the CPU wasn't idle recently and choose to keep the next freq as 1.2 GHz. - Now we have cached_raw_freq is 780 MHz and sg_policy->next_freq is 1.2 GHz. - Now if the utilization doesn't change in then next request, then the next target frequency will still be 780 MHz and it will match with cached_raw_freq. But we will choose 1.2 GHz instead of 800 MHz here. Fixes: b7eaf1aa (cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely) Signed-off-by:
Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lv Zheng authored
commit 53c5eaab upstream. Originally the Samsung quirks removed by commit 4c237371 can be covered by commit e923e8e7 and ec_freeze_events=Y mode. But commit 9c40f956 changed ec_freeze_events=Y back to N, making this problem re-surface. Actually, if commit e923e8e7 is robust enough, we can freely change ec_freeze_events mode, so this patch fixes the issue by improving commit e923e8e7. Related commits listed in the merged order: Commit: e923e8e7 Subject: ACPI / EC: Fix an issue that SCI_EVT cannot be detected after event is enabled Commit: 4c237371 Subject: ACPI / EC: Remove old CLEAR_ON_RESUME quirk Commit: 9c40f956 Subject: Revert "ACPI / EC: Enable event freeze mode..." to fix a regression This patch not only fixes the reported post-resume EC event triggering source issue, but also fixes an unreported similar issue related to the driver bind by adding EC event triggering source in ec_install_handlers(). Fixes: e923e8e7 (ACPI / EC: Fix an issue that SCI_EVT cannot be detected after event is enabled) Fixes: 4c237371 (ACPI / EC: Remove old CLEAR_ON_RESUME quirk) Fixes: 9c40f956 (Revert "ACPI / EC: Enable event freeze mode..." to fix a regression) Link: https://bugzilla.kernel.org/show_bug.cgi?id=196833Signed-off-by:
Lv Zheng <lv.zheng@intel.com> Reported-by:
Alistair Hamilton <ahpatent@gmail.com> Tested-by:
Alistair Hamilton <ahpatent@gmail.com> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ville Syrjälä authored
commit ff165679 upstream. acpi_remove_pm_notifier() ends up calling flush_workqueue() while holding acpi_pm_notifier_lock, and that same lock is taken by by the work via acpi_pm_notify_handler(). This can deadlock. To fix the problem let's split the single lock into two: one to protect the dev->wakeup between the work vs. add/remove, and another one to handle notifier installation vs. removal. After commit a1d14934 "workqueue/lockdep: 'Fix' flush_work() annotation" I was able to kill the machine (Intel Braswell) very easily with 'powertop --auto-tune', runtime suspending i915, and trying to wake it up via the USB keyboard. The cases when it didn't die are presumably explained by lockdep getting disabled by something else (cpu hotplug locking issues usually). Fortunately I still got a lockdep report over netconsole (trickling in very slowly), even though the machine was otherwise practically dead: [ 112.179806] ====================================================== [ 114.670858] WARNING: possible circular locking dependency detected [ 117.155663] 4.13.0-rc6-bsw-bisect-00169-ga1d14934 #119 Not tainted [ 119.658101] ------------------------------------------------------ [ 121.310242] xhci_hcd 0000:00:14.0: xHCI host not responding to stop endpoint command. [ 121.313294] xhci_hcd 0000:00:14.0: xHCI host controller not responding, assume dead [ 121.313346] xhci_hcd 0000:00:14.0: HC died; cleaning up [ 121.313485] usb 1-6: USB disconnect, device number 3 [ 121.313501] usb 1-6.2: USB disconnect, device number 4 [ 134.747383] kworker/0:2/47 is trying to acquire lock: [ 137.220790] (acpi_pm_notifier_lock){+.+.}, at: [<ffffffff813cafdf>] acpi_pm_notify_handler+0x2f/0x80 [ 139.721524] [ 139.721524] but task is already holding lock: [ 144.672922] ((&dpc->work)){+.+.}, at: [<ffffffff8109ce90>] process_one_work+0x160/0x720 [ 147.184450] [ 147.184450] which lock already depends on the new lock. [ 147.184450] [ 154.604711] [ 154.604711] the existing dependency chain (in reverse order) is: [ 159.447888] [ 159.447888] -> #2 ((&dpc->work)){+.+.}: [ 164.183486] __lock_acquire+0x1255/0x13f0 [ 166.504313] lock_acquire+0xb5/0x210 [ 168.778973] process_one_work+0x1b9/0x720 [ 171.030316] worker_thread+0x4c/0x440 [ 173.257184] kthread+0x154/0x190 [ 175.456143] ret_from_fork+0x27/0x40 [ 177.624348] [ 177.624348] -> #1 ("kacpi_notify"){+.+.}: [ 181.850351] __lock_acquire+0x1255/0x13f0 [ 183.941695] lock_acquire+0xb5/0x210 [ 186.046115] flush_workqueue+0xdd/0x510 [ 190.408153] acpi_os_wait_events_complete+0x31/0x40 [ 192.625303] acpi_remove_notify_handler+0x133/0x188 [ 194.820829] acpi_remove_pm_notifier+0x56/0x90 [ 196.989068] acpi_dev_pm_detach+0x5f/0xa0 [ 199.145866] dev_pm_domain_detach+0x27/0x30 [ 201.285614] i2c_device_probe+0x100/0x210 [ 203.411118] driver_probe_device+0x23e/0x310 [ 205.522425] __driver_attach+0xa3/0xb0 [ 207.634268] bus_for_each_dev+0x69/0xa0 [ 209.714797] driver_attach+0x1e/0x20 [ 211.778258] bus_add_driver+0x1bc/0x230 [ 213.837162] driver_register+0x60/0xe0 [ 215.868162] i2c_register_driver+0x42/0x70 [ 217.869551] 0xffffffffa0172017 [ 219.863009] do_one_initcall+0x45/0x170 [ 221.843863] do_init_module+0x5f/0x204 [ 223.817915] load_module+0x225b/0x29b0 [ 225.757234] SyS_finit_module+0xc6/0xd0 [ 227.661851] do_syscall_64+0x5c/0x120 [ 229.536819] return_from_SYSCALL_64+0x0/0x7a [ 231.392444] [ 231.392444] -> #0 (acpi_pm_notifier_lock){+.+.}: [ 235.124914] check_prev_add+0x44e/0x8a0 [ 237.024795] __lock_acquire+0x1255/0x13f0 [ 238.937351] lock_acquire+0xb5/0x210 [ 240.840799] __mutex_lock+0x75/0x940 [ 242.709517] mutex_lock_nested+0x1c/0x20 [ 244.551478] acpi_pm_notify_handler+0x2f/0x80 [ 246.382052] acpi_ev_notify_dispatch+0x44/0x5c [ 248.194412] acpi_os_execute_deferred+0x14/0x30 [ 250.003925] process_one_work+0x1ec/0x720 [ 251.803191] worker_thread+0x4c/0x440 [ 253.605307] kthread+0x154/0x190 [ 255.387498] ret_from_fork+0x27/0x40 [ 257.153175] [ 257.153175] other info that might help us debug this: [ 257.153175] [ 262.324392] Chain exists of: [ 262.324392] acpi_pm_notifier_lock --> "kacpi_notify" --> (&dpc->work) [ 262.324392] [ 267.391997] Possible unsafe locking scenario: [ 267.391997] [ 270.758262] CPU0 CPU1 [ 272.431713] ---- ---- [ 274.060756] lock((&dpc->work)); [ 275.646532] lock("kacpi_notify"); [ 277.260772] lock((&dpc->work)); [ 278.839146] lock(acpi_pm_notifier_lock); [ 280.391902] [ 280.391902] *** DEADLOCK *** [ 280.391902] [ 284.986385] 2 locks held by kworker/0:2/47: [ 286.524895] #0: ("kacpi_notify"){+.+.}, at: [<ffffffff8109ce90>] process_one_work+0x160/0x720 [ 288.112927] #1: ((&dpc->work)){+.+.}, at: [<ffffffff8109ce90>] process_one_work+0x160/0x720 [ 289.727725] Fixes: c072530f (ACPI / PM: Revork the handling of ACPI device wakeup notifications) Signed-off-by:
Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vasily Gorbik authored
commit b192571d upstream. Current buffer size of 64 is too small. objdump shows that there are instructions which would require up to 75 bytes buffer (with current formating). 128 bytes "ought to be enough for anybody". Also replaces 8 spaces with a single tab to reduce the memory footprint. Fixes the following KASAN finding: BUG: KASAN: stack-out-of-bounds in number+0x3fe/0x538 Write of size 1 at addr 000000005a4a75a0 by task bash/1282 CPU: 1 PID: 1282 Comm: bash Not tainted 4.14.0+ #215 Hardware name: IBM 2964 N96 702 (z/VM 6.4.0) Call Trace: ([<000000000011eeb6>] show_stack+0x56/0x88) [<0000000000e1ce1a>] dump_stack+0x15a/0x1b0 [<00000000004e2994>] print_address_description+0xf4/0x288 [<00000000004e2cf2>] kasan_report+0x13a/0x230 [<0000000000e38ae6>] number+0x3fe/0x538 [<0000000000e3dfe4>] vsnprintf+0x194/0x948 [<0000000000e3ea42>] sprintf+0xa2/0xb8 [<00000000001198dc>] print_insn+0x374/0x500 [<0000000000119346>] show_code+0x4ee/0x538 [<000000000011f234>] show_registers+0x34c/0x388 [<000000000011f2ae>] show_regs+0x3e/0xa8 [<000000000011f502>] die+0x1ea/0x2e8 [<0000000000138f0e>] do_no_context+0x106/0x168 [<0000000000139a1a>] do_protection_exception+0x4da/0x7d0 [<0000000000e55914>] pgm_check_handler+0x16c/0x1c0 [<000000000090639e>] sysrq_handle_crash+0x46/0x58 ([<0000000000000007>] 0x7) [<00000000009073fa>] __handle_sysrq+0x102/0x218 [<0000000000907c06>] write_sysrq_trigger+0xd6/0x100 [<000000000061d67a>] proc_reg_write+0xb2/0x128 [<0000000000520be6>] __vfs_write+0xee/0x368 [<0000000000521222>] vfs_write+0x21a/0x278 [<000000000052156a>] SyS_write+0xda/0x178 [<0000000000e555cc>] system_call+0xc4/0x270 The buggy address belongs to the page: page:000003d1016929c0 count:0 mapcount:0 mapping: (null) index:0x0 flags: 0x0() raw: 0000000000000000 0000000000000000 0000000000000000 ffffffff00000000 raw: 0000000000000100 0000000000000200 0000000000000000 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: 000000005a4a7480: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 000000005a4a7500: 00 00 00 00 00 00 00 00 f2 f2 f2 f2 00 00 00 00 >000000005a4a7580: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00 ^ 000000005a4a7600: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 f8 f8 000000005a4a7680: f2 f2 f2 f2 f2 f2 f8 f8 f2 f2 f3 f3 f3 f3 00 00 ================================================================== Signed-off-by:
Vasily Gorbik <gor@linux.vnet.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Heiko Carstens authored
commit 5c505387 upstream. The e7 opcode table does not have an end marker. Hence when trying to find an unknown e7 instruction the code will access memory behind the table until it finds something that matches the opcode, or the kernel crashes, whatever comes first. This affects not only the in-kernel disassembler but also uprobes and kprobes which refuse to set a probe on unknown instructions, and therefore search the opcode tables to figure out if instructions are known or not. Fixes: 3585cb02 ("s390/disassembler: add vector instructions") Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Heiko Carstens authored
commit fa1edf3f upstream. For PREEMPT enabled kernels the guarded storage (GS) code contains a possible use-after-free bug. If a task that makes use of GS exits, it will execute do_exit() while still enabled for preemption. That function will call exit_thread_runtime_instr() via exit_thread(). If exit_thread_gs() gets preempted after the GS control block of the task has been freed but before the pointer to it is set to NULL, then save_gs_cb(), called from switch_to(), will write to already freed memory. Avoid this and simply disable preemption while freeing the control block and setting the pointer to NULL. Fixes: 916cda1a ("s390: add a system call for guarded storage") Reviewed-by:
Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Heiko Carstens authored
commit d6e646ad upstream. For PREEMPT enabled kernels the runtime instrumentation (RI) code contains a possible use-after-free bug. If a task that makes use of RI exits, it will execute do_exit() while still enabled for preemption. That function will call exit_thread_runtime_instr() via exit_thread(). If exit_thread_runtime_instr() gets preempted after the RI control block of the task has been freed but before the pointer to it is set to NULL, then save_ri_cb(), called from switch_to(), will write to already freed memory. Avoid this and simply disable preemption while freeing the control block and setting the pointer to NULL. Fixes: e4b8b3f3 ("s390: add support for runtime instrumentation") Reviewed-by:
Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Heiko Carstens authored
commit d0e810ee upstream. Rebooting into a new kernel with kexec fails (system dies) if tried on a machine that has no-execute support. Reason for this is that the so called datamover code gets executed with DAT on (MMU is active) and the page that contains the datamover is marked as non-executable. Therefore when branching into the datamover an unexpected program check happens and afterwards the machine is dead. This can be simply avoided by disabling DAT, which also disables any no-execute checks, just before the datamover gets executed. In fact the first thing done by the datamover is to disable DAT. The code in the datamover that disables DAT can be removed as well. Thanks to Michael Holzheu and Gerald Schaefer for tracking this down. Reviewed-by:
Michael Holzheu <holzheu@linux.vnet.ibm.com> Reviewed-by:
Philipp Rudo <prudo@linux.vnet.ibm.com> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Fixes: 57d7f939 ("s390: add no-execute support") Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Heiko Carstens authored
commit a1c5befc upstream. Dan Horák reported the following crash related to transactional execution: User process fault: interruption code 0013 ilc:3 in libpthread-2.26.so[3ff93c00000+1b000] CPU: 2 PID: 1 Comm: /init Not tainted 4.13.4-300.fc27.s390x #1 Hardware name: IBM 2827 H43 400 (z/VM 6.4.0) task: 00000000fafc8000 task.stack: 00000000fafc4000 User PSW : 0705200180000000 000003ff93c14e70 R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:2 PM:0 RI:0 EA:3 User GPRS: 0000000000000077 000003ff00000000 000003ff93144d48 000003ff93144d5e 0000000000000000 0000000000000002 0000000000000000 000003ff00000000 0000000000000000 0000000000000418 0000000000000000 000003ffcc9fe770 000003ff93d28f50 000003ff9310acf0 000003ff92b0319a 000003ffcc9fe6d0 User Code: 000003ff93c14e62: 60e0b030 std %f14,48(%r11) 000003ff93c14e66: 60f0b038 std %f15,56(%r11) #000003ff93c14e6a: e5600000ff0e tbegin 0,65294 >000003ff93c14e70: a7740006 brc 7,3ff93c14e7c 000003ff93c14e74: a7080000 lhi %r0,0 000003ff93c14e78: a7f40023 brc 15,3ff93c14ebe 000003ff93c14e7c: b2220000 ipm %r0 000003ff93c14e80: 8800001c srl %r0,28 There are several bugs with control register handling with respect to transactional execution: - on task switch update_per_regs() is only called if the next task has an mm (is not a kernel thread). This however is incorrect. This breaks e.g. for user mode helper handling, where the kernel creates a kernel thread and then execve's a user space program. Control register contents related to transactional execution won't be updated on execve. If the previous task ran with transactional execution disabled then the new task will also run with transactional execution disabled, which is incorrect. Therefore call update_per_regs() unconditionally within switch_to(). - on startup the transactional execution facility is not enabled for the idle thread. This is not really a bug, but an inconsistency to other facilities. Therefore enable the facility if it is available. - on fork the new thread's per_flags field is not cleared. This means that a child process inherits the PER_FLAG_NO_TE flag. This flag can be set with a ptrace request to disable transactional execution for the current process. It should not be inherited by new child processes in order to be consistent with the handling of all other PER related debugging options. Therefore clear the per_flags field in copy_thread_tls(). Reported-and-tested-by:
Dan Horák <dan@danny.cz> Fixes: d35339a4 ("s390: add support for transactional memory") Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Reviewed-by:
Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by:
Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 24 Nov, 2017 2 commits
-
-
Greg Kroah-Hartman authored
-
Corey Minyard authored
commit 7e030d6d upstream. The recent changes to add SMBIOS (DMI) IPMI interfaces as platform devices caused DMI to be selected before ACPI, causing ACPI type of operations to not work. Signed-off-by:
Corey Minyard <cminyard@mvista.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-