- 01 Oct, 2019 15 commits
-
-
Bandan Das authored
BugLink: https://bugs.launchpad.net/bugs/1845036 commit bae3a8d3 upstream. Legacy apic init uses bigsmp for smp systems with 8 and more CPUs. The bigsmp APIC implementation uses physical destination mode, but it nevertheless initializes LDR and DFR. The LDR even ends up incorrectly with multiple bit being set. This does not cause a functional problem because LDR and DFR are ignored when physical destination mode is active, but it triggered a problem on a 32-bit KVM guest which jumps into a kdump kernel. The multiple bits set unearthed a bug in the KVM APIC implementation. The code which creates the logical destination map for VCPUs ignores the disabled state of the APIC and ends up overwriting an existing valid entry and as a result, APIC calibration hangs in the guest during kdump initialization. Remove the bogus LDR/DFR initialization. This is not intended to work around the KVM APIC bug. The LDR/DFR ininitalization is wrong on its own. The issue goes back into the pre git history. The fixes tag is the commit in the bitkeeper import which introduced bigsmp support in 2003. git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Fixes: db7b9e9f ("[PATCH] Clustered APIC setup for >8 CPU systems") Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Bandan Das <bsd@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20190826101513.5080-2-bsd@redhat.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Sean Christopherson authored
BugLink: https://bugs.launchpad.net/bugs/1845036 commit 75ee23b3 upstream. Don't advance RIP or inject a single-step #DB if emulation signals a fault. This logic applies to all state updates that are conditional on clean retirement of the emulation instruction, e.g. updating RFLAGS was previously handled by commit 38827dbd ("KVM: x86: Do not update EFLAGS on faulting emulation"). Not advancing RIP is likely a nop, i.e. ctxt->eip isn't updated with ctxt->_eip until emulation "retires" anyways. Skipping #DB injection fixes a bug reported by Andy Lutomirski where a #UD on SYSCALL due to invalid state with EFLAGS.TF=1 would loop indefinitely due to emulation overwriting the #UD with #DB and thus restarting the bad SYSCALL over and over. Cc: Nadav Amit <nadav.amit@gmail.com> Cc: stable@vger.kernel.org Reported-by: Andy Lutomirski <luto@kernel.org> Fixes: 663f4c61 ("KVM: x86: handle singlestep during emulation") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Takashi Iwai authored
BugLink: https://bugs.launchpad.net/bugs/1845036 commit 75545304 upstream. The input pool of a client might be deleted via the resize ioctl, the the access to it should be covered by the proper locks. Currently the only missing place is the call in snd_seq_ioctl_get_client_pool(), and this patch papers over it. Reported-by: syzbot+4a75454b9ca2777f35c7@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Eric Dumazet authored
BugLink: https://bugs.launchpad.net/bugs/1845036 [ Upstream commit ef8d8ccd ] As Jason Baron explained in commit 790ba456 ("tcp: set SOCK_NOSPACE under memory pressure"), it is crucial we properly set SOCK_NOSPACE when needed. However, Jason patch had a bug, because the 'nonblocking' status as far as sk_stream_wait_memory() is concerned is governed by MSG_DONTWAIT flag passed at sendmsg() time : long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT); So it is very possible that tcp sendmsg() calls sk_stream_wait_memory(), and that sk_stream_wait_memory() returns -EAGAIN with SOCK_NOSPACE cleared, if sk->sk_sndtimeo has been set to a small (but not zero) value. This patch removes the 'noblock' variable since we must always set SOCK_NOSPACE if -EAGAIN is returned. It also renames the do_nonblock label since we might reach this code path even if we were in blocking mode. Fixes: 790ba456 ("tcp: set SOCK_NOSPACE under memory pressure") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jason Baron <jbaron@akamai.com> Reported-by: Vladimir Rutsky <rutsky@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Jason Baron <jbaron@akamai.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Hui Peng authored
BugLink: https://bugs.launchpad.net/bugs/1845036 commit daac0715 upstream. The `uac_mixer_unit_descriptor` shown as below is read from the device side. In `parse_audio_mixer_unit`, `baSourceID` field is accessed from index 0 to `bNrInPins` - 1, the current implementation assumes that descriptor is always valid (the length of descriptor is no shorter than 5 + `bNrInPins`). If a descriptor read from the device side is invalid, it may trigger out-of-bound memory access. ``` struct uac_mixer_unit_descriptor { __u8 bLength; __u8 bDescriptorType; __u8 bDescriptorSubtype; __u8 bUnitID; __u8 bNrInPins; __u8 baSourceID[]; } ``` This patch fixes the bug by add a sanity check on the length of the descriptor. Reported-by: Hui Peng <benquike@gmail.com> Reported-by: Mathias Payer <mathias.payer@nebelwelt.net> Cc: <stable@vger.kernel.org> Signed-off-by: Hui Peng <benquike@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Hui Peng authored
BugLink: https://bugs.launchpad.net/bugs/1845036 commit 19bce474 upstream. `check_input_term` recursively calls itself with input from device side (e.g., uac_input_terminal_descriptor.bCSourceID) as argument (id). In `check_input_term`, if `check_input_term` is called with the same `id` argument as the caller, it triggers endless recursive call, resulting kernel space stack overflow. This patch fixes the bug by adding a bitmap to `struct mixer_build` to keep track of the checked ids and stop the execution if some id has been checked (similar to how parse_audio_unit handles unitid argument). Reported-by: Hui Peng <benquike@gmail.com> Reported-by: Mathias Payer <mathias.payer@nebelwelt.net> Signed-off-by: Hui Peng <benquike@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Tim Froidcoeur authored
BugLink: https://bugs.launchpad.net/bugs/1845036 Commit 8c3088f895a0 ("tcp: be more careful in tcp_fragment()") triggers following stack trace: [25244.848046] kernel BUG at ./include/linux/skbuff.h:1406! [25244.859335] RIP: 0010:skb_queue_prev+0x9/0xc [25244.888167] Call Trace: [25244.889182] <IRQ> [25244.890001] tcp_fragment+0x9c/0x2cf [25244.891295] tcp_write_xmit+0x68f/0x988 [25244.892732] __tcp_push_pending_frames+0x3b/0xa0 [25244.894347] tcp_data_snd_check+0x2a/0xc8 [25244.895775] tcp_rcv_established+0x2a8/0x30d [25244.897282] tcp_v4_do_rcv+0xb2/0x158 [25244.898666] tcp_v4_rcv+0x692/0x956 [25244.899959] ip_local_deliver_finish+0xeb/0x169 [25244.901547] __netif_receive_skb_core+0x51c/0x582 [25244.903193] ? inet_gro_receive+0x239/0x247 [25244.904756] netif_receive_skb_internal+0xab/0xc6 [25244.906395] napi_gro_receive+0x8a/0xc0 [25244.907760] receive_buf+0x9a1/0x9cd [25244.909160] ? load_balance+0x17a/0x7b7 [25244.910536] ? vring_unmap_one+0x18/0x61 [25244.911932] ? detach_buf+0x60/0xfa [25244.913234] virtnet_poll+0x128/0x1e1 [25244.914607] net_rx_action+0x12a/0x2b1 [25244.915953] __do_softirq+0x11c/0x26b [25244.917269] ? handle_irq_event+0x44/0x56 [25244.918695] irq_exit+0x61/0xa0 [25244.919947] do_IRQ+0x9d/0xbb [25244.921065] common_interrupt+0x85/0x85 [25244.922479] </IRQ> tcp_rtx_queue_tail() (called by tcp_fragment()) can call tcp_write_queue_prev() on the first packet in the queue, which will trigger the BUG in tcp_write_queue_prev(), because there is no previous packet. This happens when the retransmit queue is empty, for example in case of a zero window. Commit 8c3088f895a0 ("tcp: be more careful in tcp_fragment()") was not a simple cherry-pick of the original one from master (b617158d) because there is a specific TCP rtx queue only since v4.15. For more details, please see the commit message of b617158d ("tcp: be more careful in tcp_fragment()"). The BUG() is hit due to the specific code added to versions older than v4.15. The comment in skb_queue_prev() (include/linux/skbuff.h:1406), just before the BUG_ON() somehow suggests to add a check before using it, what Tim did. In master, this code path causing the issue will not be taken because the implementation of tcp_rtx_queue_tail() is different: tcp_fragment() → tcp_rtx_queue_tail() → tcp_write_queue_prev() → skb_queue_prev() → BUG_ON() Fixes: 8c3088f895a0 ("tcp: be more careful in tcp_fragment()") Signed-off-by: Tim Froidcoeur <tim.froidcoeur@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Stefan Wahren authored
BugLink: https://bugs.launchpad.net/bugs/1845036 [ Upstream commit 215e06f0 ] The commit 5e6acc3e ("bcm2835-pm: Move bcm2835-watchdog's DT probe to an MFD.") broke module autoloading on Raspberry Pi. So add a module alias this fix this. Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Adrian Vladu authored
BugLink: https://bugs.launchpad.net/bugs/1845036 [ Upstream commit b0995156 ] HyperV KVP and VSS daemons should exit with 0 when the '--help' or '-h' flags are used. Signed-off-by: Adrian Vladu <avladu@cloudbasesolutions.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Sasha Levin <sashal@kernel.org> Cc: Alessandro Pilotti <apilotti@cloudbasesolutions.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Hans Ulli Kroll authored
BugLink: https://bugs.launchpad.net/bugs/1845036 [ Upstream commit 77775888 ] On the Gemini SoC the FOTG2 stalls after port reset so restart the HCD after each port reset. Signed-off-by: Hans Ulli Kroll <ulli.kroll@googlemail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20190810150458.817-1-linus.walleij@linaro.orgSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Benjamin Herrenschmidt authored
BugLink: https://bugs.launchpad.net/bugs/1845036 [ Upstream commit 602fda17 ] In some cases, one can get out of suspend with a reset or a disconnect followed by a reconnect. Previously we would leave a stale suspended flag set. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Arnd Bergmann authored
BugLink: https://bugs.launchpad.net/bugs/1845036 [ Upstream commit 5d6fb560 ] clang-9 points out that there are two variables that depending on the configuration may only be used in an ARRAY_SIZE() expression but not referenced: drivers/dma/ste_dma40.c:145:12: error: variable 'd40_backup_regs' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration] static u32 d40_backup_regs[] = { ^ drivers/dma/ste_dma40.c:214:12: error: variable 'd40_backup_regs_chan' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration] static u32 d40_backup_regs_chan[] = { Mark these __maybe_unused to shut up the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20190712091357.744515-1-arnd@arndb.deSigned-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Adrian Hunter authored
BugLink: https://bugs.launchpad.net/bugs/1845036 [ Upstream commit 7c7cfdcf ] Fix the following BUG: [ 187.065689] BUG: kernel NULL pointer dereference, address: 000000000000001c [ 187.065790] RIP: 0010:ufshcd_vreg_set_hpm+0x3c/0x110 [ufshcd_core] [ 187.065938] Call Trace: [ 187.065959] ufshcd_resume+0x72/0x290 [ufshcd_core] [ 187.065980] ufshcd_system_resume+0x54/0x140 [ufshcd_core] [ 187.065993] ? pci_pm_restore+0xb0/0xb0 [ 187.066005] ufshcd_pci_resume+0x15/0x20 [ufshcd_pci] [ 187.066017] pci_pm_thaw+0x4c/0x90 [ 187.066030] dpm_run_callback+0x5b/0x150 [ 187.066043] device_resume+0x11b/0x220 Voltage regulators are optional, so functions must check they exist before dereferencing. Note this issue is hidden if CONFIG_REGULATORS is not set, because the offending code is optimised away. Notes for stable: The issue first appears in commit 57d104c1 ("ufs: add UFS power management support") but is inadvertently fixed in commit 60f01870 ("scsi: ufs: disable vccq if it's not needed by UFS device") which in turn was reverted by commit 73067981 ("Revert "scsi: ufs: disable vccq if it's not needed by UFS device""). So fix applies v3.18 to v4.5 and v5.1+ Fixes: 57d104c1 ("ufs: add UFS power management support") Fixes: 73067981 ("Revert "scsi: ufs: disable vccq if it's not needed by UFS device"") Cc: stable@vger.kernel.org Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Tom Lendacky authored
BugLink: https://bugs.launchpad.net/bugs/1845036 [ Upstream commit c49a0a80 ] There have been reports of RDRAND issues after resuming from suspend on some AMD family 15h and family 16h systems. This issue stems from a BIOS not performing the proper steps during resume to ensure RDRAND continues to function properly. RDRAND support is indicated by CPUID Fn00000001_ECX[30]. This bit can be reset by clearing MSR C001_1004[62]. Any software that checks for RDRAND support using CPUID, including the kernel, will believe that RDRAND is not supported. Update the CPU initialization to clear the RDRAND CPUID bit for any family 15h and 16h processor that supports RDRAND. If it is known that the family 15h or family 16h system does not have an RDRAND resume issue or that the system will not be placed in suspend, the "rdrand=force" kernel parameter can be used to stop the clearing of the RDRAND CPUID bit. Additionally, update the suspend and resume path to save and restore the MSR C001_1004 value to ensure that the RDRAND CPUID setting remains in place after resuming from suspend. Note, that clearing the RDRAND CPUID bit does not prevent a processor that normally supports the RDRAND instruction from executing it. So any code that determined the support based on family and model won't #UD. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chen Yu <yu.c.chen@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: "linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org> Cc: "linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: <stable@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "x86@kernel.org" <x86@kernel.org> Link: https://lkml.kernel.org/r/7543af91666f491547bd86cebb1e17c66824ab9f.1566229943.git.thomas.lendacky@amd.com [sl: adjust context in docs] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Chen Yu authored
BugLink: https://bugs.launchpad.net/bugs/1845036 [ Upstream commit 7a9c2dd0 ] A bug was reported that on certain Broadwell platforms, after resuming from S3, the CPU is running at an anomalously low speed. It turns out that the BIOS has modified the value of the THERM_CONTROL register during S3, and changed it from 0 to 0x10, thus enabled clock modulation(bit4), but with undefined CPU Duty Cycle(bit1:3) - which causes the problem. Here is a simple scenario to reproduce the issue: 1. Boot up the system 2. Get MSR 0x19a, it should be 0 3. Put the system into sleep, then wake it up 4. Get MSR 0x19a, it shows 0x10, while it should be 0 Although some BIOSen want to change the CPU Duty Cycle during S3, in our case we don't want the BIOS to do any modification. Fix this issue by introducing a more generic x86 framework to save/restore specified MSR registers(THERM_CONTROL in this case) for suspend/resume. This allows us to fix similar bugs in a much simpler way in the future. When the kernel wants to protect certain MSRs during suspending, we simply add a quirk entry in msr_save_dmi_table, and customize the MSR registers inside the quirk callback, for example: u32 msr_id_need_to_save[] = {MSR_ID0, MSR_ID1, MSR_ID2...}; and the quirk mechanism ensures that, once resumed from suspend, the MSRs indicated by these IDs will be restored to their original, pre-suspend values. Since both 64-bit and 32-bit kernels are affected, this patch covers the common 64/32-bit suspend/resume code path. And because the MSRs specified by the user might not be available or readable in any situation, we use rdmsrl_safe() to safely save these MSRs. Reported-and-tested-by: Marcin Kaszewski <marcin.kaszewski@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bp@suse.de Cc: len.brown@intel.com Cc: linux@horizon.com Cc: luto@kernel.org Cc: rjw@rjwysocki.net Link: http://lkml.kernel.org/r/c9abdcbc173dd2f57e8990e304376f19287e92ba.1448382971.git.yu.c.chen@intel.com [ More edits to the naming of data structures. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
- 26 Sep, 2019 25 commits
-
-
Dirk Morris authored
BugLink: https://bugs.launchpad.net/bugs/1845036 commit 656c8e9c upstream. Change ct id hash calculation to only use invariants. Currently the ct id hash calculation is based on some fields that can change in the lifetime on a conntrack entry in some corner cases. The current hash uses the whole tuple which contains an hlist pointer which will change when the conntrack is placed on the dying list resulting in a ct id change. This patch also removes the reply-side tuple and extension pointer from the hash calculation so that the ct id will will not change from initialization until confirmation. Fixes: 3c791076 ("netfilter: ctnetlink: don't use conntrack/expect object addresses as id") Signed-off-by: Dirk Morris <dmorris@metaloft.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Florian Westphal authored
BugLink: https://bugs.launchpad.net/bugs/1845036 commit 3c791076 upstream. else, we leak the addresses to userspace via ctnetlink events and dumps. Compute an ID on demand based on the immutable parts of nf_conn struct. Another advantage compared to using an address is that there is no immediate re-use of the same ID in case the conntrack entry is freed and reallocated again immediately. Fixes: 35832402 ("[NETFILTER]: nf_conntrack_expect: kill unique ID") Fixes: 7f85f914 ("[NETFILTER]: nf_conntrack: kill unique ID") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> [bwh: Backported to 4.4: adjust context] Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Jason A. Donenfeld authored
BugLink: https://bugs.launchpad.net/bugs/1845036 commit 1ae2324f upstream. HalfSipHash, or hsiphash, is a shortened version of SipHash, which generates 32-bit outputs using a weaker 64-bit key. It has *much* lower security margins, and shouldn't be used for anything too sensitive, but it could be used as a hashtable key function replacement, if the output is never exposed, and if the security requirement is not too high. The goal is to make this something that performance-critical jhash users would be willing to use. On 64-bit machines, HalfSipHash1-3 is slower than SipHash1-3, so we alias SipHash1-3 to HalfSipHash1-3 on those systems. 64-bit x86_64: [ 0.509409] test_siphash: SipHash2-4 cycles: 4049181 [ 0.510650] test_siphash: SipHash1-3 cycles: 2512884 [ 0.512205] test_siphash: HalfSipHash1-3 cycles: 3429920 [ 0.512904] test_siphash: JenkinsHash cycles: 978267 So, we map hsiphash() -> SipHash1-3 32-bit x86: [ 0.509868] test_siphash: SipHash2-4 cycles: 14812892 [ 0.513601] test_siphash: SipHash1-3 cycles: 9510710 [ 0.515263] test_siphash: HalfSipHash1-3 cycles: 3856157 [ 0.515952] test_siphash: JenkinsHash cycles: 1148567 So, we map hsiphash() -> HalfSipHash1-3 hsiphash() is roughly 3 times slower than jhash(), but comes with a considerable security improvement. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 4.4 to avoid regression for WireGuard with only half the siphash API present] Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Alexander Kochetkov authored
BugLink: https://bugs.launchpad.net/bugs/1845036 commit c278c253 upstream. There is a race between arc_emac_tx() and arc_emac_tx_clean(). sk_buff got freed by arc_emac_tx_clean() while arc_emac_tx() submitting sk_buff. In order to free sk_buff arc_emac_tx_clean() checks: if ((info & FOR_EMAC) || !txbd->data) break; ... dev_kfree_skb_irq(skb); If condition false, arc_emac_tx_clean() free sk_buff. In order to submit txbd, arc_emac_tx() do: priv->tx_buff[*txbd_curr].skb = skb; ... priv->txbd[*txbd_curr].data = cpu_to_le32(addr); ... ... <== arc_emac_tx_clean() check condition here ... <== (info & FOR_EMAC) is false ... <== !txbd->data is false ... *info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len); In order to reproduce the situation, run device: # iperf -s run on host: # iperf -t 600 -c <device-ip-addr> [ 28.396284] ------------[ cut here ]------------ [ 28.400912] kernel BUG at .../net/core/skbuff.c:1355! [ 28.414019] Internal error: Oops - BUG: 0 [#1] SMP ARM [ 28.419150] Modules linked in: [ 28.422219] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G B 4.4.0+ #120 [ 28.429516] Hardware name: Rockchip (Device Tree) [ 28.434216] task: c0665070 ti: c0660000 task.ti: c0660000 [ 28.439622] PC is at skb_put+0x10/0x54 [ 28.443381] LR is at arc_emac_poll+0x260/0x474 [ 28.447821] pc : [<c03af580>] lr : [<c028fec4>] psr: a0070113 [ 28.447821] sp : c0661e58 ip : eea68502 fp : ef377000 [ 28.459280] r10: 0000012c r9 : f08b2000 r8 : eeb57100 [ 28.464498] r7 : 00000000 r6 : ef376594 r5 : 00000077 r4 : ef376000 [ 28.471015] r3 : 0030488b r2 : ef13e880 r1 : 000005ee r0 : eeb57100 [ 28.477534] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [ 28.484658] Control: 10c5387d Table: 8eaf004a DAC: 00000051 [ 28.490396] Process swapper/0 (pid: 0, stack limit = 0xc0660210) [ 28.496393] Stack: (0xc0661e58 to 0xc0662000) [ 28.500745] 1e40: 00000002 00000000 [ 28.508913] 1e60: 00000000 ef376520 00000028 f08b23b8 00000000 ef376520 ef7b6900 c028fc64 [ 28.517082] 1e80: 2f158000 c0661ea8 c0661eb0 0000012c c065e900 c03bdeac ffff95e9 c0662100 [ 28.525250] 1ea0: c0663924 00000028 c0661ea8 c0661ea8 c0661eb0 c0661eb0 0000001e c0660000 [ 28.533417] 1ec0: 40000003 00000008 c0695a00 0000000a c066208c 00000100 c0661ee0 c0027410 [ 28.541584] 1ee0: ef0fb700 2f158000 00200000 ffff95e8 00000004 c0662100 c0662080 00000003 [ 28.549751] 1f00: 00000000 00000000 00000000 c065b45c 0000001e ef005000 c0647a30 00000000 [ 28.557919] 1f20: 00000000 c0027798 00000000 c005cf40 f0802100 c0662ffc c0661f60 f0803100 [ 28.566088] 1f40: c0661fb8 c00093bc c000ffb4 60070013 ffffffff c0661f94 c0661fb8 c00137d4 [ 28.574267] 1f60: 00000001 00000000 00000000 c001ffa0 00000000 c0660000 00000000 c065a364 [ 28.582441] 1f80: c0661fb8 c0647a30 00000000 00000000 00000000 c0661fb0 c000ffb0 c000ffb4 [ 28.590608] 1fa0: 60070013 ffffffff 00000051 00000000 00000000 c005496c c0662400 c061bc40 [ 28.598776] 1fc0: ffffffff ffffffff 00000000 c061b680 00000000 c0647a30 00000000 c0695294 [ 28.606943] 1fe0: c0662488 c0647a2c c066619c 6000406a 413fc090 6000807c 00000000 00000000 [ 28.615127] [<c03af580>] (skb_put) from [<ef376520>] (0xef376520) [ 28.621218] Code: e5902054 e590c090 e3520000 0a000000 (e7f001f2) [ 28.627307] ---[ end trace 4824734e2243fdb6 ]--- [ 34.377068] Internal error: Oops: 17 [#1] SMP ARM [ 34.382854] Modules linked in: [ 34.385947] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.4.0+ #120 [ 34.392219] Hardware name: Rockchip (Device Tree) [ 34.396937] task: ef02d040 ti: ef05c000 task.ti: ef05c000 [ 34.402376] PC is at __dev_kfree_skb_irq+0x4/0x80 [ 34.407121] LR is at arc_emac_poll+0x130/0x474 [ 34.411583] pc : [<c03bb640>] lr : [<c028fd94>] psr: 60030013 [ 34.411583] sp : ef05de68 ip : 0008e83c fp : ef377000 [ 34.423062] r10: c001bec4 r9 : 00000000 r8 : f08b24c8 [ 34.428296] r7 : f08b2400 r6 : 00000075 r5 : 00000019 r4 : ef376000 [ 34.434827] r3 : 00060000 r2 : 00000042 r1 : 00000001 r0 : 00000000 [ 34.441365] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [ 34.448507] Control: 10c5387d Table: 8f25c04a DAC: 00000051 [ 34.454262] Process ksoftirqd/0 (pid: 3, stack limit = 0xef05c210) [ 34.460449] Stack: (0xef05de68 to 0xef05e000) [ 34.464827] de60: ef376000 c028fd94 00000000 c0669480 c0669480 ef376520 [ 34.473022] de80: 00000028 00000001 00002ae4 ef376520 ef7b6900 c028fc64 2f158000 ef05dec0 [ 34.481215] dea0: ef05dec8 0000012c c065e900 c03bdeac ffff983f c0662100 c0663924 00000028 [ 34.489409] dec0: ef05dec0 ef05dec0 ef05dec8 ef05dec8 ef7b6000 ef05c000 40000003 00000008 [ 34.497600] dee0: c0695a00 0000000a c066208c 00000100 ef05def8 c0027410 ef7b6000 40000000 [ 34.505795] df00: 04208040 ffff983e 00000004 c0662100 c0662080 00000003 ef05c000 ef027340 [ 34.513985] df20: ef05c000 c0666c2c 00000000 00000001 00000002 00000000 00000000 c0027568 [ 34.522176] df40: ef027340 c003ef48 ef027300 00000000 ef027340 c003edd4 00000000 00000000 [ 34.530367] df60: 00000000 c003c37c ffffff7f 00000001 00000000 ef027340 00000000 00030003 [ 34.538559] df80: ef05df80 ef05df80 00000000 00000000 ef05df90 ef05df90 ef05dfac ef027300 [ 34.546750] dfa0: c003c2a4 00000000 00000000 c000f578 00000000 00000000 00000000 00000000 [ 34.554939] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 34.563129] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 ffffffff dfff7fff [ 34.571360] [<c03bb640>] (__dev_kfree_skb_irq) from [<c028fd94>] (arc_emac_poll+0x130/0x474) [ 34.579840] [<c028fd94>] (arc_emac_poll) from [<c03bdeac>] (net_rx_action+0xdc/0x28c) [ 34.587712] [<c03bdeac>] (net_rx_action) from [<c0027410>] (__do_softirq+0xcc/0x1f8) [ 34.595482] [<c0027410>] (__do_softirq) from [<c0027568>] (run_ksoftirqd+0x2c/0x50) [ 34.603168] [<c0027568>] (run_ksoftirqd) from [<c003ef48>] (smpboot_thread_fn+0x174/0x18c) [ 34.611466] [<c003ef48>] (smpboot_thread_fn) from [<c003c37c>] (kthread+0xd8/0xec) [ 34.619075] [<c003c37c>] (kthread) from [<c000f578>] (ret_from_fork+0x14/0x3c) [ 34.626317] Code: e8bd8010 e3a00000 e12fff1e e92d4010 (e59030a4) [ 34.632572] ---[ end trace cca5a3d86a82249a ]--- Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Daniel Bristot de Oliveira authored
BugLink: https://bugs.launchpad.net/bugs/1845036 commit 82d6489d upstream. While testing the deadline scheduler + cgroup setup I hit this warning. [ 132.612935] ------------[ cut here ]------------ [ 132.612951] WARNING: CPU: 5 PID: 0 at kernel/softirq.c:150 __local_bh_enable_ip+0x6b/0x80 [ 132.612952] Modules linked in: (a ton of modules...) [ 132.612981] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.7.0-rc2 #2 [ 132.612981] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150714_191134- 04/01/2014 [ 132.612982] 0000000000000086 45c8bb5effdd088b ffff88013fd43da0 ffffffff813d229e [ 132.612984] 0000000000000000 0000000000000000 ffff88013fd43de0 ffffffff810a652b [ 132.612985] 00000096811387b5 0000000000000200 ffff8800bab29d80 ffff880034c54c00 [ 132.612986] Call Trace: [ 132.612987] <IRQ> [<ffffffff813d229e>] dump_stack+0x63/0x85 [ 132.612994] [<ffffffff810a652b>] __warn+0xcb/0xf0 [ 132.612997] [<ffffffff810e76a0>] ? push_dl_task.part.32+0x170/0x170 [ 132.612999] [<ffffffff810a665d>] warn_slowpath_null+0x1d/0x20 [ 132.613000] [<ffffffff810aba5b>] __local_bh_enable_ip+0x6b/0x80 [ 132.613008] [<ffffffff817d6c8a>] _raw_write_unlock_bh+0x1a/0x20 [ 132.613010] [<ffffffff817d6c9e>] _raw_spin_unlock_bh+0xe/0x10 [ 132.613015] [<ffffffff811388ac>] put_css_set+0x5c/0x60 [ 132.613016] [<ffffffff8113dc7f>] cgroup_free+0x7f/0xa0 [ 132.613017] [<ffffffff810a3912>] __put_task_struct+0x42/0x140 [ 132.613018] [<ffffffff810e776a>] dl_task_timer+0xca/0x250 [ 132.613027] [<ffffffff810e76a0>] ? push_dl_task.part.32+0x170/0x170 [ 132.613030] [<ffffffff8111371e>] __hrtimer_run_queues+0xee/0x270 [ 132.613031] [<ffffffff81113ec8>] hrtimer_interrupt+0xa8/0x190 [ 132.613034] [<ffffffff81051a58>] local_apic_timer_interrupt+0x38/0x60 [ 132.613035] [<ffffffff817d9b0d>] smp_apic_timer_interrupt+0x3d/0x50 [ 132.613037] [<ffffffff817d7c5c>] apic_timer_interrupt+0x8c/0xa0 [ 132.613038] <EOI> [<ffffffff81063466>] ? native_safe_halt+0x6/0x10 [ 132.613043] [<ffffffff81037a4e>] default_idle+0x1e/0xd0 [ 132.613044] [<ffffffff810381cf>] arch_cpu_idle+0xf/0x20 [ 132.613046] [<ffffffff810e8fda>] default_idle_call+0x2a/0x40 [ 132.613047] [<ffffffff810e92d7>] cpu_startup_entry+0x2e7/0x340 [ 132.613048] [<ffffffff81050235>] start_secondary+0x155/0x190 [ 132.613049] ---[ end trace f91934d162ce9977 ]--- The warn is the spin_(lock|unlock)_bh(&css_set_lock) in the interrupt context. Converting the spin_lock_bh to spin_lock_irq(save) to avoid this problem - and other problems of sharing a spinlock with an interrupt. Cc: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Juri Lelli <juri.lelli@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: cgroups@vger.kernel.org Cc: stable@vger.kernel.org # 4.5+ Cc: linux-kernel@vger.kernel.org Reviewed-by: Rik van Riel <riel@redhat.com> Reviewed-by: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com> Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Acked-by: Zefan Li <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Zubin Mithra <zsm@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Mikulas Patocka authored
BugLink: https://bugs.launchpad.net/bugs/1845036 commit 1cfd5d33 upstream. If the sector number is too high, dm_table_find_target() should return a pointer to a zeroed dm_target structure (the caller should test it with dm_target_is_valid). However, for some table sizes, the code in dm_table_find_target() that performs btree lookup will access out of bound memory structures. Fix this bug by testing the sector number at the beginning of dm_table_find_target(). Also, add an "inline" keyword to the function dm_table_get_size() because this is a hot path. Fixes: 512875bd ("dm: table detect io beyond device") Cc: stable@vger.kernel.org Reported-by: Zhang Tao <kontais@zoho.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
ZhangXiaoxu authored
BugLink: https://bugs.launchpad.net/bugs/1845036 commit ae148243 upstream. In commit 6096d91a ("dm space map metadata: fix occasional leak of a metadata block on resize"), we refactor the commit logic to a new function 'apply_bops'. But when that logic was replaced in out() the return value was not stored. This may lead out() returning a wrong value to the caller. Fixes: 6096d91a ("dm space map metadata: fix occasional leak of a metadata block on resize") Cc: stable@vger.kernel.org Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
ZhangXiaoxu authored
BugLink: https://bugs.launchpad.net/bugs/1845036 commit e4f9d601 upstream. When btree_split_beneath() splits a node to two new children, it will allocate two blocks: left and right. If right block's allocation failed, the left block will be unlocked and marked dirty. If this happened, the left block'ss content is zero, because it wasn't initialized with the btree struct before the attempot to allocate the right block. Upon return, when flushing the left block to disk, the validator will fail when check this block. Then a BUG_ON is raised. Fix this by completely initializing the left block before allocating and initializing the right block. Fixes: 4dcb8b57 ("dm btree: fix leak of bufio-backed block in btree_split_beneath error path") Cc: stable@vger.kernel.org Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
John Hubbard authored
BugLink: https://bugs.launchpad.net/bugs/1845036 commit 7846f58f upstream. commit a90118c4 ("x86/boot: Save fields explicitly, zero out everything else") had two errors: * It preserved boot_params.acpi_rsdp_addr, and * It failed to preserve boot_params.hdr Therefore, zero out acpi_rsdp_addr, and preserve hdr. Fixes: a90118c4 ("x86/boot: Save fields explicitly, zero out everything else") Reported-by: Neil MacLeod <neil@nmacleod.com> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Neil MacLeod <neil@nmacleod.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20190821192513.20126-1-jhubbard@nvidia.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
John Hubbard authored
BugLink: https://bugs.launchpad.net/bugs/1845036 commit a90118c4 upstream. Recent gcc compilers (gcc 9.1) generate warnings about an out of bounds memset, if the memset goes accross several fields of a struct. This generated a couple of warnings on x86_64 builds in sanitize_boot_params(). Fix this by explicitly saving the fields in struct boot_params that are intended to be preserved, and zeroing all the rest. [ tglx: Tagged for stable as it breaks the warning free build there as well ] Suggested-by: Thomas Gleixner <tglx@linutronix.de> Suggested-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20190731054627.5627-2-jhubbard@nvidia.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Thomas Gleixner authored
BugLink: https://bugs.launchpad.net/bugs/1845036 commit f897e60a upstream. Some newer machines do not advertise legacy timers. The kernel can handle that situation if the TSC and the CPU frequency are enumerated by CPUID or MSRs and the CPU supports TSC deadline timer. If the CPU does not support TSC deadline timer the local APIC timer frequency has to be known as well. Some Ryzens machines do not advertize legacy timers, but there is no reliable way to determine the bus frequency which feeds the local APIC timer when the machine allows overclocking of that frequency. As there is no legacy timer the local APIC timer calibration crashes due to a NULL pointer dereference when accessing the not installed global clock event device. Switch the calibration loop to a non interrupt based one, which polls either TSC (if frequency is known) or jiffies. The latter requires a global clockevent. As the machines which do not have a global clockevent installed have a known TSC frequency this is a non issue. For older machines where TSC frequency is not known, there is no known case where the legacy timers do not exist as that would have been reported long ago. Reported-by: Daniel Drake <drake@endlessm.com> Reported-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Daniel Drake <drake@endlessm.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908091443030.21433@nanos.tec.linutronix.de Link: http://bugzilla.opensuse.org/show_bug.cgi?id=1142926#c12Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Sean Christopherson authored
BugLink: https://bugs.launchpad.net/bugs/1845036 commit b63f20a7 upstream. Use 'lea' instead of 'add' when adjusting %rsp in CALL_NOSPEC so as to avoid clobbering flags. KVM's emulator makes indirect calls into a jump table of sorts, where the destination of the CALL_NOSPEC is a small blob of code that performs fast emulation by executing the target instruction with fixed operands. adcb_al_dl: 0x000339f8 <+0>: adc %dl,%al 0x000339fa <+2>: ret A major motiviation for doing fast emulation is to leverage the CPU to handle consumption and manipulation of arithmetic flags, i.e. RFLAGS is both an input and output to the target of CALL_NOSPEC. Clobbering flags results in all sorts of incorrect emulation, e.g. Jcc instructions often take the wrong path. Sans the nops... asm("push %[flags]; popf; " CALL_NOSPEC " ; pushf; pop %[flags]\n" 0x0003595a <+58>: mov 0xc0(%ebx),%eax 0x00035960 <+64>: mov 0x60(%ebx),%edx 0x00035963 <+67>: mov 0x90(%ebx),%ecx 0x00035969 <+73>: push %edi 0x0003596a <+74>: popf 0x0003596b <+75>: call *%esi 0x000359a0 <+128>: pushf 0x000359a1 <+129>: pop %edi 0x000359a2 <+130>: mov %eax,0xc0(%ebx) 0x000359b1 <+145>: mov %edx,0x60(%ebx) ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK); 0x000359a8 <+136>: mov -0x10(%ebp),%eax 0x000359ab <+139>: and $0x8d5,%edi 0x000359b4 <+148>: and $0xfffff72a,%eax 0x000359b9 <+153>: or %eax,%edi 0x000359bd <+157>: mov %edi,0x4(%ebx) For the most part this has gone unnoticed as emulation of guest code that can trigger fast emulation is effectively limited to MMIO when running on modern hardware, and MMIO is rarely, if ever, accessed by instructions that affect or consume flags. Breakage is almost instantaneous when running with unrestricted guest disabled, in which case KVM must emulate all instructions when the guest has invalid state, e.g. when the guest is in Big Real Mode during early BIOS. Fixes: 776b043848fd2 ("x86/retpoline: Add initial retpoline support") Fixes: 1a29b5b7 ("KVM: x86: Make indirect calls in emulator speculation safe") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20190822211122.27579-1-sean.j.christopherson@intel.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Oleg Nesterov authored
BugLink: https://bugs.launchpad.net/bugs/1845036 commit 46d0b24c upstream. userfaultfd_release() should clear vm_flags/vm_userfaultfd_ctx even if mm->core_state != NULL. Otherwise a page fault can see userfaultfd_missing() == T and use an already freed userfaultfd_ctx. Link: http://lkml.kernel.org/r/20190820160237.GB4983@redhat.com Fixes: 04f5866e ("coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping") Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Peter Xu <peterx@redhat.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Mikulas Patocka authored
BugLink: https://bugs.launchpad.net/bugs/1845036 commit cf3591ef upstream. Revert the commit bd293d07. The proper fix has been made available with commit d0a255e7 ("loop: set PF_MEMALLOC_NOIO for the worker thread"). Note that the fix offered by commit bd293d07 doesn't really prevent the deadlock from occuring - if we look at the stacktrace reported by Junxiao Bi, we see that it hangs in bit_wait_io and not on the mutex - i.e. it has already successfully taken the mutex. Changing the mutex from mutex_lock to mutex_trylock won't help with deadlocks that happen afterwards. PID: 474 TASK: ffff8813e11f4600 CPU: 10 COMMAND: "kswapd0" #0 [ffff8813dedfb938] __schedule at ffffffff8173f405 #1 [ffff8813dedfb990] schedule at ffffffff8173fa27 #2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec #3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186 #4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f #5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8 #6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81 #7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio] #8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio] #9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio] #10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce #11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778 #12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f #13 [ffff8813dedfbec0] kthread at ffffffff810a8428 #14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242 Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Fixes: bd293d07 ("dm bufio: fix deadlock with loop device") Depends-on: d0a255e7 ("loop: set PF_MEMALLOC_NOIO for the worker thread") Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Aaron Armstrong Skomra authored
BugLink: https://bugs.launchpad.net/bugs/1845036 commit fcf887e7 upstream. The EKR ring claims a range of 0 to 71 but actually reports values 1 to 72. The ring is used in relative mode so this change should not affect users. Signed-off-by: Aaron Armstrong Skomra <aaron.skomra@wacom.com> Fixes: 72b236d6 ("HID: wacom: Add support for Express Key Remote.") Cc: <stable@vger.kernel.org> # v4.3+ Reviewed-by: Ping Cheng <ping.cheng@wacom.com> Reviewed-by: Jason Gerecke <jason.gerecke@wacom.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Naresh Kamboju authored
BugLink: https://bugs.launchpad.net/bugs/1845036 [ Upstream commit c096397c ] selftests kvm test cases need pre-required kernel configs for the test to get pass. Signed-off-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Jens Axboe authored
BugLink: https://bugs.launchpad.net/bugs/1845036 [ Upstream commit 752ead44 ] Abort processing of a command if we run out of mapped data in the SG list. This should never happen, but a previous bug caused it to be possible. Play it safe and attempt to abort nicely if we don't have more SG segments left. Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Jiangfeng Xiao authored
BugLink: https://bugs.launchpad.net/bugs/1845036 [ Upstream commit 96a50c0d ] On the arm64 platform, executing "ifconfig eth0 up" will fail, returning "ifconfig: SIOCSIFFLAGS: Input/output error." ndev->dev is not initialized, dma_map_single->get_dma_ops-> dummy_dma_ops->__dummy_map_page will return DMA_ERROR_CODE directly, so when we use dma_map_single, the first parameter is to use the device of platform_device. Signed-off-by: Jiangfeng Xiao <xiaojiangfeng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Jiangfeng Xiao authored
BugLink: https://bugs.launchpad.net/bugs/1845036 [ Upstream commit f2243b82 ] TX_DESC_NUM is 256, in tx_count, the maximum value of mod(TX_DESC_NUM - 1) is 254, the variable "count" in the hip04_mac_start_xmit function is never equal to (TX_DESC_NUM - 1), so hip04_mac_start_xmit never return NETDEV_TX_BUSY. tx_count is modified to mod(TX_DESC_NUM) so that the maximum value of tx_count can reach (TX_DESC_NUM - 1), then hip04_mac_start_xmit can reurn NETDEV_TX_BUSY. Signed-off-by: Jiangfeng Xiao <xiaojiangfeng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Jiangfeng Xiao authored
BugLink: https://bugs.launchpad.net/bugs/1845036 [ Upstream commit 1a2c070a ] If hip04_tx_reclaim is interrupted while it is running and then __napi_schedule continues to execute hip04_rx_poll->hip04_tx_reclaim, reentrancy occurs and oops is generated. So you need to mask the interrupt during the hip04_tx_reclaim run. The kernel oops exception stack is as follows: Unable to handle kernel NULL pointer dereference at virtual address 00000050 pgd = c0003000 [00000050] *pgd=80000000a04003, *pmd=00000000 Internal error: Oops: 206 [#1] SMP ARM Modules linked in: hip04_eth mtdblock mtd_blkdevs mtd ohci_platform ehci_platform ohci_hcd ehci_hcd vfat fat sd_mod usb_storage scsi_mod usbcore usb_common CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 4.4.185 #1 Hardware name: Hisilicon A15 task: c0a250e0 task.stack: c0a00000 PC is at hip04_tx_reclaim+0xe0/0x17c [hip04_eth] LR is at hip04_tx_reclaim+0x30/0x17c [hip04_eth] pc : [<bf30c3a4>] lr : [<bf30c2f4>] psr: 600e0313 sp : c0a01d88 ip : 00000000 fp : c0601f9c r10: 00000000 r9 : c3482380 r8 : 00000001 r7 : 00000000 r6 : 000000e1 r5 : c3482000 r4 : 0000000c r3 : f2209800 r2 : 00000000 r1 : 00000000 r0 : 00000000 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 32c5387d Table: 03d28c80 DAC: 55555555 Process swapper/0 (pid: 0, stack limit = 0xc0a00190) Stack: (0xc0a01d88 to 0xc0a02000) [<bf30c3a4>] (hip04_tx_reclaim [hip04_eth]) from [<bf30d2e0>] (hip04_rx_poll+0x88/0x368 [hip04_eth]) [<bf30d2e0>] (hip04_rx_poll [hip04_eth]) from [<c04c2d9c>] (net_rx_action+0x114/0x34c) [<c04c2d9c>] (net_rx_action) from [<c021eed8>] (__do_softirq+0x218/0x318) [<c021eed8>] (__do_softirq) from [<c021f284>] (irq_exit+0x88/0xac) [<c021f284>] (irq_exit) from [<c0240090>] (msa_irq_exit+0x11c/0x1d4) [<c0240090>] (msa_irq_exit) from [<c02677e0>] (__handle_domain_irq+0x110/0x148) [<c02677e0>] (__handle_domain_irq) from [<c0201588>] (gic_handle_irq+0xd4/0x118) [<c0201588>] (gic_handle_irq) from [<c0551700>] (__irq_svc+0x40/0x58) Exception stack(0xc0a01f30 to 0xc0a01f78) 1f20: c0ae8b40 00000000 00000000 00000000 1f40: 00000002 ffffe000 c0601f9c 00000000 ffffffff c0a2257c c0a22440 c0831a38 1f60: c0a01ec4 c0a01f80 c0203714 c0203718 600e0213 ffffffff [<c0551700>] (__irq_svc) from [<c0203718>] (arch_cpu_idle+0x20/0x3c) [<c0203718>] (arch_cpu_idle) from [<c025bfd8>] (cpu_startup_entry+0x244/0x29c) [<c025bfd8>] (cpu_startup_entry) from [<c054b0d8>] (rest_init+0xc8/0x10c) [<c054b0d8>] (rest_init) from [<c0800c58>] (start_kernel+0x468/0x514) Code: a40599e5 016086e2 018088e2 7660efe6 (503090e5) ---[ end trace 1db21d6d09c49d74 ]--- Kernel panic - not syncing: Fatal exception in interrupt CPU3: stopping CPU: 3 PID: 0 Comm: swapper/3 Tainted: G D O 4.4.185 #1 Signed-off-by: Jiangfeng Xiao <xiaojiangfeng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Christophe JAILLET authored
BugLink: https://bugs.launchpad.net/bugs/1845036 [ Upstream commit debea2cd ] A call to 'kfree_skb()' is missing in the error handling path of 'init_one()'. This is already present in 'remove_one()' but is missing here. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Trond Myklebust authored
BugLink: https://bugs.launchpad.net/bugs/1845036 [ Upstream commit c77e2283 ] John Hubbard reports seeing the following stack trace: nfs4_do_reclaim rcu_read_lock /* we are now in_atomic() and must not sleep */ nfs4_purge_state_owners nfs4_free_state_owner nfs4_destroy_seqid_counter rpc_destroy_wait_queue cancel_delayed_work_sync __cancel_work_timer __flush_work start_flush_work might_sleep: (kernel/workqueue.c:2975: BUG) The solution is to separate out the freeing of the state owners from nfs4_purge_state_owners(), and perform that outside the atomic context. Reported-by: John Hubbard <jhubbard@nvidia.com> Fixes: 0aaaf5c4 ("NFS: Cache state owners after files are closed") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Wang Xiayang authored
BugLink: https://bugs.launchpad.net/bugs/1845036 [ Upstream commit e787f193 ] strncpy() does not ensure NULL-termination when the input string size equals to the destination buffer size IFNAMSIZ. The output string is passed to dev_info() which relies on the NULL-termination. Use strlcpy() instead. This issue is identified by a Coccinelle script. Signed-off-by: Wang Xiayang <xywang.sjtu@sjtu.edu.cn> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Wang Xiayang authored
BugLink: https://bugs.launchpad.net/bugs/1845036 [ Upstream commit cd28aa2e ] strncpy() does not ensure NULL-termination when the input string size equals to the destination buffer size IFNAMSIZ. The output string 'name' is passed to dev_info which relies on NULL-termination. Use strlcpy() instead. This issue is identified by a Coccinelle script. Signed-off-by: Wang Xiayang <xywang.sjtu@sjtu.edu.cn> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Jiri Olsa authored
BugLink: https://bugs.launchpad.net/bugs/1845036 [ Upstream commit 6bbfe4e6 ] Michael reported an issue with perf bench numa failing with binding to cpu0 with '-0' option. # perf bench numa mem -p 3 -t 1 -P 512 -s 100 -zZcm0 --thp 1 -M 1 -ddd # Running 'numa/mem' benchmark: # Running main, "perf bench numa numa-mem -p 3 -t 1 -P 512 -s 100 -zZcm0 --thp 1 -M 1 -ddd" binding to node 0, mask: 0000000000000001 => -1 perf: bench/numa.c:356: bind_to_memnode: Assertion `!(ret)' failed. Aborted (core dumped) This happens when the cpu0 is not part of node0, which is the benchmark assumption and we can see that's not the case for some powerpc servers. Using correct node for cpu0 binding. Reported-by: Michael Petlan <mpetlan@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20190801142642.28004-1-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-