1. 06 Feb, 2017 40 commits
    • Russell King's avatar
      ARM: sa1100: clear reset status prior to reboot · 5b4918cc
      Russell King authored
      commit da60626e upstream.
      
      Clear the current reset status prior to rebooting the platform.  This
      adds the bit missing from 04fef228 ("[ARM] pxa: introduce
      reset_status and clear_reset_status for driver's usage").
      
      Fixes: 04fef228 ("[ARM] pxa: introduce reset_status and clear_reset_status for driver's usage")
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      5b4918cc
    • Srinivas Ramana's avatar
      ARM: 8618/1: decompressor: reset ttbcr fields to use TTBR0 on ARMv7 · 1774ca81
      Srinivas Ramana authored
      commit 117e5e9c upstream.
      
      If the bootloader uses the long descriptor format and jumps to
      kernel decompressor code, TTBCR may not be in a right state.
      Before enabling the MMU, it is required to clear the TTBCR.PD0
      field to use TTBR0 for translation table walks.
      
      The commit dbece458 ("ARM: 7501/1: decompressor:
      reset ttbcr for VMSA ARMv7 cores") does the reset of TTBCR.N, but
      doesn't consider all the bits for the size of TTBCR.N.
      
      Clear TTBCR.PD0 field and reset all the three bits of TTBCR.N to
      indicate the use of TTBR0 and the correct base address width.
      
      Fixes: dbece458 ("ARM: 7501/1: decompressor: reset ttbcr for VMSA ARMv7 cores")
      Acked-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarSrinivas Ramana <sramana@codeaurora.org>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      1774ca81
    • Robin Murphy's avatar
      ARM: 8616/1: dt: Respect property size when parsing CPUs · 88654a15
      Robin Murphy authored
      commit ba6dea4f upstream.
      
      Whilst MPIDR values themselves are less than 32 bits, it is still
      perfectly valid for a DT to have #address-cells > 1 in the CPUs node,
      resulting in the "reg" property having leading zero cell(s). In that
      situation, the big-endian nature of the data conspires with the current
      behaviour of only reading the first cell to cause the kernel to think
      all CPUs have ID 0, and become resoundingly unhappy as a consequence.
      
      Take the full property length into account when parsing CPUs so as to
      be correct under any circumstances.
      
      Cc: Russell King <linux@armlinux.org.uk>
      Signed-off-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      88654a15
    • Baoquan He's avatar
      iommu/amd: Free domain id when free a domain of struct dma_ops_domain · 7a6111b8
      Baoquan He authored
      commit c3db901c upstream.
      
      The current code missed freeing domain id when free a domain of
      struct dma_ops_domain.
      Signed-off-by: default avatarBaoquan He <bhe@redhat.com>
      Fixes: ec487d1a ('x86, AMD IOMMU: add domain allocation and deallocation functions')
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      7a6111b8
    • Joerg Roedel's avatar
      iommu/amd: Update Alias-DTE in update_device_table() · 56eb0df4
      Joerg Roedel authored
      commit 3254de6b upstream.
      
      Not doing so might cause IO-Page-Faults when a device uses
      an alias request-id and the alias-dte is left in a lower
      page-mode which does not cover the address allocated from
      the iova-allocator.
      
      Fixes: 492667da ('x86/amd-iommu: Remove amd_iommu_pd_table')
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      56eb0df4
    • Michael S. Tsirkin's avatar
      x86/um: reuse asm-generic/barrier.h · 69c373d8
      Michael S. Tsirkin authored
      commit 577f183a upstream.
      
      On x86/um CONFIG_SMP is never defined.  As a result, several macros
      match the asm-generic variant exactly. Drop the local definitions and
      pull in asm-generic/barrier.h instead.
      
      This is in preparation to refactoring this code area.
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarRichard Weinberger <richard@nod.at>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      69c373d8
    • H.J. Lu's avatar
      x86/build: Build compressed x86 kernels as PIE · 186c5f34
      H.J. Lu authored
      commit 6d92bc9d upstream.
      
      The 32-bit x86 assembler in binutils 2.26 will generate R_386_GOT32X
      relocation to get the symbol address in PIC.  When the compressed x86
      kernel isn't built as PIC, the linker optimizes R_386_GOT32X relocations
      to their fixed symbol addresses.  However, when the compressed x86
      kernel is loaded at a different address, it leads to the following
      load failure:
      
        Failed to allocate space for phdrs
      
      during the decompression stage.
      
      If the compressed x86 kernel is relocatable at run-time, it should be
      compiled with -fPIE, instead of -fPIC, if possible and should be built as
      Position Independent Executable (PIE) so that linker won't optimize
      R_386_GOT32X relocation to its fixed symbol address.
      
      Older linkers generate R_386_32 relocations against locally defined
      symbols, _bss, _ebss, _got and _egot, in PIE.  It isn't wrong, just less
      optimal than R_386_RELATIVE.  But the x86 kernel fails to properly handle
      R_386_32 relocations when relocating the kernel.  To generate
      R_386_RELATIVE relocations, we mark _bss, _ebss, _got and _egot as
      hidden in both 32-bit and 64-bit x86 kernels.
      
      To build a 64-bit compressed x86 kernel as PIE, we need to disable the
      relocation overflow check to avoid relocation overflow errors. We do
      this with a new linker command-line option, -z noreloc-overflow, which
      got added recently:
      
       commit 4c10bbaa0912742322f10d9d5bb630ba4e15dfa7
       Author: H.J. Lu <hjl.tools@gmail.com>
       Date:   Tue Mar 15 11:07:06 2016 -0700
      
          Add -z noreloc-overflow option to x86-64 ld
      
          Add -z noreloc-overflow command-line option to the x86-64 ELF linker to
          disable relocation overflow check.  This can be used to avoid relocation
          overflow check if there will be no dynamic relocation overflow at
          run-time.
      
      The 64-bit compressed x86 kernel is built as PIE only if the linker supports
      -z noreloc-overflow.  So far 64-bit relocatable compressed x86 kernel
      boots fine even when it is built as a normal executable.
      Signed-off-by: default avatarH.J. Lu <hjl.tools@gmail.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      [ Edited the changelog and comments. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      186c5f34
    • Steven Rostedt's avatar
      x86/paravirt: Do not trace _paravirt_ident_*() functions · 6523fa8c
      Steven Rostedt authored
      commit 15301a57 upstream.
      
      Łukasz Daniluk reported that on a RHEL kernel that his machine would lock up
      after enabling function tracer. I asked him to bisect the functions within
      available_filter_functions, which he did and it came down to three:
      
        _paravirt_nop(), _paravirt_ident_32() and _paravirt_ident_64()
      
      It was found that this is only an issue when noreplace-paravirt is added
      to the kernel command line.
      
      This means that those functions are most likely called within critical
      sections of the funtion tracer, and must not be traced.
      
      In newer kenels _paravirt_nop() is defined within gcc asm(), and is no
      longer an issue.  But both _paravirt_ident_{32,64}() causes the
      following splat when they are traced:
      
       mm/pgtable-generic.c:33: bad pmd ffff8800d2435150(0000000001d00054)
       mm/pgtable-generic.c:33: bad pmd ffff8800d3624190(0000000001d00070)
       mm/pgtable-generic.c:33: bad pmd ffff8800d36a5110(0000000001d00054)
       mm/pgtable-generic.c:33: bad pmd ffff880118eb1450(0000000001d00054)
       NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [systemd-journal:469]
       Modules linked in: e1000e
       CPU: 2 PID: 469 Comm: systemd-journal Not tainted 4.6.0-rc4-test+ #513
       Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
       task: ffff880118f740c0 ti: ffff8800d4aec000 task.ti: ffff8800d4aec000
       RIP: 0010:[<ffffffff81134148>]  [<ffffffff81134148>] queued_spin_lock_slowpath+0x118/0x1a0
       RSP: 0018:ffff8800d4aefb90  EFLAGS: 00000246
       RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88011eb16d40
       RDX: ffffffff82485760 RSI: 000000001f288820 RDI: ffffea0000008030
       RBP: ffff8800d4aefb90 R08: 00000000000c0000 R09: 0000000000000000
       R10: ffffffff821c8e0e R11: 0000000000000000 R12: ffff880000200fb8
       R13: 00007f7a4e3f7000 R14: ffffea000303f600 R15: ffff8800d4b562e0
       FS:  00007f7a4e3d7840(0000) GS:ffff88011eb00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00007f7a4e3f7000 CR3: 00000000d3e71000 CR4: 00000000001406e0
       Call Trace:
         _raw_spin_lock+0x27/0x30
         handle_pte_fault+0x13db/0x16b0
         handle_mm_fault+0x312/0x670
         __do_page_fault+0x1b1/0x4e0
         do_page_fault+0x22/0x30
         page_fault+0x28/0x30
         __vfs_read+0x28/0xe0
         vfs_read+0x86/0x130
         SyS_read+0x46/0xa0
         entry_SYSCALL_64_fastpath+0x1e/0xa8
       Code: 12 48 c1 ea 0c 83 e8 01 83 e2 30 48 98 48 81 c2 40 6d 01 00 48 03 14 c5 80 6a 5d 82 48 89 0a 8b 41 08 85 c0 75 09 f3 90 8b 41 08 <85> c0 74 f7 4c 8b 09 4d 85 c9 74 08 41 0f 18 09 eb 02 f3 90 8b
      Reported-by: default avatarŁukasz Daniluk <lukasz.daniluk@intel.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      6523fa8c
    • Jiri Kosina's avatar
      x86/mm/pat, /dev/mem: Remove superfluous error message · 1eae225e
      Jiri Kosina authored
      commit 39380b80 upstream.
      
      Currently it's possible for broken (or malicious) userspace to flood a
      kernel log indefinitely with messages a-la
      
      	Program dmidecode tried to access /dev/mem between f0000->100000
      
      because range_is_allowed() is case of CONFIG_STRICT_DEVMEM being turned on
      dumps this information each and every time devmem_is_allowed() fails.
      
      Reportedly userspace that is able to trigger contignuous flow of these
      messages exists.
      
      It would be possible to rate limit this message, but that'd have a
      questionable value; the administrator wouldn't get information about all
      the failing accessess, so then the information would be both superfluous
      and incomplete at the same time :)
      
      Returning EPERM (which is what is actually happening) is enough indication
      for userspace what has happened; no need to log this particular error as
      some sort of special condition.
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1607081137020.24757@cbobk.fhfr.pmSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      1eae225e
    • Wanpeng Li's avatar
      x86/apic: Do not init irq remapping if ioapic is disabled · 928a2775
      Wanpeng Li authored
      commit 2e63ad4b upstream.
      
      native_smp_prepare_cpus
        -> default_setup_apic_routing
          -> enable_IR_x2apic
            -> irq_remapping_prepare
              -> intel_prepare_irq_remapping
                -> intel_setup_irq_remapping
      
      So IR table is setup even if "noapic" boot parameter is added. As a result we
      crash later when the interrupt affinity is set due to a half initialized
      remapping infrastructure.
      
      Prevent remap initialization when IOAPIC is disabled.
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Joerg Roedel <joro@8bytes.org>
      Link: http://lkml.kernel.org/r/1471954039-3942-1-git-send-email-wanpeng.li@hotmail.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      928a2775
    • Sebastian Andrzej Siewior's avatar
      x86/mm: Disable preemption during CR3 read+write · b591901e
      Sebastian Andrzej Siewior authored
      commit 5cf0791d upstream.
      
      There's a subtle preemption race on UP kernels:
      
      Usually current->mm (and therefore mm->pgd) stays the same during the
      lifetime of a task so it does not matter if a task gets preempted during
      the read and write of the CR3.
      
      But then, there is this scenario on x86-UP:
      
      TaskA is in do_exit() and exit_mm() sets current->mm = NULL followed by:
      
       -> mmput()
       -> exit_mmap()
       -> tlb_finish_mmu()
       -> tlb_flush_mmu()
       -> tlb_flush_mmu_tlbonly()
       -> tlb_flush()
       -> flush_tlb_mm_range()
       -> __flush_tlb_up()
       -> __flush_tlb()
       ->  __native_flush_tlb()
      
      At this point current->mm is NULL but current->active_mm still points to
      the "old" mm.
      
      Let's preempt taskA _after_ native_read_cr3() by taskB. TaskB has its
      own mm so CR3 has changed.
      
      Now preempt back to taskA. TaskA has no ->mm set so it borrows taskB's
      mm and so CR3 remains unchanged. Once taskA gets active it continues
      where it was interrupted and that means it writes its old CR3 value
      back. Everything is fine because userland won't need its memory
      anymore.
      
      Now the fun part:
      
      Let's preempt taskA one more time and get back to taskB. This
      time switch_mm() won't do a thing because oldmm (->active_mm)
      is the same as mm (as per context_switch()). So we remain
      with a bad CR3 / PGD and return to userland.
      
      The next thing that happens is handle_mm_fault() with an address for
      the execution of its code in userland. handle_mm_fault() realizes that
      it has a PTE with proper rights so it returns doing nothing. But the
      CPU looks at the wrong PGD and insists that something is wrong and
      faults again. And again. And one more time…
      
      This pagefault circle continues until the scheduler gets tired of it and
      puts another task on the CPU. It gets little difficult if the task is a
      RT task with a high priority. The system will either freeze or it gets
      fixed by the software watchdog thread which usually runs at RT-max prio.
      But waiting for the watchdog will increase the latency of the RT task
      which is no good.
      
      Fix this by disabling preemption across the critical code section.
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Acked-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/1470404259-26290-1-git-send-email-bigeasy@linutronix.de
      [ Prettified the changelog. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      b591901e
    • Andy Lutomirski's avatar
      x86/traps: Ignore high word of regs->cs in early_idt_handler_common · 213060d9
      Andy Lutomirski authored
      This is a backport of:
      commit fc0e81b2 upstream
      
      On the 80486 DX, it seems that some exceptions may leave garbage in
      the high bits of CS.  This causes sporadic failures in which
      early_fixup_exception() refuses to fix up an exception.
      
      As far as I can tell, this has been buggy for a long time, but the
      problem seems to have been exacerbated by commits:
      
        1e02ce4c ("x86: Store a per-cpu shadow copy of CR4")
        e1bfc11c ("x86/init: Fix cr4_init_shadow() on CR4-less machines")
      
      This appears to have broken for as long as we've had early
      exception handling.
      
      [ This backport should apply to kernels from 3.4 - 4.5. ]
      
      Fixes: 4c5023a3 ("x86-32: Handle exception table entries during early boot")
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: stable@vger.kernel.org
      Reported-by: default avatarMatthew Whitehead <tedheadster@gmail.com>
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      213060d9
    • Juergen Gross's avatar
      x86/xen: fix upper bound of pmd loop in xen_cleanhighmap() · 506750b7
      Juergen Gross authored
      commit 1cf38741 upstream.
      
      xen_cleanhighmap() is operating on level2_kernel_pgt only. The upper
      bound of the loop setting non-kernel-image entries to zero should not
      exceed the size of level2_kernel_pgt.
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      506750b7
    • Ben Hutchings's avatar
      xen-pciback: Add name prefix to global 'permissive' variable · f014225a
      Ben Hutchings authored
      commit 8014bcc8 upstream.
      
      The variable for the 'permissive' module parameter used to be static
      but was recently changed to be extern.  This puts it in the kernel
      global namespace if the driver is built-in, so its name should begin
      with a prefix identifying the driver.
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Fixes: af6fc858 ("xen-pciback: limit guest control of command register")
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      f014225a
    • Konrad Rzeszutek Wilk's avatar
      xen/pciback: Don't allow MSI-X ops if PCI_COMMAND_MEMORY is not set. · 1d9d642c
      Konrad Rzeszutek Wilk authored
      commit 408fb0e5 upstream.
      
      commit f598282f ("PCI: Fix the NIU MSI-X problem in a better way")
      teaches us that dealing with MSI-X can be troublesome.
      
      Further checks in the MSI-X architecture shows that if the
      PCI_COMMAND_MEMORY bit is turned of in the PCI_COMMAND we
      may not be able to access the BAR (since they are memory regions).
      
      Since the MSI-X tables are located in there.. that can lead
      to us causing PCIe errors. Inhibit us performing any
      operation on the MSI-X unless the MEMORY bit is set.
      
      Note that Xen hypervisor with:
      "x86/MSI-X: access MSI-X table only after having enabled MSI-X"
      will return:
      xen_pciback: 0000:0a:00.1: error -6 enabling MSI-X for guest 3!
      
      When the generic MSI code tries to setup the PIRQ without
      MEMORY bit set. Which means with later versions of Xen
      (4.6) this patch is not neccessary.
      
      This is part of XSA-157
      Reviewed-by: default avatarJan Beulich <jbeulich@suse.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      1d9d642c
    • Konrad Rzeszutek Wilk's avatar
      xen/pciback: For XEN_PCI_OP_disable_msi[|x] only disable if device has MSI(X) enabled. · 4c2e2fd2
      Konrad Rzeszutek Wilk authored
      commit 7cfb905b upstream.
      
      Otherwise just continue on, returning the same values as
      previously (return of 0, and op->result has the PIRQ value).
      
      This does not change the behavior of XEN_PCI_OP_disable_msi[|x].
      
      The pci_disable_msi or pci_disable_msix have the checks for
      msi_enabled or msix_enabled so they will error out immediately.
      
      However the guest can still call these operations and cause
      us to disable the 'ack_intr'. That means the backend IRQ handler
      for the legacy interrupt will not respond to interrupts anymore.
      
      This will lead to (if the device is causing an interrupt storm)
      for the Linux generic code to disable the interrupt line.
      
      Naturally this will only happen if the device in question
      is plugged in on the motherboard on shared level interrupt GSI.
      
      This is part of XSA-157
      Reviewed-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      4c2e2fd2
    • Konrad Rzeszutek Wilk's avatar
      xen/pciback: Do not install an IRQ handler for MSI interrupts. · 5701f8a1
      Konrad Rzeszutek Wilk authored
      commit a396f3a2 upstream.
      
      Otherwise an guest can subvert the generic MSI code to trigger
      an BUG_ON condition during MSI interrupt freeing:
      
       for (i = 0; i < entry->nvec_used; i++)
              BUG_ON(irq_has_action(entry->irq + i));
      
      Xen PCI backed installs an IRQ handler (request_irq) for
      the dev->irq whenever the guest writes PCI_COMMAND_MEMORY
      (or PCI_COMMAND_IO) to the PCI_COMMAND register. This is
      done in case the device has legacy interrupts the GSI line
      is shared by the backend devices.
      
      To subvert the backend the guest needs to make the backend
      to change the dev->irq from the GSI to the MSI interrupt line,
      make the backend allocate an interrupt handler, and then command
      the backend to free the MSI interrupt and hit the BUG_ON.
      
      Since the backend only calls 'request_irq' when the guest
      writes to the PCI_COMMAND register the guest needs to call
      XEN_PCI_OP_enable_msi before any other operation. This will
      cause the generic MSI code to setup an MSI entry and
      populate dev->irq with the new PIRQ value.
      
      Then the guest can write to PCI_COMMAND PCI_COMMAND_MEMORY
      and cause the backend to setup an IRQ handler for dev->irq
      (which instead of the GSI value has the MSI pirq). See
      'xen_pcibk_control_isr'.
      
      Then the guest disables the MSI: XEN_PCI_OP_disable_msi
      which ends up triggering the BUG_ON condition in 'free_msi_irqs'
      as there is an IRQ handler for the entry->irq (dev->irq).
      
      Note that this cannot be done using MSI-X as the generic
      code does not over-write dev->irq with the MSI-X PIRQ values.
      
      The patch inhibits setting up the IRQ handler if MSI or
      MSI-X (for symmetry reasons) code had been called successfully.
      
      P.S.
      Xen PCIBack when it sets up the device for the guest consumption
      ends up writting 0 to the PCI_COMMAND (see xen_pcibk_reset_device).
      XSA-120 addendum patch removed that - however when upstreaming said
      addendum we found that it caused issues with qemu upstream. That
      has now been fixed in qemu upstream.
      
      This is part of XSA-157
      Reviewed-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      5701f8a1
    • Konrad Rzeszutek Wilk's avatar
      xen/pciback: Return error on XEN_PCI_OP_enable_msix when device has MSI or MSI-X enabled · d2dfdbab
      Konrad Rzeszutek Wilk authored
      commit 5e0ce145 upstream.
      
      The guest sequence of:
      
        a) XEN_PCI_OP_enable_msix
        b) XEN_PCI_OP_enable_msix
      
      results in hitting an NULL pointer due to using freed pointers.
      
      The device passed in the guest MUST have MSI-X capability.
      
      The a) constructs and SysFS representation of MSI and MSI groups.
      The b) adds a second set of them but adding in to SysFS fails (duplicate entry).
      'populate_msi_sysfs' frees the newly allocated msi_irq_groups (note that
      in a) pdev->msi_irq_groups is still set) and also free's ALL of the
      MSI-X entries of the device (the ones allocated in step a) and b)).
      
      The unwind code: 'free_msi_irqs' deletes all the entries and tries to
      delete the pdev->msi_irq_groups (which hasn't been set to NULL).
      However the pointers in the SysFS are already freed and we hit an
      NULL pointer further on when 'strlen' is attempted on a freed pointer.
      
      The patch adds a simple check in the XEN_PCI_OP_enable_msix to guard
      against that. The check for msi_enabled is not stricly neccessary.
      
      This is part of XSA-157
      Reviewed-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Reviewed-by: default avatarJan Beulich <jbeulich@suse.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d2dfdbab
    • Konrad Rzeszutek Wilk's avatar
      xen/pciback: Return error on XEN_PCI_OP_enable_msi when device has MSI or MSI-X enabled · 871e37be
      Konrad Rzeszutek Wilk authored
      commit 56441f3c upstream.
      
      The guest sequence of:
      
       a) XEN_PCI_OP_enable_msi
       b) XEN_PCI_OP_enable_msi
       c) XEN_PCI_OP_disable_msi
      
      results in hitting an BUG_ON condition in the msi.c code.
      
      The MSI code uses an dev->msi_list to which it adds MSI entries.
      Under the above conditions an BUG_ON() can be hit. The device
      passed in the guest MUST have MSI capability.
      
      The a) adds the entry to the dev->msi_list and sets msi_enabled.
      The b) adds a second entry but adding in to SysFS fails (duplicate entry)
      and deletes all of the entries from msi_list and returns (with msi_enabled
      is still set).  c) pci_disable_msi passes the msi_enabled checks and hits:
      
      BUG_ON(list_empty(dev_to_msi_list(&dev->dev)));
      
      and blows up.
      
      The patch adds a simple check in the XEN_PCI_OP_enable_msi to guard
      against that. The check for msix_enabled is not stricly neccessary.
      
      This is part of XSA-157.
      Reviewed-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Reviewed-by: default avatarJan Beulich <jbeulich@suse.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      871e37be
    • Konrad Rzeszutek Wilk's avatar
      xen/pciback: Save the number of MSI-X entries to be copied later. · 28018142
      Konrad Rzeszutek Wilk authored
      commit d159457b upstream.
      
      Commit 8135cf8b (xen/pciback: Save
      xen_pci_op commands before processing it) broke enabling MSI-X because
      it would never copy the resulting vectors into the response.  The
      number of vectors requested was being overwritten by the return value
      (typically zero for success).
      
      Save the number of vectors before processing the op, so the correct
      number of vectors are copied afterwards.
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: <stable@vger.kernel.org>
      Reviewed-by: default avatarJan Beulich <jbeulich@suse.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      28018142
    • Konrad Rzeszutek Wilk's avatar
      xen/pciback: Save xen_pci_op commands before processing it · ffb6bca6
      Konrad Rzeszutek Wilk authored
      commit 8135cf8b upstream.
      
      Double fetch vulnerabilities that happen when a variable is
      fetched twice from shared memory but a security check is only
      performed the first time.
      
      The xen_pcibk_do_op function performs a switch statements on the op->cmd
      value which is stored in shared memory. Interestingly this can result
      in a double fetch vulnerability depending on the performed compiler
      optimization.
      
      This patch fixes it by saving the xen_pci_op command before
      processing it. We also use 'barrier' to make sure that the
      compiler does not perform any optimization.
      
      This is part of XSA155.
      Reviewed-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarJan Beulich <JBeulich@suse.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: "Jan Beulich" <JBeulich@suse.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      ffb6bca6
    • Roger Pau Monné's avatar
      xen-blkback: only read request operation from shared ring once · d6837064
      Roger Pau Monné authored
      commit 1f13d75c upstream.
      
      A compiler may load a switch statement value multiple times, which could
      be bad when the value is in memory shared with the frontend.
      
      When converting a non-native request to a native one, ensure that
      src->operation is only loaded once by using READ_ONCE().
      
      This is part of XSA155.
      Signed-off-by: default avatarRoger Pau Monné <roger.pau@citrix.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: "Jan Beulich" <JBeulich@suse.com>
      [wt: s/READ_ONCE/ACCESS_ONCE for 3.10]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d6837064
    • David Vrabel's avatar
      xen-netback: use RING_COPY_REQUEST() throughout · 67434d33
      David Vrabel authored
      commit 68a33bfd upstream.
      
      Instead of open-coding memcpy()s and directly accessing Tx and Rx
      requests, use the new RING_COPY_REQUEST() that ensures the local copy
      is correct.
      
      This is more than is strictly necessary for guest Rx requests since
      only the id and gref fields are used and it is harmless if the
      frontend modifies these.
      
      This is part of XSA155.
      Reviewed-by: default avatarWei Liu <wei.liu2@citrix.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      [wt: adjustments for 3.10 : netbk_rx_meta instead of struct xenvif_rx_meta]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      67434d33
    • David Vrabel's avatar
      xen-netback: don't use last request to determine minimum Tx credit · 85fed7d2
      David Vrabel authored
      commit 0f589967 upstream.
      
      The last from guest transmitted request gives no indication about the
      minimum amount of credit that the guest might need to send a packet
      since the last packet might have been a small one.
      
      Instead allow for the worst case 128 KiB packet.
      
      This is part of XSA155.
      Reviewed-by: default avatarWei Liu <wei.liu2@citrix.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      85fed7d2
    • David Vrabel's avatar
      xen: Add RING_COPY_REQUEST() · 0a5b4e4c
      David Vrabel authored
      commit 454d5d88 upstream.
      
      Using RING_GET_REQUEST() on a shared ring is easy to use incorrectly
      (i.e., by not considering that the other end may alter the data in the
      shared ring while it is being inspected).  Safe usage of a request
      generally requires taking a local copy.
      
      Provide a RING_COPY_REQUEST() macro to use instead of
      RING_GET_REQUEST() and an open-coded memcpy().  This takes care of
      ensuring that the copy is done correctly regardless of any possible
      compiler optimizations.
      
      Use a volatile source to prevent the compiler from reordering or
      omitting the copy.
      
      This is part of XSA155.
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      0a5b4e4c
    • Jan Beulich's avatar
      x86/mm/xen: Suppress hugetlbfs in PV guests · 0679df0f
      Jan Beulich authored
      commit 103f6112 upstream.
      
      Huge pages are not normally available to PV guests. Not suppressing
      hugetlbfs use results in an endless loop of page faults when user mode
      code tries to access a hugetlbfs mapped area (since the hypervisor
      denies such PTEs to be created, but error indications can't be
      propagated out of xen_set_pte_at(), just like for various of its
      siblings), and - once killed in an oops like this:
      
        kernel BUG at .../fs/hugetlbfs/inode.c:428!
        invalid opcode: 0000 [#1] SMP
        ...
        RIP: e030:[<ffffffff811c333b>]  [<ffffffff811c333b>] remove_inode_hugepages+0x25b/0x320
        ...
        Call Trace:
         [<ffffffff811c3415>] hugetlbfs_evict_inode+0x15/0x40
         [<ffffffff81167b3d>] evict+0xbd/0x1b0
         [<ffffffff8116514a>] __dentry_kill+0x19a/0x1f0
         [<ffffffff81165b0e>] dput+0x1fe/0x220
         [<ffffffff81150535>] __fput+0x155/0x200
         [<ffffffff81079fc0>] task_work_run+0x60/0xa0
         [<ffffffff81063510>] do_exit+0x160/0x400
         [<ffffffff810637eb>] do_group_exit+0x3b/0xa0
         [<ffffffff8106e8bd>] get_signal+0x1ed/0x470
         [<ffffffff8100f854>] do_signal+0x14/0x110
         [<ffffffff810030e9>] prepare_exit_to_usermode+0xe9/0xf0
         [<ffffffff814178a5>] retint_user+0x8/0x13
      
      This is CVE-2016-3961 / XSA-174.
      Reported-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarJan Beulich <jbeulich@suse.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <JGross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: xen-devel <xen-devel@lists.xenproject.org>
      Link: http://lkml.kernel.org/r/57188ED802000078000E431C@prv-mh.provo.novell.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      0679df0f
    • WANG Cong's avatar
      ppp: defer netns reference release for ppp channel · 068b35e9
      WANG Cong authored
      commit 205e1e25 upstream
      
      Matt reported that we have a NULL pointer dereference
      in ppp_pernet() from ppp_connect_channel(),
      i.e. pch->chan_net is NULL.
      
      This is due to that a parallel ppp_unregister_channel()
      could happen while we are in ppp_connect_channel(), during
      which pch->chan_net set to NULL. Since we need a reference
      to net per channel, it makes sense to sync the refcnt
      with the life time of the channel, therefore we should
      release this reference when we destroy it.
      
      Fixes: 1f461dcd ("ppp: take reference on channels netns")
      Reported-by: default avatarMatt Bennett <Matt.Bennett@alliedtelesis.co.nz>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: linux-ppp@vger.kernel.org
      Cc: Guillaume Nault <g.nault@alphalink.fr>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Reviewed-by: default avatarCyrill Gorcunov <gorcunov@openvz.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      068b35e9
    • Xiaolong Ye's avatar
      PM / devfreq: Fix incorrect type issue. · 6e5ae963
      Xiaolong Ye authored
      commit 5f25f066 upstream
      
      time_in_state in struct devfreq is defined as unsigned long, so
      devm_kzalloc should use sizeof(unsigned long) as argument instead
      of sizeof(unsigned int), otherwise it will cause unexpected result
      in 64bit system.
      Signed-off-by: default avatarXiaolong Ye <yexl@marvell.com>
      Signed-off-by: default avatarKevin Liu <kliu5@marvell.com>
      Signed-off-by: default avatarMyungJoo Ham <myungjoo.ham@samsung.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      6e5ae963
    • Ignacio Alvarado's avatar
      KVM: Disable irq while unregistering user notifier · 01c22882
      Ignacio Alvarado authored
      commit 1650b4eb upstream.
      
      Function user_notifier_unregister should be called only once for each
      registered user notifier.
      
      Function kvm_arch_hardware_disable can be executed from an IPI context
      which could cause a race condition with a VCPU returning to user mode
      and attempting to unregister the notifier.
      Signed-off-by: default avatarIgnacio Alvarado <ikalvarado@google.com>
      Fixes: 18863bdd ("KVM: x86 shared msr infrastructure")
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      01c22882
    • Paolo Bonzini's avatar
      KVM: x86: fix missed SRCU usage in kvm_lapic_set_vapic_addr · 014db7f6
      Paolo Bonzini authored
      commit 7301d6ab upstream.
      
      Reported by syzkaller:
      
          [ INFO: suspicious RCU usage. ]
          4.9.0-rc4+ #47 Not tainted
          -------------------------------
          ./include/linux/kvm_host.h:536 suspicious rcu_dereference_check() usage!
      
          stack backtrace:
          CPU: 1 PID: 6679 Comm: syz-executor Not tainted 4.9.0-rc4+ #47
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
           ffff880039e2f6d0 ffffffff81c2e46b ffff88003e3a5b40 0000000000000000
           0000000000000001 ffffffff83215600 ffff880039e2f700 ffffffff81334ea9
           ffffc9000730b000 0000000000000004 ffff88003c4f8420 ffff88003d3f8000
          Call Trace:
           [<     inline     >] __dump_stack lib/dump_stack.c:15
           [<ffffffff81c2e46b>] dump_stack+0xb3/0x118 lib/dump_stack.c:51
           [<ffffffff81334ea9>] lockdep_rcu_suspicious+0x139/0x180 kernel/locking/lockdep.c:4445
           [<     inline     >] __kvm_memslots include/linux/kvm_host.h:534
           [<     inline     >] kvm_memslots include/linux/kvm_host.h:541
           [<ffffffff8105d6ae>] kvm_gfn_to_hva_cache_init+0xa1e/0xce0 virt/kvm/kvm_main.c:1941
           [<ffffffff8112685d>] kvm_lapic_set_vapic_addr+0xed/0x140 arch/x86/kvm/lapic.c:2217
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Fixes: fda4e2e8
      Cc: Andrew Honig <ahonig@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      014db7f6
    • Ido Yariv's avatar
      KVM: x86: fix wbinvd_dirty_mask use-after-free · a3e49e60
      Ido Yariv authored
      commit bd768e14 upstream.
      
      vcpu->arch.wbinvd_dirty_mask may still be used after freeing it,
      corrupting memory. For example, the following call trace may set a bit
      in an already freed cpu mask:
          kvm_arch_vcpu_load
          vcpu_load
          vmx_free_vcpu_nested
          vmx_free_vcpu
          kvm_arch_vcpu_free
      
      Fix this by deferring freeing of wbinvd_dirty_mask.
      Signed-off-by: default avatarIdo Yariv <ido@wizery.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      a3e49e60
    • James Hogan's avatar
      KVM: MIPS: Make ERET handle ERL before EXL · 010a6cc4
      James Hogan authored
      commit ede5f3e7 upstream.
      
      The ERET instruction to return from exception is used for returning from
      exception level (Status.EXL) and error level (Status.ERL). If both bits
      are set however we should be returning from ERL first, as ERL can
      interrupt EXL, for example when an NMI is taken. KVM however checks EXL
      first.
      
      Fix the order of the checks to match the pseudocode in the instruction
      set manual.
      
      Fixes: e685c689 ("KVM/MIPS32: Privileged instruction/target branch emulation.")
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      010a6cc4
    • Radim Krčmář's avatar
      KVM: nVMX: postpone VMCS changes on MSR_IA32_APICBASE write · a0753a8c
      Radim Kr�má� authored
      commit dccbfcf5 upstream.
      
      If vmcs12 does not intercept APIC_BASE writes, then KVM will handle the
      write with vmcs02 as the current VMCS.
      This will incorrectly apply modifications intended for vmcs01 to vmcs02
      and L2 can use it to gain access to L0's x2APIC registers by disabling
      virtualized x2APIC while using msr bitmap that assumes enabled.
      
      Postpone execution of vmx_set_virtual_x2apic_mode until vmcs01 is the
      current VMCS.  An alternative solution would temporarily make vmcs01 the
      current VMCS, but it requires more care.
      
      Fixes: 8d14695f ("x86, apicv: add virtual x2apic support")
      Reported-by: default avatarJim Mattson <jmattson@google.com>
      Reviewed-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      a0753a8c
    • James Hogan's avatar
      KVM: MIPS: Drop other CPU ASIDs on guest MMU changes · 4c7055f8
      James Hogan authored
      commit 91e4f1b6 upstream.
      
      When a guest TLB entry is replaced by TLBWI or TLBWR, we only invalidate
      TLB entries on the local CPU. This doesn't work correctly on an SMP host
      when the guest is migrated to a different physical CPU, as it could pick
      up stale TLB mappings from the last time the vCPU ran on that physical
      CPU.
      
      Therefore invalidate both user and kernel host ASIDs on other CPUs,
      which will cause new ASIDs to be generated when it next runs on those
      CPUs.
      
      We're careful only to do this if the TLB entry was already valid, and
      only for the kernel ASID where the virtual address it mapped is outside
      of the guest user address range.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Cc: <stable@vger.kernel.org> # 3.10.x-
      Cc: Jiri Slaby <jslaby@suse.cz>
      [james.hogan@imgtec.com: Backport to 3.10..3.16]
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      4c7055f8
    • James Hogan's avatar
      KVM: MIPS: Precalculate MMIO load resume PC · 1d3b3d31
      James Hogan authored
      commit e1e575f6 upstream.
      
      The advancing of the PC when completing an MMIO load is done before
      re-entering the guest, i.e. before restoring the guest ASID. However if
      the load is in a branch delay slot it may need to access guest code to
      read the prior branch instruction. This isn't safe in TLB mapped code at
      the moment, nor in the future when we'll access unmapped guest segments
      using direct user accessors too, as it could read the branch from host
      user memory instead.
      
      Therefore calculate the resume PC in advance while we're still in the
      right context and save it in the new vcpu->arch.io_pc (replacing the no
      longer needed vcpu->arch.pending_load_cause), and restore it on MMIO
      completion.
      
      Fixes: e685c689 ("KVM/MIPS32: Privileged instruction/target branch emulation.")
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Cc: <stable@vger.kernel.org> # 3.10.x-3.16.x: 5f508c43: MIPS: KVM: Fix unused variable build warning
      Cc: <stable@vger.kernel.org> # 3.10.x-3.16.x
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      [james.hogan@imgtec.com: Backport to 3.10..3.16]
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      1d3b3d31
    • Nicholas Mc Guire's avatar
      MIPS: KVM: Fix unused variable build warning · 63b0fa17
      Nicholas Mc Guire authored
      commit 5f508c43 upstream.
      
      As kvm_mips_complete_mmio_load() did not yet modify PC at this point
      as James Hogans <james.hogan@imgtec.com> explained the curr_pc variable
      and the comments along with it can be dropped.
      Signed-off-by: default avatarNicholas Mc Guire <hofrat@osadl.org>
      Link: http://lkml.org/lkml/2015/5/8/422
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: kvm@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/9993/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      [james.hogan@imgtec.com: Backport to 3.10..3.16]
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      63b0fa17
    • Ondrej Mosnáček's avatar
      crypto: gcm - Fix IV buffer size in crypto_gcm_setkey · 4d57910e
      Ondrej Mosná�ek authored
      commit 50d2e6dc upstream.
      
      The cipher block size for GCM is 16 bytes, and thus the CTR transform
      used in crypto_gcm_setkey() will also expect a 16-byte IV. However,
      the code currently reserves only 8 bytes for the IV, causing
      an out-of-bounds access in the CTR transform. This patch fixes
      the issue by setting the size of the IV buffer to 16 bytes.
      
      Fixes: 84c91152 ("[CRYPTO] gcm: Add support for async ciphers")
      Signed-off-by: default avatarOndrej Mosnacek <omosnacek@gmail.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      4d57910e
    • Herbert Xu's avatar
      crypto: skcipher - Fix blkcipher walk OOM crash · 0b3918fc
      Herbert Xu authored
      commit acdb04d0 upstream.
      
      When we need to allocate a temporary blkcipher_walk_next and it
      fails, the code is supposed to take the slow path of processing
      the data block by block.  However, due to an unrelated change
      we instead end up dereferencing the NULL pointer.
      
      This patch fixes it by moving the unrelated bsize setting out
      of the way so that we enter the slow path as inteded.
      
      Fixes: 7607bd8f ("[CRYPTO] blkcipher: Added blkcipher_walk_virt_block")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarxiakaixu <xiakaixu@huawei.com>
      Reported-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Tested-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      0b3918fc
    • Ard Biesheuvel's avatar
      crypto: cryptd - initialize child shash_desc on import · 6ff69ad9
      Ard Biesheuvel authored
      commit 0bd22235 upstream.
      
      When calling .import() on a cryptd ahash_request, the structure members
      that describe the child transform in the shash_desc need to be initialized
      like they are when calling .init()
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      6ff69ad9
    • Herbert Xu's avatar
      crypto: algif_skcipher - Load TX SG list after waiting · ca46046c
      Herbert Xu authored
      commit 4f0414e5 upstream.
      
      We need to load the TX SG list in sendmsg(2) after waiting for
      incoming data, not before.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      ca46046c