1. 06 Feb, 2017 40 commits
    • Al Viro's avatar
      frv: fix clear_user() · 233d4b17
      Al Viro authored
      commit 3b8767a8 upstream.
      
      It should check access_ok().  Otherwise a bunch of places turn into
      trivially exploitable rootholes.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      233d4b17
    • Al Viro's avatar
      asm-generic: make get_user() clear the destination on errors · 44ccf7f1
      Al Viro authored
      commit 9ad18b75 upstream.
      
      both for access_ok() failures and for faults halfway through
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      44ccf7f1
    • Vineet Gupta's avatar
      ARC: uaccess: get_user to zero out dest in cause of fault · df375880
      Vineet Gupta authored
      commit 05d9d0b9 upstream.
      
      Al reported potential issue with ARC get_user() as it wasn't clearing
      out destination pointer in case of fault due to bad address etc.
      
      Verified using following
      
      | {
      |  	u32 bogus1 = 0xdeadbeef;
      |	u64 bogus2 = 0xdead;
      |	int rc1, rc2;
      |
      |  	pr_info("Orig values %x %llx\n", bogus1, bogus2);
      |	rc1 = get_user(bogus1, (u32 __user *)0x40000000);
      |	rc2 = get_user(bogus2, (u64 __user *)0x50000000);
      |	pr_info("access %d %d, new values %x %llx\n",
      |		rc1, rc2, bogus1, bogus2);
      | }
      
      | [ARCLinux]# insmod /mnt/kernel-module/qtn.ko
      | Orig values deadbeef dead
      | access -14 -14, new values 0 0
      Reported-by: default avatarAl Viro <viro@ZenIV.linux.org.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: linux-snps-arc@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      df375880
    • Al Viro's avatar
      s390: get_user() should zero on failure · ed98892e
      Al Viro authored
      commit fd2d2b19 upstream.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      ed98892e
    • Al Viro's avatar
      score: fix __get_user/get_user · 60f0190e
      Al Viro authored
      commit c2f18fa4 upstream.
      
      * should zero on any failure
      * __get_user() should use __copy_from_user(), not copy_from_user()
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      60f0190e
    • Al Viro's avatar
      sh64: failing __get_user() should zero · 643d0a2d
      Al Viro authored
      commit c6852389 upstream.
      
      It could be done in exception-handling bits in __get_user_b() et.al.,
      but the surgery involved would take more knowledge of sh64 details
      than I have or _want_ to have.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      643d0a2d
    • Al Viro's avatar
      m32r: fix __get_user() · 322dab0d
      Al Viro authored
      commit c90a3bc5 upstream.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      322dab0d
    • Al Viro's avatar
      mn10300: failing __get_user() and get_user() should zero · 7d84a5d5
      Al Viro authored
      commit 43403eab upstream.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      7d84a5d5
    • Al Viro's avatar
      microblaze: fix copy_from_user() · 22e232d6
      Al Viro authored
      commit d0cf3851 upstream.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      [wt: s/might_fault/might_sleep]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      22e232d6
    • Al Viro's avatar
      microblaze: fix __get_user() · 90f2278f
      Al Viro authored
      commit e98b9e37 upstream.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      90f2278f
    • John David Anglin's avatar
      parisc: Ensure consistent state when switching to kernel stack at syscall entry · 61f2d84a
      John David Anglin authored
      commit 6ed51832 upstream.
      
      We have one critical section in the syscall entry path in which we switch from
      the userspace stack to kernel stack. In the event of an external interrupt, the
      interrupt code distinguishes between those two states by analyzing the value of
      sr7. If sr7 is zero, it uses the kernel stack. Therefore it's important, that
      the value of sr7 is in sync with the currently enabled stack.
      
      This patch now disables interrupts while executing the critical section.  This
      prevents the interrupt handler to possibly see an inconsistent state which in
      the worst case can lead to crashes.
      
      Interestingly, in the syscall exit path interrupts were already disabled in the
      critical section which switches back to the userspace stack.
      Signed-off-by: default avatarJohn David Anglin <dave.anglin@bell.net>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      61f2d84a
    • Stefan Haberland's avatar
      s390/dasd: fix hanging device after clear subchannel · 2fe6f38f
      Stefan Haberland authored
      commit 9ba333dc upstream.
      
      When a device is in a status where CIO has killed all I/O by itself the
      interrupt for a clear request may not contain an irb to determine the
      clear function. Instead it contains an error pointer -EIO.
      This was ignored by the DASD int_handler leading to a hanging device
      waiting for a clear interrupt.
      
      Handle -EIO error pointer correctly for requests that are clear pending and
      treat the clear as successful.
      Signed-off-by: default avatarStefan Haberland <sth@linux.vnet.ibm.com>
      Reviewed-by: default avatarSebastian Ott <sebott@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      2fe6f38f
    • Dan Carpenter's avatar
      avr32: off by one in at32_init_pio() · cba31cb9
      Dan Carpenter authored
      commit 55f1cf83 upstream.
      
      The pio_dev[] array has MAX_NR_PIO_DEVICES elements so the > should be
      >=.
      
      Fixes: 5f97f7f9 ('[PATCH] avr32 architecture')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      cba31cb9
    • Guenter Roeck's avatar
      avr32: fix 'undefined reference to `___copy_from_user' · da189d04
      Guenter Roeck authored
      commit 65c0044c upstream.
      
      avr32 builds fail with:
      
      arch/avr32/kernel/built-in.o: In function `arch_ptrace':
      (.text+0x650): undefined reference to `___copy_from_user'
      arch/avr32/kernel/built-in.o:(___ksymtab+___copy_from_user+0x0): undefined
      reference to `___copy_from_user'
      kernel/built-in.o: In function `proc_doulongvec_ms_jiffies_minmax':
      (.text+0x5dd8): undefined reference to `___copy_from_user'
      kernel/built-in.o: In function `proc_dointvec_minmax_sysadmin':
      sysctl.c:(.text+0x6174): undefined reference to `___copy_from_user'
      kernel/built-in.o: In function `ptrace_has_cap':
      ptrace.c:(.text+0x69c0): undefined reference to `___copy_from_user'
      kernel/built-in.o:ptrace.c:(.text+0x6b90): more undefined references to
      `___copy_from_user' follow
      
      Fixes: 8630c322 ("avr32: fix copy_from_user()")
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Acked-by: default avatarHavard Skinnemoen <hskinnemoen@gmail.com>
      Acked-by: default avatarHans-Christian Noren Egtvedt <egtvedt@samfundet.no>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      da189d04
    • Al Viro's avatar
      avr32: fix copy_from_user() · 91be3ab4
      Al Viro authored
      commit 8630c322 upstream.
      
      really ugly, but apparently avr32 compilers turns access_ok() into
      something so bad that they want it in assembler.  Left that way,
      zeroing added in inline wrapper.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      91be3ab4
    • Pan Xinhui's avatar
      powerpc/nvram: Fix an incorrect partition merge · ac78e19e
      Pan Xinhui authored
      commit 11b7e154 upstream.
      
      When we merge two contiguous partitions whose signatures are marked
      NVRAM_SIG_FREE, We need update prev's length and checksum, then write it
      to nvram, not cur's. So lets fix this mistake now.
      
      Also use memset instead of strncpy to set the partition's name. It's
      more readable if we want to fill up with duplicate chars .
      
      Fixes: fa2b4e54 ("powerpc/nvram: Improve partition removal")
      Signed-off-by: default avatarPan Xinhui <xinhui.pan@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      ac78e19e
    • Paul Mackerras's avatar
      powerpc/64: Fix incorrect return value from __copy_tofrom_user · 5daad2cc
      Paul Mackerras authored
      commit 1a34439e upstream.
      
      Debugging a data corruption issue with virtio-net/vhost-net led to
      the observation that __copy_tofrom_user was occasionally returning
      a value 16 larger than it should.  Since the return value from
      __copy_tofrom_user is the number of bytes not copied, this means
      that __copy_tofrom_user can occasionally return a value larger
      than the number of bytes it was asked to copy.  In turn this can
      cause higher-level copy functions such as copy_page_to_iter_iovec
      to corrupt memory by copying data into the wrong memory locations.
      
      It turns out that the failing case involves a fault on the store
      at label 79, and at that point the first unmodified byte of the
      destination is at R3 + 16.  Consequently the exception handler
      for that store needs to add 16 to R3 before using it to work out
      how many bytes were not copied, but in this one case it was not
      adding the offset to R3.  To fix it, this moves the label 179 to
      the point where we add 16 to R3.  I have checked manually all the
      exception handlers for the loads and stores in this code and the
      rest of them are correct (it would be excellent to have an
      automated test of all the exception cases).
      
      This bug has been present since this code was initially
      committed in May 2002 to Linux version 2.5.20.
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      5daad2cc
    • Gavin Shan's avatar
      powerpc/powernv: Use CPU-endian PEST in pnv_pci_dump_p7ioc_diag_data() · 6d13a7b0
      Gavin Shan authored
      commit 5adaf862 upstream.
      
      This fixes the warnings reported from sparse:
      
        pci.c:312:33: warning: restricted __be64 degrades to integer
        pci.c:313:33: warning: restricted __be64 degrades to integer
      
      Fixes: cee72d5b ("powerpc/powernv: Display diag data on p7ioc EEH errors")
      Signed-off-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      6d13a7b0
    • Anton Blanchard's avatar
      powerpc/vdso64: Use double word compare on pointers · 546731aa
      Anton Blanchard authored
      commit 5045ea37 upstream.
      
      __kernel_get_syscall_map() and __kernel_clock_getres() use cmpli to
      check if the passed in pointer is non zero. cmpli maps to a 32 bit
      compare on binutils, so we ignore the top 32 bits.
      
      A simple test case can be created by passing in a bogus pointer with
      the bottom 32 bits clear. Using a clk_id that is handled by the VDSO,
      then one that is handled by the kernel shows the problem:
      
        printf("%d\n", clock_getres(CLOCK_REALTIME, (void *)0x100000000));
        printf("%d\n", clock_getres(CLOCK_BOOTTIME, (void *)0x100000000));
      
      And we get:
      
        0
        -1
      
      The bigger issue is if we pass a valid pointer with the bottom 32 bits
      clear, in this case we will return success but won't write any data
      to the pointer.
      
      I stumbled across this issue because the LLVM integrated assembler
      doesn't accept cmpli with 3 arguments. Fix this by converting them to
      cmpldi.
      
      Fixes: a7f290da ("[PATCH] powerpc: Merge vdso's and add vdso support to 32 bits kernel")
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      546731aa
    • Paul Mackerras's avatar
      powerpc/mm: Don't alias user region to other regions below PAGE_OFFSET · b0a4c167
      Paul Mackerras authored
      commit f077aaf0 upstream.
      
      In commit c60ac569 ("powerpc: Update kernel VSID range", 2013-03-13)
      we lost a check on the region number (the top four bits of the effective
      address) for addresses below PAGE_OFFSET.  That commit replaced a check
      that the top 18 bits were all zero with a check that bits 46 - 59 were
      zero (performed for all addresses, not just user addresses).
      
      This means that userspace can access an address like 0x1000_0xxx_xxxx_xxxx
      and we will insert a valid SLB entry for it.  The VSID used will be the
      same as if the top 4 bits were 0, but the page size will be some random
      value obtained by indexing beyond the end of the mm_ctx_high_slices_psize
      array in the paca.  If that page size is the same as would be used for
      region 0, then userspace just has an alias of the region 0 space.  If the
      page size is different, then no HPTE will be found for the access, and
      the process will get a SIGSEGV (since hash_page_mm() will refuse to create
      a HPTE for the bogus address).
      
      The access beyond the end of the mm_ctx_high_slices_psize can be at most
      5.5MB past the array, and so will be in RAM somewhere.  Since the access
      is a load performed in real mode, it won't fault or crash the kernel.
      At most this bug could perhaps leak a little bit of information about
      blocks of 32 bytes of memory located at offsets of i * 512kB past the
      paca->mm_ctx_high_slices_psize array, for 1 <= i <= 11.
      
      Fixes: c60ac569 ("powerpc: Update kernel VSID range")
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      b0a4c167
    • Marcin Nowakowski's avatar
      MIPS: ptrace: Fix regs_return_value for kernel context · 4787d839
      Marcin Nowakowski authored
      commit 74f1077b upstream.
      
      Currently regs_return_value always negates reg[2] if it determines
      the syscall has failed, but when called in kernel context this check is
      invalid and may result in returning a wrong value.
      
      This fixes errors reported by CONFIG_KPROBES_SANITY_TEST
      
      Fixes: d7e7528b ("Audit: push audit success and retcode into arch ptrace.h")
      Signed-off-by: default avatarMarcin Nowakowski <marcin.nowakowski@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/14381/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      4787d839
    • Paul Burton's avatar
      MIPS: Malta: Fix IOCU disable switch read for MIPS64 · bc9f83ea
      Paul Burton authored
      commit 305723ab upstream.
      
      Malta boards used with CPU emulators feature a switch to disable use of
      an IOCU. Software has to check this switch & ignore any present IOCU if
      the switch is closed. The read used to do this was unsafe for 64 bit
      kernels, as it simply casted the address 0xbf403000 to a pointer &
      dereferenced it. Whilst in a 32 bit kernel this would access kseg1, in a
      64 bit kernel this attempts to access xuseg & results in an address
      error exception.
      
      Fix by accessing a correctly formed ckseg1 address generated using the
      CKSEG1ADDR macro.
      
      Whilst modifying this code, define the name of the register and the bit
      we care about within it, which indicates whether PCI DMA is routed to
      the IOCU or straight to DRAM. The code previously checked that bit 0 was
      also set, but the least significant 7 bits of the CONFIG_GEN0 register
      contain the value of the MReqInfo signal provided to the IOCU OCP bus,
      so singling out bit 0 makes little sense & that part of the check is
      dropped.
      Signed-off-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Fixes: b6d92b4a ("MIPS: Add option to disable software I/O coherency.")
      Cc: Matt Redfearn <matt.redfearn@imgtec.com>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/14187/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      bc9f83ea
    • Will Deacon's avatar
      arm64: debug: avoid resetting stepping state machine when TIF_SINGLESTEP · 83099f39
      Will Deacon authored
      commit 3a402a70 upstream.
      
      When TIF_SINGLESTEP is set for a task, the single-step state machine is
      enabled and we must take care not to reset it to the active-not-pending
      state if it is already in the active-pending state.
      
      Unfortunately, that's exactly what user_enable_single_step does, by
      unconditionally setting the SS bit in the SPSR for the current task.
      This causes failures in the GDB testsuite, where GDB ends up missing
      expected step traps if the instruction being stepped generates another
      trap, e.g. PTRACE_EVENT_FORK from an SVC instruction.
      
      This patch fixes the problem by preserving the current state of the
      stepping state machine when TIF_SINGLESTEP is set on the current thread.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: default avatarYao Qi <yao.qi@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      83099f39
    • Will Deacon's avatar
      arm64: spinlocks: implement smp_mb__before_spinlock() as smp_mb() · d65df517
      Will Deacon authored
      commit 872c63fb upstream.
      
      smp_mb__before_spinlock() is intended to upgrade a spin_lock() operation
      to a full barrier, such that prior stores are ordered with respect to
      loads and stores occuring inside the critical section.
      
      Unfortunately, the core code defines the barrier as smp_wmb(), which
      is insufficient to provide the required ordering guarantees when used in
      conjunction with our load-acquire-based spinlock implementation.
      
      This patch overrides the arm64 definition of smp_mb__before_spinlock()
      to map to a full smp_mb().
      
      Cc: Peter Zijlstra <peterz@infradead.org>
      Reported-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d65df517
    • James Hogan's avatar
      arm64: Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO · 1fd5c7b6
      James Hogan authored
      commit 3146bc64 upstream.
      
      AT_VECTOR_SIZE_ARCH should be defined with the maximum number of
      NEW_AUX_ENT entries that ARCH_DLINFO can contain, but it wasn't defined
      for arm64 at all even though ARCH_DLINFO will contain one NEW_AUX_ENT
      for the VDSO address.
      
      This shouldn't be a problem as AT_VECTOR_SIZE_BASE includes space for
      AT_BASE_PLATFORM which arm64 doesn't use, but lets define it now and add
      the comment above ARCH_DLINFO as found in several other architectures to
      remind future modifiers of ARCH_DLINFO to keep AT_VECTOR_SIZE_ARCH up to
      date.
      
      Fixes: f668cd16 ("arm64: ELF definitions")
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      1fd5c7b6
    • Mark Rutland's avatar
      arm64: avoid returning from bad_mode · e5471def
      Mark Rutland authored
      commit 7d9e8f71 upstream.
      
      Generally, taking an unexpected exception should be a fatal event, and
      bad_mode is intended to cater for this. However, it should be possible
      to contain unexpected synchronous exceptions from EL0 without bringing
      the kernel down, by sending a SIGILL to the task.
      
      We tried to apply this approach in commit 9955ac47 ("arm64:
      don't kill the kernel on a bad esr from el0"), by sending a signal for
      any bad_mode call resulting from an EL0 exception.
      
      However, this also applies to other unexpected exceptions, such as
      SError and FIQ. The entry paths for these exceptions branch to bad_mode
      without configuring the link register, and have no kernel_exit. Thus, if
      we take one of these exceptions from EL0, bad_mode will eventually
      return to the original user link register value.
      
      This patch fixes this by introducing a new bad_el0_sync handler to cater
      for the recoverable case, and restoring bad_mode to its original state,
      whereby it calls panic() and never returns. The recoverable case
      branches to bad_el0_sync with a bl, and returns to userspace via the
      usual ret_to_user mechanism.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Fixes: 9955ac47 ("arm64: don't kill the kernel on a bad esr from el0")
      Reported-by: default avatarMark Salter <msalter@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      e5471def
    • Russell King's avatar
      ARM: sa1111: fix pcmcia suspend/resume · eee1bdb5
      Russell King authored
      commit 06dfe5cc upstream.
      
      SA1111 PCMCIA was broken when PCMCIA switched to using dev_pm_ops for
      the PCMCIA socket class.  PCMCIA used to handle suspend/resume via the
      socket hosting device, which happened at normal device suspend/resume
      time.
      
      However, the referenced commit changed this: much of the resume now
      happens much earlier, in the noirq resume handler of dev_pm_ops.
      
      However, on SA1111, the PCMCIA device is not accessible as the SA1111
      has not been resumed at _noirq time.  It's slightly worse than that,
      because the SA1111 has already been put to sleep at _noirq time, so
      suspend doesn't work properly.
      
      Fix this by converting the core SA1111 code to use dev_pm_ops as well,
      and performing its own suspend/resume at noirq time.
      
      This fixes these errors in the kernel log:
      
      pcmcia_socket pcmcia_socket0: time out after reset
      pcmcia_socket pcmcia_socket1: time out after reset
      
      and the resulting lack of PCMCIA cards after a S2RAM cycle.
      
      Fixes: d7646f76 ("pcmcia: use dev_pm_ops for class pcmcia_socket_class")
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      eee1bdb5
    • Russell King's avatar
      ARM: sa1100: clear reset status prior to reboot · 5b4918cc
      Russell King authored
      commit da60626e upstream.
      
      Clear the current reset status prior to rebooting the platform.  This
      adds the bit missing from 04fef228 ("[ARM] pxa: introduce
      reset_status and clear_reset_status for driver's usage").
      
      Fixes: 04fef228 ("[ARM] pxa: introduce reset_status and clear_reset_status for driver's usage")
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      5b4918cc
    • Srinivas Ramana's avatar
      ARM: 8618/1: decompressor: reset ttbcr fields to use TTBR0 on ARMv7 · 1774ca81
      Srinivas Ramana authored
      commit 117e5e9c upstream.
      
      If the bootloader uses the long descriptor format and jumps to
      kernel decompressor code, TTBCR may not be in a right state.
      Before enabling the MMU, it is required to clear the TTBCR.PD0
      field to use TTBR0 for translation table walks.
      
      The commit dbece458 ("ARM: 7501/1: decompressor:
      reset ttbcr for VMSA ARMv7 cores") does the reset of TTBCR.N, but
      doesn't consider all the bits for the size of TTBCR.N.
      
      Clear TTBCR.PD0 field and reset all the three bits of TTBCR.N to
      indicate the use of TTBR0 and the correct base address width.
      
      Fixes: dbece458 ("ARM: 7501/1: decompressor: reset ttbcr for VMSA ARMv7 cores")
      Acked-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarSrinivas Ramana <sramana@codeaurora.org>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      1774ca81
    • Robin Murphy's avatar
      ARM: 8616/1: dt: Respect property size when parsing CPUs · 88654a15
      Robin Murphy authored
      commit ba6dea4f upstream.
      
      Whilst MPIDR values themselves are less than 32 bits, it is still
      perfectly valid for a DT to have #address-cells > 1 in the CPUs node,
      resulting in the "reg" property having leading zero cell(s). In that
      situation, the big-endian nature of the data conspires with the current
      behaviour of only reading the first cell to cause the kernel to think
      all CPUs have ID 0, and become resoundingly unhappy as a consequence.
      
      Take the full property length into account when parsing CPUs so as to
      be correct under any circumstances.
      
      Cc: Russell King <linux@armlinux.org.uk>
      Signed-off-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      88654a15
    • Baoquan He's avatar
      iommu/amd: Free domain id when free a domain of struct dma_ops_domain · 7a6111b8
      Baoquan He authored
      commit c3db901c upstream.
      
      The current code missed freeing domain id when free a domain of
      struct dma_ops_domain.
      Signed-off-by: default avatarBaoquan He <bhe@redhat.com>
      Fixes: ec487d1a ('x86, AMD IOMMU: add domain allocation and deallocation functions')
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      7a6111b8
    • Joerg Roedel's avatar
      iommu/amd: Update Alias-DTE in update_device_table() · 56eb0df4
      Joerg Roedel authored
      commit 3254de6b upstream.
      
      Not doing so might cause IO-Page-Faults when a device uses
      an alias request-id and the alias-dte is left in a lower
      page-mode which does not cover the address allocated from
      the iova-allocator.
      
      Fixes: 492667da ('x86/amd-iommu: Remove amd_iommu_pd_table')
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      56eb0df4
    • Michael S. Tsirkin's avatar
      x86/um: reuse asm-generic/barrier.h · 69c373d8
      Michael S. Tsirkin authored
      commit 577f183a upstream.
      
      On x86/um CONFIG_SMP is never defined.  As a result, several macros
      match the asm-generic variant exactly. Drop the local definitions and
      pull in asm-generic/barrier.h instead.
      
      This is in preparation to refactoring this code area.
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarRichard Weinberger <richard@nod.at>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      69c373d8
    • H.J. Lu's avatar
      x86/build: Build compressed x86 kernels as PIE · 186c5f34
      H.J. Lu authored
      commit 6d92bc9d upstream.
      
      The 32-bit x86 assembler in binutils 2.26 will generate R_386_GOT32X
      relocation to get the symbol address in PIC.  When the compressed x86
      kernel isn't built as PIC, the linker optimizes R_386_GOT32X relocations
      to their fixed symbol addresses.  However, when the compressed x86
      kernel is loaded at a different address, it leads to the following
      load failure:
      
        Failed to allocate space for phdrs
      
      during the decompression stage.
      
      If the compressed x86 kernel is relocatable at run-time, it should be
      compiled with -fPIE, instead of -fPIC, if possible and should be built as
      Position Independent Executable (PIE) so that linker won't optimize
      R_386_GOT32X relocation to its fixed symbol address.
      
      Older linkers generate R_386_32 relocations against locally defined
      symbols, _bss, _ebss, _got and _egot, in PIE.  It isn't wrong, just less
      optimal than R_386_RELATIVE.  But the x86 kernel fails to properly handle
      R_386_32 relocations when relocating the kernel.  To generate
      R_386_RELATIVE relocations, we mark _bss, _ebss, _got and _egot as
      hidden in both 32-bit and 64-bit x86 kernels.
      
      To build a 64-bit compressed x86 kernel as PIE, we need to disable the
      relocation overflow check to avoid relocation overflow errors. We do
      this with a new linker command-line option, -z noreloc-overflow, which
      got added recently:
      
       commit 4c10bbaa0912742322f10d9d5bb630ba4e15dfa7
       Author: H.J. Lu <hjl.tools@gmail.com>
       Date:   Tue Mar 15 11:07:06 2016 -0700
      
          Add -z noreloc-overflow option to x86-64 ld
      
          Add -z noreloc-overflow command-line option to the x86-64 ELF linker to
          disable relocation overflow check.  This can be used to avoid relocation
          overflow check if there will be no dynamic relocation overflow at
          run-time.
      
      The 64-bit compressed x86 kernel is built as PIE only if the linker supports
      -z noreloc-overflow.  So far 64-bit relocatable compressed x86 kernel
      boots fine even when it is built as a normal executable.
      Signed-off-by: default avatarH.J. Lu <hjl.tools@gmail.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      [ Edited the changelog and comments. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      186c5f34
    • Steven Rostedt's avatar
      x86/paravirt: Do not trace _paravirt_ident_*() functions · 6523fa8c
      Steven Rostedt authored
      commit 15301a57 upstream.
      
      Łukasz Daniluk reported that on a RHEL kernel that his machine would lock up
      after enabling function tracer. I asked him to bisect the functions within
      available_filter_functions, which he did and it came down to three:
      
        _paravirt_nop(), _paravirt_ident_32() and _paravirt_ident_64()
      
      It was found that this is only an issue when noreplace-paravirt is added
      to the kernel command line.
      
      This means that those functions are most likely called within critical
      sections of the funtion tracer, and must not be traced.
      
      In newer kenels _paravirt_nop() is defined within gcc asm(), and is no
      longer an issue.  But both _paravirt_ident_{32,64}() causes the
      following splat when they are traced:
      
       mm/pgtable-generic.c:33: bad pmd ffff8800d2435150(0000000001d00054)
       mm/pgtable-generic.c:33: bad pmd ffff8800d3624190(0000000001d00070)
       mm/pgtable-generic.c:33: bad pmd ffff8800d36a5110(0000000001d00054)
       mm/pgtable-generic.c:33: bad pmd ffff880118eb1450(0000000001d00054)
       NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [systemd-journal:469]
       Modules linked in: e1000e
       CPU: 2 PID: 469 Comm: systemd-journal Not tainted 4.6.0-rc4-test+ #513
       Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
       task: ffff880118f740c0 ti: ffff8800d4aec000 task.ti: ffff8800d4aec000
       RIP: 0010:[<ffffffff81134148>]  [<ffffffff81134148>] queued_spin_lock_slowpath+0x118/0x1a0
       RSP: 0018:ffff8800d4aefb90  EFLAGS: 00000246
       RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88011eb16d40
       RDX: ffffffff82485760 RSI: 000000001f288820 RDI: ffffea0000008030
       RBP: ffff8800d4aefb90 R08: 00000000000c0000 R09: 0000000000000000
       R10: ffffffff821c8e0e R11: 0000000000000000 R12: ffff880000200fb8
       R13: 00007f7a4e3f7000 R14: ffffea000303f600 R15: ffff8800d4b562e0
       FS:  00007f7a4e3d7840(0000) GS:ffff88011eb00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00007f7a4e3f7000 CR3: 00000000d3e71000 CR4: 00000000001406e0
       Call Trace:
         _raw_spin_lock+0x27/0x30
         handle_pte_fault+0x13db/0x16b0
         handle_mm_fault+0x312/0x670
         __do_page_fault+0x1b1/0x4e0
         do_page_fault+0x22/0x30
         page_fault+0x28/0x30
         __vfs_read+0x28/0xe0
         vfs_read+0x86/0x130
         SyS_read+0x46/0xa0
         entry_SYSCALL_64_fastpath+0x1e/0xa8
       Code: 12 48 c1 ea 0c 83 e8 01 83 e2 30 48 98 48 81 c2 40 6d 01 00 48 03 14 c5 80 6a 5d 82 48 89 0a 8b 41 08 85 c0 75 09 f3 90 8b 41 08 <85> c0 74 f7 4c 8b 09 4d 85 c9 74 08 41 0f 18 09 eb 02 f3 90 8b
      Reported-by: default avatarŁukasz Daniluk <lukasz.daniluk@intel.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      6523fa8c
    • Jiri Kosina's avatar
      x86/mm/pat, /dev/mem: Remove superfluous error message · 1eae225e
      Jiri Kosina authored
      commit 39380b80 upstream.
      
      Currently it's possible for broken (or malicious) userspace to flood a
      kernel log indefinitely with messages a-la
      
      	Program dmidecode tried to access /dev/mem between f0000->100000
      
      because range_is_allowed() is case of CONFIG_STRICT_DEVMEM being turned on
      dumps this information each and every time devmem_is_allowed() fails.
      
      Reportedly userspace that is able to trigger contignuous flow of these
      messages exists.
      
      It would be possible to rate limit this message, but that'd have a
      questionable value; the administrator wouldn't get information about all
      the failing accessess, so then the information would be both superfluous
      and incomplete at the same time :)
      
      Returning EPERM (which is what is actually happening) is enough indication
      for userspace what has happened; no need to log this particular error as
      some sort of special condition.
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1607081137020.24757@cbobk.fhfr.pmSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      1eae225e
    • Wanpeng Li's avatar
      x86/apic: Do not init irq remapping if ioapic is disabled · 928a2775
      Wanpeng Li authored
      commit 2e63ad4b upstream.
      
      native_smp_prepare_cpus
        -> default_setup_apic_routing
          -> enable_IR_x2apic
            -> irq_remapping_prepare
              -> intel_prepare_irq_remapping
                -> intel_setup_irq_remapping
      
      So IR table is setup even if "noapic" boot parameter is added. As a result we
      crash later when the interrupt affinity is set due to a half initialized
      remapping infrastructure.
      
      Prevent remap initialization when IOAPIC is disabled.
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Joerg Roedel <joro@8bytes.org>
      Link: http://lkml.kernel.org/r/1471954039-3942-1-git-send-email-wanpeng.li@hotmail.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      928a2775
    • Sebastian Andrzej Siewior's avatar
      x86/mm: Disable preemption during CR3 read+write · b591901e
      Sebastian Andrzej Siewior authored
      commit 5cf0791d upstream.
      
      There's a subtle preemption race on UP kernels:
      
      Usually current->mm (and therefore mm->pgd) stays the same during the
      lifetime of a task so it does not matter if a task gets preempted during
      the read and write of the CR3.
      
      But then, there is this scenario on x86-UP:
      
      TaskA is in do_exit() and exit_mm() sets current->mm = NULL followed by:
      
       -> mmput()
       -> exit_mmap()
       -> tlb_finish_mmu()
       -> tlb_flush_mmu()
       -> tlb_flush_mmu_tlbonly()
       -> tlb_flush()
       -> flush_tlb_mm_range()
       -> __flush_tlb_up()
       -> __flush_tlb()
       ->  __native_flush_tlb()
      
      At this point current->mm is NULL but current->active_mm still points to
      the "old" mm.
      
      Let's preempt taskA _after_ native_read_cr3() by taskB. TaskB has its
      own mm so CR3 has changed.
      
      Now preempt back to taskA. TaskA has no ->mm set so it borrows taskB's
      mm and so CR3 remains unchanged. Once taskA gets active it continues
      where it was interrupted and that means it writes its old CR3 value
      back. Everything is fine because userland won't need its memory
      anymore.
      
      Now the fun part:
      
      Let's preempt taskA one more time and get back to taskB. This
      time switch_mm() won't do a thing because oldmm (->active_mm)
      is the same as mm (as per context_switch()). So we remain
      with a bad CR3 / PGD and return to userland.
      
      The next thing that happens is handle_mm_fault() with an address for
      the execution of its code in userland. handle_mm_fault() realizes that
      it has a PTE with proper rights so it returns doing nothing. But the
      CPU looks at the wrong PGD and insists that something is wrong and
      faults again. And again. And one more time…
      
      This pagefault circle continues until the scheduler gets tired of it and
      puts another task on the CPU. It gets little difficult if the task is a
      RT task with a high priority. The system will either freeze or it gets
      fixed by the software watchdog thread which usually runs at RT-max prio.
      But waiting for the watchdog will increase the latency of the RT task
      which is no good.
      
      Fix this by disabling preemption across the critical code section.
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Acked-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/1470404259-26290-1-git-send-email-bigeasy@linutronix.de
      [ Prettified the changelog. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      b591901e
    • Andy Lutomirski's avatar
      x86/traps: Ignore high word of regs->cs in early_idt_handler_common · 213060d9
      Andy Lutomirski authored
      This is a backport of:
      commit fc0e81b2 upstream
      
      On the 80486 DX, it seems that some exceptions may leave garbage in
      the high bits of CS.  This causes sporadic failures in which
      early_fixup_exception() refuses to fix up an exception.
      
      As far as I can tell, this has been buggy for a long time, but the
      problem seems to have been exacerbated by commits:
      
        1e02ce4c ("x86: Store a per-cpu shadow copy of CR4")
        e1bfc11c ("x86/init: Fix cr4_init_shadow() on CR4-less machines")
      
      This appears to have broken for as long as we've had early
      exception handling.
      
      [ This backport should apply to kernels from 3.4 - 4.5. ]
      
      Fixes: 4c5023a3 ("x86-32: Handle exception table entries during early boot")
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: stable@vger.kernel.org
      Reported-by: default avatarMatthew Whitehead <tedheadster@gmail.com>
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      213060d9
    • Juergen Gross's avatar
      x86/xen: fix upper bound of pmd loop in xen_cleanhighmap() · 506750b7
      Juergen Gross authored
      commit 1cf38741 upstream.
      
      xen_cleanhighmap() is operating on level2_kernel_pgt only. The upper
      bound of the loop setting non-kernel-image entries to zero should not
      exceed the size of level2_kernel_pgt.
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      506750b7