1. 29 Aug, 2017 2 commits
    • Mark Rutland's avatar
      ARM: 8692/1: mm: abort uaccess retries upon fatal signal · 746a272e
      Mark Rutland authored
      When there's a fatal signal pending, arm's do_page_fault()
      implementation returns 0. The intent is that we'll return to the
      faulting userspace instruction, delivering the signal on the way.
      
      However, if we take a fatal signal during fixing up a uaccess, this
      results in a return to the faulting kernel instruction, which will be
      instantly retried, resulting in the same fault being taken forever. As
      the task never reaches userspace, the signal is not delivered, and the
      task is left unkillable. While the task is stuck in this state, it can
      inhibit the forward progress of the system.
      
      To avoid this, we must ensure that when a fatal signal is pending, we
      apply any necessary fixup for a faulting kernel instruction. Thus we
      will return to an error path, and it is up to that code to make forward
      progress towards delivering the fatal signal.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarSteve Capper <steve.capper@arm.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      746a272e
    • Hoeun Ryu's avatar
      ARM: 8690/1: lpae: build TTB control register value from scratch in v7_ttb_setup · f26fee5f
      Hoeun Ryu authored
      Reading TTBCR in early boot stage might return the value of the previous
      kernel's configuration, especially in case of kexec. For example, if
      normal kernel (first kernel) had run on a configuration of PHYS_OFFSET <=
      PAGE_OFFSET and crash kernel (second kernel) is running on a configuration
      PHYS_OFFSET > PAGE_OFFSET, which can happen because it depends on the
      reserved area for crash kernel, reading TTBCR and using the value to OR
      other bit fields might be risky because it doesn't have a reset value for TTBCR.
      Suggested-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarHoeun Ryu <hoeun.ryu@gmail.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      f26fee5f
  2. 14 Aug, 2017 1 commit
    • Russell King's avatar
      ARM: align .data section · 1abd3502
      Russell King authored
      Robert Jarzmik reports that his PXA25x system fails to boot with 4.12,
      failing at __flush_whole_cache in arch/arm/mm/proc-xscale.S:215:
      
         0xc0019e20 <+0>:     ldr     r1, [pc, #788]
         0xc0019e24 <+4>:     ldr     r0, [r1]	<== here
      
      with r1 containing 0xc06f82cd, which is the address of "clean_addr".
      Examination of the System.map shows:
      
      c06f22c8 D user_pmd_table
      c06f22cc d __warned.19178
      c06f22cd d clean_addr
      
      indicating that a .data.unlikely section has appeared just before the
      .data section from proc-xscale.S.  According to objdump -h, it appears
      that our assembly files default their .data alignment to 2**0, which
      is bad news if the preceding .data section size is not power-of-2
      aligned at link time.
      
      Add the appropriate .align directives to all assembly files in arch/arm
      that are missing them where we require an appropriate alignment.
      Reported-by: default avatarRobert Jarzmik <robert.jarzmik@free.fr>
      Tested-by: default avatarRobert Jarzmik <robert.jarzmik@free.fr>
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      1abd3502
  3. 24 Jul, 2017 2 commits
    • Dave Martin's avatar
      ARM: 8687/1: signal: Fix unparseable iwmmxt_sigframe in uc_regspace[] · ce184a0d
      Dave Martin authored
      In kernels with CONFIG_IWMMXT=y running on non-iWMMXt hardware, the
      signal frame can be left partially uninitialised in such a way
      that userspace cannot parse uc_regspace[] safely.  In particular,
      this means that the VFP registers cannot be located reliably in the
      signal frame when a multi_v7_defconfig kernel is run on the
      majority of platforms.
      
      The cause is that the uc_regspace[] is laid out statically based on
      the kernel config, but the decision of whether to save/restore the
      iWMMXt registers must be a runtime decision.
      
      To minimise breakage of software that may assume a fixed layout,
      this patch emits a dummy block of the same size as iwmmxt_sigframe,
      for non-iWMMXt threads.  However, the magic and size of this block
      are now filled in to help parsers skip over it.  A new DUMMY_MAGIC
      is defined for this purpose.
      
      It is probably legitimate (if non-portable) for userspace to
      manufacture its own sigframe for sigreturn, and there is no obvious
      reason why userspace should be required to insert a DUMMY_MAGIC
      block when running on non-iWMMXt hardware, when omitting it has
      worked just fine forever in other configurations.  So in this case,
      sigreturn does not require this block to be present.
      Reported-by: default avatarEdmund Grimley-Evans <Edmund.Grimley-Evans@arm.com>
      Signed-off-by: default avatarDave Martin <Dave.Martin@arm.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      ce184a0d
    • Dave Martin's avatar
      ARM: 8686/1: iwmmxt: Add missing __user annotations to sigframe accessors · 26958355
      Dave Martin authored
      preserve_iwmmxt_context() and restore_iwmmxt_context() lack __user
      accessors on their arguments pointing to the user signal frame.
      
      There does not be appear to be a bug here, but this omission is
      inconsistent with the crunch and vfp sigframe access functions.
      
      This patch adds the annotations, for consistency.
      Signed-off-by: default avatarDave Martin <Dave.Martin@arm.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      26958355
  4. 20 Jul, 2017 2 commits
    • Russell King's avatar
      ARM: kexec: fix failure to boot crash kernel · 0d70262a
      Russell King authored
      When kexec was converted to DTB, the dtb address was passed between
      machine_kexec_prepare() and machine_kexec() using a static variable.
      This is bad news if you load a crash kernel followed by a normal
      kernel or vice versa - the last loaded kernel overwrites the dtb
      address.
      
      This can result in kexec failures, as (eg) we try to boot the crash
      kernel with the last loaded dtb.  For example, with:
      
      the crash kernel fails to find the dtb.
      
      Avoid this by defining a kimage architecture structure, and store
      the address to be passed in r2 there, which will either be the ATAGs
      or the dtb blob.
      
      Fixes: 4cabd1d9 ("ARM: 7539/1: kexec: scan for dtb magic in segments")
      Fixes: 42d720d1 ("ARM: kexec: Make .text R/W in machine_kexec")
      Reported-by: default avatarKeerthy <j-keerthy@ti.com>
      Tested-by: default avatarKeerthy <j-keerthy@ti.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      0d70262a
    • Russell King's avatar
      ARM: kexec: avoid allocating crashkernel region outside lowmem · 67556d7a
      Russell King authored
      Allocating the crashkernel region outside lowmem causes the kernel to
      oops while trying to kexec into the new kernel:
      
      Loading crashdump kernel...
      Unable to handle kernel NULL pointer dereference at virtual address 00000000
      pgd = edd70000
      [00000000] *pgd=de19e835
      Internal error: Oops: 817 [#2] SMP ARM
      Modules linked in: ...
      CPU: 0 PID: 689 Comm: sh Not tainted 4.12.0-rc3-next-20170601-04015-gc3a5a20
      Hardware name: Generic DRA74X (Flattened Device Tree)
      task: edb32f00 task.stack: edf18000
      PC is at memcpy+0x50/0x330
      LR is at 0xe3c34001
      pc : [<c04baf30>]    lr : [<e3c34001>]    psr: 800c0193
      sp : edf19c2c  ip : 0a000001  fp : c0553170
      r10: c055316e  r9 : 00000001  r8 : e3130001
      r7 : e4903004  r6 : 0a000014  r5 : e3500000  r4 : e59f106c
      r3 : e59f0074  r2 : ffffffe8  r1 : c010fb88  r0 : 00000000
      Flags: Nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment none
      Control: 10c5387d  Table: add7006a  DAC: 00000051
      Process sh (pid: 689, stack limit = 0xedf18218)
      Stack: (0xedf19c2c to 0xedf1a000)
      ...
      [<c04baf30>] (memcpy) from [<c010fae0>] (machine_kexec+0xa8/0x12c)
      [<c010fae0>] (machine_kexec) from [<c01e4104>] (__crash_kexec+0x5c/0x98)
      [<c01e4104>] (__crash_kexec) from [<c01e419c>] (crash_kexec+0x5c/0x68)
      [<c01e419c>] (crash_kexec) from [<c010c5c0>] (die+0x228/0x490)
      [<c010c5c0>] (die) from [<c011e520>] (__do_kernel_fault.part.0+0x54/0x1e4)
      [<c011e520>] (__do_kernel_fault.part.0) from [<c082412c>] (do_page_fault+0x1e8/0x400)
      [<c082412c>] (do_page_fault) from [<c010135c>] (do_DataAbort+0x38/0xb8)
      [<c010135c>] (do_DataAbort) from [<c0823584>] (__dabt_svc+0x64/0xa0)
      
      This is caused by image->control_code_page being a highmem page, so
      page_address(image->control_code_page) returns NULL.  In any case, we
      don't want the control page to be a highmem page.
      
      We already limit the crash kernel region to the top of 32-bit physical
      memory space.  Also limit it to the top of lowmem in physical space.
      Reported-by: default avatarKeerthy <j-keerthy@ti.com>
      Tested-by: default avatarKeerthy <j-keerthy@ti.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      67556d7a
  5. 02 Jul, 2017 4 commits
  6. 01 Jul, 2017 5 commits
  7. 30 Jun, 2017 14 commits
    • Zack Weinberg's avatar
      uapi/linux/a.out.h: don't use deprecated system-specific predefines. · fbd57629
      Zack Weinberg authored
      uapi/linux/a.out.h uses a number of predefined macros that are
      deprecated because they're in the application namespace
      (e.g. '#ifdef linux' instead of '#ifdef __linux__').
      This patch either corrects or just removes them if they are not
      applicable to Linux.
      
      The primary reason this is worth bothering to fix, considering how
      obsolete a.out binary support is, is that the GCC build process
      considers this such a severe error that it will copy the header into a
      private directory and change the macro names, which causes future
      updates to the header to be masked.  This header probably doesn't get
      updated very often anymore, but it is the _only_ uapi header that gets
      this treatment, so IMHO it is worth patching just to drive that number
      all the way to zero.
      Signed-off-by: default avatarZack Weinberg <zackw@panix.com>
      [hch: removed dead conditionals]
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fbd57629
    • Jakub Kicinski's avatar
      hashtable: remove repeated phrase from a comment · dbd18777
      Jakub Kicinski authored
      "in a rcu enabled hashtable" is repeated twice in a comment.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dbd18777
    • Vikas Shivappa's avatar
      x86/intel_rdt: Fix memory leak on mount failure · 79298acc
      Vikas Shivappa authored
      If mount fails, the kn_info directory is not freed causing memory leak.
      
      Add the missing error handling path.
      
      Fixes: 4e978d06 ("x86/intel_rdt: Add "info" files to resctrl file system")
      Signed-off-by: default avatarVikas Shivappa <vikas.shivappa@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: ravi.v.shankar@intel.com
      Cc: tony.luck@intel.com
      Cc: fenghua.yu@intel.com
      Cc: peterz@infradead.org
      Cc: vikas.shivappa@intel.com
      Cc: andi.kleen@intel.com
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/1498503368-20173-3-git-send-email-vikas.shivappa@linux.intel.com
      79298acc
    • Linus Torvalds's avatar
      Merge tag 'powerpc-4.12-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · b4df2e35
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "Hopefully the last two powerpc fixes for 4.12.
      
        The CXL one is larger than I'd usually send at rc7, but it fixes new
        code this cycle, so better to have it working for the release. It was
        actually sent a few weeks back but got blocked in testing behind
        another fix that was causing issues.
      
        We are still tracking one crash in v4.12-rc7, but only one person has
        reproduced it and the commit identified by bisect doesn't touch any of
        the relevant code, so I think it's 50/50 whether that commit is
        actually the problem or it's some code layout / toolchain issue.
      
        Two fixes for code we merged this cycle:
      
         - cxl: Fixes for Coherent Accelerator Interface Architecture 2.0
      
         - Avoid miscompilation w/GCC 4.6.3 on 32-bit - don't inline
           copy_to/from_user()
      
        Thanks to Al Viro, Larry Finger, Christophe Lombard"
      
      * tag 'powerpc-4.12-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/32: Avoid miscompilation w/GCC 4.6.3 - don't inline copy_to/from_user()
        cxl: Fixes for Coherent Accelerator Interface Architecture 2.0
      b4df2e35
    • Linus Torvalds's avatar
      Merge tag 'iommu-fixes-v4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 27ab862a
      Linus Torvalds authored
      Pull IOMMU fixes from Joerg Roedel:
       "Two fixes:
      
         - A fix for AMD IOMMU interrupt remapping code when IRQs are
           forwarded directly to KVM guests
      
         - Fixed check in the recently merged code to allow tboot with
           Intel VT-d disabled"
      
      * tag 'iommu-fixes-v4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/amd: Fix interrupt remapping when disable guest_mode
        iommu/vt-d: Correctly disable Intel IOMMU force on
      27ab862a
    • Linus Torvalds's avatar
      Merge tag 'sound-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 4adc6b93
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "Two last-minute HD-audio fixes"
      
      * tag 'sound-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda - Fix endless loop of codec configure
        ALSA: hda - set input_path bitmap to zero after moving it to new place
      4adc6b93
    • Linus Torvalds's avatar
      Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs · 86c3e00a
      Linus Torvalds authored
      Pull overlayfs fixes from Miklos Szeredi:
       "Fix two bugs in copy-up code. One introduced in 4.11 and one in
        4.12-rc"
      
      * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
        ovl: don't set origin on broken lower hardlink
        ovl: copy-up: don't unlock between lookup and link
      86c3e00a
    • Baoquan He's avatar
      x86/boot/KASLR: Fix kexec crash due to 'virt_addr' calculation bug · 8eabf42a
      Baoquan He authored
      Kernel text KASLR is separated into physical address and virtual
      address randomization. And for virtual address randomization, we
      only randomiza to get an offset between 16M and KERNEL_IMAGE_SIZE.
      So the initial value of 'virt_addr' should be LOAD_PHYSICAL_ADDR,
      but not the original kernel loading address 'output'.
      
      The bug will cause kernel boot failure if kernel is loaded at a different
      position than the address, 16M, which is decided at compiled time.
      Kexec/kdump is such practical case.
      
      To fix it, just assign LOAD_PHYSICAL_ADDR to virt_addr as initial
      value.
      Tested-by: default avatarDave Young <dyoung@redhat.com>
      Signed-off-by: default avatarBaoquan He <bhe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 8391c73c ("x86/KASLR: Randomize virtual address separately")
      Link: http://lkml.kernel.org/r/1498567146-11990-3-git-send-email-bhe@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      8eabf42a
    • Baoquan He's avatar
      x86/boot/KASLR: Add checking for the offset of kernel virtual address randomization · b892cb87
      Baoquan He authored
      For kernel text KASLR, the virtual address is confined to area of 1G,
      [0xffffffff80000000, 0xffffffffc0000000). For the implemenataion of
      virtual address randomization, we only randomize to get an offset
      between 16M and 1G, then add this offset to the starting address,
      0xffffffff80000000. Here 16M is the offset which is decided at linking
      stage. So the amount of the local variable 'virt_addr' which respresents
      the offset plus the kernel output size can not exceed KERNEL_IMAGE_SIZE.
      
      Add a debug check for the offset. If out of bounds, print error
      message and hang there.
      Suggested-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarBaoquan He <bhe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1498567146-11990-2-git-send-email-bhe@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b892cb87
    • Sabrina Dubroca's avatar
      tracing/kprobes: Allow to create probe with a module name starting with a digit · 9e52b325
      Sabrina Dubroca authored
      Always try to parse an address, since kstrtoul() will safely fail when
      given a symbol as input. If that fails (which will be the case for a
      symbol), try to parse a symbol instead.
      
      This allows creating a probe such as:
      
          p:probe/vlan_gro_receive 8021q:vlan_gro_receive+0
      
      Which is necessary for this command to work:
      
          perf probe -m 8021q -a vlan_gro_receive
      
      Link: http://lkml.kernel.org/r/fd72d666f45b114e2c5b9cf7e27b91de1ec966f1.1498122881.git.sd@queasysnail.net
      
      Cc: stable@vger.kernel.org
      Fixes: 413d37d1 ("tracing: Add kprobe-based event tracer")
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      9e52b325
    • James Hogan's avatar
      MIPS: Avoid accidental raw backtrace · 85423636
      James Hogan authored
      Since commit 81a76d71 ("MIPS: Avoid using unwind_stack() with
      usermode") show_backtrace() invokes the raw backtracer when
      cp0_status & ST0_KSU indicates user mode to fix issues on EVA kernels
      where user and kernel address spaces overlap.
      
      However this is used by show_stack() which creates its own pt_regs on
      the stack and leaves cp0_status uninitialised in most of the code paths.
      This results in the non deterministic use of the raw back tracer
      depending on the previous stack content.
      
      show_stack() deals exclusively with kernel mode stacks anyway, so
      explicitly initialise regs.cp0_status to KSU_KERNEL (i.e. 0) to ensure
      we get a useful backtrace.
      
      Fixes: 81a76d71 ("MIPS: Avoid using unwind_stack() with usermode")
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Cc: <stable@vger.kernel.org> # 3.15+
      Patchwork: https://patchwork.linux-mips.org/patch/16656/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      85423636
    • Paul Burton's avatar
      MIPS: Perform post-DMA cache flushes on systems with MAARs · cad482c1
      Paul Burton authored
      Recent CPUs from Imagination Technologies such as the I6400 or P6600 are
      able to speculatively fetch data from memory into caches. This means
      that if used in a system with non-coherent DMA they require that caches
      be invalidated after a device performs DMA, and before the CPU reads the
      DMA'd data, in order to ensure that stale values weren't speculatively
      prefetched.
      
      Such CPUs also introduced Memory Accessibility Attribute Registers
      (MAARs) in order to control the regions in which they are allowed to
      speculate. Thus we can use the presence of MAARs as a good indication
      that the CPU requires the above cache maintenance. Use the presence of
      MAARs to determine the result of cpu_needs_post_dma_flush() in the
      default case, in order to handle these recent CPUs correctly.
      
      Note that the return type of cpu_needs_post_dma_flush() is changed to
      bool, such that it's clearer what's happening when cpu_has_maar is cast
      to bool for the return value. If this patch were backported to a
      pre-v4.7 kernel then MIPS_CPU_MAAR was 1ull<<34, so when cast to an int
      we would incorrectly return 0. It so happens that MIPS_CPU_MAAR is
      currently 1ull<<30, so when truncated to an int gives a non-zero value
      anyway, but even so the implicit conversion from long long int to bool
      makes it clearer to understand what will happen than the implicit
      conversion from long long int to int would. The bool return type also
      fits this usage better semantically, so seems like an all-round win.
      
      Thanks to Ed for spotting the issue for pre-v4.7 kernels & suggesting
      the return type change.
      Signed-off-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Reviewed-by: default avatarBryan O'Donoghue <pure.logic@nexus-software.ie>
      Tested-by: default avatarBryan O'Donoghue <pure.logic@nexus-software.ie>
      Cc: Ed Blake <ed.blake@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/16363/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      cad482c1
    • Paul Burton's avatar
      MIPS: Fix IRQ tracing & lockdep when rescheduling · d8550860
      Paul Burton authored
      When the scheduler sets TIF_NEED_RESCHED & we call into the scheduler
      from arch/mips/kernel/entry.S we disable interrupts. This is true
      regardless of whether we reach work_resched from syscall_exit_work,
      resume_userspace or by looping after calling schedule(). Although we
      disable interrupts in these paths we don't call trace_hardirqs_off()
      before calling into C code which may acquire locks, and we therefore
      leave lockdep with an inconsistent view of whether interrupts are
      disabled or not when CONFIG_PROVE_LOCKING & CONFIG_DEBUG_LOCKDEP are
      both enabled.
      
      Without tracing this interrupt state lockdep will print warnings such
      as the following once a task returns from a syscall via
      syscall_exit_partial with TIF_NEED_RESCHED set:
      
      [   49.927678] ------------[ cut here ]------------
      [   49.934445] WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:3687 check_flags.part.41+0x1dc/0x1e8
      [   49.946031] DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)
      [   49.946355] CPU: 0 PID: 1 Comm: init Not tainted 4.10.0-00439-gc9fd5d362289-dirty #197
      [   49.963505] Stack : 0000000000000000 ffffffff81bb5d6a 0000000000000006 ffffffff801ce9c4
      [   49.974431]         0000000000000000 0000000000000000 0000000000000000 000000000000004a
      [   49.985300]         ffffffff80b7e487 ffffffff80a24498 a8000000ff160000 ffffffff80ede8b8
      [   49.996194]         0000000000000001 0000000000000000 0000000000000000 0000000077c8030c
      [   50.007063]         000000007fd8a510 ffffffff801cd45c 0000000000000000 a8000000ff127c88
      [   50.017945]         0000000000000000 ffffffff801cf928 0000000000000001 ffffffff80a24498
      [   50.028827]         0000000000000000 0000000000000001 0000000000000000 0000000000000000
      [   50.039688]         0000000000000000 a8000000ff127bd0 0000000000000000 ffffffff805509bc
      [   50.050575]         00000000140084e0 0000000000000000 0000000000000000 0000000000040a00
      [   50.061448]         0000000000000000 ffffffff8010e1b0 0000000000000000 ffffffff805509bc
      [   50.072327]         ...
      [   50.076087] Call Trace:
      [   50.079869] [<ffffffff8010e1b0>] show_stack+0x80/0xa8
      [   50.086577] [<ffffffff805509bc>] dump_stack+0x10c/0x190
      [   50.093498] [<ffffffff8015dde0>] __warn+0xf0/0x108
      [   50.099889] [<ffffffff8015de34>] warn_slowpath_fmt+0x3c/0x48
      [   50.107241] [<ffffffff801c15b4>] check_flags.part.41+0x1dc/0x1e8
      [   50.114961] [<ffffffff801c239c>] lock_is_held_type+0x8c/0xb0
      [   50.122291] [<ffffffff809461b8>] __schedule+0x8c0/0x10f8
      [   50.129221] [<ffffffff80946a60>] schedule+0x30/0x98
      [   50.135659] [<ffffffff80106278>] work_resched+0x8/0x34
      [   50.142397] ---[ end trace 0cb4f6ef5b99fe21 ]---
      [   50.148405] possible reason: unannotated irqs-off.
      [   50.154600] irq event stamp: 400463
      [   50.159566] hardirqs last  enabled at (400463): [<ffffffff8094edc8>] _raw_spin_unlock_irqrestore+0x40/0xa8
      [   50.171981] hardirqs last disabled at (400462): [<ffffffff8094eb98>] _raw_spin_lock_irqsave+0x30/0xb0
      [   50.183897] softirqs last  enabled at (400450): [<ffffffff8016580c>] __do_softirq+0x4ac/0x6a8
      [   50.195015] softirqs last disabled at (400425): [<ffffffff80165e78>] irq_exit+0x110/0x128
      
      Fix this by using the TRACE_IRQS_OFF macro to call trace_hardirqs_off()
      when CONFIG_TRACE_IRQFLAGS is enabled. This is done before invoking
      schedule() following the work_resched label because:
      
       1) Interrupts are disabled regardless of the path we take to reach
          work_resched() & schedule().
      
       2) Performing the tracing here avoids the need to do it in paths which
          disable interrupts but don't call out to C code before hitting a
          path which uses the RESTORE_SOME macro that will call
          trace_hardirqs_on() or trace_hardirqs_off() as appropriate.
      
      We call trace_hardirqs_on() using the TRACE_IRQS_ON macro before calling
      syscall_trace_leave() for similar reasons, ensuring that lockdep has a
      consistent view of state after we re-enable interrupts.
      Signed-off-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Cc: linux-mips@linux-mips.org
      Cc: stable <stable@vger.kernel.org>
      Patchwork: https://patchwork.linux-mips.org/patch/15385/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      d8550860
    • Paul Burton's avatar
      MIPS: pm-cps: Drop manual cache-line alignment of ready_count · 161c51cc
      Paul Burton authored
      We allocate memory for a ready_count variable per-CPU, which is accessed
      via a cached non-coherent TLB mapping to perform synchronisation between
      threads within the core using LL/SC instructions. In order to ensure
      that the variable is contained within its own data cache line we
      allocate 2 lines worth of memory & align the resulting pointer to a line
      boundary. This is however unnecessary, since kmalloc is guaranteed to
      return memory which is at least cache-line aligned (see
      ARCH_DMA_MINALIGN). Stop the redundant manual alignment.
      
      Besides cleaning up the code & avoiding needless work, this has the side
      effect of avoiding an arithmetic error found by Bryan on 64 bit systems
      due to the 32 bit size of the former dlinesz. This led the ready_count
      variable to have its upper 32b cleared erroneously for MIPS64 kernels,
      causing problems when ready_count was later used on MIPS64 via cpuidle.
      Signed-off-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Fixes: 3179d37e ("MIPS: pm-cps: add PM state entry code for CPS systems")
      Reported-by: default avatarBryan O'Donoghue <bryan.odonoghue@imgtec.com>
      Reviewed-by: default avatarBryan O'Donoghue <bryan.odonoghue@imgtec.com>
      Tested-by: default avatarBryan O'Donoghue <bryan.odonoghue@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Cc: stable <stable@vger.kernel.org> # v3.16+
      Patchwork: https://patchwork.linux-mips.org/patch/15383/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      161c51cc
  8. 29 Jun, 2017 10 commits
    • Doug Berger's avatar
      ARM: 8685/1: ensure memblock-limit is pmd-aligned · 9e25ebfe
      Doug Berger authored
      The pmd containing memblock_limit is cleared by prepare_page_table()
      which creates the opportunity for early_alloc() to allocate unmapped
      memory if memblock_limit is not pmd aligned causing a boot-time hang.
      
      Commit 965278dc ("ARM: 8356/1: mm: handle non-pmd-aligned end of RAM")
      attempted to resolve this problem, but there is a path through the
      adjust_lowmem_bounds() routine where if all memory regions start and
      end on pmd-aligned addresses the memblock_limit will be set to
      arm_lowmem_limit.
      
      Since arm_lowmem_limit can be affected by the vmalloc early parameter,
      the value of arm_lowmem_limit may not be pmd-aligned. This commit
      corrects this oversight such that memblock_limit is always rounded
      down to pmd-alignment.
      
      Fixes: 965278dc ("ARM: 8356/1: mm: handle non-pmd-aligned end of RAM")
      Signed-off-by: default avatarDoug Berger <opendmb@gmail.com>
      Suggested-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      9e25ebfe
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 4d8a991d
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Need to access netdev->num_rx_queues behind an accessor in netvsc
          driver otherwise the build breaks with some configs, from Arnd
          Bergmann.
      
       2) Add dummy xfrm_dev_event() so that build doesn't fail when
          CONFIG_XFRM_OFFLOAD is not set. From Hangbin Liu.
      
       3) Don't OOPS when pfkey_msg2xfrm_state() signals an erros, from Dan
          Carpenter.
      
       4) Fix MCDI command size for filter operations in sfc driver, from
          Martin Habets.
      
       5) Fix UFO segmenting so that we don't calculate incorrect checksums,
          from Michal Kubecek.
      
       6) When ipv6 datagram connects fail, reset destination address and
          port. From Wei Wang.
      
       7) TCP disconnect must reset the cached receive DST, from WANG Cong.
      
       8) Fix sign extension bug on 32-bit in dev_get_stats(), from Eric
          Dumazet.
      
       9) fman driver has to depend on HAS_DMA, from Madalin Bucur.
      
      10) Fix bpf pointer leak with xadd in verifier, from Daniel Borkmann.
      
      11) Fix negative page counts with GFO, from Michal Kubecek.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
        sfc: fix attempt to translate invalid filter ID
        net: handle NAPI_GRO_FREE_STOLEN_HEAD case also in napi_frags_finish()
        bpf: prevent leaking pointer via xadd on unpriviledged
        arcnet: com20020-pci: add missing pdev setup in netdev structure
        arcnet: com20020-pci: fix dev_id calculation
        arcnet: com20020: remove needless base_addr assignment
        Trivial fix to spelling mistake in arc_printk message
        arcnet: change irq handler to lock irqsave
        rocker: move dereference before free
        mlxsw: spectrum_router: Fix NULL pointer dereference
        net: sched: Fix one possible panic when no destroy callback
        virtio-net: serialize tx routine during reset
        net: usb: asix88179_178a: Add support for the Belkin B2B128
        fsl/fman: add dependency on HAS_DMA
        net: prevent sign extension in dev_get_stats()
        tcp: reset sk_rx_dst in tcp_disconnect()
        net: ipv6: reset daddr and dport in sk if connect() fails
        bnx2x: Don't log mc removal needlessly
        bnxt_en: Fix netpoll handling.
        bnxt_en: Add missing logic to handle TPA end error conditions.
        ...
      4d8a991d
    • Linus Torvalds's avatar
      Merge tag 'for-4.12/dm-fixes-5' of... · 27bc3440
      Linus Torvalds authored
      Merge tag 'for-4.12/dm-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper fixes from Mike Snitzer:
      
       - dm thinp fix for crash that will occur when metadata device failure
         races with discard passdown to the underlying data device.
      
       - dm raid fix to not access the superblock's >= 1.9.0 'sectors' member
         unconditionally.
      
      * tag 'for-4.12/dm-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm thin: do not queue freed thin mapping for next stage processing
        dm raid: fix oops on upgrading to extended superblock format
      27bc3440
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 374bf883
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "Two fixes that should go into this release.
      
        One is an nvme regression fix from Keith, fixing a missing queue
        freeze if the controller is being reset. This causes the reset to
        hang.
      
        The other is a fix for a leak of the bio protection info, if smaller
        sized O_DIRECT is used. This fix should be more involved as we have
        other problematic paths in the kernel, but given as this isn't a
        regression in this series, we'll tackle those for 4.13"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        block: provide bio_uninit() free freeing integrity/task associations
        nvme/pci: Fix stuck nvme reset
      374bf883
    • Edward Cree's avatar
      sfc: fix attempt to translate invalid filter ID · d58299a4
      Edward Cree authored
      When filter insertion fails with no rollback, we were trying to convert
       EFX_EF10_FILTER_ID_INVALID to an id to store in 'ids' (which is either
       vlan->uc or vlan->mc).  This would WARN_ON_ONCE and then record a bogus
       filter ID of 0x1fff, neither of which is a good thing.
      
      Fixes: 0ccb998b ("sfc: fix filter_id misinterpretation in edge case")
      Signed-off-by: default avatarEdward Cree <ecree@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d58299a4
    • Michal Kubeček's avatar
      net: handle NAPI_GRO_FREE_STOLEN_HEAD case also in napi_frags_finish() · e44699d2
      Michal Kubeček authored
      Recently I started seeing warnings about pages with refcount -1. The
      problem was traced to packets being reused after their head was merged into
      a GRO packet by skb_gro_receive(). While bisecting the issue pointed to
      commit c21b48cc ("net: adjust skb->truesize in ___pskb_trim()") and
      I have never seen it on a kernel with it reverted, I believe the real
      problem appeared earlier when the option to merge head frag in GRO was
      implemented.
      
      Handling NAPI_GRO_FREE_STOLEN_HEAD state was only added to GRO_MERGED_FREE
      branch of napi_skb_finish() so that if the driver uses napi_gro_frags()
      and head is merged (which in my case happens after the skb_condense()
      call added by the commit mentioned above), the skb is reused including the
      head that has been merged. As a result, we release the page reference
      twice and eventually end up with negative page refcount.
      
      To fix the problem, handle NAPI_GRO_FREE_STOLEN_HEAD in napi_frags_finish()
      the same way it's done in napi_skb_finish().
      
      Fixes: d7e8883c ("net: make GRO aware of skb->head_frag")
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e44699d2
    • Daniel Borkmann's avatar
      bpf: prevent leaking pointer via xadd on unpriviledged · 6bdf6abc
      Daniel Borkmann authored
      Leaking kernel addresses on unpriviledged is generally disallowed,
      for example, verifier rejects the following:
      
        0: (b7) r0 = 0
        1: (18) r2 = 0xffff897e82304400
        3: (7b) *(u64 *)(r1 +48) = r2
        R2 leaks addr into ctx
      
      Doing pointer arithmetic on them is also forbidden, so that they
      don't turn into unknown value and then get leaked out. However,
      there's xadd as a special case, where we don't check the src reg
      for being a pointer register, e.g. the following will pass:
      
        0: (b7) r0 = 0
        1: (7b) *(u64 *)(r1 +48) = r0
        2: (18) r2 = 0xffff897e82304400 ; map
        4: (db) lock *(u64 *)(r1 +48) += r2
        5: (95) exit
      
      We could store the pointer into skb->cb, loose the type context,
      and then read it out from there again to leak it eventually out
      of a map value. Or more easily in a different variant, too:
      
         0: (bf) r6 = r1
         1: (7a) *(u64 *)(r10 -8) = 0
         2: (bf) r2 = r10
         3: (07) r2 += -8
         4: (18) r1 = 0x0
         6: (85) call bpf_map_lookup_elem#1
         7: (15) if r0 == 0x0 goto pc+3
         R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R6=ctx R10=fp
         8: (b7) r3 = 0
         9: (7b) *(u64 *)(r0 +0) = r3
        10: (db) lock *(u64 *)(r0 +0) += r6
        11: (b7) r0 = 0
        12: (95) exit
      
        from 7 to 11: R0=inv,min_value=0,max_value=0 R6=ctx R10=fp
        11: (b7) r0 = 0
        12: (95) exit
      
      Prevent this by checking xadd src reg for pointer types. Also
      add a couple of test cases related to this.
      
      Fixes: 1be7f75d ("bpf: enable non-root eBPF programs")
      Fixes: 17a52670 ("bpf: verifier (add verifier core)")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarEdward Cree <ecree@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6bdf6abc
    • Kan Liang's avatar
      perf/x86/intel/uncore: Fix wrong box pointer check · 80c65fdb
      Kan Liang authored
      Should not init a NULL box. It will cause system crash.
      The issue looks like caused by a typo.
      
      This was not noticed because there is no NULL box. Also, for most
      boxes, they are enabled by default. The init code is not critical.
      
      Fixes: fff4b87e ("perf/x86/intel/uncore: Make package handling more robust")
      Signed-off-by: default avatarKan Liang <kan.liang@intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/20170629190926.2456-1-kan.liang@intel.com
      80c65fdb
    • David S. Miller's avatar
      Merge branch 'arcnet-fixes' · 00778f7c
      David S. Miller authored
      Michael Grzeschik says:
      
      ====================
      arcnet: Collection of latest fixes
      
      Here we sum up the recent fixes I collected on the way to use and
      stabilise the framework. Part of it is an possible deadlock that we
      prevent as well to fix the calculation of the dev_id that can be setup
      by an rotary encoder. Beside that we added an trivial spelling patch and
      fix some wrong and missing assignments that improves the code footprint.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      00778f7c
    • Michael Grzeschik's avatar
      arcnet: com20020-pci: add missing pdev setup in netdev structure · 2a0ea04c
      Michael Grzeschik authored
      We add the pdev data to the pci devices netdev structure. This way
      the interface get consistent device names in the userspace (udev).
      Signed-off-by: default avatarMichael Grzeschik <m.grzeschik@pengutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2a0ea04c