1. 18 Jul, 2017 1 commit
    • Dan Streetman's avatar
      xen: do not re-use pirq number cached in pci device msi msg data · ddfc5326
      Dan Streetman authored
      commit c74fd80f upstream.
      
      Revert the main part of commit:
      af42b8d1 ("xen: fix MSI setup and teardown for PV on HVM guests")
      
      That commit introduced reading the pci device's msi message data to see
      if a pirq was previously configured for the device's msi/msix, and re-use
      that pirq.  At the time, that was the correct behavior.  However, a
      later change to Qemu caused it to call into the Xen hypervisor to unmap
      all pirqs for a pci device, when the pci device disables its MSI/MSIX
      vectors; specifically the Qemu commit:
      c976437c7dba9c7444fb41df45468968aaa326ad
      ("qemu-xen: free all the pirqs for msi/msix when driver unload")
      
      Once Qemu added this pirq unmapping, it was no longer correct for the
      kernel to re-use the pirq number cached in the pci device msi message
      data.  All Qemu releases since 2.1.0 contain the patch that unmaps the
      pirqs when the pci device disables its MSI/MSIX vectors.
      
      This bug is causing failures to initialize multiple NVMe controllers
      under Xen, because the NVMe driver sets up a single MSIX vector for
      each controller (concurrently), and then after using that to talk to
      the controller for some configuration data, it disables the single MSIX
      vector and re-configures all the MSIX vectors it needs.  So the MSIX
      setup code tries to re-use the cached pirq from the first vector
      for each controller, but the hypervisor has already given away that
      pirq to another controller, and its initialization fails.
      
      This is discussed in more detail at:
      https://lists.xen.org/archives/html/xen-devel/2017-01/msg00447.html
      
      Fixes: af42b8d1 ("xen: fix MSI setup and teardown for PV on HVM guests")
      Signed-off-by: default avatarDan Streetman <dan.streetman@canonical.com>
      Reviewed-by: default avatarStefano Stabellini <sstabellini@kernel.org>
      Acked-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      ddfc5326
  2. 02 Jul, 2017 7 commits
    • Ben Hutchings's avatar
      Linux 3.2.90 · 9b733a81
      Ben Hutchings authored
      9b733a81
    • Mikulas Patocka's avatar
      mm: fix find_vma_prev · 0e49b7dd
      Mikulas Patocka authored
      commit 83cd904d upstream.
      
      Commit 6bd4837d ("mm: simplify find_vma_prev()") broke memory
      management on PA-RISC.
      
      After application of the patch, programs that allocate big arrays on the
      stack crash with segfault, for example, this will crash if compiled
      without optimization:
      
        int main()
        {
      	char array[200000];
      	array[199999] = 0;
      	return 0;
        }
      
      The reason is that PA-RISC has up-growing stack and the stack is usually
      the last memory area.  In the above example, a page fault happens above
      the stack.
      
      Previously, if we passed too high address to find_vma_prev, it returned
      NULL and stored the last VMA in *pprev.  After "simplify find_vma_prev"
      change, it stores NULL in *pprev.  Consequently, the stack area is not
      found and it is not expanded, as it used to be before the change.
      
      This patch restores the old behavior and makes it return the last VMA in
      *pprev if the requested address is higher than address of any other VMA.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Acked-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      0e49b7dd
    • KOSAKI Motohiro's avatar
      mm: simplify find_vma_prev() · 2e8d6ed8
      KOSAKI Motohiro authored
      commit 6bd4837d upstream.
      
      commit 297c5eee ("mm: make the vma list be doubly linked") added the
      vm_prev member to vm_area_struct.  We can simplify find_vma_prev() by
      using it.  Also, this change helps to improve page fault performance
      because it has stronger locality of reference.
      Signed-off-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Reviewed-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Shaohua Li <shaohua.li@intel.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      2e8d6ed8
    • David Howells's avatar
      rxrpc: Fix several cases where a padded len isn't checked in ticket decode · 09c9faac
      David Howells authored
      commit 5f2f9765 upstream.
      
      This fixes CVE-2017-7482.
      
      When a kerberos 5 ticket is being decoded so that it can be loaded into an
      rxrpc-type key, there are several places in which the length of a
      variable-length field is checked to make sure that it's not going to
      overrun the available data - but the data is padded to the nearest
      four-byte boundary and the code doesn't check for this extra.  This could
      lead to the size-remaining variable wrapping and the data pointer going
      over the end of the buffer.
      
      Fix this by making the various variable-length data checks use the padded
      length.
      Reported-by: default avatar石磊 <shilei-c@360.cn>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarMarc Dionne <marc.c.dionne@auristor.com>
      Reviewed-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.2: adjust filename, context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      09c9faac
    • Helge Deller's avatar
      Allow stack to grow up to address space limit · a7d51947
      Helge Deller authored
      commit bd726c90 upstream.
      
      Fix expand_upwards() on architectures with an upward-growing stack (parisc,
      metag and partly IA-64) to allow the stack to reliably grow exactly up to
      the address space limit given by TASK_SIZE.
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a7d51947
    • Hugh Dickins's avatar
      mm: larger stack guard gap, between vmas · 640c7dfd
      Hugh Dickins authored
      commit 1be7107f upstream.
      
      Stack guard page is a useful feature to reduce a risk of stack smashing
      into a different mapping. We have been using a single page gap which
      is sufficient to prevent having stack adjacent to a different mapping.
      But this seems to be insufficient in the light of the stack usage in
      userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
      used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
      which is 256kB or stack strings with MAX_ARG_STRLEN.
      
      This will become especially dangerous for suid binaries and the default
      no limit for the stack size limit because those applications can be
      tricked to consume a large portion of the stack and a single glibc call
      could jump over the guard page. These attacks are not theoretical,
      unfortunatelly.
      
      Make those attacks less probable by increasing the stack guard gap
      to 1MB (on systems with 4k pages; but make it depend on the page size
      because systems with larger base pages might cap stack allocations in
      the PAGE_SIZE units) which should cover larger alloca() and VLA stack
      allocations. It is obviously not a full fix because the problem is
      somehow inherent, but it should reduce attack space a lot.
      
      One could argue that the gap size should be configurable from userspace,
      but that can be done later when somebody finds that the new 1MB is wrong
      for some special case applications.  For now, add a kernel command line
      option (stack_guard_gap) to specify the stack gap size (in page units).
      
      Implementation wise, first delete all the old code for stack guard page:
      because although we could get away with accounting one extra page in a
      stack vma, accounting a larger gap can break userspace - case in point,
      a program run with "ulimit -S -v 20000" failed when the 1MB gap was
      counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
      and strict non-overcommit mode.
      
      Instead of keeping gap inside the stack vma, maintain the stack guard
      gap as a gap between vmas: using vm_start_gap() in place of vm_start
      (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
      places which need to respect the gap - mainly arch_get_unmapped_area(),
      and and the vma tree's subtree_gap support for that.
      Original-patch-by: default avatarOleg Nesterov <oleg@redhat.com>
      Original-patch-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Tested-by: Helge Deller <deller@gmx.de> # parisc
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [Hugh Dickins: Backported to 3.2]
      [bwh: Fix more instances of vma->vm_start in sparc64 impl. of
       arch_get_unmapped_area_topdown() and generic impl. of
       hugetlb_get_unmapped_area()]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      640c7dfd
    • Linus Torvalds's avatar
      mm: do not grow the stack vma just because of an overrun on preceding vma · 641fbf59
      Linus Torvalds authored
      commit 09884964 upstream.
      
      The stack vma is designed to grow automatically (marked with VM_GROWSUP
      or VM_GROWSDOWN depending on architecture) when an access is made beyond
      the existing boundary.  However, particularly if you have not limited
      your stack at all ("ulimit -s unlimited"), this can cause the stack to
      grow even if the access was really just one past *another* segment.
      
      And that's wrong, especially since we first grow the segment, but then
      immediately later enforce the stack guard page on the last page of the
      segment.  So _despite_ first growing the stack segment as a result of
      the access, the kernel will then make the access cause a SIGSEGV anyway!
      
      So do the same logic as the guard page check does, and consider an
      access to within one page of the next segment to be a bad access, rather
      than growing the stack to abut the next segment.
      Reported-and-tested-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      641fbf59
  3. 05 Jun, 2017 32 commits