1. 06 Feb, 2017 28 commits
  2. 21 Oct, 2016 1 commit
  3. 19 Oct, 2016 11 commits
    • Linus Torvalds's avatar
      mm: remove gup_flags FOLL_WRITE games from __get_user_pages() · 9691eac5
      Linus Torvalds authored
      commit 19be0eaf upstream.
      
      This is an ancient bug that was actually attempted to be fixed once
      (badly) by me eleven years ago in commit 4ceb5db9 ("Fix
      get_user_pages() race for write access") but that was then undone due to
      problems on s390 by commit f33ea7f4 ("fix get_user_pages bug").
      
      In the meantime, the s390 situation has long been fixed, and we can now
      fix it by checking the pte_dirty() bit properly (and do it better).  The
      s390 dirty bit was implemented in abf09bed ("s390/mm: implement
      software dirty bits") which made it into v3.9.  Earlier kernels will
      have to look at the page state itself.
      
      Also, the VM has become more scalable, and what used a purely
      theoretical race back then has become easier to trigger.
      
      To fix it, we introduce a new internal FOLL_COW flag to mark the "yes,
      we already did a COW" rather than play racy games with FOLL_WRITE that
      is very fundamental, and then use the pte dirty flag to validate that
      the FOLL_COW flag is still valid.
      Reported-and-tested-by: default avatarPhil "not Paul" Oester <kernel@linuxace.com>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Reviewed-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Willy Tarreau <w@1wt.eu>
      Cc: Nick Piggin <npiggin@gmail.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [wt: s/gup.c/memory.c; s/follow_page_pte/follow_page_mask;
           s/faultin_page/__get_user_page]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      9691eac5
    • Wei Liu's avatar
      xen-netback: ref count shared rings · 4cebb475
      Wei Liu authored
      ... so that we can make sure the rings are not freed until all SKBs in
      internal queues are consumed.
      
      1. The VM is receiving packets through bonding + bridge + netback +
         netfront.
      2. For some unknown reason at least one packet remains in the rx queue
         and is not delivered to the domU immediately by netback.
      3. The VM finishes shutting down.
      4. The shared ring between dom0 and domU is freed.
      5. then xen-netback continues processing the pending requests and tries
         to put the packet into the now already released shared ring.
      
      > XXXlan0: port 9(vif26.0) entered disabled state
      > BUG: unable to handle kernel paging request at ffffc900108641d8
      > IP: [<ffffffffa04147dc>] xen_netbk_rx_action+0x18b/0x6f0 [xen_netback]
      > PGD 57e20067 PUD 57e21067 PMD 571a7067 PTE 0
      > Oops: 0000 [#1] SMP
      > ...
      > CPU: 0 PID: 12587 Comm: netback/0 Not tainted 3.10.0-ucs58-amd64 #1 Debian 3.10.11-1.58.201405060908
      > Hardware name: FUJITSU PRIMERGY BX620 S6/D3051, BIOS 080015 Rev.3C78.3051 07/22/2011
      > task: ffff880004b067c0 ti: ffff8800561ec000 task.ti: ffff8800561ec000
      > RIP: e030:[<ffffffffa04147dc>]  [<ffffffffa04147dc>] xen_netbk_rx_action+0x18b/0x6f0 [xen_netback]
      > RSP: e02b:ffff8800561edce8  EFLAGS: 00010202
      > RAX: ffffc900104adac0 RBX: ffff8800541e95c0 RCX: ffffc90010864000
      > RDX: 000000000000003b RSI: 0000000000000000 RDI: ffff880040014380
      > RBP: ffff8800570e6800 R08: 0000000000000000 R09: ffff880004799800
      > R10: ffffffff813ca115 R11: ffff88005e4fdb08 R12: ffff880054e6f800
      > R13: ffff8800561edd58 R14: ffffc900104a1000 R15: 0000000000000000
      > FS:  00007f19a54a8700(0000) GS:ffff88005da00000(0000) knlGS:0000000000000000
      > CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
      > CR2: ffffc900108641d8 CR3: 0000000054cb3000 CR4: 0000000000002660
      > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      > Stack:
      >  ffff880004b06ba0 0000000000000000 ffff88005da13ec0 ffff88005da13ec0
      >  0000000004b067c0 ffffc900104a8ac0 ffffc900104a1020 000000005da13ec0
      >  0000000000000000 0000000000000001 ffffc900104a8ac0 ffffc900104adac0
      > Call Trace:
      >  [<ffffffff813ca32d>] ? _raw_spin_lock_irqsave+0x11/0x2f
      >  [<ffffffffa0416033>] ? xen_netbk_kthread+0x174/0x841 [xen_netback]
      >  [<ffffffff8105d373>] ? wake_up_bit+0x20/0x20
      >  [<ffffffffa0415ebf>] ? xen_netbk_tx_build_gops+0xce8/0xce8 [xen_netback]
      >  [<ffffffff8105cd73>] ? kthread_freezable_should_stop+0x56/0x56
      >  [<ffffffffa0415ebf>] ? xen_netbk_tx_build_gops+0xce8/0xce8 [xen_netback]
      >  [<ffffffff8105ce1e>] ? kthread+0xab/0xb3
      >  [<ffffffff81003638>] ? xen_end_context_switch+0xe/0x1c
      >  [<ffffffff8105cd73>] ? kthread_freezable_should_stop+0x56/0x56
      >  [<ffffffff813cfbfc>] ? ret_from_fork+0x7c/0xb0
      >  [<ffffffff8105cd73>] ? kthread_freezable_should_stop+0x56/0x56
      > Code: 8b b3 d0 00 00 00 48 8b bb d8 00 00 00 0f b7 74 37 02 89 70 08 eb 07 c7 40 08 00 00 00 00 89 d2 c7 40 04 00 00 00 00 48 83 c2 08 <0f> b7 34 d1 89 30 c7 44 24 60 00 00 00 00 8b 44 d1 04 89 44 24
      > RIP  [<ffffffffa04147dc>] xen_netbk_rx_action+0x18b/0x6f0 [xen_netback]
      >  RSP <ffff8800561edce8>
      > CR2: ffffc900108641d8
      
      Track the shared ring buffer being unmapped and drop those packets.
      
      Ref-count the rings as followed:
        map         -> set to 1
         start_xmit -> inc when queueing SKB to internal queue
         rx_action  -> dec after finishing processing a SKB
        unmap       -> dec and wait to be 0
      
      Note that this is different from ref counting the vif structure itself.
      Currently only guest Rx path is taken care of because that's where the
      bug surfaced.
      
      This bug doesn't exist in kernel >=3.12 as multi-queue support was added
      there.
      
      Link: <https://lists.xenproject.org/archives/html/xen-devel/2014-06/msg00818.html>
      Signed-off-by: default avatarWei Liu <wei.liu2@citrix.com>
      Signed-off-by: default avatarPhilipp Hahn <hahn@univention.de>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Tested-by: default avatarPhilipp Hahn <hahn@univention.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      4cebb475
    • Jann Horn's avatar
      security: let security modules use PTRACE_MODE_* with bitmasks · 72f0c114
      Jann Horn authored
      commit 3dfb7d8c upstream.
      
      It looks like smack and yama weren't aware that the ptrace mode
      can have flags ORed into it - PTRACE_MODE_NOAUDIT until now, but
      only for /proc/$pid/stat, and with the PTRACE_MODE_*CREDS patch,
      all modes have flags ORed into them.
      Signed-off-by: default avatarJann Horn <jann@thejh.net>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: "Serge E. Hallyn" <serge.hallyn@ubuntu.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Willy Tarreau <w@1wt.eu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [wt: no smk_ptrace_mode() in 3.10]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      72f0c114
    • James Hogan's avatar
      MIPS: KVM: Check for pfn noslot case · fea24c07
      James Hogan authored
      commit ba913e4f upstream.
      
      When mapping a page into the guest we error check using is_error_pfn(),
      however this doesn't detect a value of KVM_PFN_NOSLOT, indicating an
      error HVA for the page. This can only happen on MIPS right now due to
      unusual memslot management (e.g. being moved / removed / resized), or
      with an Enhanced Virtual Memory (EVA) configuration where the default
      KVM_HVA_ERR_* and kvm_is_error_hva() definitions are unsuitable (fixed
      in a later patch). This case will be treated as a pfn of zero, mapping
      the first page of physical memory into the guest.
      
      It would appear the MIPS KVM port wasn't updated prior to being merged
      (in v3.10) to take commit 81c52c56 ("KVM: do not treat noslot pfn as
      a error pfn") into account (merged v3.8), which converted a bunch of
      is_error_pfn() calls to is_error_noslot_pfn(). Switch to using
      is_error_noslot_pfn() instead to catch this case properly.
      
      Fixes: 858dd5d4 ("KVM/MIPS32: MMU/TLB operations for the Guest.")
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      [james.hogan@imgtec.com: Backport to v3.16.y]
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      fea24c07
    • Andrea Arcangeli's avatar
      mm: thp: fix SMP race condition between THP page fault and MADV_DONTNEED · afac3781
      Andrea Arcangeli authored
      commit ad33bb04 upstream.
      
      pmd_trans_unstable()/pmd_none_or_trans_huge_or_clear_bad() were
      introduced to locklessy (but atomically) detect when a pmd is a regular
      (stable) pmd or when the pmd is unstable and can infinitely transition
      from pmd_none() and pmd_trans_huge() from under us, while only holding
      the mmap_sem for reading (for writing not).
      
      While holding the mmap_sem only for reading, MADV_DONTNEED can run from
      under us and so before we can assume the pmd to be a regular stable pmd
      we need to compare it against pmd_none() and pmd_trans_huge() in an
      atomic way, with pmd_trans_unstable().  The old pmd_trans_huge() left a
      tiny window for a race.
      
      Useful applications are unlikely to notice the difference as doing
      MADV_DONTNEED concurrently with a page fault would lead to undefined
      behavior.
      
      [js] 3.12 backport: no pmd_devmap in 3.12 yet.
      
      [akpm@linux-foundation.org: tidy up comment grammar/layout]
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Reported-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      afac3781
    • Dan Carpenter's avatar
      ACPI / sysfs: fix error code in get_status() · 40772621
      Dan Carpenter authored
      commit f18ebc21 upstream.
      
      The problem with ornamental, do-nothing gotos is that they lead to
      "forgot to set the error code" bugs.  We should be returning -EINVAL
      here but we don't.  It leads to an uninitalized variable in
      counter_show():
      
          drivers/acpi/sysfs.c:603 counter_show()
          error: uninitialized symbol 'status'.
      
      Fixes: 1c8fce27 (ACPI: introduce drivers/acpi/sysfs.c)
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      40772621
    • Ian Abbott's avatar
      staging: comedi: daqboard2000: bug fix board type matching code · 479c12a0
      Ian Abbott authored
      commit 80e162ee upstream.
      
      `daqboard2000_find_boardinfo()` is supposed to check if the
      DaqBoard/2000 series model is supported, based on the PCI subvendor and
      subdevice ID.  The current code is wrong as it is comparing the PCI
      device's subdevice ID to an expected, fixed value for the subvendor ID.
      It should be comparing the PCI device's subvendor ID to this fixed
      value.  Correct it.
      
      Fixes: 7e8401b2 ("staging: comedi: daqboard2000: add back
      subsystem_device check")
      Signed-off-by: default avatarIan Abbott <abbotti@mev.co.uk>
      Cc: <stable@vger.kernel.org> # 3.7+
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      479c12a0
    • Dan Carpenter's avatar
      crypto: nx - off by one bug in nx_of_update_msc() · c0a803c1
      Dan Carpenter authored
      commit e514cc0a upstream.
      
      The props->ap[] array is defined like this:
      
      	struct alg_props ap[NX_MAX_FC][NX_MAX_MODE][3];
      
      So we can see that if msc->fc and msc->mode are == to NX_MAX_FC or
      NX_MAX_MODE then we're off by one.
      
      Fixes: ae0222b7 ('powerpc/crypto: nx driver code supporting nx encryption')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      c0a803c1
    • Yinghai Lu's avatar
      megaraid_sas: Fix probing cards without io port · 40605334
      Yinghai Lu authored
      commit e7f85168 upstream.
      
      Found one megaraid_sas HBA probe fails,
      
      [  187.235190] scsi host2: Avago SAS based MegaRAID driver
      [  191.112365] megaraid_sas 0000:89:00.0: BAR 0: can't reserve [io  0x0000-0x00ff]
      [  191.120548] megaraid_sas 0000:89:00.0: IO memory region busy!
      
      and the card has resource like,
      [  125.097714] pci 0000:89:00.0: [1000:005d] type 00 class 0x010400
      [  125.104446] pci 0000:89:00.0: reg 0x10: [io  0x0000-0x00ff]
      [  125.110686] pci 0000:89:00.0: reg 0x14: [mem 0xce400000-0xce40ffff 64bit]
      [  125.118286] pci 0000:89:00.0: reg 0x1c: [mem 0xce300000-0xce3fffff 64bit]
      [  125.125891] pci 0000:89:00.0: reg 0x30: [mem 0xce200000-0xce2fffff pref]
      
      that does not io port resource allocated from BIOS, and kernel can not
      assign one as io port shortage.
      
      The driver is only looking for MEM, and should not fail.
      
      It turns out megasas_init_fw() etc are using bar index as mask.  index 1
      is used as mask 1, so that pci_request_selected_regions() is trying to
      request BAR0 instead of BAR1.
      
      Fix all related reference.
      
      Fixes: b6d5d880 ("megaraid_sas: Use lowest memory bar for SR-IOV VF support")
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Acked-by: default avatarKashyap Desai <kashyap.desai@broadcom.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      40605334
    • Dave Carroll's avatar
      aacraid: Check size values after double-fetch from user · 2c0ffdb8
      Dave Carroll authored
      commit fa00c437 upstream.
      
      In aacraid's ioctl_send_fib() we do two fetches from userspace, one the
      get the fib header's size and one for the fib itself. Later we use the
      size field from the second fetch to further process the fib. If for some
      reason the size from the second fetch is different than from the first
      fix, we may encounter an out-of- bounds access in aac_fib_send(). We
      also check the sender size to insure it is not out of bounds. This was
      reported in https://bugzilla.kernel.org/show_bug.cgi?id=116751 and was
      assigned CVE-2016-6480.
      Reported-by: default avatarPengfei Wang <wpengfeinudt@gmail.com>
      Fixes: 7c00ffa3 '[SCSI] 2.6 aacraid: Variable FIB size (updated patch)'
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDave Carroll <david.carroll@microsemi.com>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      2c0ffdb8
    • Simon Horman's avatar
      PCI: Limit config space size for Netronome NFP4000 · af2e2346
      Simon Horman authored
      commit c2e771b0 upstream.
      
      Like the NFP6000, the NFP4000 as an erratum where reading/writing to PCI
      config space addresses above 0x600 can cause the NFP to generate PCIe
      completion timeouts.
      
      Limit the NFP4000's PF's config space size to 0x600 bytes as is already
      done for the NFP6000.
      
      The NFP4000's VF is 0x6004 (PCI_DEVICE_ID_NETRONOME_NFP6000_VF), the same
      device ID as the NFP6000's VF.  Thus, its config space is already limited
      by the existing use of quirk_nfp6000().
      Signed-off-by: default avatarSimon Horman <simon.horman@netronome.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      af2e2346