1. 06 Feb, 2017 37 commits
  2. 21 Oct, 2016 1 commit
  3. 19 Oct, 2016 2 commits
    • Linus Torvalds's avatar
      mm: remove gup_flags FOLL_WRITE games from __get_user_pages() · 9691eac5
      Linus Torvalds authored
      commit 19be0eaf upstream.
      
      This is an ancient bug that was actually attempted to be fixed once
      (badly) by me eleven years ago in commit 4ceb5db9 ("Fix
      get_user_pages() race for write access") but that was then undone due to
      problems on s390 by commit f33ea7f4 ("fix get_user_pages bug").
      
      In the meantime, the s390 situation has long been fixed, and we can now
      fix it by checking the pte_dirty() bit properly (and do it better).  The
      s390 dirty bit was implemented in abf09bed ("s390/mm: implement
      software dirty bits") which made it into v3.9.  Earlier kernels will
      have to look at the page state itself.
      
      Also, the VM has become more scalable, and what used a purely
      theoretical race back then has become easier to trigger.
      
      To fix it, we introduce a new internal FOLL_COW flag to mark the "yes,
      we already did a COW" rather than play racy games with FOLL_WRITE that
      is very fundamental, and then use the pte dirty flag to validate that
      the FOLL_COW flag is still valid.
      Reported-and-tested-by: default avatarPhil "not Paul" Oester <kernel@linuxace.com>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Reviewed-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Willy Tarreau <w@1wt.eu>
      Cc: Nick Piggin <npiggin@gmail.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [wt: s/gup.c/memory.c; s/follow_page_pte/follow_page_mask;
           s/faultin_page/__get_user_page]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      9691eac5
    • Wei Liu's avatar
      xen-netback: ref count shared rings · 4cebb475
      Wei Liu authored
      ... so that we can make sure the rings are not freed until all SKBs in
      internal queues are consumed.
      
      1. The VM is receiving packets through bonding + bridge + netback +
         netfront.
      2. For some unknown reason at least one packet remains in the rx queue
         and is not delivered to the domU immediately by netback.
      3. The VM finishes shutting down.
      4. The shared ring between dom0 and domU is freed.
      5. then xen-netback continues processing the pending requests and tries
         to put the packet into the now already released shared ring.
      
      > XXXlan0: port 9(vif26.0) entered disabled state
      > BUG: unable to handle kernel paging request at ffffc900108641d8
      > IP: [<ffffffffa04147dc>] xen_netbk_rx_action+0x18b/0x6f0 [xen_netback]
      > PGD 57e20067 PUD 57e21067 PMD 571a7067 PTE 0
      > Oops: 0000 [#1] SMP
      > ...
      > CPU: 0 PID: 12587 Comm: netback/0 Not tainted 3.10.0-ucs58-amd64 #1 Debian 3.10.11-1.58.201405060908
      > Hardware name: FUJITSU PRIMERGY BX620 S6/D3051, BIOS 080015 Rev.3C78.3051 07/22/2011
      > task: ffff880004b067c0 ti: ffff8800561ec000 task.ti: ffff8800561ec000
      > RIP: e030:[<ffffffffa04147dc>]  [<ffffffffa04147dc>] xen_netbk_rx_action+0x18b/0x6f0 [xen_netback]
      > RSP: e02b:ffff8800561edce8  EFLAGS: 00010202
      > RAX: ffffc900104adac0 RBX: ffff8800541e95c0 RCX: ffffc90010864000
      > RDX: 000000000000003b RSI: 0000000000000000 RDI: ffff880040014380
      > RBP: ffff8800570e6800 R08: 0000000000000000 R09: ffff880004799800
      > R10: ffffffff813ca115 R11: ffff88005e4fdb08 R12: ffff880054e6f800
      > R13: ffff8800561edd58 R14: ffffc900104a1000 R15: 0000000000000000
      > FS:  00007f19a54a8700(0000) GS:ffff88005da00000(0000) knlGS:0000000000000000
      > CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
      > CR2: ffffc900108641d8 CR3: 0000000054cb3000 CR4: 0000000000002660
      > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      > Stack:
      >  ffff880004b06ba0 0000000000000000 ffff88005da13ec0 ffff88005da13ec0
      >  0000000004b067c0 ffffc900104a8ac0 ffffc900104a1020 000000005da13ec0
      >  0000000000000000 0000000000000001 ffffc900104a8ac0 ffffc900104adac0
      > Call Trace:
      >  [<ffffffff813ca32d>] ? _raw_spin_lock_irqsave+0x11/0x2f
      >  [<ffffffffa0416033>] ? xen_netbk_kthread+0x174/0x841 [xen_netback]
      >  [<ffffffff8105d373>] ? wake_up_bit+0x20/0x20
      >  [<ffffffffa0415ebf>] ? xen_netbk_tx_build_gops+0xce8/0xce8 [xen_netback]
      >  [<ffffffff8105cd73>] ? kthread_freezable_should_stop+0x56/0x56
      >  [<ffffffffa0415ebf>] ? xen_netbk_tx_build_gops+0xce8/0xce8 [xen_netback]
      >  [<ffffffff8105ce1e>] ? kthread+0xab/0xb3
      >  [<ffffffff81003638>] ? xen_end_context_switch+0xe/0x1c
      >  [<ffffffff8105cd73>] ? kthread_freezable_should_stop+0x56/0x56
      >  [<ffffffff813cfbfc>] ? ret_from_fork+0x7c/0xb0
      >  [<ffffffff8105cd73>] ? kthread_freezable_should_stop+0x56/0x56
      > Code: 8b b3 d0 00 00 00 48 8b bb d8 00 00 00 0f b7 74 37 02 89 70 08 eb 07 c7 40 08 00 00 00 00 89 d2 c7 40 04 00 00 00 00 48 83 c2 08 <0f> b7 34 d1 89 30 c7 44 24 60 00 00 00 00 8b 44 d1 04 89 44 24
      > RIP  [<ffffffffa04147dc>] xen_netbk_rx_action+0x18b/0x6f0 [xen_netback]
      >  RSP <ffff8800561edce8>
      > CR2: ffffc900108641d8
      
      Track the shared ring buffer being unmapped and drop those packets.
      
      Ref-count the rings as followed:
        map         -> set to 1
         start_xmit -> inc when queueing SKB to internal queue
         rx_action  -> dec after finishing processing a SKB
        unmap       -> dec and wait to be 0
      
      Note that this is different from ref counting the vif structure itself.
      Currently only guest Rx path is taken care of because that's where the
      bug surfaced.
      
      This bug doesn't exist in kernel >=3.12 as multi-queue support was added
      there.
      
      Link: <https://lists.xenproject.org/archives/html/xen-devel/2014-06/msg00818.html>
      Signed-off-by: default avatarWei Liu <wei.liu2@citrix.com>
      Signed-off-by: default avatarPhilipp Hahn <hahn@univention.de>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Tested-by: default avatarPhilipp Hahn <hahn@univention.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      4cebb475