1. 16 Jan, 2018 5 commits
    • Wanpeng Li's avatar
      KVM: x86: fix escape of guest dr6 to the host · efdab992
      Wanpeng Li authored
      syzkaller reported:
      
         WARNING: CPU: 0 PID: 12927 at arch/x86/kernel/traps.c:780 do_debug+0x222/0x250
         CPU: 0 PID: 12927 Comm: syz-executor Tainted: G           OE    4.15.0-rc2+ #16
         RIP: 0010:do_debug+0x222/0x250
         Call Trace:
          <#DB>
          debug+0x3e/0x70
         RIP: 0010:copy_user_enhanced_fast_string+0x10/0x20
          </#DB>
          _copy_from_user+0x5b/0x90
          SyS_timer_create+0x33/0x80
          entry_SYSCALL_64_fastpath+0x23/0x9a
      
      The testcase sets a watchpoint (with perf_event_open) on a buffer that is
      passed to timer_create() as the struct sigevent argument.  In timer_create(),
      copy_from_user()'s rep movsb triggers the BP.  The testcase also sets
      the debug registers for the guest.
      
      However, KVM only restores host debug registers when the host has active
      watchpoints, which triggers a race condition when running the testcase with
      multiple threads.  The guest's DR6.BS bit can escape to the host before
      another thread invokes timer_create(), and do_debug() complains.
      
      The fix is to respect do_debug()'s dr6 invariant when leaving KVM.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      efdab992
    • Wanpeng Li's avatar
      KVM: X86: support paravirtualized help for TLB shootdowns · f38a7b75
      Wanpeng Li authored
      When running on a virtual machine, IPIs are expensive when the target
      CPU is sleeping.  Thus, it is nice to be able to avoid them for TLB
      shootdowns.  KVM can just do the flush via INVVPID on the guest's behalf
      the next time the CPU is scheduled.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      [Use "&" to test the bit instead of "==". - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      f38a7b75
    • Wanpeng Li's avatar
      KVM: X86: introduce invalidate_gpa argument to tlb flush · c2ba05cc
      Wanpeng Li authored
      Introduce a new bool invalidate_gpa argument to kvm_x86_ops->tlb_flush,
      it will be used by later patches to just flush guest tlb.
      
      For VMX, this will use INVVPID instead of INVEPT, which will invalidate
      combined mappings while keeping guest-physical mappings.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      c2ba05cc
    • Wanpeng Li's avatar
      KVM: X86: use paravirtualized TLB Shootdown · 858a43aa
      Wanpeng Li authored
      Remote TLB flush does a busy wait which is fine in bare-metal
      scenario. But with-in the guest, the vcpus might have been pre-empted or
      blocked. In this scenario, the initator vcpu would end up busy-waiting
      for a long amount of time; it also consumes CPU unnecessarily to wake
      up the target of the shootdown.
      
      This patch set adds support for KVM's new paravirtualized TLB flush;
      remote TLB flush does not wait for vcpus that are sleeping, instead
      KVM will flush the TLB as soon as the vCPU starts running again.
      
      The improvement is clearly visible when the host is overcommitted; in this
      case, the PV TLB flush (in addition to avoiding the wait on the main CPU)
      prevents preempted vCPUs from stealing precious execution time from the
      running ones.
      
      Testing on a Xeon Gold 6142 2.6GHz 2 sockets, 32 cores, 64 threads,
      so 64 pCPUs, and each VM is 64 vCPUs.
      
      ebizzy -M
                    vanilla    optimized     boost
      1VM            46799       48670         4%
      2VM            23962       42691        78%
      3VM            16152       37539       132%
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      858a43aa
    • Wanpeng Li's avatar
      KVM: X86: Add KVM_VCPU_PREEMPTED · fa55eedd
      Wanpeng Li authored
      The next patch will add another bit to the preempted field in
      kvm_steal_time.  Define a constant for bit 0 (the only one that is
      currently used).
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      fa55eedd
  2. 14 Dec, 2017 35 commits