1. 13 Apr, 2022 21 commits
  2. 11 Apr, 2022 5 commits
  3. 09 Apr, 2022 4 commits
    • Heiko Stuebner's avatar
      RISC-V: KVM: include missing hwcap.h into vcpu_fp · 4054eee9
      Heiko Stuebner authored
      vcpu_fp uses the riscv_isa_extension mechanism which gets
      defined in hwcap.h but doesn't include that head file.
      
      While it seems to work in most cases, in certain conditions
      this can lead to build failures like
      
      ../arch/riscv/kvm/vcpu_fp.c: In function ‘kvm_riscv_vcpu_fp_reset’:
      ../arch/riscv/kvm/vcpu_fp.c:22:13: error: implicit declaration of function ‘riscv_isa_extension_available’ [-Werror=implicit-function-declaration]
         22 |         if (riscv_isa_extension_available(&isa, f) ||
            |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      ../arch/riscv/kvm/vcpu_fp.c:22:49: error: ‘f’ undeclared (first use in this function)
         22 |         if (riscv_isa_extension_available(&isa, f) ||
      
      Fix this by simply including the necessary header.
      
      Fixes: 0a86512d ("RISC-V: KVM: Factor-out FP virtualization into separate
      sources")
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: default avatarAnup Patel <anup@brainfault.org>
      4054eee9
    • Anup Patel's avatar
      KVM: selftests: riscv: Fix alignment of the guest_hang() function · ebdef0de
      Anup Patel authored
      The guest_hang() function is used as the default exception handler
      for various KVM selftests applications by setting it's address in
      the vstvec CSR. The vstvec CSR requires exception handler base address
      to be at least 4-byte aligned so this patch fixes alignment of the
      guest_hang() function.
      
      Fixes: 3e06cdf1 ("KVM: selftests: Add initial support for RISC-V
      64-bit")
      Signed-off-by: default avatarAnup Patel <apatel@ventanamicro.com>
      Tested-by: default avatarMayuresh Chitale <mchitale@ventanamicro.com>
      Signed-off-by: default avatarAnup Patel <anup@brainfault.org>
      ebdef0de
    • Anup Patel's avatar
      KVM: selftests: riscv: Set PTE A and D bits in VS-stage page table · fac37253
      Anup Patel authored
      Supporting hardware updates of PTE A and D bits is optional for any
      RISC-V implementation so current software strategy is to always set
      these bits in both G-stage (hypervisor) and VS-stage (guest kernel).
      
      If PTE A and D bits are not set by software (hypervisor or guest)
      then RISC-V implementations not supporting hardware updates of these
      bits will cause traps even for perfectly valid PTEs.
      
      Based on above explanation, the VS-stage page table created by various
      KVM selftest applications is not correct because PTE A and D bits are
      not set. This patch fixes VS-stage page table programming of PTE A and
      D bits for KVM selftests.
      
      Fixes: 3e06cdf1 ("KVM: selftests: Add initial support for RISC-V
      64-bit")
      Signed-off-by: default avatarAnup Patel <apatel@ventanamicro.com>
      Tested-by: default avatarMayuresh Chitale <mchitale@ventanamicro.com>
      Signed-off-by: default avatarAnup Patel <anup@brainfault.org>
      fac37253
    • Anup Patel's avatar
      RISC-V: KVM: Don't clear hgatp CSR in kvm_arch_vcpu_put() · 8c3ce496
      Anup Patel authored
      We might have RISC-V systems (such as QEMU) where VMID is not part
      of the TLB entry tag so these systems will have to flush all TLB
      entries upon any change in hgatp.VMID.
      
      Currently, we zero-out hgatp CSR in kvm_arch_vcpu_put() and we
      re-program hgatp CSR in kvm_arch_vcpu_load(). For above described
      systems, this will flush all TLB entries whenever VCPU exits to
      user-space hence reducing performance.
      
      This patch fixes above described performance issue by not clearing
      hgatp CSR in kvm_arch_vcpu_put().
      
      Fixes: 34bde9d8 ("RISC-V: KVM: Implement VCPU world-switch")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAnup Patel <apatel@ventanamicro.com>
      Signed-off-by: default avatarAnup Patel <anup@brainfault.org>
      8c3ce496
  4. 08 Apr, 2022 1 commit
  5. 07 Apr, 2022 4 commits
  6. 06 Apr, 2022 5 commits
    • Paolo Bonzini's avatar
      KVM: avoid NULL pointer dereference in kvm_dirty_ring_push · 5593473a
      Paolo Bonzini authored
      kvm_vcpu_release() will call kvm_dirty_ring_free(), freeing
      ring->dirty_gfns and setting it to NULL.  Afterwards, it calls
      kvm_arch_vcpu_destroy().
      
      However, if closing the file descriptor races with KVM_RUN in such away
      that vcpu->arch.st.preempted == 0, the following call stack leads to a
      NULL pointer dereference in kvm_dirty_run_push():
      
       mark_page_dirty_in_slot+0x192/0x270 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3171
       kvm_steal_time_set_preempted arch/x86/kvm/x86.c:4600 [inline]
       kvm_arch_vcpu_put+0x34e/0x5b0 arch/x86/kvm/x86.c:4618
       vcpu_put+0x1b/0x70 arch/x86/kvm/../../../virt/kvm/kvm_main.c:211
       vmx_free_vcpu+0xcb/0x130 arch/x86/kvm/vmx/vmx.c:6985
       kvm_arch_vcpu_destroy+0x76/0x290 arch/x86/kvm/x86.c:11219
       kvm_vcpu_destroy arch/x86/kvm/../../../virt/kvm/kvm_main.c:441 [inline]
      
      The fix is to release the dirty page ring after kvm_arch_vcpu_destroy
      has run.
      Reported-by: default avatarQiuhao Li <qiuhao@sysec.org>
      Reported-by: default avatarGaoning Pan <pgn@zju.edu.cn>
      Reported-by: default avatarYongkang Jia <kangel@zju.edu.cn>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      5593473a
    • Reiji Watanabe's avatar
      KVM: arm64: selftests: Introduce vcpu_width_config · 2f5d27e6
      Reiji Watanabe authored
      Introduce a test for aarch64 that ensures non-mixed-width vCPUs
      (all 64bit vCPUs or all 32bit vcPUs) can be configured, and
      mixed-width vCPUs cannot be configured.
      Reviewed-by: default avatarAndrew Jones <drjones@redhat.com>
      Signed-off-by: default avatarReiji Watanabe <reijiw@google.com>
      Reviewed-by: default avatarOliver Upton <oupton@google.com>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20220329031924.619453-3-reijiw@google.com
      2f5d27e6
    • Reiji Watanabe's avatar
      KVM: arm64: mixed-width check should be skipped for uninitialized vCPUs · 26bf74bd
      Reiji Watanabe authored
      KVM allows userspace to configure either all EL1 32bit or 64bit vCPUs
      for a guest.  At vCPU reset, vcpu_allowed_register_width() checks
      if the vcpu's register width is consistent with all other vCPUs'.
      Since the checking is done even against vCPUs that are not initialized
      (KVM_ARM_VCPU_INIT has not been done) yet, the uninitialized vCPUs
      are erroneously treated as 64bit vCPU, which causes the function to
      incorrectly detect a mixed-width VM.
      
      Introduce KVM_ARCH_FLAG_EL1_32BIT and KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED
      bits for kvm->arch.flags.  A value of the EL1_32BIT bit indicates that
      the guest needs to be configured with all 32bit or 64bit vCPUs, and
      a value of the REG_WIDTH_CONFIGURED bit indicates if a value of the
      EL1_32BIT bit is valid (already set up). Values in those bits are set at
      the first KVM_ARM_VCPU_INIT for the guest based on KVM_ARM_VCPU_EL1_32BIT
      configuration for the vCPU.
      
      Check vcpu's register width against those new bits at the vcpu's
      KVM_ARM_VCPU_INIT (instead of against other vCPUs' register width).
      
      Fixes: 66e94d5c ("KVM: arm64: Prevent mixed-width VM creation")
      Signed-off-by: default avatarReiji Watanabe <reijiw@google.com>
      Reviewed-by: default avatarOliver Upton <oupton@google.com>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20220329031924.619453-2-reijiw@google.com
      26bf74bd
    • Yu Zhe's avatar
      c707663e
    • Oliver Upton's avatar
      KVM: arm64: Don't split hugepages outside of MMU write lock · f587661f
      Oliver Upton authored
      It is possible to take a stage-2 permission fault on a page larger than
      PAGE_SIZE. For example, when running a guest backed by 2M HugeTLB, KVM
      eagerly maps at the largest possible block size. When dirty logging is
      enabled on a memslot, KVM does *not* eagerly split these 2M stage-2
      mappings and instead clears the write bit on the pte.
      
      Since dirty logging is always performed at PAGE_SIZE granularity, KVM
      lazily splits these 2M block mappings down to PAGE_SIZE in the stage-2
      fault handler. This operation must be done under the write lock. Since
      commit f783ef1c ("KVM: arm64: Add fast path to handle permission
      relaxation during dirty logging"), the stage-2 fault handler
      conditionally takes the read lock on permission faults with dirty
      logging enabled. To that end, it is possible to split a 2M block mapping
      while only holding the read lock.
      
      The problem is demonstrated by running kvm_page_table_test with 2M
      anonymous HugeTLB, which splats like so:
      
        WARNING: CPU: 5 PID: 15276 at arch/arm64/kvm/hyp/pgtable.c:153 stage2_map_walk_leaf+0x124/0x158
      
        [...]
      
        Call trace:
        stage2_map_walk_leaf+0x124/0x158
        stage2_map_walker+0x5c/0xf0
        __kvm_pgtable_walk+0x100/0x1d4
        __kvm_pgtable_walk+0x140/0x1d4
        __kvm_pgtable_walk+0x140/0x1d4
        kvm_pgtable_walk+0xa0/0xf8
        kvm_pgtable_stage2_map+0x15c/0x198
        user_mem_abort+0x56c/0x838
        kvm_handle_guest_abort+0x1fc/0x2a4
        handle_exit+0xa4/0x120
        kvm_arch_vcpu_ioctl_run+0x200/0x448
        kvm_vcpu_ioctl+0x588/0x664
        __arm64_sys_ioctl+0x9c/0xd4
        invoke_syscall+0x4c/0x144
        el0_svc_common+0xc4/0x190
        do_el0_svc+0x30/0x8c
        el0_svc+0x28/0xcc
        el0t_64_sync_handler+0x84/0xe4
        el0t_64_sync+0x1a4/0x1a8
      
      Fix the issue by only acquiring the read lock if the guest faulted on a
      PAGE_SIZE granule w/ dirty logging enabled. Add a WARN to catch locking
      bugs in future changes.
      
      Fixes: f783ef1c ("KVM: arm64: Add fast path to handle permission relaxation during dirty logging")
      Cc: Jing Zhang <jingzhangos@google.com>
      Signed-off-by: default avatarOliver Upton <oupton@google.com>
      Reviewed-by: default avatarReiji Watanabe <reijiw@google.com>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20220401194652.950240-1-oupton@google.com
      f587661f