1. 22 Feb, 2024 2 commits
    • Paul Durrant's avatar
      KVM: x86/xen: allow vcpu_info to be mapped by fixed HVA · 3991f358
      Paul Durrant authored
      If the guest does not explicitly set the GPA of vcpu_info structure in
      memory then, for guests with 32 vCPUs or fewer, the vcpu_info embedded
      in the shared_info page may be used. As described in a previous commit,
      the shared_info page is an overlay at a fixed HVA within the VMM, so in
      this case it also more optimal to activate the vcpu_info cache with a
      fixed HVA to avoid unnecessary invalidation if the guest memory layout
      is modified.
      Signed-off-by: default avatarPaul Durrant <pdurrant@amazon.com>
      Reviewed-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Link: https://lore.kernel.org/r/20240215152916.1158-14-paul@xen.org
      [sean: use kvm_gpc_is_{gpa,hva}_active()]
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      3991f358
    • Paul Durrant's avatar
      KVM: x86/xen: allow shared_info to be mapped by fixed HVA · b9220d32
      Paul Durrant authored
      The shared_info page is not guest memory as such. It is a dedicated page
      allocated by the VMM and overlaid onto guest memory in a GFN chosen by the
      guest and specified in the XENMEM_add_to_physmap hypercall. The guest may
      even request that shared_info be moved from one GFN to another by
      re-issuing that hypercall, but the HVA is never going to change.
      
      Because the shared_info page is an overlay the memory slots need to be
      updated in response to the hypercall. However, memory slot adjustment is
      not atomic and, whilst all vCPUs are paused, there is still the possibility
      that events may be delivered (which requires the shared_info page to be
      updated) whilst the shared_info GPA is absent. The HVA is never absent
      though, so it makes much more sense to use that as the basis for the
      kernel's mapping.
      
      Hence add a new KVM_XEN_ATTR_TYPE_SHARED_INFO_HVA attribute type for this
      purpose and a KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA flag to advertize its
      availability. Don't actually advertize it yet though. That will be done in
      a subsequent patch, which will also add tests for the new attribute type.
      
      Also update the KVM API documentation with the new attribute and also fix
      it up to consistently refer to 'shared_info' (with the underscore).
      Signed-off-by: default avatarPaul Durrant <pdurrant@amazon.com>
      Reviewed-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Link: https://lore.kernel.org/r/20240215152916.1158-13-paul@xen.org
      [sean: store "hva" as a user pointer, use kvm_gpc_is_{gpa,hva}_active()]
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      b9220d32
  2. 20 Feb, 2024 11 commits
  3. 08 Feb, 2024 13 commits
  4. 07 Feb, 2024 4 commits
    • Linus Torvalds's avatar
      Merge tag 'loongarch-fixes-6.8-2' of... · 547ab8fc
      Linus Torvalds authored
      Merge tag 'loongarch-fixes-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
      
      Pull LoongArch fixes from Huacai Chen:
       "Fix acpi_core_pic[] array overflow, fix earlycon parameter if KASAN
        enabled, disable UBSAN instrumentation for vDSO build, and two Kconfig
        cleanups"
      
      * tag 'loongarch-fixes-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
        LoongArch: vDSO: Disable UBSAN instrumentation
        LoongArch: Fix earlycon parameter if KASAN enabled
        LoongArch: Change acpi_core_pic[NR_CPUS] to acpi_core_pic[MAX_CORE_PIC]
        LoongArch: Select HAVE_ARCH_SECCOMP to use the common SECCOMP menu
        LoongArch: Select ARCH_ENABLE_THP_MIGRATION instead of redefining it
      547ab8fc
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 5c24ba20
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "x86 guest:
      
         - Avoid false positive for check that only matters on AMD processors
      
        x86:
      
         - Give a hint when Win2016 might fail to boot due to XSAVES &&
           !XSAVEC configuration
      
         - Do not allow creating an in-kernel PIT unless an IOAPIC already
           exists
      
        RISC-V:
      
         - Allow ISA extensions that were enabled for bare metal in 6.8 (Zbc,
           scalar and vector crypto, Zfh[min], Zihintntl, Zvfh[min], Zfa)
      
        S390:
      
         - fix CC for successful PQAP instruction
      
         - fix a race when creating a shadow page"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        x86/coco: Define cc_vendor without CONFIG_ARCH_HAS_CC_PLATFORM
        x86/kvm: Fix SEV check in sev_map_percpu_data()
        KVM: x86: Give a hint when Win2016 might fail to boot due to XSAVES erratum
        KVM: x86: Check irqchip mode before create PIT
        KVM: riscv: selftests: Add Zfa extension to get-reg-list test
        RISC-V: KVM: Allow Zfa extension for Guest/VM
        KVM: riscv: selftests: Add Zvfh[min] extensions to get-reg-list test
        RISC-V: KVM: Allow Zvfh[min] extensions for Guest/VM
        KVM: riscv: selftests: Add Zihintntl extension to get-reg-list test
        RISC-V: KVM: Allow Zihintntl extension for Guest/VM
        KVM: riscv: selftests: Add Zfh[min] extensions to get-reg-list test
        RISC-V: KVM: Allow Zfh[min] extensions for Guest/VM
        KVM: riscv: selftests: Add vector crypto extensions to get-reg-list test
        RISC-V: KVM: Allow vector crypto extensions for Guest/VM
        KVM: riscv: selftests: Add scaler crypto extensions to get-reg-list test
        RISC-V: KVM: Allow scalar crypto extensions for Guest/VM
        KVM: riscv: selftests: Add Zbc extension to get-reg-list test
        RISC-V: KVM: Allow Zbc extension for Guest/VM
        KVM: s390: fix cc for successful PQAP
        KVM: s390: vsie: fix race during shadow creation
      5c24ba20
    • Linus Torvalds's avatar
      Merge tag 'nfsd-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux · c8d80f83
      Linus Torvalds authored
      Pull nfsd fix from Chuck Lever:
      
       - Address a deadlock regression in RELEASE_LOCKOWNER
      
      * tag 'nfsd-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
        nfsd: don't take fi_lock in nfsd_break_deleg_cb()
      c8d80f83
    • Linus Torvalds's avatar
      Merge tag 'for-6.8-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 6d280f4d
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
      
       - two fixes preventing deletion and manual creation of subvolume qgroup
      
       - unify error code returned for unknown send flags
      
       - fix assertion during subvolume creation when anonymous device could
         be allocated by other thread (e.g. due to backref walk)
      
      * tag 'for-6.8-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: do not ASSERT() if the newly created subvolume already got read
        btrfs: forbid deleting live subvol qgroup
        btrfs: forbid creating subvol qgroups
        btrfs: send: return EOPNOTSUPP on unknown flags
      6d280f4d
  5. 06 Feb, 2024 7 commits
    • Nathan Chancellor's avatar
      x86/coco: Define cc_vendor without CONFIG_ARCH_HAS_CC_PLATFORM · e4596477
      Nathan Chancellor authored
      After commit a9ef2774 ("x86/kvm: Fix SEV check in
      sev_map_percpu_data()"), there is a build error when building
      x86_64_defconfig with GCOV using LLVM:
      
        ld.lld: error: undefined symbol: cc_vendor
        >>> referenced by kvm.c
        >>>               arch/x86/kernel/kvm.o:(kvm_smp_prepare_boot_cpu) in archive vmlinux.a
      
      which corresponds to
      
        if (cc_vendor != CC_VENDOR_AMD ||
            !cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
                  return;
      
      Without GCOV, clang is able to eliminate the use of cc_vendor because
      cc_platform_has() evaluates to false when CONFIG_ARCH_HAS_CC_PLATFORM is
      not set, meaning that if statement will be true no matter what value
      cc_vendor has.
      
      With GCOV, the instrumentation keeps the use of cc_vendor around for
      code coverage purposes but cc_vendor is only declared, not defined,
      without CONFIG_ARCH_HAS_CC_PLATFORM, leading to the build error above.
      
      Provide a macro definition of cc_vendor when CONFIG_ARCH_HAS_CC_PLATFORM
      is not set with a value of CC_VENDOR_NONE, so that the first condition
      can always be evaluated/eliminated at compile time, avoiding the build
      error altogether. This is very similar to the situation prior to
      commit da86eb96 ("x86/coco: Get rid of accessor functions").
      Signed-off-by: default avatarNathan Chancellor <nathan@kernel.org>
      Acked-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Message-Id: <20240202-provide-cc_vendor-without-arch_has_cc_platform-v1-1-09ad5f2a3099@kernel.org>
      Fixes: a9ef2774 ("x86/kvm: Fix SEV check in sev_map_percpu_data()", 2024-01-31)
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e4596477
    • Linus Torvalds's avatar
      Merge tag 'bcachefs-2024-02-05' of https://evilpiepirate.org/git/bcachefs · 99bd3cb0
      Linus Torvalds authored
      Pull bcachefs fixes from Kent Overstreet:
       "Two serious ones here that we'll want to backport to stable: a fix for
        a race in the thread_with_file code, and another locking fixup in the
        subvolume deletion path"
      
      * tag 'bcachefs-2024-02-05' of https://evilpiepirate.org/git/bcachefs:
        bcachefs: time_stats: Check for last_event == 0 when updating freq stats
        bcachefs: install fd later to avoid race with close
        bcachefs: unlock parent dir if entry is not found in subvolume deletion
        bcachefs: Fix build on parisc by avoiding __multi3()
      99bd3cb0
    • Kees Cook's avatar
      LoongArch: vDSO: Disable UBSAN instrumentation · cca5efe7
      Kees Cook authored
      The vDSO executes in userspace, so the kernel's UBSAN should not
      instrument it. Solves these kind of build errors:
      
        loongarch64-linux-ld: arch/loongarch/vdso/vgettimeofday.o: in function `vdso_shift_ns':
        lib/vdso/gettimeofday.c:23:(.text+0x3f8): undefined reference to `__ubsan_handle_shift_out_of_bounds'
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Closes: https://lore.kernel.org/oe-kbuild-all/202401310530.lZHCj1Zl-lkp@intel.com/
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: WANG Xuerui <kernel@xen0n.name>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Fangrui Song <maskray@google.com>
      Cc: loongarch@lists.linux.dev
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      cca5efe7
    • Huacai Chen's avatar
      LoongArch: Fix earlycon parameter if KASAN enabled · 639420e9
      Huacai Chen authored
      The earlycon parameter is based on fixmap, and fixmap addresses are not
      supposed to be shadowed by KASAN. So return the kasan_early_shadow_page
      in kasan_mem_to_shadow() if the input address is above FIXADDR_START.
      Otherwise earlycon cannot work after kasan_init().
      
      Cc: stable@vger.kernel.org
      Fixes: 5aa4ac64 ("LoongArch: Add KASAN (Kernel Address Sanitizer) support")
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      639420e9
    • Huacai Chen's avatar
      LoongArch: Change acpi_core_pic[NR_CPUS] to acpi_core_pic[MAX_CORE_PIC] · 4551b305
      Huacai Chen authored
      With default config, the value of NR_CPUS is 64. When HW platform has
      more then 64 cpus, system will crash on these platforms. MAX_CORE_PIC
      is the maximum cpu number in MADT table (max physical number) which can
      exceed the supported maximum cpu number (NR_CPUS, max logical number),
      but kernel should not crash. Kernel should boot cpus with NR_CPUS, let
      the remainder cpus stay in BIOS.
      
      The potential crash reason is that the array acpi_core_pic[NR_CPUS] can
      be overflowed when parsing MADT table, and it is obvious that CORE_PIC
      should be corresponding to physical core rather than logical core, so it
      is better to define the array as acpi_core_pic[MAX_CORE_PIC].
      
      With the patch, system can boot up 64 vcpus with qemu parameter -smp 128,
      otherwise system will crash with the following message.
      
      [    0.000000] CPU 0 Unable to handle kernel paging request at virtual address 0000420000004259, era == 90000000037a5f0c, ra == 90000000037a46ec
      [    0.000000] Oops[#1]:
      [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-rc2+ #192
      [    0.000000] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022
      [    0.000000] pc 90000000037a5f0c ra 90000000037a46ec tp 9000000003c90000 sp 9000000003c93d60
      [    0.000000] a0 0000000000000019 a1 9000000003d93bc0 a2 0000000000000000 a3 9000000003c93bd8
      [    0.000000] a4 9000000003c93a74 a5 9000000083c93a67 a6 9000000003c938f0 a7 0000000000000005
      [    0.000000] t0 0000420000004201 t1 0000000000000000 t2 0000000000000001 t3 0000000000000001
      [    0.000000] t4 0000000000000003 t5 0000000000000000 t6 0000000000000030 t7 0000000000000063
      [    0.000000] t8 0000000000000014 u0 ffffffffffffffff s9 0000000000000000 s0 9000000003caee98
      [    0.000000] s1 90000000041b0480 s2 9000000003c93da0 s3 9000000003c93d98 s4 9000000003c93d90
      [    0.000000] s5 9000000003caa000 s6 000000000a7fd000 s7 000000000f556b60 s8 000000000e0a4330
      [    0.000000]    ra: 90000000037a46ec platform_init+0x214/0x250
      [    0.000000]   ERA: 90000000037a5f0c efi_runtime_init+0x30/0x94
      [    0.000000]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
      [    0.000000]  PRMD: 00000000 (PPLV0 -PIE -PWE)
      [    0.000000]  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
      [    0.000000]  ECFG: 00070800 (LIE=11 VS=7)
      [    0.000000] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
      [    0.000000]  BADV: 0000420000004259
      [    0.000000]  PRID: 0014c010 (Loongson-64bit, Loongson-3A5000)
      [    0.000000] Modules linked in:
      [    0.000000] Process swapper (pid: 0, threadinfo=(____ptrval____), task=(____ptrval____))
      [    0.000000] Stack : 9000000003c93a14 9000000003800898 90000000041844f8 90000000037a46ec
      [    0.000000]         000000000a7fd000 0000000008290000 0000000000000000 0000000000000000
      [    0.000000]         0000000000000000 0000000000000000 00000000019d8000 000000000f556b60
      [    0.000000]         000000000a7fd000 000000000f556b08 9000000003ca7700 9000000003800000
      [    0.000000]         9000000003c93e50 9000000003800898 9000000003800108 90000000037a484c
      [    0.000000]         000000000e0a4330 000000000f556b60 000000000a7fd000 000000000f556b08
      [    0.000000]         9000000003ca7700 9000000004184000 0000000000200000 000000000e02b018
      [    0.000000]         000000000a7fd000 90000000037a0790 9000000003800108 0000000000000000
      [    0.000000]         0000000000000000 000000000e0a4330 000000000f556b60 000000000a7fd000
      [    0.000000]         000000000f556b08 000000000eaae298 000000000eaa5040 0000000000200000
      [    0.000000]         ...
      [    0.000000] Call Trace:
      [    0.000000] [<90000000037a5f0c>] efi_runtime_init+0x30/0x94
      [    0.000000] [<90000000037a46ec>] platform_init+0x214/0x250
      [    0.000000] [<90000000037a484c>] setup_arch+0x124/0x45c
      [    0.000000] [<90000000037a0790>] start_kernel+0x90/0x670
      [    0.000000] [<900000000378b0d8>] kernel_entry+0xd8/0xdc
      Signed-off-by: default avatarBibo Mao <maobibo@loongson.cn>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      4551b305
    • Masahiro Yamada's avatar
      LoongArch: Select HAVE_ARCH_SECCOMP to use the common SECCOMP menu · 6b79ecd0
      Masahiro Yamada authored
      LoongArch missed the refactoring made by commit 282a181b ("seccomp:
      Move config option SECCOMP to arch/Kconfig") because LoongArch was not
      mainlined at that time.
      
      The 'depends on PROC_FS' statement is stale as described in that commit.
      Select HAVE_ARCH_SECCOMP, and remove the duplicated config entry.
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      6b79ecd0
    • Masahiro Yamada's avatar
      LoongArch: Select ARCH_ENABLE_THP_MIGRATION instead of redefining it · b3ff2d9c
      Masahiro Yamada authored
      ARCH_ENABLE_THP_MIGRATION is supposed to be selected by arch Kconfig.
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      b3ff2d9c
  6. 05 Feb, 2024 3 commits
    • NeilBrown's avatar
      nfsd: don't take fi_lock in nfsd_break_deleg_cb() · 5ea9a7c5
      NeilBrown authored
      A recent change to check_for_locks() changed it to take ->flc_lock while
      holding ->fi_lock.  This creates a lock inversion (reported by lockdep)
      because there is a case where ->fi_lock is taken while holding
      ->flc_lock.
      
      ->flc_lock is held across ->fl_lmops callbacks, and
      nfsd_break_deleg_cb() is one of those and does take ->fi_lock.  However
      it doesn't need to.
      
      Prior to v4.17-rc1~110^2~22 ("nfsd: create a separate lease for each
      delegation") nfsd_break_deleg_cb() would walk the ->fi_delegations list
      and so needed the lock.  Since then it doesn't walk the list and doesn't
      need the lock.
      
      Two actions are performed under the lock.  One is to call
      nfsd_break_one_deleg which calls nfsd4_run_cb().  These doesn't act on
      the nfs4_file at all, so don't need the lock.
      
      The other is to set ->fi_had_conflict which is in the nfs4_file.
      This field is only ever set here (except when initialised to false)
      so there is no possible problem will multiple threads racing when
      setting it.
      
      The field is tested twice in nfs4_set_delegation().  The first test does
      not hold a lock and is documented as an opportunistic optimisation, so
      it doesn't impose any need to hold ->fi_lock while setting
      ->fi_had_conflict.
      
      The second test in nfs4_set_delegation() *is* make under ->fi_lock, so
      removing the locking when ->fi_had_conflict is set could make a change.
      The change could only be interesting if ->fi_had_conflict tested as
      false even though nfsd_break_one_deleg() ran before ->fi_lock was
      unlocked.  i.e. while hash_delegation_locked() was running.
      As hash_delegation_lock() doesn't interact in any way with nfs4_run_cb()
      there can be no importance to this interaction.
      
      So this patch removes the locking from nfsd_break_one_deleg() and moves
      the final test on ->fi_had_conflict out of the locked region to make it
      clear that locking isn't important to the test.  It is still tested
      *after* vfs_setlease() has succeeded.  This might be significant and as
      vfs_setlease() takes ->flc_lock, and nfsd_break_one_deleg() is called
      under ->flc_lock this "after" is a true ordering provided by a spinlock.
      
      Fixes: edcf9725 ("nfsd: fix RELEASE_LOCKOWNER")
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      5ea9a7c5
    • Kent Overstreet's avatar
      bcachefs: time_stats: Check for last_event == 0 when updating freq stats · 7b508b32
      Kent Overstreet authored
      This fixes spurious outliers in the frequency stats.
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      7b508b32
    • Mathias Krause's avatar
      bcachefs: install fd later to avoid race with close · dd839f31
      Mathias Krause authored
      Calling fd_install() makes a file reachable for userland, including the
      possibility to close the file descriptor, which leads to calling its
      'release' hook. If that happens before the code had a chance to bump the
      reference of the newly created task struct, the release callback will
      call put_task_struct() too early, leading to the premature destruction
      of the kernel thread.
      
      Avoid that race by calling fd_install() later, after all the setup is
      done.
      
      Fixes: 1c6fdbd8 ("bcachefs: Initial commit")
      Signed-off-by: default avatarMathias Krause <minipli@grsecurity.net>
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      dd839f31