1. 17 Oct, 2024 4 commits
  2. 11 Oct, 2024 1 commit
    • Marc Zyngier's avatar
      KVM: arm64: Don't eagerly teardown the vgic on init error · df5fd75e
      Marc Zyngier authored
      As there is very little ordering in the KVM API, userspace can
      instanciate a half-baked GIC (missing its memory map, for example)
      at almost any time.
      
      This means that, with the right timing, a thread running vcpu-0
      can enter the kernel without a GIC configured and get a GIC created
      behind its back by another thread. Amusingly, it will pick up
      that GIC and start messing with the data structures without the
      GIC having been fully initialised.
      
      Similarly, a thread running vcpu-1 can enter the kernel, and try
      to init the GIC that was previously created. Since this GIC isn't
      properly configured (no memory map), it fails to correctly initialise.
      
      And that's the point where we decide to teardown the GIC, freeing all
      its resources. Behind vcpu-0's back. Things stop pretty abruptly,
      with a variety of symptoms.  Clearly, this isn't good, we should be
      a bit more careful about this.
      
      It is obvious that this guest is not viable, as it is missing some
      important part of its configuration. So instead of trying to tear
      bits of it down, let's just mark it as *dead*. It means that any
      further interaction from userspace will result in -EIO. The memory
      will be released on the "normal" path, when userspace gives up.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarAlexander Potapenko <glider@google.com>
      Reviewed-by: default avatarOliver Upton <oliver.upton@linux.dev>
      Link: https://lore.kernel.org/r/20241009183603.3221824-1-maz@kernel.orgSigned-off-by: default avatarMarc Zyngier <maz@kernel.org>
      df5fd75e
  3. 08 Oct, 2024 7 commits
    • Mark Brown's avatar
      KVM: arm64: Expose S1PIE to guests · d4a89e5a
      Mark Brown authored
      Prior to commit 70ed7238 ("KVM: arm64: Sanitise ID_AA64MMFR3_EL1")
      we just exposed the santised view of ID_AA64MMFR3_EL1 to guests, meaning
      that they saw both TCRX and S1PIE if present on the host machine. That
      commit added VMM control over the contents of the register and exposed
      S1POE but removed S1PIE, meaning that the extension is no longer visible
      to guests. Reenable support for S1PIE with VMM control.
      
      Fixes: 70ed7238 ("KVM: arm64: Sanitise ID_AA64MMFR3_EL1")
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Reviewed-by: default avatarJoey Gouly <joey.gouly@arm.com>
      Link: https://lore.kernel.org/r/20241005-kvm-arm64-fix-s1pie-v1-1-5901f02de749@kernel.orgSigned-off-by: default avatarMarc Zyngier <maz@kernel.org>
      d4a89e5a
    • Oliver Upton's avatar
      KVM: arm64: nv: Clarify safety of allowing TLBI unmaps to reschedule · 79cc6cdb
      Oliver Upton authored
      There's been a decent amount of attention around unmaps of nested MMUs,
      and TLBI handling is no exception to this. Add a comment clarifying why
      it is safe to reschedule during a TLBI unmap, even without a reference
      on the MMU in progress.
      Signed-off-by: default avatarOliver Upton <oliver.upton@linux.dev>
      Link: https://lore.kernel.org/r/20241007233028.2236133-5-oliver.upton@linux.devSigned-off-by: default avatarMarc Zyngier <maz@kernel.org>
      79cc6cdb
    • Oliver Upton's avatar
      KVM: arm64: nv: Punt stage-2 recycling to a vCPU request · c268f204
      Oliver Upton authored
      Currently, when a nested MMU is repurposed for some other MMU context,
      KVM unmaps everything during vcpu_load() while holding the MMU lock for
      write. This is quite a performance bottleneck for large nested VMs, as
      all vCPU scheduling will spin until the unmap completes.
      
      Start punting the MMU cleanup to a vCPU request, where it is then
      possible to periodically release the MMU lock and CPU in the presence of
      contention.
      
      Ensure that no vCPU winds up using a stale MMU by tracking the pending
      unmap on the S2 MMU itself and requesting an unmap on every vCPU that
      finds it.
      Signed-off-by: default avatarOliver Upton <oliver.upton@linux.dev>
      Link: https://lore.kernel.org/r/20241007233028.2236133-4-oliver.upton@linux.devSigned-off-by: default avatarMarc Zyngier <maz@kernel.org>
      c268f204
    • Oliver Upton's avatar
      KVM: arm64: nv: Do not block when unmapping stage-2 if disallowed · 3c164eb9
      Oliver Upton authored
      Right now the nested code allows unmap operations on a shadow stage-2 to
      block unconditionally. This is wrong in a couple places, such as a
      non-blocking MMU notifier or on the back of a sched_in() notifier as
      part of shadow MMU recycling.
      
      Carry through whether or not blocking is allowed to
      kvm_pgtable_stage2_unmap(). This 'fixes' an issue where stage-2 MMU
      reclaim would precipitate a stack overflow from a pile of kvm_sched_in()
      callbacks, all trying to recycle a stage-2 MMU.
      Signed-off-by: default avatarOliver Upton <oliver.upton@linux.dev>
      Link: https://lore.kernel.org/r/20241007233028.2236133-3-oliver.upton@linux.devSigned-off-by: default avatarMarc Zyngier <maz@kernel.org>
      3c164eb9
    • Oliver Upton's avatar
      KVM: arm64: nv: Keep reference on stage-2 MMU when scheduled out · 6ded46b5
      Oliver Upton authored
      If a vCPU is scheduling out and not in WFI emulation, it is highly
      likely it will get scheduled again soon and reuse the MMU it had before.
      Dropping the MMU at vcpu_put() can have some unfortunate consequences,
      as the MMU could get reclaimed and used in a different context, forcing
      another 'cold start' on an otherwise active MMU.
      
      Avoid that altogether by keeping a reference on the MMU if the vCPU is
      scheduling out, ensuring that another vCPU cannot reclaim it while the
      current vCPU is away. Since there are more MMUs than vCPUs, this does
      not affect the guarantee that an unused MMU is available at any time.
      
      Furthermore, this makes the vcpu->arch.hw_mmu ~stable in preemptible
      code, at least for where it matters in the stage-2 abort path. Yes, the
      MMU can change across WFI emulation, but there isn't even a use case
      where this would matter.
      Signed-off-by: default avatarOliver Upton <oliver.upton@linux.dev>
      Link: https://lore.kernel.org/r/20241007233028.2236133-2-oliver.upton@linux.devSigned-off-by: default avatarMarc Zyngier <maz@kernel.org>
      6ded46b5
    • Oliver Upton's avatar
      KVM: arm64: Unregister redistributor for failed vCPU creation · ae8f8b37
      Oliver Upton authored
      Alex reports that syzkaller has managed to trigger a use-after-free when
      tearing down a VM:
      
        BUG: KASAN: slab-use-after-free in kvm_put_kvm+0x300/0xe68 virt/kvm/kvm_main.c:5769
        Read of size 8 at addr ffffff801c6890d0 by task syz.3.2219/10758
      
        CPU: 3 UID: 0 PID: 10758 Comm: syz.3.2219 Not tainted 6.11.0-rc6-dirty #64
        Hardware name: linux,dummy-virt (DT)
        Call trace:
         dump_backtrace+0x17c/0x1a8 arch/arm64/kernel/stacktrace.c:317
         show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:324
         __dump_stack lib/dump_stack.c:93 [inline]
         dump_stack_lvl+0x94/0xc0 lib/dump_stack.c:119
         print_report+0x144/0x7a4 mm/kasan/report.c:377
         kasan_report+0xcc/0x128 mm/kasan/report.c:601
         __asan_report_load8_noabort+0x20/0x2c mm/kasan/report_generic.c:381
         kvm_put_kvm+0x300/0xe68 virt/kvm/kvm_main.c:5769
         kvm_vm_release+0x4c/0x60 virt/kvm/kvm_main.c:1409
         __fput+0x198/0x71c fs/file_table.c:422
         ____fput+0x20/0x30 fs/file_table.c:450
         task_work_run+0x1cc/0x23c kernel/task_work.c:228
         do_notify_resume+0x144/0x1a0 include/linux/resume_user_mode.h:50
         el0_svc+0x64/0x68 arch/arm64/kernel/entry-common.c:169
         el0t_64_sync_handler+0x90/0xfc arch/arm64/kernel/entry-common.c:730
         el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
      
      Upon closer inspection, it appears that we do not properly tear down the
      MMIO registration for a vCPU that fails creation late in the game, e.g.
      a vCPU w/ the same ID already exists in the VM.
      
      It is important to consider the context of commit that introduced this bug
      by moving the unregistration out of __kvm_vgic_vcpu_destroy(). That
      change correctly sought to avoid an srcu v. config_lock inversion by
      breaking up the vCPU teardown into two parts, one guarded by the
      config_lock.
      
      Fix the use-after-free while avoiding lock inversion by adding a
      special-cased unregistration to __kvm_vgic_vcpu_destroy(). This is safe
      because failed vCPUs are torn down outside of the config_lock.
      
      Cc: stable@vger.kernel.org
      Fixes: f6165067 ("KVM: arm64: vgic: Don't hold config_lock while unregistering redistributors")
      Reported-by: default avatarAlexander Potapenko <glider@google.com>
      Signed-off-by: default avatarOliver Upton <oliver.upton@linux.dev>
      Link: https://lore.kernel.org/r/20241007223909.2157336-1-oliver.upton@linux.devSigned-off-by: default avatarMarc Zyngier <maz@kernel.org>
      ae8f8b37
    • Marc Zyngier's avatar
      Merge branch kvm-arm64/idregs-6.12 into kvmarm/fixes · 9b7c3dd5
      Marc Zyngier authored
      * kvm-arm64/idregs-6.12:
        : .
        : Make some fields of ID_AA64DFR0_EL1 and ID_AA64PFR1_EL1
        : writable from userspace, so that a VMM can influence the
        : set of guest-visible features.
        :
        : - for ID_AA64DFR0_EL1: DoubleLock, WRPs, PMUVer and DebugVer
        :   are writable (courtesy of Shameer Kolothum)
        :
        : - for ID_AA64PFR1_EL1: BT, SSBS, CVS2_frac are writable
        :   (courtesy of Shaoqin Huang)
        : .
        KVM: selftests: aarch64: Add writable test for ID_AA64PFR1_EL1
        KVM: arm64: Allow userspace to change ID_AA64PFR1_EL1
        KVM: arm64: Use kvm_has_feat() to check if FEAT_SSBS is advertised to the guest
        KVM: arm64: Disable fields that KVM doesn't know how to handle in ID_AA64PFR1_EL1
        KVM: arm64: Make the exposed feature bits in AA64DFR0_EL1 writable from userspace
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      9b7c3dd5
  4. 03 Oct, 2024 1 commit
    • Marc Zyngier's avatar
      KVM: arm64: Fix kvm_has_feat*() handling of negative features · a1d402ab
      Marc Zyngier authored
      Oliver reports that the kvm_has_feat() helper is not behaviing as
      expected for negative feature. On investigation, the main issue
      seems to be caused by the following construct:
      
       #define get_idreg_field(kvm, id, fld)				\
       	(id##_##fld##_SIGNED ?					\
      	 get_idreg_field_signed(kvm, id, fld) :			\
      	 get_idreg_field_unsigned(kvm, id, fld))
      
      where one side of the expression evaluates as something signed,
      and the other as something unsigned. In retrospect, this is totally
      braindead, as the compiler converts this into an unsigned expression.
      When compared to something that is 0, the test is simply elided.
      
      Epic fail. Similar issue exists in the expand_field_sign() macro.
      
      The correct way to handle this is to chose between signed and unsigned
      comparisons, so that both sides of the ternary expression are of the
      same type (bool).
      
      In order to keep the code readable (sort of), we introduce new
      comparison primitives taking an operator as a parameter, and
      rewrite the kvm_has_feat*() helpers in terms of these primitives.
      
      Fixes: c62d7a23 ("KVM: arm64: Add feature checking helpers")
      Reported-by: default avatarOliver Upton <oliver.upton@linux.dev>
      Tested-by: default avatarOliver Upton <oliver.upton@linux.dev>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20241002204239.2051637-1-maz@kernel.orgSigned-off-by: default avatarMarc Zyngier <maz@kernel.org>
      a1d402ab
  5. 01 Oct, 2024 3 commits
  6. 29 Sep, 2024 12 commits
    • Linus Torvalds's avatar
      Linux 6.12-rc1 · 9852d85e
      Linus Torvalds authored
      9852d85e
    • Linus Torvalds's avatar
      x86: kvm: fix build error · 3f749bef
      Linus Torvalds authored
      The cpu_emergency_register_virt_callback() function is used
      unconditionally by the x86 kvm code, but it is declared (and defined)
      conditionally:
      
        #if IS_ENABLED(CONFIG_KVM_INTEL) || IS_ENABLED(CONFIG_KVM_AMD)
        void cpu_emergency_register_virt_callback(cpu_emergency_virt_cb *callback);
        ...
      
      leading to a build error when neither KVM_INTEL nor KVM_AMD support is
      enabled:
      
        arch/x86/kvm/x86.c: In function ‘kvm_arch_enable_virtualization’:
        arch/x86/kvm/x86.c:12517:9: error: implicit declaration of function ‘cpu_emergency_register_virt_callback’ [-Wimplicit-function-declaration]
        12517 |         cpu_emergency_register_virt_callback(kvm_x86_ops.emergency_disable_virtualization_cpu);
              |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        arch/x86/kvm/x86.c: In function ‘kvm_arch_disable_virtualization’:
        arch/x86/kvm/x86.c:12522:9: error: implicit declaration of function ‘cpu_emergency_unregister_virt_callback’ [-Wimplicit-function-declaration]
        12522 |         cpu_emergency_unregister_virt_callback(kvm_x86_ops.emergency_disable_virtualization_cpu);
              |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      Fix the build by defining empty helper functions the same way the old
      cpu_emergency_disable_virtualization() function was dealt with for the
      same situation.
      
      Maybe we could instead have made the call sites conditional, since the
      callers (kvm_arch_{en,dis}able_virtualization()) have an empty weak
      fallback.  I'll leave that to the kvm people to argue about, this at
      least gets the build going for that particular config.
      
      Fixes: 590b09b1 ("KVM: x86: Register "emergency disable" callbacks when virt is enabled")
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Kai Huang <kai.huang@intel.com>
      Cc: Chao Gao <chao.gao@intel.com>
      Cc: Farrah Chen <farrah.chen@intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3f749bef
    • Linus Torvalds's avatar
      Merge tag 'mailbox-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox · e7ed3436
      Linus Torvalds authored
      Pull mailbox updates from Jassi Brar:
      
       - fix kconfig dependencies (mhu-v3, omap2+)
      
       - use devie name instead of genereic imx_mu_chan as interrupt name
         (imx)
      
       - enable sa8255p and qcs8300 ipc controllers (qcom)
      
       - Fix timeout during suspend mode (bcm2835)
      
       - convert to use use of_property_match_string (mailbox)
      
       - enable mt8188 (mediatek)
      
       - use devm_clk_get_enabled helpers (spreadtrum)
      
       - fix device-id typo (rockchip)
      
      * tag 'mailbox-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox:
        mailbox, remoteproc: omap2+: fix compile testing
        dt-bindings: mailbox: qcom-ipcc: Document QCS8300 IPCC
        dt-bindings: mailbox: qcom-ipcc: document the support for SA8255p
        dt-bindings: mailbox: mtk,adsp-mbox: Add compatible for MT8188
        mailbox: Use of_property_match_string() instead of open-coding
        mailbox: bcm2835: Fix timeout during suspend mode
        mailbox: sprd: Use devm_clk_get_enabled() helpers
        mailbox: rockchip: fix a typo in module autoloading
        mailbox: imx: use device name in interrupt name
        mailbox: ARM_MHU_V3 should depend on ARM64
      e7ed3436
    • Linus Torvalds's avatar
      Merge tag 'i2c-for-6.12-rc1-additional_fixes' of... · 907537f5
      Linus Torvalds authored
      Merge tag 'i2c-for-6.12-rc1-additional_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
      
      Pull i2c fixes from Wolfram Sang:
      
       - fix DesignWare driver ENABLE-ABORT sequence, ensuring ABORT can
         always be sent when needed
      
       - check for PCLK in the SynQuacer controller as an optional clock,
         allowing ACPI to directly provide the clock rate
      
       - KEBA driver Kconfig dependency fix
      
       - fix XIIC driver power suspend sequence
      
      * tag 'i2c-for-6.12-rc1-additional_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: xiic: Fix pm_runtime_set_suspended() with runtime pm enabled
        i2c: keba: I2C_KEBA should depend on KEBA_CP500
        i2c: synquacer: Deal with optional PCLK correctly
        i2c: designware: fix controller is holding SCL low while ENABLE bit is disabled
      907537f5
    • Linus Torvalds's avatar
      Merge tag 'dma-mapping-6.12-2024-09-29' of git://git.infradead.org/users/hch/dma-mapping · b81b78da
      Linus Torvalds authored
      Pull dma-mapping fix from Christoph Hellwig:
      
       - handle chained SGLs in the new tracing code (Christoph Hellwig)
      
      * tag 'dma-mapping-6.12-2024-09-29' of git://git.infradead.org/users/hch/dma-mapping:
        dma-mapping: fix DMA API tracing for chained scatterlists
      b81b78da
    • Linus Torvalds's avatar
      Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 3ed7df08
      Linus Torvalds authored
      Pull more SCSI updates from James Bottomley:
       "These are mostly minor updates.
      
        There are two drivers (lpfc and mpi3mr) which missed the initial
        pull and a core change to retry a start/stop unit which affect
        suspend/resume"
      
      * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (32 commits)
        scsi: lpfc: Update lpfc version to 14.4.0.5
        scsi: lpfc: Support loopback tests with VMID enabled
        scsi: lpfc: Revise TRACE_EVENT log flag severities from KERN_ERR to KERN_WARNING
        scsi: lpfc: Ensure DA_ID handling completion before deleting an NPIV instance
        scsi: lpfc: Fix kref imbalance on fabric ndlps from dev_loss_tmo handler
        scsi: lpfc: Restrict support for 32 byte CDBs to specific HBAs
        scsi: lpfc: Update phba link state conditional before sending CMF_SYNC_WQE
        scsi: lpfc: Add ELS_RSP cmd to the list of WQEs to flush in lpfc_els_flush_cmd()
        scsi: mpi3mr: Update driver version to 8.12.0.0.50
        scsi: mpi3mr: Improve wait logic while controller transitions to READY state
        scsi: mpi3mr: Update MPI Headers to revision 34
        scsi: mpi3mr: Use firmware-provided timestamp update interval
        scsi: mpi3mr: Enhance the Enable Controller retry logic
        scsi: sd: Fix off-by-one error in sd_read_block_characteristics()
        scsi: pm8001: Do not overwrite PCI queue mapping
        scsi: scsi_debug: Remove a useless memset()
        scsi: pmcraid: Convert comma to semicolon
        scsi: sd: Retry START STOP UNIT commands
        scsi: mpi3mr: A performance fix
        scsi: ufs: qcom: Update MODE_MAX cfg_bw value
        ...
      3ed7df08
    • Linus Torvalds's avatar
      Merge tag 'bcachefs-2024-09-28' of git://evilpiepirate.org/bcachefs · 9f9a5347
      Linus Torvalds authored
      Pull more bcachefs updates from Kent Overstreet:
       "Assorted minor syzbot fixes, and for bigger stuff:
      
        Fix two disk accounting rewrite bugs:
      
         - Disk accounting keys use the version field of bkey so that journal
           replay can tell which updates have been applied to the btree.
      
           This is set in the transaction commit path, after we've gotten our
           journal reservation (and our time ordering), but the
           BCH_TRANS_COMMIT_skip_accounting_apply flag that journal replay
           uses was incorrectly skipping this for new updates generated prior
           to journal replay.
      
           This fixes the underlying cause of an assertion pop in
           disk_accounting_read.
      
         - A couple of fixes for disk accounting + device removal.
      
           Checking if acocunting replicas entries were marked in the
           superblock was being done at the wrong point, when deltas in the
           journal could still zero them out, and then additionally we'd try
           to add a missing replicas entry to the superblock without checking
           if it referred to an invalid (removed) device.
      
        A whole slew of repair fixes:
      
         - fix infinite loop in propagate_key_to_snapshot_leaves(), this fixes
           an infinite loop when repairing a filesystem with many snapshots
      
         - fix incorrect transaction restart handling leading to occasional
           "fsck counted ..." warnings
      
         - fix warning in __bch2_fsck_err() for bkey fsck errors
      
         - check_inode() in fsck now correctly checks if the filesystem was
           clean
      
         - there shouldn't be pending logged ops if the fs was clean, we now
           check for this
      
         - remove_backpointer() doesn't remove a dirent that doesn't actually
           point to the inode
      
         - many more fsck errors are AUTOFIX"
      
      * tag 'bcachefs-2024-09-28' of git://evilpiepirate.org/bcachefs: (35 commits)
        bcachefs: check_subvol_path() now prints subvol root inode
        bcachefs: remove_backpointer() now checks if dirent points to inode
        bcachefs: dirent_points_to_inode() now warns on mismatch
        bcachefs: Fix lost wake up
        bcachefs: Check for logged ops when clean
        bcachefs: BCH_FS_clean_recovery
        bcachefs: Convert disk accounting BUG_ON() to WARN_ON()
        bcachefs: Fix BCH_TRANS_COMMIT_skip_accounting_apply
        bcachefs: Check for accounting keys with bversion=0
        bcachefs: rename version -> bversion
        bcachefs: Don't delete unlinked inodes before logged op resume
        bcachefs: Fix BCH_SB_ERRS() so we can reorder
        bcachefs: Fix fsck warnings from bkey validation
        bcachefs: Move transaction commit path validation to as late as possible
        bcachefs: Fix disk accounting attempting to mark invalid replicas entry
        bcachefs: Fix unlocked access to c->disk_sb.sb in bch2_replicas_entry_validate()
        bcachefs: Fix accounting read + device removal
        bcachefs: bch_accounting_mode
        bcachefs: fix transaction restart handling in check_extents(), check_dirents()
        bcachefs: kill inode_walker_entry.seen_this_pos
        ...
      9f9a5347
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d37421e6
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
       "Fix TDX MMIO #VE fault handling, and add two new Intel model numbers
        for 'Pantherlake' and 'Diamond Rapids'"
      
      * tag 'x86-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/cpu: Add two Intel CPU model numbers
        x86/tdx: Fix "in-kernel MMIO" check
      d37421e6
    • Linus Torvalds's avatar
      Merge tag 'locking-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ec03de73
      Linus Torvalds authored
      Pull locking updates from Ingo Molnar:
       "lockdep:
          - Fix potential deadlock between lockdep and RCU (Zhiguo Niu)
          - Use str_plural() to address Coccinelle warning (Thorsten Blum)
          - Add debuggability enhancement (Luis Claudio R. Goncalves)
      
        static keys & calls:
          - Fix static_key_slow_dec() yet again (Peter Zijlstra)
          - Handle module init failure correctly in static_call_del_module()
            (Thomas Gleixner)
          - Replace pointless WARN_ON() in static_call_module_notify() (Thomas
            Gleixner)
      
        <linux/cleanup.h>:
          - Add usage and style documentation (Dan Williams)
      
        rwsems:
          - Move is_rwsem_reader_owned() and rwsem_owner() under
            CONFIG_DEBUG_RWSEMS (Waiman Long)
      
        atomic ops, x86:
          - Redeclare x86_32 arch_atomic64_{add,sub}() as void (Uros Bizjak)
          - Introduce the read64_nonatomic macro to x86_32 with cx8 (Uros
            Bizjak)"
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      
      * tag 'locking-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        locking/rwsem: Move is_rwsem_reader_owned() and rwsem_owner() under CONFIG_DEBUG_RWSEMS
        jump_label: Fix static_key_slow_dec() yet again
        static_call: Replace pointless WARN_ON() in static_call_module_notify()
        static_call: Handle module init failure correctly in static_call_del_module()
        locking/lockdep: Simplify character output in seq_line()
        lockdep: fix deadlock issue between lockdep and rcu
        lockdep: Use str_plural() to fix Coccinelle warning
        cleanup: Add usage and style documentation
        lockdep: suggest the fix for "lockdep bfs error:-1" on print_bfs_bug
        locking/atomic/x86: Redeclare x86_32 arch_atomic64_{add,sub}() as void
        locking/atomic/x86: Introduce the read64_nonatomic macro to x86_32 with cx8
      ec03de73
    • Linus Torvalds's avatar
      Merge tag 'cocci-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux · 68e4b0e0
      Linus Torvalds authored
      Pull coccinelle updates from Julia Lawall:
       "Extend string_choices.cocci to use more available helpers
      
        Ten patches from Hongbo Li extending string_choices.cocci with the
        complete set of functions offered by include/linux/string_choices.h.
      
        One patch from myself reducing the number of redundant cases that are
        checked by Coccinelle, giving a small performance improvement"
      
      * tag 'cocci-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux:
        Reduce Coccinelle choices in string_choices.cocci
        coccinelle: Remove unnecessary parentheses for only one possible change.
        coccinelle: Add rules to find str_yes_no() replacements
        coccinelle: Add rules to find str_on_off() replacements
        coccinelle: Add rules to find str_write_read() replacements
        coccinelle: Add rules to find str_read_write() replacements
        coccinelle: Add rules to find str_enable{d}_disable{d}() replacements
        coccinelle: Add rules to find str_lo{w}_hi{gh}() replacements
        coccinelle: Add rules to find str_hi{gh}_lo{w}() replacements
        coccinelle: Add rules to find str_false_true() replacements
        coccinelle: Add rules to find str_true_false() replacements
      68e4b0e0
    • Linus Torvalds's avatar
      Merge tag 'linux_kselftest-next-6.12-rc1-fixes' of... · e7ebdb51
      Linus Torvalds authored
      Merge tag 'linux_kselftest-next-6.12-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull kselftest fix from Shuah Khan:
       "One urgent fix to vDSO as automated testing is failing due to this
        bug"
      
      * tag 'linux_kselftest-next-6.12-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        selftests: vDSO: align stack for O2-optimized memcpy
      e7ebdb51
    • Ingo Molnar's avatar
      Merge branch 'locking/core' into locking/urgent, to pick up pending commits · ae39e0bd
      Ingo Molnar authored
      Merge all pending locking commits into a single branch.
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ae39e0bd
  7. 28 Sep, 2024 12 commits
    • Julia Lawall's avatar
      Reduce Coccinelle choices in string_choices.cocci · 4003ba66
      Julia Lawall authored
      The isomorphism neg_if_exp negates the test of a ?: conditional,
      making it unnecessary to have an explicit case for a negated test
      with the branches inverted.
      
      At the same time, we can disable neg_if_exp in cases where a
      different API function may be more suitable for a negated test.
      
      Finally, in the non-patch cases, E matches an expression with
      parentheses around it, so there is no need to mention ()
      explicitly in the pattern.  The () are still needed in the patch
      cases, because we want to drop them, if they are present.
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      4003ba66
    • Hongbo Li's avatar
      coccinelle: Remove unnecessary parentheses for only one possible change. · f584e375
      Hongbo Li authored
      The parentheses are only needed if there is a disjunction, ie a
      set of possible changes. If there is only one pattern, we can
      remove these parentheses. Just like the format:
      
        -  x
        +  y
      
      not:
      
        (
        -  x
        +  y
        )
      Signed-off-by: default avatarHongbo Li <lihongbo22@huawei.com>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      f584e375
    • Hongbo Li's avatar
      coccinelle: Add rules to find str_yes_no() replacements · 253244cd
      Hongbo Li authored
      As other rules done, we add rules for str_yes_no()
      to check the relative opportunities.
      Signed-off-by: default avatarHongbo Li <lihongbo22@huawei.com>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      253244cd
    • Hongbo Li's avatar
      coccinelle: Add rules to find str_on_off() replacements · 9b5b4810
      Hongbo Li authored
      As other rules done, we add rules for str_on_off()
      to check the relative opportunities.
      Signed-off-by: default avatarHongbo Li <lihongbo22@huawei.com>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      9b5b4810
    • Hongbo Li's avatar
      coccinelle: Add rules to find str_write_read() replacements · c81ca023
      Hongbo Li authored
      As other rules done, we add rules for str_write_read()
      to check the relative opportunities.
      Signed-off-by: default avatarHongbo Li <lihongbo22@huawei.com>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      c81ca023
    • Hongbo Li's avatar
      coccinelle: Add rules to find str_read_write() replacements · ba4b514a
      Hongbo Li authored
      As other rules done, we add rules for str_read_write()
      to check the relative opportunities.
      Signed-off-by: default avatarHongbo Li <lihongbo22@huawei.com>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      ba4b514a
    • Hongbo Li's avatar
      coccinelle: Add rules to find str_enable{d}_disable{d}() replacements · dd2275d3
      Hongbo Li authored
      As other rules done, we add rules for str_enable{d}_
      disable{d}() to check the relative opportunities.
      Signed-off-by: default avatarHongbo Li <lihongbo22@huawei.com>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      dd2275d3
    • Hongbo Li's avatar
      coccinelle: Add rules to find str_lo{w}_hi{gh}() replacements · 5b7ca450
      Hongbo Li authored
      As other rules done, we add rules for str_lo{w}_hi{gh}()
      to check the relative opportunities.
      Signed-off-by: default avatarHongbo Li <lihongbo22@huawei.com>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      5b7ca450
    • Hongbo Li's avatar
      coccinelle: Add rules to find str_hi{gh}_lo{w}() replacements · d4c75440
      Hongbo Li authored
      As other rules done, we add rules for str_hi{gh}_lo{w}()
      to check the relative opportunities.
      Signed-off-by: default avatarHongbo Li <lihongbo22@huawei.com>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      d4c75440
    • Hongbo Li's avatar
      coccinelle: Add rules to find str_false_true() replacements · 8a0236ba
      Hongbo Li authored
      As done with str_true_false(), add checks for str_false_true()
      opportunities. A simple test can find over 9 cases currently
      exist in the tree.
      Signed-off-by: default avatarHongbo Li <lihongbo22@huawei.com>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      8a0236ba
    • Hongbo Li's avatar
      coccinelle: Add rules to find str_true_false() replacements · 716bf84e
      Hongbo Li authored
      After str_true_false() has been introduced in the tree,
      we can add rules for finding places where str_true_false()
      can be used. A simple test can find over 10 locations.
      Signed-off-by: default avatarHongbo Li <lihongbo22@huawei.com>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      716bf84e
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 3efc5736
      Linus Torvalds authored
      Pull x86 kvm updates from Paolo Bonzini:
       "x86:
      
         - KVM currently invalidates the entirety of the page tables, not just
           those for the memslot being touched, when a memslot is moved or
           deleted.
      
           This does not traditionally have particularly noticeable overhead,
           but Intel's TDX will require the guest to re-accept private pages
           if they are dropped from the secure EPT, which is a non starter.
      
           Actually, the only reason why this is not already being done is a
           bug which was never fully investigated and caused VM instability
           with assigned GeForce GPUs, so allow userspace to opt into the new
           behavior.
      
         - Advertise AVX10.1 to userspace (effectively prep work for the
           "real" AVX10 functionality that is on the horizon)
      
         - Rework common MSR handling code to suppress errors on userspace
           accesses to unsupported-but-advertised MSRs
      
           This will allow removing (almost?) all of KVM's exemptions for
           userspace access to MSRs that shouldn't exist based on the vCPU
           model (the actual cleanup is non-trivial future work)
      
         - Rework KVM's handling of x2APIC ICR, again, because AMD (x2AVIC)
           splits the 64-bit value into the legacy ICR and ICR2 storage,
           whereas Intel (APICv) stores the entire 64-bit value at the ICR
           offset
      
         - Fix a bug where KVM would fail to exit to userspace if one was
           triggered by a fastpath exit handler
      
         - Add fastpath handling of HLT VM-Exit to expedite re-entering the
           guest when there's already a pending wake event at the time of the
           exit
      
         - Fix a WARN caused by RSM entering a nested guest from SMM with
           invalid guest state, by forcing the vCPU out of guest mode prior to
           signalling SHUTDOWN (the SHUTDOWN hits the VM altogether, not the
           nested guest)
      
         - Overhaul the "unprotect and retry" logic to more precisely identify
           cases where retrying is actually helpful, and to harden all retry
           paths against putting the guest into an infinite retry loop
      
         - Add support for yielding, e.g. to honor NEED_RESCHED, when zapping
           rmaps in the shadow MMU
      
         - Refactor pieces of the shadow MMU related to aging SPTEs in
           prepartion for adding multi generation LRU support in KVM
      
         - Don't stuff the RSB after VM-Exit when RETPOLINE=y and AutoIBRS is
           enabled, i.e. when the CPU has already flushed the RSB
      
         - Trace the per-CPU host save area as a VMCB pointer to improve
           readability and cleanup the retrieval of the SEV-ES host save area
      
         - Remove unnecessary accounting of temporary nested VMCB related
           allocations
      
         - Set FINAL/PAGE in the page fault error code for EPT violations if
           and only if the GVA is valid. If the GVA is NOT valid, there is no
           guest-side page table walk and so stuffing paging related metadata
           is nonsensical
      
         - Fix a bug where KVM would incorrectly synthesize a nested VM-Exit
           instead of emulating posted interrupt delivery to L2
      
         - Add a lockdep assertion to detect unsafe accesses of vmcs12
           structures
      
         - Harden eVMCS loading against an impossible NULL pointer deref
           (really truly should be impossible)
      
         - Minor SGX fix and a cleanup
      
         - Misc cleanups
      
        Generic:
      
         - Register KVM's cpuhp and syscore callbacks when enabling
           virtualization in hardware, as the sole purpose of said callbacks
           is to disable and re-enable virtualization as needed
      
         - Enable virtualization when KVM is loaded, not right before the
           first VM is created
      
           Together with the previous change, this simplifies a lot the logic
           of the callbacks, because their very existence implies
           virtualization is enabled
      
         - Fix a bug that results in KVM prematurely exiting to userspace for
           coalesced MMIO/PIO in many cases, clean up the related code, and
           add a testcase
      
         - Fix a bug in kvm_clear_guest() where it would trigger a buffer
           overflow _if_ the gpa+len crosses a page boundary, which thankfully
           is guaranteed to not happen in the current code base. Add WARNs in
           more helpers that read/write guest memory to detect similar bugs
      
        Selftests:
      
         - Fix a goof that caused some Hyper-V tests to be skipped when run on
           bare metal, i.e. NOT in a VM
      
         - Add a regression test for KVM's handling of SHUTDOWN for an SEV-ES
           guest
      
         - Explicitly include one-off assets in .gitignore. Past Sean was
           completely wrong about not being able to detect missing .gitignore
           entries
      
         - Verify userspace single-stepping works when KVM happens to handle a
           VM-Exit in its fastpath
      
         - Misc cleanups"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits)
        Documentation: KVM: fix warning in "make htmldocs"
        s390: Enable KVM_S390_UCONTROL config in debug_defconfig
        selftests: kvm: s390: Add VM run test case
        KVM: SVM: let alternatives handle the cases when RSB filling is required
        KVM: VMX: Set PFERR_GUEST_{FINAL,PAGE}_MASK if and only if the GVA is valid
        KVM: x86/mmu: Use KVM_PAGES_PER_HPAGE() instead of an open coded equivalent
        KVM: x86/mmu: Add KVM_RMAP_MANY to replace open coded '1' and '1ul' literals
        KVM: x86/mmu: Fold mmu_spte_age() into kvm_rmap_age_gfn_range()
        KVM: x86/mmu: Morph kvm_handle_gfn_range() into an aging specific helper
        KVM: x86/mmu: Honor NEED_RESCHED when zapping rmaps and blocking is allowed
        KVM: x86/mmu: Add a helper to walk and zap rmaps for a memslot
        KVM: x86/mmu: Plumb a @can_yield parameter into __walk_slot_rmaps()
        KVM: x86/mmu: Move walk_slot_rmaps() up near for_each_slot_rmap_range()
        KVM: x86/mmu: WARN on MMIO cache hit when emulating write-protected gfn
        KVM: x86/mmu: Detect if unprotect will do anything based on invalid_list
        KVM: x86/mmu: Subsume kvm_mmu_unprotect_page() into the and_retry() version
        KVM: x86: Rename reexecute_instruction()=>kvm_unprotect_and_retry_on_failure()
        KVM: x86: Update retry protection fields when forcing retry on emulation failure
        KVM: x86: Apply retry protection to "unprotect on failure" path
        KVM: x86: Check EMULTYPE_WRITE_PF_TO_SP before unprotecting gfn
        ...
      3efc5736