1. 18 Nov, 2021 40 commits
    • Linus Torvalds's avatar
      Merge tag 'net-5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 8d0112ac
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from bpf, mac80211.
      
        Current release - regressions:
      
         - devlink: don't throw an error if flash notification sent before
           devlink visible
      
         - page_pool: Revert "page_pool: disable dma mapping support...",
           turns out there are active arches who need it
      
        Current release - new code bugs:
      
         - amt: cancel delayed_work synchronously in amt_fini()
      
        Previous releases - regressions:
      
         - xsk: fix crash on double free in buffer pool
      
         - bpf: fix inner map state pruning regression causing program
           rejections
      
         - mac80211: drop check for DONT_REORDER in __ieee80211_select_queue,
           preventing mis-selecting the best effort queue
      
         - mac80211: do not access the IV when it was stripped
      
         - mac80211: fix radiotap header generation, off-by-one
      
         - nl80211: fix getting radio statistics in survey dump
      
         - e100: fix device suspend/resume
      
        Previous releases - always broken:
      
         - tcp: fix uninitialized access in skb frags array for Rx 0cp
      
         - bpf: fix toctou on read-only map's constant scalar tracking
      
         - bpf: forbid bpf_ktime_get_coarse_ns and bpf_timer_* in tracing
           progs
      
         - tipc: only accept encrypted MSG_CRYPTO msgs
      
         - smc: transfer remaining wait queue entries during fallback, fix
           missing wake ups
      
         - udp: validate checksum in udp_read_sock() (when sockmap is used)
      
         - sched: act_mirred: drop dst for the direction from egress to
           ingress
      
         - virtio_net_hdr_to_skb: count transport header in UFO, prevent
           allowing bad skbs into the stack
      
         - nfc: reorder the logic in nfc_{un,}register_device, fix unregister
      
         - ipsec: check return value of ipv6_skip_exthdr
      
         - usb: r8152: add MAC passthrough support for more Lenovo Docks"
      
      * tag 'net-5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (96 commits)
        ptp: ocp: Fix a couple NULL vs IS_ERR() checks
        net: ethernet: dec: tulip: de4x5: fix possible array overflows in type3_infoblock()
        net: tulip: de4x5: fix the problem that the array 'lp->phy[8]' may be out of bound
        ipv6: check return value of ipv6_skip_exthdr
        e100: fix device suspend/resume
        devlink: Don't throw an error if flash notification sent before devlink visible
        page_pool: Revert "page_pool: disable dma mapping support..."
        ethernet: hisilicon: hns: hns_dsaf_misc: fix a possible array overflow in hns_dsaf_ge_srst_by_port()
        octeontx2-af: debugfs: don't corrupt user memory
        NFC: add NCI_UNREG flag to eliminate the race
        NFC: reorder the logic in nfc_{un,}register_device
        NFC: reorganize the functions in nci_request
        tipc: check for null after calling kmemdup
        i40e: Fix display error code in dmesg
        i40e: Fix creation of first queue by omitting it if is not power of two
        i40e: Fix warning message and call stack during rmmod i40e driver
        i40e: Fix ping is lost after configuring ADq on VF
        i40e: Fix changing previously set num_queue_pairs for PFs
        i40e: Fix NULL ptr dereference on VSI filter sync
        i40e: Fix correct max_pkt_size on VF RX queue
        ...
      8d0112ac
    • Linus Torvalds's avatar
      Merge tag 'for-5.16-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 6fdf8864
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "Several xes and one old ioctl deprecation. Namely there's fix for
        crashes/warnings with lzo compression that was suspected to be caused
        by first pull merge resolution, but it was a different bug.
      
        Summary:
      
         - regression fix for a crash in lzo due to missing boundary checks of
           the page array
      
         - fix crashes on ARM64 due to missing barriers when synchronizing
           status bits between work queues
      
         - silence lockdep when reading chunk tree during mount
      
         - fix false positive warning in integrity checker on devices with
           disabled write caching
      
         - fix signedness of bitfields in scrub
      
         - start deprecation of balance v1 ioctl"
      
      * tag 'for-5.16-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: deprecate BTRFS_IOC_BALANCE ioctl
        btrfs: make 1-bit bit-fields of scrub_page unsigned int
        btrfs: check-integrity: fix a warning on write caching disabled disk
        btrfs: silence lockdep when reading chunk tree during mount
        btrfs: fix memory ordering between normal and ordered work functions
        btrfs: fix a out-of-bound access in copy_compressed_data_to_page()
      6fdf8864
    • Linus Torvalds's avatar
      Merge tag 'fs_for_v5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · db850a9b
      Linus Torvalds authored
      Pull UDF fix from Jan Kara:
       "A fix for a long-standing UDF bug where we were not properly
        validating directory position inside readdir"
      
      * tag 'fs_for_v5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        udf: Fix crash after seekdir
      db850a9b
    • Linus Torvalds's avatar
      Merge tag 'fs.idmapped.v5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · 7cf7eed1
      Linus Torvalds authored
      Pull setattr idmapping fix from Christian Brauner:
       "This contains a simple fix for setattr. When determining the validity
        of the attributes the ia_{g,u}id fields contain the value that will be
        written to inode->i_{g,u}id. When the {g,u}id attribute of the file
        isn't altered and the caller's fs{g,u}id matches the current {g,u}id
        attribute the attribute change is allowed.
      
        The value in ia_{g,u}id does already account for idmapped mounts and
        will have taken the relevant idmapping into account. So in order to
        verify that the {g,u}id attribute isn't changed we simple need to
        compare the ia_{g,u}id value against the inode's i_{g,u}id value.
      
        This only has any meaning for idmapped mounts as idmapping helpers are
        idempotent without them. And for idmapped mounts this really only has
        a meaning when circular idmappings are used, i.e. mappings where e.g.
        id 1000 is mapped to id 1001 and id 1001 is mapped to id 1000. Such
        ciruclar mappings can e.g. be useful when sharing the same home
        directory between multiple users at the same time.
      
        Before this patch we could end up denying legitimate attribute changes
        and allowing invalid attribute changes when circular mappings are
        used. To even get into this situation the caller must've been
        privileged both to create that mapping and to create that idmapped
        mount.
      
        This hasn't been seen in the wild anywhere but came up when expanding
        the fstest suite during work on a series of hardening patches. All
        idmapped fstests pass without any regressions and we're adding new
        tests to verify the behavior of circular mappings.
      
        The new tests can be found at [1]"
      
      Link: https://lore.kernel.org/linux-fsdevel/20211109145713.1868404-2-brauner@kernel.org [1]
      
      * tag 'fs.idmapped.v5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        fs: handle circular mappings correctly
      7cf7eed1
    • Linus Torvalds's avatar
      Merge tag 'for-5.16/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · a6a6d227
      Linus Torvalds authored
      Pull parisc fixes from Helge Deller:
       "parisc bug and warning fixes and wire up futex_waitv.
      
        Fix some warnings which showed up with allmodconfig builds, a revert
        of a change to the sigreturn trampoline which broke signal handling,
        wire up futex_waitv and add CONFIG_PRINTK_TIME=y to 32bit defconfig"
      
      * tag 'for-5.16/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: Enable CONFIG_PRINTK_TIME=y in 32bit defconfig
        Revert "parisc: Reduce sigreturn trampoline to 3 instructions"
        parisc: Wrap assembler related defines inside __ASSEMBLY__
        parisc: Wire up futex_waitv
        parisc: Include stringify.h to avoid build error in crypto/api.c
        parisc/sticon: fix reverse colors
      a6a6d227
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · c46e8ece
      Linus Torvalds authored
      Pull KVM fixes from Paolo Bonzini:
       "Selftest changes:
      
         - Cleanups for the perf test infrastructure and mapping hugepages
      
         - Avoid contention on mmap_sem when the guests start to run
      
         - Add event channel upcall support to xen_shinfo_test
      
        x86 changes:
      
         - Fixes for Xen emulation
      
         - Kill kvm_map_gfn() / kvm_unmap_gfn() and broken gfn_to_pfn_cache
      
         - Fixes for migration of 32-bit nested guests on 64-bit hypervisor
      
         - Compilation fixes
      
         - More SEV cleanups
      
        Generic:
      
         - Cap the return value of KVM_CAP_NR_VCPUS to both KVM_CAP_MAX_VCPUS
           and num_online_cpus(). Most architectures were only using one of
           the two"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (42 commits)
        KVM: x86: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
        KVM: s390: Cap KVM_CAP_NR_VCPUS by num_online_cpus()
        KVM: RISC-V: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
        KVM: PPC: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
        KVM: MIPS: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
        KVM: arm64: Cap KVM_CAP_NR_VCPUS by kvm_arm_default_max_vcpus()
        KVM: x86: Assume a 64-bit hypercall for guests with protected state
        selftests: KVM: Add /x86_64/sev_migrate_tests to .gitignore
        riscv: kvm: fix non-kernel-doc comment block
        KVM: SEV: Fix typo in and tweak name of cmd_allowed_from_miror()
        KVM: SEV: Drop a redundant setting of sev->asid during initialization
        KVM: SEV: WARN if SEV-ES is marked active but SEV is not
        KVM: SEV: Set sev_info.active after initial checks in sev_guest_init()
        KVM: SEV: Disallow COPY_ENC_CONTEXT_FROM if target has created vCPUs
        KVM: Kill kvm_map_gfn() / kvm_unmap_gfn() and gfn_to_pfn_cache
        KVM: nVMX: Use a gfn_to_hva_cache for vmptrld
        KVM: nVMX: Use kvm_read_guest_offset_cached() for nested VMCS check
        KVM: x86/xen: Use sizeof_field() instead of open-coding it
        KVM: nVMX: Use kvm_{read,write}_guest_cached() for shadow_vmcs12
        KVM: x86/xen: Fix get_attr of KVM_XEN_ATTR_TYPE_SHARED_INFO
        ...
      c46e8ece
    • Linus Torvalds's avatar
      Merge tag 'docs-5.16-2' of git://git.lwn.net/linux · 4ae275bc
      Linus Torvalds authored
      Pull documentation fixes from Jonathan Corbet:
       "A handful of documentation fixes for 5.16"
      
      * tag 'docs-5.16-2' of git://git.lwn.net/linux:
        Documentation/process: fix a cross reference
        Documentation: update vcpu-requests.rst reference
        docs: accounting: update delay-accounting.rst reference
        libbpf: update index.rst reference
        docs: filesystems: Fix grammatical error "with" to "which"
        doc/zh_CN: fix a translation error in management-style
        docs: ftrace: fix the wrong path of tracefs
        Documentation: arm: marvell: Fix link to armada_1000_pb.pdf document
        Documentation: arm: marvell: Put Armada XP section between Armada 370 and 375
        Documentation: arm: marvell: Add some links to homepage / product infos
        docs: Update Sphinx requirements
      4ae275bc
    • Linus Torvalds's avatar
      Merge tag 'printk-for-5.16-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux · 7d5775d4
      Linus Torvalds authored
      Pull printk fixes from Petr Mladek:
      
       - Try to flush backtraces from other CPUs also on the local one. This
         was a regression caused by printk_safe buffers removal.
      
       - Remove header dependency warning.
      
      * tag 'printk-for-5.16-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
        printk: Remove printk.h inclusion in percpu.h
        printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces
      7d5775d4
    • Dan Carpenter's avatar
      ptp: ocp: Fix a couple NULL vs IS_ERR() checks · c7521d3a
      Dan Carpenter authored
      The ptp_ocp_get_mem() function does not return NULL, it returns error
      pointers.
      
      Fixes: 773bda96 ("ptp: ocp: Expose various resources on the timecard.")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c7521d3a
    • Teng Qi's avatar
      net: ethernet: dec: tulip: de4x5: fix possible array overflows in type3_infoblock() · 0fa68da7
      Teng Qi authored
      The definition of macro MOTO_SROM_BUG is:
        #define MOTO_SROM_BUG    (lp->active == 8 && (get_unaligned_le32(
        dev->dev_addr) & 0x00ffffff) == 0x3e0008)
      
      and the if statement
        if (MOTO_SROM_BUG) lp->active = 0;
      
      using this macro indicates lp->active could be 8. If lp->active is 8 and
      the second comparison of this macro is false. lp->active will remain 8 in:
        lp->phy[lp->active].gep = (*p ? p : NULL); p += (2 * (*p) + 1);
        lp->phy[lp->active].rst = (*p ? p : NULL); p += (2 * (*p) + 1);
        lp->phy[lp->active].mc  = get_unaligned_le16(p); p += 2;
        lp->phy[lp->active].ana = get_unaligned_le16(p); p += 2;
        lp->phy[lp->active].fdx = get_unaligned_le16(p); p += 2;
        lp->phy[lp->active].ttm = get_unaligned_le16(p); p += 2;
        lp->phy[lp->active].mci = *p;
      
      However, the length of array lp->phy is 8, so array overflows can occur.
      To fix these possible array overflows, we first check lp->active and then
      return -EINVAL if it is greater or equal to ARRAY_SIZE(lp->phy) (i.e. 8).
      Reported-by: default avatarTOTE Robot <oslab@tsinghua.edu.cn>
      Signed-off-by: default avatarTeng Qi <starmiku1207184332@gmail.com>
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0fa68da7
    • zhangyue's avatar
      net: tulip: de4x5: fix the problem that the array 'lp->phy[8]' may be out of bound · 61217be8
      zhangyue authored
      In line 5001, if all id in the array 'lp->phy[8]' is not 0, when the
      'for' end, the 'k' is 8.
      
      At this time, the array 'lp->phy[8]' may be out of bound.
      Signed-off-by: default avatarzhangyue <zhangyue1@kylinos.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      61217be8
    • David S. Miller's avatar
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net- · 4e5d2124
      David S. Miller authored
      queue
      
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2021-11-17
      
      This series contains updates to i40e driver only.
      
      Eryk adds accounting for VLAN header in packet size when VF port VLAN is
      configured. He also fixes TC queue distribution when the user has changed
      queue counts as well as for configuration of VF ADQ which caused dropped
      packets.
      
      Michal adds tracking for when a VSI is being released to prevent null
      pointer dereference when managing filters.
      
      Karen ensures PF successfully initiates VF requested reset which could
      cause a call trace otherwise.
      
      Jedrzej moves validation of channel queue value earlier to prevent
      partial configuration when the value is invalid.
      
      Grzegorz corrects the reported error when adding filter fails.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e5d2124
    • Jordy Zomer's avatar
      ipv6: check return value of ipv6_skip_exthdr · 5f9c55c8
      Jordy Zomer authored
      The offset value is used in pointer math on skb->data.
      Since ipv6_skip_exthdr may return -1 the pointer to uh and th
      may not point to the actual udp and tcp headers and potentially
      overwrite other stuff. This is why I think this should be checked.
      
      EDIT:  added {}'s, thanks Kees
      Signed-off-by: default avatarJordy Zomer <jordy@pwning.systems>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5f9c55c8
    • Jesse Brandeburg's avatar
      e100: fix device suspend/resume · 5d2ca2e1
      Jesse Brandeburg authored
      As reported in [1], e100 was no longer working for suspend/resume
      cycles. The previous commit mentioned in the fixes appears to have
      broken things and this attempts to practice best known methods for
      device power management and keep wake-up working while allowing
      suspend/resume to work. To do this, I reorder a little bit of code
      and fix the resume path to make sure the device is enabled.
      
      [1] https://bugzilla.kernel.org/show_bug.cgi?id=214933
      
      Fixes: 69a74aef ("e100: use generic power management")
      Cc: Vaibhav Gupta <vaibhavgupta40@gmail.com>
      Reported-by: default avatarAlexey Kuznetsov <axet@me.com>
      Signed-off-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
      Tested-by: default avatarAlexey Kuznetsov <axet@me.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5d2ca2e1
    • Leon Romanovsky's avatar
      devlink: Don't throw an error if flash notification sent before devlink visible · fec1faf2
      Leon Romanovsky authored
      The mlxsw driver calls to various devlink flash routines even before
      users can get any access to the devlink instance itself. For example,
      mlxsw_core_fw_rev_validate() one of such functions.
      
      __mlxsw_core_bus_device_register
       -> mlxsw_core_fw_rev_validate
        -> mlxsw_core_fw_flash
         -> mlxfw_firmware_flash
          -> mlxfw_status_notify
           -> devlink_flash_update_status_notify
            -> __devlink_flash_update_notify
             -> WARN_ON(...)
      
      It causes to the WARN_ON to trigger warning about devlink not registered.
      
      Fixes: cf530217 ("devlink: Notify users when objects are accessible")
      Reported-by: default avatarDanielle Ratson <danieller@nvidia.com>
      Tested-by: default avatarDanielle Ratson <danieller@nvidia.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Acked-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fec1faf2
    • Yunsheng Lin's avatar
      page_pool: Revert "page_pool: disable dma mapping support..." · f915b75b
      Yunsheng Lin authored
      This reverts commit d00e60ee.
      
      As reported by Guillaume in [1]:
      Enabling LPAE always enables CONFIG_ARCH_DMA_ADDR_T_64BIT
      in 32-bit systems, which breaks the bootup proceess when a
      ethernet driver is using page pool with PP_FLAG_DMA_MAP flag.
      As we were hoping we had no active consumers for such system
      when we removed the dma mapping support, and LPAE seems like
      a common feature for 32 bits system, so revert it.
      
      1. https://www.spinics.net/lists/netdev/msg779890.html
      
      Fixes: d00e60ee ("page_pool: disable dma mapping support for 32-bit arch with 64-bit DMA")
      Signed-off-by: default avatarYunsheng Lin <linyunsheng@huawei.com>
      Reported-by: default avatar"kernelci.org bot" <bot@kernelci.org>
      Tested-by: default avatar"kernelci.org bot" <bot@kernelci.org>
      Acked-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: default avatarIlias Apalodimas <ilias.apalodimas@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f915b75b
    • Teng Qi's avatar
      ethernet: hisilicon: hns: hns_dsaf_misc: fix a possible array overflow in... · a66998e0
      Teng Qi authored
      ethernet: hisilicon: hns: hns_dsaf_misc: fix a possible array overflow in hns_dsaf_ge_srst_by_port()
      
      The if statement:
        if (port >= DSAF_GE_NUM)
              return;
      
      limits the value of port less than DSAF_GE_NUM (i.e., 8).
      However, if the value of port is 6 or 7, an array overflow could occur:
        port_rst_off = dsaf_dev->mac_cb[port]->port_rst_off;
      
      because the length of dsaf_dev->mac_cb is DSAF_MAX_PORT_NUM (i.e., 6).
      
      To fix this possible array overflow, we first check port and if it is
      greater than or equal to DSAF_MAX_PORT_NUM, the function returns.
      Reported-by: default avatarTOTE Robot <oslab@tsinghua.edu.cn>
      Signed-off-by: default avatarTeng Qi <starmiku1207184332@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a66998e0
    • Petr Mladek's avatar
      bf6d0d1e
    • Helge Deller's avatar
      9412f5aa
    • Helge Deller's avatar
      Revert "parisc: Reduce sigreturn trampoline to 3 instructions" · 79df39d5
      Helge Deller authored
      This reverts commit e4f2006f.
      
      This patch shows problems with signal handling. Revert it for now.
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Cc: <stable@vger.kernel.org> # v5.15
      79df39d5
    • Helge Deller's avatar
      parisc: Wrap assembler related defines inside __ASSEMBLY__ · 4017b230
      Helge Deller authored
      Building allmodconfig shows errors in the gpu/drm/msm snapdragon drivers,
      because a COND() define is used there which conflicts with the COND() for
      PA-RISC assembly.  Although the snapdragon driver isn't relevant for parisc, it
      is nevertheless compiled when CONFIG_COMPILE_TEST is defined.
      
      Move the COND() define and other PA-RISC mnemonics inside the #ifdef
      __ASSEMBLY__ part to avoid this conflict.
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      4017b230
    • Helge Deller's avatar
      parisc: Wire up futex_waitv · 8f663eb3
      Helge Deller authored
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      8f663eb3
    • Helge Deller's avatar
      parisc: Include stringify.h to avoid build error in crypto/api.c · 4d7804d2
      Helge Deller authored
      Include stringify.h to avoid this build error:
       arch/parisc/include/asm/jump_label.h: error: expected ':' before '__stringify'
       arch/parisc/include/asm/jump_label.h: error: label 'l_yes' defined but not used [-Werror=unused-label]
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      4d7804d2
    • Vitaly Kuznetsov's avatar
      KVM: x86: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS · 2845e735
      Vitaly Kuznetsov authored
      It doesn't make sense to return the recommended maximum number of
      vCPUs which exceeds the maximum possible number of vCPUs.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20211116163443.88707-7-vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      2845e735
    • Vitaly Kuznetsov's avatar
      KVM: s390: Cap KVM_CAP_NR_VCPUS by num_online_cpus() · 82cc27ef
      Vitaly Kuznetsov authored
      KVM_CAP_NR_VCPUS is a legacy advisory value which on other architectures
      return num_online_cpus() caped by KVM_CAP_NR_VCPUS or something else
      (ppc and arm64 are special cases). On s390, KVM_CAP_NR_VCPUS returns
      the same as KVM_CAP_MAX_VCPUS and this may turn out to be a bad
      'advice'. Switch s390 to returning caped num_online_cpus() too.
      Acked-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Reviewed-by: default avatarChristian Borntraeger <borntraeger@linux.ibm.com>
      Message-Id: <20211116163443.88707-6-vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      82cc27ef
    • Vitaly Kuznetsov's avatar
      KVM: RISC-V: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS · 37fd3ce1
      Vitaly Kuznetsov authored
      It doesn't make sense to return the recommended maximum number of
      vCPUs which exceeds the maximum possible number of vCPUs.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Acked-by: default avatarAnup Patel <anup.patel@wdc.com>
      Reviewed-by: default avatarAnup Patel <anup.patel@wdc.com>
      Message-Id: <20211116163443.88707-5-vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      37fd3ce1
    • Vitaly Kuznetsov's avatar
      KVM: PPC: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS · b7915d55
      Vitaly Kuznetsov authored
      It doesn't make sense to return the recommended maximum number of
      vCPUs which exceeds the maximum possible number of vCPUs.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20211116163443.88707-4-vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b7915d55
    • Vitaly Kuznetsov's avatar
      KVM: MIPS: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS · 57a2e13e
      Vitaly Kuznetsov authored
      It doesn't make sense to return the recommended maximum number of
      vCPUs which exceeds the maximum possible number of vCPUs.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20211116163443.88707-3-vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      57a2e13e
    • Vitaly Kuznetsov's avatar
      KVM: arm64: Cap KVM_CAP_NR_VCPUS by kvm_arm_default_max_vcpus() · f60a00d7
      Vitaly Kuznetsov authored
      Generally, it doesn't make sense to return the recommended maximum number
      of vCPUs which exceeds the maximum possible number of vCPUs.
      
      Note: ARM64 is special as the value returned by KVM_CAP_MAX_VCPUS differs
      depending on whether it is a system-wide ioctl or a per-VM one. Previously,
      KVM_CAP_NR_VCPUS didn't have this difference and it seems preferable to
      keep the status quo. Cap KVM_CAP_NR_VCPUS by kvm_arm_default_max_vcpus()
      which is what gets returned by system-wide KVM_CAP_MAX_VCPUS.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20211116163443.88707-2-vkuznets@redhat.com>
      Acked-by: default avatarMarc Zyngier <maz@kernel.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f60a00d7
    • Tom Lendacky's avatar
      KVM: x86: Assume a 64-bit hypercall for guests with protected state · b5aead00
      Tom Lendacky authored
      When processing a hypercall for a guest with protected state, currently
      SEV-ES guests, the guest CS segment register can't be checked to
      determine if the guest is in 64-bit mode. For an SEV-ES guest, it is
      expected that communication between the guest and the hypervisor is
      performed to shared memory using the GHCB. In order to use the GHCB, the
      guest must have been in long mode, otherwise writes by the guest to the
      GHCB would be encrypted and not be able to be comprehended by the
      hypervisor.
      
      Create a new helper function, is_64_bit_hypercall(), that assumes the
      guest is in 64-bit mode when the guest has protected state, and returns
      true, otherwise invoking is_64_bit_mode() to determine the mode. Update
      the hypercall related routines to use is_64_bit_hypercall() instead of
      is_64_bit_mode().
      
      Add a WARN_ON_ONCE() to is_64_bit_mode() to catch occurences of calls to
      this helper function for a guest running with protected state.
      
      Fixes: f1c6366e ("KVM: SVM: Add required changes to support intercepts under SEV-ES")
      Reported-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <e0b20c770c9d0d1403f23d83e785385104211f74.1621878537.git.thomas.lendacky@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b5aead00
    • Arnaldo Carvalho de Melo's avatar
      selftests: KVM: Add /x86_64/sev_migrate_tests to .gitignore · b768f60b
      Arnaldo Carvalho de Melo authored
        $ git status
        nothing to commit, working tree clean
        $
        $ make -C tools/testing/selftests/kvm/ > /dev/null 2>&1
        $ git status
      
        Untracked files:
          (use "git add <file>..." to include in what will be committed)
        	tools/testing/selftests/kvm/x86_64/sev_migrate_tests
      
        nothing added to commit but untracked files present (use "git add" to track)
        $
      
      Fixes: 6a581508 ("selftest: KVM: Add intra host migration tests")
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Marc Orr <marcorr@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Gonda <pgonda@google.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Message-Id: <YZPIPfvYgRDCZi/w@kernel.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b768f60b
    • Randy Dunlap's avatar
      riscv: kvm: fix non-kernel-doc comment block · 0e2e6419
      Randy Dunlap authored
      Don't use "/**" to begin a comment block for a non-kernel-doc comment.
      
      Prevents this docs build warning:
      
      vcpu_sbi.c:3: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
       * Copyright (c) 2019 Western Digital Corporation or its affiliates.
      
      Fixes: dea8ee31 ("RISC-V: KVM: Add SBI v0.1 support")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Cc: Atish Patra <atish.patra@wdc.com>
      Cc: Anup Patel <anup.patel@wdc.com>
      Cc: kvm@vger.kernel.org
      Cc: kvm-riscv@lists.infradead.org
      Cc: linux-riscv@lists.infradead.org
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Message-Id: <20211107034706.30672-1-rdunlap@infradead.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0e2e6419
    • Paolo Bonzini's avatar
      Merge branch 'kvm-5.16-fixes' into kvm-master · 817506df
      Paolo Bonzini authored
      * Fixes for Xen emulation
      
      * Kill kvm_map_gfn() / kvm_unmap_gfn() and broken gfn_to_pfn_cache
      
      * Fixes for migration of 32-bit nested guests on 64-bit hypervisor
      
      * Compilation fixes
      
      * More SEV cleanups
      817506df
    • Sean Christopherson's avatar
      KVM: SEV: Fix typo in and tweak name of cmd_allowed_from_miror() · 8e38e96a
      Sean Christopherson authored
      Rename cmd_allowed_from_miror() to is_cmd_allowed_from_mirror(), fixing
      a typo and making it obvious that the result is a boolean where
      false means "not allowed".
      
      No functional change intended.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20211109215101.2211373-7-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      8e38e96a
    • Sean Christopherson's avatar
      KVM: SEV: Drop a redundant setting of sev->asid during initialization · ea410ef4
      Sean Christopherson authored
      Remove a fully redundant write to sev->asid during SEV/SEV-ES guest
      initialization.  The ASID is set a few lines earlier prior to the call to
      sev_platform_init(), which doesn't take "sev" as a param, i.e. can't
      muck with the ASID barring some truly magical behind-the-scenes code.
      
      No functional change intended.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20211109215101.2211373-6-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ea410ef4
    • Sean Christopherson's avatar
      KVM: SEV: WARN if SEV-ES is marked active but SEV is not · 1bd00a42
      Sean Christopherson authored
      WARN if the VM is tagged as SEV-ES but not SEV.  KVM relies on SEV and
      SEV-ES being set atomically, and guards common flows with "is SEV", i.e.
      observing SEV-ES without SEV means KVM has a fatal bug.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20211109215101.2211373-5-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1bd00a42
    • Sean Christopherson's avatar
      KVM: SEV: Set sev_info.active after initial checks in sev_guest_init() · a41fb26e
      Sean Christopherson authored
      Set sev_info.active during SEV/SEV-ES activation before calling any code
      that can potentially consume sev_info.es_active, e.g. set "active" and
      "es_active" as a pair immediately after the initial sanity checks.  KVM
      generally expects that es_active can be true if and only if active is
      true, e.g. sev_asid_new() deliberately avoids sev_es_guest() so that it
      doesn't get a false negative.  This will allow WARNing in sev_es_guest()
      if the VM is tagged as SEV-ES but not SEV.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20211109215101.2211373-4-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a41fb26e
    • Sean Christopherson's avatar
      KVM: SEV: Disallow COPY_ENC_CONTEXT_FROM if target has created vCPUs · 79b11142
      Sean Christopherson authored
      Reject COPY_ENC_CONTEXT_FROM if the destination VM has created vCPUs.
      KVM relies on SEV activation to occur before vCPUs are created, e.g. to
      set VMCB flags and intercepts correctly.
      
      Fixes: 54526d1f ("KVM: x86: Support KVM VMs sharing SEV context")
      Cc: stable@vger.kernel.org
      Cc: Peter Gonda <pgonda@google.com>
      Cc: Marc Orr <marcorr@google.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Nathan Tempelman <natet@google.com>
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20211109215101.2211373-2-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      79b11142
    • David Woodhouse's avatar
      KVM: Kill kvm_map_gfn() / kvm_unmap_gfn() and gfn_to_pfn_cache · 357a18ad
      David Woodhouse authored
      In commit 7e2175eb ("KVM: x86: Fix recording of guest steal time /
      preempted status") I removed the only user of these functions because
      it was basically impossible to use them safely.
      
      There are two stages to the GFN->PFN mapping; first through the KVM
      memslots to a userspace HVA and then through the page tables to
      translate that HVA to an underlying PFN. Invalidations of the former
      were being handled correctly, but no attempt was made to use the MMU
      notifiers to invalidate the cache when the HVA->GFN mapping changed.
      
      As a prelude to reinventing the gfn_to_pfn_cache with more usable
      semantics, rip it out entirely and untangle the implementation of
      the unsafe kvm_vcpu_map()/kvm_vcpu_unmap() functions from it.
      
      All current users of kvm_vcpu_map() also look broken right now, and
      will be dealt with separately. They broadly fall into two classes:
      
      * Those which map, access the data and immediately unmap. This is
        mostly gratuitous and could just as well use the existing user
        HVA, and could probably benefit from a gfn_to_hva_cache as they
        do so.
      
      * Those which keep the mapping around for a longer time, perhaps
        even using the PFN directly from the guest. These will need to
        be converted to the new gfn_to_pfn_cache and then kvm_vcpu_map()
        can be removed too.
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Message-Id: <20211115165030.7422-8-dwmw2@infradead.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      357a18ad
    • David Woodhouse's avatar
      KVM: nVMX: Use a gfn_to_hva_cache for vmptrld · cee66664
      David Woodhouse authored
      And thus another call to kvm_vcpu_map() can die.
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Message-Id: <20211115165030.7422-7-dwmw2@infradead.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      cee66664