1. 26 Mar, 2022 12 commits
    • Zheng Yongjun's avatar
      net: sparx5: switchdev: fix possible NULL pointer dereference · 0906f3a3
      Zheng Yongjun authored
      As the possible failure of the allocation, devm_kzalloc() may return NULL
      pointer.
      Therefore, it should be better to check the 'db' in order to prevent
      the dereference of NULL pointer.
      
      Fixes: 10615907 ("net: sparx5: switchdev: adding frame DMA functionality")
      Signed-off-by: default avatarZheng Yongjun <zhengyongjun3@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0906f3a3
    • Duoming Zhou's avatar
      net/x25: Fix null-ptr-deref caused by x25_disconnect · 77816079
      Duoming Zhou authored
      When the link layer is terminating, x25->neighbour will be set to NULL
      in x25_disconnect(). As a result, it could cause null-ptr-deref bugs in
      x25_sendmsg(),x25_recvmsg() and x25_connect(). One of the bugs is
      shown below.
      
          (Thread 1)                 |  (Thread 2)
      x25_link_terminated()          | x25_recvmsg()
       x25_kill_by_neigh()           |  ...
        x25_disconnect()             |  lock_sock(sk)
         ...                         |  ...
         x25->neighbour = NULL //(1) |
         ...                         |  x25->neighbour->extended //(2)
      
      The code sets NULL to x25->neighbour in position (1) and dereferences
      x25->neighbour in position (2), which could cause null-ptr-deref bug.
      
      This patch adds lock_sock() in x25_kill_by_neigh() in order to synchronize
      with x25_sendmsg(), x25_recvmsg() and x25_connect(). What`s more, the
      sock held by lock_sock() is not NULL, because it is extracted from x25_list
      and uses x25_list_lock to synchronize.
      
      Fixes: 4becb7ee ("net/x25: Fix x25_neigh refcnt leak when x25 disconnect")
      Signed-off-by: default avatarDuoming Zhou <duoming@zju.edu.cn>
      Reviewed-by: default avatarLin Ma <linma@zju.edu.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      77816079
    • Tom Rix's avatar
      qlcnic: dcb: default to returning -EOPNOTSUPP · 1521db37
      Tom Rix authored
      Clang static analysis reports this issue
      qlcnic_dcb.c:382:10: warning: Assigned value is
        garbage or undefined
        mbx_out = *val;
                ^ ~~~~
      
      val is set in the qlcnic_dcb_query_hw_capability() wrapper.
      If there is no query_hw_capability op in dcp, success is
      returned without setting the val.
      
      For this and similar wrappers, return -EOPNOTSUPP.
      
      Fixes: 14d385b9 ("qlcnic: dcb: Query adapter DCB capabilities.")
      Signed-off-by: default avatarTom Rix <trix@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1521db37
    • Randy Dunlap's avatar
      net: sparx5: depends on PTP_1588_CLOCK_OPTIONAL · 08be6b13
      Randy Dunlap authored
      Fix build errors when PTP_1588_CLOCK=m and SPARX5_SWTICH=y.
      
      arc-linux-ld: drivers/net/ethernet/microchip/sparx5/sparx5_ethtool.o: in function `sparx5_get_ts_info':
      sparx5_ethtool.c:(.text+0x146): undefined reference to `ptp_clock_index'
      arc-linux-ld: sparx5_ethtool.c:(.text+0x146): undefined reference to `ptp_clock_index'
      arc-linux-ld: drivers/net/ethernet/microchip/sparx5/sparx5_ptp.o: in function `sparx5_ptp_init':
      sparx5_ptp.c:(.text+0xd56): undefined reference to `ptp_clock_register'
      arc-linux-ld: sparx5_ptp.c:(.text+0xd56): undefined reference to `ptp_clock_register'
      arc-linux-ld: drivers/net/ethernet/microchip/sparx5/sparx5_ptp.o: in function `sparx5_ptp_deinit':
      sparx5_ptp.c:(.text+0xf30): undefined reference to `ptp_clock_unregister'
      arc-linux-ld: sparx5_ptp.c:(.text+0xf30): undefined reference to `ptp_clock_unregister'
      arc-linux-ld: sparx5_ptp.c:(.text+0xf38): undefined reference to `ptp_clock_unregister'
      arc-linux-ld: sparx5_ptp.c:(.text+0xf46): undefined reference to `ptp_clock_unregister'
      arc-linux-ld: drivers/net/ethernet/microchip/sparx5/sparx5_ptp.o:sparx5_ptp.c:(.text+0xf46): more undefined references to `ptp_clock_unregister' follow
      
      Fixes: 3cfa11ba ("net: sparx5: add the basic sparx5 driver")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Cc: Horatiu Vultur <horatiu.vultur@microchip.com>
      Cc: UNGLinuxDriver@microchip.com
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Steen Hegelund <steen.hegelund@microchip.com>
      Cc: Bjarni Jonasson <bjarni.jonasson@microchip.com>
      Cc: Lars Povlsen <lars.povlsen@microchip.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      08be6b13
    • David S. Miller's avatar
      Merge branch 'hns3-fixes' · 2eca426d
      David S. Miller authored
      Guangbin Huang says:
      
      ====================
      net: hns3: add some fixes for -net
      
      This series adds some fixes for the HNS3 ethernet driver.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2eca426d
    • Guangbin Huang's avatar
      net: hns3: fix phy can not link up when autoneg off and reset · ad0ecaef
      Guangbin Huang authored
      Currently, function hclge_mdio_read() will return 0 if during reset(the
      cmd state will be set to disable).
      
      If use general phy driver, the phy_state_machine() will update phy speed
      every second in function genphy_read_status_fixed() when PHY is set to
      autoneg off, no matter of link down or link up.
      
      If phy driver happens to read BMCR register during reset, phy speed will
      be updated to 10Mpbs as BMCR register value is 0. So it may call phy can
      not link up if previous speed is not 10Mpbs.
      
      To fix this problem, function hclge_mdio_read() should return -EBUSY if
      the cmd state is disable. So does function hclge_mdio_write().
      
      Fixes: 1c124938 ("net: hns3: bugfix for hclge_mdio_write and hclge_mdio_read")
      Signed-off-by: default avatarGuangbin Huang <huangguangbin2@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad0ecaef
    • Hao Chen's avatar
      net: hns3: add NULL pointer check for hns3_set/get_ringparam() · 4d07c593
      Hao Chen authored
      When pci devices init failed and haven't reinit, priv->ring is
      NULL and hns3_set/get_ringparam() will access priv->ring. it
      causes call trace.
      
      So, add NULL pointer check for hns3_set/get_ringparam() to
      avoid this situation.
      
      Fixes: 5668abda ("net: hns3: add support for set_ringparam")
      Signed-off-by: default avatarHao Chen <chenhao288@hisilicon.com>
      Signed-off-by: default avatarGuangbin Huang <huangguangbin2@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4d07c593
    • Hao Chen's avatar
      net: hns3: add netdev reset check for hns3_set_tunable() · f5cd6016
      Hao Chen authored
      When pci device reset failed, it does uninit operation and priv->ring
      is NULL, it causes accessing NULL pointer error.
      
      Add netdev reset check for hns3_set_tunable() to fix it.
      
      Fixes: 99f6b5fb ("net: hns3: use bounce buffer when rx page can not be reused")
      Signed-off-by: default avatarHao Chen <chenhao288@hisilicon.com>
      Signed-off-by: default avatarGuangbin Huang <huangguangbin2@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f5cd6016
    • Peng Li's avatar
      net: hns3: clean residual vf config after disable sriov · 671cb8cb
      Peng Li authored
      After disable sriov, VF still has some config and info need to be
      cleaned, which configured by PF. This patch clean the HW config
      and SW struct vport->vf_info.
      
      Fixes: fa8d82e8 ("net: hns3: Add support of .sriov_configure in HNS3 driver")
      Signed-off-by: Peng Li<lipeng321@huawei.com>
      Signed-off-by: default avatarGuangbin Huang <huangguangbin2@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      671cb8cb
    • Hao Chen's avatar
      net: hns3: add max order judgement for tx spare buffer · a89cbb16
      Hao Chen authored
      Add max order judgement for tx spare buffer to avoid triggering
      call trace, print related fail information instead, when user
      set tx spare buf size to a large value which causes order
      exceeding 10.
      
      Fixes: e445f08a ("net: hns3: add support to set/get tx copybreak buf size via ethtool for hns3 driver")
      Signed-off-by: default avatarHao Chen <chenhao288@hisilicon.com>
      Signed-off-by: default avatarGuangbin Huang <huangguangbin2@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a89cbb16
    • Hao Chen's avatar
      net: hns3: fix ethtool tx copybreak buf size indicating not aligned issue · 87783721
      Hao Chen authored
      When use ethtoool set tx copybreak buf size to a large value
      which causes order exceeding 10 or memory is not enough,
      it causes allocating tx copybreak buffer failed and print
      "the active tx spare buf is 0, not enabled tx spare buffer",
      however, use --get-tunable parameter query tx copybreak buf
      size and it indicates setting value not 0.
      
      So, it's necessary to change the print value from setting
      value to 0.
      
      Set kinfo.tx_spare_buf_size to 0 when set tx copybreak buf size failed.
      
      Fixes: e445f08a ("net: hns3: add support to set/get tx copybreak buf size via ethtool for hns3 driver")
      Signed-off-by: default avatarHao Chen <chenhao288@hisilicon.com>
      Signed-off-by: default avatarGuangbin Huang <huangguangbin2@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      87783721
    • Ido Schimmel's avatar
      selftests: test_vxlan_under_vrf: Fix broken test case · b50d3b46
      Ido Schimmel authored
      The purpose of the last test case is to test VXLAN encapsulation and
      decapsulation when the underlay lookup takes place in a non-default VRF.
      This is achieved by enslaving the physical device of the tunnel to a
      VRF.
      
      The binding of the VXLAN UDP socket to the VRF happens when the VXLAN
      device itself is opened, not when its physical device is opened. This
      was also mentioned in the cited commit ("tests that moving the underlay
      from a VRF to another works when down/up the VXLAN interface"), but the
      test did something else.
      
      Fix it by reopening the VXLAN device instead of its physical device.
      
      Before:
      
       # ./test_vxlan_under_vrf.sh
       Checking HV connectivity                                           [ OK ]
       Check VM connectivity through VXLAN (underlay in the default VRF)  [ OK ]
       Check VM connectivity through VXLAN (underlay in a VRF)            [FAIL]
      
      After:
      
       # ./test_vxlan_under_vrf.sh
       Checking HV connectivity                                           [ OK ]
       Check VM connectivity through VXLAN (underlay in the default VRF)  [ OK ]
       Check VM connectivity through VXLAN (underlay in a VRF)            [ OK ]
      
      Fixes: 03f1c26b ("test/net: Add script for VXLAN underlay in a VRF")
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20220324200514.1638326-1-idosch@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b50d3b46
  2. 25 Mar, 2022 18 commits
  3. 24 Mar, 2022 10 commits
    • Linus Torvalds's avatar
      Merge tag 'net-next-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next · 169e7776
      Linus Torvalds authored
      Pull networking updates from Jakub Kicinski:
       "The sprinkling of SPI drivers is because we added a new one and Mark
        sent us a SPI driver interface conversion pull request.
      
        Core
        ----
      
         - Introduce XDP multi-buffer support, allowing the use of XDP with
           jumbo frame MTUs and combination with Rx coalescing offloads (LRO).
      
         - Speed up netns dismantling (5x) and lower the memory cost a little.
           Remove unnecessary per-netns sockets. Scope some lists to a netns.
           Cut down RCU syncing. Use batch methods. Allow netdev registration
           to complete out of order.
      
         - Support distinguishing timestamp types (ingress vs egress) and
           maintaining them across packet scrubbing points (e.g. redirect).
      
         - Continue the work of annotating packet drop reasons throughout the
           stack.
      
         - Switch netdev error counters from an atomic to dynamically
           allocated per-CPU counters.
      
         - Rework a few preempt_disable(), local_irq_save() and busy waiting
           sections problematic on PREEMPT_RT.
      
         - Extend the ref_tracker to allow catching use-after-free bugs.
      
        BPF
        ---
      
         - Introduce "packing allocator" for BPF JIT images. JITed code is
           marked read only, and used to be allocated at page granularity.
           Custom allocator allows for more efficient memory use, lower iTLB
           pressure and prevents identity mapping huge pages from getting
           split.
      
         - Make use of BTF type annotations (e.g. __user, __percpu) to enforce
           the correct probe read access method, add appropriate helpers.
      
         - Convert the BPF preload to use light skeleton and drop the
           user-mode-driver dependency.
      
         - Allow XDP BPF_PROG_RUN test infra to send real packets, enabling
           its use as a packet generator.
      
         - Allow local storage memory to be allocated with GFP_KERNEL if
           called from a hook allowed to sleep.
      
         - Introduce fprobe (multi kprobe) to speed up mass attachment (arch
           bits to come later).
      
         - Add unstable conntrack lookup helpers for BPF by using the BPF
           kfunc infra.
      
         - Allow cgroup BPF progs to return custom errors to user space.
      
         - Add support for AF_UNIX iterator batching.
      
         - Allow iterator programs to use sleepable helpers.
      
         - Support JIT of add, and, or, xor and xchg atomic ops on arm64.
      
         - Add BTFGen support to bpftool which allows to use CO-RE in kernels
           without BTF info.
      
         - Large number of libbpf API improvements, cleanups and deprecations.
      
        Protocols
        ---------
      
         - Micro-optimize UDPv6 Tx, gaining up to 5% in test on dummy netdev.
      
         - Adjust TSO packet sizes based on min_rtt, allowing very low latency
           links (data centers) to always send full-sized TSO super-frames.
      
         - Make IPv6 flow label changes (AKA hash rethink) more configurable,
           via sysctl and setsockopt. Distinguish between server and client
           behavior.
      
         - VxLAN support to "collect metadata" devices to terminate only
           configured VNIs. This is similar to VLAN filtering in the bridge.
      
         - Support inserting IPv6 IOAM information to a fraction of frames.
      
         - Add protocol attribute to IP addresses to allow identifying where
           given address comes from (kernel-generated, DHCP etc.)
      
         - Support setting socket and IPv6 options via cmsg on ping6 sockets.
      
         - Reject mis-use of ECN bits in IP headers as part of DSCP/TOS.
           Define dscp_t and stop taking ECN bits into account in fib-rules.
      
         - Add support for locked bridge ports (for 802.1X).
      
         - tun: support NAPI for packets received from batched XDP buffs,
           doubling the performance in some scenarios.
      
         - IPv6 extension header handling in Open vSwitch.
      
         - Support IPv6 control message load balancing in bonding, prevent
           neighbor solicitation and advertisement from using the wrong port.
           Support NS/NA monitor selection similar to existing ARP monitor.
      
         - SMC
            - improve performance with TCP_CORK and sendfile()
            - support auto-corking
            - support TCP_NODELAY
      
         - MCTP (Management Component Transport Protocol)
            - add user space tag control interface
            - I2C binding driver (as specified by DMTF DSP0237)
      
         - Multi-BSSID beacon handling in AP mode for WiFi.
      
         - Bluetooth:
            - handle MSFT Monitor Device Event
            - add MGMT Adv Monitor Device Found/Lost events
      
         - Multi-Path TCP:
            - add support for the SO_SNDTIMEO socket option
            - lots of selftest cleanups and improvements
      
         - Increase the max PDU size in CAN ISOTP to 64 kB.
      
        Driver API
        ----------
      
         - Add HW counters for SW netdevs, a mechanism for devices which
           offload packet forwarding to report packet statistics back to
           software interfaces such as tunnels.
      
         - Select the default NIC queue count as a fraction of number of
           physical CPU cores, instead of hard-coding to 8.
      
         - Expose devlink instance locks to drivers. Allow device layer of
           drivers to use that lock directly instead of creating their own
           which always runs into ordering issues in devlink callbacks.
      
         - Add header/data split indication to guide user space enabling of
           TCP zero-copy Rx.
      
         - Allow configuring completion queue event size.
      
         - Refactor page_pool to enable fragmenting after allocation.
      
         - Add allocation and page reuse statistics to page_pool.
      
         - Improve Multiple Spanning Trees support in the bridge to allow
           reuse of topologies across VLANs, saving HW resources in switches.
      
         - DSA (Distributed Switch Architecture):
            - replay and offload of host VLAN entries
            - offload of static and local FDB entries on LAG interfaces
            - FDB isolation and unicast filtering
      
        New hardware / drivers
        ----------------------
      
         - Ethernet:
            - LAN937x T1 PHYs
            - Davicom DM9051 SPI NIC driver
            - Realtek RTL8367S, RTL8367RB-VB switch and MDIO
            - Microchip ksz8563 switches
            - Netronome NFP3800 SmartNICs
            - Fungible SmartNICs
            - MediaTek MT8195 switches
      
         - WiFi:
            - mt76: MediaTek mt7916
            - mt76: MediaTek mt7921u USB adapters
            - brcmfmac: Broadcom BCM43454/6
      
         - Mobile:
            - iosm: Intel M.2 7360 WWAN card
      
        Drivers
        -------
      
         - Convert many drivers to the new phylink API built for split PCS
           designs but also simplifying other cases.
      
         - Intel Ethernet NICs:
            - add TTY for GNSS module for E810T device
            - improve AF_XDP performance
            - GTP-C and GTP-U filter offload
            - QinQ VLAN support
      
         - Mellanox Ethernet NICs (mlx5):
            - support xdp->data_meta
            - multi-buffer XDP
            - offload tc push_eth and pop_eth actions
      
         - Netronome Ethernet NICs (nfp):
            - flow-independent tc action hardware offload (police / meter)
            - AF_XDP
      
         - Other Ethernet NICs:
            - at803x: fiber and SFP support
            - xgmac: mdio: preamble suppression and custom MDC frequencies
            - r8169: enable ASPM L1.2 if system vendor flags it as safe
            - macb/gem: ZynqMP SGMII
            - hns3: add TX push mode
            - dpaa2-eth: software TSO
            - lan743x: multi-queue, mdio, SGMII, PTP
            - axienet: NAPI and GRO support
      
         - Mellanox Ethernet switches (mlxsw):
            - source and dest IP address rewrites
            - RJ45 ports
      
         - Marvell Ethernet switches (prestera):
            - basic routing offload
            - multi-chain TC ACL offload
      
         - NXP embedded Ethernet switches (ocelot & felix):
            - PTP over UDP with the ocelot-8021q DSA tagging protocol
            - basic QoS classification on Felix DSA switch using dcbnl
            - port mirroring for ocelot switches
      
         - Microchip high-speed industrial Ethernet (sparx5):
            - offloading of bridge port flooding flags
            - PTP Hardware Clock
      
         - Other embedded switches:
            - lan966x: PTP Hardward Clock
            - qca8k: mdio read/write operations via crafted Ethernet packets
      
         - Qualcomm 802.11ax WiFi (ath11k):
            - add LDPC FEC type and 802.11ax High Efficiency data in radiotap
            - enable RX PPDU stats in monitor co-exist mode
      
         - Intel WiFi (iwlwifi):
            - UHB TAS enablement via BIOS
            - band disablement via BIOS
            - channel switch offload
            - 32 Rx AMPDU sessions in newer devices
      
         - MediaTek WiFi (mt76):
            - background radar detection
            - thermal management improvements on mt7915
            - SAR support for more mt76 platforms
            - MBSSID and 6 GHz band on mt7915
      
         - RealTek WiFi:
            - rtw89: AP mode
            - rtw89: 160 MHz channels and 6 GHz band
            - rtw89: hardware scan
      
         - Bluetooth:
            - mt7921s: wake on Bluetooth, SCO over I2S, wide-band-speed (WBS)
      
         - Microchip CAN (mcp251xfd):
            - multiple RX-FIFOs and runtime configurable RX/TX rings
            - internal PLL, runtime PM handling simplification
            - improve chip detection and error handling after wakeup"
      
      * tag 'net-next-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2521 commits)
        llc: fix netdevice reference leaks in llc_ui_bind()
        drivers: ethernet: cpsw: fix panic when interrupt coaleceing is set via ethtool
        ice: don't allow to run ice_send_event_to_aux() in atomic ctx
        ice: fix 'scheduling while atomic' on aux critical err interrupt
        net/sched: fix incorrect vlan_push_eth dest field
        net: bridge: mst: Restrict info size queries to bridge ports
        net: marvell: prestera: add missing destroy_workqueue() in prestera_module_init()
        drivers: net: xgene: Fix regression in CRC stripping
        net: geneve: add missing netlink policy and size for IFLA_GENEVE_INNER_PROTO_INHERIT
        net: dsa: fix missing host-filtered multicast addresses
        net/mlx5e: Fix build warning, detected write beyond size of field
        iwlwifi: mvm: Don't fail if PPAG isn't supported
        selftests/bpf: Fix kprobe_multi test.
        Revert "rethook: x86: Add rethook x86 implementation"
        Revert "arm64: rethook: Add arm64 rethook implementation"
        Revert "powerpc: Add rethook support"
        Revert "ARM: rethook: Add rethook arm implementation"
        netdevice: add missing dm_private kdoc
        net: bridge: mst: prevent NULL deref in br_mst_info_size()
        selftests: forwarding: Use same VRF for port and VLAN upper
        ...
      169e7776
    • Linus Torvalds's avatar
      Merge tag 'vfio-v5.18-rc1' of https://github.com/awilliam/linux-vfio · 7403e6d8
      Linus Torvalds authored
      Pull VFIO updates from Alex Williamson:
      
       - Introduce new device migration uAPI and implement device specific
         mlx5 vfio-pci variant driver supporting new protocol (Jason
         Gunthorpe, Yishai Hadas, Leon Romanovsky)
      
       - New HiSilicon acc vfio-pci variant driver, also supporting migration
         interface (Shameer Kolothum, Longfang Liu)
      
       - D3hot fixes for vfio-pci-core (Abhishek Sahu)
      
       - Document new vfio-pci variant driver acceptance criteria
         (Alex Williamson)
      
       - Fix UML build unresolved ioport_{un}map() functions
         (Alex Williamson)
      
       - Fix MAINTAINERS due to header movement (Lukas Bulwahn)
      
      * tag 'vfio-v5.18-rc1' of https://github.com/awilliam/linux-vfio: (31 commits)
        vfio-pci: Provide reviewers and acceptance criteria for variant drivers
        MAINTAINERS: adjust entry for header movement in hisilicon qm driver
        hisi_acc_vfio_pci: Use its own PCI reset_done error handler
        hisi_acc_vfio_pci: Add support for VFIO live migration
        crypto: hisilicon/qm: Set the VF QM state register
        hisi_acc_vfio_pci: Add helper to retrieve the struct pci_driver
        hisi_acc_vfio_pci: Restrict access to VF dev BAR2 migration region
        hisi_acc_vfio_pci: add new vfio_pci driver for HiSilicon ACC devices
        hisi_acc_qm: Move VF PCI device IDs to common header
        crypto: hisilicon/qm: Move few definitions to common header
        crypto: hisilicon/qm: Move the QM header to include/linux
        vfio/mlx5: Fix to not use 0 as NULL pointer
        PCI/IOV: Fix wrong kernel-doc identifier
        vfio/mlx5: Use its own PCI reset_done error handler
        vfio/pci: Expose vfio_pci_core_aer_err_detected()
        vfio/mlx5: Implement vfio_pci driver for mlx5 devices
        vfio/mlx5: Expose migration commands over mlx5 device
        vfio: Remove migration protocol v1 documentation
        vfio: Extend the device migration protocol with RUNNING_P2P
        vfio: Define device migration protocol v2
        ...
      7403e6d8
    • Linus Torvalds's avatar
      Merge tag 'hyperv-next-signed-20220322' of... · 66711cfe
      Linus Torvalds authored
      Merge tag 'hyperv-next-signed-20220322' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
      
      Pull hyperv updates from Wei Liu:
       "Minor patches from various people"
      
      * tag 'hyperv-next-signed-20220322' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
        x86/hyperv: Output host build info as normal Windows version number
        hv_balloon: rate-limit "Unhandled message" warning
        drivers: hv: log when enabling crash_kexec_post_notifiers
        hv_utils: Add comment about max VMbus packet size in VSS driver
        Drivers: hv: Compare cpumasks and not their weights in init_vp_index()
        Drivers: hv: Rename 'alloced' to 'allocated'
        Drivers: hv: vmbus: Use struct_size() helper in kmalloc()
      66711cfe
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 1ebdbeb0
      Linus Torvalds authored
      Pull kvm updates from Paolo Bonzini:
       "ARM:
         - Proper emulation of the OSLock feature of the debug architecture
      
         - Scalibility improvements for the MMU lock when dirty logging is on
      
         - New VMID allocator, which will eventually help with SVA in VMs
      
         - Better support for PMUs in heterogenous systems
      
         - PSCI 1.1 support, enabling support for SYSTEM_RESET2
      
         - Implement CONFIG_DEBUG_LIST at EL2
      
         - Make CONFIG_ARM64_ERRATUM_2077057 default y
      
         - Reduce the overhead of VM exit when no interrupt is pending
      
         - Remove traces of 32bit ARM host support from the documentation
      
         - Updated vgic selftests
      
         - Various cleanups, doc updates and spelling fixes
      
        RISC-V:
         - Prevent KVM_COMPAT from being selected
      
         - Optimize __kvm_riscv_switch_to() implementation
      
         - RISC-V SBI v0.3 support
      
        s390:
         - memop selftest
      
         - fix SCK locking
      
         - adapter interruptions virtualization for secure guests
      
         - add Claudio Imbrenda as maintainer
      
         - first step to do proper storage key checking
      
        x86:
         - Continue switching kvm_x86_ops to static_call(); introduce
           static_call_cond() and __static_call_ret0 when applicable.
      
         - Cleanup unused arguments in several functions
      
         - Synthesize AMD 0x80000021 leaf
      
         - Fixes and optimization for Hyper-V sparse-bank hypercalls
      
         - Implement Hyper-V's enlightened MSR bitmap for nested SVM
      
         - Remove MMU auditing
      
         - Eager splitting of page tables (new aka "TDP" MMU only) when dirty
           page tracking is enabled
      
         - Cleanup the implementation of the guest PGD cache
      
         - Preparation for the implementation of Intel IPI virtualization
      
         - Fix some segment descriptor checks in the emulator
      
         - Allow AMD AVIC support on systems with physical APIC ID above 255
      
         - Better API to disable virtualization quirks
      
         - Fixes and optimizations for the zapping of page tables:
      
            - Zap roots in two passes, avoiding RCU read-side critical
              sections that last too long for very large guests backed by 4
              KiB SPTEs.
      
            - Zap invalid and defunct roots asynchronously via
              concurrency-managed work queue.
      
            - Allowing yielding when zapping TDP MMU roots in response to the
              root's last reference being put.
      
            - Batch more TLB flushes with an RCU trick. Whoever frees the
              paging structure now holds RCU as a proxy for all vCPUs running
              in the guest, i.e. to prolongs the grace period on their behalf.
              It then kicks the the vCPUs out of guest mode before doing
              rcu_read_unlock().
      
        Generic:
         - Introduce __vcalloc and use it for very large allocations that need
           memcg accounting"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (246 commits)
        KVM: use kvcalloc for array allocations
        KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2
        kvm: x86: Require const tsc for RT
        KVM: x86: synthesize CPUID leaf 0x80000021h if useful
        KVM: x86: add support for CPUID leaf 0x80000021
        KVM: x86: do not use KVM_X86_OP_OPTIONAL_RET0 for get_mt_mask
        Revert "KVM: x86/mmu: Zap only TDP MMU leafs in kvm_zap_gfn_range()"
        kvm: x86/mmu: Flush TLB before zap_gfn_range releases RCU
        KVM: arm64: fix typos in comments
        KVM: arm64: Generalise VM features into a set of flags
        KVM: s390: selftests: Add error memop tests
        KVM: s390: selftests: Add more copy memop tests
        KVM: s390: selftests: Add named stages for memop test
        KVM: s390: selftests: Add macro as abstraction for MEM_OP
        KVM: s390: selftests: Split memop tests
        KVM: s390x: fix SCK locking
        RISC-V: KVM: Implement SBI HSM suspend call
        RISC-V: KVM: Add common kvm_riscv_vcpu_wfi() function
        RISC-V: Add SBI HSM suspend related defines
        RISC-V: KVM: Implement SBI v0.3 SRST extension
        ...
      1ebdbeb0
    • Linus Torvalds's avatar
      Merge tag 'tomoyo-pr-20220322' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1 · efee6c79
      Linus Torvalds authored
      Pull tomoyo update from Tetsuo Handa:
       "Avoid unnecessarily leaking kernel command line arguments"
      
      * tag 'tomoyo-pr-20220322' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1:
        TOMOYO: fix __setup handlers return values
      efee6c79
    • Linus Torvalds's avatar
      Merge tag 'flexible-array-transformations-5.18-rc1' of... · 3ce62cf4
      Linus Torvalds authored
      Merge tag 'flexible-array-transformations-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
      
      Pull flexible-array transformations from Gustavo Silva:
       "Treewide patch that replaces zero-length arrays with flexible-array
        members.
      
        This has been baking in linux-next for a whole development cycle"
      
      * tag 'flexible-array-transformations-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
        treewide: Replace zero-length arrays with flexible-array members
      3ce62cf4
    • Linus Torvalds's avatar
      Merge tag 'prlimit-tasklist_lock-for-v5.18' of... · cd4699c5
      Linus Torvalds authored
      Merge tag 'prlimit-tasklist_lock-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
      
      Pull tasklist_lock optimizations from Eric Biederman:
       "prlimit and getpriority tasklist_lock optimizations
      
        The tasklist_lock popped up as a scalability bottleneck on some
        testing workloads. The readlocks in do_prlimit and set/getpriority are
        not necessary in all cases.
      
        Based on a cycles profile, it looked like ~87% of the time was spent
        in the kernel, ~42% of which was just trying to get *some* spinlock
        (queued_spin_lock_slowpath, not necessarily the tasklist_lock).
      
        The big offenders (with rough percentages in cycles of the overall
        trace):
         - do_wait 11%
         - setpriority 8% (done previously in commit 7f8ca0ed)
         - kill 8%
         - do_exit 5%
         - clone 3%
         - prlimit64 2%   (this patchset)
         - getrlimit 1%   (this patchset)
      
        I can't easily test this patchset on the original workload for various
        reasons. Instead, I used the microbenchmark below to at least verify
        there was some improvement. This patchset had a 28% speedup (12% from
        baseline to set/getprio, then another 14% for prlimit).
      
        This series used to do the setpriority case, but an almost identical
        change was merged as commit 7f8ca0ed ("kernel/sys.c: only take
        tasklist_lock for get/setpriority(PRIO_PGRP)") so that has been
        dropped from here.
      
        One interesting thing is that my libc's getrlimit() was calling
        prlimit64, so hoisting the read_lock(tasklist_lock) into sys_prlimit64
        had no effect - it essentially optimized the older syscalls only. I
        didn't do that in this patchset, but figured I'd mention it since it
        was an option from the previous patch's discussion"
      
      micobenchmark.c:
      ---------------
      	int main(int argc, char **argv)
      	{
      		pid_t child;
      		struct rlimit rlim[1];
      
      		fork(); fork(); fork(); fork(); fork(); fork();
      
      		for (int i = 0; i < 5000; i++) {
      			child = fork();
      			if (child < 0)
      				exit(1);
      			if (child > 0) {
      				usleep(1000);
      				kill(child, SIGTERM);
      				waitpid(child, NULL, 0);
      			} else {
      				for (;;) {
      					setpriority(PRIO_PROCESS, 0,
      						    getpriority(PRIO_PROCESS, 0));
      					getrlimit(RLIMIT_CPU, rlim);
      				}
      			}
      		}
      
      		return 0;
      	}
      
      Link: https://lore.kernel.org/lkml/20211213220401.1039578-1-brho@google.com/ [v1]
      Link: https://lore.kernel.org/lkml/20220105212828.197013-1-brho@google.com/ [v2]
      Link: https://lore.kernel.org/lkml/20220106172041.522167-1-brho@google.com/ [v3]
      
      * tag 'prlimit-tasklist_lock-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
        prlimit: do not grab the tasklist_lock
        prlimit: make do_prlimit() static
      cd4699c5
    • Linus Torvalds's avatar
      Merge tag 'fs.rt.v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · 2e2d4650
      Linus Torvalds authored
      Pull mount attributes PREEMPT_RT update from Christian Brauner:
       "This contains Sebastian's fix to make changing mount
        attributes/getting write access compatible with CONFIG_PREEMPT_RT.
      
        The change only applies when users explicitly opt-in to real-time via
        CONFIG_PREEMPT_RT otherwise things are exactly as before. We've waited
        quite a long time with this to make sure folks could take a good look"
      
      * tag 'fs.rt.v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        fs/namespace: Boost the mount_lock.lock owner instead of spinning on PREEMPT_RT.
      2e2d4650
    • Linus Torvalds's avatar
      Merge tag 'fs.v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · 15f2e3d6
      Linus Torvalds authored
      Pull mount_setattr updates from Christian Brauner:
       "This contains a few more patches to massage the mount_setattr()
        codepaths and one minor fix to reuse a helper we added some time back.
      
        The final two patches do similar cleanups in different ways. One patch
        is mine and the other is Al's who was nice enough to give me a branch
        for it.
      
        Since his came in later and my branch had been sitting in -next for
        quite some time we just put his on top instead of swap them"
      
      * tag 'fs.v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        mount_setattr(): clean the control flow and calling conventions
        fs: clean up mount_setattr control flow
        fs: don't open-code mnt_hold_writers()
        fs: simplify check in mount_setattr_commit()
        fs: add mnt_allow_writers() and simplify mount_setattr_prepare()
      15f2e3d6
    • Linus Torvalds's avatar
      Merge tag 'arm-dt-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · ed464352
      Linus Torvalds authored
      Pull ARM devicetree updates from Arnd Bergmann:
       "After a somewhat quiet 5.17 release, the size of the DT changes is a
        bit larger again. There are nine new SoC that get added, all of them
        related to existing platforms:
      
         - Airoha (formerly Mediatek/EcoNet) EN7523 networking SoC and EVB
      
         - Mediatek mt6582 tablet platform with the Prestigio PMT5008 3G
           tablet
      
         - Microchip Lan966 networking SoC and it evaluation board
      
         - Qualcomm Snapdragon 625/632 midrange phone SoCs, with the LG Nexus
           5X and Fairphone FP3 phones
      
         - Renesas RZ/G2LC and RZ/V2L general-purpose embedded SoCs, along
           with their evaluation boards
      
         - Samsung Exynos 850 phone SoC and reference board
      
         - Samsung Exynos7885 with the Samsung Galaxy A8 (2018) phone
      
         - Tesla FSD (Fully Self-Driving), an automotive SoC loosely derived
           from the Samsung Exynos family.
      
         - TI K3/AM62 SoC and reference board
      
        Support for additional functionality in existing dts files is added
        all over the place: Samsung, Renesas, Mstar, wpcm450, OMAP, AT91,
        Allwinner, i.MX, Tegra, Aspeed, Oxnas, Qualcomm, Mediatek, and
        Broadcom.
      
        Samsung has a rework for its pinctrl schema that is a bit tricky and
        requires driver changes to be included here.
      
        A few more platforms only have smaller cleanups and DT Schema fixes,
        this includes SoCFPGA, ux500, ixp4xx, STi, Xilinx Zynq, LG, and Juno.
      
        The new machines are really too many to list, but I'll do it anyway:
      
        Allwinner:
         - A20-Marsboard development board
      
        Amlogic:
         - Amediatek X96-AIR (Amlogic S905X3)
         - CYX A95XF3-AIR (Amlogic S905X3)
         - Haochuangy H96-Max (Amlogic S905X3)
         - Amlogic AQ222 (Amlogic S4)
         - OSMC Vero 4K+ (Amlogic S905D)
      
        Arm Juno:
         - Separate DT depending on SCMI firmware version
      
        Aspeed:
         - Quanta S6Q BMC (AST2600)
         - ASRock ROMED8HM3 (AST2500)
      
        Broadcom:
         - Raspberry Pi Zero 2 W
      
        Marvell MVEBU/Armada:
         - Ctera C200 V1 NAS (kirkwood)
         - Ctera C200 V2 NAS (armada-370)
      
        Mstar:
         - DongShanPiOne, a low-end embedded board
         - Miyoo Mini handheld game console
      
        NXP i.MX:
         - Numerous i.MX8M Mini based boards in even more variations, but
          none based on other SoCs this time:
          Protonic PRT8MM, emCON-MX8M Mini, Toradex Verdin, and
          Gateworks GW7903
      
        Qualcomm:
         - Google Herobrine R1 Chromebook platform (Snapdragon 7c Gen 3)
         - SHIFT6mq phone (Snapdragon 845)
         - Samsung Galaxy Book2 (Snapdragon 850)
         - Snapdragon 8 Gen 1 Hardware Development Kit
      
        TI OMAP:
         - SanCloud BeagleBone Enhanced WiFi
      
        Rockchip:
         - Pine64 PineNote ereader tablet (rk356x)
         - Bananapi-R2-Pro (rk356x)
      
        STM32:
         - emtrion emSBS-Argon embedded board (stm32mp157c)"
      
      * tag 'arm-dt-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (627 commits)
        arm64: dts: n5x: drop invalid property and fix edac node name
        arm64: dts: fsd: Add the MCT support
        arm64: dts: stingray: Fix spi clock name
        arm64: dts: ns2: Fix spi clock name
        ARM: dts: rockchip: Update regulator name for PX3
        ARM: dts: rockchip: Add #clock-cells value for rk805
        arm64: dts: rockchip: Add #clock-cells value for rk805
        arm64: dts: rockchip: Remove vcc13 and vcc14 for rk808
        arm64: dts: rockchip: Fix SDIO regulator supply properties on rk3399-firefly
        ARM: dts: at91: sama7g5: Add NAND support
        ARM: dts: at91: sama7g5: add eic node
        ARM: dts: at91: sama7g5: Remove unused properties in i2c nodes
        ARM: dts: at91: sam9x60ek: modify vdd_1v5 regulator to vdd_1v15
        arm64: dts: lg: align pl330 node name with dtschema
        arm64: dts: lg: add dma-cells to pl330 node
        arm64: dts: juno: align pl330 node name with dtschema
        arm64: dts: broadcom: Fix sata nodename
        arm64: dts: n5x: add sdr edac support
        arm64: dts: agilex/stratix10: add clock-names to USB DWC2 node
        dt-bindings: usb: dwc2: add disable-over-current
        ...
      ed464352