1. 22 Feb, 2023 14 commits
    • Hou Tao's avatar
      bpf: Only allocate one bpf_mem_cache for bpf_cpumask_ma · 5d5de3a4
      Hou Tao authored
      The size of bpf_cpumask is fixed, so there is no need to allocate many
      bpf_mem_caches for bpf_cpumask_ma, just one bpf_mem_cache is enough.
      Also add comments for bpf_mem_alloc_init() in bpf_mem_alloc.h to prevent
      future miuse.
      Signed-off-by: default avatarHou Tao <houtao1@huawei.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Link: https://lore.kernel.org/r/20230216024821.2202916-1-houtao@huaweicloud.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      5d5de3a4
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Wrap register invalidation with a helper · dbd8d228
      Kumar Kartikeya Dwivedi authored
      Typically, verifier should use env->allow_ptr_leaks when invaliding
      registers for users that don't have CAP_PERFMON or CAP_SYS_ADMIN to
      avoid leaking the pointer value. This is similar in spirit to
      c67cae55 ("bpf: Tighten ptr_to_btf_id checks."). In a lot of the
      existing checks, we know the capabilities are present, hence we don't do
      the check.
      
      Instead of being inconsistent in the application of the check, wrap the
      action of invalidating a register into a helper named 'mark_invalid_reg'
      and use it in a uniform fashion to replace open coded invalidation
      operations, so that the check is always made regardless of the call site
      and we don't have to remember whether it needs to be done or not for
      each case.
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20230221200646.2500777-7-memxor@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      dbd8d228
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Fix check_reg_type for PTR_TO_BTF_ID · da03e43a
      Kumar Kartikeya Dwivedi authored
      The current code does type matching for the case where reg->type is
      PTR_TO_BTF_ID or has the PTR_TRUSTED flag. However, this only needs to
      occur for non-MEM_ALLOC and non-MEM_PERCPU cases, but will include both
      as per the current code.
      
      The MEM_ALLOC case with or without PTR_TRUSTED needs to be handled
      specially by the code for type_is_alloc case, while MEM_PERCPU case must
      be ignored. Hence, to restore correct behavior and for clarity,
      explicitly list out the handled PTR_TO_BTF_ID types which should be
      handled for each case using a switch statement.
      
      Helpers currently only take:
      	PTR_TO_BTF_ID
      	PTR_TO_BTF_ID | PTR_TRUSTED
      	PTR_TO_BTF_ID | MEM_RCU
      	PTR_TO_BTF_ID | MEM_ALLOC
      	PTR_TO_BTF_ID | MEM_PERCPU
      	PTR_TO_BTF_ID | MEM_PERCPU | PTR_TRUSTED
      
      This fix was also described (for the MEM_ALLOC case) in [0].
      
        [0]: https://lore.kernel.org/bpf/20221121160657.h6z7xuvedybp5y7s@apolloSigned-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20230221200646.2500777-6-memxor@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      da03e43a
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Remove unused MEM_ALLOC | PTR_TRUSTED checks · 521d3c0a
      Kumar Kartikeya Dwivedi authored
      The plan is to supposedly tag everything with PTR_TRUSTED eventually,
      however those changes should bring in their respective code, instead
      of leaving it around right now. It is arguable whether PTR_TRUSTED is
      required for all types, when it's only use case is making PTR_TO_BTF_ID
      a bit stronger, while all other types are trusted by default.
      
      Hence, just drop the two instances which do not occur in the verifier
      for now to avoid reader confusion.
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20230221200646.2500777-5-memxor@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      521d3c0a
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Annotate data races in bpf_local_storage · 0a09a2f9
      Kumar Kartikeya Dwivedi authored
      There are a few cases where hlist_node is checked to be unhashed without
      holding the lock protecting its modification. In this case, one must use
      hlist_unhashed_lockless to avoid load tearing and KCSAN reports. Fix
      this by using lockless variant in places not protected by the lock.
      
      Since this is not prompted by any actual KCSAN reports but only from
      code review, I have not included a fixes tag.
      
      Cc: Martin KaFai Lau <martin.lau@kernel.org>
      Cc: KP Singh <kpsingh@kernel.org>
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20230221200646.2500777-4-memxor@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      0a09a2f9
    • Alexei Starovoitov's avatar
      Merge branch 'bpf: Allow reads from uninit stack' · bf9bec4c
      Alexei Starovoitov authored
      Eduard Zingerman says:
      
      ====================
      
      This patch-set modifies BPF verifier to accept programs that read from
      uninitialized stack locations, but only if executed in privileged mode.
      This provides significant verification performance gains: 30% to 70% less
      processed states for big number of test programs.
      
      The reason for performance gains comes from treating STACK_MISC and
      STACK_INVALID as compatible, when cached state is compared to current state
      in verifier.c:stacksafe().
      
      The change should not affect safety, because any value read from STACK_MISC
      location has full binary range (e.g. 0x00-0xff for byte-sized reads).
      
      Details and measurements are provided in the description for the patch #1.
      
      The change was suggested by Andrii Nakryiko, the initial patch was created
      by Alexei Starovoitov. The discussion could be found at [1].
      
      Changes v1 -> v2 (v1 available at [2]):
      - Calls to helper functions now convert STACK_INVALID to STACK_MISC
        (suggested by Andrii);
      - The test case progs/test_global_func10.c is updated to expect new
        error message. Before recent commit [3] exact content of error
        messages was not verified for this test.
      - Replaced incorrect '//'-style comments in test case asm blocks by
        '/*...*/'-style comments in order to fix compilation issues;
      - Changed the tag from "Suggested-By" to "Co-developed-by" for Alexei
        on patch #1, please let me know if this is appropriate use of the tag.
      
      [1] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@mail.gmail.com/
      [2] https://lore.kernel.org/bpf/20230216183606.2483834-1-eddyz87@gmail.com/
      [3] 95ebb376 ("selftests/bpf: Convert test_global_funcs test to test_loader framework")
      ====================
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      bf9bec4c
    • Eduard Zingerman's avatar
      selftests/bpf: Tests for uninitialized stack reads · 6338a94d
      Eduard Zingerman authored
      Three testcases to make sure that stack reads from uninitialized
      locations are accepted by verifier when executed in privileged mode:
      - read from a fixed offset;
      - read from a variable offset;
      - passing a pointer to stack to a helper converts
        STACK_INVALID to STACK_MISC.
      Signed-off-by: default avatarEduard Zingerman <eddyz87@gmail.com>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20230219200427.606541-3-eddyz87@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      6338a94d
    • Eduard Zingerman's avatar
      bpf: Allow reads from uninit stack · 6715df8d
      Eduard Zingerman authored
      This commits updates the following functions to allow reads from
      uninitialized stack locations when env->allow_uninit_stack option is
      enabled:
      - check_stack_read_fixed_off()
      - check_stack_range_initialized(), called from:
        - check_stack_read_var_off()
        - check_helper_mem_access()
      
      Such change allows to relax logic in stacksafe() to treat STACK_MISC
      and STACK_INVALID in a same way and make the following stack slot
      configurations equivalent:
      
        |  Cached state    |  Current state   |
        |   stack slot     |   stack slot     |
        |------------------+------------------|
        | STACK_INVALID or | STACK_INVALID or |
        | STACK_MISC       | STACK_SPILL   or |
        |                  | STACK_MISC    or |
        |                  | STACK_ZERO    or |
        |                  | STACK_DYNPTR     |
      
      This leads to significant verification speed gains (see below).
      
      The idea was suggested by Andrii Nakryiko [1] and initial patch was
      created by Alexei Starovoitov [2].
      
      Currently the env->allow_uninit_stack is allowed for programs loaded
      by users with CAP_PERFMON or CAP_SYS_ADMIN capabilities.
      
      A number of test cases from verifier/*.c were expecting uninitialized
      stack access to be an error. These test cases were updated to execute
      in unprivileged mode (thus preserving the tests).
      
      The test progs/test_global_func10.c expected "invalid indirect read
      from stack" error message because of the access to uninitialized
      memory region. This error is no longer possible in privileged mode.
      The test is updated to provoke an error "invalid indirect access to
      stack" because of access to invalid stack address (such error is not
      verified by progs/test_global_func*.c series of tests).
      
      The following tests had to be removed because these can't be made
      unprivileged:
      - verifier/sock.c:
        - "sk_storage_get(map, skb->sk, &stack_value, 1): partially init
        stack_value"
        BPF_PROG_TYPE_SCHED_CLS programs are not executed in unprivileged mode.
      - verifier/var_off.c:
        - "indirect variable-offset stack access, max_off+size > max_initialized"
        - "indirect variable-offset stack access, uninitialized"
        These tests verify that access to uninitialized stack values is
        detected when stack offset is not a constant. However, variable
        stack access is prohibited in unprivileged mode, thus these tests
        are no longer valid.
      
       * * *
      
      Here is veristat log comparing this patch with current master on a
      set of selftest binaries listed in tools/testing/selftests/bpf/veristat.cfg
      and cilium BPF binaries (see [3]):
      
      $ ./veristat -e file,prog,states -C -f 'states_pct<-30' master.log current.log
      File                        Program                     States (A)  States (B)  States    (DIFF)
      --------------------------  --------------------------  ----------  ----------  ----------------
      bpf_host.o                  tail_handle_ipv6_from_host         349         244    -105 (-30.09%)
      bpf_host.o                  tail_handle_nat_fwd_ipv4          1320         895    -425 (-32.20%)
      bpf_lxc.o                   tail_handle_nat_fwd_ipv4          1320         895    -425 (-32.20%)
      bpf_sock.o                  cil_sock4_connect                   70          48     -22 (-31.43%)
      bpf_sock.o                  cil_sock4_sendmsg                   68          46     -22 (-32.35%)
      bpf_xdp.o                   tail_handle_nat_fwd_ipv4          1554         803    -751 (-48.33%)
      bpf_xdp.o                   tail_lb_ipv4                      6457        2473   -3984 (-61.70%)
      bpf_xdp.o                   tail_lb_ipv6                      7249        3908   -3341 (-46.09%)
      pyperf600_bpf_loop.bpf.o    on_event                           287         145    -142 (-49.48%)
      strobemeta.bpf.o            on_event                         15915        4772  -11143 (-70.02%)
      strobemeta_nounroll2.bpf.o  on_event                         17087        3820  -13267 (-77.64%)
      xdp_synproxy_kern.bpf.o     syncookie_tc                     21271        6635  -14636 (-68.81%)
      xdp_synproxy_kern.bpf.o     syncookie_xdp                    23122        6024  -17098 (-73.95%)
      --------------------------  --------------------------  ----------  ----------  ----------------
      
      Note: I limited selection by states_pct<-30%.
      
      Inspection of differences in pyperf600_bpf_loop behavior shows that
      the following patch for the test removes almost all differences:
      
          - a/tools/testing/selftests/bpf/progs/pyperf.h
          + b/tools/testing/selftests/bpf/progs/pyperf.h
          @ -266,8 +266,8 @ int __on_event(struct bpf_raw_tracepoint_args *ctx)
                  }
      
                  if (event->pthread_match || !pidData->use_tls) {
          -               void* frame_ptr;
          -               FrameData frame;
          +               void* frame_ptr = 0;
          +               FrameData frame = {};
                          Symbol sym = {};
                          int cur_cpu = bpf_get_smp_processor_id();
      
      W/o this patch the difference comes from the following pattern
      (for different variables):
      
          static bool get_frame_data(... FrameData *frame ...)
          {
              ...
              bpf_probe_read_user(&frame->f_code, ...);
              if (!frame->f_code)
                  return false;
              ...
              bpf_probe_read_user(&frame->co_name, ...);
              if (frame->co_name)
                  ...;
          }
      
          int __on_event(struct bpf_raw_tracepoint_args *ctx)
          {
              FrameData frame;
              ...
              get_frame_data(... &frame ...) // indirectly via a bpf_loop & callback
              ...
          }
      
          SEC("raw_tracepoint/kfree_skb")
          int on_event(struct bpf_raw_tracepoint_args* ctx)
          {
              ...
              ret |= __on_event(ctx);
              ret |= __on_event(ctx);
              ...
          }
      
      With regards to value `frame->co_name` the following is important:
      - Because of the conditional `if (!frame->f_code)` each call to
        __on_event() produces two states, one with `frame->co_name` marked
        as STACK_MISC, another with it as is (and marked STACK_INVALID on a
        first call).
      - The call to bpf_probe_read_user() does not mark stack slots
        corresponding to `&frame->co_name` as REG_LIVE_WRITTEN but it marks
        these slots as BPF_MISC, this happens because of the following loop
        in the check_helper_call():
      
      	for (i = 0; i < meta.access_size; i++) {
      		err = check_mem_access(env, insn_idx, meta.regno, i, BPF_B,
      				       BPF_WRITE, -1, false);
      		if (err)
      			return err;
      	}
      
        Note the size of the write, it is a one byte write for each byte
        touched by a helper. The BPF_B write does not lead to write marks
        for the target stack slot.
      - Which means that w/o this patch when second __on_event() call is
        verified `if (frame->co_name)` will propagate read marks first to a
        stack slot with STACK_MISC marks and second to a stack slot with
        STACK_INVALID marks and these states would be considered different.
      
      [1] https://lore.kernel.org/bpf/CAEf4BzY3e+ZuC6HUa8dCiUovQRg2SzEk7M-dSkqNZyn=xEmnPA@mail.gmail.com/
      [2] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@mail.gmail.com/
      [3] git@github.com:anakryiko/cilium.git
      Suggested-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Co-developed-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarEduard Zingerman <eddyz87@gmail.com>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20230219200427.606541-2-eddyz87@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      6715df8d
    • Linus Torvalds's avatar
      Merge tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next · 5b7c4cab
      Linus Torvalds authored
      Pull networking updates from Jakub Kicinski:
       "Core:
      
         - Add dedicated kmem_cache for typical/small skb->head, avoid having
           to access struct page at kfree time, and improve memory use.
      
         - Introduce sysctl to set default RPS configuration for new netdevs.
      
         - Define Netlink protocol specification format which can be used to
           describe messages used by each family and auto-generate parsers.
           Add tools for generating kernel data structures and uAPI headers.
      
         - Expose all net/core sysctls inside netns.
      
         - Remove 4s sleep in netpoll if carrier is instantly detected on
           boot.
      
         - Add configurable limit of MDB entries per port, and port-vlan.
      
         - Continue populating drop reasons throughout the stack.
      
         - Retire a handful of legacy Qdiscs and classifiers.
      
        Protocols:
      
         - Support IPv4 big TCP (TSO frames larger than 64kB).
      
         - Add IP_LOCAL_PORT_RANGE socket option, to control local port range
           on socket by socket basis.
      
         - Track and report in procfs number of MPTCP sockets used.
      
         - Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path
           manager.
      
         - IPv6: don't check net.ipv6.route.max_size and rely on garbage
           collection to free memory (similarly to IPv4).
      
         - Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
      
         - ICMP: add per-rate limit counters.
      
         - Add support for user scanning requests in ieee802154.
      
         - Remove static WEP support.
      
         - Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
           reporting.
      
         - WiFi 7 EHT channel puncturing support (client & AP).
      
        BPF:
      
         - Add a rbtree data structure following the "next-gen data structure"
           precedent set by recently added linked list, that is, by using
           kfunc + kptr instead of adding a new BPF map type.
      
         - Expose XDP hints via kfuncs with initial support for RX hash and
           timestamp metadata.
      
         - Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to
           better support decap on GRE tunnel devices not operating in collect
           metadata.
      
         - Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
      
         - Remove the need for trace_printk_lock for bpf_trace_printk and
           bpf_trace_vprintk helpers.
      
         - Extend libbpf's bpf_tracing.h support for tracing arguments of
           kprobes/uprobes and syscall as a special case.
      
         - Significantly reduce the search time for module symbols by
           livepatch and BPF.
      
         - Enable cpumasks to be used as kptrs, which is useful for tracing
           programs tracking which tasks end up running on which CPUs in
           different time intervals.
      
         - Add support for BPF trampoline on s390x and riscv64.
      
         - Add capability to export the XDP features supported by the NIC.
      
         - Add __bpf_kfunc tag for marking kernel functions as kfuncs.
      
         - Add cgroup.memory=nobpf kernel parameter option to disable BPF
           memory accounting for container environments.
      
        Netfilter:
      
         - Remove the CLUSTERIP target. It has been marked as obsolete for
           years, and we still have WARN splats wrt races of the out-of-band
           /proc interface installed by this target.
      
         - Add 'destroy' commands to nf_tables. They are identical to the
           existing 'delete' commands, but do not return an error if the
           referenced object (set, chain, rule...) did not exist.
      
        Driver API:
      
         - Improve cpumask_local_spread() locality to help NICs set the right
           IRQ affinity on AMD platforms.
      
         - Separate C22 and C45 MDIO bus transactions more clearly.
      
         - Introduce new DCB table to control DSCP rewrite on egress.
      
         - Support configuration of Physical Layer Collision Avoidance (PLCA)
           Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
           shared medium Ethernet.
      
         - Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
           preemption of low priority frames by high priority frames.
      
         - Add support for controlling MACSec offload using netlink SET.
      
         - Rework devlink instance refcounts to allow registration and
           de-registration under the instance lock. Split the code into
           multiple files, drop some of the unnecessarily granular locks and
           factor out common parts of netlink operation handling.
      
         - Add TX frame aggregation parameters (for USB drivers).
      
         - Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
           messages with notifications for debug.
      
         - Allow offloading of UDP NEW connections via act_ct.
      
         - Add support for per action HW stats in TC.
      
         - Support hardware miss to TC action (continue processing in SW from
           a specific point in the action chain).
      
         - Warn if old Wireless Extension user space interface is used with
           modern cfg80211/mac80211 drivers. Do not support Wireless
           Extensions for Wi-Fi 7 devices at all. Everyone should switch to
           using nl80211 interface instead.
      
         - Improve the CAN bit timing configuration. Use extack to return
           error messages directly to user space, update the SJW handling,
           including the definition of a new default value that will benefit
           CAN-FD controllers, by increasing their oscillator tolerance.
      
        New hardware / drivers:
      
         - Ethernet:
            - nVidia BlueField-3 support (control traffic driver)
            - Ethernet support for imx93 SoCs
            - Motorcomm yt8531 gigabit Ethernet PHY
            - onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
            - Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
            - Amlogic gxl MDIO mux
      
         - WiFi:
            - RealTek RTL8188EU (rtl8xxxu)
            - Qualcomm Wi-Fi 7 devices (ath12k)
      
         - CAN:
            - Renesas R-Car V4H
      
        Drivers:
      
         - Bluetooth:
            - Set Per Platform Antenna Gain (PPAG) for Intel controllers.
      
         - Ethernet NICs:
            - Intel (1G, igc):
               - support TSN / Qbv / packet scheduling features of i226 model
            - Intel (100G, ice):
               - use GNSS subsystem instead of TTY
               - multi-buffer XDP support
               - extend support for GPIO pins to E823 devices
            - nVidia/Mellanox:
               - update the shared buffer configuration on PFC commands
               - implement PTP adjphase function for HW offset control
               - TC support for Geneve and GRE with VF tunnel offload
               - more efficient crypto key management method
               - multi-port eswitch support
            - Netronome/Corigine:
               - add DCB IEEE support
               - support IPsec offloading for NFP3800
            - Freescale/NXP (enetc):
               - support XDP_REDIRECT for XDP non-linear buffers
               - improve reconfig, avoid link flap and waiting for idle
               - support MAC Merge layer
            - Other NICs:
               - sfc/ef100: add basic devlink support for ef100
               - ionic: rx_push mode operation (writing descriptors via MMIO)
               - bnxt: use the auxiliary bus abstraction for RDMA
               - r8169: disable ASPM and reset bus in case of tx timeout
               - cpsw: support QSGMII mode for J721e CPSW9G
               - cpts: support pulse-per-second output
               - ngbe: add an mdio bus driver
               - usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
               - r8152: handle devices with FW with NCM support
               - amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
               - virtio-net: support multi buffer XDP
               - virtio/vsock: replace virtio_vsock_pkt with sk_buff
               - tsnep: XDP support
      
         - Ethernet high-speed switches:
            - nVidia/Mellanox (mlxsw):
               - add support for latency TLV (in FW control messages)
            - Microchip (sparx5):
               - separate explicit and implicit traffic forwarding rules, make
                 the implicit rules always active
               - add support for egress DSCP rewrite
               - IS0 VCAP support (Ingress Classification)
               - IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS
                 etc.)
               - ES2 VCAP support (Egress Access Control)
               - support for Per-Stream Filtering and Policing (802.1Q,
                 8.6.5.1)
      
         - Ethernet embedded switches:
            - Marvell (mv88e6xxx):
               - add MAB (port auth) offload support
               - enable PTP receive for mv88e6390
            - NXP (ocelot):
               - support MAC Merge layer
               - support for the the vsc7512 internal copper phys
            - Microchip:
               - lan9303: convert to PHYLINK
               - lan966x: support TC flower filter statistics
               - lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
               - lan937x: support Credit Based Shaper configuration
               - ksz9477: support Energy Efficient Ethernet
            - other:
               - qca8k: convert to regmap read/write API, use bulk operations
               - rswitch: Improve TX timestamp accuracy
      
         - Intel WiFi (iwlwifi):
            - EHT (Wi-Fi 7) rate reporting
            - STEP equalizer support: transfer some STEP (connection to radio
              on platforms with integrated wifi) related parameters from the
              BIOS to the firmware.
      
         - Qualcomm 802.11ax WiFi (ath11k):
            - IPQ5018 support
            - Fine Timing Measurement (FTM) responder role support
            - channel 177 support
      
         - MediaTek WiFi (mt76):
            - per-PHY LED support
            - mt7996: EHT (Wi-Fi 7) support
            - Wireless Ethernet Dispatch (WED) reset support
            - switch to using page pool allocator
      
         - RealTek WiFi (rtw89):
            - support new version of Bluetooth co-existance
      
         - Mobile:
            - rmnet: support TX aggregation"
      
      * tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits)
        page_pool: add a comment explaining the fragment counter usage
        net: ethtool: fix __ethtool_dev_mm_supported() implementation
        ethtool: pse-pd: Fix double word in comments
        xsk: add linux/vmalloc.h to xsk.c
        sefltests: netdevsim: wait for devlink instance after netns removal
        selftest: fib_tests: Always cleanup before exit
        net/mlx5e: Align IPsec ASO result memory to be as required by hardware
        net/mlx5e: TC, Set CT miss to the specific ct action instance
        net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG
        net/mlx5: Refactor tc miss handling to a single function
        net/mlx5: Kconfig: Make tc offload depend on tc skb extension
        net/sched: flower: Support hardware miss to tc action
        net/sched: flower: Move filter handle initialization earlier
        net/sched: cls_api: Support hardware miss to tc action
        net/sched: Rename user cookie and act cookie
        sfc: fix builds without CONFIG_RTC_LIB
        sfc: clean up some inconsistent indentings
        net/mlx4_en: Introduce flexible array to silence overflow warning
        net: lan966x: Fix possible deadlock inside PTP
        net/ulp: Remove redundant ->clone() test in inet_clone_ulp().
        ...
      5b7c4cab
    • Linus Torvalds's avatar
      Merge tag 'v6.3-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 36289a03
      Linus Torvalds authored
      Pull crypto update from Herbert Xu:
       "API:
         - Use kmap_local instead of kmap_atomic
         - Change request callback to take void pointer
         - Print FIPS status in /proc/crypto (when enabled)
      
        Algorithms:
         - Add rfc4106/gcm support on arm64
         - Add ARIA AVX2/512 support on x86
      
        Drivers:
         - Add TRNG driver for StarFive SoC
         - Delete ux500/hash driver (subsumed by stm32/hash)
         - Add zlib support in qat
         - Add RSA support in aspeed"
      
      * tag 'v6.3-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (156 commits)
        crypto: x86/aria-avx - Do not use avx2 instructions
        crypto: aspeed - Fix modular aspeed-acry
        crypto: hisilicon/qm - fix coding style issues
        crypto: hisilicon/qm - update comments to match function
        crypto: hisilicon/qm - change function names
        crypto: hisilicon/qm - use min() instead of min_t()
        crypto: hisilicon/qm - remove some unused defines
        crypto: proc - Print fips status
        crypto: crypto4xx - Call dma_unmap_page when done
        crypto: octeontx2 - Fix objects shared between several modules
        crypto: nx - Fix sparse warnings
        crypto: ecc - Silence sparse warning
        tls: Pass rec instead of aead_req into tls_encrypt_done
        crypto: api - Remove completion function scaffolding
        tls: Remove completion function scaffolding
        tipc: Remove completion function scaffolding
        net: ipv6: Remove completion function scaffolding
        net: ipv4: Remove completion function scaffolding
        net: macsec: Remove completion function scaffolding
        dm: Remove completion function scaffolding
        ...
      36289a03
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v6.3-1' of... · 69308402
      Linus Torvalds authored
      Merge tag 'platform-drivers-x86-v6.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
      
      Pull x86 platform driver updates from Hans de Goede:
      
       - AMD PMC: Improvements to aid s2idle debugging
      
       - Dell WMI-DDV: hwmon support
      
       - INT3472 camera sensor power-management: Improve privacy LED support
      
       - Intel VSEC: Base TPMI (Topology Aware Register and PM Capsule
         Interface) support
      
       - Mellanox: SN5600 and Nvidia L1 switch support
      
       - Microsoft Surface Support: Various cleanups + code improvements
      
       - tools/intel-speed-select: Various improvements
      
       - Miscellaneous other cleanups / fixes
      
      * tag 'platform-drivers-x86-v6.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (80 commits)
        platform/x86: nvidia-wmi-ec-backlight: Add force module parameter
        platform/x86/amd/pmf: Add depends on CONFIG_POWER_SUPPLY
        platform/x86: dell-ddv: Prefer asynchronous probing
        platform/x86: dell-ddv: Add hwmon support
        Documentation/ABI: Add new attribute for mlxreg-io sysfs interfaces
        platform: mellanox: mlx-platform: Move bus shift assignment out of the loop
        platform: mellanox: mlx-platform: Add mux selection register to regmap
        platform_data/mlxreg: Add field with mapped resource address
        platform/mellanox: mlxreg-hotplug: Allow more flexible hotplug events configuration
        platform: mellanox: Extend all systems with I2C notification callback
        platform: mellanox: Split logic in init and exit flow
        platform: mellanox: Split initialization procedure
        platform: mellanox: Introduce support of new Nvidia L1 switch
        platform: mellanox: Introduce support for next-generation 800GB/s switch
        platform: mellanox: Cosmetic changes - rename to more common name
        platform: mellanox: Change "reset_pwr_converter_fail" attribute
        platform: mellanox: Introduce support for rack manager switch
        MAINTAINERS: dell-wmi-sysman: drop Divya Bharathi
        x86/platform/uv: Make kobj_type structure constant
        platform/x86: think-lmi: Make kobj_type structure constant
        ...
      69308402
    • Linus Torvalds's avatar
      Merge tag 'tag-chrome-platform-for-v6.3' of... · 5f5ce6bc
      Linus Torvalds authored
      Merge tag 'tag-chrome-platform-for-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux
      
      Pull chrome platform updates from Tzung-Bi Shih:
       "New drivers:
         - cros_ec_uart for ChromeOS EC protocol over UART
         - cros_typec_vdm for USB PD Vendor Defined Message
      
        Improvements:
         - Preserve logs as much as possible when EC panics
         - Shutdown to refrain from potential HW damages when EC panics
      
        Fixes:
         - Fix DP_PORT_VDO to include DP_CAP_RECEPTACLE
         - Fix a lockdep false positive
      
        Cleanups:
         - Use sysfs_emit*() instead of scnprintf()
         - Use asm instead of asm-generic for unaligned.h
      
        Misc:
         - Rename module name from cros_ec_typec to cros-ec-typec
         - Minor fixes"
      
      * tag 'tag-chrome-platform-for-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux: (34 commits)
        platform/chrome: cros_ec_typec: Fix spelling mistake
        platform/chrome: cros_typec_vdm: Add Attention support
        platform/chrome: cros_ec: Add VDM attention headers
        platform/chrome: cros_typec_vdm: Fix VDO copy
        platform/chrome: cros_ec_typec: allow deferred probe of switch handles
        platform/chrome: cros_ec_proto: remove big stub objects from stack
        platform/chrome: cros_ec_uart: fix negative type promoted to high
        platform/chrome: cros_ec: Use per-device lockdep key
        platform/chrome: fix kernel-doc warnings for cros_ec_command
        platform/chrome: fix kernel-doc warning for last_resume_result
        platform/chrome: fix kernel-doc warning for suspend_timeout_ms
        platform/chrome: fix kernel-doc warnings for panic notifier
        platform/chrome: cros_ec_lpc: initialize the buf variable
        platform/chrome: cros_ec: Fix panic notifier registration
        platform/chrome: cros_typec_switch: Check for retimer flag
        platform/chrome: cros_typec_switch: Use fwnode* prop check
        platform/chrome: cros_typec_vdm: Add VDM send support
        platform/chrome: cros_typec_vdm: Add VDM reply support
        platform/chrome: cros_ec_typec: Add initial VDM support
        platform/chrome: cros_ec_typec: Alter module name with hyphens
        ...
      5f5ce6bc
    • Linus Torvalds's avatar
      Merge tag 'for-linus-6.3-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 239451e9
      Linus Torvalds authored
      Pull xen updates from Juergen Gross:
      
       - help deprecate the /proc/xen files by making the related information
         available via sysfs
      
       - mark the Xen variants of play_dead "noreturn"
      
       - support a shared Xen platform interrupt
      
       - several small cleanups and fixes
      
      * tag 'for-linus-6.3-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen: sysfs: make kobj_type structure constant
        x86/Xen: drop leftover VM-assist uses
        xen: Replace one-element array with flexible-array member
        xen/grant-dma-iommu: Implement a dummy probe_device() callback
        xen/pvcalls-back: fix permanently masked event channel
        xen: Allow platform PCI interrupt to be shared
        x86/xen/time: prefer tsc as clocksource when it is invariant
        x86/xen: mark xen_pv_play_dead() as __noreturn
        x86/xen: don't let xen_pv_play_dead() return
        drivers/xen/hypervisor: Expose Xen SIF flags to userspace
      239451e9
    • Linus Torvalds's avatar
      Merge tag 'hyperv-next-signed-20230220' of... · b8878e5a
      Linus Torvalds authored
      Merge tag 'hyperv-next-signed-20230220' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
      
      Pull hyperv updates from Wei Liu:
      
       - allow Linux to run as the nested root partition for Microsoft
         Hypervisor (Jinank Jain and Nuno Das Neves)
      
       - clean up the return type of callback functions (Dawei Li)
      
      * tag 'hyperv-next-signed-20230220' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
        x86/hyperv: Fix hv_get/set_register for nested bringup
        Drivers: hv: Make remove callback of hyperv driver void returned
        Drivers: hv: Enable vmbus driver for nested root partition
        x86/hyperv: Add an interface to do nested hypercalls
        Drivers: hv: Setup synic registers in case of nested root partition
        x86/hyperv: Add support for detecting nested hypervisor
      b8878e5a
  2. 21 Feb, 2023 26 commits
    • Linus Torvalds's avatar
      Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 8bf1a529
      Linus Torvalds authored
      Pull arm64 updates from Catalin Marinas:
      
       - Support for arm64 SME 2 and 2.1. SME2 introduces a new 512-bit
         architectural register (ZT0, for the look-up table feature) that
         Linux needs to save/restore
      
       - Include TPIDR2 in the signal context and add the corresponding
         kselftests
      
       - Perf updates: Arm SPEv1.2 support, HiSilicon uncore PMU updates, ACPI
         support to the Marvell DDR and TAD PMU drivers, reset DTM_PMU_CONFIG
         (ARM CMN) at probe time
      
       - Support for DYNAMIC_FTRACE_WITH_CALL_OPS on arm64
      
       - Permit EFI boot with MMU and caches on. Instead of cleaning the
         entire loaded kernel image to the PoC and disabling the MMU and
         caches before branching to the kernel bare metal entry point, leave
         the MMU and caches enabled and rely on EFI's cacheable 1:1 mapping of
         all of system RAM to populate the initial page tables
      
       - Expose the AArch32 (compat) ELF_HWCAP features to user in an arm64
         kernel (the arm32 kernel only defines the values)
      
       - Harden the arm64 shadow call stack pointer handling: stash the shadow
         stack pointer in the task struct on interrupt, load it directly from
         this structure
      
       - Signal handling cleanups to remove redundant validation of size
         information and avoid reading the same data from userspace twice
      
       - Refactor the hwcap macros to make use of the automatically generated
         ID registers. It should make new hwcaps writing less error prone
      
       - Further arm64 sysreg conversion and some fixes
      
       - arm64 kselftest fixes and improvements
      
       - Pointer authentication cleanups: don't sign leaf functions, unify
         asm-arch manipulation
      
       - Pseudo-NMI code generation optimisations
      
       - Minor fixes for SME and TPIDR2 handling
      
       - Miscellaneous updates: ARCH_FORCE_MAX_ORDER is now selectable,
         replace strtobool() to kstrtobool() in the cpufeature.c code, apply
         dynamic shadow call stack in two passes, intercept pfn changes in
         set_pte_at() without the required break-before-make sequence, attempt
         to dump all instructions on unhandled kernel faults
      
      * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (130 commits)
        arm64: fix .idmap.text assertion for large kernels
        kselftest/arm64: Don't require FA64 for streaming SVE+ZA tests
        kselftest/arm64: Copy whole EXTRA context
        arm64: kprobes: Drop ID map text from kprobes blacklist
        perf: arm_spe: Print the version of SPE detected
        perf: arm_spe: Add support for SPEv1.2 inverted event filtering
        perf: Add perf_event_attr::config3
        arm64/sme: Fix __finalise_el2 SMEver check
        drivers/perf: fsl_imx8_ddr_perf: Remove set-but-not-used variable
        arm64/signal: Only read new data when parsing the ZT context
        arm64/signal: Only read new data when parsing the ZA context
        arm64/signal: Only read new data when parsing the SVE context
        arm64/signal: Avoid rereading context frame sizes
        arm64/signal: Make interface for restore_fpsimd_context() consistent
        arm64/signal: Remove redundant size validation from parse_user_sigframe()
        arm64/signal: Don't redundantly verify FPSIMD magic
        arm64/cpufeature: Use helper macros to specify hwcaps
        arm64/cpufeature: Always use symbolic name for feature value in hwcaps
        arm64/sysreg: Initial unsigned annotations for ID registers
        arm64/sysreg: Initial annotation of signed ID registers
        ...
      8bf1a529
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm · b327dfe0
      Linus Torvalds authored
      Pull ARM udpates from Russell King:
      
       - Improve Kconfig help text for Cortex A8 and Cortex A9 errata
      
       - Kconfig spelling and grammar fixes
      
       - Allow kernel-mode VFP/Neon in softirq context
      
       - Use Neon in softirq context
      
       - Implement AES-CTR/GHASH version of GCM
      
      * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
        ARM: 9289/1: Allow pre-ARMv5 builds with ld.lld 16.0.0 and newer
        ARM: 9288/1: Kconfigs: fix spelling & grammar
        ARM: 9286/1: crypto: Implement fused AES-CTR/GHASH version of GCM
        ARM: 9285/1: remove meaningless arch/arm/mach-rda/Makefile
        ARM: 9283/1: permit non-nested kernel mode NEON in softirq context
        ARM: 9282/1: vfp: Manipulate task VFP state with softirqs disabled
        ARM: 9281/1: improve Cortex A8/A9 errata help text
      b327dfe0
    • Linus Torvalds's avatar
      Merge tag 'm68k-for-v6.3-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k · eb6d5bbe
      Linus Torvalds authored
      Pull m68k updates from Geert Uytterhoeven:
      
       - Add seccomp support
      
       - defconfig updates
      
       - Miscellaneous fixes and improvements
      
      * tag 'm68k-for-v6.3-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
        m68k: /proc/hardware should depend on PROC_FS
        selftests/seccomp: Add m68k support
        m68k: Add kernel seccomp support
        m68k: Check syscall_trace_enter() return code
        m68k: defconfig: Update defconfigs for v6.2-rc3
        m68k: q40: Do not initialise statics to 0
      eb6d5bbe
    • Linus Torvalds's avatar
      Merge tag 's390-6.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · bcf5470e
      Linus Torvalds authored
      Pull s390 updates from Heiko Carstens:
      
       - Large cleanup of the con3270/tty3270 driver. Among others this fixes:
           - Background Color Support
           - ASCII Line Character Support
           - VT100 Support
           - Geometries other than 80x24
      
       - Cleanup and improve cmpxchg() code. Also add cmpxchg_user_key() to
         uaccess functions, which will be used by KVM to access KVM guest
         memory with a specific storage key
      
       - Add support for user space events counting to CPUMF
      
       - Cleanup the vfio/ccw code, which also allows now to properly support
         2K Format-2 IDALs
      
       - Move kernel page table allocation and initialization to decompressor,
         which finally allows to enter the kernel with dynamic address
         translation enabled. This in turn allows to get rid of code with
         special handling in the kernel, which has to distinguish if DAT is on
         or off
      
       - Replace kretprobe with rethook
      
       - Various improvements to vfio/ap queue resets:
           - Use TAPQ to verify completion of a reset in progress rather than
             multiple invocations of ZAPQ.
           - Check TAPQ response codes when verifying successful completion of
             ZAPQ.
           - Fix erroneous handling of some error response codes.
           - Increase the maximum amount of time to wait for successful
             completion of ZAPQ
      
       - Rework system call wrappers to get rid of alias functions, which were
         only left on s390
      
       - Cleanup diag288_wdt watchdog driver. It has been agreed on with
         Guenter Roeck that this goes upstream via the s390 tree
      
       - Add missing loadparm parameter handling for list-directed ECKD
         ipl/reipl
      
       - Various improvements to memory detection code
      
       - Remove arch_cpu_idle_time() since the current implementation is
         broken, and allows user space observable accounted idle times which
         can temporarily decrease
      
       - Add Reset DAT-Protection support: (only) allow to change PTEs from RO
         to RW with a new RDP instruction. Unlike the currently used IPTE
         instruction, this does not necessarily guarantee that TLBs of all
         CPUs are synchronously flushed; and that remote CPUs can see spurious
         protection faults. The overall improvement for not requiring an all
         CPU synchronization, like it is required with IPTE, should be
         beneficial
      
       - Fix KFENCE page fault reporting
      
       - Smaller cleanups and improvement all over the place
      
      * tag 's390-6.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (182 commits)
        s390/irq,idle: simplify idle check
        s390/processor: add test_and_set_cpu_flag() and test_and_clear_cpu_flag()
        s390/processor: let cpu helper functions return boolean values
        s390/kfence: fix page fault reporting
        s390/zcrypt: introduce ctfm field in struct CPRBX
        s390: remove confusing comment from uapi types header file
        vfio/ccw: remove WARN_ON during shutdown
        s390/entry: remove toolchain dependent micro-optimization
        s390/mem_detect: do not truncate online memory ranges info
        s390/vx: remove __uint128_t type from __vector128 struct again
        s390/mm: add support for RDP (Reset DAT-Protection)
        s390/mm: define private VM_FAULT_* reasons from top bits
        Documentation: s390: correct spelling
        s390/ap: fix status returned by ap_qact()
        s390/ap: fix status returned by ap_aqic()
        s390: vfio-ap: tighten the NIB validity check
        Revert "s390/mem_detect: do not update output parameters on failure"
        s390/idle: remove arch_cpu_idle_time() and corresponding code
        s390/vx: use simple assignments to access __vector128 members
        s390/vx: add 64 and 128 bit members to __vector128 struct
        ...
      bcf5470e
    • Linus Torvalds's avatar
      Merge tag 'x86_cpu_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 87793476
      Linus Torvalds authored
      Pull x86 cpuid updates from Borislav Petkov:
      
       - Cache the AMD debug registers in per-CPU variables to avoid MSR
         writes where possible, when supporting a debug registers swap feature
         for SEV-ES guests
      
       - Add support for AMD's version of eIBRS called Automatic IBRS which is
         a set-and-forget control of indirect branch restriction speculation
         resources on privilege change
      
       - Add support for a new x86 instruction - LKGS - Load kernel GS which
         is part of the FRED infrastructure
      
       - Reset SPEC_CTRL upon init to accomodate use cases like kexec which
         rediscover
      
       - Other smaller fixes and cleanups
      
      * tag 'x86_cpu_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/amd: Cache debug register values in percpu variables
        KVM: x86: Propagate the AMD Automatic IBRS feature to the guest
        x86/cpu: Support AMD Automatic IBRS
        x86/cpu, kvm: Add the SMM_CTL MSR not present feature
        x86/cpu, kvm: Add the Null Selector Clears Base feature
        x86/cpu, kvm: Move X86_FEATURE_LFENCE_RDTSC to its native leaf
        x86/cpu, kvm: Add the NO_NESTED_DATA_BP feature
        KVM: x86: Move open-coded CPUID leaf 0x80000021 EAX bit propagation code
        x86/cpu, kvm: Add support for CPUID_80000021_EAX
        x86/gsseg: Add the new <asm/gsseg.h> header to <asm/asm-prototypes.h>
        x86/gsseg: Use the LKGS instruction if available for load_gs_index()
        x86/gsseg: Move load_gs_index() to its own new header file
        x86/gsseg: Make asm_load_gs_index() take an u16
        x86/opcode: Add the LKGS instruction to x86-opcode-map
        x86/cpufeature: Add the CPU feature bit for LKGS
        x86/bugs: Reset speculation control settings on init
        x86/cpu: Remove redundant extern x86_read_arch_cap_msr()
      87793476
    • Dave Hansen's avatar
      uaccess: Add speculation barrier to copy_from_user() · 74e19ef0
      Dave Hansen authored
      The results of "access_ok()" can be mis-speculated.  The result is that
      you can end speculatively:
      
      	if (access_ok(from, size))
      		// Right here
      
      even for bad from/size combinations.  On first glance, it would be ideal
      to just add a speculation barrier to "access_ok()" so that its results
      can never be mis-speculated.
      
      But there are lots of system calls just doing access_ok() via
      "copy_to_user()" and friends (example: fstat() and friends).  Those are
      generally not problematic because they do not _consume_ data from
      userspace other than the pointer.  They are also very quick and common
      system calls that should not be needlessly slowed down.
      
      "copy_from_user()" on the other hand uses a user-controller pointer and
      is frequently followed up with code that might affect caches.  Take
      something like this:
      
      	if (!copy_from_user(&kernelvar, uptr, size))
      		do_something_with(kernelvar);
      
      If userspace passes in an evil 'uptr' that *actually* points to a kernel
      addresses, and then do_something_with() has cache (or other)
      side-effects, it could allow userspace to infer kernel data values.
      
      Add a barrier to the common copy_from_user() code to prevent
      mis-speculated values which happen after the copy.
      
      Also add a stub for architectures that do not define barrier_nospec().
      This makes the macro usable in generic code.
      
      Since the barrier is now usable in generic code, the x86 #ifdef in the
      BPF code can also go away.
      Reported-by: default avatarJordy Zomer <jordyzomer@google.com>
      Suggested-by: default avatarLinus Torvalds <torvalds@linuxfoundation.org>
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: Daniel Borkmann <daniel@iogearbox.net>   # BPF bits
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      74e19ef0
    • Linus Torvalds's avatar
      Merge tag 'thermal-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 1b72607d
      Linus Torvalds authored
      Pull thermal control updates from Rafael Wysocki:
       "The majority of changes here are related to the general switch-over to
        using arrays of generic trip point structures registered along with a
        thermal zone instead of trip point callbacks (this has been done
        mostly by Daniel Lezcano with some help from yours truly on the Intel
        drivers front).
      
        Apart from that and the related reorganization of code, there are some
        enhancements of the existing driver and a new Mediatek Low Voltage
        Thermal Sensor (LVTS) driver. The Intel powerclamp undergoes a major
        rework so it will use the generic idle_inject facility for CPU idle
        time injection going forward and it will take additional module
        parameters for specifying the subset of CPUs to be affected by it
        (work done by Srinivas Pandruvada).
      
        Also included are assorted fixes and a whole bunch of cleanups.
      
        Specifics:
      
         - Rework a large bunch of drivers to use the generic thermal trip
           structure and use the opportunity to do more cleanups by removing
           unused functions from the OF code (Daniel Lezcano)
      
         - Remove core header inclusion from drivers (Daniel Lezcano)
      
         - Fix some locking issues related to the generic thermal trip rework
           (Johan Hovold)
      
         - Fix a crash when requesting the critical temperature on tegra,
           which is related to the generic trip point work (Jon Hunter)
      
         - Clean up thermal device unregistration code (Viresh Kumar)
      
         - Fix and clean up thermal control core initialization error code
           paths (Daniel Lezcano)
      
         - Relocate the trip points handling code into a separate file (Daniel
           Lezcano)
      
         - Make the thermal core fail registration of thermal zones and
           cooling devices if the thermal class has not been registered
           (Rafael Wysocki)
      
         - Add trip point initialization helper functions for ACPI-defined
           trip points and modify two thermal drivers to use them (Rafael
           Wysocki, Daniel Lezcano)
      
         - Make the core thermal control code use sysfs_emit_at() instead of
           scnprintf() where applicable (ye xingchen)
      
         - Consolidate code accessing the Intel TCC (Thermal Control
           Circuitry) MSRs by introducing library functions for that and
           making the TCC-related code in thermal drivers use them (Zhang Rui)
      
         - Enhance the x86_pkg_temp_thermal driver to support dynamic tjmax
           changes (Zhang Rui)
      
         - Address an "unsigned expression compared with zero" warning in the
           intel_soc_dts_iosf thermal driver (Yang Li)
      
         - Update comments regarding two functions in the Intel Menlow thermal
           driver (Deming Wang)
      
         - Use sysfs_emit_at() instead of scnprintf() in the int340x thermal
           driver (ye xingchen)
      
         - Make the intel_pch thermal driver support the Wellsburg PCH (Tim
           Zimmermann)
      
         - Modify the intel_pch and processor_thermal_device_pci thermal
           drivers use generic trip point tables instead of thermal zone trip
           point callbacks (Daniel Lezcano)
      
         - Add production mode attribute sysfs attribute to the int340x
           thermal driver (Srinivas Pandruvada)
      
         - Rework dynamic trip point updates handling and locking in the
           int340x thermal driver (Rafael Wysocki)
      
         - Make the int340x thermal driver use a generic trip points table
           instead of thermal zone trip point callbacks (Rafael Wysocki,
           Daniel Lezcano)
      
         - Clean up and improve the int340x thermal driver (Rafael Wysocki)
      
         - Simplify and clean up the intel_pch thermal driver (Rafael Wysocki)
      
         - Fix the Intel powerclamp thermal driver and make it use the common
           idle injection framework (Srinivas Pandruvada)
      
         - Add two module parameters, cpumask and max_idle, to the Intel
           powerclamp thermal driver to allow it to affect only a specific
           subset of CPUs instead of all of them (Srinivas Pandruvada)
      
         - Make the Intel quark_dts thermal driver Use generic trip point
           objects instead of its own trip point representation (Daniel
           Lezcano)
      
         - Add toctree entry for thermal documents and fix two issues in the
           Intel powerclamp driver documentation (Bagas Sanjaya)
      
         - Use strscpy() to instead of strncpy() in the thermal core (Xu
           Panda)
      
         - Fix thermal_sampling_exit() (Vincent Guittot)
      
         - Add Mediatek Low Voltage Thermal Sensor (LVTS) driver (Balsam
           Chihi)
      
         - Add r8a779g0 RCar support to the rcar_gen3 thermal driver (Geert
           Uytterhoeven)
      
         - Fix useless call to set_trips() when resuming in the rcar_gen3
           thermal control driver and add interrupt support detection at init
           time to it (Niklas Söderlund)
      
         - Fix memory corruption in the hi3660 thermal driver (Yongqin Liu)
      
         - Fix include path for libnl3 in pkg-config file for libthermal
           (Vibhav Pant)
      
         - Remove syscfg-based driver for st as the platform is not supported
           any more (Alain Volmat)"
      
      * tag 'thermal-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (135 commits)
        thermal/drivers/st: Remove syscfg based driver
        thermal: Remove core header inclusion from drivers
        tools/lib/thermal: Fix include path for libnl3 in pkg-config file.
        thermal/drivers/hisi: Drop second sensor hi3660
        thermal/drivers/rcar_gen3_thermal: Fix device initialization
        thermal/drivers/rcar_gen3_thermal: Create device local ops struct
        thermal/drivers/rcar_gen3_thermal: Do not call set_trips() when resuming
        thermal/drivers/rcar_gen3: Add support for R-Car V4H
        dt-bindings: thermal: rcar-gen3-thermal: Add r8a779g0 support
        thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver
        dt-bindings: thermal: mediatek: Add LVTS thermal controllers
        thermal/drivers/mediatek: Relocate driver to mediatek folder
        tools/lib/thermal: Fix thermal_sampling_exit()
        Documentation: powerclamp: Fix numbered lists formatting
        Documentation: powerclamp: Escape wildcard in cpumask description
        Documentation: admin-guide: Add toctree entry for thermal docs
        thermal: intel: powerclamp: Add two module parameters
        Documentation: admin-guide: Move intel_powerclamp documentation
        thermal: core: Use sysfs_emit_at() instead of scnprintf()
        thermal: intel: powerclamp: Fix duration module parameter
        ...
      1b72607d
    • Linus Torvalds's avatar
      Merge tag 'acpi-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 88af9b16
      Linus Torvalds authored
      Pull ACPI updates from Rafael Wysocki:
       "These fix a frequency limit issue in the ACPI processor performance
        library code, fix a few issues in the ACPICA code, improve Crystal
        Cove support in the ACPI PMIC driver, fix string handling in the ACPI
        battery driver, add IRQ override quirks for a few machines more, fix
        other assorted problems and clean up code and documentation.
      
        Specifics:
      
         - Drop port I/O validation for some regions to avoid AML failures due
           to rejections of legitimate port I/O writes (Mario Limonciello)
      
         - Constify acpi_get_handle() pathname argument to allow its callers
           to pass const pathnames to it (Sakari Ailus)
      
         - Prevent acpi_ns_simple_repair() from crashing in some cases when
           AE_AML_NO_RETURN_VALUE should be returned (Daniil Tatianin)
      
         - Fix typo in CDAT DSMAS struct definition (Lukas Wunner)
      
         - Drop an unnecessary (void *) conversion from the ACPI processor
           driver (Zhou jie)
      
         - Modify the ACPI processor performance library code to use the "no
           limit" frequency QoS as appropriate and adjust the intel_pstate
           driver accordingly (Rafael Wysocki)
      
         - Add support for NBFT to the ACPI table parser (Stuart Hayes)
      
         - Introduce list of known non-PNP devices to avoid enumerating some
           of them as PNP devices (Rafael Wysocki)
      
         - Add x86 ACPI paths to the ACPI entry in MAINTAINERS to allow
           scripts to report the actual maintainers information (Rafael
           Wysocki)
      
         - Add two more entries to the ACPI IRQ override quirk list (Adam
           Niederer, Werner Sembach)
      
         - Add a pmic_i2c_address entry for Intel Bay Trail Crystal Cove to
           allow intel_soc_pmic_exec_mipi_pmic_seq_element() to be used with
           the Bay Trail Crystal Cove PMIC OpRegion driver (Hans de Goede)
      
         - Add comments with DSDT power OpRegion field names to the ACPI PMIC
           driver (Hans de Goede)
      
         - Fix string termination handling in the ACPI battery driver (Armin
           Wolf)
      
         - Limit error type to 32-bit width in the ACPI APEI error injection
           code (Shuai Xue)
      
         - Fix Lenovo Ideapad Z570 DMI match in the ACPI backlight driver
           (Hans de Goede)
      
         - Silence missing prototype warnings in some places in the
           ACPI-related code (Ammar Faizi)
      
         - Make kobj_type structures used in the ACPI code constant (Thomas
           Weißschuh)
      
         - Correct spelling in firmware-guide/ACPI (Randy Dunlap)
      
         - Clarify the meaning of Explicit and Implicit in the _DSD GPIO
           properties documentation (Andy Shevchenko)
      
         - Fix some kernel-doc comments in the ACPI CPPC library code (Yang
           Li)"
      
      * tag 'acpi-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (25 commits)
        ACPI: make kobj_type structures constant
        Documentation: firmware-guide: gpio-properties: Clarify Explicit and Implicit
        ACPICA: Fix typo in CDAT DSMAS struct definition
        ACPI: resource: Do IRQ override on all TongFang GMxRGxx
        ACPI: resource: Add IRQ overrides for MAINGEAR Vector Pro 2 models
        ACPI: CPPC: Fix some kernel-doc comments
        ACPI: video: Fix Lenovo Ideapad Z570 DMI match
        Documentation: firmware-guide/ACPI: correct spelling
        ACPI: PMIC: Add comments with DSDT power opregion field names
        ACPI: battery: Increase maximum string length
        ACPI: battery: Fix buffer overread if not NUL-terminated
        ACPI: APEI: EINJ: Limit error type to 32-bit width
        MAINTAINERS: Add x86 ACPI paths to the ACPI entry
        ACPI: battery: Fix missing NUL-termination with large strings
        ACPI: PNP: Introduce list of known non-PNP devices
        ACPICA: nsrepair: handle cases without a return value correctly
        ACPI: Silence missing prototype warnings
        cpufreq: intel_pstate: Drop ACPI _PSS states table patching
        ACPI: processor: perflib: Avoid updating frequency QoS unnecessarily
        ACPI: processor: perflib: Use the "no limit" frequency QoS
        ...
      88af9b16
    • Linus Torvalds's avatar
      Merge tag 'pm-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 2504ba8b
      Linus Torvalds authored
      Pull power management updates from Rafael Wysocki:
       "These add EPP support to the AMD P-state cpufreq driver, add support
        for new platforms to the Intel RAPL power capping driver, intel_idle
        and the Qualcomm cpufreq driver, enable thermal cooling for Tegra194,
        drop the custom cpufreq driver for loongson1 that is not necessary any
        more (and the corresponding cpufreq platform device), fix assorted
        issues and clean up code.
      
        Specifics:
      
         - Add EPP support to the AMD P-state cpufreq driver (Perry Yuan, Wyes
           Karny, Arnd Bergmann, Bagas Sanjaya)
      
         - Drop the custom cpufreq driver for loongson1 that is not necessary
           any more and the corresponding cpufreq platform device (Keguang
           Zhang)
      
         - Remove "select SRCU" from system sleep, cpufreq and OPP Kconfig
           entries (Paul E. McKenney)
      
         - Enable thermal cooling for Tegra194 (Yi-Wei Wang)
      
         - Register module device table and add missing compatibles for
           cpufreq-qcom-hw (Nícolas F. R. A. Prado, Abel Vesa and Luca Weiss)
      
         - Various dt binding updates for qcom-cpufreq-nvmem and
           opp-v2-kryo-cpu (Christian Marangi)
      
         - Make kobj_type structure in the cpufreq core constant (Thomas
           Weißschuh)
      
         - Make cpufreq_unregister_driver() return void (Uwe Kleine-König)
      
         - Make the TEO cpuidle governor check CPU utilization in order to
           refine idle state selection (Kajetan Puchalski)
      
         - Make Kconfig select the haltpoll cpuidle governor when the haltpoll
           cpuidle driver is selected and replace a default_idle() call in
           that driver with arch_cpu_idle() to allow MWAIT to be used (Li
           RongQing)
      
         - Add Emerald Rapids Xeon support to the intel_idle driver (Artem
           Bityutskiy)
      
         - Add ARCH_SUSPEND_POSSIBLE dependencies for ARMv4 cpuidle drivers to
           avoid randconfig build failures (Arnd Bergmann)
      
         - Make kobj_type structures used in the cpuidle sysfs interface
           constant (Thomas Weißschuh)
      
         - Make the cpuidle driver registration code update microsecond values
           of idle state parameters in accordance with their nanosecond values
           if they are provided (Rafael Wysocki)
      
         - Make the PSCI cpuidle driver prevent topology CPUs from being
           suspended on PREEMPT_RT (Krzysztof Kozlowski)
      
         - Document that pm_runtime_force_suspend() cannot be used with
           DPM_FLAG_SMART_SUSPEND (Richard Fitzgerald)
      
         - Add EXPORT macros for exporting PM functions from drivers (Richard
           Fitzgerald)
      
         - Remove /** from non-kernel-doc comments in hibernation code (Randy
           Dunlap)
      
         - Fix possible name leak in powercap_register_zone() (Yang Yingliang)
      
         - Add Meteor Lake and Emerald Rapids support to the intel_rapl power
           capping driver (Zhang Rui)
      
         - Modify the idle_inject power capping facility to support 100% idle
           injection (Srinivas Pandruvada)
      
         - Fix large time windows handling in the intel_rapl power capping
           driver (Zhang Rui)
      
         - Fix memory leaks with using debugfs_lookup() in the generic PM
           domains and Energy Model code (Greg Kroah-Hartman)
      
         - Add missing 'cache-unified' property in the example for kryo OPP
           bindings (Rob Herring)
      
         - Fix error checking in opp_migrate_dentry() (Qi Zheng)
      
         - Let qcom,opp-fuse-level be a 2-long array for qcom SoCs (Konrad
           Dybcio)
      
         - Modify some power management utilities to use the canonical ftrace
           path (Ross Zwisler)
      
         - Correct spelling problems for Documentation/power/ as reported by
           codespell (Randy Dunlap)"
      
      * tag 'pm-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (53 commits)
        Documentation: amd-pstate: disambiguate user space sections
        cpufreq: amd-pstate: Fix invalid write to MSR_AMD_CPPC_REQ
        dt-bindings: opp: opp-v2-kryo-cpu: enlarge opp-supported-hw maximum
        dt-bindings: cpufreq: qcom-cpufreq-nvmem: make cpr bindings optional
        dt-bindings: cpufreq: qcom-cpufreq-nvmem: specify supported opp tables
        PM: Add EXPORT macros for exporting PM functions
        cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT
        MIPS: loongson32: Drop obsolete cpufreq platform device
        powercap: intel_rapl: Fix handling for large time window
        cpuidle: driver: Update microsecond values of state parameters as needed
        cpuidle: sysfs: make kobj_type structures constant
        cpuidle: add ARCH_SUSPEND_POSSIBLE dependencies
        PM: EM: fix memory leak with using debugfs_lookup()
        PM: domains: fix memory leak with using debugfs_lookup()
        cpufreq: Make kobj_type structure constant
        cpufreq: davinci: Fix clk use after free
        cpufreq: amd-pstate: avoid uninitialized variable use
        cpufreq: Make cpufreq_unregister_driver() return void
        OPP: fix error checking in opp_migrate_dentry()
        dt-bindings: cpufreq: cpufreq-qcom-hw: Add SM8550 compatible
        ...
      2504ba8b
    • Linus Torvalds's avatar
      Merge tag 'hardening-v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 4a7d37e8
      Linus Torvalds authored
      Pull hardening updates from Kees Cook:
       "Beyond some specific LoadPin, UBSAN, and fortify features, there are
        other fixes scattered around in various subsystems where maintainers
        were okay with me carrying them in my tree or were non-responsive but
        the patches were reviewed by others:
      
         - Replace 0-length and 1-element arrays with flexible arrays in
           various subsystems (Paulo Miguel Almeida, Stephen Rothwell, Kees
           Cook)
      
         - randstruct: Disable Clang 15 support (Eric Biggers)
      
         - GCC plugins: Drop -std=gnu++11 flag (Sam James)
      
         - strpbrk(): Refactor to use strchr() (Andy Shevchenko)
      
         - LoadPin LSM: Allow root filesystem switching when non-enforcing
      
         - fortify: Use dynamic object size hints when available
      
         - ext4: Fix CFI function prototype mismatch
      
         - Nouveau: Fix DP buffer size arguments
      
         - hisilicon: Wipe entire crypto DMA pool on error
      
         - coda: Fully allocate sig_inputArgs
      
         - UBSAN: Improve arm64 trap code reporting
      
         - copy_struct_from_user(): Add minimum bounds check on kernel buffer
           size"
      
      * tag 'hardening-v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        randstruct: disable Clang 15 support
        uaccess: Add minimum bounds check on kernel buffer size
        arm64: Support Clang UBSAN trap codes for better reporting
        coda: Avoid partial allocation of sig_inputArgs
        gcc-plugins: drop -std=gnu++11 to fix GCC 13 build
        lib/string: Use strchr() in strpbrk()
        crypto: hisilicon: Wipe entire pool on error
        net/i40e: Replace 0-length array with flexible array
        io_uring: Replace 0-length array with flexible array
        ext4: Fix function prototype mismatch for ext4_feat_ktype
        i915/gvt: Replace one-element array with flexible-array member
        drm/nouveau/disp: Fix nvif_outp_acquire_dp() argument size
        LoadPin: Allow filesystem switch when not enforcing
        LoadPin: Move pin reporting cleanly out of locking
        LoadPin: Refactor sysctl initialization
        LoadPin: Refactor read-only check into a helper
        ARM: ixp4xx: Replace 0-length arrays with flexible arrays
        fortify: Use __builtin_dynamic_object_size() when available
        rxrpc: replace zero-lenth array with DECLARE_FLEX_ARRAY() helper
      4a7d37e8
    • Linus Torvalds's avatar
      Merge tag 'seccomp-v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 902d9fcd
      Linus Torvalds authored
      Pull seccomp update from Kees Cook:
      
       - Fix kernel-doc function name ordering to avoid warning (Randy Dunlap)
      
      * tag 'seccomp-v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        seccomp: fix kernel-doc function name warning
      902d9fcd
    • Linus Torvalds's avatar
      Merge tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu · 8cc01d43
      Linus Torvalds authored
      Pull RCU updates from Paul McKenney:
      
       - Documentation updates
      
       - Miscellaneous fixes, perhaps most notably:
      
            - Throttling callback invocation based on the number of callbacks
              that are now ready to invoke instead of on the total number of
              callbacks
      
            - Several patches that suppress false-positive boot-time
              diagnostics, for example, due to lockdep not yet being
              initialized
      
            - Make expedited RCU CPU stall warnings dump stacks of any tasks
              that are blocking the stalled grace period. (Normal RCU CPU
              stall warnings have done this for many years)
      
            - Lazy-callback fixes to avoid delays during boot, suspend, and
              resume. (Note that lazy callbacks must be explicitly enabled, so
              this should not (yet) affect production use cases)
      
       - Make kfree_rcu() and friends take advantage of polled grace periods,
         thus reducing memory footprint by almost two orders of magnitude,
         admittedly on a microbenchmark
      
         This also begins the transition from kfree_rcu(p) to
         kfree_rcu_mightsleep(p). This transition was motivated by bugs where
         kfree_rcu(p), which can block, was typed instead of the intended
         kfree_rcu(p, rh)
      
       - SRCU updates, perhaps most notably fixing a bug that causes SRCU to
         fail when booted on a system with a non-zero boot CPU. This
         surprising situation actually happens for kdump kernels on the
         powerpc architecture
      
         This also adds an srcu_down_read() and srcu_up_read(), which act like
         srcu_read_lock() and srcu_read_unlock(), but allow an SRCU read-side
         critical section to be handed off from one task to another
      
       - Clean up the now-useless SRCU Kconfig option
      
         There are a few more commits that are not yet acked or pulled into
         maintainer trees, and these will be in a pull request for a later
         merge window
      
       - RCU-tasks updates, perhaps most notably these fixes:
      
            - A strange interaction between PID-namespace unshare and the
              RCU-tasks grace period that results in a low-probability but
              very real hang
      
            - A race between an RCU tasks rude grace period on a single-CPU
              system and CPU-hotplug addition of the second CPU that can
              result in a too-short grace period
      
            - A race between shrinking RCU tasks down to a single callback
              list and queuing a new callback to some other CPU, but where
              that queuing is delayed for more than an RCU grace period. This
              can result in that callback being stranded on the non-boot CPU
      
       - Torture-test updates and fixes
      
       - Torture-test scripting updates and fixes
      
       - Provide additional RCU CPU stall-warning information in kernels built
         with CONFIG_RCU_CPU_STALL_CPUTIME=y, and restore the full five-minute
         timeout limit for expedited RCU CPU stall warnings
      
      * tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (80 commits)
        rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep()
        kernel/notifier: Remove CONFIG_SRCU
        init: Remove "select SRCU"
        fs/quota: Remove "select SRCU"
        fs/notify: Remove "select SRCU"
        fs/btrfs: Remove "select SRCU"
        fs: Remove CONFIG_SRCU
        drivers/pci/controller: Remove "select SRCU"
        drivers/net: Remove "select SRCU"
        drivers/md: Remove "select SRCU"
        drivers/hwtracing/stm: Remove "select SRCU"
        drivers/dax: Remove "select SRCU"
        drivers/base: Remove CONFIG_SRCU
        rcu: Disable laziness if lazy-tracking says so
        rcu: Track laziness during boot and suspend
        rcu: Remove redundant call to rcu_boost_kthread_setaffinity()
        rcu: Allow up to five minutes expedited RCU CPU stall-warning timeouts
        rcu: Align the output of RCU CPU stall warning messages
        rcu: Add RCU stall diagnosis information
        sched: Add helper nr_context_switches_cpu()
        ...
      8cc01d43
    • Linus Torvalds's avatar
      Merge tag 'cgroup-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · 8ca8d89b
      Linus Torvalds authored
      Pull cgroup updates from Tejun Heo:
       "All the changes are trivial: documentation updates and a trivial code
        cleanup"
      
      * tag 'cgroup-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cgroup/cpuset: fix a few kernel-doc warnings & coding style
        docs: cgroup-v1: use numbered lists for user interface setup
        docs: cgroup-v1: add internal cross-references
        docs: cgroup-v1: make swap extension subsections subsections
        docs: cgroup-v1: use bullet lists for list of stat file tables
        docs: cgroup-v1: move hierarchy of accounting caption
        docs: cgroup-v1: fix footnotes
        docs: cgroup-v1: use code block for locking order schema
        docs: cgroup-v1: wrap remaining admonitions in admonition blocks
        docs: cgroup-v1: replace custom note constructs with appropriate admonition blocks
        cgroup/cpuset: no need to explicitly init a global static variable
      8ca8d89b
    • Linus Torvalds's avatar
      Merge tag 'wq-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · 3e82b41e
      Linus Torvalds authored
      Pull workqueue updates from Tejun Heo:
      
       - When per-cpu workqueue workers expire after sitting idle for too
         long, they used to wake up to the CPU that they're bound to in order
         to exit. This unfortunately could cause unwanted disturbances on CPUs
         isolated for e.g. RT applications.
      
         The worker exit path is restructured so that an existing worker is
         unbound from its CPU before being woken up for the last time,
         allowing it to migrate away from an isolated CPU for exiting.
      
       - A couple debug improvements. Watchdog dump is made more compact and
         workqueue now warns if used-after-free during the RCU grace period
         after destroy_workqueue().
      
      * tag 'wq-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        workqueue: Fold rebind_worker() within rebind_workers()
        workqueue: Unbind kworkers before sending them to exit()
        workqueue: Don't hold any lock while rcuwait'ing for !POOL_MANAGER_ACTIVE
        workqueue: Convert the idle_timer to a timer + work_struct
        workqueue: Factorize unbind/rebind_workers() logic
        workqueue: Protects wq_unbound_cpumask with wq_pool_attach_mutex
        workqueue: Make show_pwq() use run-length encoding
        workqueue: Add a new flag to spot the potential UAF error
      3e82b41e
    • Linus Torvalds's avatar
      Merge tag 'irq-core-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9e58df97
      Linus Torvalds authored
      Pull irq updates from Thomas Gleixner:
       "Updates for the interrupt subsystem:
      
        Core:
      
         - Move the interrupt affinity spreading mechanism into lib/group_cpus
           so it can be used for similar spreading requirements, e.g. in the
           block multi-queue code
      
           This also contains a first usecase in the block multi-queue code
           which Jens asked to take along with the librarization
      
         - Improve irqdomain locking to close a number race conditions which
           can be observed with massive parallel device driver probing
      
         - Enforce and document the semantics of disable_irq() which cannot be
           invoked safely from non-sleepable context
      
         - Move the IPI multiplexing code from the Apple AIC driver into the
           core, so it can be reused by RISCV
      
        Drivers:
      
         - Plug OF node refcounting leaks in various drivers
      
         - Correctly mark level triggered interrupts in the Broadcom L2
           drivers
      
         - The usual small fixes and improvements
      
         - No new drivers for the record!"
      
      * tag 'irq-core-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
        irqchip/irq-bcm7120-l2: Set IRQ_LEVEL for level triggered interrupts
        irqchip/irq-brcmstb-l2: Set IRQ_LEVEL for level triggered interrupts
        irqdomain: Switch to per-domain locking
        irqchip/mvebu-odmi: Use irq_domain_create_hierarchy()
        irqchip/loongson-pch-msi: Use irq_domain_create_hierarchy()
        irqchip/gic-v3-mbi: Use irq_domain_create_hierarchy()
        irqchip/gic-v3-its: Use irq_domain_create_hierarchy()
        irqchip/gic-v2m: Use irq_domain_create_hierarchy()
        irqchip/alpine-msi: Use irq_domain_add_hierarchy()
        x86/uv: Use irq_domain_create_hierarchy()
        x86/ioapic: Use irq_domain_create_hierarchy()
        irqdomain: Clean up irq_domain_push/pop_irq()
        irqdomain: Drop leftover brackets
        irqdomain: Drop dead domain-name assignment
        irqdomain: Drop revmap mutex
        irqdomain: Fix domain registration race
        irqdomain: Fix mapping-creation race
        irqdomain: Refactor __irq_domain_alloc_irqs()
        irqdomain: Look for existing mapping only once
        irqdomain: Drop bogus fwspec-mapping error handling
        ...
      9e58df97
    • Linus Torvalds's avatar
      Merge tag 'timers-core-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 560b8030
      Linus Torvalds authored
      Pull timer updates from Thomas Gleixner:
       "Updates for timekeeping, timers and clockevent/source drivers:
      
        Core:
      
         - Yet another round of improvements to make the clocksource watchdog
           more robust:
      
             - Relax the clocksource-watchdog skew criteria to match the NTP
               criteria.
      
             - Temporarily skip the watchdog when high memory latencies are
               detected which can lead to false-positives.
      
             - Provide an option to enable TSC skew detection even on systems
               where TSC is marked as reliable.
      
           Sigh!
      
         - Initialize the restart block in the nanosleep syscalls to be
           directed to the no restart function instead of doing a partial
           setup on entry.
      
           This prevents an erroneous restart_syscall() invocation from
           corrupting user space data. While such a situation is clearly a
           user space bug, preventing this is a correctness issue and caters
           to the least suprise principle.
      
         - Ignore the hrtimer slack for realtime tasks in schedule_hrtimeout()
           to align it with the nanosleep semantics.
      
        Drivers:
      
         - The obligatory new driver bindings for Mediatek, Rockchip and
           RISC-V variants.
      
         - Add support for the C3STOP misfeature to the RISC-V timer to handle
           the case where the timer stops in deeper idle state.
      
         - Set up a static key in the RISC-V timer correctly before first use.
      
         - The usual small improvements and fixes all over the place"
      
      * tag 'timers-core-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
        clocksource/drivers/timer-sun4i: Add CLOCK_EVT_FEAT_DYNIRQ
        clocksource/drivers/em_sti: Mark driver as non-removable
        clocksource/drivers/sh_tmu: Mark driver as non-removable
        clocksource/drivers/riscv: Patch riscv_clock_next_event() jump before first use
        clocksource/drivers/timer-microchip-pit64b: Add delay timer
        clocksource/drivers/timer-microchip-pit64b: Select driver only on ARM
        dt-bindings: timer: sifive,clint: add comaptibles for T-Head's C9xx
        dt-bindings: timer: mediatek,mtk-timer: add MT8365
        clocksource/drivers/riscv: Get rid of clocksource_arch_init() callback
        clocksource/drivers/sh_cmt: Mark driver as non-removable
        clocksource/drivers/timer-microchip-pit64b: Drop obsolete dependency on COMPILE_TEST
        clocksource/drivers/riscv: Increase the clock source rating
        clocksource/drivers/timer-riscv: Set CLOCK_EVT_FEAT_C3STOP based on DT
        dt-bindings: timer: Add bindings for the RISC-V timer device
        RISC-V: time: initialize hrtimer based broadcast clock event device
        dt-bindings: timer: rk-timer: Add rktimer for rv1126
        time/debug: Fix memory leak with using debugfs_lookup()
        clocksource: Enable TSC watchdog checking of HPET and PMTMR only when requested
        posix-timers: Use atomic64_try_cmpxchg() in __update_gt_cputime()
        clocksource: Verify HPET and PMTMR when TSC unverified
        ...
      560b8030
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · d1fabc68
      Jakub Kicinski authored
      Per-next-PR merge.
      
      net/smc/af_smc.c
        b5dd4d69 ("net/smc: llc_conf_mutex refactor, replace it with rw_semaphore")
        e40b801b ("net/smc: fix potential panic dues to unprotected smc_llc_srv_add_link()")
      https://lore.kernel.org/all/20230221124008.6303c330@canb.auug.org.au/Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d1fabc68
    • Linus Torvalds's avatar
      Merge tag 'x86-cleanups-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 056612fd
      Linus Torvalds authored
      Pull miscellaneous x86 cleanups from Thomas Gleixner:
      
       - Correct the common copy and pasted mishandling of kstrtobool() in the
         strict_sas_size() setup function
      
       - Make recalibrate_cpu_khz() an GPL only export
      
       - Check TSC feature before doing anything else which avoids pointless
         code execution if TSC is not available
      
       - Remove or fixup stale and misleading comments
      
       - Remove unused or pointelessly duplicated variables
      
       - Spelling and typo fixes
      
      * tag 'x86-cleanups-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/hotplug: Remove incorrect comment about mwait_play_dead()
        x86/tsc: Do feature check as the very first thing
        x86/tsc: Make recalibrate_cpu_khz() export GPL only
        x86/cacheinfo: Remove unused trace variable
        x86/Kconfig: Fix spellos & punctuation
        x86/signal: Fix the value returned by strict_sas_size()
        x86/cpu: Remove misleading comment
        x86/setup: Move duplicate boot_cpu_data definition out of the ifdeffery
        x86/boot/e820: Fix typo in e820.c comment
      056612fd
    • Ilias Apalodimas's avatar
      page_pool: add a comment explaining the fragment counter usage · 4d4266e3
      Ilias Apalodimas authored
      When reading the page_pool code the first impression is that keeping
      two separate counters, one being the page refcnt and the other being
      fragment pp_frag_count, is counter-intuitive.
      
      However without that fragment counter we don't know when to reliably
      destroy or sync the outstanding DMA mappings.  So let's add a comment
      explaining this part.
      Reviewed-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: default avatarIlias Apalodimas <ilias.apalodimas@linaro.org>
      Acked-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Link: https://lore.kernel.org/r/20230217222130.85205-1-ilias.apalodimas@linaro.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4d4266e3
    • Vladimir Oltean's avatar
      net: ethtool: fix __ethtool_dev_mm_supported() implementation · a00da30c
      Vladimir Oltean authored
      The MAC Merge layer is supported when ops->get_mm() returns 0.
      The implementation was changed during review, and in this process, a bug
      was introduced.
      
      Link: https://lore.kernel.org/netdev/20230111161706.1465242-5-vladimir.oltean@nxp.com/
      Fixes: 04692c90 ("net: ethtool: netlink: retrieve stats from multiple sources (eMAC, pMAC)")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFerenc Fejes <fejes@inf.elte.hu>
      Link: https://lore.kernel.org/all/20230220122343.1156614-2-vladimir.oltean@nxp.com/Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a00da30c
    • Bo Liu's avatar
      ethtool: pse-pd: Fix double word in comments · 7ec07774
      Bo Liu authored
      Remove the repeated word "for" in comments.
      Signed-off-by: default avatarBo Liu <liubo03@inspur.com>
      Reviewed-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Link: https://lore.kernel.org/r/20230221083036.2414-1-liubo03@inspur.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7ec07774
    • Xuan Zhuo's avatar
      xsk: add linux/vmalloc.h to xsk.c · 951bce29
      Xuan Zhuo authored
      Fix the failure of the compilation under the sh4.
      
      Because we introduced remap_vmalloc_range() earlier, this has caused
      the compilation failure on the sh4 platform. So this introduction of the
      header file of linux/vmalloc.h.
      
      config: sh-allmodconfig (https://download.01.org/0day-ci/archive/20230221/202302210041.kpPQLlNQ-lkp@intel.com/config)
      compiler: sh4-linux-gcc (GCC) 12.1.0
      reproduce (this is a W=1 build):
              wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
              chmod +x ~/bin/make.cross
              # https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=9f78bf330a66cd400b3e00f370f597e9fa939207
              git remote add net-next https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git
              git fetch --no-tags net-next master
              git checkout 9f78bf33
              # save the config file
              mkdir build_dir && cp config build_dir/.config
              COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=sh olddefconfig
              COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=sh SHELL=/bin/bash net/
      
      Fixes: 9f78bf33 ("xsk: support use vaddr as ring")
      Signed-off-by: default avatarXuan Zhuo <xuanzhuo@linux.alibaba.com>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Link: https://lore.kernel.org/oe-kbuild-all/202302210041.kpPQLlNQ-lkp@intel.com/
      Link: https://lore.kernel.org/r/20230221075140.46988-1-xuanzhuo@linux.alibaba.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      951bce29
    • Linus Torvalds's avatar
      Merge tag 'x86_vdso_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3f0b0903
      Linus Torvalds authored
      Pull x86 vdso updates from Borislav Petkov:
      
       - Add getcpu support for the 32-bit version of the vDSO
      
       - Some smaller fixes
      
      * tag 'x86_vdso_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/vdso: Fix -Wmissing-prototypes warnings
        x86/vdso: Fake 32bit VDSO build on 64bit compile for vgetcpu
        selftests: Emit a warning if getcpu() is missing on 32bit
        x86/vdso: Provide getcpu for x86-32.
        x86/cpu: Provide the full setup for getcpu() on x86-32
        x86/vdso: Move VDSO image init to vdso2c generated code
      3f0b0903
    • Linus Torvalds's avatar
      Merge tag 'x86_microcode_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · efebca0b
      Linus Torvalds authored
      Pull x86 microcode loader updates from Borislav Petkov:
      
       - Fix mixed steppings support on AMD which got broken somewhere along
         the way
      
       - Improve revision reporting
      
       - Properly check CPUID capabilities after late microcode upgrade to
         avoid false positives
      
       - A garden variety of other small fixes
      
      * tag 'x86_microcode_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/microcode/core: Return an error only when necessary
        x86/microcode/AMD: Fix mixed steppings support
        x86/microcode/AMD: Add a @cpu parameter to the reloading functions
        x86/microcode/amd: Remove load_microcode_amd()'s bsp parameter
        x86/microcode: Allow only "1" as a late reload trigger value
        x86/microcode/intel: Print old and new revision during early boot
        x86/microcode/intel: Pass the microcode revision to print_ucode_info() directly
        x86/microcode: Adjust late loading result reporting message
        x86/microcode: Check CPU capabilities after late microcode update correctly
        x86/microcode: Add a parameter to microcode_check() to store CPU capabilities
        x86/microcode: Use the DEVICE_ATTR_RO() macro
        x86/microcode/AMD: Handle multiple glued containers properly
        x86/microcode/AMD: Rename a couple of functions
      efebca0b
    • Linus Torvalds's avatar
      Merge tag 'x86_cache_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · aa8c3db4
      Linus Torvalds authored
      Pull x86 resource control updates from Borislav Petkov:
      
       - Add support for a new AMD feature called slow memory bandwidth
         allocation. Its goal is to control resource allocation in external
         slow memory which is connected to the machine like for example
         through CXL devices, accelerators etc
      
      * tag 'x86_cache_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/resctrl: Fix a silly -Wunused-but-set-variable warning
        Documentation/x86: Update resctrl.rst for new features
        x86/resctrl: Add interface to write mbm_local_bytes_config
        x86/resctrl: Add interface to write mbm_total_bytes_config
        x86/resctrl: Add interface to read mbm_local_bytes_config
        x86/resctrl: Add interface to read mbm_total_bytes_config
        x86/resctrl: Support monitor configuration
        x86/resctrl: Add __init attribute to rdt_get_mon_l3_config()
        x86/resctrl: Detect and configure Slow Memory Bandwidth Allocation
        x86/resctrl: Include new features in command line options
        x86/cpufeatures: Add Bandwidth Monitoring Event Configuration feature flag
        x86/resctrl: Add a new resource type RDT_RESOURCE_SMBA
        x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag
        x86/resctrl: Replace smp_call_function_many() with on_each_cpu_mask()
      aa8c3db4
    • Linus Torvalds's avatar
      Merge tag 'x86_alternatives_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1adce1b9
      Linus Torvalds authored
      Pull x86 asm alternatives updates from Borislav Petkov:
      
       - Teach the static_call patching infrastructure to handle conditional
         tall calls properly which can be static calls too
      
       - Add proper struct alt_instr.flags which controls different aspects of
         insn patching behavior
      
      * tag 'x86_alternatives_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/static_call: Add support for Jcc tail-calls
        x86/alternatives: Teach text_poke_bp() to patch Jcc.d32 instructions
        x86/alternatives: Introduce int3_emulate_jcc()
        x86/alternatives: Add alt_instr.flags
      1adce1b9