1. 16 Sep, 2024 35 commits
    • Linus Torvalds's avatar
      Merge tag 'lsm-pr-20240911' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm · a430d95c
      Linus Torvalds authored
      Pull lsm updates from Paul Moore:
      
       - Move the LSM framework to static calls
      
         This transitions the vast majority of the LSM callbacks into static
         calls. Those callbacks which haven't been converted were left as-is
         due to the general ugliness of the changes required to support the
         static call conversion; we can revisit those callbacks at a future
         date.
      
       - Add the Integrity Policy Enforcement (IPE) LSM
      
         This adds a new LSM, Integrity Policy Enforcement (IPE). There is
         plenty of documentation about IPE in this patches, so I'll refrain
         from going into too much detail here, but the basic motivation behind
         IPE is to provide a mechanism such that administrators can restrict
         execution to only those binaries which come from integrity protected
         storage, e.g. a dm-verity protected filesystem. You will notice that
         IPE requires additional LSM hooks in the initramfs, dm-verity, and
         fs-verity code, with the associated patches carrying ACK/review tags
         from the associated maintainers. We couldn't find an obvious
         maintainer for the initramfs code, but the IPE patchset has been
         widely posted over several years.
      
         Both Deven Bowers and Fan Wu have contributed to IPE's development
         over the past several years, with Fan Wu agreeing to serve as the IPE
         maintainer moving forward. Once IPE is accepted into your tree, I'll
         start working with Fan to ensure he has the necessary accounts, keys,
         etc. so that he can start submitting IPE pull requests to you
         directly during the next merge window.
      
       - Move the lifecycle management of the LSM blobs to the LSM framework
      
         Management of the LSM blobs (the LSM state buffers attached to
         various kernel structs, typically via a void pointer named "security"
         or similar) has been mixed, some blobs were allocated/managed by
         individual LSMs, others were managed by the LSM framework itself.
      
         Starting with this pull we move management of all the LSM blobs,
         minus the XFRM blob, into the framework itself, improving consistency
         across LSMs, and reducing the amount of duplicated code across LSMs.
         Due to some additional work required to migrate the XFRM blob, it has
         been left as a todo item for a later date; from a practical
         standpoint this omission should have little impact as only SELinux
         provides a XFRM LSM implementation.
      
       - Fix problems with the LSM's handling of F_SETOWN
      
         The LSM hook for the fcntl(F_SETOWN) operation had a couple of
         problems: it was racy with itself, and it was disconnected from the
         associated DAC related logic in such a way that the LSM state could
         be updated in cases where the DAC state would not. We fix both of
         these problems by moving the security_file_set_fowner() hook into the
         same section of code where the DAC attributes are updated. Not only
         does this resolve the DAC/LSM synchronization issue, but as that code
         block is protected by a lock, it also resolve the race condition.
      
       - Fix potential problems with the security_inode_free() LSM hook
      
         Due to use of RCU to protect inodes and the placement of the LSM hook
         associated with freeing the inode, there is a bit of a challenge when
         it comes to managing any LSM state associated with an inode. The VFS
         folks are not open to relocating the LSM hook so we have to get
         creative when it comes to releasing an inode's LSM state.
         Traditionally we have used a single LSM callback within the hook that
         is triggered when the inode is "marked for death", but not actually
         released due to RCU.
      
         Unfortunately, this causes problems for LSMs which want to take an
         action when the inode's associated LSM state is actually released; so
         we add an additional LSM callback, inode_free_security_rcu(), that is
         called when the inode's LSM state is released in the RCU free
         callback.
      
       - Refactor two LSM hooks to better fit the LSM return value patterns
      
         The vast majority of the LSM hooks follow the "return 0 on success,
         negative values on failure" pattern, however, there are a small
         handful that have unique return value behaviors which has caused
         confusion in the past and makes it difficult for the BPF verifier to
         properly vet BPF LSM programs. This includes patches to
         convert two of these"special" LSM hooks to the common 0/-ERRNO pattern.
      
       - Various cleanups and improvements
      
         A handful of patches to remove redundant code, better leverage the
         IS_ERR_OR_NULL() helper, add missing "static" markings, and do some
         minor style fixups.
      
      * tag 'lsm-pr-20240911' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (40 commits)
        security: Update file_set_fowner documentation
        fs: Fix file_set_fowner LSM hook inconsistencies
        lsm: Use IS_ERR_OR_NULL() helper function
        lsm: remove LSM_COUNT and LSM_CONFIG_COUNT
        ipe: Remove duplicated include in ipe.c
        lsm: replace indirect LSM hook calls with static calls
        lsm: count the LSMs enabled at compile time
        kernel: Add helper macros for loop unrolling
        init/main.c: Initialize early LSMs after arch code, static keys and calls.
        MAINTAINERS: add IPE entry with Fan Wu as maintainer
        documentation: add IPE documentation
        ipe: kunit test for parser
        scripts: add boot policy generation program
        ipe: enable support for fs-verity as a trust provider
        fsverity: expose verified fsverity built-in signatures to LSMs
        lsm: add security_inode_setintegrity() hook
        ipe: add support for dm-verity as a trust provider
        dm-verity: expose root hash digest and signature data to LSMs
        block,lsm: add LSM blob and new LSM hooks for block devices
        ipe: add permissive toggle
        ...
      a430d95c
    • Linus Torvalds's avatar
      Merge tag 'selinux-pr-20240911' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux · ad060dbb
      Linus Torvalds authored
      Pull selinux updates from Paul Moore:
      
       - Ensure that both IPv4 and IPv6 connections are properly initialized
      
         While we always properly initialized IPv4 connections early in their
         life, we missed the necessary IPv6 change when we were adding IPv6
         support.
      
       - Annotate the SELinux inode revalidation function to quiet KCSAN
      
         KCSAN correctly identifies a race in __inode_security_revalidate()
         when we check to see if an inode's SELinux has been properly
         initialized. While KCSAN is correct, it is an intentional choice made
         for performance reasons; if necessary, we check the state a second
         time, this time with a lock held, before initializing the inode's
         state.
      
       - Code cleanups, simplification, etc.
      
         A handful of individual patches to simplify some SELinux kernel
         logic, improve return code granularity via ERR_PTR(), follow the
         guidance on using KMEM_CACHE(), and correct some minor style
         problems.
      
      * tag 'selinux-pr-20240911' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
        selinux: fix style problems in security/selinux/include/audit.h
        selinux: simplify avc_xperms_audit_required()
        selinux: mark both IPv4 and IPv6 accepted connection sockets as labeled
        selinux: replace kmem_cache_create() with KMEM_CACHE()
        selinux: annotate false positive data race to avoid KCSAN warnings
        selinux: refactor code to return ERR_PTR in selinux_netlbl_sock_genattr
        selinux: Streamline type determination in security_compute_sid
      ad060dbb
    • Linus Torvalds's avatar
      Merge tag 'audit-pr-20240911' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit · dc644fba
      Linus Torvalds authored
      Pull audit updates from Paul Moore:
      
       - Fix some remaining problems with PID/TGID reporting
      
         When most users think about PIDs, what they are really thinking about
         is the TGID. This commit shifts the audit PID logging and filtering
         to use the TGID value which should provide a more meaningful audit
         stream and filtering experience for users.
      
       - Migrate to the str_enabled_disabled() helper
      
         Evidently we have helper functions that help ensure if we mistype
         "enabled" or "disabled" it is now caught at compile time. I guess
         we're fancy now.
      
      * tag 'audit-pr-20240911' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
        audit: Make use of str_enabled_disabled() helper
        audit: use task_tgid_nr() instead of task_pid_nr()
      dc644fba
    • David Howells's avatar
      cifs: Remove redundant setting of NETFS_SREQ_HIT_EOF · 43a64bd0
      David Howells authored
      Fix an upstream merge resolution issue[1].  The NETFS_SREQ_HIT_EOF flag,
      and code to set it, got added via two different paths.  The original path
      saw it added in the netfslib read improvements[2], but it was also added,
      and slightly differently, in a fix that was committed before v6.11:
      
              1da29f2c
              netfs, cifs: Fix handling of short DIO read
      
      However, the code added to smb2_readv_callback() to set the flag in didn't
      get removed when the netfs read improvements series was rebased to take
      account of the cifs fixes.  The proposed merge resolution[2] deleted it
      rather than rebase the patches.
      
      Fix this by removing the redundant lines.  Code to set the bit that derives
      from the fix patch is still there, a few lines above in the source.
      
      Fixes: 35219bc5 ("Merge tag 'vfs-6.12.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Steve French <stfrench@microsoft.com>
      cc: Paulo Alcantara <pc@manguebit.com>
      cc: Christian Brauner <brauner@kernel.org>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: linux-cifs@vger.kernel.org
      cc: netfs@lists.linux.dev
      cc: linux-fsdevel@vger.kernel.org
      Link: https://lore.kernel.org/r/CAHk-=wjr8fxk20-wx=63mZruW1LTvBvAKya1GQ1EhyzXb-okMA@mail.gmail.com/ [1]
      Link: https://lore.kernel.org/linux-fsdevel/20240913-vfs-netfs-39ef6f974061@brauner/ [2]
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      43a64bd0
    • David Howells's avatar
      cifs: Fix cifs readv callback merge resolution issue · dc1a456d
      David Howells authored
      Fix an upstream merge resolution issue[1].  Prior to the netfs read
      healpers, the SMB1 asynchronous read callback, cifs_readv_worker()
      performed the cleanup for the operation in the network message processing
      loop, potentially slowing down the processing of incoming SMB messages.
      
      With commit a68c7486 ("cifs: Fix SMB1 readv/writev callback in the same
      way as SMB2/3"), this was moved to a worker thread (as is done in the
      SMB2/3 transport variant).  However, the "was_async" argument to
      netfs_subreq_terminated (which was originally incorrectly "false" got
      flipped to "true" - which was then incorrect because, being in a kernel
      thread, it's not in an async context).
      
      This got corrected in the sample merge[2], but Linus, not unreasonably,
      switched it back to its previous value.
      
      Note that this value tells netfslib whether or not it can run sleepable
      stuff or stuff that takes a long time, such as retries and cleanups, in the
      calling thread, or whether it should offload to a worker thread.
      
      Fix this so that it is "false".  The callback to netfslib in both SMB1 and
      SMB2/3 now gets offloaded from the network message thread to a separate
      worker thread and thus it's fine to do the slow work in this thread.
      
      Fixes: 35219bc5 ("Merge tag 'vfs-6.12.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Steve French <stfrench@microsoft.com>
      cc: Paulo Alcantara <pc@manguebit.com>
      cc: Christian Brauner <brauner@kernel.org>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: linux-cifs@vger.kernel.org
      cc: netfs@lists.linux.dev
      cc: linux-fsdevel@vger.kernel.org
      Link: https://lore.kernel.org/r/CAHk-=wjr8fxk20-wx=63mZruW1LTvBvAKya1GQ1EhyzXb-okMA@mail.gmail.com/ [1]
      Link: https://lore.kernel.org/linux-fsdevel/20240913-vfs-netfs-39ef6f974061@brauner/ [2]
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dc1a456d
    • Linus Torvalds's avatar
      Merge tag 'for-6.12/io_uring-discard-20240913' of git://git.kernel.dk/linux · adfc3ded
      Linus Torvalds authored
      Pull io_uring async discard support from Jens Axboe:
       "Sitting on top of both the 6.12 block and io_uring core branches,
        here's support for async discard through io_uring.
      
        This allows applications to issue async discards, rather than rely on
        the blocking sync ioctl discards we already have. The sync support is
        difficult to use outside of idle/cleanup periods.
      
        On a real (but slow) device, testing shows the following results when
        compared to sync discard:
      
      	qd64 sync discard: 21K IOPS, lat avg 3 msec (max 21 msec)
      	qd64 async discard: 76K IOPS, lat avg 845 usec (max 2.2 msec)
      
      	qd64 sync discard: 14K IOPS, lat avg 5 msec (max 25 msec)
      	qd64 async discard: 56K IOPS, lat avg 1153 usec (max 3.6 msec)
      
        and synthetic null_blk testing with the same queue depth and block
        size settings as above shows:
      
      	Type    Trim size       IOPS    Lat avg (usec)  Lat Max (usec)
      	==============================================================
      	sync    4k               144K       444            20314
      	async   4k              1353K        47              595
      	sync    1M                56K      1136            21031
      	async   1M                94K       680              760"
      
      * tag 'for-6.12/io_uring-discard-20240913' of git://git.kernel.dk/linux:
        block: implement async io_uring discard cmd
        block: introduce blk_validate_byte_range()
        filemap: introduce filemap_invalidate_pages
        io_uring/cmd: give inline space in request to cmds
        io_uring/cmd: expose iowq to cmds
      adfc3ded
    • Linus Torvalds's avatar
      Merge tag 'for-6.12/block-20240913' of git://git.kernel.dk/linux · 26bb0d3f
      Linus Torvalds authored
      Pull block updates from Jens Axboe:
      
       - MD changes via Song:
            - md-bitmap refactoring (Yu Kuai)
            - raid5 performance optimization (Artur Paszkiewicz)
            - Other small fixes (Yu Kuai, Chen Ni)
            - Add a sysfs entry 'new_level' (Xiao Ni)
            - Improve information reported in /proc/mdstat (Mateusz Kusiak)
      
       - NVMe changes via Keith:
            - Asynchronous namespace scanning (Stuart)
            - TCP TLS updates (Hannes)
            - RDMA queue controller validation (Niklas)
            - Align field names to the spec (Anuj)
            - Metadata support validation (Puranjay)
            - A syntax cleanup (Shen)
            - Fix a Kconfig linking error (Arnd)
            - New queue-depth quirk (Keith)
      
       - Add missing unplug trace event (Keith)
      
       - blk-iocost fixes (Colin, Konstantin)
      
       - t10-pi modular removal and fixes (Alexey)
      
       - Fix for potential BLKSECDISCARD overflow (Alexey)
      
       - bio splitting cleanups and fixes (Christoph)
      
       - Deal with folios rather than rather than pages, speeding up how the
         block layer handles bigger IOs (Kundan)
      
       - Use spinlocks rather than bit spinlocks in zram (Sebastian, Mike)
      
       - Reduce zoned device overhead in ublk (Ming)
      
       - Add and use sendpages_ok() for drbd and nvme-tcp (Ofir)
      
       - Fix regression in partition error pointer checking (Riyan)
      
       - Add support for write zeroes and rotational status in nbd (Wouter)
      
       - Add Yu Kuai as new BFQ maintainer. The scheduler has been
         unmaintained for quite a while.
      
       - Various sets of fixes for BFQ (Yu Kuai)
      
       - Misc fixes and cleanups (Alvaro, Christophe, Li, Md Haris, Mikhail,
         Yang)
      
      * tag 'for-6.12/block-20240913' of git://git.kernel.dk/linux: (120 commits)
        nvme-pci: qdepth 1 quirk
        block: fix potential invalid pointer dereference in blk_add_partition
        blk_iocost: make read-only static array vrate_adj_pct const
        block: unpin user pages belonging to a folio at once
        mm: release number of pages of a folio
        block: introduce folio awareness and add a bigger size from folio
        block: Added folio-ized version of bio_add_hw_page()
        block, bfq: factor out a helper to split bfqq in bfq_init_rq()
        block, bfq: remove local variable 'bfqq_already_existing' in bfq_init_rq()
        block, bfq: remove local variable 'split' in bfq_init_rq()
        block, bfq: remove bfq_log_bfqg()
        block, bfq: merge bfq_release_process_ref() into bfq_put_cooperator()
        block, bfq: fix procress reference leakage for bfqq in merge chain
        block, bfq: fix uaf for accessing waker_bfqq after splitting
        blk-throttle: support prioritized processing of metadata
        blk-throttle: remove last_low_overflow_time
        drbd: Add NULL check for net_conf to prevent dereference in state validation
        nvme-tcp: fix link failure for TCP auth
        blk-mq: add missing unplug trace event
        mtip32xx: Remove redundant null pointer checks in mtip_hw_debugfs_init()
        ...
      26bb0d3f
    • Linus Torvalds's avatar
      Merge tag 'for-6.12/io_uring-20240913' of git://git.kernel.dk/linux · 3a4d319a
      Linus Torvalds authored
      Pull io_uring updates from Jens Axboe:
      
       - NAPI fixes and cleanups (Pavel, Olivier)
      
       - Add support for absolute timeouts (Pavel)
      
       - Fixes for io-wq/sqpoll affinities (Felix)
      
       - Efficiency improvements for dealing with huge pages (Chenliang)
      
       - Support for a minwait mode, where the application essentially has two
         timouts - one smaller one that defines the batch timeout, and the
         overall large one similar to what we had before. This enables
         efficient use of batching based on count + timeout, while still
         working well with periods of less intensive workloads
      
       - Use ITER_UBUF for single segment sends
      
       - Add support for incremental buffer consumption. Right now each
         operation will always consume a full buffer. With incremental
         consumption, a recv/read operation only consumes the part of the
         buffer that it needs to satisfy the operation
      
       - Add support for GCOV for io_uring, to help retain a high coverage of
         test to code ratio
      
       - Fix regression with ocfs2, where an odd -EOPNOTSUPP wasn't correctly
         converted to a blocking retry
      
       - Add support for cloning registered buffers from one ring to another
      
       - Misc cleanups (Anuj, me)
      
      * tag 'for-6.12/io_uring-20240913' of git://git.kernel.dk/linux: (35 commits)
        io_uring: add IORING_REGISTER_COPY_BUFFERS method
        io_uring/register: provide helper to get io_ring_ctx from 'fd'
        io_uring/rsrc: add reference count to struct io_mapped_ubuf
        io_uring/rsrc: clear 'slot' entry upfront
        io_uring/io-wq: inherit cpuset of cgroup in io worker
        io_uring/io-wq: do not allow pinning outside of cpuset
        io_uring/rw: drop -EOPNOTSUPP check in __io_complete_rw_common()
        io_uring/rw: treat -EOPNOTSUPP for IOCB_NOWAIT like -EAGAIN
        io_uring/sqpoll: do not allow pinning outside of cpuset
        io_uring/eventfd: move refs to refcount_t
        io_uring: remove unused rsrc_put_fn
        io_uring: add new line after variable declaration
        io_uring: add GCOV_PROFILE_URING Kconfig option
        io_uring/kbuf: add support for incremental buffer consumption
        io_uring/kbuf: pass in 'len' argument for buffer commit
        Revert "io_uring: Require zeroed sqe->len on provided-buffers send"
        io_uring/kbuf: move io_ring_head_to_buf() to kbuf.h
        io_uring/kbuf: add io_kbuf_commit() helper
        io_uring/kbuf: shrink nr_iovs/mode in struct buf_sel_arg
        io_uring: wire up min batch wake timeout
        ...
      3a4d319a
    • Linus Torvalds's avatar
      Merge tag 'erofs-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs · 69a3a0a4
      Linus Torvalds authored
      Pull erofs updates from Gao Xiang:
       "In this cycle, we add file-backed mount support, which has has been a
        strong requirement for years. It is especially useful when there are
        thousands of images running on the same host for containers and other
        sandbox use cases, unlike OS image use cases.
      
        Without file-backed mounts, it's hard for container runtimes to manage
        and isolate so many unnecessary virtual block devices safely and
        efficiently, therefore file-backed mounts are highly preferred. For
        EROFS users, ComposeFS [1], containerd, and Android APEXes [2] will
        directly benefit from it, and I've seen no risk in implementing it as
        a completely immutable filesystem.
      
        The previous experimental feature "EROFS over fscache" is now marked
        as deprecated because:
      
         - Fscache is no longer an independent subsystem and has been merged
           into netfs, which was somewhat unexpected when it was proposed.
      
         - New HSM "fanotify pre-content hooks" [3] will be landed upstream.
           These hooks will replace "EROFS over fscache" in a simpler way, as
           EROFS won't be bother with kernel caching anymore. Userspace
           programs can also manage their own caching hierarchy more flexibly.
      
        Once the HSM "fanotify pre-content hooks" is landed, I will remove the
        fscache backend entirely as an internal dependency cleanup. More
        backgrounds are listed in the original patchset [4].
      
        In addition to that, there are bugfixes and cleanups as usual.
      
        Summary:
      
         - Support file-backed mounts for containers and sandboxes
      
         - Mark the experimental fscache backend as deprecated
      
         - Handle overlapped pclusters caused by crafted images properly
      
         - Fix a failure path which could cause infinite loops in
           z_erofs_init_decompressor()
      
         - Get rid of unnecessary NOFAILs
      
         - Harmless on-disk hardening & minor cleanups"
      
      * tag 'erofs-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
        erofs: reject inodes with negative i_size
        erofs: restrict pcluster size limitations
        erofs: allocate more short-lived pages from reserved pool first
        erofs: sunset unneeded NOFAILs
        erofs: simplify erofs_map_blocks_flatmode()
        erofs: refactor read_inode calling convention
        erofs: use kmemdup_nul in erofs_fill_symlink
        erofs: mark experimental fscache backend deprecated
        erofs: support compressed inodes for fileio
        erofs: support unencoded inodes for fileio
        erofs: add file-backed mount support
        erofs: handle overlapped pclusters out of crafted images properly
        erofs: fix error handling in z_erofs_init_decompressor
        erofs: clean up erofs_register_sysfs()
        erofs: fix incorrect symlink detection in fast symlink
      69a3a0a4
    • Linus Torvalds's avatar
      Merge tag 'for-6.12-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 7a40974f
      Linus Torvalds authored
      Pull btrfs updates from David Sterba:
       "This brings mostly refactoring, cleanups, minor performance
        optimizations and usual fixes. The folio API conversions are most
        noticeable.
      
        There's one less visible change that could have a high impact. The
        extent lock scope for read is reduced, not held for the entire
        operation. In the buffered read case it's left to page or inode lock,
        some direct io read synchronization is still needed.
      
        This used to prevent deadlocks induced by page faults during direct
        io, so there was a 4K limitation on the requests, e.g. for io_uring.
        In the future this will allow smoother integration with iomap where
        the extent read lock was a major obstacle.
      
        User visible changes:
      
         - the FSTRIM ioctl updates the processed range even after an error or
           interruption
      
         - cleaner thread is woken up in SYNC ioctl instead of waking the
           transaction thread that can take some delay before waking up the
           cleaner, this can speed up cleaning of deleted subvolumes
      
         - print an error message when opening a device fail, e.g. when it's
           unexpectedly read-only
      
        Core changes:
      
         - improved extent map handling in various ways (locking, iteration, ...)
      
         - new assertions and locking annotations
      
         - raid-stripe-tree locking fixes
      
         - use xarray for tracking dirty qgroup extents, switched from rb-tree
      
         - turn the subpage test to compile-time condition if possible (e.g.
           on x86_64 with 4K pages), this allows to skip a lot of ifs and
           remove dead code
      
         - more preparatory work for compression in subpage mode
      
        Cleanups and refactoring
      
         - folio API conversions, many simple cases where page is passed so
           switch it to folios
      
         - more subpage code refactoring, update page state bitmap processing
      
         - introduce auto free for btrfs_path structure, use for the simple
           cases"
      
      * tag 'for-6.12-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (110 commits)
        btrfs: only unlock the to-be-submitted ranges inside a folio
        btrfs: merge btrfs_folio_unlock_writer() into btrfs_folio_end_writer_lock()
        btrfs: BTRFS_PATH_AUTO_FREE in orphan.c
        btrfs: use btrfs_path auto free in zoned.c
        btrfs: DEFINE_FREE for struct btrfs_path
        btrfs: remove btrfs_folio_end_all_writers()
        btrfs: constify more pointer parameters
        btrfs: rework BTRFS_I as macro to preserve parameter const
        btrfs: add and use helper to verify the calling task has locked the inode
        btrfs: always update fstrim_range on failure in FITRIM ioctl
        btrfs: convert copy_inline_to_page() to use folio
        btrfs: convert btrfs_decompress() to take a folio
        btrfs: convert zstd_decompress() to take a folio
        btrfs: convert lzo_decompress() to take a folio
        btrfs: convert zlib_decompress() to take a folio
        btrfs: convert try_release_extent_mapping() to take a folio
        btrfs: convert try_release_extent_state() to take a folio
        btrfs: convert submit_eb_page() to take a folio
        btrfs: convert submit_eb_subpage() to take a folio
        btrfs: convert read_key_bytes() to take a folio
        ...
      7a40974f
    • Linus Torvalds's avatar
      Merge tag 'affs-for-6.12-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · effdcd52
      Linus Torvalds authored
      Pull affs updates from David Sterba:
       "Cleanups removing unused code and updating the definition of a
        flexible struct array"
      
      * tag 'affs-for-6.12-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        affs: Replace one-element array with flexible-array member
        affs: Remove unused macros GET_END_PTR, AFFS_GET_HASHENTRY
      effdcd52
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.12.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 35219bc5
      Linus Torvalds authored
      Pull netfs updates from Christian Brauner:
       "This contains the work to improve read/write performance for the new
        netfs library.
      
        The main performance enhancing changes are:
      
         - Define a structure, struct folio_queue, and a new iterator type,
           ITER_FOLIOQ, to hold a buffer as a replacement for ITER_XARRAY. See
           that patch for questions about naming and form.
      
           ITER_FOLIOQ is provided as a replacement for ITER_XARRAY. The
           problem with an xarray is that accessing it requires the use of a
           lock (typically the RCU read lock) - and this means that we can't
           supply iterate_and_advance() with a step function that might sleep
           (crypto for example) without having to drop the lock between pages.
           ITER_FOLIOQ is the iterator for a chain of folio_queue structs,
           where each folio_queue holds a small list of folios. A folio_queue
           struct is a simpler structure than xarray and is not subject to
           concurrent manipulation by the VM. folio_queue is used rather than
           a bvec[] as it can form lists of indefinite size, adding to one end
           and removing from the other on the fly.
      
         - Provide a copy_folio_from_iter() wrapper.
      
         - Make cifs RDMA support ITER_FOLIOQ.
      
         - Use folio queues in the write-side helpers instead of xarrays.
      
         - Add a function to reset the iterator in a subrequest.
      
         - Simplify the write-side helpers to use sheaves to skip gaps rather
           than trying to work out where gaps are.
      
         - In afs, make the read subrequests asynchronous, putting them into
           work items to allow the next patch to do progressive
           unlocking/reading.
      
         - Overhaul the read-side helpers to improve performance.
      
         - Fix the caching of a partial block at the end of a file.
      
         - Allow a store to be cancelled.
      
        Then some changes for cifs to make it use folio queues instead of
        xarrays for crypto bufferage:
      
         - Use raw iteration functions rather than manually coding iteration
           when hashing data.
      
         - Switch to using folio_queue for crypto buffers.
      
         - Remove the xarray bits.
      
        Make some adjustments to the /proc/fs/netfs/stats file such that:
      
         - All the netfs stats lines begin 'Netfs:' but change this to
           something a bit more useful.
      
         - Add a couple of stats counters to track the numbers of skips and
           waits on the per-inode writeback serialisation lock to make it
           easier to check for this as a source of performance loss.
      
        Miscellaneous work:
      
         - Ensure that the sb_writers lock is taken around
           vfs_{set,remove}xattr() in the cachefiles code.
      
         - Reduce the number of conditional branches in netfs_perform_write().
      
         - Move the CIFS_INO_MODIFIED_ATTR flag to the netfs_inode struct and
           remove cifs_post_modify().
      
         - Move the max_len/max_nr_segs members from netfs_io_subrequest to
           netfs_io_request as they're only needed for one subreq at a time.
      
         - Add an 'unknown' source value for tracing purposes.
      
         - Remove NETFS_COPY_TO_CACHE as it's no longer used.
      
         - Set the request work function up front at allocation time.
      
         - Use bh-disabling spinlocks for rreq->lock as cachefiles completion
           may be run from block-filesystem DIO completion in softirq context.
      
         - Remove fs/netfs/io.c"
      
      * tag 'vfs-6.12.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (25 commits)
        docs: filesystems: corrected grammar of netfs page
        cifs: Don't support ITER_XARRAY
        cifs: Switch crypto buffer to use a folio_queue rather than an xarray
        cifs: Use iterate_and_advance*() routines directly for hashing
        netfs: Cancel dirty folios that have no storage destination
        cachefiles, netfs: Fix write to partial block at EOF
        netfs: Remove fs/netfs/io.c
        netfs: Speed up buffered reading
        afs: Make read subreqs async
        netfs: Simplify the writeback code
        netfs: Provide an iterator-reset function
        netfs: Use new folio_queue data type and iterator instead of xarray iter
        cifs: Provide the capability to extract from ITER_FOLIOQ to RDMA SGEs
        iov_iter: Provide copy_folio_from_iter()
        mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios
        netfs: Use bh-disabling spinlocks for rreq->lock
        netfs: Set the request work function upon allocation
        netfs: Remove NETFS_COPY_TO_CACHE
        netfs: Reserve netfs_sreq_source 0 as unset/unknown
        netfs: Move max_len/max_nr_segs from netfs_io_subrequest to netfs_io_stream
        ...
      35219bc5
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.12.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 9020d0d8
      Linus Torvalds authored
      Pull vfs mount updates from Christian Brauner:
       "Recently, we added the ability to list mounts in other mount
        namespaces and the ability to retrieve namespace file descriptors
        without having to go through procfs by deriving them from pidfds.
      
        This extends nsfs in two ways:
      
         (1) Add the ability to retrieve information about a mount namespace
             via NS_MNT_GET_INFO.
      
             This will return the mount namespace id and the number of mounts
             currently in the mount namespace. The number of mounts can be
             used to size the buffer that needs to be used for listmount() and
             is in general useful without having to actually iterate through
             all the mounts.
      
            The structure is extensible.
      
         (2) Add the ability to iterate through all mount namespaces over
             which the caller holds privilege returning the file descriptor
             for the next or previous mount namespace.
      
             To retrieve a mount namespace the caller must be privileged wrt
             to it's owning user namespace. This means that PID 1 on the host
             can list all mounts in all mount namespaces or that a container
             can list all mounts of its nested containers.
      
             Optionally pass a structure for NS_MNT_GET_INFO with
             NS_MNT_GET_{PREV,NEXT} to retrieve information about the mount
             namespace in one go.
      
        (1) and (2) can be implemented for other namespace types easily.
      
        Together with recent api additions this means one can iterate through
        all mounts in all mount namespaces without ever touching procfs.
      
        The commit message in 49224a34 ('Merge patch series "nsfs: iterate
        through mount namespaces"') contains example code how to do this"
      
      * tag 'vfs-6.12.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        nsfs: iterate through mount namespaces
        file: add fput() cleanup helper
        fs: add put_mnt_ns() cleanup helper
        fs: allow mount namespace fd
      9020d0d8
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.12.procfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · e8fc317d
      Linus Torvalds authored
      Pull procfs updates from Christian Brauner:
       "This contains the following changes for procfs:
      
         - Add config options and parameters to block forcing memory writes.
      
           This adds a Kconfig option and boot param to allow removing the
           FOLL_FORCE flag from /proc/<pid>/mem write calls as this can be
           used in various attacks.
      
           The traditional forcing behavior is kept as default because it can
           break GDB and some other use cases.
      
           This is the simpler version that you had requested.
      
         - Restrict overmounting of ephemeral entities.
      
           It is currently possible to mount on top of various ephemeral
           entities in procfs. This specifically includes magic links. To
           recap, magic links are links of the form /proc/<pid>/fd/<nr>. They
           serve as references to a target file and during path lookup they
           cause a jump to the target path. Such magic links disappear if the
           corresponding file descriptor is closed.
      
           Currently it is possible to overmount such magic links. This is
           mostly interesting for an attacker that wants to somehow trick a
           process into e.g., reopening something that it didn't intend to
           reopen or to hide a malicious file descriptor.
      
           But also it risks leaking mounts for long-running processes. When
           overmounting a magic link like above, the mount will not be
           detached when the file descriptor is closed. Only the target
           mountpoint will disappear. Which has the consequence of making it
           impossible to unmount that mount afterwards. So the mount will
           stick around until the process exits and the /proc/<pid>/ directory
           is cleaned up during proc_flush_pid() when the dentries are pruned
           and invalidated.
      
           That in turn means it's possible for a program to accidentally leak
           mounts and it's also possible to make a task leak mounts without
           it's knowledge if the attacker just keeps overmounting things under
           /proc/<pid>/fd/<nr>.
      
           Disallow overmounting of such ephemeral entities.
      
         - Cleanup the readdir method naming in some procfs file operations.
      
         - Replace kmalloc() and strcpy() with a simple kmemdup() call"
      
      * tag 'vfs-6.12.procfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        proc: fold kmalloc() + strcpy() into kmemdup()
        proc: block mounting on top of /proc/<pid>/fdinfo/*
        proc: block mounting on top of /proc/<pid>/fd/*
        proc: block mounting on top of /proc/<pid>/map_files/*
        proc: add proc_splice_unmountable()
        proc: proc_readfdinfo() -> proc_fdinfo_iterate()
        proc: proc_readfd() -> proc_fd_iterate()
        proc: add config & param to block forcing mem writes
      e8fc317d
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.12.fallocate' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · ee25861f
      Linus Torvalds authored
      Pull vfs fallocate updates from Christian Brauner:
       "This contains work to try and cleanup some the fallocate mode
        handling. Currently, it confusingly mixes operation modes and an
        optional flag.
      
        The work here tries to better define operation modes and optional
        flags allowing the core and filesystem code to use switch statements
        to switch on the operation mode"
      
      * tag 'vfs-6.12.fallocate' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        xfs: refactor xfs_file_fallocate
        xfs: move the xfs_is_always_cow_inode check into xfs_alloc_file_space
        xfs: call xfs_flush_unmap_range from xfs_free_file_space
        fs: sort out the fallocate mode vs flag mess
        ext4: remove tracing for FALLOC_FL_NO_HIDE_STALE
        block: remove checks for FALLOC_FL_NO_HIDE_STALE
      ee25861f
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.12.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 3352633c
      Linus Torvalds authored
      Pull vfs file updates from Christian Brauner:
       "This is the work to cleanup and shrink struct file significantly.
      
        Right now, (focusing on x86) struct file is 232 bytes. After this
        series struct file will be 184 bytes aka 3 cacheline and a spare 8
        bytes for future extensions at the end of the struct.
      
        With struct file being as ubiquitous as it is this should make a
        difference for file heavy workloads and allow further optimizations in
        the future.
      
         - struct fown_struct was embedded into struct file letting it take up
           32 bytes in total when really it shouldn't even be embedded in
           struct file in the first place. Instead, actual users of struct
           fown_struct now allocate the struct on demand. This frees up 24
           bytes.
      
         - Move struct file_ra_state into the union containg the cleanup hooks
           and move f_iocb_flags out of the union. This closes a 4 byte hole
           we created earlier and brings struct file to 192 bytes. Which means
           struct file is 3 cachelines and we managed to shrink it by 40
           bytes.
      
         - Reorder struct file so that nothing crosses a cacheline.
      
           I suspect that in the future we will end up reordering some members
           to mitigate false sharing issues or just because someone does
           actually provide really good perf data.
      
         - Shrinking struct file to 192 bytes is only part of the work.
      
           Files use a slab that is SLAB_TYPESAFE_BY_RCU and when a kmem cache
           is created with SLAB_TYPESAFE_BY_RCU the free pointer must be
           located outside of the object because the cache doesn't know what
           part of the memory can safely be overwritten as it may be needed to
           prevent object recycling.
      
           That has the consequence that SLAB_TYPESAFE_BY_RCU may end up
           adding a new cacheline.
      
           So this also contains work to add a new kmem_cache_create_rcu()
           function that allows the caller to specify an offset where the
           freelist pointer is supposed to be placed. Thus avoiding the
           implicit addition of a fourth cacheline.
      
         - And finally this removes the f_version member in struct file.
      
           The f_version member isn't particularly well-defined. It is mainly
           used as a cookie to detect concurrent seeks when iterating
           directories. But it is also abused by some subsystems for
           completely unrelated things.
      
           It is mostly a directory and filesystem specific thing that doesn't
           really need to live in struct file and with its wonky semantics it
           really lacks a specific function.
      
           For pipes, f_version is (ab)used to defer poll notifications until
           a write has happened. And struct pipe_inode_info is used by
           multiple struct files in their ->private_data so there's no chance
           of pushing that down into file->private_data without introducing
           another pointer indirection.
      
           But pipes don't rely on f_pos_lock so this adds a union into struct
           file encompassing f_pos_lock and a pipe specific f_pipe member that
           pipes can use. This union of course can be extended to other file
           types and is similar to what we do in struct inode already"
      
      * tag 'vfs-6.12.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (26 commits)
        fs: remove f_version
        pipe: use f_pipe
        fs: add f_pipe
        ubifs: store cookie in private data
        ufs: store cookie in private data
        udf: store cookie in private data
        proc: store cookie in private data
        ocfs2: store cookie in private data
        input: remove f_version abuse
        ext4: store cookie in private data
        ext2: store cookie in private data
        affs: store cookie in private data
        fs: add generic_llseek_cookie()
        fs: use must_set_pos()
        fs: add must_set_pos()
        fs: add vfs_setpos_cookie()
        s390: remove unused f_version
        ceph: remove unused f_version
        adi: remove unused f_version
        mm: Removed @freeptr_offset to prevent doc warning
        ...
      3352633c
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.12.folio' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs · 2775df6e
      Linus Torvalds authored
      Pull vfs folio updates from Christian Brauner:
       "This contains work to port write_begin and write_end to rely on folios
        for various filesystems.
      
        This converts ocfs2, vboxfs, orangefs, jffs2, hostfs, fuse, f2fs,
        ecryptfs, ntfs3, nilfs2, reiserfs, minixfs, qnx6, sysv, ufs, and
        squashfs.
      
        After this series lands a bunch of the filesystems in this list do not
        mention struct page anymore"
      
      * tag 'vfs-6.12.folio' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (61 commits)
        Squashfs: Ensure all readahead pages have been used
        Squashfs: Rewrite and update squashfs_readahead_fragment() to not use page->index
        Squashfs: Update squashfs_readpage_block() to not use page->index
        Squashfs: Update squashfs_readahead() to not use page->index
        Squashfs: Update page_actor to not use page->index
        jffs2: Use a folio in jffs2_garbage_collect_dnode()
        jffs2: Convert jffs2_do_readpage_nolock to take a folio
        buffer: Convert __block_write_begin() to take a folio
        ocfs2: Convert ocfs2_write_zero_page to use a folio
        fs: Convert aops->write_begin to take a folio
        fs: Convert aops->write_end to take a folio
        vboxsf: Use a folio in vboxsf_write_end()
        orangefs: Convert orangefs_write_begin() to use a folio
        orangefs: Convert orangefs_write_end() to use a folio
        jffs2: Convert jffs2_write_begin() to use a folio
        jffs2: Convert jffs2_write_end() to use a folio
        hostfs: Convert hostfs_write_end() to use a folio
        fuse: Convert fuse_write_begin() to use a folio
        fuse: Convert fuse_write_end() to use a folio
        f2fs: Convert f2fs_write_begin() to use a folio
        ...
      2775df6e
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.12.misc' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs · 8f72c31f
      Linus Torvalds authored
      Pull misc vfs updates from Christian Brauner:
       "This contains the usual pile of misc updates:
      
        Features:
      
         - Add F_CREATED_QUERY fcntl() that allows userspace to query whether
           a file was actually created. Often userspace wants to know whether
           an O_CREATE request did actually create a file without using
           O_EXCL. The current logic is that to first attempts to open the
           file without O_CREAT | O_EXCL and if ENOENT is returned userspace
           tries again with both flags. If that succeeds all is well. If it
           now reports EEXIST it retries.
      
           That works fairly well but some corner cases make this more
           involved. If this operates on a dangling symlink the first openat()
           without O_CREAT | O_EXCL will return ENOENT but the second openat()
           with O_CREAT | O_EXCL will fail with EEXIST.
      
           The reason is that openat() without O_CREAT | O_EXCL follows the
           symlink while O_CREAT | O_EXCL doesn't for security reasons. So
           it's not something we can really change unless we add an explicit
           opt-in via O_FOLLOW which seems really ugly.
      
           All available workarounds are really nasty (fanotify, bpf lsm etc)
           so add a simple fcntl().
      
         - Try an opportunistic lookup for O_CREAT. Today, when opening a file
           we'll typically do a fast lookup, but if O_CREAT is set, the kernel
           always takes the exclusive inode lock. This was likely done with
           the expectation that O_CREAT means that we always expect to do the
           create, but that's often not the case. Many programs set O_CREAT
           even in scenarios where the file already exists (see related
           F_CREATED_QUERY patch motivation above).
      
           The series contained in the pr rearranges the pathwalk-for-open
           code to also attempt a fast_lookup in certain O_CREAT cases. If a
           positive dentry is found, the inode_lock can be avoided altogether
           and it can stay in rcuwalk mode for the last step_into.
      
         - Expose the 64 bit mount id via name_to_handle_at()
      
           Now that we provide a unique 64-bit mount ID interface in statx(2),
           we can now provide a race-free way for name_to_handle_at(2) to
           provide a file handle and corresponding mount without needing to
           worry about racing with /proc/mountinfo parsing or having to open a
           file just to do statx(2).
      
           While this is not necessary if you are using AT_EMPTY_PATH and
           don't care about an extra statx(2) call, users that pass full paths
           into name_to_handle_at(2) need to know which mount the file handle
           comes from (to make sure they don't try to open_by_handle_at a file
           handle from a different filesystem) and switching to AT_EMPTY_PATH
           would require allocating a file for every name_to_handle_at(2) call
      
         - Add a per dentry expire timeout to autofs
      
           There are two fairly well known automounter map formats, the autofs
           format and the amd format (more or less System V and Berkley).
      
           Some time ago Linux autofs added an amd map format parser that
           implemented a fair amount of the amd functionality. This was done
           within the autofs infrastructure and some functionality wasn't
           implemented because it either didn't make sense or required extra
           kernel changes. The idea was to restrict changes to be within the
           existing autofs functionality as much as possible and leave changes
           with a wider scope to be considered later.
      
           One of these changes is implementing the amd options:
            1) "unmount", expire this mount according to a timeout (same as
               the current autofs default).
            2) "nounmount", don't expire this mount (same as setting the
               autofs timeout to 0 except only for this specific mount) .
            3) "utimeout=<seconds>", expire this mount using the specified
               timeout (again same as setting the autofs timeout but only for
               this mount)
      
           To implement these options per-dentry expire timeouts need to be
           implemented for autofs indirect mounts. This is because all map
           keys (mounts) for autofs indirect mounts use an expire timeout
           stored in the autofs mount super block info. structure and all
           indirect mounts use the same expire timeout.
      
        Fixes:
      
         - Fix missing fput for FSCONFIG_SET_FD in autofs
      
         - Use param->file for FSCONFIG_SET_FD in coda
      
         - Delete the 'fs/netfs' proc subtreee when netfs module exits
      
         - Make sure that struct uid_gid_map fits into a single cacheline
      
         - Don't flush in-flight wb switches for superblocks without cgroup
           writeback
      
         - Correcting the idmapping mount example in the idmapping
           documentation
      
         - Fix a race between evice_inodes() and find_inode() and iput()
      
         - Refine the show_inode_state() macro definition in writeback code
      
         - Prevent dump_mapping() from accessing invalid dentry.d_name.name
      
         - Show actual source for debugfs in /proc/mounts
      
         - Annotate data-race of busy_poll_usecs in eventpoll
      
         - Don't WARN for racy path_noexec check in exec code
      
         - Handle OOM on mnt_warn_timestamp_expiry()
      
         - Fix some spelling in the iomap design documentation
      
         - Fix typo in procfs comment
      
         - Fix typo in fs/namespace.c comment
      
        Cleanups:
      
         - Add the VFS git tree to the MAINTAINERS file
      
         - Move FMODE_UNSIGNED_OFFSET to fop_flags freeing up another f_mode
           bit in struct file bringing us to 5 free f_mode bits
      
         - Remove the __I_DIO_WAKEUP bit from i_state flags as we can simplify
           the wait mechanism
      
         - Remove the unused path_put_init() helper
      
         - Replace a __u32 with u32 for s_fsnotify_mask as __u32 is uapi
           specific
      
         - Replace the unsigned long i_state member with a u32 i_state member
           in struct inode freeing up 4 bytes in struct inode. Instead of
           using the bit based wait apis we're now using the var event apis
           and using the individual bytes of the i_state member to wait on
           state changes
      
         - Explain how per-syscall AT_* flags should be allocated
      
         - Use in_group_or_capable() helper to simplify the posix acl mode
           update code
      
         - Switch to LIST_HEAD() in fsync_buffers_list() to simplify the code
      
         - Removed comment about d_rcu_to_refcount() as that function doesn't
           exist anymore
      
         - Add kernel documentation for lookup_fast()
      
         - Don't re-zero evenpoll fields
      
         - Remove outdated comment after close_fd()
      
         - Fix imprecise wording in comment about the pipe filesystem
      
         - Drop GFP_NOFAIL mode from alloc_page_buffers
      
         - Missing blank line warnings and struct declaration improved in
           file_table
      
         - Annotate struct poll_list with __counted_by()
      
         - Remove the unused read parameter in percpu-rwsem
      
         - Remove linux/prefetch.h include from direct-io code
      
         - Use kmemdup_array instead of kmemdup for multiple allocation in
           mnt_idmapping code
      
         - Remove unused mnt_cursor_del() declaration
      
        Performance tweaks:
      
         - Dodge smp_mb in break_lease and break_deleg in the common case
      
         - Only read fops once in fops_{get,put}()
      
         - Use RCU in ilookup()
      
         - Elide smp_mb in iversion handling in the common case
      
         - Drop one lock trip in evict()"
      
      * tag 'vfs-6.12.misc' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (58 commits)
        uidgid: make sure we fit into one cacheline
        proc: Fix typo in the comment
        fs/pipe: Correct imprecise wording in comment
        fhandle: expose u64 mount id to name_to_handle_at(2)
        uapi: explain how per-syscall AT_* flags should be allocated
        fs: drop GFP_NOFAIL mode from alloc_page_buffers
        writeback: Refine the show_inode_state() macro definition
        fs/inode: Prevent dump_mapping() accessing invalid dentry.d_name.name
        mnt_idmapping: Use kmemdup_array instead of kmemdup for multiple allocation
        netfs: Delete subtree of 'fs/netfs' when netfs module exits
        fs: use LIST_HEAD() to simplify code
        inode: make i_state a u32
        inode: port __I_LRU_ISOLATING to var event
        vfs: fix race between evice_inodes() and find_inode()&iput()
        inode: port __I_NEW to var event
        inode: port __I_SYNC to var event
        fs: reorder i_state bits
        fs: add i_state helpers
        MAINTAINERS: add the VFS git tree
        fs: s/__u32/u32/ for s_fsnotify_mask
        ...
      8f72c31f
    • Linus Torvalds's avatar
      Merge tag 'thermal-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · d2230051
      Linus Torvalds authored
      Pull thermal control updates from Rafael Wysocki:
       "These mostly continue to rework the thermal core and the thermal zone
        driver interface to make the code more straightforward and reduce
        bloat
      
        The most significant piece of this work is a change of the code
        related to binding cooling devices to thermal zones which, among other
        things, replaces two previously existing thermal zone operations with
        one allowing driver implementations to be much simpler
      
        There is also a new thermal core testing module allowing mock thermal
        zones to be created and controlled via debugfs in order to exercise
        the thermal core functionality. It is expected to be used for
        implementing thermal core self tests in the future
      
        Apart from the above, there are assorted thermal driver updates
      
        Specifics:
      
         - Update some thermal drivers to eliminate thermal_zone_get_trip()
           calls from them and get rid of that function (Rafael Wysocki)
      
         - Update the thermal sysfs code to store trip point attributes in
           trip descriptors and get to trip points via attribute pointers
           (Rafael Wysocki)
      
         - Move the computation of the low and high boundaries for
           thermal_zone_set_trips() to __thermal_zone_device_update() (Daniel
           Lezcano)
      
         - Introduce a debugfs-based facility for thermal core testing (Rafael
           Wysocki)
      
         - Replace the thermal zone .bind() and .unbind() callbacks for
           binding cooling devices to thermal zones with one .should_bind()
           callback used for deciding whether or not a given cooling devices
           should be bound to a given trip point in a given thermal zone
           (Rafael Wysocki)
      
         - Eliminate code that has no more users after the other changes, drop
           some redundant checks from the thermal core and clean it up (Rafael
           Wysocki)
      
         - Fix rounding of delay jiffies in the thermal core (Rafael Wysocki)
      
         - Refuse to accept trip point temperature or hysteresis that would
           lead to an invalid threshold value when setting them via sysfs
           (Rafael Wysocki)
      
         - Adjust states of all uninitialized instances in the .manage()
           callback of the Bang-bang thermal governor (Rafael Wysocki)
      
         - Drop a couple of redundant checks along with the code depending on
           them from the thermal core (Rafael Wysocki)
      
         - Rearrange the thermal core to avoid redundant checks and simplify
           control flow in a couple of code paths (Rafael Wysocki)
      
         - Add power domain DT bindings for new Amlogic SoCs (Georges Stark)
      
         - Switch from CONFIG_PM_SLEEP guards to pm_sleep_ptr() in the ST
           driver and add a Kconfig dependency on THERMAL_OF subsystem for the
           STi driver (Raphael Gallais-Pou)
      
         - Simplify the error code path in the probe functions in the brcmstb
           driver with the helo of dev_err_probe() (Yan Zhen)
      
         - Make imx_sc_thermal use dev_err_probe() (Alexander Stein)
      
         - Remove trailing space after \n newline in the Renesas driver (Colin
           Ian King)
      
         - Add DT binding compatible string for the SA8255p to the tsens
           thermal driver (Nikunj Kela)
      
         - Use the devm_clk_get_enabled() helpers to simplify the init routine
           in the sprd thermal driver (Huan Yang)
      
         - Remove __maybe_unused notations for the functions by using the new
           RUNTIME_PM_OPS() and SYSTEM_SLEEP_PM_OPS() macros on the IMx and
           Qoriq drivers (Fabio Estevam)
      
         - Remove unused declarations from the ti-soc-thermal driver's header
           file as the functions in question were removed previously (Zhang
           Zekun)"
      
      * tag 'thermal-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (48 commits)
        thermal: core: Drop thermal_zone_device_is_enabled()
        thermal: core: Check passive delay in monitor_thermal_zone()
        thermal: core: Drop dead code from monitor_thermal_zone()
        thermal: core: Drop redundant lockdep_assert_held()
        thermal: gov_bang_bang: Adjust states of all uninitialized instances
        thermal: sysfs: Add sanity checks for trip temperature and hysteresis
        thermal/drivers/imx_sc_thermal: Use dev_err_probe
        thermal/drivers/ti-soc-thermal: Remove unused declarations
        thermal/drivers/imx: Remove __maybe_unused notations
        thermal/drivers/qoriq: Remove __maybe_unused notations
        thermal/drivers/sprd: Use devm_clk_get_enabled() helpers
        dt-bindings: thermal: tsens: document support on SA8255p
        thermal/drivers/renesas: Remove trailing space after \n newline
        thermal/drivers/brcmstb_thermal: Simplify with dev_err_probe()
        thermal/drivers/sti: Depend on THERMAL_OF subsystem
        thermal/drivers/st: Switch from CONFIG_PM_SLEEP guards to pm_sleep_ptr()
        dt-bindings: thermal: amlogic,thermal: add optional power-domains
        thermal: core: Drop tz field from struct thermal_instance
        thermal: core: Drop redundant checks from thermal_bind_cdev_to_trip()
        thermal: core: Rename cdev-to-thermal-zone bind/unbind functions
        ...
      d2230051
    • Linus Torvalds's avatar
      Merge tag 'pm-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 02824a5f
      Linus Torvalds authored
      Pull power management updates from Rafael Wysocki:
       "By the number of new lines of code, the most visible change here is
        the addition of hybrid CPU capacity scaling support to the
        intel_pstate driver. Next are the amd-pstate driver changes related to
        the calculation of the AMD boost numerator and preferred core
        detection.
      
        As far as new hardware support is concerned, the intel_idle driver
        will now handle Granite Rapids Xeon processors natively, the
        intel_rapl power capping driver will recognize family 1Ah of AMD
        processors and Intel ArrowLake-U chipos, and intel_pstate will handle
        Granite Rapids and Sierra Forest chips in the out-of-band (OOB) mode.
      
        Apart from the above, there is a usual collection of assorted fixes
        and code cleanups in many places and there are tooling updates.
      
        Specifics:
      
         - Remove LATENCY_MULTIPLIER from cpufreq (Qais Yousef)
      
         - Add support for Granite Rapids and Sierra Forest in OOB mode to the
           intel_pstate cpufreq driver (Srinivas Pandruvada)
      
         - Add basic support for CPU capacity scaling on x86 and make the
           intel_pstate driver set asymmetric CPU capacity on hybrid systems
           without SMT (Rafael Wysocki)
      
         - Add missing MODULE_DESCRIPTION() macros to the powerpc cpufreq
           driver (Jeff Johnson)
      
         - Several OF related cleanups in cpufreq drivers (Rob Herring)
      
         - Enable COMPILE_TEST for ARM drivers (Rob Herrring)
      
         - Introduce quirks for syscon failures and use socinfo to get
           revision for TI cpufreq driver (Dhruva Gole, Nishanth Menon)
      
         - Minor cleanups in amd-pstate driver (Anastasia Belova, Dhananjay
           Ugwekar)
      
         - Minor cleanups for loongson, cpufreq-dt and powernv cpufreq drivers
           (Danila Tikhonov, Huacai Chen, and Liu Jing)
      
         - Make amd-pstate validate return of any attempt to update EPP
           limits, which fixes the masking hardware problems (Mario
           Limonciello)
      
         - Move the calculation of the AMD boost numerator outside of
           amd-pstate, correcting acpi-cpufreq on systems with preferred cores
           (Mario Limonciello)
      
         - Harden preferred core detection in amd-pstate to avoid potential
           false positives (Mario Limonciello)
      
         - Add extra unit test coverage for mode state machine (Mario
           Limonciello)
      
         - Fix an "Uninitialized variables" issue in amd-pstste (Qianqiang
           Liu)
      
         - Add Granite Rapids Xeon support to intel_idle (Artem Bityutskiy)
      
         - Disable promotion to C1E on Jasper Lake and Elkhart Lake in
           intel_idle (Kai-Heng Feng)
      
         - Use scoped device node handling to fix missing of_node_put() and
           simplify walking OF children in the riscv-sbi cpuidle driver
           (Krzysztof Kozlowski)
      
         - Remove dead code from cpuidle_enter_state() (Dhruva Gole)
      
         - Change an error pointer to NULL to fix error handling in the
           intel_rapl power capping driver (Dan Carpenter)
      
         - Fix off by one in get_rpi() in the intel_rapl power capping driver
           (Dan Carpenter)
      
         - Add support for ArrowLake-U to the intel_rapl power capping driver
           (Sumeet Pawnikar)
      
         - Fix the energy-pkg event for AMD CPUs in the intel_rapl power
           capping driver (Dhananjay Ugwekar)
      
         - Add support for AMD family 1Ah processors to the intel_rapl power
           capping driver (Dhananjay Ugwekar)
      
         - Remove unused stub for saveable_highmem_page() and remove
           deprecated macros from power management documentation (Andy
           Shevchenko)
      
         - Use ysfs_emit() and sysfs_emit_at() in "show" functions in the PM
           sysfs interface (Xueqin Luo)
      
         - Update the maintainers information for the
           operating-points-v2-ti-cpu DT binding (Dhruva Gole)
      
         - Drop unnecessary of_match_ptr() from ti-opp-supply (Rob Herring)
      
         - Add missing MODULE_DESCRIPTION() macros to devfreq governors (Jeff
           Johnson)
      
         - Use devm_clk_get_enabled() in the exynos-bus devfreq driver (Anand
           Moon)
      
         - Use of_property_present() instead of of_get_property() in the
           imx-bus devfreq driver (Rob Herring)
      
         - Update directory handling and installation process in the pm-graph
           Makefile and add .gitignore to ignore sleepgraph.py artifacts to
           pm-graph (Amit Vadhavana, Yo-Jung Lin)
      
         - Make cpupower display residency value in idle-info (Aboorva
           Devarajan)
      
         - Add missing powercap_set_enabled() stub function to cpupower (John
           B. Wyatt IV)
      
         - Add SWIG support to cpupower (John B. Wyatt IV)"
      
      * tag 'pm-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (62 commits)
        cpufreq/amd-pstate-ut: Fix an "Uninitialized variables" issue
        cpufreq/amd-pstate-ut: Add test case for mode switches
        cpufreq/amd-pstate: Export symbols for changing modes
        amd-pstate: Add missing documentation for `amd_pstate_prefcore_ranking`
        cpufreq: amd-pstate: Add documentation for `amd_pstate_hw_prefcore`
        cpufreq: amd-pstate: Optimize amd_pstate_update_limits()
        cpufreq: amd-pstate: Merge amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()
        x86/amd: Detect preferred cores in amd_get_boost_ratio_numerator()
        x86/amd: Move amd_get_highest_perf() out of amd-pstate
        ACPI: CPPC: Adjust debug messages in amd_set_max_freq_ratio() to warn
        ACPI: CPPC: Drop check for non zero perf ratio
        x86/amd: Rename amd_get_highest_perf() to amd_get_boost_ratio_numerator()
        ACPI: CPPC: Adjust return code for inline functions in !CONFIG_ACPI_CPPC_LIB
        x86/amd: Move amd_get_highest_perf() from amd.c to cppc.c
        PM: hibernate: Remove unused stub for saveable_highmem_page()
        pm:cpupower: Add error warning when SWIG is not installed
        MAINTAINERS: Add Maintainers for SWIG Python bindings
        pm:cpupower: Include test_raw_pylibcpupower.py
        pm:cpupower: Add SWIG bindings files for libcpupower
        pm:cpupower: Add missing powercap_set_enabled() stub function
        ...
      02824a5f
    • Linus Torvalds's avatar
      Merge tag 'acpi-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 11b31250
      Linus Torvalds authored
      Pull ACPI updates from Rafael Wysocki:
       "These update the ACPICA code in the kernel to upstream version
        20240827, add support for ACPI-based enumeration of interrupt
        controllers on RISC-V along with some related irqchip updates, clean
        up the ACPI device object sysfs interface, add some quirks for
        backlight handling and IRQ overrides, fix assorted issues and clean up
        code.
      
        Specifics:
      
         - Check return value in acpi_db_convert_to_package() (Pei Xiao)
      
         - Detect FACS and allow setting the waking vector on reduced-hardware
           ACPI platforms (Jiaqing Zhao)
      
         - Allow ACPICA to represent semaphores as integers (Adrien Destugues)
      
         - Complete CXL 3.0 CXIMS structures support in ACPICA (Zhang Rui)
      
         - Make ACPICA support SPCR version 4 and add RISC-V SBI Subtype to
           DBG2 (Sia Jee Heng)
      
         - Implement the Dword_PCC Resource Descriptor Macro in ACPICA (Jose
           Marinho)
      
         - Correct the typo in struct acpi_mpam_msc_node member (Punit
           Agrawal)
      
         - Implement ACPI_WARNING_ONCE() and ACPI_ERROR_ONCE() and use them to
           prevent a Stall() violation warning from being printed every time
           this takes place (Vasily Khoruzhick)
      
         - Allow PCC Data Type in MCTP resource (Adam Young)
      
         - Fix memory leaks on acpi_ps_get_next_namepath() and
           acpi_ps_get_next_field() failures (Armin Wolf)
      
         - Add support for supressing leading zeros in hex strings when
           converting them to integers and update integer-to-hex-string
           conversions in ACPICA (Armin Wolf)
      
         - Add support for Windows 11 22H2 _OSI string (Armin Wolf)
      
         - Avoid warning for Dump Functions in ACPICA (Adam Lackorzynski)
      
         - Add extended linear address mode to HMAT MSCIS in ACPICA (Dave
           Jiang)
      
         - Handle empty connection_node in iasl (Aleksandrs Vinarskis)
      
         - Allow for more flexibility in _DSM args (Saket Dumbre)
      
         - Setup for ACPICA release 20240827 (Saket Dumbre)
      
         - Add ACPI device enumeration support for interrupt controller
           probing including taking dependencies into account (Sunil V L)
      
         - Implement ACPI-based interrupt controller probing on RISC-V
           (Sunil V L)
      
         - Add ACPI support for AIA in riscv-intc and add ACPI support to
           riscv-imsic, riscv-aplic, and sifive-plic (Sunil V L)
      
         - Do not release locks during operation region accesses in the ACPI
           EC driver (Rafael Wysocki)
      
         - Fix up the _STR handling in the ACPI device object sysfs interface,
           make it represent the device object attributes as an attribute
           group and make it rely on driver core functionality for sysfs
           attrubute management (Thomas Weißschuh)
      
         - Extend error messages printed to the kernel log when
           acpi_evaluate_dsm() fails to include revision and function number
           (David Wang)
      
         - Add a new AMDI0015 platform device ID to the ACPi APD driver for
           AMD SoCs (Shyam Sundar S K)
      
         - Use the driver core for the async probing management in the ACPI
           battery driver (Thomas Weißschuh)
      
         - Remove redundant initalizations of a local variable to NULL from
           the ACPI battery driver (Ilpo Järvinen)
      
         - Remove unneeded check in tps68470_pmic_opregion_probe() (Aleksandr
           Mishin)
      
         - Add support for setting the EPP register through the ACPI CPPC
           sysfs interface if it is in FFH (Mario Limonciello)
      
         - Fix MASK_VAL() usage in the ACPI CPPC library (Clément Léger)
      
         - Reduce the log level of a per-CPU message about idle states in the
           ACPI processor driver (Li RongQing)
      
         - Fix crash in exit_round_robin() in the ACPI processor aggregator
           device (PAD) driver (Seiji Nishikawa)
      
         - Add force_vendor quirk for Panasonic Toughbook CF-18 in the ACPI
           backlight driver (Hans de Goede)
      
         - Make the DMI checks related to backlight handling on Lenovo Yoga
           Tab 3 X90F less strict (Hans de Goede)
      
         - Enforce native backlight handling on Apple MacbookPro9,2 (Esther
           Shimanovich)
      
         - Add IRQ override quirks for Asus Vivobook Go E1404GAB and MECHREV
           GM7XG0M, and refine the TongFang GMxXGxx quirk (Li Chen, Tamim
           Khan, Werner Sembach)
      
         - Quirk ASUS ROG M16 to default to S3 sleep (Luke D. Jones)
      
         - Define and use symbols for device and class name lengths in the
           ACPI bus type code and make the code use strscpy() instead of
           strcpy() in several places (Muhammad Qasim Abdul Majeed)"
      
      * tag 'acpi-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (70 commits)
        ACPI: resource: Add another DMI match for the TongFang GMxXGxx
        ACPI: CPPC: Add support for setting EPP register in FFH
        ACPI: PM: Quirk ASUS ROG M16 to default to S3 sleep
        ACPI: video: Add force_vendor quirk for Panasonic Toughbook CF-18
        ACPI: battery: use driver core managed async probing
        ACPI: button: Use strscpy() instead of strcpy()
        ACPI: resource: Skip IRQ override on Asus Vivobook Go E1404GAB
        ACPI: CPPC: Fix MASK_VAL() usage
        irqchip/sifive-plic: Add ACPI support
        ACPICA: Setup for ACPICA release 20240827
        ACPICA: Allow for more flexibility in _DSM args
        ACPICA: iasl: handle empty connection_node
        ACPICA: HMAT: Add extended linear address mode to MSCIS
        ACPICA: Avoid warning for Dump Functions
        ACPICA: Add support for Windows 11 22H2 _OSI string
        ACPICA: Update integer-to-hex-string conversions
        ACPICA: Add support for supressing leading zeros in hex strings
        ACPICA: Allow for supressing leading zeros when using acpi_ex_convert_to_ascii()
        ACPICA: Fix memory leak if acpi_ps_get_next_field() fails
        ACPICA: Fix memory leak if acpi_ps_get_next_namepath() fails
        ...
      11b31250
    • Linus Torvalds's avatar
      Merge tag 'for-linus-non-x86' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 64dd3b6a
      Linus Torvalds authored
      Pull kvm updates from Paolo Bonzini:
       "These are the non-x86 changes (mostly ARM, as is usually the case).
        The generic and x86 changes will come later"
      
        ARM:
      
         - New Stage-2 page table dumper, reusing the main ptdump
           infrastructure
      
         - FP8 support
      
         - Nested virtualization now supports the address translation
           (FEAT_ATS1A) family of instructions
      
         - Add selftest checks for a bunch of timer emulation corner cases
      
         - Fix multiple cases where KVM/arm64 doesn't correctly handle the
           guest trying to use a GICv3 that wasn't advertised
      
         - Remove REG_HIDDEN_USER from the sysreg infrastructure, making
           things little simpler
      
         - Prevent MTE tags being restored by userspace if we are actively
           logging writes, as that's a recipe for disaster
      
         - Correct the refcount on a page that is not considered for MTE tag
           copying (such as a device)
      
         - When walking a page table to split block mappings, synchronize only
           at the end the walk rather than on every store
      
         - Fix boundary check when transfering memory using FFA
      
         - Fix pKVM TLB invalidation, only affecting currently out of tree
           code but worth addressing for peace of mind
      
        LoongArch:
      
         - Revert qspinlock to test-and-set simple lock on VM.
      
         - Add Loongson Binary Translation extension support.
      
         - Add PMU support for guest.
      
         - Enable paravirt feature control from VMM.
      
         - Implement function kvm_para_has_feature().
      
        RISC-V:
      
         - Fix sbiret init before forwarding to userspace
      
         - Don't zero-out PMU snapshot area before freeing data
      
         - Allow legacy PMU access from guest
      
         - Fix to allow hpmcounter31 from the guest"
      
      * tag 'for-linus-non-x86' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (64 commits)
        LoongArch: KVM: Implement function kvm_para_has_feature()
        LoongArch: KVM: Enable paravirt feature control from VMM
        LoongArch: KVM: Add PMU support for guest
        KVM: arm64: Get rid of REG_HIDDEN_USER visibility qualifier
        KVM: arm64: Simplify visibility handling of AArch32 SPSR_*
        KVM: arm64: Simplify handling of CNTKCTL_EL12
        LoongArch: KVM: Add vm migration support for LBT registers
        LoongArch: KVM: Add Binary Translation extension support
        LoongArch: KVM: Add VM feature detection function
        LoongArch: Revert qspinlock to test-and-set simple lock on VM
        KVM: arm64: Register ptdump with debugfs on guest creation
        arm64: ptdump: Don't override the level when operating on the stage-2 tables
        arm64: ptdump: Use the ptdump description from a local context
        arm64: ptdump: Expose the attribute parsing functionality
        KVM: arm64: Add memory length checks and remove inline in do_ffa_mem_xfer
        KVM: arm64: Move pagetable definitions to common header
        KVM: arm64: nv: Add support for FEAT_ATS1A
        KVM: arm64: nv: Plumb handling of AT S1* traps from EL2
        KVM: arm64: nv: Make AT+PAN instructions aware of FEAT_PAN3
        KVM: arm64: nv: Sanitise SCTLR_EL1.EPAN according to VM configuration
        ...
      64dd3b6a
    • Linus Torvalds's avatar
      Merge tag 'cmpxchg.2024.09.15a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu · 980bcd35
      Linus Torvalds authored
      Pull byte cmpxchg updates from Paul McKenney:
       "ARC/sh/xtensa: Provide one-byte cmpxchg emulation
      
        This series provides emulated one-byte cmpxchg() support for ARM, sh,
        and xtensa using the cmpxchg_emu_u8() function that uses a four-byte
        cmpxchg() to emulate the one-byte variant.
      
        This covers all architectures"
      
      * tag 'cmpxchg.2024.09.15a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
        xtensa: Emulate one-byte cmpxchg
        sh: Emulate one-byte cmpxchg
        ARC: Emulate one-byte cmpxchg
      980bcd35
    • Linus Torvalds's avatar
      Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 114143a5
      Linus Torvalds authored
      Pull arm64 updates from Will Deacon:
       "The highlights are support for Arm's "Permission Overlay Extension"
        using memory protection keys, support for running as a protected guest
        on Android as well as perf support for a bunch of new interconnect
        PMUs.
      
        Summary:
      
        ACPI:
         - Enable PMCG erratum workaround for HiSilicon HIP10 and 11
           platforms.
         - Ensure arm64-specific IORT header is covered by MAINTAINERS.
      
        CPU Errata:
         - Enable workaround for hardware access/dirty issue on Ampere-1A
           cores.
      
        Memory management:
         - Define PHYSMEM_END to fix a crash in the amdgpu driver.
         - Avoid tripping over invalid kernel mappings on the kexec() path.
         - Userspace support for the Permission Overlay Extension (POE) using
           protection keys.
      
        Perf and PMUs:
         - Add support for the "fixed instruction counter" extension in the
           CPU PMU architecture.
         - Extend and fix the event encodings for Apple's M1 CPU PMU.
         - Allow LSM hooks to decide on SPE permissions for physical
           profiling.
         - Add support for the CMN S3 and NI-700 PMUs.
      
        Confidential Computing:
         - Add support for booting an arm64 kernel as a protected guest under
           Android's "Protected KVM" (pKVM) hypervisor.
      
        Selftests:
         - Fix vector length issues in the SVE/SME sigreturn tests
         - Fix build warning in the ptrace tests.
      
        Timers:
         - Add support for PR_{G,S}ET_TSC so that 'rr' can deal with
           non-determinism arising from the architected counter.
      
        Miscellaneous:
         - Rework our IPI-based CPU stopping code to try NMIs if regular IPIs
           don't succeed.
         - Minor fixes and cleanups"
      
      * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (94 commits)
        perf: arm-ni: Fix an NULL vs IS_ERR() bug
        arm64: hibernate: Fix warning for cast from restricted gfp_t
        arm64: esr: Define ESR_ELx_EC_* constants as UL
        arm64: pkeys: remove redundant WARN
        perf: arm_pmuv3: Use BR_RETIRED for HW branch event if enabled
        MAINTAINERS: List Arm interconnect PMUs as supported
        perf: Add driver for Arm NI-700 interconnect PMU
        dt-bindings/perf: Add Arm NI-700 PMU
        perf/arm-cmn: Improve format attr printing
        perf/arm-cmn: Clean up unnecessary NUMA_NO_NODE check
        arm64/mm: use lm_alias() with addresses passed to memblock_free()
        mm: arm64: document why pte is not advanced in contpte_ptep_set_access_flags()
        arm64: Expose the end of the linear map in PHYSMEM_END
        arm64: trans_pgd: mark PTEs entries as valid to avoid dead kexec()
        arm64/mm: Delete __init region from memblock.reserved
        perf/arm-cmn: Support CMN S3
        dt-bindings: perf: arm-cmn: Add CMN S3
        perf/arm-cmn: Refactor DTC PMU register access
        perf/arm-cmn: Make cycle counts less surprising
        perf/arm-cmn: Improve build-time assertion
        ...
      114143a5
    • Linus Torvalds's avatar
      Merge tag 'mips_6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · 8617d7d6
      Linus Torvalds authored
      Pull MIPS updates from Thomas Bogendoerfer:
      
       - use devm_clk_get_enabled() helper
      
       - prototype fixes
      
       - cleanup unused stuff
      
      * tag 'mips_6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
        mips: Remove posix_types.h include from sigcontext.h
        bus: bt1-apb: change to use devm_clk_get_enabled() helper
        bus: bt1-axi: change to use devm_clk_get_enabled() helper
        MIPS: dec: prom: Remove unused unregister_prom_console() declaration
        MIPS: Remove unused mips_display/_scroll_message() declarations
        MIPS: Remove unused declarations in asm/cmp.h
        MIPS: MT: Remove unused function mips_mt_regdump()
        mips/jazz: remove unused jazz_handle_int() declaration
        MIPS: Remove unused function dump_au1000_dma_channel() in dma.c
        MIPS: ralink: Fix missing `get_c0_perfcount_int` prototype
        MIPS: ralink: Fix missing `plat_time_init` prototype
      8617d7d6
    • Linus Torvalds's avatar
      Merge tag 'x86_sgx_for_6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a4ebad65
      Linus Torvalds authored
      Pull x86 SGX updates from Dave Hansen:
       "These fix a deadlock in the SGX NUMA allocator.
      
        It's probably only triggerable today on servers with buggy BIOSes, but
        it's theoretically possible it can happen on less goofy systems"
      
      * tag 'x86_sgx_for_6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/sgx: Log information when a node lacks an EPC section
        x86/sgx: Fix deadlock in SGX NUMA node search
      a4ebad65
    • Linus Torvalds's avatar
      Merge tag 'x86_bugs_for_v6.12_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 963d0d60
      Linus Torvalds authored
      Pull x86 hw mitigation updates from Borislav Petkov:
      
       - Add CONFIG_ option for every hw CPU mitigation. The intent is to
         support configurations and scenarios where the mitigations code is
         irrelevant
      
       - Other small fixlets and improvements
      
      * tag 'x86_bugs_for_v6.12_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/bugs: Fix handling when SRSO mitigation is disabled
        x86/bugs: Add missing NO_SSB flag
        Documentation/srso: Document a method for checking safe RET operates properly
        x86/bugs: Add a separate config for GDS
        x86/bugs: Remove GDS Force Kconfig option
        x86/bugs: Add a separate config for SSB
        x86/bugs: Add a separate config for Spectre V2
        x86/bugs: Add a separate config for SRBDS
        x86/bugs: Add a separate config for Spectre v1
        x86/bugs: Add a separate config for RETBLEED
        x86/bugs: Add a separate config for L1TF
        x86/bugs: Add a separate config for MMIO Stable Data
        x86/bugs: Add a separate config for TAA
        x86/bugs: Add a separate config for MDS
      963d0d60
    • Linus Torvalds's avatar
      Merge tag 'x86_cpu_for_v6.12_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d580d74e
      Linus Torvalds authored
      Pull x86 cpuid updates from Borislav Petkov:
      
       - Add the final conversions to the new Intel VFM CPU model matching
         macros which include the vendor and finally drop the old ones which
         hardcode family 6
      
      * tag 'x86_cpu_for_v6.12_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/cpu/vfm: Delete all the *_FAM6_ CPU #defines
        x86/cpu/vfm: Delete X86_MATCH_INTEL_FAM6_MODEL[_STEPPING]() macros
        extcon: axp288: Switch to new Intel CPU model defines
        x86/cpu/intel: Replace PAT erratum model/family magic numbers with symbolic IFM references
      d580d74e
    • Linus Torvalds's avatar
      Merge tag 'x86_sev_for_v6.12_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b56dff26
      Linus Torvalds authored
      Pull x86 SEV updates from Borislav Petkov:
      
       - A bunch of cleanups to the sev-guest driver. All in preparation for
         future SEV work
      
      * tag 'x86_sev_for_v6.12_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        virt: sev-guest: Ensure the SNP guest messages do not exceed a page
        virt: sev-guest: Fix user-visible strings
        virt: sev-guest: Rename local guest message variables
        virt: sev-guest: Replace dev_dbg() with pr_debug()
      b56dff26
    • Linus Torvalds's avatar
      Merge tag 'ras_core_for_v6.12_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d0a63f0e
      Linus Torvalds authored
      Pull x86 RAS updates from Borislav Petkov:
      
       - Reorganize the struct mce populating functions so that MCA errors
         reported through BIOS' BERT method can report the correct CPU number
         the error has been detected on
      
      * tag 'ras_core_for_v6.12_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mce: Use mce_prep_record() helpers for apei_smca_report_x86_error()
        x86/mce: Define mce_prep_record() helpers for common and per-CPU fields
        x86/mce: Rename mce_setup() to mce_prep_record()
      d0a63f0e
    • Linus Torvalds's avatar
      Merge tag 'x86_microcode_for_v6.12_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 79f1a6ad
      Linus Torvalds authored
      Pull x86 microcode loading updates from Borislav Petkov:
      
       - Simplify microcode patches loading on AMD Zen and newer by using the
         family, model and stepping encoded in the patch revision number
      
       - Fix a silly clang warning
      
      * tag 'x86_microcode_for_v6.12_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/microcode/AMD: Fix a -Wsometimes-uninitialized clang false positive
        x86/microcode/AMD: Use the family,model,stepping encoded in the patch ID
      79f1a6ad
    • Linus Torvalds's avatar
      Merge tag 'edac_updates_for_v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras · 7dfc15c4
      Linus Torvalds authored
      Pull EDAC updates from Borislav Petkov:
      
       - Drop a now obsolete ppc4xx_edac driver
      
       - Fix conversion to physical memory addresses on Intel's Elkhart Lake
         and Ice Lake hardware when the system address is above the
         (Top-Of-Memory) TOM address
      
       - Pay attention to the memory hole on Zynq UltraScale+ MPSoC DDR
         controllers when injecting errors for testing purposes
      
       - Add support for translating normalized error addresses reported by an
         AMD memory controller into system physical addresses using an UEFI
         mechanism called platform runtime mechanism (PRM).
      
       - The usual cleanups and fixes
      
      * tag 'edac_updates_for_v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
        EDAC: Drop obsolete PPC4xx driver
        EDAC/sb_edac: Fix the compile warning of large frame size
        EDAC/{skx_common,i10nm}: Remove the AMAP register for determing DDR5
        EDAC/{skx_common,skx,i10nm}: Move the common debug code to skx_common
        EDAC/igen6: Fix conversion of system address to physical memory address
        EDAC/synopsys: Fix error injection on Zynq UltraScale+
        RAS/AMD/ATL: Translate normalized to system physical addresses using PRM
        ACPI: PRM: Add PRM handler direct call support
      7dfc15c4
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux · 1636f57c
      Linus Torvalds authored
      Pull ARM updates from Russell King:
      
       - clean up TTBCR magic numbers and use u32 for this register
      
       - fix clang issue in VFP code leading to kernel oops, caused by
         compiler instruction scheduling.
      
       - switch 32-bit Arm to use GENERIC_CPU_DEVICES and use the
         arch_cpu_is_hotpluggable() hook.
      
       - pass struct device to arm_iommu_create_mapping() and move over to use
         iommu_paging_domain_alloc() rather than iommu_domain_alloc()
      
       - make amba_bustype constant
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux:
        ARM: 9418/1: dma-mapping: Use iommu_paging_domain_alloc()
        ARM: 9417/1: dma-mapping: Pass device to arm_iommu_create_mapping()
        ARM: 9416/1: amba: make amba_bustype constant
        ARM: 9412/1: Convert to arch_cpu_is_hotpluggable()
        ARM: 9411/1: Switch over to GENERIC_CPU_DEVICES using arch_register_cpu()
        ARM: 9410/1: vfp: Use asm volatile in fmrx/fmxr macros
        ARM: 9409/1: mmu: Do not use magic number for TTBCR settings
      1636f57c
    • Linus Torvalds's avatar
      Merge tag 'v6.12-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 85ffc6e4
      Linus Torvalds authored
      Pull crypto update from Herbert Xu"
       "API:
         - Make self-test asynchronous
      
        Algorithms:
         - Remove MPI functions added for SM3
         - Add allocation error checks to remaining MPI functions (introduced
           for SM3)
         - Set default Jitter RNG OSR to 3
      
        Drivers:
         - Add hwrng driver for Rockchip RK3568 SoC
         - Allow disabling SR-IOV VFs through sysfs in qat
         - Fix device reset bugs in hisilicon
         - Fix authenc key parsing by using generic helper in octeontx*
      
        Others:
         - Fix xor benchmarking on parisc"
      
      * tag 'v6.12-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (96 commits)
        crypto: n2 - Set err to EINVAL if snprintf fails for hmac
        crypto: camm/qi - Use ERR_CAST() to return error-valued pointer
        crypto: mips/crc32 - Clean up useless assignment operations
        crypto: qcom-rng - rename *_of_data to *_match_data
        crypto: qcom-rng - fix support for ACPI-based systems
        dt-bindings: crypto: qcom,prng: document support for SA8255p
        crypto: aegis128 - Fix indentation issue in crypto_aegis128_process_crypt()
        crypto: octeontx* - Select CRYPTO_AUTHENC
        crypto: testmgr - Hide ENOENT errors
        crypto: qat - Remove trailing space after \n newline
        crypto: hisilicon/sec - Remove trailing space after \n newline
        crypto: algboss - Pass instance creation error up
        crypto: api - Fix generic algorithm self-test races
        crypto: hisilicon/qm - inject error before stopping queue
        crypto: hisilicon/hpre - mask cluster timeout error
        crypto: hisilicon/qm - reset device before enabling it
        crypto: hisilicon/trng - modifying the order of header files
        crypto: hisilicon - add a lock for the qp send operation
        crypto: hisilicon - fix missed error branch
        crypto: ccp - do not request interrupt on cmd completion when irqs disabled
        ...
      85ffc6e4
    • Linus Torvalds's avatar
      Merge tag 'net-next-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next · 94106455
      Linus Torvalds authored
      Pull networking updates from Jakub Kicinski:
       "The zero-copy changes are relatively significant, but regression risk
        should be contained. The feature needs to be used to cause trouble.
      
        Also it feels like we got an order of magnitude more semi-automated
        "refactoring" chaff than usual, I wonder if it's just us.
      
        Core & protocols:
      
         - Support Device Memory TCP, ability to zero-copy receive TCP
           payloads to a DMABUF region of memory while packet headers land
           separately in normal kernel buffers, and TCP processes then as
           usual.
      
         - The ability to read the PTP PHC (Physical Hardware Clock) alongside
           MONOTONIC_RAW timestamps with PTP_SYS_OFFSET_EXTENDED. Previously
           only CLOCK_REALTIME was supported.
      
         - Allow matching on all bits of IP DSCP for routing decisions.
           Previously we only supported on matching TOS bits in IPv4 which is
           a narrower interpretation of the same header field.
      
         - Increase the range of weights used for multi-path routing from
           8 bits to 16 bits.
      
         - Add support for IPv6 PIO p flag in the Prefix Information Option
           per draft-ietf-6man-pio-pflag.
      
         - IPv6 IOAM6 support for new tunsrc encap mode for better
           performance.
      
         - Detect destinations which blackhole MPTCP traffic and avoid
           initiating MPTCP connections to them for a certain period of time,
           1h by default.
      
         - Improve IPsec control path performance by removing the inexact
           policies list.
      
         - AF_VSOCK: add support for SIOCOUTQ ioctl.
      
         - Add enum for reasons TCP reset was sent for easier tracing.
      
         - Add SMC ringbufs usage statistics.
      
        Drivers:
      
         - Handle netconsole setup failures more gracefully, don't fail
           loading, retain the specified target as disabled.
      
         - Extend bonding's IPsec offload pass thru capabilities (ESN, stats).
      
        Filtering:
      
         - Add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_*sockopt() to address the case
           when long-lived sockets miss a chance to set additional callbacks
           if a sockops program was not attached early in their lifetime.
      
         - Support using BPF skb helpers in tracepoints.
      
         - Conntrack Netlink: support CTA_FILTER for flush.
      
         - Improve SCTP support in nfnetlink_queue.
      
         - Improve performance of large nftables flush transactions.
      
        Things we sprinkled into general kernel code:
      
         - selftests: support setting an "interpreter" for script files; make
           it easy to run as separate cases tests where one "interpreter" is
           fed various test descriptions (in our case packet sequences).
      
        Driver API:
      
         - Extend core and ethtool APIs to support many PHYs connected to a
           single interface (PHY topologies).
      
         - Extend cable diagnostics to specify whether Time Domain
           Reflectometry (TDR) or Active Link Cable Diagnostic (ALCD) was
           used.
      
         - Add library for implementing MAC-PHY Ethernet drivers for SPI
           devices compatible with Open Alliance 10BASE-T1x MAC-PHY Serial
           Interface (TC6) standard.
      
         - Add helpers to the PHY framework, for PHYs following the Open
           Alliance standards:
             - 1000BaseT1 link settings
             - cable test and diagnostics
      
         - Support listing / dumping all allocated RSS contexts.
      
         - Add configuration for frequency Embedded SYNC in DPLL, which
           magically embeds sync pulses into Ethernet signaling.
      
        Device drivers:
      
         - Ethernet high-speed NICs:
            - Broadcom (bnxt):
               - use better FW APIs for queue reset
               - support QOS and TPID settings for the SR-IOV VLAN
               - support dynamic MSI-X allocation
            - Intel (100G, ice, idpf):
               - ice: support PCIe subfunctions
               - iavf: add support for TC U32 filters on VFs
               - ice: support Embedded SYNC in DPLL
            - nVidia/Mellanox (mlx5):
               - support HW managed steering tables
               - support PCIe PTM cross timestamping
            - AMD/Pensando:
               - ionic: use page_pool to increase Rx performance
            - Cisco (enic):
               - report per-queue statistics
      
         - Ethernet virtual:
            - Microsoft vNIC:
               - mana: support configuring ring length
               - netvsc: enable more channels on systems with many CPUs
            - IBM veth:
               - optimize polling to improve TCP_RR performance
               - optimize performance of Tx handling
            - VirtIO net:
               - synchronize the operstate with the admin state to allow a
                 lower virtio-net to propagate the link status to an upper
                 device like macvlan
      
         - Ethernet NICs consumer, and embedded:
            - Add driver for Realtek automotive PCIe devices (RTL9054,
              RTL9068, RTL9072, RTL9075, RTL9068, RTL9071)
            - Add driver for Microchip LAN8650/1 10BASE-T1S MAC-PHY.
            - Microchip:
               - lan743x: use phylink - support WOL, EEE, pause, link settings
               - add Wake-on-LAN support for KSZ87xx family
               - add KSZ8895/KSZ8864 switch support
               - factor out FDMA code and use it in sparx5 and lan966x
                 (including DCB support in both)
            - Synopsys (stmmac):
               - support frame preemption (configured using TC and ethtool)
               - support Loongson DWMAC (GMAC v3.73)
               - support RockChips RK3576 DWMAC
            - TI:
               - am65-cpsw: add multi queue RX support
               - icssg-prueth: HSR offload support
            - Cadence (macb):
               - enable software (hrtimer based) IRQ coalescing by default
            - Xilinx (axinet):
               - expose HW statistics
               - improve multicast filtering
               - relax Rx checksum offload constraints
            - MediaTek:
               - mt7530: add EN7581 support
            - Aspeed (ftgmac100):
               - report link speed and duplex
            - Intel:
               - igc: add mqprio offload
               - igc: report EEE configuration
            - RealTek (r8169):
               - add support for RTL8126A rev.b
            - Vitesse (vsc73xx):
               - implement FDB add/del/dump operations
            - Freescale (fs_enet):
               - use phylink
      
         - Ethernet PHYs:
            - vitesse: implement downshift and MDI-X in vsc73xx PHYs
            - microchip: support LAN887x, supporting IEEE 802.3bw (100BASE-T1)
              and IEEE 802.3bp (1000BASE-T1) specifications
            - add Applied Micro QT2025 PHY driver (in Rust)
            - add Motorcomm yt8821 2.5G Ethernet PHY driver
      
         - CAN:
            - add driver for Rockchip RK3568 CAN-FD controller
            - flexcan: add wakeup support for imx95
            - kvaser_usb: set hardware timestamp on transmitted packets
      
         - WiFi:
            - mac80211/cfg80211:
               - EHT rate support in AQL airtime fairness
               - handle DFS (radar detection) per link in Multi-Link Operation
            - RealTek (rtw89):
               - support RTL8852BT and 8852BE-VT (WiFi 6)
               - support hardware rfkill
               - support HW encryption in unicast management frames
               - support Wake-on-WLAN with supported network detection
            - RealTek (rtw89):
               - improve Rx performance by using USB frame aggregation
               - support USB 3 with RTL8822CU/RTL8822BU
            - Intel (iwlwifi/mvm):
               - offload RLC/SMPS functionality to firmware
            - Marvell (mwifiex):
               - add host based MLME to enable WPA3
      
         - Bluetooth:
            - add support for Amlogic HCI UART protocol
            - add support for ISO data/packets to Intel and NXP drivers"
      
      * tag 'net-next-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1303 commits)
        net/mlx5: HWS, check the correct variable in hws_send_ring_alloc_sq()
        netfilter: nft_socket: Fix a NULL vs IS_ERR() bug in nft_socket_cgroup_subtree_level()
        ice: Fix a NULL vs IS_ERR() check in probe()
        ice: Fix a couple NULL vs IS_ERR() bugs
        net: ethernet: fs_enet: Make the per clock optional
        net: ti: icssg-prueth: Add multicast filtering support in HSR mode
        net: ti: icssg-prueth: Enable HSR Tx duplication, Tx Tag and Rx Tag offload
        net: ti: icssg-prueth: Add support for HSR frame forward offload
        net: ti: icssg-prueth: Stop hardcoding def_inc
        net: ti: icss-iep: Move icss_iep structure
        net: ibm: emac: get rid of wol_irq
        net: ibm: emac: remove all waiting code
        net: ibm: emac: replace of_get_property
        net: ibm: emac: use netdev's phydev directly
        net: ibm: emac: use devm for register_netdev
        net: ibm: emac: remove mii_bus with devm
        net: ibm: emac: use devm for of_iomap
        net: ibm: emac: manage emac_irq with devm
        net: ibm: emac: use devm for alloc_etherdev
        octeontx2-af: debugfs: Add Channel info to RPM map
        ...
      94106455
  2. 15 Sep, 2024 5 commits