1. 20 Aug, 2022 6 commits
    • Peter Xu's avatar
      mm/smaps: don't access young/dirty bit if pte unpresent · efd41493
      Peter Xu authored
      These bits should only be valid when the ptes are present.  Introducing
      two booleans for it and set it to false when !pte_present() for both pte
      and pmd accountings.
      
      The bug is found during code reading and no real world issue reported, but
      logically such an error can cause incorrect readings for either smaps or
      smaps_rollup output on quite a few fields.
      
      For example, it could cause over-estimate on values like Shared_Dirty,
      Private_Dirty, Referenced.  Or it could also cause under-estimate on
      values like LazyFree, Shared_Clean, Private_Clean.
      
      Link: https://lkml.kernel.org/r/20220805160003.58929-1-peterx@redhat.com
      Fixes: b1d4d9e0 ("proc/smaps: carefully handle migration entries")
      Fixes: c94b6923 ("/proc/PID/smaps: Add PMD migration entry parsing")
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarYang Shi <shy828301@gmail.com>
      Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Huang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      efd41493
    • Hao Lee's avatar
      mm: add DEVICE_ZONE to FOR_ALL_ZONES · a39c5d3c
      Hao Lee authored
      FOR_ALL_ZONES should be consistent with enum zone_type.  Otherwise,
      __count_zid_vm_events have the potential to add count to wrong item when
      zid is ZONE_DEVICE.
      
      Link: https://lkml.kernel.org/r/20220807154442.GA18167@haolee.ioSigned-off-by: default avatarHao Lee <haolee.swjtu@gmail.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      a39c5d3c
    • Randy Dunlap's avatar
      kernel/sys_ni: add compat entry for fadvise64_64 · a8faed3a
      Randy Dunlap authored
      When CONFIG_ADVISE_SYSCALLS is not set/enabled and CONFIG_COMPAT is
      set/enabled, the riscv compat_syscall_table references
      'compat_sys_fadvise64_64', which is not defined:
      
      riscv64-linux-ld: arch/riscv/kernel/compat_syscall_table.o:(.rodata+0x6f8):
      undefined reference to `compat_sys_fadvise64_64'
      
      Add 'fadvise64_64' to kernel/sys_ni.c as a conditional COMPAT function so
      that when CONFIG_ADVISE_SYSCALLS is not set, there is a fallback function
      available.
      
      Link: https://lkml.kernel.org/r/20220807220934.5689-1-rdunlap@infradead.org
      Fixes: d3ac21ca ("mm: Support compiling out madvise and fadvise")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Suggested-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      a8faed3a
    • David Hildenbrand's avatar
      mm/gup: fix FOLL_FORCE COW security issue and remove FOLL_COW · 5535be30
      David Hildenbrand authored
      Ever since the Dirty COW (CVE-2016-5195) security issue happened, we know
      that FOLL_FORCE can be possibly dangerous, especially if there are races
      that can be exploited by user space.
      
      Right now, it would be sufficient to have some code that sets a PTE of a
      R/O-mapped shared page dirty, in order for it to erroneously become
      writable by FOLL_FORCE.  The implications of setting a write-protected PTE
      dirty might not be immediately obvious to everyone.
      
      And in fact ever since commit 9ae0f87d ("mm/shmem: unconditionally set
      pte dirty in mfill_atomic_install_pte"), we can use UFFDIO_CONTINUE to map
      a shmem page R/O while marking the pte dirty.  This can be used by
      unprivileged user space to modify tmpfs/shmem file content even if the
      user does not have write permissions to the file, and to bypass memfd
      write sealing -- Dirty COW restricted to tmpfs/shmem (CVE-2022-2590).
      
      To fix such security issues for good, the insight is that we really only
      need that fancy retry logic (FOLL_COW) for COW mappings that are not
      writable (!VM_WRITE).  And in a COW mapping, we really only broke COW if
      we have an exclusive anonymous page mapped.  If we have something else
      mapped, or the mapped anonymous page might be shared (!PageAnonExclusive),
      we have to trigger a write fault to break COW.  If we don't find an
      exclusive anonymous page when we retry, we have to trigger COW breaking
      once again because something intervened.
      
      Let's move away from this mandatory-retry + dirty handling and rely on our
      PageAnonExclusive() flag for making a similar decision, to use the same
      COW logic as in other kernel parts here as well.  In case we stumble over
      a PTE in a COW mapping that does not map an exclusive anonymous page, COW
      was not properly broken and we have to trigger a fake write-fault to break
      COW.
      
      Just like we do in can_change_pte_writable() added via commit 64fe24a3
      ("mm/mprotect: try avoiding write faults for exclusive anonymous pages
      when changing protection") and commit 76aefad6 ("mm/mprotect: fix
      soft-dirty check in can_change_pte_writable()"), take care of softdirty
      and uffd-wp manually.
      
      For example, a write() via /proc/self/mem to a uffd-wp-protected range has
      to fail instead of silently granting write access and bypassing the
      userspace fault handler.  Note that FOLL_FORCE is not only used for debug
      access, but also triggered by applications without debug intentions, for
      example, when pinning pages via RDMA.
      
      This fixes CVE-2022-2590. Note that only x86_64 and aarch64 are
      affected, because only those support CONFIG_HAVE_ARCH_USERFAULTFD_MINOR.
      
      Fortunately, FOLL_COW is no longer required to handle FOLL_FORCE. So
      let's just get rid of it.
      
      Thanks to Nadav Amit for pointing out that the pte_dirty() check in
      FOLL_FORCE code is problematic and might be exploitable.
      
      Note 1: We don't check for the PTE being dirty because it doesn't matter
      	for making a "was COWed" decision anymore, and whoever modifies the
      	page has to set the page dirty either way.
      
      Note 2: Kernels before extended uffd-wp support and before
      	PageAnonExclusive (< 5.19) can simply revert the problematic
      	commit instead and be safe regarding UFFDIO_CONTINUE. A backport to
      	v5.19 requires minor adjustments due to lack of
      	vma_soft_dirty_enabled().
      
      Link: https://lkml.kernel.org/r/20220809205640.70916-1-david@redhat.com
      Fixes: 9ae0f87d ("mm/shmem: unconditionally set pte dirty in mfill_atomic_install_pte")
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Axel Rasmussen <axelrasmussen@google.com>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Jason Gunthorpe <jgg@nvidia.com>
      Cc: David Laight <David.Laight@ACULAB.COM>
      Cc: <stable@vger.kernel.org>	[5.16]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      5535be30
    • Jiri Slaby's avatar
      Revert "zram: remove double compression logic" · 37887783
      Jiri Slaby authored
      This reverts commit e7be8d1d ("zram: remove double compression
      logic") as it causes zram failures.  It does not revert cleanly, PTR_ERR
      handling was introduced in the meantime.  This is handled by appropriate
      IS_ERR.
      
      When under memory pressure, zs_malloc() can fail.  Before the above
      commit, the allocation was retried with direct reclaim enabled (GFP_NOIO).
      After the commit, it is not -- only __GFP_KSWAPD_RECLAIM is tried.
      
      So when the failure occurs under memory pressure, the overlaying
      filesystem such as ext2 (mounted by ext4 module in this case) can emit
      failures, making the (file)system unusable:
        EXT4-fs warning (device zram0): ext4_end_bio:343: I/O error 10 writing to inode 16386 starting block 159744)
        Buffer I/O error on device zram0, logical block 159744
      
      With direct reclaim, memory is really reclaimed and allocation succeeds,
      eventually.  In the worst case, the oom killer is invoked, which is proper
      outcome if user sets up zram too large (in comparison to available RAM).
      
      This very diff doesn't apply to 5.19 (stable) cleanly (see PTR_ERR note
      above). Use revert of e7be8d1d directly.
      
      Link: https://bugzilla.suse.com/show_bug.cgi?id=1202203
      Link: https://lkml.kernel.org/r/20220810070609.14402-1-jslaby@suse.cz
      Fixes: e7be8d1d ("zram: remove double compression logic")
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Reviewed-by: default avatarSergey Senozhatsky <senozhatsky@chromium.org>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Alexey Romanov <avromanov@sberdevices.ru>
      Cc: Dmitry Rokosov <ddrokosov@sberdevices.ru>
      Cc: Lukas Czerner <lczerner@redhat.com>
      Cc: <stable@vger.kernel.org>	[5.19]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      37887783
    • Dan Carpenter's avatar
      get_maintainer: add Alan to .get_maintainer.ignore · d10a72de
      Dan Carpenter authored
      Alan asked to be added to the .get_maintainer.ignore list.
      
      Link: https://lkml.kernel.org/r/YvN30KhO9aD5Sza9@kiliSigned-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      d10a72de
  2. 14 Aug, 2022 10 commits
    • Linus Torvalds's avatar
      Linux 6.0-rc1 · 568035b0
      Linus Torvalds authored
      568035b0
    • Yury Norov's avatar
      radix-tree: replace gfp.h inclusion with gfp_types.h · 9f162193
      Yury Norov authored
      Radix tree header includes gfp.h for __GFP_BITS_SHIFT only. Now we
      have gfp_types.h for this.
      
      Fixes powerpc allmodconfig build:
      
         In file included from include/linux/nodemask.h:97,
                          from include/linux/mmzone.h:17,
                          from include/linux/gfp.h:7,
                          from include/linux/radix-tree.h:12,
                          from include/linux/idr.h:15,
                          from include/linux/kernfs.h:12,
                          from include/linux/sysfs.h:16,
                          from include/linux/kobject.h:20,
                          from include/linux/pci.h:35,
                          from arch/powerpc/kernel/prom_init.c:24:
         include/linux/random.h: In function 'add_latent_entropy':
      >> include/linux/random.h:25:46: error: 'latent_entropy' undeclared (first use in this function); did you mean 'add_latent_entropy'?
            25 |         add_device_randomness((const void *)&latent_entropy, sizeof(latent_entropy));
               |                                              ^~~~~~~~~~~~~~
               |                                              add_latent_entropy
         include/linux/random.h:25:46: note: each undeclared identifier is reported only once for each function it appears in
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      CC: Andrew Morton <akpm@linux-foundation.org>
      CC: Jason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarYury Norov <yury.norov@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9f162193
    • Linus Torvalds's avatar
      Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 74cbb480
      Linus Torvalds authored
      Pull vfs lseek fix from Al Viro:
       "Fix proc_reg_llseek() breakage. Always had been possible if somebody
        left NULL ->proc_lseek, became a practical issue now"
      
      * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        take care to handle NULL ->proc_lseek()
      74cbb480
    • Al Viro's avatar
      take care to handle NULL ->proc_lseek() · 3f61631d
      Al Viro authored
      Easily done now, just by clearing FMODE_LSEEK in ->f_mode
      during proc_reg_open() for such entries.
      
      Fixes: 868941b1 "fs: remove no_llseek"
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      3f61631d
    • Linus Torvalds's avatar
      Merge tag 'for-linus-6.0-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 5d6a0f4d
      Linus Torvalds authored
      Pull more xen updates from Juergen Gross:
      
       - fix the handling of the "persistent grants" feature negotiation
         between Xen blkfront and Xen blkback drivers
      
       - a cleanup of xen.config and adding xen.config to Xen section in
         MAINTAINERS
      
       - support HVMOP_set_evtchn_upcall_vector, which is more compliant to
         "normal" interrupt handling than the global callback used up to now
      
       - further small cleanups
      
      * tag 'for-linus-6.0-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        MAINTAINERS: add xen config fragments to XEN HYPERVISOR sections
        xen: remove XEN_SCRUB_PAGES in xen.config
        xen/pciback: Fix comment typo
        xen/xenbus: fix return type in xenbus_file_read()
        xen-blkfront: Apply 'feature_persistent' parameter when connect
        xen-blkback: Apply 'feature_persistent' parameter when connect
        xen-blkback: fix persistent grants negotiation
        x86/xen: Add support for HVMOP_set_evtchn_upcall_vector
      5d6a0f4d
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-for-v6.0-2022-08-13' of... · 96f86ff0
      Linus Torvalds authored
      Merge tag 'perf-tools-fixes-for-v6.0-2022-08-13' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull more perf tool updates from Arnaldo Carvalho de Melo:
      
       - 'perf c2c' now supports ARM64, adjust its output to cope with
         differences with what is in x86_64. Now go find false sharing on
         ARM64 (at least Neoverse) as well!
      
       - Refactor the JSON processing, making the output more compact and thus
         reducing the size of the resulting perf binary
      
       - Improvements for 'perf offcpu' profiling, including tracking child
         processes
      
       - Update Intel JSON metrics and events files for broadwellde,
         broadwellx, cascadelakex, haswellx, icelakex, ivytown, jaketown,
         knightslanding, sapphirerapids, skylakex and snowridgex
      
       - Add 'perf stat' JSON output and a 'perf test' entry for it
      
       - Ignore memfd and anonymous mmap events if jitdump present
      
       - Refactor 'perf test' shell tests allowing subdirs
      
       - Fix an error handling path in 'parse_perf_probe_command()'
      
       - Fixes for the guest Intel PT tracing patchkit in the 1st batch of
         this merge window
      
       - Print debuginfod queries if -v option is used, to explain delays in
         processing when debuginfo servers are enabled to fetch DSOs with
         richer symbol tables
      
       - Improve error message for 'perf record -p not_existing_pid'
      
       - Fix openssl and libbpf feature detection
      
       - Add PMU pai_crypto event description for IBM z16 on 'perf list'
      
       - Fix typos and duplicated words on comments in various places
      
      * tag 'perf-tools-fixes-for-v6.0-2022-08-13' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (81 commits)
        perf test: Refactor shell tests allowing subdirs
        perf vendor events: Update events for snowridgex
        perf vendor events: Update events and metrics for skylakex
        perf vendor events: Update metrics for sapphirerapids
        perf vendor events: Update events for knightslanding
        perf vendor events: Update metrics for jaketown
        perf vendor events: Update metrics for ivytown
        perf vendor events: Update events and metrics for icelakex
        perf vendor events: Update events and metrics for haswellx
        perf vendor events: Update events and metrics for cascadelakex
        perf vendor events: Update events and metrics for broadwellx
        perf vendor events: Update metrics for broadwellde
        perf jevents: Fold strings optimization
        perf jevents: Compress the pmu_events_table
        perf metrics: Copy entire pmu_event in find metric
        perf pmu-events: Hide the pmu_events
        perf pmu-events: Don't assume pmu_event is an array
        perf pmu-events: Move test events/metrics to JSON
        perf test: Use full metric resolution
        perf pmu-events: Hide pmu_events_map
        ...
      96f86ff0
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · d785610f
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - Ensure we never emit lwarx with EH=1 on 32-bit, because some 32-bit
         CPUs trap on it rather than ignoring it as they should.
      
       - Fix ftrace when building with clang, which was broken by some
         refactoring.
      
       - A couple of other minor fixes.
      
      Thanks to Christophe Leroy, Naveen N.  Rao, Nick Desaulniers, Ondrej
      Mosnacek, Pali Rohár, Russell Currey, and Segher Boessenkool.
      
      * tag 'powerpc-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/kexec: Fix build failure from uninitialised variable
        powerpc/ppc-opcode: Fix PPC_RAW_TW()
        powerpc64/ftrace: Fix ftrace for clang builds
        powerpc: Make eh value more explicit when using lwarx
        powerpc: Don't hide eh field of lwarx behind a macro
        powerpc: Fix eh field when calling lwarx on PPC32
      d785610f
    • Linus Torvalds's avatar
      Merge tag 'pull-work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · aea23e7c
      Linus Torvalds authored
      Pull /proc/mounts fix from Al Viro:
       "Fix for /proc/mounts escaping - escape the '#' character too"
      
      * tag 'pull-work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        vfs: escape hash as well
      aea23e7c
    • Linus Torvalds's avatar
      Merge tag '5.20-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6 · 332019e2
      Linus Torvalds authored
      Pull more cifs updates from Steve French:
      
       - two fixes for stable, one for a lock length miscalculation, and
         another fixes a lease break timeout bug
      
       - improvement to handle leases, allows the close timeout to be
         configured more safely
      
       - five restructuring/cleanup patches
      
      * tag '5.20-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: Do not access tcon->cfids->cfid directly from is_path_accessible
        cifs: Add constructor/destructors for tcon->cfid
        SMB3: fix lease break timeout when multiple deferred close handles for the same file.
        smb3: allow deferred close timeout to be configurable
        cifs: Do not use tcon->cfid directly, use the cfid we get from open_cached_dir
        cifs: Move cached-dir functions into a separate file
        cifs: Remove {cifs,nfs}_fscache_release_page()
        cifs: fix lock length calculation
      332019e2
    • David Howells's avatar
      afs: Enable multipage folio support · 8549a263
      David Howells authored
      Enable multipage folio support for the afs filesystem.
      
      Support has already been implemented in netfslib, fscache and cachefiles
      and in most of afs, but I've waited for Matthew Wilcox's latest folio
      changes.
      
      Note that it does require a change to afs_write_begin() to return the
      correct subpage.  This is a "temporary" change as we're working on
      getting rid of the need for ->write_begin() and ->write_end()
      completely, at least as far as network filesystems are concerned - but
      it doesn't prevent afs from making use of the capability.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Tested-by: kafs-testing@auristor.com
      Cc: Marc Dionne <marc.dionne@auristor.com>
      Cc: linux-afs@lists.infradead.org
      Link: https://lore.kernel.org/lkml/2274528.1645833226@warthog.procyon.org.uk/Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8549a263
  3. 13 Aug, 2022 24 commits