1. 15 Jul, 2021 12 commits
  2. 14 Jul, 2021 10 commits
    • James Clark's avatar
      perf cs-etm: Split Coresight decode by aux records · 83d1fc92
      James Clark authored
      Populate the auxtrace queues using AUX records rather than whole
      auxtrace buffers so that the decoder is reset between each aux record.
      
      This is similar to the auxtrace_queues__process_index() ->
      auxtrace_queues__add_indexed_event() flow where
      perf_session__peek_event() is used to read AUXTRACE events out of random
      positions in the file based on the auxtrace index.
      
      But now we loop over all PERF_RECORD_AUX events instead of AUXTRACE
      buffers. For each PERF_RECORD_AUX event, we find the corresponding
      AUXTRACE buffer using the index, and add a fragment of that buffer to
      the auxtrace queues.
      
      No other changes to decoding were made, apart from populating the
      auxtrace queues. The result of decoding is identical to before, except
      in cases where decoding failed completely, due to not resetting the
      decoder.
      
      The reason for this change is because AUX records are emitted any time
      tracing is disabled, for example when the process is scheduled out.
      Because ETM was disabled and enabled again, the decoder also needs to be
      reset to force the search for a sync packet. Otherwise there would be
      fatal decoding errors.
      
      Testing
      =======
      
      Testing was done with the following script, to diff the decoding results
      between the patched and un-patched versions of perf:
      
      	#!/bin/bash
      	set -ex
      
      	$1 script -i $3 $4 > split.script
      	$2 script -i $3 $4 > default.script
      
      	diff split.script default.script | head -n 20
      
      And it was run like this, with various itrace options depending on the
      quantity of synthesised events:
      
      	compare.sh ./perf-patched ./perf-default perf-per-cpu-2-threads.data --itrace=i100000ns
      
      No changes in output were observed in the following scenarios:
      
      * Simple per-cpu
      	perf record -e cs_etm/@tmc_etr0/u top
      
      * Per-thread, single thread
      	perf record -e cs_etm/@tmc_etr0/u --per-thread ./threads_C
      
      * Per-thread multiple threads (but only one thread collected data):
      	perf record -e cs_etm/@tmc_etr0/u --per-thread --pid 4596,4597
      
      * Per-thread multiple threads (both threads collected data):
      	perf record -e cs_etm/@tmc_etr0/u --per-thread --pid 4596,4597
      
      * Per-cpu explicit threads:
      	perf record -e cs_etm/@tmc_etr0/u --pid 853,854
      
      * System-wide (per-cpu):
          perf record -e cs_etm/@tmc_etr0/u -a
      
      * No data collected (no aux buffers)
      	Can happen with any command when run for a short period
      
      * Containing truncated records
      	Can happen with any command
      
      * Containing aux records with 0 size
      	Can happen with any command
      
      * Snapshot mode (various files with and without buffer wrap)
      	perf record -e cs_etm/@tmc_etr0/u -a --snapshot
      
      Some differences were observed in the following scenario:
      
      * Snapshot mode (with duplicate buffers)
      	perf record -e cs_etm/@tmc_etr0/u -a --snapshot
      
      Fewer samples are generated in snapshot mode if duplicate buffers
      were gathered because buffers with the same offset are now only added
      once. This gives different, but more correct results and no duplicate
      data is decoded any more.
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Reviewed-by: default avatarMathieu Poirier <mathieu.poirier@linaro.org>
      Tested-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Al Grant <al.grant@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Branislav Rankov <branislav.rankov@arm.com>
      Cc: Denis Nikitin <denik@chromium.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lore.kernel.org/lkml/20210624164303.28632-2-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      83d1fc92
    • Arnaldo Carvalho de Melo's avatar
      tools headers: Remove broken definition of __LITTLE_ENDIAN · fa2c02e5
      Arnaldo Carvalho de Melo authored
      The linux/kconfig.h file was copied from the kernel but the line where
      with the generated/autoconf.h include from where the CONFIG_ entries
      would come from was deleted, as tools/ build system don't create that
      file, so we ended up always defining just __LITTLE_ENDIAN as
      CONFIG_CPU_BIG_ENDIAN was nowhere to be found.
      
      This in turn ended up breaking the build in some systems where
      __LITTLE_ENDIAN was already defined, such as the androind NDK.
      
      So just ditch that block that depends on the CONFIG_CPU_BIG_ENDIAN
      define.
      
      The kconfig.h file was copied just to get IS_ENABLED() and a
      'make -C tools/all' doesn't breaks with this removal.
      
      Fixes: 93281c4a ("x86/insn: Add an insn_decode() API")
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/YO8hK7lqJcIWuBzx@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fa2c02e5
    • Arnaldo Carvalho de Melo's avatar
      perf sched: Cast PTHREAD_STACK_MIN to int as it may turn into sysconf(__SC_THREAD_STACK_MIN_VALUE) · d08c84e0
      Arnaldo Carvalho de Melo authored
      In fedora rawhide the PTHREAD_STACK_MIN define may end up expanded to a
      sysconf() call, and that will return 'long int', breaking the build:
      
          45 fedora:rawhide                : FAIL gcc version 11.1.1 20210623 (Red Hat 11.1.1-6) (GCC)
            builtin-sched.c: In function 'create_tasks':
            /git/perf-5.14.0-rc1/tools/include/linux/kernel.h:43:24: error: comparison of distinct pointer types lacks a cast [-Werror]
               43 |         (void) (&_max1 == &_max2);              \
                  |                        ^~
            builtin-sched.c:673:34: note: in expansion of macro 'max'
              673 |                         (size_t) max(16 * 1024, PTHREAD_STACK_MIN));
                  |                                  ^~~
            cc1: all warnings being treated as errors
      
        $ grep __sysconf /usr/include/*/*.h
        /usr/include/bits/pthread_stack_min-dynamic.h:extern long int __sysconf (int __name) __THROW;
        /usr/include/bits/pthread_stack_min-dynamic.h:#   define PTHREAD_STACK_MIN __sysconf (__SC_THREAD_STACK_MIN_VALUE)
        /usr/include/bits/time.h:extern long int __sysconf (int);
        /usr/include/bits/time.h:# define CLK_TCK ((__clock_t) __sysconf (2))	/* 2 is _SC_CLK_TCK */
        $
      
      So cast it to int to cope with that.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d08c84e0
    • Heiko Carstens's avatar
      libperf: Fix build error with LIBPFM4=1 · 50e98924
      Heiko Carstens authored
      Fix build error with LIBPFM4=1:
      
          CC      util/pfm.o
        util/pfm.c: In function ‘parse_libpfm_events_option’:
        util/pfm.c:102:30: error: ‘struct evsel’ has no member named ‘leader’
          102 |                         evsel->leader = grp_leader;
              |                              ^~
      
      Committer notes:
      
      There is this entry in 'make -C tools/perf build-test' to test the build
      with libpfm:
      
        $ grep libpfm tools/perf/tests/make
        make_with_libpfm4   := LIBPFM4=1
        run += make_with_libpfm4
        $
      
      But the test machine lacked libpfm-devel, now its installed and further
      cases like this shouldn't happen.
      
      Committer testing:
      
      Before this patch this fails, after applying it:
      
        $ make -C tools/perf build-test
        make: Entering directory '/var/home/acme/git/perf/tools/perf'
        - tarpkg: ./tests/perf-targz-src-pkg .
                         make_static: make LDFLAGS=-static NO_PERF_READ_VDSO32=1 NO_PERF_READ_VDSOX32=1 NO_JVMTI=1 -j24  DESTDIR=/tmp/tmp.KzFSfvGRQa
        <SNIP>
                   make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
                 make_with_libpfm4_O: make LIBPFM4=1
               make_install_prefix_O: make install prefix=/tmp/krava
                  make_no_auxtrace_O: make NO_AUXTRACE=1
        <SNIP>
        $ rpm -q libpfm-devel
        libpfm-devel-4.11.0-4.fc34.x86_64
        $
      
      FIXME:
      
      This shows a need for 'build-test' to bail out when a build option is
      specified that has no required library devel files installed.
      
      Fixes: fba7c866 ("libperf: Move 'leader' from tools/perf to perf_evsel::leader")
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20210713091907.1555560-1-hca@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      50e98924
    • Arnaldo Carvalho de Melo's avatar
      tools headers UAPI: Sync files changed by the memfd_secret new syscall · 376a9476
      Arnaldo Carvalho de Melo authored
      To pick the changes in this cset:
      
        7bb7f2ac ("arch, mm: wire up memfd_secret system call where relevant")
      
      That silences these perf build warnings and add support for those new
      syscalls in tools such as 'perf trace'.
      
      For instance, this is now possible:
      
        # perf trace -v -e memfd_secret
        event qualifier tracepoint filter: (common_pid != 13375 && common_pid != 3713) && (id == 447)
        ^C#
      
      That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
      tracepoints.
      
        $ grep memfd_secret tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
        447    common  memfd_secret            sys_memfd_secret
        $
      
      This addresses these perf build warnings:
      
        Warning: Kernel ABI header at 'tools/arch/arm64/include/uapi/asm/unistd.h' differs from latest version at 'arch/arm64/include/uapi/asm/unistd.h'
        diff -u tools/arch/arm64/include/uapi/asm/unistd.h arch/arm64/include/uapi/asm/unistd.h
        Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
        diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
        Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
        diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Rapoport <rppt@kernel.org>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      376a9476
    • Jin Yao's avatar
      perf stat: Merge uncore events by default for hybrid platform · e0a7ef2a
      Jin Yao authored
      On a hybrid platform, by default 'perf stat' aggregates and reports the
      event counts per PMU. For example,
      
        # perf stat -e cycles -a true
      
         Performance counter stats for 'system wide':
      
                 1,400,445      cpu_core/cycles/
                   680,881      cpu_atom/cycles/
      
               0.001770773 seconds time elapsed
      
      But for uncore events that's not a suitable method. Uncore has nothing
      to do with hybrid. So for uncore events, we aggregate event counts from
      all PMUs and report the counts without PMUs.
      
      Before:
      
        # perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ -a true
      
         Performance counter stats for 'system wide':
      
                     2,058      uncore_arb_0/event=0x81,umask=0x1/
                     2,028      uncore_arb_1/event=0x81,umask=0x1/
                         0      uncore_arb_0/event=0x84,umask=0x1/
                         0      uncore_arb_1/event=0x84,umask=0x1/
      
               0.000614498 seconds time elapsed
      
      After:
      
        # perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ -a true
      
         Performance counter stats for 'system wide':
      
                     3,996      arb/event=0x81,umask=0x1/
                         0      arb/event=0x84,umask=0x1/
      
               0.000630046 seconds time elapsed
      
      Of course, we also keep the '--no-merge' working for uncore events.
      
        # perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ --no-merge true
      
         Performance counter stats for 'system wide':
      
                     1,952      uncore_arb_0/event=0x81,umask=0x1/
                     1,921      uncore_arb_1/event=0x81,umask=0x1/
                         0      uncore_arb_0/event=0x84,umask=0x1/
                         0      uncore_arb_1/event=0x84,umask=0x1/
      
               0.000575536 seconds time elapsed
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210707055652.962-1-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e0a7ef2a
    • Jin Yao's avatar
      perf tests: Fix 'Convert perf time to TSC' on core-only system · de3d5fd8
      Jin Yao authored
      If the atom CPUs are offlined, the 'cpu_atom' is not valid.
      We don't need the test case for 'cpu_atom'.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20210708013701.20347-5-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      de3d5fd8
    • Jin Yao's avatar
      perf tests: Fix 'Roundtrip evsel->name' on core-only system · 212f3d97
      Jin Yao authored
      If the atom CPUs are offlined, the 'cpu_atom' is not valid.
      Perf will not create two events for one hw event, so the
      evsel->idx doesn't need to be divided by 2 before comparing.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20210708013701.20347-4-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      212f3d97
    • Jin Yao's avatar
      perf tests: Fix 'Parse event definition strings' on core-only system · 490e9a8f
      Jin Yao authored
      If the atom CPUs are offlined, the 'cpu_atom' is not valid.
      We don't need the test case for 'cpu_atom'.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20210708013701.20347-3-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      490e9a8f
    • Jin Yao's avatar
      perf pmu: Skip invalid hybrid pmu · 49afa7f6
      Jin Yao authored
      On hybrid platform, such as Alderlake, if atom CPUs are offlined,
      the kernel still exports the sysfs path '/sys/devices/cpu_atom/' for
      'cpu_atom' pmu but the file '/sys/devices/cpu_atom/cpus' is empty,
      which indicates this is an invalid pmu.
      
      Need to check and skip the invalid hybrid pmu.
      
      Before:
      
        # perf list
        ...
        branch-instructions OR cpu_atom/branch-instructions/ [Kernel PMU event]
        branch-instructions OR cpu_core/branch-instructions/ [Kernel PMU event]
        branch-misses OR cpu_atom/branch-misses/           [Kernel PMU event]
        branch-misses OR cpu_core/branch-misses/           [Kernel PMU event]
        bus-cycles OR cpu_atom/bus-cycles/                 [Kernel PMU event]
        bus-cycles OR cpu_core/bus-cycles/                 [Kernel PMU event]
        ...
      
      The cpu_atom events are still displayed even if atom CPUs are offlined.
      
      After:
      
        # perf list
        ...
        branch-instructions OR cpu_core/branch-instructions/ [Kernel PMU event]
        branch-misses OR cpu_core/branch-misses/           [Kernel PMU event]
        bus-cycles OR cpu_core/bus-cycles/                 [Kernel PMU event]
        ...
      
      Now only cpu_core events are displayed.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20210708013701.20347-2-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      49afa7f6
  3. 13 Jul, 2021 2 commits
    • Linus Torvalds's avatar
      Merge tag 'vboxsf-v5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hansg/linux · 40226a3d
      Linus Torvalds authored
      Pull vboxsf fixes from Hans de Goede:
       "This adds support for the atomic_open directory-inode op to vboxsf.
      
        Note this is not just an enhancement this also fixes an actual issue
        which users are hitting, see the commit message of the "boxsf: Add
        support for the atomic_open directory-inode" patch"
      
      * tag 'vboxsf-v5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hansg/linux:
        vboxsf: Add support for the atomic_open directory-inode op
        vboxsf: Add vboxsf_[create|release]_sf_handle() helpers
        vboxsf: Make vboxsf_dir_create() return the handle for the created file
        vboxsf: Honor excl flag to the dir-inode create op
      40226a3d
    • Linus Torvalds's avatar
      Merge tag 'for-5.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · f02bf857
      Linus Torvalds authored
      Pull btrfs zoned mode fixes from David Sterba:
      
       - fix deadlock when allocating system chunk
      
       - fix wrong mutex unlock on an error path
      
       - fix extent map splitting for append operation
      
       - update and fix message reporting unusable chunk space
      
       - don't block when background zone reclaim runs with balance in
         parallel
      
      * tag 'for-5.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: zoned: fix wrong mutex unlock on failure to allocate log root tree
        btrfs: don't block if we can't acquire the reclaim lock
        btrfs: properly split extent_map for REQ_OP_ZONE_APPEND
        btrfs: rework chunk allocation to avoid exhaustion of the system chunk array
        btrfs: fix deadlock with concurrent chunk allocations involving system chunks
        btrfs: zoned: print unusable percentage when reclaiming block groups
        btrfs: zoned: fix types for u64 division in btrfs_reclaim_bgs_work
      f02bf857
  4. 12 Jul, 2021 3 commits
  5. 11 Jul, 2021 11 commits
    • Linus Torvalds's avatar
      Linux 5.14-rc1 · e73f0f0e
      Linus Torvalds authored
      e73f0f0e
    • Hugh Dickins's avatar
      mm/rmap: try_to_migrate() skip zone_device !device_private · 6c855fce
      Hugh Dickins authored
      I know nothing about zone_device pages and !device_private pages; but if
      try_to_migrate_one() will do nothing for them, then it's better that
      try_to_migrate() filter them first, than trawl through all their vmas.
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Reviewed-by: default avatarShakeel Butt <shakeelb@google.com>
      Reviewed-by: default avatarAlistair Popple <apopple@nvidia.com>
      Link: https://lore.kernel.org/lkml/1241d356-8ec9-f47b-a5ec-9b2bf66d242@google.com/
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jason Gunthorpe <jgg@nvidia.com>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Yang Shi <shy828301@gmail.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6c855fce
    • Hugh Dickins's avatar
      mm/rmap: fix new bug: premature return from page_mlock_one() · 023e1a8d
      Hugh Dickins authored
      In the unlikely race case that page_mlock_one() finds VM_LOCKED has been
      cleared by the time it got page table lock, page_vma_mapped_walk_done()
      must be called before returning, either explicitly, or by a final call
      to page_vma_mapped_walk() - otherwise the page table remains locked.
      
      Fixes: cd62734c ("mm/rmap: split try_to_munlock from try_to_unmap")
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Reviewed-by: default avatarAlistair Popple <apopple@nvidia.com>
      Reviewed-by: default avatarShakeel Butt <shakeelb@google.com>
      Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Link: https://lore.kernel.org/lkml/20210711151446.GB4070@xsang-OptiPlex-9020/
      Link: https://lore.kernel.org/lkml/f71f8523-cba7-3342-40a7-114abc5d1f51@google.com/
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jason Gunthorpe <jgg@nvidia.com>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Yang Shi <shy828301@gmail.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      023e1a8d
    • Hugh Dickins's avatar
      mm/rmap: fix old bug: munlocking THP missed other mlocks · d9770fcc
      Hugh Dickins authored
      The kernel recovers in due course from missing Mlocked pages: but there
      was no point in calling page_mlock() (formerly known as
      try_to_munlock()) on a THP, because nothing got done even when it was
      found to be mapped in another VM_LOCKED vma.
      
      It's true that we need to be careful: Mlocked accounting of pte-mapped
      THPs is too difficult (so consistently avoided); but Mlocked accounting
      of only-pmd-mapped THPs is supposed to work, even when multiple mappings
      are mlocked and munlocked or munmapped.  Refine the tests.
      
      There is already a VM_BUG_ON_PAGE(PageDoubleMap) in page_mlock(), so
      page_mlock_one() does not even have to worry about that complication.
      
      (I said the kernel recovers: but would page reclaim be likely to split
      THP before rediscovering that it's VM_LOCKED? I've not followed that up)
      
      Fixes: 9a73f61b ("thp, mlock: do not mlock PTE-mapped file huge pages")
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Reviewed-by: default avatarShakeel Butt <shakeelb@google.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Link: https://lore.kernel.org/lkml/cfa154c-d595-406-eb7d-eb9df730f944@google.com/
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Alistair Popple <apopple@nvidia.com>
      Cc: Jason Gunthorpe <jgg@nvidia.com>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Yang Shi <shy828301@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d9770fcc
    • Hugh Dickins's avatar
      mm/rmap: fix comments left over from recent changes · 64b586d1
      Hugh Dickins authored
      Parallel developments in mm/rmap.c have left behind some out-of-date
      comments: try_to_migrate_one() also accepts TTU_SYNC (already commented
      in try_to_migrate() itself), and try_to_migrate() returns nothing at
      all.
      
      TTU_SPLIT_FREEZE has just been deleted, so reword the comment about it
      in mm/huge_memory.c; and TTU_IGNORE_ACCESS was removed in 5.11, so
      delete the "recently referenced" comment from try_to_unmap_one() (once
      upon a time the comment was near the removed codeblock, but they drifted
      apart).
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Reviewed-by: default avatarShakeel Butt <shakeelb@google.com>
      Reviewed-by: default avatarAlistair Popple <apopple@nvidia.com>
      Link: https://lore.kernel.org/lkml/563ce5b2-7a44-5b4d-1dfd-59a0e65932a9@google.com/
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jason Gunthorpe <jgg@nvidia.com>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Yang Shi <shy828301@gmail.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      64b586d1
    • Linus Torvalds's avatar
      Merge tag 'irq-urgent-2021-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 98f7fdce
      Linus Torvalds authored
      Pull irq fixes from Ingo Molnar:
       "Two fixes:
      
         - Fix a MIPS IRQ handling RCU bug
      
         - Remove a DocBook annotation for a parameter that doesn't exist
           anymore"
      
      * tag 'irq-urgent-2021-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/mips: Fix RCU violation when using irqdomain lookup on interrupt entry
        genirq/irqdesc: Drop excess kernel-doc entry @lookup
      98f7fdce
    • Linus Torvalds's avatar
      Merge tag 'sched-urgent-2021-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 877029d9
      Linus Torvalds authored
      Pull scheduler fixes from Ingo Molnar:
       "Three fixes:
      
         - Fix load tracking bug/inconsistency
      
         - Fix a sporadic CFS bandwidth constraints enforcement bug
      
         - Fix a uclamp utilization tracking bug for newly woken tasks"
      
      * tag 'sched-urgent-2021-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/uclamp: Ignore max aggregation if rq is idle
        sched/fair: Fix CFS bandwidth hrtimer expiry type
        sched/fair: Sync load_sum with load_avg after dequeue
      877029d9
    • Linus Torvalds's avatar
      Merge tag 'perf-urgent-2021-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 936b664f
      Linus Torvalds authored
      Pull perf fixes from Ingo Molnar:
       "A fix and a hardware-enablement addition:
      
         - Robustify uncore_snbep's skx_iio_set_mapping()'s error cleanup
      
         - Add cstate event support for Intel ICELAKE_X and ICELAKE_D"
      
      * tag 'perf-urgent-2021-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/intel/uncore: Clean up error handling path of iio mapping
        perf/x86/cstate: Add ICELAKE_X and ICELAKE_D support
      936b664f
    • Linus Torvalds's avatar
      Merge tag 'locking-urgent-2021-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 301c8b1d
      Linus Torvalds authored
      Pull locking fixes from Ingo Molnar:
      
       - Fix a Sparc crash
      
       - Fix a number of objtool warnings
      
       - Fix /proc/lockdep output on certain configs
      
       - Restore a kprobes fail-safe
      
      * tag 'locking-urgent-2021-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        locking/atomic: sparc: Fix arch_cmpxchg64_local()
        kprobe/static_call: Restore missing static_call_text_reserved()
        static_call: Fix static_call_text_reserved() vs __init
        jump_label: Fix jump_label_text_reserved() vs __init
        locking/lockdep: Fix meaningless /proc/lockdep output of lock classes on !CONFIG_PROVE_LOCKING
      301c8b1d
    • Linus Torvalds's avatar
      Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 8b9cc17a
      Linus Torvalds authored
      Pull more SCSI updates from James Bottomley:
       "This is a set of minor fixes and clean ups in the core and various
        drivers.
      
        The only core change in behaviour is the I/O retry for spinup notify,
        but that shouldn't impact anything other than the failing case"
      
      * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (23 commits)
        scsi: virtio_scsi: Add validation for residual bytes from response
        scsi: ipr: System crashes when seeing type 20 error
        scsi: core: Retry I/O for Notify (Enable Spinup) Required error
        scsi: mpi3mr: Fix warnings reported by smatch
        scsi: qedf: Add check to synchronize abort and flush
        scsi: MAINTAINERS: Add mpi3mr driver maintainers
        scsi: libfc: Fix array index out of bound exception
        scsi: mvsas: Use DEVICE_ATTR_RO()/RW() macro
        scsi: megaraid_mbox: Use DEVICE_ATTR_ADMIN_RO() macro
        scsi: qedf: Use DEVICE_ATTR_RO() macro
        scsi: qedi: Use DEVICE_ATTR_RO() macro
        scsi: message: mptfc: Switch from pci_ to dma_ API
        scsi: be2iscsi: Fix some missing space in some messages
        scsi: be2iscsi: Fix an error handling path in beiscsi_dev_probe()
        scsi: ufs: Fix build warning without CONFIG_PM
        scsi: bnx2fc: Remove meaningless bnx2fc_abts_cleanup() return value assignment
        scsi: qla2xxx: Add heartbeat check
        scsi: virtio_scsi: Do not overwrite SCSI status
        scsi: libsas: Add LUN number check in .slave_alloc callback
        scsi: core: Inline scsi_mq_alloc_queue()
        ...
      8b9cc17a
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-for-v5.14-2021-07-10' of... · b1412bd7
      Linus Torvalds authored
      Merge tag 'perf-tools-for-v5.14-2021-07-10' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull more perf tool updates from Arnaldo Carvalho de Melo:
       "New features:
      
         - Enable use of BPF counters with 'perf stat --for-each-cgroup',
           using per-CPU 'cgroup-switch' events with an attached BPF program
           that does aggregation per-cgroup in the kernel instead of using
           per-cgroup perf events.
      
         - Add Topdown metrics L2 events as default events in 'perf stat' for
           systems having those events.
      
        Hardware tracing:
      
         - Add a config for max loops without consuming a packet in the Intel
           PT packet decoder, set via 'perf config intel-pt.max-loops=N'
      
        Hardware enablement:
      
         - Disable misleading NMI watchdog message in 'perf stat' on hybrid
           systems such as Intel Alder Lake.
      
         - Add a dummy event on hybrid systems to collect metadata records.
      
         - Add 24x7 nest metric events for the Power10 platform.
      
        Fixes:
      
         - Fix event parsing for PMUs starting with the same prefix.
      
         - Fix the 'perf trace' 'trace' alias installation dir.
      
         - Fix buffer size to report iregs in perf script python scripts,
           supporting the extended registers in PowerPC.
      
         - Fix overflow in elf_sec__is_text().
      
         - Fix 's' on source line when disasm is empty in the annotation TUI,
           accessible via 'perf annotate', 'perf report' and 'perf top'.
      
         - Plug leaks in scandir() returned dirent entries in 'perf test' when
           sorting the shell tests.
      
         - Fix --task and --stat with pipe input in 'perf report'.
      
         - Fix 'perf probe' use of debuginfo files by build id.
      
         - If a DSO has both dynsym and symtab ELF sections, read from both
           when loading the symbol table, fixing a problem processing Fedora
           32 glibc DSOs.
      
        Libraries:
      
         - Add grouping of events to libperf, from code in tools/perf,
           allowing libperf users to use that mode.
      
        Misc:
      
         - Filter plt stubs from the 'perf probe --functions' output.
      
         - Update UAPI header copies for asound, DRM, mman-common.h and the
           ones affected by the quotactl_fd syscall"
      
      * tag 'perf-tools-for-v5.14-2021-07-10' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (29 commits)
        perf test: Add free() calls for scandir() returned dirent entries
        libperf: Add tests for perf_evlist__set_leader()
        libperf: Remove BUG_ON() from library code in get_group_fd()
        libperf: Add group support to perf_evsel__open()
        perf tools: Fix pattern matching for same substring in different PMU type
        perf record: Add a dummy event on hybrid systems to collect metadata records
        perf stat: Add Topdown metrics L2 events as default events
        libperf: Adopt evlist__set_leader() from tools/perf as perf_evlist__set_leader()
        libperf: Move 'nr_groups' from tools/perf to evlist::nr_groups
        libperf: Move 'leader' from tools/perf to perf_evsel::leader
        libperf: Move 'idx' from tools/perf to perf_evsel::idx
        libperf: Change tests to single static and shared binaries
        perf intel-pt: Add a config for max loops without consuming a packet
        perf stat: Disable the NMI watchdog message on hybrid
        perf vendor events power10: Adds 24x7 nest metric events for power10 platform
        perf script python: Fix buffer size to report iregs in perf script
        perf trace: Fix the perf trace link location
        perf top: Fix overflow in elf_sec__is_text()
        perf annotate: Fix 's' on source line when disasm is empty
        perf probe: Do not show @plt function by default
        ...
      b1412bd7
  6. 10 Jul, 2021 2 commits
    • Linus Torvalds's avatar
      Merge tag 'rtc-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux · de554096
      Linus Torvalds authored
      Pull RTC updates from Alexandre Belloni:
       "Mostly documentation/comment changes and non urgent fixes.
      
         - add or fix SPDX identifiers
      
         - NXP pcf*: fix datasheet URLs
      
         - imxdi: add wakeup support
      
         - pcf2127: handle timestamp interrupts, this fixes a possible
           interrupt storm
      
         - bd70528: Drop BD70528 support"
      
      * tag 'rtc-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (33 commits)
        rtc: pcf8523: rename register and bit defines
        rtc: pcf2127: handle timestamp interrupts
        rtc: at91sam9: Remove unnecessary offset variable checks
        rtc: s5m: Check return value of s5m_check_peding_alarm_interrupt()
        rtc: spear: convert to SPDX identifier
        rtc: tps6586x: convert to SPDX identifier
        rtc: tps80031: convert to SPDX identifier
        rtc: rtd119x: Fix format of SPDX identifier
        rtc: sc27xx: Fix format of SPDX identifier
        rtc: palmas: convert to SPDX identifier
        rtc: max6900: convert to SPDX identifier
        rtc: ds1374: convert to SPDX identifier
        rtc: au1xxx: convert to SPDX identifier
        rtc: pcf85063: Update the PCF85063A datasheet revision
        dt-bindings: rtc: ti,bq32k: take maintainership
        rtc: pcf8563: Fix the datasheet URL
        rtc: pcf85063: Fix the datasheet URL
        rtc: pcf2127: Fix the datasheet URL
        dt-bindings: rtc: ti,bq32k: Convert to json-schema
        dt-bindings: rtc: rx8900: Convert to YAML schema
        ...
      de554096
    • Mel Gorman's avatar
      mm/page_alloc: Revert pahole zero-sized workaround · 6bce2443
      Mel Gorman authored
      Commit dbbee9d5 ("mm/page_alloc: convert per-cpu list protection to
      local_lock") folded in a workaround patch for pahole that was unable to
      deal with zero-sized percpu structures.
      
      A superior workaround is achieved with commit a0b8200d ("kbuild:
      skip per-CPU BTF generation for pahole v1.18-v1.21").
      
      This patch reverts the dummy field and the pahole version check.
      
      Fixes: dbbee9d5 ("mm/page_alloc: convert per-cpu list protection to local_lock")
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6bce2443