1. 20 Sep, 2019 18 commits
  2. 10 Sep, 2019 4 commits
  3. 05 Sep, 2019 1 commit
  4. 03 Sep, 2019 1 commit
    • Valdis Klētnieks's avatar
      perf/x86: Make more stuff static · d9f3b450
      Valdis Klētnieks authored
      When building with C=2, sparse makes note of a number of things:
      
        arch/x86/events/intel/rapl.c:637:30: warning: symbol 'rapl_attr_update' was not declared. Should it be static?
        arch/x86/events/intel/cstate.c:449:30: warning: symbol 'core_attr_update' was not declared. Should it be static?
        arch/x86/events/intel/cstate.c:457:30: warning: symbol 'pkg_attr_update' was not declared. Should it be static?
        arch/x86/events/msr.c:170:30: warning: symbol 'attr_update' was not declared. Should it be static?
        arch/x86/events/intel/lbr.c:276:1: warning: symbol 'lbr_from_quirk_key' was not declared. Should it be static?
      
      And they can all indeed be static.
      Signed-off-by: default avatarValdis Kletnieks <valdis.kletnieks@vt.edu>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/128059.1565286242@turing-policeSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d9f3b450
  5. 02 Sep, 2019 3 commits
  6. 01 Sep, 2019 13 commits
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9f159ae0
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "A set of fixes for x86:
      
         - Fix the bogus detection of 32bit user mode for uretprobes which
           caused corruption of the user return address resulting in
           application crashes. In the uprobes handler in_ia32_syscall() is
           obviously always returning false on a 64bit kernel. Use
           user_64bit_mode() instead which works correctly.
      
         - Prevent large page splitting when ftrace flips RW/RO on the kernel
           text which caused iTLB performance issues. Ftrace wants to be
           converted to text_poke() which avoids the problem, but for now
           allow large page preservation in the static protections check when
           the change request spawns a full large page.
      
         - Prevent arch_dynirq_lower_bound() from returning 0 when the IOAPIC
           is configured via device tree. In the device tree case the GSI 1:1
           mapping is meaningless therefore the lower bound which protects the
           GSI range on ACPI machines is irrelevant. Return the lower bound
           which the core hands to the function instead of blindly returning 0
           which causes the core to allocate the invalid virtual interupt
           number 0 which in turn prevents all drivers from allocating and
           requesting an interrupt.
      
         - Remove the bogus initialization of LDR and DFR in the 32bit bigsmp
           APIC driver. That uses physical destination mode where LDR/DFR are
           ignored, but the initialization and the missing clear of LDR caused
           the APIC to be left in a inconsistent state on kexec/reboot.
      
         - Clear LDR when clearing the APIC registers so the APIC is in a well
           defined state.
      
         - Initialize variables proper in the find_trampoline_placement()
           code.
      
         - Silence GCC( build warning for the real mode part of the build"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mm/cpa: Prevent large page split when ftrace flips RW on kernel text
        x86/build: Add -Wnoaddress-of-packed-member to REALMODE_CFLAGS, to silence GCC9 build warning
        x86/boot/compressed/64: Fix missing initialization in find_trampoline_placement()
        x86/apic: Include the LDR when clearing out APIC registers
        x86/apic: Do not initialize LDR and DFR for bigsmp
        uprobes/x86: Fix detection of 32-bit user mode
        x86/apic: Fix arch_dynirq_lower_bound() bug for DT enabled machines
      9f159ae0
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 5fb181cb
      Linus Torvalds authored
      Pull perf fixes from Thomas Gleixner:
       "Two fixes for perf x86 hardware implementations:
      
         - Restrict the period on Nehalem machines to prevent perf from
           hogging the CPU
      
         - Prevent the AMD IBS driver from overwriting the hardwre controlled
           and pre-seeded reserved bits (0-6) in the count register which
           caused a sample bias for dispatched micro-ops"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/amd/ibs: Fix sample bias for dispatched micro-ops
        perf/x86/intel: Restrict period on Nehalem
      5fb181cb
    • Linus Torvalds's avatar
      Merge branch 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux · 5358e6e7
      Linus Torvalds authored
      Pull turbostat updates from Len Brown:
       "User-space turbostat (and x86_energy_perf_policy) patches.
      
        They are primarily bug fixes from users"
      
      * 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
        tools/power turbostat: update version number
        tools/power turbostat: Add support for Hygon Fam 18h (Dhyana) RAPL
        tools/power turbostat: Fix caller parameter of get_tdp_amd()
        tools/power turbostat: Fix CPU%C1 display value
        tools/power turbostat: do not enforce 1ms
        tools/power turbostat: read from pipes too
        tools/power turbostat: Add Ice Lake NNPI support
        tools/power turbostat: rename has_hsw_msrs()
        tools/power turbostat: Fix Haswell Core systems
        tools/power turbostat: add Jacobsville support
        tools/power turbostat: fix buffer overrun
        tools/power turbostat: fix file descriptor leaks
        tools/power turbostat: fix leak of file descriptor on error return path
        tools/power turbostat: Make interval calculation per thread to reduce jitter
        tools/power turbostat: remove duplicate pc10 column
        tools/power x86_energy_perf_policy: Fix argument parsing
        tools/power: Fix typo in man page
        tools/power/x86: Enable compiler optimisations and Fortify by default
        tools/power x86_energy_perf_policy: Fix "uninitialized variable" warnings at -O2
      5358e6e7
    • Arnaldo Carvalho de Melo's avatar
      objtool: Ignore intentional differences for the x86 insn decoder · ae31a514
      Arnaldo Carvalho de Melo authored
      Since we need to build this in !x86, we need to explicitely use the x86
      files, not things like asm/insn.h, so we intentionally differ from the
      master copy in the kernel sources, add -I diff directives to ignore just
      these differences when checking for drift.
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Link: http://lore.kernel.org/lkml/20190830193109.p7jagidsrahoa4pn@trebleAcked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/n/tip-j965m9b7xtdc83em3twfkh9o@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ae31a514
    • Arnaldo Carvalho de Melo's avatar
      objtool: Update sync-check.sh from perf's check-headers.sh · 2ffd84ae
      Arnaldo Carvalho de Melo authored
      To allow using the -I trick that will be needed for checking the x86
      insn decoder files.
      
      Without the specific -I lines we still get the same warnings as before:
      
        $ make -C tools/objtool/ clean ; make -C tools/objtool/
        make: Entering directory '/home/acme/git/perf/tools/objtool'
          CLEAN    objtool
        find  -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete
        rm -f arch/x86/inat-tables.c fixdep
        <SNIP>
          LD       objtool-in.o
        make[1]: Leaving directory '/home/acme/git/perf/tools/objtool'
        Warning: Kernel ABI header at 'tools/arch/x86/include/asm/inat.h' differs from latest version at 'arch/x86/include/asm/inat.h'
        diff -u tools/arch/x86/include/asm/inat.h arch/x86/include/asm/inat.h
        Warning: Kernel ABI header at 'tools/arch/x86/include/asm/insn.h' differs from latest version at 'arch/x86/include/asm/insn.h'
        diff -u tools/arch/x86/include/asm/insn.h arch/x86/include/asm/insn.h
        Warning: Kernel ABI header at 'tools/arch/x86/lib/inat.c' differs from latest version at 'arch/x86/lib/inat.c'
        diff -u tools/arch/x86/lib/inat.c arch/x86/lib/inat.c
        Warning: Kernel ABI header at 'tools/arch/x86/lib/insn.c' differs from latest version at 'arch/x86/lib/insn.c'
        diff -u tools/arch/x86/lib/insn.c arch/x86/lib/insn.c
        /home/acme/git/perf/tools/objtool
          LINK     objtool
        make: Leaving directory '/home/acme/git/perf/tools/objtool'
        $
      
      The next patch will add the -I lines for those files.
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Link: http://lore.kernel.org/lkml/20190830193109.p7jagidsrahoa4pn@trebleAcked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/n/tip-vu3p38mnxlwd80rlsnjkqcf2@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2ffd84ae
    • Arnaldo Carvalho de Melo's avatar
      perf build: Ignore intentional differences for the x86 insn decoder · 87a682a7
      Arnaldo Carvalho de Melo authored
      Since we need to build this in !x86, we need to explicitely use the x86
      files, not things like asm/insn.h, so we intentionally differ from the
      master copy in the kernel sources, add -I diff directives to ignore just
      these differences when checking for drift.
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/n/tip-9qziqjjt120mmz6kyepka9p7@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      87a682a7
    • Josh Poimboeuf's avatar
      perf intel-pt: Use shared x86 insn decoder · 00a26390
      Josh Poimboeuf authored
      Now that there's a common version of the decoder for all tools, use it
      instead of the local copy.
      
      Also use perf's check-headers.sh script to diff the decoder files to
      make sure they remain in sync with the kernel version.  Objtool has a
      similar check.
      
      Committer notes:
      
      Had to keep this all pointing explicitely to x86 headers/files, i.e.
      instead of asm/isnn.h we had to use ../include/asm/insn.h when the files
      were in differemt dirs, or just replace "<asm/foo.h>" with "foo.h".
      
      This way we continue to be able to process perf.data files with Intel PT
      traces in distros other than x86.
      
      Also fixed up the awk script paths to use $(srcdir)/tools/arch instead
      or relative directories so that we keep detached tarballs (make help |
      grep perf) working.
      
      For now the include lines in these headers are being ignored so as not
      to flag false reports of kernel/tools out of sync.
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Reviewed-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: x86@kernel.org
      Link: http://lore.kernel.org/lkml/8a37e615d2880f039505d693d1e068a009358a2b.1567118001.git.jpoimboe@redhat.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      00a26390
    • Josh Poimboeuf's avatar
      perf intel-pt: Remove inat.c from build dependency list · f1da0a6c
      Josh Poimboeuf authored
      intel-pt-insn-decoder.c includes inat.c directly, so it already has an
      implicit dependency on inat.c.  The Build file dependency is redundant.
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Reviewed-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: x86@kernel.org
      Link: http://lore.kernel.org/lkml/53776d6d29bc9eceb571d52df8fa32250c58a0f3.1567118001.git.jpoimboe@redhat.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f1da0a6c
    • Josh Poimboeuf's avatar
      perf: Update .gitignore file · 58993fb2
      Josh Poimboeuf authored
      After a "make tools/perf", git reports the following untracked files:
      
        tools/perf/feature/
        tools/perf/fixdep
        tools/perf/libtraceevent-dynamic-list
      
      Add these generated files to perf's .gitignore file.
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Reviewed-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: x86@kernel.org
      Link: http://lore.kernel.org/lkml/03acbc6c2fbc72054861f6c301875db75db33030.1567118001.git.jpoimboe@redhat.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      58993fb2
    • Josh Poimboeuf's avatar
      objtool: Move x86 insn decoder to a common location · d046b725
      Josh Poimboeuf authored
      The kernel tree has three identical copies of the x86 instruction
      decoder.  Two of them are in the tools subdir.
      
      The tools subdir is supposed to be completely standalone and separate
      from the kernel.  So having at least one copy of the kernel decoder in
      the tools subdir is unavoidable.  However, we don't need *two* of them.
      
      Move objtool's copy of the decoder to a shared location, so that perf
      will also be able to use it.
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Reviewed-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: x86@kernel.org
      Link: http://lore.kernel.org/lkml/55b486b88f6bcd0c9a2a04b34f964860c8390ca8.1567118001.git.jpoimboe@redhat.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d046b725
    • Jin Yao's avatar
      perf metricgroup: Support multiple events for metricgroup · f01642e4
      Jin Yao authored
      Some uncore metrics don't work as expected. For example, on
      cascadelakex:
      
        root@lkp-csl-2sp2:~# perf stat -M UNC_M_PMM_BANDWIDTH.TOTAL -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
                 1841092      unc_m_pmm_rpq_inserts
                 3680816      unc_m_pmm_wpq_inserts
      
             1.001775055 seconds time elapsed
      
        root@lkp-csl-2sp2:~# perf stat -M UNC_M_PMM_READ_LATENCY -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
               860649746      unc_m_pmm_rpq_occupancy.all
                 1840557      unc_m_pmm_rpq_inserts
             12790627455      unc_m_clockticks
      
             1.001773348 seconds time elapsed
      
      No metrics 'UNC_M_PMM_BANDWIDTH.TOTAL' or 'UNC_M_PMM_READ_LATENCY' are
      reported.
      
      The issue is, the case of an alias expanding to mulitple events is not
      supported, typically the uncore events.  (see comments in
      find_evsel_group()).
      
      For UNC_M_PMM_BANDWIDTH.TOTAL in above example, the expanded event group
      is '{unc_m_pmm_rpq_inserts,unc_m_pmm_wpq_inserts}:W', but the actual
      events passed to find_evsel_group are:
      
        unc_m_pmm_rpq_inserts
        unc_m_pmm_rpq_inserts
        unc_m_pmm_rpq_inserts
        unc_m_pmm_rpq_inserts
        unc_m_pmm_rpq_inserts
        unc_m_pmm_rpq_inserts
        unc_m_pmm_wpq_inserts
        unc_m_pmm_wpq_inserts
        unc_m_pmm_wpq_inserts
        unc_m_pmm_wpq_inserts
        unc_m_pmm_wpq_inserts
        unc_m_pmm_wpq_inserts
      
      For this multiple events case, it's not supported well.
      
      This patch introduces a new field 'metric_leader' in struct evsel. The
      first event is considered as a metric leader. For the rest of same
      events, they point to the first event via it's metric_leader field in
      struct evsel.
      
      This design is for adding the counting results of all same events to the
      first event in group (the metric_leader).
      
      With this patch,
      
        root@lkp-csl-2sp2:~# perf stat -M UNC_M_PMM_BANDWIDTH.TOTAL -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
                 1842108      unc_m_pmm_rpq_inserts     #    337.2 MB/sec  UNC_M_PMM_BANDWIDTH.TOTAL
                 3682209      unc_m_pmm_wpq_inserts
      
             1.001819706 seconds time elapsed
      
        root@lkp-csl-2sp2:~# perf stat -M UNC_M_PMM_READ_LATENCY -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
               861970685      unc_m_pmm_rpq_occupancy.all #    219.4 ns  UNC_M_PMM_READ_LATENCY
                 1842772      unc_m_pmm_rpq_inserts
             12790196356      unc_m_clockticks
      
             1.001749103 seconds time elapsed
      
      Now we can see the correct metrics 'UNC_M_PMM_BANDWIDTH.TOTAL' and
      'UNC_M_PMM_READ_LATENCY'.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20190828055932.8269-5-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f01642e4
    • Jin Yao's avatar
      perf metricgroup: Scale the metric result · 287f2649
      Jin Yao authored
      Some metrics define the scale unit, such as
      
          {
              "BriefDescription": "Intel Optane DC persistent memory read latency (ns). Derived from unc_m_pmm_rpq_occupancy.all",
              "Counter": "0,1,2,3",
              "EventCode": "0xE0",
              "EventName": "UNC_M_PMM_READ_LATENCY",
              "MetricExpr": "UNC_M_PMM_RPQ_OCCUPANCY.ALL / UNC_M_PMM_RPQ_INSERTS / UNC_M_CLOCKTICKS",
              "MetricName": "UNC_M_PMM_READ_LATENCY",
              "PerPkg": "1",
              "ScaleUnit": "6000000000ns",
              "UMask": "0x1",
              "Unit": "iMC"
          },
      
      For above example, the ratio should be,
      
      ratio = (UNC_M_PMM_RPQ_OCCUPANCY.ALL / UNC_M_PMM_RPQ_INSERTS / UNC_M_CLOCKTICKS) * 6000000000
      
      But in current code, the ratio is not scaled ( * 6000000000)
      
      With this patch, the ratio is scaled and the unit (ns) is printed.
      
      For example,
        #    219.4 ns  UNC_M_PMM_READ_LATENCY
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20190828055932.8269-4-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      287f2649
    • Jin Yao's avatar
      perf pmu: Change convert_scale from static to global · a55ab7c4
      Jin Yao authored
      The function convert_scale() can be used to convert string to unit and
      scale. For example,
      
        s = "6000000000ns";
        convert_scale(s, &unit, &scale);
      
      unit = "ns", scale = 6000000000.
      
      Currently this function is static. This patch renames the function to
      perf_pmu__convert_scale and changes the function to global.  No
      functional change.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20190828055932.8269-2-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a55ab7c4