Commit 34122105 authored by Ian Rogers's avatar Ian Rogers Committed by Arnaldo Carvalho de Melo

perf vendor events: Update Intel sapphirerapids

Update to v1.04, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the sapphirerapids files into perf and update mapfile.csv.

Tested on a non-sapphirerapids with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
Signed-off-by: default avatarIan Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-23-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
parent 777e1312
......@@ -20,6 +20,7 @@ GenuineIntel-6-AA,v1.00,meteorlake,core
GenuineIntel-6-1[AEF],v3,nehalemep,core
GenuineIntel-6-2E,v3,nehalemex,core
GenuineIntel-6-2A,v17,sandybridge,core
GenuineIntel-6-8F,v1.04,sapphirerapids,core
GenuineIntel-6-[4589]E,v24,skylake,core
GenuineIntel-6-A[56],v24,skylake,core
GenuineIntel-6-37,v13,silvermont,core
......@@ -31,7 +32,6 @@ GenuineIntel-6-2F,v2,westmereex,core
GenuineIntel-6-55-[01234],v1,skylakex,core
GenuineIntel-6-8[CD],v1,tigerlake,core
GenuineIntel-6-86,v1,snowridgex,core
GenuineIntel-6-8F,v1,sapphirerapids,core
AuthenticAMD-23-([12][0-9A-F]|[0-9A-F]),v2,amdzen1,core
AuthenticAMD-23-[[:xdigit:]]+,v1,amdzen2,core
AuthenticAMD-25-[[:xdigit:]]+,v1,amdzen3,core
......@@ -131,6 +131,18 @@
"Speculative": "1",
"UMask": "0x1"
},
{
"BriefDescription": "Cache lines that have been L2 hardware prefetched but not used by demand accesses",
"CollectPEBSRecord": "2",
"Counter": "0,1,2,3",
"EventCode": "0x26",
"EventName": "L2_LINES_OUT.USELESS_HWPF",
"PEBScounters": "0,1,2,3",
"PublicDescription": "Counts the number of cache lines that have been prefetched by the L2 hardware prefetcher but not used by demand access when evicted from the L2 cache",
"SampleAfterValue": "200003",
"Speculative": "1",
"UMask": "0x4"
},
{
"BriefDescription": "All accesses to L2 cache[This event is alias to L2_RQSTS.REFERENCES]",
"CollectPEBSRecord": "2",
......@@ -358,18 +370,31 @@
"UMask": "0x28"
},
{
"BriefDescription": "LONGEST_LAT_CACHE.MISS",
"BriefDescription": "Core-originated cacheable requests that missed L3 (Except hardware prefetches to the L3)",
"CollectPEBSRecord": "2",
"Counter": "0,1,2,3,4,5,6,7",
"EventCode": "0x2e",
"EventName": "LONGEST_LAT_CACHE.MISS",
"PEBScounters": "0,1,2,3,4,5,6,7",
"PublicDescription": "Counts core-originated cacheable requests that miss the L3 cache (Longest Latency cache). Requests include data and code reads, Reads-for-Ownership (RFOs), speculative accesses and hardware prefetches to the L1 and L2. It does not include hardware prefetches to the L3, and may not count other types of requests to the L3.",
"SampleAfterValue": "100003",
"Speculative": "1",
"UMask": "0x41"
},
{
"BriefDescription": "All retired load instructions.",
"BriefDescription": "Core-originated cacheable requests that refer to L3 (Except hardware prefetches to the L3)",
"CollectPEBSRecord": "2",
"Counter": "0,1,2,3,4,5,6,7",
"EventCode": "0x2e",
"EventName": "LONGEST_LAT_CACHE.REFERENCE",
"PEBScounters": "0,1,2,3,4,5,6,7",
"PublicDescription": "Counts core-originated cacheable requests to the L3 cache (Longest Latency cache). Requests include data and code reads, Reads-for-Ownership (RFOs), speculative accesses and hardware prefetches to the L1 and L2. It does not include hardware prefetches to the L3, and may not count other types of requests to the L3.",
"SampleAfterValue": "100003",
"Speculative": "1",
"UMask": "0x4f"
},
{
"BriefDescription": "Retired load instructions.",
"CollectPEBSRecord": "2",
"Counter": "0,1,2,3",
"Data_LA": "1",
......@@ -377,12 +402,12 @@
"EventName": "MEM_INST_RETIRED.ALL_LOADS",
"PEBS": "1",
"PEBScounters": "0,1,2,3",
"PublicDescription": "Counts all retired load instructions. This event accounts for SW prefetch instructions for loads.",
"PublicDescription": "Counts all retired load instructions. This event accounts for SW prefetch instructions of PREFETCHNTA or PREFETCHT0/1/2 or PREFETCHW.",
"SampleAfterValue": "1000003",
"UMask": "0x81"
},
{
"BriefDescription": "All retired store instructions.",
"BriefDescription": "Retired store instructions.",
"CollectPEBSRecord": "2",
"Counter": "0,1,2,3",
"Data_LA": "1",
......@@ -391,7 +416,7 @@
"L1_Hit_Indication": "1",
"PEBS": "1",
"PEBScounters": "0,1,2,3",
"PublicDescription": "Counts all retired store instructions. This event account for SW prefetch instructions and PREFETCHW instruction for stores.",
"PublicDescription": "Counts all retired store instructions.",
"SampleAfterValue": "1000003",
"UMask": "0x82"
},
......@@ -1013,6 +1038,17 @@
"SampleAfterValue": "100003",
"UMask": "0x1"
},
{
"BriefDescription": "Counts demand reads for ownership (RFO), hardware prefetch RFOs (which bring data to L2), and software prefetches for exclusive ownership (PREFETCHW) that hit to a (M)odified cacheline in the L3 or snoop filter.",
"Counter": "0,1,2,3",
"EventCode": "0x2A,0x2B",
"EventName": "OCR.RFO_TO_CORE.L3_HIT_M",
"MSRIndex": "0x1a6,0x1a7",
"MSRValue": "0x1F80040022",
"Offcore": "1",
"SampleAfterValue": "100003",
"UMask": "0x1"
},
{
"BriefDescription": "Counts streaming stores that hit in the L3 or were snooped from another core's caches on the same socket.",
"Counter": "0,1,2,3",
......
......@@ -276,6 +276,17 @@
"SampleAfterValue": "100003",
"UMask": "0x1"
},
{
"BriefDescription": "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were not supplied by the local socket's L1, L2, or L3 caches and the cacheline is homed locally.",
"Counter": "0,1,2,3",
"EventCode": "0x2A,0x2B",
"EventName": "OCR.READS_TO_CORE.L3_MISS_LOCAL",
"MSRIndex": "0x1a6,0x1a7",
"MSRValue": "0x3F04C04477",
"Offcore": "1",
"SampleAfterValue": "100003",
"UMask": "0x1"
},
{
"BriefDescription": "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that missed the L3 Cache and were supplied by the local socket (DRAM or PMM), whether or not in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts PMM or DRAM accesses that are controlled by the close or distant SNC Cluster. It does not count misses to the L3 which go to Local CXL Type 2 Memory or Local Non DRAM.",
"Counter": "0,1,2,3",
......
......@@ -174,6 +174,17 @@
"SampleAfterValue": "100003",
"UMask": "0x1"
},
{
"BriefDescription": "Counts data load hardware prefetch requests to the L1 data cache that have any type of response.",
"Counter": "0,1,2,3",
"EventCode": "0x2A,0x2B",
"EventName": "OCR.HWPF_L1D.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
"MSRValue": "0x10400",
"Offcore": "1",
"SampleAfterValue": "100003",
"UMask": "0x1"
},
{
"BriefDescription": "Counts hardware prefetches (which bring data to L2) that have any type of response.",
"Counter": "0,1,2,3",
......@@ -207,6 +218,17 @@
"SampleAfterValue": "100003",
"UMask": "0x1"
},
{
"BriefDescription": "Counts writebacks of modified cachelines and streaming stores that have any type of response.",
"Counter": "0,1,2,3",
"EventCode": "0x2A,0x2B",
"EventName": "OCR.MODIFIED_WRITE.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
"MSRValue": "0x10808",
"Offcore": "1",
"SampleAfterValue": "100003",
"UMask": "0x1"
},
{
"BriefDescription": "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that have any type of response.",
"Counter": "0,1,2,3",
......@@ -344,9 +366,49 @@
"CollectPEBSRecord": "2",
"Counter": "0,1,2,3,4,5,6,7",
"EventCode": "0xa5",
"EventName": "RS.EMPTY",
"PEBScounters": "0,1,2,3,4,5,6,7",
"PublicDescription": "Counts cycles during which the reservation station (RS) is empty for this logical processor. This is usually caused when the front-end pipeline runs into starvation periods (e.g. branch mispredictions or i-cache misses)",
"SampleAfterValue": "1000003",
"Speculative": "1",
"UMask": "0x7"
},
{
"BriefDescription": "Counts end of periods where the Reservation Station (RS) was empty.",
"CollectPEBSRecord": "2",
"Counter": "0,1,2,3,4,5,6,7",
"CounterMask": "1",
"EdgeDetect": "1",
"EventCode": "0xa5",
"EventName": "RS.EMPTY_COUNT",
"Invert": "1",
"PEBScounters": "0,1,2,3,4,5,6,7",
"PublicDescription": "Counts end of periods where the Reservation Station (RS) was empty. Could be useful to closely sample on front-end latency issues (see the FRONTEND_RETIRED event of designated precise events)",
"SampleAfterValue": "100003",
"Speculative": "1",
"UMask": "0x7"
},
{
"BriefDescription": "This event is deprecated. Refer to new event RS.EMPTY_COUNT",
"CollectPEBSRecord": "2",
"Counter": "0,1,2,3,4,5,6,7",
"CounterMask": "1",
"EdgeDetect": "1",
"EventCode": "0xa5",
"EventName": "RS_EMPTY.COUNT",
"Invert": "1",
"PEBScounters": "0,1,2,3,4,5,6,7",
"SampleAfterValue": "100003",
"Speculative": "1",
"UMask": "0x7"
},
{
"BriefDescription": "This event is deprecated. Refer to new event RS.EMPTY",
"CollectPEBSRecord": "2",
"Counter": "0,1,2,3,4,5,6,7",
"EventCode": "0xa5",
"EventName": "RS_EMPTY.CYCLES",
"PEBScounters": "0,1,2,3,4,5,6,7",
"PublicDescription": "Counts cycles during which the reservation station (RS) is empty for this logical processor.",
"SampleAfterValue": "1000003",
"Speculative": "1",
"UMask": "0x7"
......
......@@ -57,7 +57,6 @@
"EventCode": "0xb0",
"EventName": "ARITH.IDIV_ACTIVE",
"PEBScounters": "0,1,2,3,4,5,6,7",
"PublicDescription": "ARITH.IDIV_ACTIVE",
"SampleAfterValue": "1000003",
"Speculative": "1",
"UMask": "0x8"
......@@ -229,7 +228,7 @@
"UMask": "0x10"
},
{
"BriefDescription": "number of branch instructions retired that were mispredicted and taken. Non PEBS",
"BriefDescription": "number of branch instructions retired that were mispredicted and taken.",
"CollectPEBSRecord": "2",
"Counter": "0,1,2,3,4,5,6,7",
"EventCode": "0xc5",
......@@ -393,6 +392,18 @@
"Speculative": "1",
"UMask": "0x3"
},
{
"BriefDescription": "Reference cycles when the core is not in halt state.",
"CollectPEBSRecord": "2",
"Counter": "0,1,2,3,4,5,6,7",
"EventCode": "0x3c",
"EventName": "CPU_CLK_UNHALTED.REF_TSC_P",
"PEBScounters": "0,1,2,3,4,5,6,7",
"PublicDescription": "Counts the number of reference cycles when the core is not in a halt state. The core enters the halt state when it is running the HLT instruction or the MWAIT instruction. This event is not affected by core frequency changes (for example, P states, TM2 transitions) but has the same incrementing frequency as the time stamp counter. This event can approximate elapsed time while the core was not in a halt state. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events. Note: On all current platforms this event stops counting during 'throttling (TM)' states duty off periods the processor is 'halted'. The counter update is done at a lower clock rate then the core clock the overflow status bit for this counter may appear 'sticky'. After the counter has overflowed and software clears the overflow status bit and resets the counter to less than MAX. The reset value to the counter is not clocked immediately so the overflow status bit will flip 'high (1)' and generate another PMI (if enabled) after which the reset value gets clocked into the counter. Therefore, software will get the interrupt, read the overflow status bit '1 for bit 34 while the counter value is less than MAX. Software should ignore this case.",
"SampleAfterValue": "2000003",
"Speculative": "1",
"UMask": "0x1"
},
{
"BriefDescription": "Core cycles when the thread is not in halt state",
"CollectPEBSRecord": "2",
......@@ -617,12 +628,13 @@
"UMask": "0x10"
},
{
"BriefDescription": "Number of all retired NOP instructions.",
"BriefDescription": "Retired NOP instructions.",
"CollectPEBSRecord": "2",
"Counter": "0,1,2,3,4,5,6,7",
"EventCode": "0xc0",
"EventName": "INST_RETIRED.NOP",
"PEBScounters": "1,2,3,4,5,6,7",
"PublicDescription": "Counts all retired NOP or ENDBR32/64 instructions",
"SampleAfterValue": "2000003",
"UMask": "0x2"
},
......
......@@ -20,15 +20,6 @@
"UMaskExt": "0x00000000",
"Unit": "UPI LL"
},
{
"BriefDescription": "Clockticks in the UBOX using a dedicated 48-bit Fixed Counter",
"Counter": "FIXED",
"CounterType": "FIXED",
"EventCode": "0xff",
"EventName": "UNC_U_CLOCKTICKS",
"PerPkg": "1",
"Unit": "UBOX"
},
{
"BriefDescription": "IRP Clockticks",
"Counter": "0,1",
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment