Commit fa607370 authored by Ian Rogers's avatar Ian Rogers Committed by Arnaldo Carvalho de Melo

perf vendor events intel: Refresh alderlake-n metrics

Update the alderlake-n events from 1.16 to 1.18 (no change) and
metrics. Generation was done using https://github.com/intel/perfmon.

Notable changes are TMA metrics are updated to version 4.5, TMA info
metrics are renamed from their node name to be lower case and prefixed
by tma_info_, MetricThreshold expressions are added and the smi_cost
metric group is added replicating existing hard coded metrics in
stat-shadow.
Signed-off-by: default avatarIan Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Link: https://lore.kernel.org/r/20230219092848.639226-12-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
parent ad10c920
[ [
{ {
"BriefDescription": "Counts the number of issue slots that were not consumed by the backend due to frontend stalls.", "BriefDescription": "C10 residency percent per package",
"MetricExpr": "TOPDOWN_FE_BOUND.ALL / SLOTS", "MetricExpr": "cstate_pkg@c10\\-residency@ / TSC",
"MetricGroup": "TopdownL1", "MetricGroup": "Power",
"MetricName": "tma_frontend_bound", "MetricName": "C10_Pkg_Residency",
"ScaleUnit": "100%"
},
{
"BriefDescription": "Counts the number of issue slots that were not delivered by the frontend due to frontend bandwidth restrictions due to decode, predecode, cisc, and other limitations.",
"MetricExpr": "TOPDOWN_FE_BOUND.FRONTEND_LATENCY / SLOTS",
"MetricGroup": "TopdownL2;tma_frontend_bound_group",
"MetricName": "tma_frontend_latency",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of issue slots that were not delivered by the frontend due to instruction cache misses.", "BriefDescription": "C1 residency percent per core",
"MetricExpr": "TOPDOWN_FE_BOUND.ICACHE / SLOTS", "MetricExpr": "cstate_core@c1\\-residency@ / TSC",
"MetricGroup": "TopdownL3;tma_frontend_latency_group", "MetricGroup": "Power",
"MetricName": "tma_icache", "MetricName": "C1_Core_Residency",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of issue slots that were not delivered by the frontend due to Instruction Table Lookaside Buffer (ITLB) misses.", "BriefDescription": "C2 residency percent per package",
"MetricExpr": "TOPDOWN_FE_BOUND.ITLB / SLOTS", "MetricExpr": "cstate_pkg@c2\\-residency@ / TSC",
"MetricGroup": "TopdownL3;tma_frontend_latency_group", "MetricGroup": "Power",
"MetricName": "tma_itlb", "MetricName": "C2_Pkg_Residency",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of issue slots that were not delivered by the frontend due to BACLEARS, which occurs when the Branch Target Buffer (BTB) prediction or lack thereof, was corrected by a later branch predictor in the frontend", "BriefDescription": "C3 residency percent per package",
"MetricExpr": "TOPDOWN_FE_BOUND.BRANCH_DETECT / SLOTS", "MetricExpr": "cstate_pkg@c3\\-residency@ / TSC",
"MetricGroup": "TopdownL3;tma_frontend_latency_group", "MetricGroup": "Power",
"MetricName": "tma_branch_detect", "MetricName": "C3_Pkg_Residency",
"PublicDescription": "Counts the number of issue slots that were not delivered by the frontend due to BACLEARS, which occurs when the Branch Target Buffer (BTB) prediction or lack thereof, was corrected by a later branch predictor in the frontend. Includes BACLEARS due to all branch types including conditional and unconditional jumps, returns, and indirect branches.",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of issue slots that were not delivered by the frontend due to BTCLEARS, which occurs when the Branch Target Buffer (BTB) predicts a taken branch.", "BriefDescription": "C6 residency percent per core",
"MetricExpr": "TOPDOWN_FE_BOUND.BRANCH_RESTEER / SLOTS", "MetricExpr": "cstate_core@c6\\-residency@ / TSC",
"MetricGroup": "TopdownL3;tma_frontend_latency_group", "MetricGroup": "Power",
"MetricName": "tma_branch_resteer", "MetricName": "C6_Core_Residency",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of issue slots that were not delivered by the frontend due to frontend bandwidth restrictions due to decode, predecode, cisc, and other limitations.", "BriefDescription": "C6 residency percent per package",
"MetricExpr": "TOPDOWN_FE_BOUND.FRONTEND_BANDWIDTH / SLOTS", "MetricExpr": "cstate_pkg@c6\\-residency@ / TSC",
"MetricGroup": "TopdownL2;tma_frontend_bound_group", "MetricGroup": "Power",
"MetricName": "tma_frontend_bandwidth", "MetricName": "C6_Pkg_Residency",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of issue slots that were not delivered by the frontend due to the microcode sequencer (MS).", "BriefDescription": "C7 residency percent per core",
"MetricExpr": "TOPDOWN_FE_BOUND.CISC / SLOTS", "MetricExpr": "cstate_core@c7\\-residency@ / TSC",
"MetricGroup": "TopdownL3;tma_frontend_bandwidth_group", "MetricGroup": "Power",
"MetricName": "tma_cisc", "MetricName": "C7_Core_Residency",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of issue slots that were not delivered by the frontend due to decode stalls.", "BriefDescription": "C7 residency percent per package",
"MetricExpr": "TOPDOWN_FE_BOUND.DECODE / SLOTS", "MetricExpr": "cstate_pkg@c7\\-residency@ / TSC",
"MetricGroup": "TopdownL3;tma_frontend_bandwidth_group", "MetricGroup": "Power",
"MetricName": "tma_decode", "MetricName": "C7_Pkg_Residency",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of issue slots that were not delivered by the frontend due to wrong predecodes.", "BriefDescription": "C8 residency percent per package",
"MetricExpr": "TOPDOWN_FE_BOUND.PREDECODE / SLOTS", "MetricExpr": "cstate_pkg@c8\\-residency@ / TSC",
"MetricGroup": "TopdownL3;tma_frontend_bandwidth_group", "MetricGroup": "Power",
"MetricName": "tma_predecode", "MetricName": "C8_Pkg_Residency",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of issue slots that were not delivered by the frontend due to other common frontend stalls not categorized.", "BriefDescription": "C9 residency percent per package",
"MetricExpr": "TOPDOWN_FE_BOUND.OTHER / SLOTS", "MetricExpr": "cstate_pkg@c9\\-residency@ / TSC",
"MetricGroup": "TopdownL3;tma_frontend_bandwidth_group", "MetricGroup": "Power",
"MetricName": "tma_other_fb", "MetricName": "C9_Pkg_Residency",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the total number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear", "BriefDescription": "Percentage of cycles spent in System Management Interrupts.",
"MetricExpr": "(SLOTS - (TOPDOWN_FE_BOUND.ALL + TOPDOWN_BE_BOUND.ALL + TOPDOWN_RETIRING.ALL)) / SLOTS", "MetricExpr": "((msr@aperf@ - cycles) / msr@aperf@ if msr@smi@ > 0 else 0)",
"MetricGroup": "TopdownL1", "MetricGroup": "smi",
"MetricName": "tma_bad_speculation", "MetricName": "smi_cycles",
"PublicDescription": "Counts the total number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. Only issue slots wasted due to fast nukes such as memory ordering nukes are counted. Other nukes are not accounted for. Counts all issue slots blocked during this recovery window including relevant microcode flows and while uops are not yet available in the instruction queue (IQ). Also includes the issue slots that were consumed by the backend but were thrown away because they were younger than the mispredict or machine clear.", "MetricThreshold": "smi_cycles > 0.1",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of issue slots that were not consumed by the backend due to branch mispredicts.", "BriefDescription": "Number of SMI interrupts.",
"MetricExpr": "TOPDOWN_BAD_SPECULATION.MISPREDICT / SLOTS", "MetricExpr": "msr@smi@",
"MetricGroup": "TopdownL2;tma_bad_speculation_group", "MetricGroup": "smi",
"MetricName": "tma_branch_mispredicts", "MetricName": "smi_num",
"ScaleUnit": "100%" "ScaleUnit": "1SMI#"
}, },
{ {
"BriefDescription": "Counts the total number of issue slots that were not consumed by the backend because allocation is stalled due to a machine clear (nuke) of any kind including memory ordering and memory disambiguation.", "BriefDescription": "Counts the number of issue slots that were not consumed by the backend due to certain allocation restrictions.",
"MetricExpr": "TOPDOWN_BAD_SPECULATION.MACHINE_CLEARS / SLOTS", "MetricExpr": "TOPDOWN_BE_BOUND.ALLOC_RESTRICTIONS / tma_info_slots",
"MetricGroup": "TopdownL2;tma_bad_speculation_group", "MetricGroup": "TopdownL3;tma_L3_group;tma_resource_bound_group",
"MetricName": "tma_machine_clears", "MetricName": "tma_alloc_restriction",
"MetricThreshold": "tma_alloc_restriction > 0.1",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of issue slots that were not consumed by the backend due to a machine clear (slow nuke).", "BriefDescription": "Counts the total number of issue slots that were not consumed by the backend due to backend stalls",
"MetricExpr": "TOPDOWN_BAD_SPECULATION.NUKE / SLOTS", "MetricExpr": "TOPDOWN_BE_BOUND.ALL / tma_info_slots",
"MetricGroup": "TopdownL3;tma_machine_clears_group", "MetricGroup": "TopdownL1;tma_L1_group",
"MetricName": "tma_nuke", "MetricName": "tma_backend_bound",
"MetricThreshold": "tma_backend_bound > 0.1",
"PublicDescription": "Counts the total number of issue slots that were not consumed by the backend due to backend stalls. Note that uops must be available for consumption in order for this event to count. If a uop is not available (IQ is empty), this event will not count. The rest of these subevents count backend stalls, in cycles, due to an outstanding request which is memory bound vs core bound. The subevents are not slot based events and therefore can not be precisely added or subtracted from the Backend_Bound_Aux subevents which are slot based.",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of machine clears relative to the number of nuke slots due to SMC. ", "BriefDescription": "Counts the total number of issue slots that were not consumed by the backend due to backend stalls",
"MetricExpr": "tma_nuke * (MACHINE_CLEARS.SMC / MACHINE_CLEARS.SLOW)", "MetricExpr": "tma_backend_bound",
"MetricGroup": "TopdownL4;tma_nuke_group", "MetricGroup": "TopdownL1;tma_L1_group",
"MetricName": "tma_smc", "MetricName": "tma_backend_bound_aux",
"MetricThreshold": "tma_backend_bound_aux > 0.2",
"PublicDescription": "Counts the total number of issue slots that were not consumed by the backend due to backend stalls. Note that UOPS must be available for consumption in order for this event to count. If a uop is not available (IQ is empty), this event will not count. All of these subevents count backend stalls, in slots, due to a resource limitation. These are not cycle based events and therefore can not be precisely added or subtracted from the Backend_Bound subevents which are cycle based. These subevents are supplementary to Backend_Bound and can be used to analyze results from a resource perspective at allocation.",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of machine clears relative to the number of nuke slots due to memory ordering. ", "BriefDescription": "Counts the total number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear",
"MetricExpr": "tma_nuke * (MACHINE_CLEARS.MEMORY_ORDERING / MACHINE_CLEARS.SLOW)", "MetricExpr": "(tma_info_slots - (TOPDOWN_FE_BOUND.ALL + TOPDOWN_BE_BOUND.ALL + TOPDOWN_RETIRING.ALL)) / tma_info_slots",
"MetricGroup": "TopdownL4;tma_nuke_group", "MetricGroup": "TopdownL1;tma_L1_group",
"MetricName": "tma_memory_ordering", "MetricName": "tma_bad_speculation",
"MetricThreshold": "tma_bad_speculation > 0.15",
"PublicDescription": "Counts the total number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. Only issue slots wasted due to fast nukes such as memory ordering nukes are counted. Other nukes are not accounted for. Counts all issue slots blocked during this recovery window including relevant microcode flows and while uops are not yet available in the instruction queue (IQ). Also includes the issue slots that were consumed by the backend but were thrown away because they were younger than the mispredict or machine clear.",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of machine clears relative to the number of nuke slots due to FP assists. ", "BriefDescription": "Counts the number of uops that are not from the microsequencer.",
"MetricExpr": "tma_nuke * (MACHINE_CLEARS.FP_ASSIST / MACHINE_CLEARS.SLOW)", "MetricExpr": "(TOPDOWN_RETIRING.ALL - UOPS_RETIRED.MS) / tma_info_slots",
"MetricGroup": "TopdownL4;tma_nuke_group", "MetricGroup": "TopdownL2;tma_L2_group;tma_retiring_group",
"MetricName": "tma_fp_assist", "MetricName": "tma_base",
"MetricThreshold": "tma_base > 0.6",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of machine clears relative to the number of nuke slots due to memory disambiguation. ", "BriefDescription": "Counts the number of issue slots that were not delivered by the frontend due to BACLEARS, which occurs when the Branch Target Buffer (BTB) prediction or lack thereof, was corrected by a later branch predictor in the frontend",
"MetricExpr": "tma_nuke * (MACHINE_CLEARS.DISAMBIGUATION / MACHINE_CLEARS.SLOW)", "MetricExpr": "TOPDOWN_FE_BOUND.BRANCH_DETECT / tma_info_slots",
"MetricGroup": "TopdownL4;tma_nuke_group", "MetricGroup": "TopdownL3;tma_L3_group;tma_frontend_latency_group",
"MetricName": "tma_disambiguation", "MetricName": "tma_branch_detect",
"MetricThreshold": "tma_branch_detect > 0.05",
"PublicDescription": "Counts the number of issue slots that were not delivered by the frontend due to BACLEARS, which occurs when the Branch Target Buffer (BTB) prediction or lack thereof, was corrected by a later branch predictor in the frontend. Includes BACLEARS due to all branch types including conditional and unconditional jumps, returns, and indirect branches.",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of machine clears relative to the number of nuke slots due to page faults. ", "BriefDescription": "Counts the number of issue slots that were not consumed by the backend due to branch mispredicts.",
"MetricExpr": "tma_nuke * (MACHINE_CLEARS.PAGE_FAULT / MACHINE_CLEARS.SLOW)", "MetricExpr": "TOPDOWN_BAD_SPECULATION.MISPREDICT / tma_info_slots",
"MetricGroup": "TopdownL4;tma_nuke_group", "MetricGroup": "TopdownL2;tma_L2_group;tma_bad_speculation_group",
"MetricName": "tma_page_fault", "MetricName": "tma_branch_mispredicts",
"MetricThreshold": "tma_branch_mispredicts > 0.05",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of issue slots that were not consumed by the backend due to a machine clear classified as a fast nuke due to memory ordering, memory disambiguation and memory renaming.", "BriefDescription": "Counts the number of issue slots that were not delivered by the frontend due to BTCLEARS, which occurs when the Branch Target Buffer (BTB) predicts a taken branch.",
"MetricExpr": "TOPDOWN_BAD_SPECULATION.FASTNUKE / SLOTS", "MetricExpr": "TOPDOWN_FE_BOUND.BRANCH_RESTEER / tma_info_slots",
"MetricGroup": "TopdownL3;tma_machine_clears_group", "MetricGroup": "TopdownL3;tma_L3_group;tma_frontend_latency_group",
"MetricName": "tma_fast_nuke", "MetricName": "tma_branch_resteer",
"MetricThreshold": "tma_branch_resteer > 0.05",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the total number of issue slots that were not consumed by the backend due to backend stalls", "BriefDescription": "Counts the number of issue slots that were not delivered by the frontend due to the microcode sequencer (MS).",
"MetricExpr": "TOPDOWN_BE_BOUND.ALL / SLOTS", "MetricExpr": "TOPDOWN_FE_BOUND.CISC / tma_info_slots",
"MetricGroup": "TopdownL1", "MetricGroup": "TopdownL3;tma_L3_group;tma_frontend_bandwidth_group",
"MetricName": "tma_backend_bound", "MetricName": "tma_cisc",
"PublicDescription": "Counts the total number of issue slots that were not consumed by the backend due to backend stalls. Note that uops must be available for consumption in order for this event to count. If a uop is not available (IQ is empty), this event will not count. The rest of these subevents count backend stalls, in cycles, due to an outstanding request which is memory bound vs core bound. The subevents are not slot based events and therefore can not be precisely added or subtracted from the Backend_Bound_Aux subevents which are slot based.", "MetricThreshold": "tma_cisc > 0.05",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of cycles due to backend bound stalls that are core execution bound and not attributed to outstanding demand load or store stalls. ", "BriefDescription": "Counts the number of cycles due to backend bound stalls that are core execution bound and not attributed to outstanding demand load or store stalls.",
"MetricExpr": "max(0, tma_backend_bound - tma_load_store_bound)", "MetricExpr": "max(0, tma_backend_bound - tma_load_store_bound)",
"MetricGroup": "TopdownL2;tma_backend_bound_group", "MetricGroup": "TopdownL2;tma_L2_group;tma_backend_bound_group",
"MetricName": "tma_core_bound", "MetricName": "tma_core_bound",
"MetricThreshold": "tma_core_bound > 0.1",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of cycles the core is stalled due to stores or loads. ", "BriefDescription": "Counts the number of issue slots that were not delivered by the frontend due to decode stalls.",
"MetricExpr": "min(tma_backend_bound, LD_HEAD.ANY_AT_RET / CLKS + tma_store_bound)", "MetricExpr": "TOPDOWN_FE_BOUND.DECODE / tma_info_slots",
"MetricGroup": "TopdownL2;tma_backend_bound_group", "MetricGroup": "TopdownL3;tma_L3_group;tma_frontend_bandwidth_group",
"MetricName": "tma_load_store_bound", "MetricName": "tma_decode",
"MetricThreshold": "tma_decode > 0.05",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of cycles the core is stalled due to store buffer full.", "BriefDescription": "Counts the number of machine clears relative to the number of nuke slots due to memory disambiguation.",
"MetricExpr": "tma_st_buffer", "MetricExpr": "tma_nuke * (MACHINE_CLEARS.DISAMBIGUATION / MACHINE_CLEARS.SLOW)",
"MetricGroup": "TopdownL3;tma_load_store_bound_group", "MetricGroup": "TopdownL4;tma_L4_group;tma_nuke_group",
"MetricName": "tma_store_bound", "MetricName": "tma_disambiguation",
"MetricThreshold": "tma_disambiguation > 0.02",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of cycles that the oldest load of the load buffer is stalled at retirement due to a load block.", "BriefDescription": "Counts the number of cycles the core is stalled due to a demand load miss which hit in DRAM or MMIO (Non-DRAM).",
"MetricExpr": "LD_HEAD.L1_BOUND_AT_RET / CLKS", "MetricConstraint": "NO_GROUP_EVENTS",
"MetricGroup": "TopdownL3;tma_load_store_bound_group", "MetricExpr": "MEM_BOUND_STALLS.LOAD_DRAM_HIT / tma_info_clks - MEM_BOUND_STALLS_AT_RET_CORRECTION * MEM_BOUND_STALLS.LOAD_DRAM_HIT / MEM_BOUND_STALLS.LOAD",
"MetricName": "tma_l1_bound", "MetricGroup": "TopdownL3;tma_L3_group;tma_load_store_bound_group",
"MetricName": "tma_dram_bound",
"MetricThreshold": "tma_dram_bound > 0.1",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of cycles that the oldest load of the load buffer is stalled at retirement due to a store forward block.", "BriefDescription": "Counts the number of issue slots that were not consumed by the backend due to a machine clear classified as a fast nuke due to memory ordering, memory disambiguation and memory renaming.",
"MetricExpr": "LD_HEAD.ST_ADDR_AT_RET / CLKS", "MetricExpr": "TOPDOWN_BAD_SPECULATION.FASTNUKE / tma_info_slots",
"MetricGroup": "TopdownL4;tma_l1_bound_group", "MetricGroup": "TopdownL3;tma_L3_group;tma_machine_clears_group",
"MetricName": "tma_store_fwd", "MetricName": "tma_fast_nuke",
"MetricThreshold": "tma_fast_nuke > 0.05",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of cycles that the oldest load of the load buffer is stalled at retirement due to a first level TLB miss.", "BriefDescription": "Counts the number of machine clears relative to the number of nuke slots due to FP assists.",
"MetricExpr": "LD_HEAD.DTLB_MISS_AT_RET / CLKS", "MetricExpr": "tma_nuke * (MACHINE_CLEARS.FP_ASSIST / MACHINE_CLEARS.SLOW)",
"MetricGroup": "TopdownL4;tma_l1_bound_group", "MetricGroup": "TopdownL4;tma_L4_group;tma_nuke_group",
"MetricName": "tma_stlb_hit", "MetricName": "tma_fp_assist",
"MetricThreshold": "tma_fp_assist > 0.02",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of cycles that the oldest load of the load buffer is stalled at retirement due to a second level TLB miss requiring a page walk.", "BriefDescription": "Counts the number of floating point operations per uop with all default weighting.",
"MetricExpr": "LD_HEAD.PGWALK_AT_RET / CLKS", "MetricExpr": "UOPS_RETIRED.FPDIV / tma_info_slots",
"MetricGroup": "TopdownL4;tma_l1_bound_group", "MetricGroup": "TopdownL3;tma_L3_group;tma_base_group",
"MetricName": "tma_stlb_miss", "MetricName": "tma_fp_uops",
"MetricThreshold": "tma_fp_uops > 0.2",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of cycles that the oldest load of the load buffer is stalled at retirement due to a number of other load blocks.", "BriefDescription": "Counts the number of issue slots that were not delivered by the frontend due to frontend bandwidth restrictions due to decode, predecode, cisc, and other limitations.",
"MetricExpr": "LD_HEAD.OTHER_AT_RET / CLKS", "MetricExpr": "TOPDOWN_FE_BOUND.FRONTEND_BANDWIDTH / tma_info_slots",
"MetricGroup": "TopdownL4;tma_l1_bound_group", "MetricGroup": "TopdownL2;tma_L2_group;tma_frontend_bound_group",
"MetricName": "tma_other_l1", "MetricName": "tma_frontend_bandwidth",
"MetricThreshold": "tma_frontend_bandwidth > 0.1",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of cycles a core is stalled due to a demand load which hit in the L2 Cache.", "BriefDescription": "Counts the number of issue slots that were not consumed by the backend due to frontend stalls.",
"MetricExpr": "MEM_BOUND_STALLS.LOAD_L2_HIT / CLKS - MEM_BOUND_STALLS_AT_RET_CORRECTION * MEM_BOUND_STALLS.LOAD_L2_HIT / MEM_BOUND_STALLS.LOAD", "MetricExpr": "TOPDOWN_FE_BOUND.ALL / tma_info_slots",
"MetricGroup": "TopdownL3;tma_load_store_bound_group", "MetricGroup": "TopdownL1;tma_L1_group",
"MetricName": "tma_l2_bound", "MetricName": "tma_frontend_bound",
"MetricThreshold": "tma_frontend_bound > 0.2",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of cycles a core is stalled due to a demand load which hit in the Last Level Cache (LLC) or other core with HITE/F/M.", "BriefDescription": "Counts the number of issue slots that were not delivered by the frontend due to frontend bandwidth restrictions due to decode, predecode, cisc, and other limitations.",
"MetricExpr": "MEM_BOUND_STALLS.LOAD_LLC_HIT / CLKS - MEM_BOUND_STALLS_AT_RET_CORRECTION * MEM_BOUND_STALLS.LOAD_LLC_HIT / MEM_BOUND_STALLS.LOAD", "MetricExpr": "TOPDOWN_FE_BOUND.FRONTEND_LATENCY / tma_info_slots",
"MetricGroup": "TopdownL3;tma_load_store_bound_group", "MetricGroup": "TopdownL2;tma_L2_group;tma_frontend_bound_group",
"MetricName": "tma_l3_bound", "MetricName": "tma_frontend_latency",
"MetricThreshold": "tma_frontend_latency > 0.15",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of cycles the core is stalled due to a demand load miss which hit in DRAM or MMIO (Non-DRAM).", "BriefDescription": "Counts the number of issue slots that were not delivered by the frontend due to instruction cache misses.",
"MetricExpr": "MEM_BOUND_STALLS.LOAD_DRAM_HIT / CLKS - MEM_BOUND_STALLS_AT_RET_CORRECTION * MEM_BOUND_STALLS.LOAD_DRAM_HIT / MEM_BOUND_STALLS.LOAD", "MetricExpr": "TOPDOWN_FE_BOUND.ICACHE / tma_info_slots",
"MetricGroup": "TopdownL3;tma_load_store_bound_group", "MetricGroup": "TopdownL3;tma_L3_group;tma_frontend_latency_group",
"MetricName": "tma_dram_bound", "MetricName": "tma_icache",
"MetricThreshold": "tma_icache > 0.05",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of cycles the core is stalled due to a demand load miss which hits in the L2, LLC, DRAM or MMIO (Non-DRAM) but could not be correctly attributed or cycles in which the load miss is waiting on a request buffer.", "BriefDescription": "Percentage of total non-speculative loads with a address aliasing block",
"MetricExpr": "max(0, tma_load_store_bound - (tma_store_bound + tma_l1_bound + tma_l2_bound + tma_l3_bound + tma_dram_bound))", "MetricExpr": "100 * LD_BLOCKS.4K_ALIAS / MEM_UOPS_RETIRED.ALL_LOADS",
"MetricGroup": "TopdownL3;tma_load_store_bound_group", "MetricName": "tma_info_address_alias_blocks"
"MetricName": "tma_other_load_store",
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the total number of issue slots that were not consumed by the backend due to backend stalls", "BriefDescription": "Ratio of all branches which mispredict",
"MetricExpr": "tma_backend_bound", "MetricExpr": "BR_MISP_RETIRED.ALL_BRANCHES / BR_INST_RETIRED.ALL_BRANCHES",
"MetricGroup": "TopdownL1", "MetricGroup": " ",
"MetricName": "tma_backend_bound_aux", "MetricName": "tma_info_branch_mispredict_ratio"
"PublicDescription": "Counts the total number of issue slots that were not consumed by the backend due to backend stalls. Note that UOPS must be available for consumption in order for this event to count. If a uop is not available (IQ is empty), this event will not count. All of these subevents count backend stalls, in slots, due to a resource limitation. These are not cycle based events and therefore can not be precisely added or subtracted from the Backend_Bound subevents which are cycle based. These subevents are supplementary to Backend_Bound and can be used to analyze results from a resource perspective at allocation. ",
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the total number of issue slots that were not consumed by the backend due to backend stalls", "BriefDescription": "Ratio between Mispredicted branches and unknown branches",
"MetricExpr": "tma_backend_bound", "MetricExpr": "BR_MISP_RETIRED.ALL_BRANCHES / BACLEARS.ANY",
"MetricGroup": "TopdownL2;tma_backend_bound_aux_group", "MetricGroup": " ",
"MetricName": "tma_resource_bound", "MetricName": "tma_info_branch_mispredict_to_unknown_branch_ratio"
"PublicDescription": "Counts the total number of issue slots that were not consumed by the backend due to backend stalls. Note that uops must be available for consumption in order for this event to count. If a uop is not available (IQ is empty), this event will not count. ",
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of issue slots that were not consumed by the backend due to memory reservation stalls in which a scheduler is not able to accept uops.", "BriefDescription": "",
"MetricExpr": "TOPDOWN_BE_BOUND.MEM_SCHEDULER / SLOTS", "MetricExpr": "CPU_CLK_UNHALTED.CORE",
"MetricGroup": "TopdownL3;tma_resource_bound_group", "MetricGroup": " ",
"MetricName": "tma_mem_scheduler", "MetricName": "tma_info_clks"
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of cycles, relative to the number of mem_scheduler slots, in which uops are blocked due to store buffer full", "BriefDescription": "",
"MetricExpr": "tma_mem_scheduler * (MEM_SCHEDULER_BLOCK.ST_BUF / MEM_SCHEDULER_BLOCK.ALL)", "MetricExpr": "CPU_CLK_UNHALTED.CORE_P",
"MetricGroup": "TopdownL4;tma_mem_scheduler_group", "MetricGroup": " ",
"MetricName": "tma_st_buffer", "MetricName": "tma_info_clks_p"
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of cycles, relative to the number of mem_scheduler slots, in which uops are blocked due to load buffer full", "BriefDescription": "Cycles Per Instruction",
"MetricExpr": "tma_mem_scheduler * MEM_SCHEDULER_BLOCK.LD_BUF / MEM_SCHEDULER_BLOCK.ALL", "MetricExpr": "tma_info_clks / INST_RETIRED.ANY",
"MetricGroup": "TopdownL4;tma_mem_scheduler_group", "MetricGroup": " ",
"MetricName": "tma_ld_buffer", "MetricName": "tma_info_cpi"
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of cycles, relative to the number of mem_scheduler slots, in which uops are blocked due to RSV full relative ", "BriefDescription": "Average CPU Utilization",
"MetricExpr": "tma_mem_scheduler * MEM_SCHEDULER_BLOCK.RSV / MEM_SCHEDULER_BLOCK.ALL", "MetricExpr": "CPU_CLK_UNHALTED.REF_TSC / TSC",
"MetricGroup": "TopdownL4;tma_mem_scheduler_group", "MetricGroup": " ",
"MetricName": "tma_rsv", "MetricName": "tma_info_cpu_utilization"
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of issue slots that were not consumed by the backend due to IEC or FPC RAT stalls, which can be due to FIQ or IEC reservation stalls in which the integer, floating point or SIMD scheduler is not able to accept uops.", "BriefDescription": "Cycle cost per DRAM hit",
"MetricExpr": "TOPDOWN_BE_BOUND.NON_MEM_SCHEDULER / SLOTS", "MetricExpr": "MEM_BOUND_STALLS.LOAD_DRAM_HIT / MEM_LOAD_UOPS_RETIRED.DRAM_HIT",
"MetricGroup": "TopdownL3;tma_resource_bound_group", "MetricGroup": " ",
"MetricName": "tma_non_mem_scheduler", "MetricName": "tma_info_cycles_per_demand_load_dram_hit"
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of issue slots that were not consumed by the backend due to the physical register file unable to accept an entry (marble stalls).", "BriefDescription": "Cycle cost per L2 hit",
"MetricExpr": "TOPDOWN_BE_BOUND.REGISTER / SLOTS", "MetricExpr": "MEM_BOUND_STALLS.LOAD_L2_HIT / MEM_LOAD_UOPS_RETIRED.L2_HIT",
"MetricGroup": "TopdownL3;tma_resource_bound_group", "MetricGroup": " ",
"MetricName": "tma_register", "MetricName": "tma_info_cycles_per_demand_load_l2_hit"
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of issue slots that were not consumed by the backend due to the reorder buffer being full (ROB stalls).", "BriefDescription": "Cycle cost per LLC hit",
"MetricExpr": "TOPDOWN_BE_BOUND.REORDER_BUFFER / SLOTS", "MetricExpr": "MEM_BOUND_STALLS.LOAD_LLC_HIT / MEM_LOAD_UOPS_RETIRED.L3_HIT",
"MetricGroup": "TopdownL3;tma_resource_bound_group", "MetricGroup": " ",
"MetricName": "tma_reorder_buffer", "MetricName": "tma_info_cycles_per_demand_load_l3_hit"
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of issue slots that were not consumed by the backend due to certain allocation restrictions.", "BriefDescription": "Percentage of all uops which are FPDiv uops",
"MetricExpr": "TOPDOWN_BE_BOUND.ALLOC_RESTRICTIONS / SLOTS", "MetricExpr": "100 * UOPS_RETIRED.FPDIV / UOPS_RETIRED.ALL",
"MetricGroup": "TopdownL3;tma_resource_bound_group", "MetricGroup": " ",
"MetricName": "tma_alloc_restriction", "MetricName": "tma_info_fpdiv_uop_ratio"
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of issue slots that were not consumed by the backend due to scoreboards from the instruction queue (IQ), jump execution unit (JEU), or microcode sequencer (MS).", "BriefDescription": "Percentage of all uops which are IDiv uops",
"MetricExpr": "TOPDOWN_BE_BOUND.SERIALIZATION / SLOTS", "MetricExpr": "100 * UOPS_RETIRED.IDIV / UOPS_RETIRED.ALL",
"MetricGroup": "TopdownL3;tma_resource_bound_group", "MetricGroup": " ",
"MetricName": "tma_serialization", "MetricName": "tma_info_idiv_uop_ratio"
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the numer of issue slots that result in retirement slots. ", "BriefDescription": "Percent of instruction miss cost that hit in DRAM",
"MetricExpr": "TOPDOWN_RETIRING.ALL / SLOTS", "MetricExpr": "100 * MEM_BOUND_STALLS.IFETCH_DRAM_HIT / MEM_BOUND_STALLS.IFETCH",
"MetricGroup": "TopdownL1", "MetricGroup": " ",
"MetricName": "tma_retiring", "MetricName": "tma_info_inst_miss_cost_dramhit_percent"
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of uops that are not from the microsequencer. ", "BriefDescription": "Percent of instruction miss cost that hit in the L2",
"MetricExpr": "(TOPDOWN_RETIRING.ALL - UOPS_RETIRED.MS) / SLOTS", "MetricExpr": "100 * MEM_BOUND_STALLS.IFETCH_L2_HIT / MEM_BOUND_STALLS.IFETCH",
"MetricGroup": "TopdownL2;tma_retiring_group", "MetricGroup": " ",
"MetricName": "tma_base", "MetricName": "tma_info_inst_miss_cost_l2hit_percent"
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of floating point operations per uop with all default weighting.", "BriefDescription": "Percent of instruction miss cost that hit in the L3",
"MetricExpr": "UOPS_RETIRED.FPDIV / SLOTS", "MetricExpr": "100 * MEM_BOUND_STALLS.IFETCH_LLC_HIT / MEM_BOUND_STALLS.IFETCH",
"MetricGroup": "TopdownL3;tma_base_group", "MetricGroup": " ",
"MetricName": "tma_fp_uops", "MetricName": "tma_info_inst_miss_cost_l3hit_percent"
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of uops retired excluding ms and fp div uops.", "BriefDescription": "Instructions per Branch (lower number means higher occurance rate)",
"MetricExpr": "(TOPDOWN_RETIRING.ALL - UOPS_RETIRED.MS - UOPS_RETIRED.FPDIV) / SLOTS", "MetricExpr": "INST_RETIRED.ANY / BR_INST_RETIRED.ALL_BRANCHES",
"MetricGroup": "TopdownL3;tma_base_group", "MetricGroup": " ",
"MetricName": "tma_other_ret", "MetricName": "tma_info_ipbranch"
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Counts the number of uops that are from the complex flows issued by the micro-sequencer (MS)", "BriefDescription": "Instructions Per Cycle",
"MetricExpr": "UOPS_RETIRED.MS / SLOTS", "MetricExpr": "INST_RETIRED.ANY / tma_info_clks",
"MetricGroup": "TopdownL2;tma_retiring_group", "MetricGroup": " ",
"MetricName": "tma_ms_uops", "MetricName": "tma_info_ipc"
"PublicDescription": "Counts the number of uops that are from the complex flows issued by the micro-sequencer (MS). This includes uops from flows due to complex instructions, faults, assists, and inserted flows.",
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "", "BriefDescription": "Instruction per (near) call (lower number means higher occurance rate)",
"MetricExpr": "CPU_CLK_UNHALTED.CORE", "MetricExpr": "INST_RETIRED.ANY / BR_INST_RETIRED.CALL",
"MetricName": "CLKS" "MetricGroup": " ",
"MetricName": "tma_info_ipcall"
}, },
{ {
"BriefDescription": "", "BriefDescription": "Instructions per Far Branch",
"MetricExpr": "CPU_CLK_UNHALTED.CORE_P", "MetricExpr": "INST_RETIRED.ANY / (BR_INST_RETIRED.FAR_BRANCH / 2)",
"MetricName": "CLKS_P" "MetricGroup": " ",
"MetricName": "tma_info_ipfarbranch"
}, },
{ {
"BriefDescription": "", "BriefDescription": "Instructions per Load",
"MetricExpr": "5 * CLKS", "MetricExpr": "INST_RETIRED.ANY / MEM_UOPS_RETIRED.ALL_LOADS",
"MetricName": "SLOTS" "MetricGroup": " ",
"MetricName": "tma_info_ipload"
}, },
{ {
"BriefDescription": "Instructions Per Cycle", "BriefDescription": "Number of Instructions per non-speculative Branch Misprediction",
"MetricExpr": "INST_RETIRED.ANY / CLKS", "MetricExpr": "INST_RETIRED.ANY / BR_MISP_RETIRED.ALL_BRANCHES",
"MetricName": "IPC" "MetricGroup": " ",
"MetricName": "tma_info_ipmispredict"
}, },
{ {
"BriefDescription": "Cycles Per Instruction", "BriefDescription": "Instructions per Store",
"MetricExpr": "CLKS / INST_RETIRED.ANY", "MetricExpr": "INST_RETIRED.ANY / MEM_UOPS_RETIRED.ALL_STORES",
"MetricName": "CPI" "MetricGroup": " ",
"MetricName": "tma_info_ipstore"
}, },
{ {
"BriefDescription": "Uops Per Instruction", "BriefDescription": "Fraction of cycles spent in Kernel mode",
"MetricExpr": "UOPS_RETIRED.ALL / INST_RETIRED.ANY", "MetricExpr": "cpu@CPU_CLK_UNHALTED.CORE@k / CPU_CLK_UNHALTED.CORE",
"MetricName": "UPI" "MetricGroup": " ",
"MetricName": "tma_info_kernel_utilization"
}, },
{ {
"BriefDescription": "Percentage of total non-speculative loads with a store forward or unknown store address block", "BriefDescription": "Percentage of total non-speculative loads that are splits",
"MetricExpr": "100 * LD_BLOCKS.DATA_UNKNOWN / MEM_UOPS_RETIRED.ALL_LOADS", "MetricExpr": "100 * MEM_UOPS_RETIRED.SPLIT_LOADS / MEM_UOPS_RETIRED.ALL_LOADS",
"MetricName": "Store_Fwd_Blocks" "MetricName": "tma_info_load_splits"
}, },
{ {
"BriefDescription": "Percentage of total non-speculative loads with a address aliasing block", "BriefDescription": "load ops retired per 1000 instruction",
"MetricExpr": "100 * LD_BLOCKS.4K_ALIAS / MEM_UOPS_RETIRED.ALL_LOADS", "MetricExpr": "1e3 * MEM_UOPS_RETIRED.ALL_LOADS / INST_RETIRED.ANY",
"MetricName": "Address_Alias_Blocks" "MetricGroup": " ",
"MetricName": "tma_info_memloadpki"
}, },
{ {
"BriefDescription": "Percentage of total non-speculative loads that are splits", "BriefDescription": "Percentage of all uops which are ucode ops",
"MetricExpr": "100 * MEM_UOPS_RETIRED.SPLIT_LOADS / MEM_UOPS_RETIRED.ALL_LOADS", "MetricExpr": "100 * UOPS_RETIRED.MS / UOPS_RETIRED.ALL",
"MetricName": "Load_Splits" "MetricGroup": " ",
"MetricName": "tma_info_microcode_uop_ratio"
}, },
{ {
"BriefDescription": "Instructions per Branch (lower number means higher occurrence rate)", "BriefDescription": "",
"MetricExpr": "INST_RETIRED.ANY / BR_INST_RETIRED.ALL_BRANCHES", "MetricExpr": "5 * tma_info_clks",
"MetricName": "IpBranch" "MetricGroup": " ",
"MetricName": "tma_info_slots"
}, },
{ {
"BriefDescription": "Instruction per (near) call (lower number means higher occurrence rate)", "BriefDescription": "Percentage of total non-speculative loads with a store forward or unknown store address block",
"MetricExpr": "INST_RETIRED.ANY / BR_INST_RETIRED.CALL", "MetricExpr": "100 * LD_BLOCKS.DATA_UNKNOWN / MEM_UOPS_RETIRED.ALL_LOADS",
"MetricName": "IpCall" "MetricName": "tma_info_store_fwd_blocks"
}, },
{ {
"BriefDescription": "Instructions per Load", "BriefDescription": "Average Frequency Utilization relative nominal frequency",
"MetricExpr": "INST_RETIRED.ANY / MEM_UOPS_RETIRED.ALL_LOADS", "MetricExpr": "tma_info_clks / CPU_CLK_UNHALTED.REF_TSC",
"MetricName": "IpLoad" "MetricGroup": " ",
"MetricName": "tma_info_turbo_utilization"
}, },
{ {
"BriefDescription": "Instructions per Store", "BriefDescription": "Uops Per Instruction",
"MetricExpr": "INST_RETIRED.ANY / MEM_UOPS_RETIRED.ALL_STORES", "MetricExpr": "UOPS_RETIRED.ALL / INST_RETIRED.ANY",
"MetricName": "IpStore" "MetricGroup": " ",
"MetricName": "tma_info_upi"
}, },
{ {
"BriefDescription": "Number of Instructions per non-speculative Branch Misprediction", "BriefDescription": "Percentage of all uops which are x87 uops",
"MetricExpr": "INST_RETIRED.ANY / BR_MISP_RETIRED.ALL_BRANCHES", "MetricExpr": "100 * UOPS_RETIRED.X87 / UOPS_RETIRED.ALL",
"MetricName": "IpMispredict" "MetricGroup": " ",
"MetricName": "tma_info_x87_uop_ratio"
}, },
{ {
"BriefDescription": "Instructions per Far Branch", "BriefDescription": "Counts the number of issue slots that were not delivered by the frontend due to Instruction Table Lookaside Buffer (ITLB) misses.",
"MetricExpr": "INST_RETIRED.ANY / (BR_INST_RETIRED.FAR_BRANCH / 2)", "MetricExpr": "TOPDOWN_FE_BOUND.ITLB / tma_info_slots",
"MetricName": "IpFarBranch" "MetricGroup": "TopdownL3;tma_L3_group;tma_frontend_latency_group",
"MetricName": "tma_itlb",
"MetricThreshold": "tma_itlb > 0.05",
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Ratio of all branches which mispredict", "BriefDescription": "Counts the number of cycles that the oldest load of the load buffer is stalled at retirement due to a load block.",
"MetricExpr": "BR_MISP_RETIRED.ALL_BRANCHES / BR_INST_RETIRED.ALL_BRANCHES", "MetricExpr": "LD_HEAD.L1_BOUND_AT_RET / tma_info_clks",
"MetricName": "Branch_Mispredict_Ratio" "MetricGroup": "TopdownL3;tma_L3_group;tma_load_store_bound_group",
"MetricName": "tma_l1_bound",
"MetricThreshold": "tma_l1_bound > 0.1",
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Ratio between Mispredicted branches and unknown branches", "BriefDescription": "Counts the number of cycles a core is stalled due to a demand load which hit in the L2 Cache.",
"MetricExpr": "BR_MISP_RETIRED.ALL_BRANCHES / BACLEARS.ANY", "MetricConstraint": "NO_GROUP_EVENTS",
"MetricName": "Branch_Mispredict_to_Unknown_Branch_Ratio" "MetricExpr": "MEM_BOUND_STALLS.LOAD_L2_HIT / tma_info_clks - MEM_BOUND_STALLS_AT_RET_CORRECTION * MEM_BOUND_STALLS.LOAD_L2_HIT / MEM_BOUND_STALLS.LOAD",
"MetricGroup": "TopdownL3;tma_L3_group;tma_load_store_bound_group",
"MetricName": "tma_l2_bound",
"MetricThreshold": "tma_l2_bound > 0.1",
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Percentage of all uops which are ucode ops", "BriefDescription": "Counts the number of cycles a core is stalled due to a demand load which hit in the Last Level Cache (LLC) or other core with HITE/F/M.",
"MetricExpr": "100 * UOPS_RETIRED.MS / UOPS_RETIRED.ALL", "MetricExpr": "MEM_BOUND_STALLS.LOAD_LLC_HIT / tma_info_clks - MEM_BOUND_STALLS_AT_RET_CORRECTION * MEM_BOUND_STALLS.LOAD_LLC_HIT / MEM_BOUND_STALLS.LOAD",
"MetricName": "Microcode_Uop_Ratio" "MetricGroup": "TopdownL3;tma_L3_group;tma_load_store_bound_group",
"MetricName": "tma_l3_bound",
"MetricThreshold": "tma_l3_bound > 0.1",
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Percentage of all uops which are FPDiv uops", "BriefDescription": "Counts the number of cycles, relative to the number of mem_scheduler slots, in which uops are blocked due to load buffer full",
"MetricExpr": "100 * UOPS_RETIRED.FPDIV / UOPS_RETIRED.ALL", "MetricExpr": "tma_mem_scheduler * MEM_SCHEDULER_BLOCK.LD_BUF / MEM_SCHEDULER_BLOCK.ALL",
"MetricName": "FPDiv_Uop_Ratio" "MetricGroup": "TopdownL4;tma_L4_group;tma_mem_scheduler_group",
"MetricName": "tma_ld_buffer",
"MetricThreshold": "tma_ld_buffer > 0.05",
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Percentage of all uops which are IDiv uops", "BriefDescription": "Counts the number of cycles the core is stalled due to stores or loads.",
"MetricExpr": "100 * UOPS_RETIRED.IDIV / UOPS_RETIRED.ALL", "MetricExpr": "min(tma_backend_bound, LD_HEAD.ANY_AT_RET / tma_info_clks + tma_store_bound)",
"MetricName": "IDiv_Uop_Ratio" "MetricGroup": "TopdownL2;tma_L2_group;tma_backend_bound_group",
"MetricName": "tma_load_store_bound",
"MetricThreshold": "tma_load_store_bound > 0.2",
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Percentage of all uops which are x87 uops", "BriefDescription": "Counts the total number of issue slots that were not consumed by the backend because allocation is stalled due to a machine clear (nuke) of any kind including memory ordering and memory disambiguation.",
"MetricExpr": "100 * UOPS_RETIRED.X87 / UOPS_RETIRED.ALL", "MetricExpr": "TOPDOWN_BAD_SPECULATION.MACHINE_CLEARS / tma_info_slots",
"MetricName": "X87_Uop_Ratio" "MetricGroup": "TopdownL2;tma_L2_group;tma_bad_speculation_group",
"MetricName": "tma_machine_clears",
"MetricThreshold": "tma_machine_clears > 0.05",
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Average Frequency Utilization relative nominal frequency", "BriefDescription": "Counts the number of issue slots that were not consumed by the backend due to memory reservation stalls in which a scheduler is not able to accept uops.",
"MetricExpr": "CLKS / CPU_CLK_UNHALTED.REF_TSC", "MetricExpr": "TOPDOWN_BE_BOUND.MEM_SCHEDULER / tma_info_slots",
"MetricName": "Turbo_Utilization" "MetricGroup": "TopdownL3;tma_L3_group;tma_resource_bound_group",
"MetricName": "tma_mem_scheduler",
"MetricThreshold": "tma_mem_scheduler > 0.1",
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Fraction of cycles spent in Kernel mode", "BriefDescription": "Counts the number of machine clears relative to the number of nuke slots due to memory ordering.",
"MetricExpr": "cpu@CPU_CLK_UNHALTED.CORE@k / CPU_CLK_UNHALTED.CORE", "MetricExpr": "tma_nuke * (MACHINE_CLEARS.MEMORY_ORDERING / MACHINE_CLEARS.SLOW)",
"MetricName": "Kernel_Utilization" "MetricGroup": "TopdownL4;tma_L4_group;tma_nuke_group",
"MetricName": "tma_memory_ordering",
"MetricThreshold": "tma_memory_ordering > 0.02",
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Average CPU Utilization", "BriefDescription": "Counts the number of uops that are from the complex flows issued by the micro-sequencer (MS)",
"MetricExpr": "CPU_CLK_UNHALTED.REF_TSC / TSC", "MetricExpr": "UOPS_RETIRED.MS / tma_info_slots",
"MetricName": "CPU_Utilization" "MetricGroup": "TopdownL2;tma_L2_group;tma_retiring_group",
"MetricName": "tma_ms_uops",
"MetricThreshold": "tma_ms_uops > 0.05",
"PublicDescription": "Counts the number of uops that are from the complex flows issued by the micro-sequencer (MS). This includes uops from flows due to complex instructions, faults, assists, and inserted flows.",
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Cycle cost per L2 hit", "BriefDescription": "Counts the number of issue slots that were not consumed by the backend due to IEC or FPC RAT stalls, which can be due to FIQ or IEC reservation stalls in which the integer, floating point or SIMD scheduler is not able to accept uops.",
"MetricExpr": "MEM_BOUND_STALLS.LOAD_L2_HIT / MEM_LOAD_UOPS_RETIRED.L2_HIT", "MetricExpr": "TOPDOWN_BE_BOUND.NON_MEM_SCHEDULER / tma_info_slots",
"MetricName": "Cycles_per_Demand_Load_L2_Hit" "MetricGroup": "TopdownL3;tma_L3_group;tma_resource_bound_group",
"MetricName": "tma_non_mem_scheduler",
"MetricThreshold": "tma_non_mem_scheduler > 0.1",
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Cycle cost per LLC hit", "BriefDescription": "Counts the number of issue slots that were not consumed by the backend due to a machine clear (slow nuke).",
"MetricExpr": "MEM_BOUND_STALLS.LOAD_LLC_HIT / MEM_LOAD_UOPS_RETIRED.L3_HIT", "MetricExpr": "TOPDOWN_BAD_SPECULATION.NUKE / tma_info_slots",
"MetricName": "Cycles_per_Demand_Load_L3_Hit" "MetricGroup": "TopdownL3;tma_L3_group;tma_machine_clears_group",
"MetricName": "tma_nuke",
"MetricThreshold": "tma_nuke > 0.05",
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Cycle cost per DRAM hit", "BriefDescription": "Counts the number of issue slots that were not delivered by the frontend due to other common frontend stalls not categorized.",
"MetricExpr": "MEM_BOUND_STALLS.LOAD_DRAM_HIT / MEM_LOAD_UOPS_RETIRED.DRAM_HIT", "MetricExpr": "TOPDOWN_FE_BOUND.OTHER / tma_info_slots",
"MetricName": "Cycles_per_Demand_Load_DRAM_Hit" "MetricGroup": "TopdownL3;tma_L3_group;tma_frontend_bandwidth_group",
"MetricName": "tma_other_fb",
"MetricThreshold": "tma_other_fb > 0.05",
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Percent of instruction miss cost that hit in the L2", "BriefDescription": "Counts the number of cycles that the oldest load of the load buffer is stalled at retirement due to a number of other load blocks.",
"MetricExpr": "100 * MEM_BOUND_STALLS.IFETCH_L2_HIT / MEM_BOUND_STALLS.IFETCH", "MetricExpr": "LD_HEAD.OTHER_AT_RET / tma_info_clks",
"MetricName": "Inst_Miss_Cost_L2Hit_Percent" "MetricGroup": "TopdownL4;tma_L4_group;tma_l1_bound_group",
"MetricName": "tma_other_l1",
"MetricThreshold": "tma_other_l1 > 0.05",
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Percent of instruction miss cost that hit in the L3", "BriefDescription": "Counts the number of cycles the core is stalled due to a demand load miss which hits in the L2, LLC, DRAM or MMIO (Non-DRAM) but could not be correctly attributed or cycles in which the load miss is waiting on a request buffer.",
"MetricExpr": "100 * MEM_BOUND_STALLS.IFETCH_LLC_HIT / MEM_BOUND_STALLS.IFETCH", "MetricExpr": "max(0, tma_load_store_bound - (tma_store_bound + tma_l1_bound + tma_l2_bound + tma_l3_bound + tma_dram_bound))",
"MetricName": "Inst_Miss_Cost_L3Hit_Percent" "MetricGroup": "TopdownL3;tma_L3_group;tma_load_store_bound_group",
"MetricName": "tma_other_load_store",
"MetricThreshold": "tma_other_load_store > 0.1",
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "Percent of instruction miss cost that hit in DRAM", "BriefDescription": "Counts the number of uops retired excluding ms and fp div uops.",
"MetricExpr": "100 * MEM_BOUND_STALLS.IFETCH_DRAM_HIT / MEM_BOUND_STALLS.IFETCH", "MetricExpr": "(TOPDOWN_RETIRING.ALL - UOPS_RETIRED.MS - UOPS_RETIRED.FPDIV) / tma_info_slots",
"MetricName": "Inst_Miss_Cost_DRAMHit_Percent" "MetricGroup": "TopdownL3;tma_L3_group;tma_base_group",
"MetricName": "tma_other_ret",
"MetricThreshold": "tma_other_ret > 0.3",
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "load ops retired per 1000 instruction", "BriefDescription": "Counts the number of machine clears relative to the number of nuke slots due to page faults.",
"MetricExpr": "1e3 * MEM_UOPS_RETIRED.ALL_LOADS / INST_RETIRED.ANY", "MetricExpr": "tma_nuke * (MACHINE_CLEARS.PAGE_FAULT / MACHINE_CLEARS.SLOW)",
"MetricName": "MemLoadPKI" "MetricGroup": "TopdownL4;tma_L4_group;tma_nuke_group",
"MetricName": "tma_page_fault",
"MetricThreshold": "tma_page_fault > 0.02",
"ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "C1 residency percent per core", "BriefDescription": "Counts the number of issue slots that were not delivered by the frontend due to wrong predecodes.",
"MetricExpr": "cstate_core@c1\\-residency@ / TSC", "MetricExpr": "TOPDOWN_FE_BOUND.PREDECODE / tma_info_slots",
"MetricGroup": "Power", "MetricGroup": "TopdownL3;tma_L3_group;tma_frontend_bandwidth_group",
"MetricName": "C1_Core_Residency", "MetricName": "tma_predecode",
"MetricThreshold": "tma_predecode > 0.05",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "C6 residency percent per core", "BriefDescription": "Counts the number of issue slots that were not consumed by the backend due to the physical register file unable to accept an entry (marble stalls).",
"MetricExpr": "cstate_core@c6\\-residency@ / TSC", "MetricExpr": "TOPDOWN_BE_BOUND.REGISTER / tma_info_slots",
"MetricGroup": "Power", "MetricGroup": "TopdownL3;tma_L3_group;tma_resource_bound_group",
"MetricName": "C6_Core_Residency", "MetricName": "tma_register",
"MetricThreshold": "tma_register > 0.1",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "C7 residency percent per core", "BriefDescription": "Counts the number of issue slots that were not consumed by the backend due to the reorder buffer being full (ROB stalls).",
"MetricExpr": "cstate_core@c7\\-residency@ / TSC", "MetricExpr": "TOPDOWN_BE_BOUND.REORDER_BUFFER / tma_info_slots",
"MetricGroup": "Power", "MetricGroup": "TopdownL3;tma_L3_group;tma_resource_bound_group",
"MetricName": "C7_Core_Residency", "MetricName": "tma_reorder_buffer",
"MetricThreshold": "tma_reorder_buffer > 0.1",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "C2 residency percent per package", "BriefDescription": "Counts the total number of issue slots that were not consumed by the backend due to backend stalls",
"MetricExpr": "cstate_pkg@c2\\-residency@ / TSC", "MetricExpr": "tma_backend_bound",
"MetricGroup": "Power", "MetricGroup": "TopdownL2;tma_L2_group;tma_backend_bound_aux_group",
"MetricName": "C2_Pkg_Residency", "MetricName": "tma_resource_bound",
"MetricThreshold": "tma_resource_bound > 0.2",
"PublicDescription": "Counts the total number of issue slots that were not consumed by the backend due to backend stalls. Note that uops must be available for consumption in order for this event to count. If a uop is not available (IQ is empty), this event will not count.",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "C3 residency percent per package", "BriefDescription": "Counts the numer of issue slots that result in retirement slots.",
"MetricExpr": "cstate_pkg@c3\\-residency@ / TSC", "MetricExpr": "TOPDOWN_RETIRING.ALL / tma_info_slots",
"MetricGroup": "Power", "MetricGroup": "TopdownL1;tma_L1_group",
"MetricName": "C3_Pkg_Residency", "MetricName": "tma_retiring",
"MetricThreshold": "tma_retiring > 0.75",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "C6 residency percent per package", "BriefDescription": "Counts the number of cycles, relative to the number of mem_scheduler slots, in which uops are blocked due to RSV full relative",
"MetricExpr": "cstate_pkg@c6\\-residency@ / TSC", "MetricExpr": "tma_mem_scheduler * MEM_SCHEDULER_BLOCK.RSV / MEM_SCHEDULER_BLOCK.ALL",
"MetricGroup": "Power", "MetricGroup": "TopdownL4;tma_L4_group;tma_mem_scheduler_group",
"MetricName": "C6_Pkg_Residency", "MetricName": "tma_rsv",
"MetricThreshold": "tma_rsv > 0.05",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "C7 residency percent per package", "BriefDescription": "Counts the number of issue slots that were not consumed by the backend due to scoreboards from the instruction queue (IQ), jump execution unit (JEU), or microcode sequencer (MS).",
"MetricExpr": "cstate_pkg@c7\\-residency@ / TSC", "MetricExpr": "TOPDOWN_BE_BOUND.SERIALIZATION / tma_info_slots",
"MetricGroup": "Power", "MetricGroup": "TopdownL3;tma_L3_group;tma_resource_bound_group",
"MetricName": "C7_Pkg_Residency", "MetricName": "tma_serialization",
"MetricThreshold": "tma_serialization > 0.1",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "C8 residency percent per package", "BriefDescription": "Counts the number of machine clears relative to the number of nuke slots due to SMC.",
"MetricExpr": "cstate_pkg@c8\\-residency@ / TSC", "MetricExpr": "tma_nuke * (MACHINE_CLEARS.SMC / MACHINE_CLEARS.SLOW)",
"MetricGroup": "Power", "MetricGroup": "TopdownL4;tma_L4_group;tma_nuke_group",
"MetricName": "C8_Pkg_Residency", "MetricName": "tma_smc",
"MetricThreshold": "tma_smc > 0.02",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "C9 residency percent per package", "BriefDescription": "Counts the number of cycles, relative to the number of mem_scheduler slots, in which uops are blocked due to store buffer full",
"MetricExpr": "cstate_pkg@c9\\-residency@ / TSC", "MetricExpr": "tma_store_bound",
"MetricGroup": "Power", "MetricGroup": "TopdownL4;tma_L4_group;tma_mem_scheduler_group",
"MetricName": "C9_Pkg_Residency", "MetricName": "tma_st_buffer",
"MetricThreshold": "tma_st_buffer > 0.05",
"ScaleUnit": "100%" "ScaleUnit": "100%"
}, },
{ {
"BriefDescription": "C10 residency percent per package", "BriefDescription": "Counts the number of cycles that the oldest load of the load buffer is stalled at retirement due to a first level TLB miss.",
"MetricExpr": "cstate_pkg@c10\\-residency@ / TSC", "MetricExpr": "LD_HEAD.DTLB_MISS_AT_RET / tma_info_clks",
"MetricGroup": "Power", "MetricGroup": "TopdownL4;tma_L4_group;tma_l1_bound_group",
"MetricName": "C10_Pkg_Residency", "MetricName": "tma_stlb_hit",
"MetricThreshold": "tma_stlb_hit > 0.05",
"ScaleUnit": "100%"
},
{
"BriefDescription": "Counts the number of cycles that the oldest load of the load buffer is stalled at retirement due to a second level TLB miss requiring a page walk.",
"MetricExpr": "LD_HEAD.PGWALK_AT_RET / tma_info_clks",
"MetricGroup": "TopdownL4;tma_L4_group;tma_l1_bound_group",
"MetricName": "tma_stlb_miss",
"MetricThreshold": "tma_stlb_miss > 0.05",
"ScaleUnit": "100%"
},
{
"BriefDescription": "Counts the number of cycles the core is stalled due to store buffer full.",
"MetricExpr": "tma_mem_scheduler * (MEM_SCHEDULER_BLOCK.ST_BUF / MEM_SCHEDULER_BLOCK.ALL)",
"MetricGroup": "TopdownL3;tma_L3_group;tma_load_store_bound_group",
"MetricName": "tma_store_bound",
"MetricThreshold": "tma_store_bound > 0.1",
"ScaleUnit": "100%"
},
{
"BriefDescription": "Counts the number of cycles that the oldest load of the load buffer is stalled at retirement due to a store forward block.",
"MetricExpr": "LD_HEAD.ST_ADDR_AT_RET / tma_info_clks",
"MetricGroup": "TopdownL4;tma_L4_group;tma_l1_bound_group",
"MetricName": "tma_store_fwd",
"MetricThreshold": "tma_store_fwd > 0.05",
"ScaleUnit": "100%" "ScaleUnit": "100%"
} }
] ]
Family-model,Version,Filename,EventType Family-model,Version,Filename,EventType
GenuineIntel-6-(97|9A|B7|BA|BF),v1.18,alderlake,core GenuineIntel-6-(97|9A|B7|BA|BF),v1.18,alderlake,core
GenuineIntel-6-BE,v1.16,alderlaken,core GenuineIntel-6-BE,v1.18,alderlaken,core
GenuineIntel-6-(1C|26|27|35|36),v4,bonnell,core GenuineIntel-6-(1C|26|27|35|36),v4,bonnell,core
GenuineIntel-6-(3D|47),v26,broadwell,core GenuineIntel-6-(3D|47),v26,broadwell,core
GenuineIntel-6-56,v7,broadwellde,core GenuineIntel-6-56,v7,broadwellde,core
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment