Commit dd9168ab authored by Will Deacon's avatar Will Deacon

Merge branch 'for-next/perf' into for-next/core

* for-next/perf: (30 commits)
  arm: perf: Fix ARCH=arm build with GCC
  MAINTAINERS: add maintainers for DesignWare PCIe PMU driver
  drivers/perf: add DesignWare PCIe PMU driver
  PCI: Move pci_clear_and_set_dword() helper to PCI header
  PCI: Add Alibaba Vendor ID to linux/pci_ids.h
  docs: perf: Add description for Synopsys DesignWare PCIe PMU driver
  Revert "perf/arm_dmc620: Remove duplicate format attribute #defines"
  Documentation: arm64: Document the PMU event counting threshold feature
  arm64: perf: Add support for event counting threshold
  arm: pmu: Move error message and -EOPNOTSUPP to individual PMUs
  KVM: selftests: aarch64: Update tools copy of arm_pmuv3.h
  perf/arm_dmc620: Remove duplicate format attribute #defines
  arm: pmu: Share user ABI format mechanism with SPE
  arm64: perf: Include threshold control fields in PMEVTYPER mask
  arm: perf: Convert remaining fields to use GENMASK
  arm: perf: Use GENMASK for PMMIR fields
  arm: perf/kvm: Use GENMASK for ARMV8_PMU_PMCR_N
  arm: perf: Remove inlines from arm_pmuv3.c
  drivers/perf: arm_dsu_pmu: Remove kerneldoc-style comment syntax
  drivers/perf: Remove usage of the deprecated ida_simple_xx() API
  ...
parents 3b47bd8f bb339db4
======================================================================
Synopsys DesignWare Cores (DWC) PCIe Performance Monitoring Unit (PMU)
======================================================================
DesignWare Cores (DWC) PCIe PMU
===============================
The PMU is a PCIe configuration space register block provided by each PCIe Root
Port in a Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error
injection, and Statistics).
As the name indicates, the RAS DES capability supports system level
debugging, AER error injection, and collection of statistics. To facilitate
collection of statistics, Synopsys DesignWare Cores PCIe controller
provides the following two features:
- one 64-bit counter for Time Based Analysis (RX/TX data throughput and
time spent in each low-power LTSSM state) and
- one 32-bit counter for Event Counting (error and non-error events for
a specified lane)
Note: There is no interrupt for counter overflow.
Time Based Analysis
-------------------
Using this feature you can obtain information regarding RX/TX data
throughput and time spent in each low-power LTSSM state by the controller.
The PMU measures data in two categories:
- Group#0: Percentage of time the controller stays in LTSSM states.
- Group#1: Amount of data processed (Units of 16 bytes).
Lane Event counters
-------------------
Using this feature you can obtain Error and Non-Error information in
specific lane by the controller. The PMU event is selected by all of:
- Group i
- Event j within the Group i
- Lane k
Some of the events only exist for specific configurations.
DesignWare Cores (DWC) PCIe PMU Driver
=======================================
This driver adds PMU devices for each PCIe Root Port named based on the BDF of
the Root Port. For example,
30:03.0 PCI bridge: Device 1ded:8000 (rev 01)
the PMU device name for this Root Port is dwc_rootport_3018.
The DWC PCIe PMU driver registers a perf PMU driver, which provides
description of available events and configuration options in sysfs, see
/sys/bus/event_source/devices/dwc_rootport_{bdf}.
The "format" directory describes format of the config fields of the
perf_event_attr structure. The "events" directory provides configuration
templates for all documented events. For example,
"Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1".
The "perf list" command shall list the available events from sysfs, e.g.::
$# perf list | grep dwc_rootport
<...>
dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event]
<...>
dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event]
Time Based Analysis Event Usage
-------------------------------
Example usage of counting PCIe RX TLP data payload (Units of bytes)::
$# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/
The average RX/TX bandwidth can be calculated using the following formula:
PCIe RX Bandwidth = Rx_PCIe_TLP_Data_Payload / Measure_Time_Window
PCIe TX Bandwidth = Tx_PCIe_TLP_Data_Payload / Measure_Time_Window
Lane Event Usage
-------------------------------
Each lane has the same event set and to avoid generating a list of hundreds
of events, the user need to specify the lane ID explicitly, e.g.::
$# perf stat -a -e dwc_rootport_3018/rx_memory_read,lane=4/
The driver does not support sampling, therefore "perf record" will not
work. Per-task (without "-a") perf sessions are not supported.
...@@ -13,8 +13,8 @@ is one register for each counter. Counter 0 is special in that it always counts ...@@ -13,8 +13,8 @@ is one register for each counter. Counter 0 is special in that it always counts
interrupt is raised. If any other counter overflows, it continues counting, and interrupt is raised. If any other counter overflows, it continues counting, and
no interrupt is raised. no interrupt is raised.
The "format" directory describes format of the config (event ID) and config1 The "format" directory describes format of the config (event ID) and config1/2
(AXI filtering) fields of the perf_event_attr structure, see /sys/bus/event_source/ (AXI filter setting) fields of the perf_event_attr structure, see /sys/bus/event_source/
devices/imx8_ddr0/format/. The "events" directory describes the events types devices/imx8_ddr0/format/. The "events" directory describes the events types
hardware supported that can be used with perf tool, see /sys/bus/event_source/ hardware supported that can be used with perf tool, see /sys/bus/event_source/
devices/imx8_ddr0/events/. The "caps" directory describes filter features implemented devices/imx8_ddr0/events/. The "caps" directory describes filter features implemented
...@@ -28,12 +28,11 @@ in DDR PMU, see /sys/bus/events_source/devices/imx8_ddr0/caps/. ...@@ -28,12 +28,11 @@ in DDR PMU, see /sys/bus/events_source/devices/imx8_ddr0/caps/.
AXI filtering is only used by CSV modes 0x41 (axid-read) and 0x42 (axid-write) AXI filtering is only used by CSV modes 0x41 (axid-read) and 0x42 (axid-write)
to count reading or writing matches filter setting. Filter setting is various to count reading or writing matches filter setting. Filter setting is various
from different DRAM controller implementations, which is distinguished by quirks from different DRAM controller implementations, which is distinguished by quirks
in the driver. You also can dump info from userspace, filter in "caps" directory in the driver. You also can dump info from userspace, "caps" directory show the
indicates whether PMU supports AXI ID filter or not; enhanced_filter indicates type of AXI filter (filter, enhanced_filter and super_filter). Value 0 for
whether PMU supports enhanced AXI ID filter or not. Value 0 for un-supported, and un-supported, and value 1 for supported.
value 1 for supported.
* With DDR_CAP_AXI_ID_FILTER quirk(filter: 1, enhanced_filter: 0). * With DDR_CAP_AXI_ID_FILTER quirk(filter: 1, enhanced_filter: 0, super_filter: 0).
Filter is defined with two configuration parts: Filter is defined with two configuration parts:
--AXI_ID defines AxID matching value. --AXI_ID defines AxID matching value.
--AXI_MASKING defines which bits of AxID are meaningful for the matching. --AXI_MASKING defines which bits of AxID are meaningful for the matching.
...@@ -65,7 +64,37 @@ value 1 for supported. ...@@ -65,7 +64,37 @@ value 1 for supported.
perf stat -a -e imx8_ddr0/axid-read,axi_id=0x12/ cmd, which will monitor ARID=0x12 perf stat -a -e imx8_ddr0/axid-read,axi_id=0x12/ cmd, which will monitor ARID=0x12
* With DDR_CAP_AXI_ID_FILTER_ENHANCED quirk(filter: 1, enhanced_filter: 1). * With DDR_CAP_AXI_ID_FILTER_ENHANCED quirk(filter: 1, enhanced_filter: 1, super_filter: 0).
This is an extension to the DDR_CAP_AXI_ID_FILTER quirk which permits This is an extension to the DDR_CAP_AXI_ID_FILTER quirk which permits
counting the number of bytes (as opposed to the number of bursts) from DDR counting the number of bytes (as opposed to the number of bursts) from DDR
read and write transactions concurrently with another set of data counters. read and write transactions concurrently with another set of data counters.
* With DDR_CAP_AXI_ID_PORT_CHANNEL_FILTER quirk(filter: 0, enhanced_filter: 0, super_filter: 1).
There is a limitation in previous AXI filter, it cannot filter different IDs
at the same time as the filter is shared between counters. This quirk is the
extension of AXI ID filter. One improvement is that counter 1-3 has their own
filter, means that it supports concurrently filter various IDs. Another
improvement is that counter 1-3 supports AXI PORT and CHANNEL selection. Support
selecting address channel or data channel.
Filter is defined with 2 configuration registers per counter 1-3.
--Counter N MASK COMP register - including AXI_ID and AXI_MASKING.
--Counter N MUX CNTL register - including AXI CHANNEL and AXI PORT.
- 0: address channel
- 1: data channel
PMU in DDR subsystem, only one single port0 exists, so axi_port is reserved
which should be 0.
.. code-block:: bash
perf stat -a -e imx8_ddr0/axid-read,axi_mask=0xMMMM,axi_id=0xDDDD,axi_channel=0xH/ cmd
perf stat -a -e imx8_ddr0/axid-write,axi_mask=0xMMMM,axi_id=0xDDDD,axi_channel=0xH/ cmd
.. note::
axi_channel is inverted in userspace, and it will be reverted in driver
automatically. So that users do not need specify axi_channel if want to
monitor data channel from DDR transactions, since data channel is more
meaningful.
...@@ -19,6 +19,7 @@ Performance monitor support ...@@ -19,6 +19,7 @@ Performance monitor support
arm_dsu_pmu arm_dsu_pmu
thunderx2-pmu thunderx2-pmu
alibaba_pmu alibaba_pmu
dwc_pcie_pmu
nvidia-pmu nvidia-pmu
meson-ddr-pmu meson-ddr-pmu
cxl cxl
......
...@@ -164,3 +164,75 @@ and should be used to mask the upper bits as needed. ...@@ -164,3 +164,75 @@ and should be used to mask the upper bits as needed.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c
.. _tools/lib/perf/tests/test-evsel.c: .. _tools/lib/perf/tests/test-evsel.c:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c
Event Counting Threshold
==========================================
Overview
--------
FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on
events whose count meets a specified threshold condition. For example if
threshold_compare is set to 2 ('Greater than or equal'), and the
threshold is set to 2, then the PMU counter will now only increment by
when an event would have previously incremented the PMU counter by 2 or
more on a single processor cycle.
To increment by 1 after passing the threshold condition instead of the
number of events on that cycle, add the 'threshold_count' option to the
commandline.
How-to
------
These are the parameters for controlling the feature:
.. list-table::
:header-rows: 1
* - Parameter
- Description
* - threshold
- Value to threshold the event by. A value of 0 means that
thresholding is disabled and the other parameters have no effect.
* - threshold_compare
- | Comparison function to use, with the following values supported:
|
| 0: Not-equal
| 1: Equals
| 2: Greater-than-or-equal
| 3: Less-than
* - threshold_count
- If this is set, count by 1 after passing the threshold condition
instead of the value of the event on this cycle.
The threshold, threshold_compare and threshold_count values can be
provided per event, for example:
.. code-block:: sh
perf stat -e stall_slot/threshold=2,threshold_compare=2/ \
-e dtlb_walk/threshold=10,threshold_compare=3,threshold_count/
In this example the stall_slot event will count by 2 or more on every
cycle where 2 or more stalls happen. And dtlb_walk will count by 1 on
every cycle where the number of dtlb walks were less than 10.
The maximum supported threshold value can be read from the caps of each
PMU, for example:
.. code-block:: sh
cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max
0x000000ff
If a value higher than this is given, then opening the event will result
in an error. The highest possible maximum is 4095, as the config field
for threshold is limited to 12 bits, and the Perf tool will refuse to
parse higher values.
If the PMU doesn't support FEAT_PMUv3_TH, then threshold_max will read
0, and attempting to set a threshold value will also result in an error.
threshold_max will also read as 0 on aarch32 guests, even if the host
is running on hardware with the feature.
...@@ -27,6 +27,9 @@ properties: ...@@ -27,6 +27,9 @@ properties:
- fsl,imx8mq-ddr-pmu - fsl,imx8mq-ddr-pmu
- fsl,imx8mp-ddr-pmu - fsl,imx8mp-ddr-pmu
- const: fsl,imx8m-ddr-pmu - const: fsl,imx8m-ddr-pmu
- items:
- const: fsl,imx8dxl-ddr-pmu
- const: fsl,imx8-ddr-pmu
reg: reg:
maxItems: 1 maxItems: 1
......
...@@ -21090,6 +21090,13 @@ L: linux-mmc@vger.kernel.org ...@@ -21090,6 +21090,13 @@ L: linux-mmc@vger.kernel.org
S: Maintained S: Maintained
F: drivers/mmc/host/dw_mmc* F: drivers/mmc/host/dw_mmc*
SYNOPSYS DESIGNWARE PCIE PMU DRIVER
M: Shuai Xue <xueshuai@linux.alibaba.com>
M: Jing Zhang <renyu.zj@linux.alibaba.com>
S: Supported
F: Documentation/admin-guide/perf/dwc_pcie_pmu.rst
F: drivers/perf/dwc_pcie_pmu.c
SYNOPSYS HSDK RESET CONTROLLER DRIVER SYNOPSYS HSDK RESET CONTROLLER DRIVER
M: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> M: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
S: Supported S: Supported
......
...@@ -268,10 +268,8 @@ static inline void armv6pmu_write_counter(struct perf_event *event, u64 value) ...@@ -268,10 +268,8 @@ static inline void armv6pmu_write_counter(struct perf_event *event, u64 value)
static void armv6pmu_enable_event(struct perf_event *event) static void armv6pmu_enable_event(struct perf_event *event)
{ {
unsigned long val, mask, evt, flags; unsigned long val, mask, evt;
struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
struct hw_perf_event *hwc = &event->hw; struct hw_perf_event *hwc = &event->hw;
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
int idx = hwc->idx; int idx = hwc->idx;
if (ARMV6_CYCLE_COUNTER == idx) { if (ARMV6_CYCLE_COUNTER == idx) {
...@@ -294,12 +292,10 @@ static void armv6pmu_enable_event(struct perf_event *event) ...@@ -294,12 +292,10 @@ static void armv6pmu_enable_event(struct perf_event *event)
* Mask out the current event and set the counter to count the event * Mask out the current event and set the counter to count the event
* that we're interested in. * that we're interested in.
*/ */
raw_spin_lock_irqsave(&events->pmu_lock, flags);
val = armv6_pmcr_read(); val = armv6_pmcr_read();
val &= ~mask; val &= ~mask;
val |= evt; val |= evt;
armv6_pmcr_write(val); armv6_pmcr_write(val);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
} }
static irqreturn_t static irqreturn_t
...@@ -362,26 +358,20 @@ armv6pmu_handle_irq(struct arm_pmu *cpu_pmu) ...@@ -362,26 +358,20 @@ armv6pmu_handle_irq(struct arm_pmu *cpu_pmu)
static void armv6pmu_start(struct arm_pmu *cpu_pmu) static void armv6pmu_start(struct arm_pmu *cpu_pmu)
{ {
unsigned long flags, val; unsigned long val;
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
raw_spin_lock_irqsave(&events->pmu_lock, flags);
val = armv6_pmcr_read(); val = armv6_pmcr_read();
val |= ARMV6_PMCR_ENABLE; val |= ARMV6_PMCR_ENABLE;
armv6_pmcr_write(val); armv6_pmcr_write(val);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
} }
static void armv6pmu_stop(struct arm_pmu *cpu_pmu) static void armv6pmu_stop(struct arm_pmu *cpu_pmu)
{ {
unsigned long flags, val; unsigned long val;
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
raw_spin_lock_irqsave(&events->pmu_lock, flags);
val = armv6_pmcr_read(); val = armv6_pmcr_read();
val &= ~ARMV6_PMCR_ENABLE; val &= ~ARMV6_PMCR_ENABLE;
armv6_pmcr_write(val); armv6_pmcr_write(val);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
} }
static int static int
...@@ -419,10 +409,8 @@ static void armv6pmu_clear_event_idx(struct pmu_hw_events *cpuc, ...@@ -419,10 +409,8 @@ static void armv6pmu_clear_event_idx(struct pmu_hw_events *cpuc,
static void armv6pmu_disable_event(struct perf_event *event) static void armv6pmu_disable_event(struct perf_event *event)
{ {
unsigned long val, mask, evt, flags; unsigned long val, mask, evt;
struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
struct hw_perf_event *hwc = &event->hw; struct hw_perf_event *hwc = &event->hw;
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
int idx = hwc->idx; int idx = hwc->idx;
if (ARMV6_CYCLE_COUNTER == idx) { if (ARMV6_CYCLE_COUNTER == idx) {
...@@ -444,20 +432,16 @@ static void armv6pmu_disable_event(struct perf_event *event) ...@@ -444,20 +432,16 @@ static void armv6pmu_disable_event(struct perf_event *event)
* of ETM bus signal assertion cycles. The external reporting should * of ETM bus signal assertion cycles. The external reporting should
* be disabled and so this should never increment. * be disabled and so this should never increment.
*/ */
raw_spin_lock_irqsave(&events->pmu_lock, flags);
val = armv6_pmcr_read(); val = armv6_pmcr_read();
val &= ~mask; val &= ~mask;
val |= evt; val |= evt;
armv6_pmcr_write(val); armv6_pmcr_write(val);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
} }
static void armv6mpcore_pmu_disable_event(struct perf_event *event) static void armv6mpcore_pmu_disable_event(struct perf_event *event)
{ {
unsigned long val, mask, flags, evt = 0; unsigned long val, mask, evt = 0;
struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
struct hw_perf_event *hwc = &event->hw; struct hw_perf_event *hwc = &event->hw;
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
int idx = hwc->idx; int idx = hwc->idx;
if (ARMV6_CYCLE_COUNTER == idx) { if (ARMV6_CYCLE_COUNTER == idx) {
...@@ -475,12 +459,10 @@ static void armv6mpcore_pmu_disable_event(struct perf_event *event) ...@@ -475,12 +459,10 @@ static void armv6mpcore_pmu_disable_event(struct perf_event *event)
* Unlike UP ARMv6, we don't have a way of stopping the counters. We * Unlike UP ARMv6, we don't have a way of stopping the counters. We
* simply disable the interrupt reporting. * simply disable the interrupt reporting.
*/ */
raw_spin_lock_irqsave(&events->pmu_lock, flags);
val = armv6_pmcr_read(); val = armv6_pmcr_read();
val &= ~mask; val &= ~mask;
val |= evt; val |= evt;
armv6_pmcr_write(val); armv6_pmcr_write(val);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
} }
static int armv6_map_event(struct perf_event *event) static int armv6_map_event(struct perf_event *event)
......
...@@ -870,10 +870,8 @@ static void armv7_pmnc_dump_regs(struct arm_pmu *cpu_pmu) ...@@ -870,10 +870,8 @@ static void armv7_pmnc_dump_regs(struct arm_pmu *cpu_pmu)
static void armv7pmu_enable_event(struct perf_event *event) static void armv7pmu_enable_event(struct perf_event *event)
{ {
unsigned long flags;
struct hw_perf_event *hwc = &event->hw; struct hw_perf_event *hwc = &event->hw;
struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
int idx = hwc->idx; int idx = hwc->idx;
if (!armv7_pmnc_counter_valid(cpu_pmu, idx)) { if (!armv7_pmnc_counter_valid(cpu_pmu, idx)) {
...@@ -886,7 +884,6 @@ static void armv7pmu_enable_event(struct perf_event *event) ...@@ -886,7 +884,6 @@ static void armv7pmu_enable_event(struct perf_event *event)
* Enable counter and interrupt, and set the counter to count * Enable counter and interrupt, and set the counter to count
* the event that we're interested in. * the event that we're interested in.
*/ */
raw_spin_lock_irqsave(&events->pmu_lock, flags);
/* /*
* Disable counter * Disable counter
...@@ -910,16 +907,12 @@ static void armv7pmu_enable_event(struct perf_event *event) ...@@ -910,16 +907,12 @@ static void armv7pmu_enable_event(struct perf_event *event)
* Enable counter * Enable counter
*/ */
armv7_pmnc_enable_counter(idx); armv7_pmnc_enable_counter(idx);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
} }
static void armv7pmu_disable_event(struct perf_event *event) static void armv7pmu_disable_event(struct perf_event *event)
{ {
unsigned long flags;
struct hw_perf_event *hwc = &event->hw; struct hw_perf_event *hwc = &event->hw;
struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
int idx = hwc->idx; int idx = hwc->idx;
if (!armv7_pmnc_counter_valid(cpu_pmu, idx)) { if (!armv7_pmnc_counter_valid(cpu_pmu, idx)) {
...@@ -931,7 +924,6 @@ static void armv7pmu_disable_event(struct perf_event *event) ...@@ -931,7 +924,6 @@ static void armv7pmu_disable_event(struct perf_event *event)
/* /*
* Disable counter and interrupt * Disable counter and interrupt
*/ */
raw_spin_lock_irqsave(&events->pmu_lock, flags);
/* /*
* Disable counter * Disable counter
...@@ -942,8 +934,6 @@ static void armv7pmu_disable_event(struct perf_event *event) ...@@ -942,8 +934,6 @@ static void armv7pmu_disable_event(struct perf_event *event)
* Disable interrupt for this counter * Disable interrupt for this counter
*/ */
armv7_pmnc_disable_intens(idx); armv7_pmnc_disable_intens(idx);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
} }
static irqreturn_t armv7pmu_handle_irq(struct arm_pmu *cpu_pmu) static irqreturn_t armv7pmu_handle_irq(struct arm_pmu *cpu_pmu)
...@@ -1009,24 +999,14 @@ static irqreturn_t armv7pmu_handle_irq(struct arm_pmu *cpu_pmu) ...@@ -1009,24 +999,14 @@ static irqreturn_t armv7pmu_handle_irq(struct arm_pmu *cpu_pmu)
static void armv7pmu_start(struct arm_pmu *cpu_pmu) static void armv7pmu_start(struct arm_pmu *cpu_pmu)
{ {
unsigned long flags;
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
raw_spin_lock_irqsave(&events->pmu_lock, flags);
/* Enable all counters */ /* Enable all counters */
armv7_pmnc_write(armv7_pmnc_read() | ARMV7_PMNC_E); armv7_pmnc_write(armv7_pmnc_read() | ARMV7_PMNC_E);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
} }
static void armv7pmu_stop(struct arm_pmu *cpu_pmu) static void armv7pmu_stop(struct arm_pmu *cpu_pmu)
{ {
unsigned long flags;
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
raw_spin_lock_irqsave(&events->pmu_lock, flags);
/* Disable all counters */ /* Disable all counters */
armv7_pmnc_write(armv7_pmnc_read() & ~ARMV7_PMNC_E); armv7_pmnc_write(armv7_pmnc_read() & ~ARMV7_PMNC_E);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
} }
static int armv7pmu_get_event_idx(struct pmu_hw_events *cpuc, static int armv7pmu_get_event_idx(struct pmu_hw_events *cpuc,
...@@ -1072,8 +1052,10 @@ static int armv7pmu_set_event_filter(struct hw_perf_event *event, ...@@ -1072,8 +1052,10 @@ static int armv7pmu_set_event_filter(struct hw_perf_event *event,
{ {
unsigned long config_base = 0; unsigned long config_base = 0;
if (attr->exclude_idle) if (attr->exclude_idle) {
return -EPERM; pr_debug("ARM performance counters do not support mode exclusion\n");
return -EOPNOTSUPP;
}
if (attr->exclude_user) if (attr->exclude_user)
config_base |= ARMV7_EXCLUDE_USER; config_base |= ARMV7_EXCLUDE_USER;
if (attr->exclude_kernel) if (attr->exclude_kernel)
...@@ -1492,14 +1474,10 @@ static void krait_clearpmu(u32 config_base) ...@@ -1492,14 +1474,10 @@ static void krait_clearpmu(u32 config_base)
static void krait_pmu_disable_event(struct perf_event *event) static void krait_pmu_disable_event(struct perf_event *event)
{ {
unsigned long flags;
struct hw_perf_event *hwc = &event->hw; struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx; int idx = hwc->idx;
struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
/* Disable counter and interrupt */ /* Disable counter and interrupt */
raw_spin_lock_irqsave(&events->pmu_lock, flags);
/* Disable counter */ /* Disable counter */
armv7_pmnc_disable_counter(idx); armv7_pmnc_disable_counter(idx);
...@@ -1512,23 +1490,17 @@ static void krait_pmu_disable_event(struct perf_event *event) ...@@ -1512,23 +1490,17 @@ static void krait_pmu_disable_event(struct perf_event *event)
/* Disable interrupt for this counter */ /* Disable interrupt for this counter */
armv7_pmnc_disable_intens(idx); armv7_pmnc_disable_intens(idx);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
} }
static void krait_pmu_enable_event(struct perf_event *event) static void krait_pmu_enable_event(struct perf_event *event)
{ {
unsigned long flags;
struct hw_perf_event *hwc = &event->hw; struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx; int idx = hwc->idx;
struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
/* /*
* Enable counter and interrupt, and set the counter to count * Enable counter and interrupt, and set the counter to count
* the event that we're interested in. * the event that we're interested in.
*/ */
raw_spin_lock_irqsave(&events->pmu_lock, flags);
/* Disable counter */ /* Disable counter */
armv7_pmnc_disable_counter(idx); armv7_pmnc_disable_counter(idx);
...@@ -1548,8 +1520,6 @@ static void krait_pmu_enable_event(struct perf_event *event) ...@@ -1548,8 +1520,6 @@ static void krait_pmu_enable_event(struct perf_event *event)
/* Enable counter */ /* Enable counter */
armv7_pmnc_enable_counter(idx); armv7_pmnc_enable_counter(idx);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
} }
static void krait_pmu_reset(void *info) static void krait_pmu_reset(void *info)
...@@ -1825,14 +1795,10 @@ static void scorpion_clearpmu(u32 config_base) ...@@ -1825,14 +1795,10 @@ static void scorpion_clearpmu(u32 config_base)
static void scorpion_pmu_disable_event(struct perf_event *event) static void scorpion_pmu_disable_event(struct perf_event *event)
{ {
unsigned long flags;
struct hw_perf_event *hwc = &event->hw; struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx; int idx = hwc->idx;
struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
/* Disable counter and interrupt */ /* Disable counter and interrupt */
raw_spin_lock_irqsave(&events->pmu_lock, flags);
/* Disable counter */ /* Disable counter */
armv7_pmnc_disable_counter(idx); armv7_pmnc_disable_counter(idx);
...@@ -1845,23 +1811,17 @@ static void scorpion_pmu_disable_event(struct perf_event *event) ...@@ -1845,23 +1811,17 @@ static void scorpion_pmu_disable_event(struct perf_event *event)
/* Disable interrupt for this counter */ /* Disable interrupt for this counter */
armv7_pmnc_disable_intens(idx); armv7_pmnc_disable_intens(idx);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
} }
static void scorpion_pmu_enable_event(struct perf_event *event) static void scorpion_pmu_enable_event(struct perf_event *event)
{ {
unsigned long flags;
struct hw_perf_event *hwc = &event->hw; struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx; int idx = hwc->idx;
struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
/* /*
* Enable counter and interrupt, and set the counter to count * Enable counter and interrupt, and set the counter to count
* the event that we're interested in. * the event that we're interested in.
*/ */
raw_spin_lock_irqsave(&events->pmu_lock, flags);
/* Disable counter */ /* Disable counter */
armv7_pmnc_disable_counter(idx); armv7_pmnc_disable_counter(idx);
...@@ -1881,8 +1841,6 @@ static void scorpion_pmu_enable_event(struct perf_event *event) ...@@ -1881,8 +1841,6 @@ static void scorpion_pmu_enable_event(struct perf_event *event)
/* Enable counter */ /* Enable counter */
armv7_pmnc_enable_counter(idx); armv7_pmnc_enable_counter(idx);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
} }
static void scorpion_pmu_reset(void *info) static void scorpion_pmu_reset(void *info)
......
...@@ -203,10 +203,8 @@ xscale1pmu_handle_irq(struct arm_pmu *cpu_pmu) ...@@ -203,10 +203,8 @@ xscale1pmu_handle_irq(struct arm_pmu *cpu_pmu)
static void xscale1pmu_enable_event(struct perf_event *event) static void xscale1pmu_enable_event(struct perf_event *event)
{ {
unsigned long val, mask, evt, flags; unsigned long val, mask, evt;
struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
struct hw_perf_event *hwc = &event->hw; struct hw_perf_event *hwc = &event->hw;
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
int idx = hwc->idx; int idx = hwc->idx;
switch (idx) { switch (idx) {
...@@ -229,20 +227,16 @@ static void xscale1pmu_enable_event(struct perf_event *event) ...@@ -229,20 +227,16 @@ static void xscale1pmu_enable_event(struct perf_event *event)
return; return;
} }
raw_spin_lock_irqsave(&events->pmu_lock, flags);
val = xscale1pmu_read_pmnc(); val = xscale1pmu_read_pmnc();
val &= ~mask; val &= ~mask;
val |= evt; val |= evt;
xscale1pmu_write_pmnc(val); xscale1pmu_write_pmnc(val);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
} }
static void xscale1pmu_disable_event(struct perf_event *event) static void xscale1pmu_disable_event(struct perf_event *event)
{ {
unsigned long val, mask, evt, flags; unsigned long val, mask, evt;
struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
struct hw_perf_event *hwc = &event->hw; struct hw_perf_event *hwc = &event->hw;
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
int idx = hwc->idx; int idx = hwc->idx;
switch (idx) { switch (idx) {
...@@ -263,12 +257,10 @@ static void xscale1pmu_disable_event(struct perf_event *event) ...@@ -263,12 +257,10 @@ static void xscale1pmu_disable_event(struct perf_event *event)
return; return;
} }
raw_spin_lock_irqsave(&events->pmu_lock, flags);
val = xscale1pmu_read_pmnc(); val = xscale1pmu_read_pmnc();
val &= ~mask; val &= ~mask;
val |= evt; val |= evt;
xscale1pmu_write_pmnc(val); xscale1pmu_write_pmnc(val);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
} }
static int static int
...@@ -300,26 +292,20 @@ static void xscalepmu_clear_event_idx(struct pmu_hw_events *cpuc, ...@@ -300,26 +292,20 @@ static void xscalepmu_clear_event_idx(struct pmu_hw_events *cpuc,
static void xscale1pmu_start(struct arm_pmu *cpu_pmu) static void xscale1pmu_start(struct arm_pmu *cpu_pmu)
{ {
unsigned long flags, val; unsigned long val;
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
raw_spin_lock_irqsave(&events->pmu_lock, flags);
val = xscale1pmu_read_pmnc(); val = xscale1pmu_read_pmnc();
val |= XSCALE_PMU_ENABLE; val |= XSCALE_PMU_ENABLE;
xscale1pmu_write_pmnc(val); xscale1pmu_write_pmnc(val);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
} }
static void xscale1pmu_stop(struct arm_pmu *cpu_pmu) static void xscale1pmu_stop(struct arm_pmu *cpu_pmu)
{ {
unsigned long flags, val; unsigned long val;
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
raw_spin_lock_irqsave(&events->pmu_lock, flags);
val = xscale1pmu_read_pmnc(); val = xscale1pmu_read_pmnc();
val &= ~XSCALE_PMU_ENABLE; val &= ~XSCALE_PMU_ENABLE;
xscale1pmu_write_pmnc(val); xscale1pmu_write_pmnc(val);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
} }
static inline u64 xscale1pmu_read_counter(struct perf_event *event) static inline u64 xscale1pmu_read_counter(struct perf_event *event)
...@@ -549,10 +535,8 @@ xscale2pmu_handle_irq(struct arm_pmu *cpu_pmu) ...@@ -549,10 +535,8 @@ xscale2pmu_handle_irq(struct arm_pmu *cpu_pmu)
static void xscale2pmu_enable_event(struct perf_event *event) static void xscale2pmu_enable_event(struct perf_event *event)
{ {
unsigned long flags, ien, evtsel; unsigned long ien, evtsel;
struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
struct hw_perf_event *hwc = &event->hw; struct hw_perf_event *hwc = &event->hw;
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
int idx = hwc->idx; int idx = hwc->idx;
ien = xscale2pmu_read_int_enable(); ien = xscale2pmu_read_int_enable();
...@@ -587,18 +571,14 @@ static void xscale2pmu_enable_event(struct perf_event *event) ...@@ -587,18 +571,14 @@ static void xscale2pmu_enable_event(struct perf_event *event)
return; return;
} }
raw_spin_lock_irqsave(&events->pmu_lock, flags);
xscale2pmu_write_event_select(evtsel); xscale2pmu_write_event_select(evtsel);
xscale2pmu_write_int_enable(ien); xscale2pmu_write_int_enable(ien);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
} }
static void xscale2pmu_disable_event(struct perf_event *event) static void xscale2pmu_disable_event(struct perf_event *event)
{ {
unsigned long flags, ien, evtsel, of_flags; unsigned long ien, evtsel, of_flags;
struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
struct hw_perf_event *hwc = &event->hw; struct hw_perf_event *hwc = &event->hw;
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
int idx = hwc->idx; int idx = hwc->idx;
ien = xscale2pmu_read_int_enable(); ien = xscale2pmu_read_int_enable();
...@@ -638,11 +618,9 @@ static void xscale2pmu_disable_event(struct perf_event *event) ...@@ -638,11 +618,9 @@ static void xscale2pmu_disable_event(struct perf_event *event)
return; return;
} }
raw_spin_lock_irqsave(&events->pmu_lock, flags);
xscale2pmu_write_event_select(evtsel); xscale2pmu_write_event_select(evtsel);
xscale2pmu_write_int_enable(ien); xscale2pmu_write_int_enable(ien);
xscale2pmu_write_overflow_flags(of_flags); xscale2pmu_write_overflow_flags(of_flags);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
} }
static int static int
...@@ -663,26 +641,20 @@ xscale2pmu_get_event_idx(struct pmu_hw_events *cpuc, ...@@ -663,26 +641,20 @@ xscale2pmu_get_event_idx(struct pmu_hw_events *cpuc,
static void xscale2pmu_start(struct arm_pmu *cpu_pmu) static void xscale2pmu_start(struct arm_pmu *cpu_pmu)
{ {
unsigned long flags, val; unsigned long val;
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
raw_spin_lock_irqsave(&events->pmu_lock, flags);
val = xscale2pmu_read_pmnc() & ~XSCALE_PMU_CNT64; val = xscale2pmu_read_pmnc() & ~XSCALE_PMU_CNT64;
val |= XSCALE_PMU_ENABLE; val |= XSCALE_PMU_ENABLE;
xscale2pmu_write_pmnc(val); xscale2pmu_write_pmnc(val);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
} }
static void xscale2pmu_stop(struct arm_pmu *cpu_pmu) static void xscale2pmu_stop(struct arm_pmu *cpu_pmu)
{ {
unsigned long flags, val; unsigned long val;
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
raw_spin_lock_irqsave(&events->pmu_lock, flags);
val = xscale2pmu_read_pmnc(); val = xscale2pmu_read_pmnc();
val &= ~XSCALE_PMU_ENABLE; val &= ~XSCALE_PMU_ENABLE;
xscale2pmu_write_pmnc(val); xscale2pmu_write_pmnc(val);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
} }
static inline u64 xscale2pmu_read_counter(struct perf_event *event) static inline u64 xscale2pmu_read_counter(struct perf_event *event)
......
...@@ -267,9 +267,8 @@ void kvm_pmu_vcpu_destroy(struct kvm_vcpu *vcpu) ...@@ -267,9 +267,8 @@ void kvm_pmu_vcpu_destroy(struct kvm_vcpu *vcpu)
u64 kvm_pmu_valid_counter_mask(struct kvm_vcpu *vcpu) u64 kvm_pmu_valid_counter_mask(struct kvm_vcpu *vcpu)
{ {
u64 val = kvm_vcpu_read_pmcr(vcpu) >> ARMV8_PMU_PMCR_N_SHIFT; u64 val = FIELD_GET(ARMV8_PMU_PMCR_N, kvm_vcpu_read_pmcr(vcpu));
val &= ARMV8_PMU_PMCR_N_MASK;
if (val == 0) if (val == 0)
return BIT(ARMV8_PMU_CYCLE_IDX); return BIT(ARMV8_PMU_CYCLE_IDX);
else else
...@@ -1136,8 +1135,7 @@ u8 kvm_arm_pmu_get_pmuver_limit(void) ...@@ -1136,8 +1135,7 @@ u8 kvm_arm_pmu_get_pmuver_limit(void)
*/ */
u64 kvm_vcpu_read_pmcr(struct kvm_vcpu *vcpu) u64 kvm_vcpu_read_pmcr(struct kvm_vcpu *vcpu)
{ {
u64 pmcr = __vcpu_sys_reg(vcpu, PMCR_EL0) & u64 pmcr = __vcpu_sys_reg(vcpu, PMCR_EL0);
~(ARMV8_PMU_PMCR_N_MASK << ARMV8_PMU_PMCR_N_SHIFT);
return pmcr | ((u64)vcpu->kvm->arch.pmcr_n << ARMV8_PMU_PMCR_N_SHIFT); return u64_replace_bits(pmcr, vcpu->kvm->arch.pmcr_n, ARMV8_PMU_PMCR_N);
} }
...@@ -877,7 +877,7 @@ static bool pmu_counter_idx_valid(struct kvm_vcpu *vcpu, u64 idx) ...@@ -877,7 +877,7 @@ static bool pmu_counter_idx_valid(struct kvm_vcpu *vcpu, u64 idx)
u64 pmcr, val; u64 pmcr, val;
pmcr = kvm_vcpu_read_pmcr(vcpu); pmcr = kvm_vcpu_read_pmcr(vcpu);
val = (pmcr >> ARMV8_PMU_PMCR_N_SHIFT) & ARMV8_PMU_PMCR_N_MASK; val = FIELD_GET(ARMV8_PMU_PMCR_N, pmcr);
if (idx >= val && idx != ARMV8_PMU_CYCLE_IDX) { if (idx >= val && idx != ARMV8_PMU_CYCLE_IDX) {
kvm_inject_undefined(vcpu); kvm_inject_undefined(vcpu);
return false; return false;
...@@ -1143,7 +1143,7 @@ static int get_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r, ...@@ -1143,7 +1143,7 @@ static int get_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
static int set_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r, static int set_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
u64 val) u64 val)
{ {
u8 new_n = (val >> ARMV8_PMU_PMCR_N_SHIFT) & ARMV8_PMU_PMCR_N_MASK; u8 new_n = FIELD_GET(ARMV8_PMU_PMCR_N, val);
struct kvm *kvm = vcpu->kvm; struct kvm *kvm = vcpu->kvm;
mutex_lock(&kvm->arch.config_lock); mutex_lock(&kvm->arch.config_lock);
......
...@@ -11,8 +11,6 @@ ...@@ -11,8 +11,6 @@
#include <linux/types.h> #include <linux/types.h>
/* PCIe device related definition. */ /* PCIe device related definition. */
#define PCI_VENDOR_ID_ALIBABA 0x1ded
#define ERDMA_PCI_WIDTH 64 #define ERDMA_PCI_WIDTH 64
#define ERDMA_FUNC_BAR 0 #define ERDMA_FUNC_BAR 0
#define ERDMA_MISX_BAR 2 #define ERDMA_MISX_BAR 2
......
...@@ -598,3 +598,15 @@ int pci_write_config_dword(const struct pci_dev *dev, int where, ...@@ -598,3 +598,15 @@ int pci_write_config_dword(const struct pci_dev *dev, int where,
return pci_bus_write_config_dword(dev->bus, dev->devfn, where, val); return pci_bus_write_config_dword(dev->bus, dev->devfn, where, val);
} }
EXPORT_SYMBOL(pci_write_config_dword); EXPORT_SYMBOL(pci_write_config_dword);
void pci_clear_and_set_config_dword(const struct pci_dev *dev, int pos,
u32 clear, u32 set)
{
u32 val;
pci_read_config_dword(dev, pos, &val);
val &= ~clear;
val |= set;
pci_write_config_dword(dev, pos, val);
}
EXPORT_SYMBOL(pci_clear_and_set_config_dword);
...@@ -426,17 +426,6 @@ static void pcie_aspm_check_latency(struct pci_dev *endpoint) ...@@ -426,17 +426,6 @@ static void pcie_aspm_check_latency(struct pci_dev *endpoint)
} }
} }
static void pci_clear_and_set_dword(struct pci_dev *pdev, int pos,
u32 clear, u32 set)
{
u32 val;
pci_read_config_dword(pdev, pos, &val);
val &= ~clear;
val |= set;
pci_write_config_dword(pdev, pos, val);
}
/* Calculate L1.2 PM substate timing parameters */ /* Calculate L1.2 PM substate timing parameters */
static void aspm_calc_l12_info(struct pcie_link_state *link, static void aspm_calc_l12_info(struct pcie_link_state *link,
u32 parent_l1ss_cap, u32 child_l1ss_cap) u32 parent_l1ss_cap, u32 child_l1ss_cap)
...@@ -501,10 +490,12 @@ static void aspm_calc_l12_info(struct pcie_link_state *link, ...@@ -501,10 +490,12 @@ static void aspm_calc_l12_info(struct pcie_link_state *link,
cl1_2_enables = cctl1 & PCI_L1SS_CTL1_L1_2_MASK; cl1_2_enables = cctl1 & PCI_L1SS_CTL1_L1_2_MASK;
if (pl1_2_enables || cl1_2_enables) { if (pl1_2_enables || cl1_2_enables) {
pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1, pci_clear_and_set_config_dword(child,
PCI_L1SS_CTL1_L1_2_MASK, 0); child->l1ss + PCI_L1SS_CTL1,
pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, PCI_L1SS_CTL1_L1_2_MASK, 0);
PCI_L1SS_CTL1_L1_2_MASK, 0); pci_clear_and_set_config_dword(parent,
parent->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_L1_2_MASK, 0);
} }
/* Program T_POWER_ON times in both ports */ /* Program T_POWER_ON times in both ports */
...@@ -512,22 +503,26 @@ static void aspm_calc_l12_info(struct pcie_link_state *link, ...@@ -512,22 +503,26 @@ static void aspm_calc_l12_info(struct pcie_link_state *link,
pci_write_config_dword(child, child->l1ss + PCI_L1SS_CTL2, ctl2); pci_write_config_dword(child, child->l1ss + PCI_L1SS_CTL2, ctl2);
/* Program Common_Mode_Restore_Time in upstream device */ /* Program Common_Mode_Restore_Time in upstream device */
pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, pci_clear_and_set_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_CM_RESTORE_TIME, ctl1); PCI_L1SS_CTL1_CM_RESTORE_TIME, ctl1);
/* Program LTR_L1.2_THRESHOLD time in both ports */ /* Program LTR_L1.2_THRESHOLD time in both ports */
pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, pci_clear_and_set_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_LTR_L12_TH_VALUE | PCI_L1SS_CTL1_LTR_L12_TH_VALUE |
PCI_L1SS_CTL1_LTR_L12_TH_SCALE, ctl1); PCI_L1SS_CTL1_LTR_L12_TH_SCALE,
pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1, ctl1);
PCI_L1SS_CTL1_LTR_L12_TH_VALUE | pci_clear_and_set_config_dword(child, child->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_LTR_L12_TH_SCALE, ctl1); PCI_L1SS_CTL1_LTR_L12_TH_VALUE |
PCI_L1SS_CTL1_LTR_L12_TH_SCALE,
ctl1);
if (pl1_2_enables || cl1_2_enables) { if (pl1_2_enables || cl1_2_enables) {
pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, 0, pci_clear_and_set_config_dword(parent,
pl1_2_enables); parent->l1ss + PCI_L1SS_CTL1, 0,
pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1, 0, pl1_2_enables);
cl1_2_enables); pci_clear_and_set_config_dword(child,
child->l1ss + PCI_L1SS_CTL1, 0,
cl1_2_enables);
} }
} }
...@@ -687,10 +682,10 @@ static void pcie_config_aspm_l1ss(struct pcie_link_state *link, u32 state) ...@@ -687,10 +682,10 @@ static void pcie_config_aspm_l1ss(struct pcie_link_state *link, u32 state)
*/ */
/* Disable all L1 substates */ /* Disable all L1 substates */
pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1, pci_clear_and_set_config_dword(child, child->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_L1SS_MASK, 0); PCI_L1SS_CTL1_L1SS_MASK, 0);
pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, pci_clear_and_set_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_L1SS_MASK, 0); PCI_L1SS_CTL1_L1SS_MASK, 0);
/* /*
* If needed, disable L1, and it gets enabled later * If needed, disable L1, and it gets enabled later
* in pcie_config_aspm_link(). * in pcie_config_aspm_link().
...@@ -713,10 +708,10 @@ static void pcie_config_aspm_l1ss(struct pcie_link_state *link, u32 state) ...@@ -713,10 +708,10 @@ static void pcie_config_aspm_l1ss(struct pcie_link_state *link, u32 state)
val |= PCI_L1SS_CTL1_PCIPM_L1_2; val |= PCI_L1SS_CTL1_PCIPM_L1_2;
/* Enable what we need to enable */ /* Enable what we need to enable */
pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, pci_clear_and_set_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_L1SS_MASK, val); PCI_L1SS_CTL1_L1SS_MASK, val);
pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1, pci_clear_and_set_config_dword(child, child->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_L1SS_MASK, val); PCI_L1SS_CTL1_L1SS_MASK, val);
} }
static void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val) static void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val)
......
...@@ -217,6 +217,13 @@ config MARVELL_CN10K_DDR_PMU ...@@ -217,6 +217,13 @@ config MARVELL_CN10K_DDR_PMU
Enable perf support for Marvell DDR Performance monitoring Enable perf support for Marvell DDR Performance monitoring
event on CN10K platform. event on CN10K platform.
config DWC_PCIE_PMU
tristate "Synopsys DesignWare PCIe PMU"
depends on PCI
help
Enable perf support for Synopsys DesignWare PCIe PMU Performance
monitoring event on platform including the Alibaba Yitian 710.
source "drivers/perf/arm_cspmu/Kconfig" source "drivers/perf/arm_cspmu/Kconfig"
source "drivers/perf/amlogic/Kconfig" source "drivers/perf/amlogic/Kconfig"
......
...@@ -23,6 +23,7 @@ obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) += marvell_cn10k_tad_pmu.o ...@@ -23,6 +23,7 @@ obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) += marvell_cn10k_tad_pmu.o
obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) += marvell_cn10k_ddr_pmu.o obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) += marvell_cn10k_ddr_pmu.o
obj-$(CONFIG_APPLE_M1_CPU_PMU) += apple_m1_cpu_pmu.o obj-$(CONFIG_APPLE_M1_CPU_PMU) += apple_m1_cpu_pmu.o
obj-$(CONFIG_ALIBABA_UNCORE_DRW_PMU) += alibaba_uncore_drw_pmu.o obj-$(CONFIG_ALIBABA_UNCORE_DRW_PMU) += alibaba_uncore_drw_pmu.o
obj-$(CONFIG_DWC_PCIE_PMU) += dwc_pcie_pmu.o
obj-$(CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU) += arm_cspmu/ obj-$(CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU) += arm_cspmu/
obj-$(CONFIG_MESON_DDR_PMU) += amlogic/ obj-$(CONFIG_MESON_DDR_PMU) += amlogic/
obj-$(CONFIG_CXL_PMU) += cxl_pmu.o obj-$(CONFIG_CXL_PMU) += cxl_pmu.o
...@@ -524,8 +524,10 @@ static int m1_pmu_set_event_filter(struct hw_perf_event *event, ...@@ -524,8 +524,10 @@ static int m1_pmu_set_event_filter(struct hw_perf_event *event,
{ {
unsigned long config_base = 0; unsigned long config_base = 0;
if (!attr->exclude_guest) if (!attr->exclude_guest) {
return -EINVAL; pr_debug("ARM performance counters do not support mode exclusion\n");
return -EOPNOTSUPP;
}
if (!attr->exclude_kernel) if (!attr->exclude_kernel)
config_base |= M1_PMU_CFG_COUNT_KERNEL; config_base |= M1_PMU_CFG_COUNT_KERNEL;
if (!attr->exclude_user) if (!attr->exclude_user)
......
...@@ -811,7 +811,7 @@ static umode_t arm_cmn_event_attr_is_visible(struct kobject *kobj, ...@@ -811,7 +811,7 @@ static umode_t arm_cmn_event_attr_is_visible(struct kobject *kobj,
#define CMN_EVENT_HNF_OCC(_model, _name, _event) \ #define CMN_EVENT_HNF_OCC(_model, _name, _event) \
CMN_EVENT_HN_OCC(_model, hnf_##_name, CMN_TYPE_HNF, _event) CMN_EVENT_HN_OCC(_model, hnf_##_name, CMN_TYPE_HNF, _event)
#define CMN_EVENT_HNF_CLS(_model, _name, _event) \ #define CMN_EVENT_HNF_CLS(_model, _name, _event) \
CMN_EVENT_HN_CLS(_model, hnf_##_name, CMN_TYPE_HNS, _event) CMN_EVENT_HN_CLS(_model, hnf_##_name, CMN_TYPE_HNF, _event)
#define CMN_EVENT_HNF_SNT(_model, _name, _event) \ #define CMN_EVENT_HNF_SNT(_model, _name, _event) \
CMN_EVENT_HN_SNT(_model, hnf_##_name, CMN_TYPE_HNF, _event) CMN_EVENT_HN_SNT(_model, hnf_##_name, CMN_TYPE_HNF, _event)
......
...@@ -371,7 +371,7 @@ static inline u32 dsu_pmu_get_reset_overflow(void) ...@@ -371,7 +371,7 @@ static inline u32 dsu_pmu_get_reset_overflow(void)
return __dsu_pmu_get_reset_overflow(); return __dsu_pmu_get_reset_overflow();
} }
/** /*
* dsu_pmu_set_event_period: Set the period for the counter. * dsu_pmu_set_event_period: Set the period for the counter.
* *
* All DSU PMU event counters, except the cycle counter are 32bit * All DSU PMU event counters, except the cycle counter are 32bit
...@@ -602,7 +602,7 @@ static struct dsu_pmu *dsu_pmu_alloc(struct platform_device *pdev) ...@@ -602,7 +602,7 @@ static struct dsu_pmu *dsu_pmu_alloc(struct platform_device *pdev)
return dsu_pmu; return dsu_pmu;
} }
/** /*
* dsu_pmu_dt_get_cpus: Get the list of CPUs in the cluster * dsu_pmu_dt_get_cpus: Get the list of CPUs in the cluster
* from device tree. * from device tree.
*/ */
...@@ -632,7 +632,7 @@ static int dsu_pmu_dt_get_cpus(struct device *dev, cpumask_t *mask) ...@@ -632,7 +632,7 @@ static int dsu_pmu_dt_get_cpus(struct device *dev, cpumask_t *mask)
return 0; return 0;
} }
/** /*
* dsu_pmu_acpi_get_cpus: Get the list of CPUs in the cluster * dsu_pmu_acpi_get_cpus: Get the list of CPUs in the cluster
* from ACPI. * from ACPI.
*/ */
......
...@@ -445,7 +445,7 @@ __hw_perf_event_init(struct perf_event *event) ...@@ -445,7 +445,7 @@ __hw_perf_event_init(struct perf_event *event)
{ {
struct arm_pmu *armpmu = to_arm_pmu(event->pmu); struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
struct hw_perf_event *hwc = &event->hw; struct hw_perf_event *hwc = &event->hw;
int mapping; int mapping, ret;
hwc->flags = 0; hwc->flags = 0;
mapping = armpmu->map_event(event); mapping = armpmu->map_event(event);
...@@ -470,11 +470,10 @@ __hw_perf_event_init(struct perf_event *event) ...@@ -470,11 +470,10 @@ __hw_perf_event_init(struct perf_event *event)
/* /*
* Check whether we need to exclude the counter from certain modes. * Check whether we need to exclude the counter from certain modes.
*/ */
if (armpmu->set_event_filter && if (armpmu->set_event_filter) {
armpmu->set_event_filter(hwc, &event->attr)) { ret = armpmu->set_event_filter(hwc, &event->attr);
pr_debug("ARM performance counters do not support " if (ret)
"mode exclusion\n"); return ret;
return -EOPNOTSUPP;
} }
/* /*
...@@ -893,7 +892,6 @@ struct arm_pmu *armpmu_alloc(void) ...@@ -893,7 +892,6 @@ struct arm_pmu *armpmu_alloc(void)
struct pmu_hw_events *events; struct pmu_hw_events *events;
events = per_cpu_ptr(pmu->hw_events, cpu); events = per_cpu_ptr(pmu->hw_events, cpu);
raw_spin_lock_init(&events->pmu_lock);
events->percpu_pmu = pmu; events->percpu_pmu = pmu;
} }
......
...@@ -15,6 +15,7 @@ ...@@ -15,6 +15,7 @@
#include <clocksource/arm_arch_timer.h> #include <clocksource/arm_arch_timer.h>
#include <linux/acpi.h> #include <linux/acpi.h>
#include <linux/bitfield.h>
#include <linux/clocksource.h> #include <linux/clocksource.h>
#include <linux/of.h> #include <linux/of.h>
#include <linux/perf/arm_pmu.h> #include <linux/perf/arm_pmu.h>
...@@ -169,7 +170,11 @@ armv8pmu_events_sysfs_show(struct device *dev, ...@@ -169,7 +170,11 @@ armv8pmu_events_sysfs_show(struct device *dev,
PMU_EVENT_ATTR_ID(name, armv8pmu_events_sysfs_show, config) PMU_EVENT_ATTR_ID(name, armv8pmu_events_sysfs_show, config)
static struct attribute *armv8_pmuv3_event_attrs[] = { static struct attribute *armv8_pmuv3_event_attrs[] = {
ARMV8_EVENT_ATTR(sw_incr, ARMV8_PMUV3_PERFCTR_SW_INCR), /*
* Don't expose the sw_incr event in /sys. It's not usable as writes to
* PMSWINC_EL0 will trap as PMUSERENR.{SW,EN}=={0,0} and event rotation
* means we don't have a fixed event<->counter relationship regardless.
*/
ARMV8_EVENT_ATTR(l1i_cache_refill, ARMV8_PMUV3_PERFCTR_L1I_CACHE_REFILL), ARMV8_EVENT_ATTR(l1i_cache_refill, ARMV8_PMUV3_PERFCTR_L1I_CACHE_REFILL),
ARMV8_EVENT_ATTR(l1i_tlb_refill, ARMV8_PMUV3_PERFCTR_L1I_TLB_REFILL), ARMV8_EVENT_ATTR(l1i_tlb_refill, ARMV8_PMUV3_PERFCTR_L1I_TLB_REFILL),
ARMV8_EVENT_ATTR(l1d_cache_refill, ARMV8_PMUV3_PERFCTR_L1D_CACHE_REFILL), ARMV8_EVENT_ATTR(l1d_cache_refill, ARMV8_PMUV3_PERFCTR_L1D_CACHE_REFILL),
...@@ -294,26 +299,66 @@ static const struct attribute_group armv8_pmuv3_events_attr_group = { ...@@ -294,26 +299,66 @@ static const struct attribute_group armv8_pmuv3_events_attr_group = {
.is_visible = armv8pmu_event_attr_is_visible, .is_visible = armv8pmu_event_attr_is_visible,
}; };
PMU_FORMAT_ATTR(event, "config:0-15"); /* User ABI */
PMU_FORMAT_ATTR(long, "config1:0"); #define ATTR_CFG_FLD_event_CFG config
PMU_FORMAT_ATTR(rdpmc, "config1:1"); #define ATTR_CFG_FLD_event_LO 0
#define ATTR_CFG_FLD_event_HI 15
#define ATTR_CFG_FLD_long_CFG config1
#define ATTR_CFG_FLD_long_LO 0
#define ATTR_CFG_FLD_long_HI 0
#define ATTR_CFG_FLD_rdpmc_CFG config1
#define ATTR_CFG_FLD_rdpmc_LO 1
#define ATTR_CFG_FLD_rdpmc_HI 1
#define ATTR_CFG_FLD_threshold_count_CFG config1 /* PMEVTYPER.TC[0] */
#define ATTR_CFG_FLD_threshold_count_LO 2
#define ATTR_CFG_FLD_threshold_count_HI 2
#define ATTR_CFG_FLD_threshold_compare_CFG config1 /* PMEVTYPER.TC[2:1] */
#define ATTR_CFG_FLD_threshold_compare_LO 3
#define ATTR_CFG_FLD_threshold_compare_HI 4
#define ATTR_CFG_FLD_threshold_CFG config1 /* PMEVTYPER.TH */
#define ATTR_CFG_FLD_threshold_LO 5
#define ATTR_CFG_FLD_threshold_HI 16
GEN_PMU_FORMAT_ATTR(event);
GEN_PMU_FORMAT_ATTR(long);
GEN_PMU_FORMAT_ATTR(rdpmc);
GEN_PMU_FORMAT_ATTR(threshold_count);
GEN_PMU_FORMAT_ATTR(threshold_compare);
GEN_PMU_FORMAT_ATTR(threshold);
static int sysctl_perf_user_access __read_mostly; static int sysctl_perf_user_access __read_mostly;
static inline bool armv8pmu_event_is_64bit(struct perf_event *event) static bool armv8pmu_event_is_64bit(struct perf_event *event)
{
return ATTR_CFG_GET_FLD(&event->attr, long);
}
static bool armv8pmu_event_want_user_access(struct perf_event *event)
{ {
return event->attr.config1 & 0x1; return ATTR_CFG_GET_FLD(&event->attr, rdpmc);
} }
static inline bool armv8pmu_event_want_user_access(struct perf_event *event) static u8 armv8pmu_event_threshold_control(struct perf_event_attr *attr)
{ {
return event->attr.config1 & 0x2; u8 th_compare = ATTR_CFG_GET_FLD(attr, threshold_compare);
u8 th_count = ATTR_CFG_GET_FLD(attr, threshold_count);
/*
* The count bit is always the bottom bit of the full control field, and
* the comparison is the upper two bits, but it's not explicitly
* labelled in the Arm ARM. For the Perf interface we split it into two
* fields, so reconstruct it here.
*/
return (th_compare << 1) | th_count;
} }
static struct attribute *armv8_pmuv3_format_attrs[] = { static struct attribute *armv8_pmuv3_format_attrs[] = {
&format_attr_event.attr, &format_attr_event.attr,
&format_attr_long.attr, &format_attr_long.attr,
&format_attr_rdpmc.attr, &format_attr_rdpmc.attr,
&format_attr_threshold.attr,
&format_attr_threshold_compare.attr,
&format_attr_threshold_count.attr,
NULL, NULL,
}; };
...@@ -327,7 +372,7 @@ static ssize_t slots_show(struct device *dev, struct device_attribute *attr, ...@@ -327,7 +372,7 @@ static ssize_t slots_show(struct device *dev, struct device_attribute *attr,
{ {
struct pmu *pmu = dev_get_drvdata(dev); struct pmu *pmu = dev_get_drvdata(dev);
struct arm_pmu *cpu_pmu = container_of(pmu, struct arm_pmu, pmu); struct arm_pmu *cpu_pmu = container_of(pmu, struct arm_pmu, pmu);
u32 slots = cpu_pmu->reg_pmmir & ARMV8_PMU_SLOTS_MASK; u32 slots = FIELD_GET(ARMV8_PMU_SLOTS, cpu_pmu->reg_pmmir);
return sysfs_emit(page, "0x%08x\n", slots); return sysfs_emit(page, "0x%08x\n", slots);
} }
...@@ -339,8 +384,7 @@ static ssize_t bus_slots_show(struct device *dev, struct device_attribute *attr, ...@@ -339,8 +384,7 @@ static ssize_t bus_slots_show(struct device *dev, struct device_attribute *attr,
{ {
struct pmu *pmu = dev_get_drvdata(dev); struct pmu *pmu = dev_get_drvdata(dev);
struct arm_pmu *cpu_pmu = container_of(pmu, struct arm_pmu, pmu); struct arm_pmu *cpu_pmu = container_of(pmu, struct arm_pmu, pmu);
u32 bus_slots = (cpu_pmu->reg_pmmir >> ARMV8_PMU_BUS_SLOTS_SHIFT) u32 bus_slots = FIELD_GET(ARMV8_PMU_BUS_SLOTS, cpu_pmu->reg_pmmir);
& ARMV8_PMU_BUS_SLOTS_MASK;
return sysfs_emit(page, "0x%08x\n", bus_slots); return sysfs_emit(page, "0x%08x\n", bus_slots);
} }
...@@ -352,8 +396,7 @@ static ssize_t bus_width_show(struct device *dev, struct device_attribute *attr, ...@@ -352,8 +396,7 @@ static ssize_t bus_width_show(struct device *dev, struct device_attribute *attr,
{ {
struct pmu *pmu = dev_get_drvdata(dev); struct pmu *pmu = dev_get_drvdata(dev);
struct arm_pmu *cpu_pmu = container_of(pmu, struct arm_pmu, pmu); struct arm_pmu *cpu_pmu = container_of(pmu, struct arm_pmu, pmu);
u32 bus_width = (cpu_pmu->reg_pmmir >> ARMV8_PMU_BUS_WIDTH_SHIFT) u32 bus_width = FIELD_GET(ARMV8_PMU_BUS_WIDTH, cpu_pmu->reg_pmmir);
& ARMV8_PMU_BUS_WIDTH_MASK;
u32 val = 0; u32 val = 0;
/* Encoded as Log2(number of bytes), plus one */ /* Encoded as Log2(number of bytes), plus one */
...@@ -365,10 +408,38 @@ static ssize_t bus_width_show(struct device *dev, struct device_attribute *attr, ...@@ -365,10 +408,38 @@ static ssize_t bus_width_show(struct device *dev, struct device_attribute *attr,
static DEVICE_ATTR_RO(bus_width); static DEVICE_ATTR_RO(bus_width);
static u32 threshold_max(struct arm_pmu *cpu_pmu)
{
/*
* PMMIR.THWIDTH is readable and non-zero on aarch32, but it would be
* impossible to write the threshold in the upper 32 bits of PMEVTYPER.
*/
if (IS_ENABLED(CONFIG_ARM))
return 0;
/*
* The largest value that can be written to PMEVTYPER<n>_EL0.TH is
* (2 ^ PMMIR.THWIDTH) - 1.
*/
return (1 << FIELD_GET(ARMV8_PMU_THWIDTH, cpu_pmu->reg_pmmir)) - 1;
}
static ssize_t threshold_max_show(struct device *dev,
struct device_attribute *attr, char *page)
{
struct pmu *pmu = dev_get_drvdata(dev);
struct arm_pmu *cpu_pmu = container_of(pmu, struct arm_pmu, pmu);
return sysfs_emit(page, "0x%08x\n", threshold_max(cpu_pmu));
}
static DEVICE_ATTR_RO(threshold_max);
static struct attribute *armv8_pmuv3_caps_attrs[] = { static struct attribute *armv8_pmuv3_caps_attrs[] = {
&dev_attr_slots.attr, &dev_attr_slots.attr,
&dev_attr_bus_slots.attr, &dev_attr_bus_slots.attr,
&dev_attr_bus_width.attr, &dev_attr_bus_width.attr,
&dev_attr_threshold_max.attr,
NULL, NULL,
}; };
...@@ -397,7 +468,7 @@ static bool armv8pmu_has_long_event(struct arm_pmu *cpu_pmu) ...@@ -397,7 +468,7 @@ static bool armv8pmu_has_long_event(struct arm_pmu *cpu_pmu)
return (IS_ENABLED(CONFIG_ARM64) && is_pmuv3p5(cpu_pmu->pmuver)); return (IS_ENABLED(CONFIG_ARM64) && is_pmuv3p5(cpu_pmu->pmuver));
} }
static inline bool armv8pmu_event_has_user_read(struct perf_event *event) static bool armv8pmu_event_has_user_read(struct perf_event *event)
{ {
return event->hw.flags & PERF_EVENT_FLAG_USER_READ_CNT; return event->hw.flags & PERF_EVENT_FLAG_USER_READ_CNT;
} }
...@@ -407,7 +478,7 @@ static inline bool armv8pmu_event_has_user_read(struct perf_event *event) ...@@ -407,7 +478,7 @@ static inline bool armv8pmu_event_has_user_read(struct perf_event *event)
* except when we have allocated the 64bit cycle counter (for CPU * except when we have allocated the 64bit cycle counter (for CPU
* cycles event) or when user space counter access is enabled. * cycles event) or when user space counter access is enabled.
*/ */
static inline bool armv8pmu_event_is_chained(struct perf_event *event) static bool armv8pmu_event_is_chained(struct perf_event *event)
{ {
int idx = event->hw.idx; int idx = event->hw.idx;
struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
...@@ -428,36 +499,36 @@ static inline bool armv8pmu_event_is_chained(struct perf_event *event) ...@@ -428,36 +499,36 @@ static inline bool armv8pmu_event_is_chained(struct perf_event *event)
#define ARMV8_IDX_TO_COUNTER(x) \ #define ARMV8_IDX_TO_COUNTER(x) \
(((x) - ARMV8_IDX_COUNTER0) & ARMV8_PMU_COUNTER_MASK) (((x) - ARMV8_IDX_COUNTER0) & ARMV8_PMU_COUNTER_MASK)
static inline u64 armv8pmu_pmcr_read(void) static u64 armv8pmu_pmcr_read(void)
{ {
return read_pmcr(); return read_pmcr();
} }
static inline void armv8pmu_pmcr_write(u64 val) static void armv8pmu_pmcr_write(u64 val)
{ {
val &= ARMV8_PMU_PMCR_MASK; val &= ARMV8_PMU_PMCR_MASK;
isb(); isb();
write_pmcr(val); write_pmcr(val);
} }
static inline int armv8pmu_has_overflowed(u32 pmovsr) static int armv8pmu_has_overflowed(u32 pmovsr)
{ {
return pmovsr & ARMV8_PMU_OVERFLOWED_MASK; return pmovsr & ARMV8_PMU_OVERFLOWED_MASK;
} }
static inline int armv8pmu_counter_has_overflowed(u32 pmnc, int idx) static int armv8pmu_counter_has_overflowed(u32 pmnc, int idx)
{ {
return pmnc & BIT(ARMV8_IDX_TO_COUNTER(idx)); return pmnc & BIT(ARMV8_IDX_TO_COUNTER(idx));
} }
static inline u64 armv8pmu_read_evcntr(int idx) static u64 armv8pmu_read_evcntr(int idx)
{ {
u32 counter = ARMV8_IDX_TO_COUNTER(idx); u32 counter = ARMV8_IDX_TO_COUNTER(idx);
return read_pmevcntrn(counter); return read_pmevcntrn(counter);
} }
static inline u64 armv8pmu_read_hw_counter(struct perf_event *event) static u64 armv8pmu_read_hw_counter(struct perf_event *event)
{ {
int idx = event->hw.idx; int idx = event->hw.idx;
u64 val = armv8pmu_read_evcntr(idx); u64 val = armv8pmu_read_evcntr(idx);
...@@ -519,14 +590,14 @@ static u64 armv8pmu_read_counter(struct perf_event *event) ...@@ -519,14 +590,14 @@ static u64 armv8pmu_read_counter(struct perf_event *event)
return armv8pmu_unbias_long_counter(event, value); return armv8pmu_unbias_long_counter(event, value);
} }
static inline void armv8pmu_write_evcntr(int idx, u64 value) static void armv8pmu_write_evcntr(int idx, u64 value)
{ {
u32 counter = ARMV8_IDX_TO_COUNTER(idx); u32 counter = ARMV8_IDX_TO_COUNTER(idx);
write_pmevcntrn(counter, value); write_pmevcntrn(counter, value);
} }
static inline void armv8pmu_write_hw_counter(struct perf_event *event, static void armv8pmu_write_hw_counter(struct perf_event *event,
u64 value) u64 value)
{ {
int idx = event->hw.idx; int idx = event->hw.idx;
...@@ -552,15 +623,22 @@ static void armv8pmu_write_counter(struct perf_event *event, u64 value) ...@@ -552,15 +623,22 @@ static void armv8pmu_write_counter(struct perf_event *event, u64 value)
armv8pmu_write_hw_counter(event, value); armv8pmu_write_hw_counter(event, value);
} }
static inline void armv8pmu_write_evtype(int idx, u32 val) static void armv8pmu_write_evtype(int idx, unsigned long val)
{ {
u32 counter = ARMV8_IDX_TO_COUNTER(idx); u32 counter = ARMV8_IDX_TO_COUNTER(idx);
unsigned long mask = ARMV8_PMU_EVTYPE_EVENT |
ARMV8_PMU_INCLUDE_EL2 |
ARMV8_PMU_EXCLUDE_EL0 |
ARMV8_PMU_EXCLUDE_EL1;
val &= ARMV8_PMU_EVTYPE_MASK; if (IS_ENABLED(CONFIG_ARM64))
mask |= ARMV8_PMU_EVTYPE_TC | ARMV8_PMU_EVTYPE_TH;
val &= mask;
write_pmevtypern(counter, val); write_pmevtypern(counter, val);
} }
static inline void armv8pmu_write_event_type(struct perf_event *event) static void armv8pmu_write_event_type(struct perf_event *event)
{ {
struct hw_perf_event *hwc = &event->hw; struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx; int idx = hwc->idx;
...@@ -594,7 +672,7 @@ static u32 armv8pmu_event_cnten_mask(struct perf_event *event) ...@@ -594,7 +672,7 @@ static u32 armv8pmu_event_cnten_mask(struct perf_event *event)
return mask; return mask;
} }
static inline void armv8pmu_enable_counter(u32 mask) static void armv8pmu_enable_counter(u32 mask)
{ {
/* /*
* Make sure event configuration register writes are visible before we * Make sure event configuration register writes are visible before we
...@@ -604,7 +682,7 @@ static inline void armv8pmu_enable_counter(u32 mask) ...@@ -604,7 +682,7 @@ static inline void armv8pmu_enable_counter(u32 mask)
write_pmcntenset(mask); write_pmcntenset(mask);
} }
static inline void armv8pmu_enable_event_counter(struct perf_event *event) static void armv8pmu_enable_event_counter(struct perf_event *event)
{ {
struct perf_event_attr *attr = &event->attr; struct perf_event_attr *attr = &event->attr;
u32 mask = armv8pmu_event_cnten_mask(event); u32 mask = armv8pmu_event_cnten_mask(event);
...@@ -616,7 +694,7 @@ static inline void armv8pmu_enable_event_counter(struct perf_event *event) ...@@ -616,7 +694,7 @@ static inline void armv8pmu_enable_event_counter(struct perf_event *event)
armv8pmu_enable_counter(mask); armv8pmu_enable_counter(mask);
} }
static inline void armv8pmu_disable_counter(u32 mask) static void armv8pmu_disable_counter(u32 mask)
{ {
write_pmcntenclr(mask); write_pmcntenclr(mask);
/* /*
...@@ -626,7 +704,7 @@ static inline void armv8pmu_disable_counter(u32 mask) ...@@ -626,7 +704,7 @@ static inline void armv8pmu_disable_counter(u32 mask)
isb(); isb();
} }
static inline void armv8pmu_disable_event_counter(struct perf_event *event) static void armv8pmu_disable_event_counter(struct perf_event *event)
{ {
struct perf_event_attr *attr = &event->attr; struct perf_event_attr *attr = &event->attr;
u32 mask = armv8pmu_event_cnten_mask(event); u32 mask = armv8pmu_event_cnten_mask(event);
...@@ -638,18 +716,18 @@ static inline void armv8pmu_disable_event_counter(struct perf_event *event) ...@@ -638,18 +716,18 @@ static inline void armv8pmu_disable_event_counter(struct perf_event *event)
armv8pmu_disable_counter(mask); armv8pmu_disable_counter(mask);
} }
static inline void armv8pmu_enable_intens(u32 mask) static void armv8pmu_enable_intens(u32 mask)
{ {
write_pmintenset(mask); write_pmintenset(mask);
} }
static inline void armv8pmu_enable_event_irq(struct perf_event *event) static void armv8pmu_enable_event_irq(struct perf_event *event)
{ {
u32 counter = ARMV8_IDX_TO_COUNTER(event->hw.idx); u32 counter = ARMV8_IDX_TO_COUNTER(event->hw.idx);
armv8pmu_enable_intens(BIT(counter)); armv8pmu_enable_intens(BIT(counter));
} }
static inline void armv8pmu_disable_intens(u32 mask) static void armv8pmu_disable_intens(u32 mask)
{ {
write_pmintenclr(mask); write_pmintenclr(mask);
isb(); isb();
...@@ -658,13 +736,13 @@ static inline void armv8pmu_disable_intens(u32 mask) ...@@ -658,13 +736,13 @@ static inline void armv8pmu_disable_intens(u32 mask)
isb(); isb();
} }
static inline void armv8pmu_disable_event_irq(struct perf_event *event) static void armv8pmu_disable_event_irq(struct perf_event *event)
{ {
u32 counter = ARMV8_IDX_TO_COUNTER(event->hw.idx); u32 counter = ARMV8_IDX_TO_COUNTER(event->hw.idx);
armv8pmu_disable_intens(BIT(counter)); armv8pmu_disable_intens(BIT(counter));
} }
static inline u32 armv8pmu_getreset_flags(void) static u32 armv8pmu_getreset_flags(void)
{ {
u32 value; u32 value;
...@@ -672,7 +750,7 @@ static inline u32 armv8pmu_getreset_flags(void) ...@@ -672,7 +750,7 @@ static inline u32 armv8pmu_getreset_flags(void)
value = read_pmovsclr(); value = read_pmovsclr();
/* Write to clear flags */ /* Write to clear flags */
value &= ARMV8_PMU_OVSR_MASK; value &= ARMV8_PMU_OVERFLOWED_MASK;
write_pmovsclr(value); write_pmovsclr(value);
return value; return value;
...@@ -914,9 +992,15 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event, ...@@ -914,9 +992,15 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
struct perf_event_attr *attr) struct perf_event_attr *attr)
{ {
unsigned long config_base = 0; unsigned long config_base = 0;
struct perf_event *perf_event = container_of(attr, struct perf_event,
if (attr->exclude_idle) attr);
return -EPERM; struct arm_pmu *cpu_pmu = to_arm_pmu(perf_event->pmu);
u32 th;
if (attr->exclude_idle) {
pr_debug("ARM performance counters do not support mode exclusion\n");
return -EOPNOTSUPP;
}
/* /*
* If we're running in hyp mode, then we *are* the hypervisor. * If we're running in hyp mode, then we *are* the hypervisor.
...@@ -945,6 +1029,22 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event, ...@@ -945,6 +1029,22 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
if (attr->exclude_user) if (attr->exclude_user)
config_base |= ARMV8_PMU_EXCLUDE_EL0; config_base |= ARMV8_PMU_EXCLUDE_EL0;
/*
* If FEAT_PMUv3_TH isn't implemented, then THWIDTH (threshold_max) will
* be 0 and will also trigger this check, preventing it from being used.
*/
th = ATTR_CFG_GET_FLD(attr, threshold);
if (th > threshold_max(cpu_pmu)) {
pr_debug("PMU event threshold exceeds max value\n");
return -EINVAL;
}
if (IS_ENABLED(CONFIG_ARM64) && th) {
config_base |= FIELD_PREP(ARMV8_PMU_EVTYPE_TH, th);
config_base |= FIELD_PREP(ARMV8_PMU_EVTYPE_TC,
armv8pmu_event_threshold_control(attr));
}
/* /*
* Install the filter into config_base as this is used to * Install the filter into config_base as this is used to
* construct the event type. * construct the event type.
...@@ -1107,8 +1207,7 @@ static void __armv8pmu_probe_pmu(void *info) ...@@ -1107,8 +1207,7 @@ static void __armv8pmu_probe_pmu(void *info)
probe->present = true; probe->present = true;
/* Read the nb of CNTx counters supported from PMNC */ /* Read the nb of CNTx counters supported from PMNC */
cpu_pmu->num_events = (armv8pmu_pmcr_read() >> ARMV8_PMU_PMCR_N_SHIFT) cpu_pmu->num_events = FIELD_GET(ARMV8_PMU_PMCR_N, armv8pmu_pmcr_read());
& ARMV8_PMU_PMCR_N_MASK;
/* Add the CPU cycles counter */ /* Add the CPU cycles counter */
cpu_pmu->num_events += 1; cpu_pmu->num_events += 1;
...@@ -1221,6 +1320,12 @@ static int name##_pmu_init(struct arm_pmu *cpu_pmu) \ ...@@ -1221,6 +1320,12 @@ static int name##_pmu_init(struct arm_pmu *cpu_pmu) \
return armv8_pmu_init(cpu_pmu, #name, armv8_pmuv3_map_event); \ return armv8_pmu_init(cpu_pmu, #name, armv8_pmuv3_map_event); \
} }
#define PMUV3_INIT_MAP_EVENT(name, map_event) \
static int name##_pmu_init(struct arm_pmu *cpu_pmu) \
{ \
return armv8_pmu_init(cpu_pmu, #name, map_event); \
}
PMUV3_INIT_SIMPLE(armv8_pmuv3) PMUV3_INIT_SIMPLE(armv8_pmuv3)
PMUV3_INIT_SIMPLE(armv8_cortex_a34) PMUV3_INIT_SIMPLE(armv8_cortex_a34)
...@@ -1247,51 +1352,24 @@ PMUV3_INIT_SIMPLE(armv8_neoverse_v1) ...@@ -1247,51 +1352,24 @@ PMUV3_INIT_SIMPLE(armv8_neoverse_v1)
PMUV3_INIT_SIMPLE(armv8_nvidia_carmel) PMUV3_INIT_SIMPLE(armv8_nvidia_carmel)
PMUV3_INIT_SIMPLE(armv8_nvidia_denver) PMUV3_INIT_SIMPLE(armv8_nvidia_denver)
static int armv8_a35_pmu_init(struct arm_pmu *cpu_pmu) PMUV3_INIT_MAP_EVENT(armv8_cortex_a35, armv8_a53_map_event)
{ PMUV3_INIT_MAP_EVENT(armv8_cortex_a53, armv8_a53_map_event)
return armv8_pmu_init(cpu_pmu, "armv8_cortex_a35", armv8_a53_map_event); PMUV3_INIT_MAP_EVENT(armv8_cortex_a57, armv8_a57_map_event)
} PMUV3_INIT_MAP_EVENT(armv8_cortex_a72, armv8_a57_map_event)
PMUV3_INIT_MAP_EVENT(armv8_cortex_a73, armv8_a73_map_event)
static int armv8_a53_pmu_init(struct arm_pmu *cpu_pmu) PMUV3_INIT_MAP_EVENT(armv8_cavium_thunder, armv8_thunder_map_event)
{ PMUV3_INIT_MAP_EVENT(armv8_brcm_vulcan, armv8_vulcan_map_event)
return armv8_pmu_init(cpu_pmu, "armv8_cortex_a53", armv8_a53_map_event);
}
static int armv8_a57_pmu_init(struct arm_pmu *cpu_pmu)
{
return armv8_pmu_init(cpu_pmu, "armv8_cortex_a57", armv8_a57_map_event);
}
static int armv8_a72_pmu_init(struct arm_pmu *cpu_pmu)
{
return armv8_pmu_init(cpu_pmu, "armv8_cortex_a72", armv8_a57_map_event);
}
static int armv8_a73_pmu_init(struct arm_pmu *cpu_pmu)
{
return armv8_pmu_init(cpu_pmu, "armv8_cortex_a73", armv8_a73_map_event);
}
static int armv8_thunder_pmu_init(struct arm_pmu *cpu_pmu)
{
return armv8_pmu_init(cpu_pmu, "armv8_cavium_thunder", armv8_thunder_map_event);
}
static int armv8_vulcan_pmu_init(struct arm_pmu *cpu_pmu)
{
return armv8_pmu_init(cpu_pmu, "armv8_brcm_vulcan", armv8_vulcan_map_event);
}
static const struct of_device_id armv8_pmu_of_device_ids[] = { static const struct of_device_id armv8_pmu_of_device_ids[] = {
{.compatible = "arm,armv8-pmuv3", .data = armv8_pmuv3_pmu_init}, {.compatible = "arm,armv8-pmuv3", .data = armv8_pmuv3_pmu_init},
{.compatible = "arm,cortex-a34-pmu", .data = armv8_cortex_a34_pmu_init}, {.compatible = "arm,cortex-a34-pmu", .data = armv8_cortex_a34_pmu_init},
{.compatible = "arm,cortex-a35-pmu", .data = armv8_a35_pmu_init}, {.compatible = "arm,cortex-a35-pmu", .data = armv8_cortex_a35_pmu_init},
{.compatible = "arm,cortex-a53-pmu", .data = armv8_a53_pmu_init}, {.compatible = "arm,cortex-a53-pmu", .data = armv8_cortex_a53_pmu_init},
{.compatible = "arm,cortex-a55-pmu", .data = armv8_cortex_a55_pmu_init}, {.compatible = "arm,cortex-a55-pmu", .data = armv8_cortex_a55_pmu_init},
{.compatible = "arm,cortex-a57-pmu", .data = armv8_a57_pmu_init}, {.compatible = "arm,cortex-a57-pmu", .data = armv8_cortex_a57_pmu_init},
{.compatible = "arm,cortex-a65-pmu", .data = armv8_cortex_a65_pmu_init}, {.compatible = "arm,cortex-a65-pmu", .data = armv8_cortex_a65_pmu_init},
{.compatible = "arm,cortex-a72-pmu", .data = armv8_a72_pmu_init}, {.compatible = "arm,cortex-a72-pmu", .data = armv8_cortex_a72_pmu_init},
{.compatible = "arm,cortex-a73-pmu", .data = armv8_a73_pmu_init}, {.compatible = "arm,cortex-a73-pmu", .data = armv8_cortex_a73_pmu_init},
{.compatible = "arm,cortex-a75-pmu", .data = armv8_cortex_a75_pmu_init}, {.compatible = "arm,cortex-a75-pmu", .data = armv8_cortex_a75_pmu_init},
{.compatible = "arm,cortex-a76-pmu", .data = armv8_cortex_a76_pmu_init}, {.compatible = "arm,cortex-a76-pmu", .data = armv8_cortex_a76_pmu_init},
{.compatible = "arm,cortex-a77-pmu", .data = armv8_cortex_a77_pmu_init}, {.compatible = "arm,cortex-a77-pmu", .data = armv8_cortex_a77_pmu_init},
...@@ -1309,8 +1387,8 @@ static const struct of_device_id armv8_pmu_of_device_ids[] = { ...@@ -1309,8 +1387,8 @@ static const struct of_device_id armv8_pmu_of_device_ids[] = {
{.compatible = "arm,neoverse-n1-pmu", .data = armv8_neoverse_n1_pmu_init}, {.compatible = "arm,neoverse-n1-pmu", .data = armv8_neoverse_n1_pmu_init},
{.compatible = "arm,neoverse-n2-pmu", .data = armv9_neoverse_n2_pmu_init}, {.compatible = "arm,neoverse-n2-pmu", .data = armv9_neoverse_n2_pmu_init},
{.compatible = "arm,neoverse-v1-pmu", .data = armv8_neoverse_v1_pmu_init}, {.compatible = "arm,neoverse-v1-pmu", .data = armv8_neoverse_v1_pmu_init},
{.compatible = "cavium,thunder-pmu", .data = armv8_thunder_pmu_init}, {.compatible = "cavium,thunder-pmu", .data = armv8_cavium_thunder_pmu_init},
{.compatible = "brcm,vulcan-pmu", .data = armv8_vulcan_pmu_init}, {.compatible = "brcm,vulcan-pmu", .data = armv8_brcm_vulcan_pmu_init},
{.compatible = "nvidia,carmel-pmu", .data = armv8_nvidia_carmel_pmu_init}, {.compatible = "nvidia,carmel-pmu", .data = armv8_nvidia_carmel_pmu_init},
{.compatible = "nvidia,denver-pmu", .data = armv8_nvidia_denver_pmu_init}, {.compatible = "nvidia,denver-pmu", .data = armv8_nvidia_denver_pmu_init},
{}, {},
......
...@@ -206,28 +206,6 @@ static const struct attribute_group arm_spe_pmu_cap_group = { ...@@ -206,28 +206,6 @@ static const struct attribute_group arm_spe_pmu_cap_group = {
#define ATTR_CFG_FLD_inv_event_filter_LO 0 #define ATTR_CFG_FLD_inv_event_filter_LO 0
#define ATTR_CFG_FLD_inv_event_filter_HI 63 #define ATTR_CFG_FLD_inv_event_filter_HI 63
/* Why does everything I do descend into this? */
#define __GEN_PMU_FORMAT_ATTR(cfg, lo, hi) \
(lo) == (hi) ? #cfg ":" #lo "\n" : #cfg ":" #lo "-" #hi
#define _GEN_PMU_FORMAT_ATTR(cfg, lo, hi) \
__GEN_PMU_FORMAT_ATTR(cfg, lo, hi)
#define GEN_PMU_FORMAT_ATTR(name) \
PMU_FORMAT_ATTR(name, \
_GEN_PMU_FORMAT_ATTR(ATTR_CFG_FLD_##name##_CFG, \
ATTR_CFG_FLD_##name##_LO, \
ATTR_CFG_FLD_##name##_HI))
#define _ATTR_CFG_GET_FLD(attr, cfg, lo, hi) \
((((attr)->cfg) >> lo) & GENMASK(hi - lo, 0))
#define ATTR_CFG_GET_FLD(attr, name) \
_ATTR_CFG_GET_FLD(attr, \
ATTR_CFG_FLD_##name##_CFG, \
ATTR_CFG_FLD_##name##_LO, \
ATTR_CFG_FLD_##name##_HI)
GEN_PMU_FORMAT_ATTR(ts_enable); GEN_PMU_FORMAT_ATTR(ts_enable);
GEN_PMU_FORMAT_ATTR(pa_enable); GEN_PMU_FORMAT_ATTR(pa_enable);
GEN_PMU_FORMAT_ATTR(pct_enable); GEN_PMU_FORMAT_ATTR(pct_enable);
......
// SPDX-License-Identifier: GPL-2.0
/*
* Synopsys DesignWare PCIe PMU driver
*
* Copyright (C) 2021-2023 Alibaba Inc.
*/
#include <linux/bitfield.h>
#include <linux/bitops.h>
#include <linux/cpuhotplug.h>
#include <linux/cpumask.h>
#include <linux/device.h>
#include <linux/errno.h>
#include <linux/kernel.h>
#include <linux/list.h>
#include <linux/perf_event.h>
#include <linux/pci.h>
#include <linux/platform_device.h>
#include <linux/smp.h>
#include <linux/sysfs.h>
#include <linux/types.h>
#define DWC_PCIE_VSEC_RAS_DES_ID 0x02
#define DWC_PCIE_EVENT_CNT_CTL 0x8
/*
* Event Counter Data Select includes two parts:
* - 27-24: Group number(4-bit: 0..0x7)
* - 23-16: Event number(8-bit: 0..0x13) within the Group
*
* Put them together as in TRM.
*/
#define DWC_PCIE_CNT_EVENT_SEL GENMASK(27, 16)
#define DWC_PCIE_CNT_LANE_SEL GENMASK(11, 8)
#define DWC_PCIE_CNT_STATUS BIT(7)
#define DWC_PCIE_CNT_ENABLE GENMASK(4, 2)
#define DWC_PCIE_PER_EVENT_OFF 0x1
#define DWC_PCIE_PER_EVENT_ON 0x3
#define DWC_PCIE_EVENT_CLEAR GENMASK(1, 0)
#define DWC_PCIE_EVENT_PER_CLEAR 0x1
#define DWC_PCIE_EVENT_CNT_DATA 0xC
#define DWC_PCIE_TIME_BASED_ANAL_CTL 0x10
#define DWC_PCIE_TIME_BASED_REPORT_SEL GENMASK(31, 24)
#define DWC_PCIE_TIME_BASED_DURATION_SEL GENMASK(15, 8)
#define DWC_PCIE_DURATION_MANUAL_CTL 0x0
#define DWC_PCIE_DURATION_1MS 0x1
#define DWC_PCIE_DURATION_10MS 0x2
#define DWC_PCIE_DURATION_100MS 0x3
#define DWC_PCIE_DURATION_1S 0x4
#define DWC_PCIE_DURATION_2S 0x5
#define DWC_PCIE_DURATION_4S 0x6
#define DWC_PCIE_DURATION_4US 0xFF
#define DWC_PCIE_TIME_BASED_TIMER_START BIT(0)
#define DWC_PCIE_TIME_BASED_CNT_ENABLE 0x1
#define DWC_PCIE_TIME_BASED_ANAL_DATA_REG_LOW 0x14
#define DWC_PCIE_TIME_BASED_ANAL_DATA_REG_HIGH 0x18
/* Event attributes */
#define DWC_PCIE_CONFIG_EVENTID GENMASK(15, 0)
#define DWC_PCIE_CONFIG_TYPE GENMASK(19, 16)
#define DWC_PCIE_CONFIG_LANE GENMASK(27, 20)
#define DWC_PCIE_EVENT_ID(event) FIELD_GET(DWC_PCIE_CONFIG_EVENTID, (event)->attr.config)
#define DWC_PCIE_EVENT_TYPE(event) FIELD_GET(DWC_PCIE_CONFIG_TYPE, (event)->attr.config)
#define DWC_PCIE_EVENT_LANE(event) FIELD_GET(DWC_PCIE_CONFIG_LANE, (event)->attr.config)
enum dwc_pcie_event_type {
DWC_PCIE_TIME_BASE_EVENT,
DWC_PCIE_LANE_EVENT,
DWC_PCIE_EVENT_TYPE_MAX,
};
#define DWC_PCIE_LANE_EVENT_MAX_PERIOD GENMASK_ULL(31, 0)
#define DWC_PCIE_MAX_PERIOD GENMASK_ULL(63, 0)
struct dwc_pcie_pmu {
struct pmu pmu;
struct pci_dev *pdev; /* Root Port device */
u16 ras_des_offset;
u32 nr_lanes;
struct list_head pmu_node;
struct hlist_node cpuhp_node;
struct perf_event *event[DWC_PCIE_EVENT_TYPE_MAX];
int on_cpu;
};
#define to_dwc_pcie_pmu(p) (container_of(p, struct dwc_pcie_pmu, pmu))
static int dwc_pcie_pmu_hp_state;
static struct list_head dwc_pcie_dev_info_head =
LIST_HEAD_INIT(dwc_pcie_dev_info_head);
static bool notify;
struct dwc_pcie_dev_info {
struct platform_device *plat_dev;
struct pci_dev *pdev;
struct list_head dev_node;
};
struct dwc_pcie_vendor_id {
int vendor_id;
};
static const struct dwc_pcie_vendor_id dwc_pcie_vendor_ids[] = {
{.vendor_id = PCI_VENDOR_ID_ALIBABA },
{} /* terminator */
};
static ssize_t cpumask_show(struct device *dev,
struct device_attribute *attr,
char *buf)
{
struct dwc_pcie_pmu *pcie_pmu = to_dwc_pcie_pmu(dev_get_drvdata(dev));
return cpumap_print_to_pagebuf(true, buf, cpumask_of(pcie_pmu->on_cpu));
}
static DEVICE_ATTR_RO(cpumask);
static struct attribute *dwc_pcie_pmu_cpumask_attrs[] = {
&dev_attr_cpumask.attr,
NULL
};
static struct attribute_group dwc_pcie_cpumask_attr_group = {
.attrs = dwc_pcie_pmu_cpumask_attrs,
};
struct dwc_pcie_format_attr {
struct device_attribute attr;
u64 field;
int config;
};
PMU_FORMAT_ATTR(eventid, "config:0-15");
PMU_FORMAT_ATTR(type, "config:16-19");
PMU_FORMAT_ATTR(lane, "config:20-27");
static struct attribute *dwc_pcie_format_attrs[] = {
&format_attr_type.attr,
&format_attr_eventid.attr,
&format_attr_lane.attr,
NULL,
};
static struct attribute_group dwc_pcie_format_attrs_group = {
.name = "format",
.attrs = dwc_pcie_format_attrs,
};
struct dwc_pcie_event_attr {
struct device_attribute attr;
enum dwc_pcie_event_type type;
u16 eventid;
u8 lane;
};
static ssize_t dwc_pcie_event_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct dwc_pcie_event_attr *eattr;
eattr = container_of(attr, typeof(*eattr), attr);
if (eattr->type == DWC_PCIE_LANE_EVENT)
return sysfs_emit(buf, "eventid=0x%x,type=0x%x,lane=?\n",
eattr->eventid, eattr->type);
else if (eattr->type == DWC_PCIE_TIME_BASE_EVENT)
return sysfs_emit(buf, "eventid=0x%x,type=0x%x\n",
eattr->eventid, eattr->type);
return 0;
}
#define DWC_PCIE_EVENT_ATTR(_name, _type, _eventid, _lane) \
(&((struct dwc_pcie_event_attr[]) {{ \
.attr = __ATTR(_name, 0444, dwc_pcie_event_show, NULL), \
.type = _type, \
.eventid = _eventid, \
.lane = _lane, \
}})[0].attr.attr)
#define DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(_name, _eventid) \
DWC_PCIE_EVENT_ATTR(_name, DWC_PCIE_TIME_BASE_EVENT, _eventid, 0)
#define DWC_PCIE_PMU_LANE_EVENT_ATTR(_name, _eventid) \
DWC_PCIE_EVENT_ATTR(_name, DWC_PCIE_LANE_EVENT, _eventid, 0)
static struct attribute *dwc_pcie_pmu_time_event_attrs[] = {
/* Group #0 */
DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(one_cycle, 0x00),
DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(TX_L0S, 0x01),
DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(RX_L0S, 0x02),
DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(L0, 0x03),
DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(L1, 0x04),
DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(L1_1, 0x05),
DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(L1_2, 0x06),
DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(CFG_RCVRY, 0x07),
DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(TX_RX_L0S, 0x08),
DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(L1_AUX, 0x09),
/* Group #1 */
DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(Tx_PCIe_TLP_Data_Payload, 0x20),
DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(Rx_PCIe_TLP_Data_Payload, 0x21),
DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(Tx_CCIX_TLP_Data_Payload, 0x22),
DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(Rx_CCIX_TLP_Data_Payload, 0x23),
/*
* Leave it to the user to specify the lane ID to avoid generating
* a list of hundreds of events.
*/
DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_ack_dllp, 0x600),
DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_update_fc_dllp, 0x601),
DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_ack_dllp, 0x602),
DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_update_fc_dllp, 0x603),
DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_nulified_tlp, 0x604),
DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_nulified_tlp, 0x605),
DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_duplicate_tl, 0x606),
DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_memory_write, 0x700),
DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_memory_read, 0x701),
DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_configuration_write, 0x702),
DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_configuration_read, 0x703),
DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_io_write, 0x704),
DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_io_read, 0x705),
DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_completion_without_data, 0x706),
DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_completion_with_data, 0x707),
DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_message_tlp, 0x708),
DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_atomic, 0x709),
DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_tlp_with_prefix, 0x70A),
DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_memory_write, 0x70B),
DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_memory_read, 0x70C),
DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_io_write, 0x70F),
DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_io_read, 0x710),
DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_completion_without_data, 0x711),
DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_completion_with_data, 0x712),
DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_message_tlp, 0x713),
DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_atomic, 0x714),
DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_tlp_with_prefix, 0x715),
DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_ccix_tlp, 0x716),
DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_ccix_tlp, 0x717),
NULL
};
static const struct attribute_group dwc_pcie_event_attrs_group = {
.name = "events",
.attrs = dwc_pcie_pmu_time_event_attrs,
};
static const struct attribute_group *dwc_pcie_attr_groups[] = {
&dwc_pcie_event_attrs_group,
&dwc_pcie_format_attrs_group,
&dwc_pcie_cpumask_attr_group,
NULL
};
static void dwc_pcie_pmu_lane_event_enable(struct dwc_pcie_pmu *pcie_pmu,
bool enable)
{
struct pci_dev *pdev = pcie_pmu->pdev;
u16 ras_des_offset = pcie_pmu->ras_des_offset;
if (enable)
pci_clear_and_set_config_dword(pdev,
ras_des_offset + DWC_PCIE_EVENT_CNT_CTL,
DWC_PCIE_CNT_ENABLE, DWC_PCIE_PER_EVENT_ON);
else
pci_clear_and_set_config_dword(pdev,
ras_des_offset + DWC_PCIE_EVENT_CNT_CTL,
DWC_PCIE_CNT_ENABLE, DWC_PCIE_PER_EVENT_OFF);
}
static void dwc_pcie_pmu_time_based_event_enable(struct dwc_pcie_pmu *pcie_pmu,
bool enable)
{
struct pci_dev *pdev = pcie_pmu->pdev;
u16 ras_des_offset = pcie_pmu->ras_des_offset;
pci_clear_and_set_config_dword(pdev,
ras_des_offset + DWC_PCIE_TIME_BASED_ANAL_CTL,
DWC_PCIE_TIME_BASED_TIMER_START, enable);
}
static u64 dwc_pcie_pmu_read_lane_event_counter(struct perf_event *event)
{
struct dwc_pcie_pmu *pcie_pmu = to_dwc_pcie_pmu(event->pmu);
struct pci_dev *pdev = pcie_pmu->pdev;
u16 ras_des_offset = pcie_pmu->ras_des_offset;
u32 val;
pci_read_config_dword(pdev, ras_des_offset + DWC_PCIE_EVENT_CNT_DATA, &val);
return val;
}
static u64 dwc_pcie_pmu_read_time_based_counter(struct perf_event *event)
{
struct dwc_pcie_pmu *pcie_pmu = to_dwc_pcie_pmu(event->pmu);
struct pci_dev *pdev = pcie_pmu->pdev;
int event_id = DWC_PCIE_EVENT_ID(event);
u16 ras_des_offset = pcie_pmu->ras_des_offset;
u32 lo, hi, ss;
u64 val;
/*
* The 64-bit value of the data counter is spread across two
* registers that are not synchronized. In order to read them
* atomically, ensure that the high 32 bits match before and after
* reading the low 32 bits.
*/
pci_read_config_dword(pdev,
ras_des_offset + DWC_PCIE_TIME_BASED_ANAL_DATA_REG_HIGH, &hi);
do {
/* snapshot the high 32 bits */
ss = hi;
pci_read_config_dword(
pdev, ras_des_offset + DWC_PCIE_TIME_BASED_ANAL_DATA_REG_LOW,
&lo);
pci_read_config_dword(
pdev, ras_des_offset + DWC_PCIE_TIME_BASED_ANAL_DATA_REG_HIGH,
&hi);
} while (hi != ss);
val = ((u64)hi << 32) | lo;
/*
* The Group#1 event measures the amount of data processed in 16-byte
* units. Simplify the end-user interface by multiplying the counter
* at the point of read.
*/
if (event_id >= 0x20 && event_id <= 0x23)
val *= 16;
return val;
}
static void dwc_pcie_pmu_event_update(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
enum dwc_pcie_event_type type = DWC_PCIE_EVENT_TYPE(event);
u64 delta, prev, now = 0;
do {
prev = local64_read(&hwc->prev_count);
if (type == DWC_PCIE_LANE_EVENT)
now = dwc_pcie_pmu_read_lane_event_counter(event);
else if (type == DWC_PCIE_TIME_BASE_EVENT)
now = dwc_pcie_pmu_read_time_based_counter(event);
} while (local64_cmpxchg(&hwc->prev_count, prev, now) != prev);
delta = (now - prev) & DWC_PCIE_MAX_PERIOD;
/* 32-bit counter for Lane Event Counting */
if (type == DWC_PCIE_LANE_EVENT)
delta &= DWC_PCIE_LANE_EVENT_MAX_PERIOD;
local64_add(delta, &event->count);
}
static int dwc_pcie_pmu_event_init(struct perf_event *event)
{
struct dwc_pcie_pmu *pcie_pmu = to_dwc_pcie_pmu(event->pmu);
enum dwc_pcie_event_type type = DWC_PCIE_EVENT_TYPE(event);
struct perf_event *sibling;
u32 lane;
if (event->attr.type != event->pmu->type)
return -ENOENT;
/* We don't support sampling */
if (is_sampling_event(event))
return -EINVAL;
/* We cannot support task bound events */
if (event->cpu < 0 || event->attach_state & PERF_ATTACH_TASK)
return -EINVAL;
if (event->group_leader != event &&
!is_software_event(event->group_leader))
return -EINVAL;
for_each_sibling_event(sibling, event->group_leader) {
if (sibling->pmu != event->pmu && !is_software_event(sibling))
return -EINVAL;
}
if (type < 0 || type >= DWC_PCIE_EVENT_TYPE_MAX)
return -EINVAL;
if (type == DWC_PCIE_LANE_EVENT) {
lane = DWC_PCIE_EVENT_LANE(event);
if (lane < 0 || lane >= pcie_pmu->nr_lanes)
return -EINVAL;
}
event->cpu = pcie_pmu->on_cpu;
return 0;
}
static void dwc_pcie_pmu_event_start(struct perf_event *event, int flags)
{
struct hw_perf_event *hwc = &event->hw;
struct dwc_pcie_pmu *pcie_pmu = to_dwc_pcie_pmu(event->pmu);
enum dwc_pcie_event_type type = DWC_PCIE_EVENT_TYPE(event);
hwc->state = 0;
local64_set(&hwc->prev_count, 0);
if (type == DWC_PCIE_LANE_EVENT)
dwc_pcie_pmu_lane_event_enable(pcie_pmu, true);
else if (type == DWC_PCIE_TIME_BASE_EVENT)
dwc_pcie_pmu_time_based_event_enable(pcie_pmu, true);
}
static void dwc_pcie_pmu_event_stop(struct perf_event *event, int flags)
{
struct dwc_pcie_pmu *pcie_pmu = to_dwc_pcie_pmu(event->pmu);
enum dwc_pcie_event_type type = DWC_PCIE_EVENT_TYPE(event);
struct hw_perf_event *hwc = &event->hw;
if (event->hw.state & PERF_HES_STOPPED)
return;
if (type == DWC_PCIE_LANE_EVENT)
dwc_pcie_pmu_lane_event_enable(pcie_pmu, false);
else if (type == DWC_PCIE_TIME_BASE_EVENT)
dwc_pcie_pmu_time_based_event_enable(pcie_pmu, false);
dwc_pcie_pmu_event_update(event);
hwc->state |= PERF_HES_STOPPED | PERF_HES_UPTODATE;
}
static int dwc_pcie_pmu_event_add(struct perf_event *event, int flags)
{
struct dwc_pcie_pmu *pcie_pmu = to_dwc_pcie_pmu(event->pmu);
struct pci_dev *pdev = pcie_pmu->pdev;
struct hw_perf_event *hwc = &event->hw;
enum dwc_pcie_event_type type = DWC_PCIE_EVENT_TYPE(event);
int event_id = DWC_PCIE_EVENT_ID(event);
int lane = DWC_PCIE_EVENT_LANE(event);
u16 ras_des_offset = pcie_pmu->ras_des_offset;
u32 ctrl;
/* one counter for each type and it is in use */
if (pcie_pmu->event[type])
return -ENOSPC;
pcie_pmu->event[type] = event;
hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
if (type == DWC_PCIE_LANE_EVENT) {
/* EVENT_COUNTER_DATA_REG needs clear manually */
ctrl = FIELD_PREP(DWC_PCIE_CNT_EVENT_SEL, event_id) |
FIELD_PREP(DWC_PCIE_CNT_LANE_SEL, lane) |
FIELD_PREP(DWC_PCIE_CNT_ENABLE, DWC_PCIE_PER_EVENT_OFF) |
FIELD_PREP(DWC_PCIE_EVENT_CLEAR, DWC_PCIE_EVENT_PER_CLEAR);
pci_write_config_dword(pdev, ras_des_offset + DWC_PCIE_EVENT_CNT_CTL,
ctrl);
} else if (type == DWC_PCIE_TIME_BASE_EVENT) {
/*
* TIME_BASED_ANAL_DATA_REG is a 64 bit register, we can safely
* use it with any manually controlled duration. And it is
* cleared when next measurement starts.
*/
ctrl = FIELD_PREP(DWC_PCIE_TIME_BASED_REPORT_SEL, event_id) |
FIELD_PREP(DWC_PCIE_TIME_BASED_DURATION_SEL,
DWC_PCIE_DURATION_MANUAL_CTL) |
DWC_PCIE_TIME_BASED_CNT_ENABLE;
pci_write_config_dword(
pdev, ras_des_offset + DWC_PCIE_TIME_BASED_ANAL_CTL, ctrl);
}
if (flags & PERF_EF_START)
dwc_pcie_pmu_event_start(event, PERF_EF_RELOAD);
perf_event_update_userpage(event);
return 0;
}
static void dwc_pcie_pmu_event_del(struct perf_event *event, int flags)
{
struct dwc_pcie_pmu *pcie_pmu = to_dwc_pcie_pmu(event->pmu);
enum dwc_pcie_event_type type = DWC_PCIE_EVENT_TYPE(event);
dwc_pcie_pmu_event_stop(event, flags | PERF_EF_UPDATE);
perf_event_update_userpage(event);
pcie_pmu->event[type] = NULL;
}
static void dwc_pcie_pmu_remove_cpuhp_instance(void *hotplug_node)
{
cpuhp_state_remove_instance_nocalls(dwc_pcie_pmu_hp_state, hotplug_node);
}
/*
* Find the binded DES capability device info of a PCI device.
* @pdev: The PCI device.
*/
static struct dwc_pcie_dev_info *dwc_pcie_find_dev_info(struct pci_dev *pdev)
{
struct dwc_pcie_dev_info *dev_info;
list_for_each_entry(dev_info, &dwc_pcie_dev_info_head, dev_node)
if (dev_info->pdev == pdev)
return dev_info;
return NULL;
}
static void dwc_pcie_unregister_pmu(void *data)
{
struct dwc_pcie_pmu *pcie_pmu = data;
perf_pmu_unregister(&pcie_pmu->pmu);
}
static bool dwc_pcie_match_des_cap(struct pci_dev *pdev)
{
const struct dwc_pcie_vendor_id *vid;
u16 vsec = 0;
u32 val;
if (!pci_is_pcie(pdev) || !(pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT))
return false;
for (vid = dwc_pcie_vendor_ids; vid->vendor_id; vid++) {
vsec = pci_find_vsec_capability(pdev, vid->vendor_id,
DWC_PCIE_VSEC_RAS_DES_ID);
if (vsec)
break;
}
if (!vsec)
return false;
pci_read_config_dword(pdev, vsec + PCI_VNDR_HEADER, &val);
if (PCI_VNDR_HEADER_REV(val) != 0x04)
return false;
pci_dbg(pdev,
"Detected PCIe Vendor-Specific Extended Capability RAS DES\n");
return true;
}
static void dwc_pcie_unregister_dev(struct dwc_pcie_dev_info *dev_info)
{
platform_device_unregister(dev_info->plat_dev);
list_del(&dev_info->dev_node);
kfree(dev_info);
}
static int dwc_pcie_register_dev(struct pci_dev *pdev)
{
struct platform_device *plat_dev;
struct dwc_pcie_dev_info *dev_info;
u32 bdf;
bdf = PCI_DEVID(pdev->bus->number, pdev->devfn);
plat_dev = platform_device_register_data(NULL, "dwc_pcie_pmu", bdf,
pdev, sizeof(*pdev));
if (IS_ERR(plat_dev))
return PTR_ERR(plat_dev);
dev_info = kzalloc(sizeof(*dev_info), GFP_KERNEL);
if (!dev_info)
return -ENOMEM;
/* Cache platform device to handle pci device hotplug */
dev_info->plat_dev = plat_dev;
dev_info->pdev = pdev;
list_add(&dev_info->dev_node, &dwc_pcie_dev_info_head);
return 0;
}
static int dwc_pcie_pmu_notifier(struct notifier_block *nb,
unsigned long action, void *data)
{
struct device *dev = data;
struct pci_dev *pdev = to_pci_dev(dev);
struct dwc_pcie_dev_info *dev_info;
switch (action) {
case BUS_NOTIFY_ADD_DEVICE:
if (!dwc_pcie_match_des_cap(pdev))
return NOTIFY_DONE;
if (dwc_pcie_register_dev(pdev))
return NOTIFY_BAD;
break;
case BUS_NOTIFY_DEL_DEVICE:
dev_info = dwc_pcie_find_dev_info(pdev);
if (!dev_info)
return NOTIFY_DONE;
dwc_pcie_unregister_dev(dev_info);
break;
}
return NOTIFY_OK;
}
static struct notifier_block dwc_pcie_pmu_nb = {
.notifier_call = dwc_pcie_pmu_notifier,
};
static int dwc_pcie_pmu_probe(struct platform_device *plat_dev)
{
struct pci_dev *pdev = plat_dev->dev.platform_data;
struct dwc_pcie_pmu *pcie_pmu;
char *name;
u32 bdf, val;
u16 vsec;
int ret;
vsec = pci_find_vsec_capability(pdev, pdev->vendor,
DWC_PCIE_VSEC_RAS_DES_ID);
pci_read_config_dword(pdev, vsec + PCI_VNDR_HEADER, &val);
bdf = PCI_DEVID(pdev->bus->number, pdev->devfn);
name = devm_kasprintf(&plat_dev->dev, GFP_KERNEL, "dwc_rootport_%x", bdf);
if (!name)
return -ENOMEM;
pcie_pmu = devm_kzalloc(&plat_dev->dev, sizeof(*pcie_pmu), GFP_KERNEL);
if (!pcie_pmu)
return -ENOMEM;
pcie_pmu->pdev = pdev;
pcie_pmu->ras_des_offset = vsec;
pcie_pmu->nr_lanes = pcie_get_width_cap(pdev);
pcie_pmu->on_cpu = -1;
pcie_pmu->pmu = (struct pmu){
.name = name,
.parent = &pdev->dev,
.module = THIS_MODULE,
.attr_groups = dwc_pcie_attr_groups,
.capabilities = PERF_PMU_CAP_NO_EXCLUDE,
.task_ctx_nr = perf_invalid_context,
.event_init = dwc_pcie_pmu_event_init,
.add = dwc_pcie_pmu_event_add,
.del = dwc_pcie_pmu_event_del,
.start = dwc_pcie_pmu_event_start,
.stop = dwc_pcie_pmu_event_stop,
.read = dwc_pcie_pmu_event_update,
};
/* Add this instance to the list used by the offline callback */
ret = cpuhp_state_add_instance(dwc_pcie_pmu_hp_state,
&pcie_pmu->cpuhp_node);
if (ret) {
pci_err(pdev, "Error %d registering hotplug @%x\n", ret, bdf);
return ret;
}
/* Unwind when platform driver removes */
ret = devm_add_action_or_reset(&plat_dev->dev,
dwc_pcie_pmu_remove_cpuhp_instance,
&pcie_pmu->cpuhp_node);
if (ret)
return ret;
ret = perf_pmu_register(&pcie_pmu->pmu, name, -1);
if (ret) {
pci_err(pdev, "Error %d registering PMU @%x\n", ret, bdf);
return ret;
}
ret = devm_add_action_or_reset(&plat_dev->dev, dwc_pcie_unregister_pmu,
pcie_pmu);
if (ret)
return ret;
return 0;
}
static int dwc_pcie_pmu_online_cpu(unsigned int cpu, struct hlist_node *cpuhp_node)
{
struct dwc_pcie_pmu *pcie_pmu;
pcie_pmu = hlist_entry_safe(cpuhp_node, struct dwc_pcie_pmu, cpuhp_node);
if (pcie_pmu->on_cpu == -1)
pcie_pmu->on_cpu = cpumask_local_spread(
0, dev_to_node(&pcie_pmu->pdev->dev));
return 0;
}
static int dwc_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *cpuhp_node)
{
struct dwc_pcie_pmu *pcie_pmu;
struct pci_dev *pdev;
int node;
cpumask_t mask;
unsigned int target;
pcie_pmu = hlist_entry_safe(cpuhp_node, struct dwc_pcie_pmu, cpuhp_node);
/* Nothing to do if this CPU doesn't own the PMU */
if (cpu != pcie_pmu->on_cpu)
return 0;
pcie_pmu->on_cpu = -1;
pdev = pcie_pmu->pdev;
node = dev_to_node(&pdev->dev);
if (cpumask_and(&mask, cpumask_of_node(node), cpu_online_mask) &&
cpumask_andnot(&mask, &mask, cpumask_of(cpu)))
target = cpumask_any(&mask);
else
target = cpumask_any_but(cpu_online_mask, cpu);
if (target >= nr_cpu_ids) {
pci_err(pdev, "There is no CPU to set\n");
return 0;
}
/* This PMU does NOT support interrupt, just migrate context. */
perf_pmu_migrate_context(&pcie_pmu->pmu, cpu, target);
pcie_pmu->on_cpu = target;
return 0;
}
static struct platform_driver dwc_pcie_pmu_driver = {
.probe = dwc_pcie_pmu_probe,
.driver = {.name = "dwc_pcie_pmu",},
};
static int __init dwc_pcie_pmu_init(void)
{
struct pci_dev *pdev = NULL;
bool found = false;
int ret;
for_each_pci_dev(pdev) {
if (!dwc_pcie_match_des_cap(pdev))
continue;
ret = dwc_pcie_register_dev(pdev);
if (ret) {
pci_dev_put(pdev);
return ret;
}
found = true;
}
if (!found)
return -ENODEV;
ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
"perf/dwc_pcie_pmu:online",
dwc_pcie_pmu_online_cpu,
dwc_pcie_pmu_offline_cpu);
if (ret < 0)
return ret;
dwc_pcie_pmu_hp_state = ret;
ret = platform_driver_register(&dwc_pcie_pmu_driver);
if (ret)
goto platform_driver_register_err;
ret = bus_register_notifier(&pci_bus_type, &dwc_pcie_pmu_nb);
if (ret)
goto platform_driver_register_err;
notify = true;
return 0;
platform_driver_register_err:
cpuhp_remove_multi_state(dwc_pcie_pmu_hp_state);
return ret;
}
static void __exit dwc_pcie_pmu_exit(void)
{
struct dwc_pcie_dev_info *dev_info, *tmp;
if (notify)
bus_unregister_notifier(&pci_bus_type, &dwc_pcie_pmu_nb);
list_for_each_entry_safe(dev_info, tmp, &dwc_pcie_dev_info_head, dev_node)
dwc_pcie_unregister_dev(dev_info);
platform_driver_unregister(&dwc_pcie_pmu_driver);
cpuhp_remove_multi_state(dwc_pcie_pmu_hp_state);
}
module_init(dwc_pcie_pmu_init);
module_exit(dwc_pcie_pmu_exit);
MODULE_DESCRIPTION("PMU driver for DesignWare Cores PCI Express Controller");
MODULE_AUTHOR("Shuai Xue <xueshuai@linux.alibaba.com>");
MODULE_LICENSE("GPL v2");
...@@ -19,6 +19,8 @@ ...@@ -19,6 +19,8 @@
#define COUNTER_READ 0x20 #define COUNTER_READ 0x20
#define COUNTER_DPCR1 0x30 #define COUNTER_DPCR1 0x30
#define COUNTER_MUX_CNTL 0x50
#define COUNTER_MASK_COMP 0x54
#define CNTL_OVER 0x1 #define CNTL_OVER 0x1
#define CNTL_CLEAR 0x2 #define CNTL_CLEAR 0x2
...@@ -32,6 +34,13 @@ ...@@ -32,6 +34,13 @@
#define CNTL_CSV_SHIFT 24 #define CNTL_CSV_SHIFT 24
#define CNTL_CSV_MASK (0xFFU << CNTL_CSV_SHIFT) #define CNTL_CSV_MASK (0xFFU << CNTL_CSV_SHIFT)
#define READ_PORT_SHIFT 0
#define READ_PORT_MASK (0x7 << READ_PORT_SHIFT)
#define READ_CHANNEL_REVERT 0x00000008 /* bit 3 for read channel select */
#define WRITE_PORT_SHIFT 8
#define WRITE_PORT_MASK (0x7 << WRITE_PORT_SHIFT)
#define WRITE_CHANNEL_REVERT 0x00000800 /* bit 11 for write channel select */
#define EVENT_CYCLES_ID 0 #define EVENT_CYCLES_ID 0
#define EVENT_CYCLES_COUNTER 0 #define EVENT_CYCLES_COUNTER 0
#define NUM_COUNTERS 4 #define NUM_COUNTERS 4
...@@ -50,6 +59,7 @@ static DEFINE_IDA(ddr_ida); ...@@ -50,6 +59,7 @@ static DEFINE_IDA(ddr_ida);
/* DDR Perf hardware feature */ /* DDR Perf hardware feature */
#define DDR_CAP_AXI_ID_FILTER 0x1 /* support AXI ID filter */ #define DDR_CAP_AXI_ID_FILTER 0x1 /* support AXI ID filter */
#define DDR_CAP_AXI_ID_FILTER_ENHANCED 0x3 /* support enhanced AXI ID filter */ #define DDR_CAP_AXI_ID_FILTER_ENHANCED 0x3 /* support enhanced AXI ID filter */
#define DDR_CAP_AXI_ID_PORT_CHANNEL_FILTER 0x4 /* support AXI ID PORT CHANNEL filter */
struct fsl_ddr_devtype_data { struct fsl_ddr_devtype_data {
unsigned int quirks; /* quirks needed for different DDR Perf core */ unsigned int quirks; /* quirks needed for different DDR Perf core */
...@@ -82,6 +92,11 @@ static const struct fsl_ddr_devtype_data imx8mp_devtype_data = { ...@@ -82,6 +92,11 @@ static const struct fsl_ddr_devtype_data imx8mp_devtype_data = {
.identifier = "i.MX8MP", .identifier = "i.MX8MP",
}; };
static const struct fsl_ddr_devtype_data imx8dxl_devtype_data = {
.quirks = DDR_CAP_AXI_ID_PORT_CHANNEL_FILTER,
.identifier = "i.MX8DXL",
};
static const struct of_device_id imx_ddr_pmu_dt_ids[] = { static const struct of_device_id imx_ddr_pmu_dt_ids[] = {
{ .compatible = "fsl,imx8-ddr-pmu", .data = &imx8_devtype_data}, { .compatible = "fsl,imx8-ddr-pmu", .data = &imx8_devtype_data},
{ .compatible = "fsl,imx8m-ddr-pmu", .data = &imx8m_devtype_data}, { .compatible = "fsl,imx8m-ddr-pmu", .data = &imx8m_devtype_data},
...@@ -89,6 +104,7 @@ static const struct of_device_id imx_ddr_pmu_dt_ids[] = { ...@@ -89,6 +104,7 @@ static const struct of_device_id imx_ddr_pmu_dt_ids[] = {
{ .compatible = "fsl,imx8mm-ddr-pmu", .data = &imx8mm_devtype_data}, { .compatible = "fsl,imx8mm-ddr-pmu", .data = &imx8mm_devtype_data},
{ .compatible = "fsl,imx8mn-ddr-pmu", .data = &imx8mn_devtype_data}, { .compatible = "fsl,imx8mn-ddr-pmu", .data = &imx8mn_devtype_data},
{ .compatible = "fsl,imx8mp-ddr-pmu", .data = &imx8mp_devtype_data}, { .compatible = "fsl,imx8mp-ddr-pmu", .data = &imx8mp_devtype_data},
{ .compatible = "fsl,imx8dxl-ddr-pmu", .data = &imx8dxl_devtype_data},
{ /* sentinel */ } { /* sentinel */ }
}; };
MODULE_DEVICE_TABLE(of, imx_ddr_pmu_dt_ids); MODULE_DEVICE_TABLE(of, imx_ddr_pmu_dt_ids);
...@@ -144,6 +160,7 @@ static const struct attribute_group ddr_perf_identifier_attr_group = { ...@@ -144,6 +160,7 @@ static const struct attribute_group ddr_perf_identifier_attr_group = {
enum ddr_perf_filter_capabilities { enum ddr_perf_filter_capabilities {
PERF_CAP_AXI_ID_FILTER = 0, PERF_CAP_AXI_ID_FILTER = 0,
PERF_CAP_AXI_ID_FILTER_ENHANCED, PERF_CAP_AXI_ID_FILTER_ENHANCED,
PERF_CAP_AXI_ID_PORT_CHANNEL_FILTER,
PERF_CAP_AXI_ID_FEAT_MAX, PERF_CAP_AXI_ID_FEAT_MAX,
}; };
...@@ -157,6 +174,8 @@ static u32 ddr_perf_filter_cap_get(struct ddr_pmu *pmu, int cap) ...@@ -157,6 +174,8 @@ static u32 ddr_perf_filter_cap_get(struct ddr_pmu *pmu, int cap)
case PERF_CAP_AXI_ID_FILTER_ENHANCED: case PERF_CAP_AXI_ID_FILTER_ENHANCED:
quirks &= DDR_CAP_AXI_ID_FILTER_ENHANCED; quirks &= DDR_CAP_AXI_ID_FILTER_ENHANCED;
return quirks == DDR_CAP_AXI_ID_FILTER_ENHANCED; return quirks == DDR_CAP_AXI_ID_FILTER_ENHANCED;
case PERF_CAP_AXI_ID_PORT_CHANNEL_FILTER:
return !!(quirks & DDR_CAP_AXI_ID_PORT_CHANNEL_FILTER);
default: default:
WARN(1, "unknown filter cap %d\n", cap); WARN(1, "unknown filter cap %d\n", cap);
} }
...@@ -187,6 +206,7 @@ static ssize_t ddr_perf_filter_cap_show(struct device *dev, ...@@ -187,6 +206,7 @@ static ssize_t ddr_perf_filter_cap_show(struct device *dev,
static struct attribute *ddr_perf_filter_cap_attr[] = { static struct attribute *ddr_perf_filter_cap_attr[] = {
PERF_FILTER_EXT_ATTR_ENTRY(filter, PERF_CAP_AXI_ID_FILTER), PERF_FILTER_EXT_ATTR_ENTRY(filter, PERF_CAP_AXI_ID_FILTER),
PERF_FILTER_EXT_ATTR_ENTRY(enhanced_filter, PERF_CAP_AXI_ID_FILTER_ENHANCED), PERF_FILTER_EXT_ATTR_ENTRY(enhanced_filter, PERF_CAP_AXI_ID_FILTER_ENHANCED),
PERF_FILTER_EXT_ATTR_ENTRY(super_filter, PERF_CAP_AXI_ID_PORT_CHANNEL_FILTER),
NULL, NULL,
}; };
...@@ -272,11 +292,15 @@ static const struct attribute_group ddr_perf_events_attr_group = { ...@@ -272,11 +292,15 @@ static const struct attribute_group ddr_perf_events_attr_group = {
PMU_FORMAT_ATTR(event, "config:0-7"); PMU_FORMAT_ATTR(event, "config:0-7");
PMU_FORMAT_ATTR(axi_id, "config1:0-15"); PMU_FORMAT_ATTR(axi_id, "config1:0-15");
PMU_FORMAT_ATTR(axi_mask, "config1:16-31"); PMU_FORMAT_ATTR(axi_mask, "config1:16-31");
PMU_FORMAT_ATTR(axi_port, "config2:0-2");
PMU_FORMAT_ATTR(axi_channel, "config2:3-3");
static struct attribute *ddr_perf_format_attrs[] = { static struct attribute *ddr_perf_format_attrs[] = {
&format_attr_event.attr, &format_attr_event.attr,
&format_attr_axi_id.attr, &format_attr_axi_id.attr,
&format_attr_axi_mask.attr, &format_attr_axi_mask.attr,
&format_attr_axi_port.attr,
&format_attr_axi_channel.attr,
NULL, NULL,
}; };
...@@ -530,6 +554,7 @@ static int ddr_perf_event_add(struct perf_event *event, int flags) ...@@ -530,6 +554,7 @@ static int ddr_perf_event_add(struct perf_event *event, int flags)
int counter; int counter;
int cfg = event->attr.config; int cfg = event->attr.config;
int cfg1 = event->attr.config1; int cfg1 = event->attr.config1;
int cfg2 = event->attr.config2;
if (pmu->devtype_data->quirks & DDR_CAP_AXI_ID_FILTER) { if (pmu->devtype_data->quirks & DDR_CAP_AXI_ID_FILTER) {
int i; int i;
...@@ -553,6 +578,26 @@ static int ddr_perf_event_add(struct perf_event *event, int flags) ...@@ -553,6 +578,26 @@ static int ddr_perf_event_add(struct perf_event *event, int flags)
return -EOPNOTSUPP; return -EOPNOTSUPP;
} }
if (pmu->devtype_data->quirks & DDR_CAP_AXI_ID_PORT_CHANNEL_FILTER) {
if (ddr_perf_is_filtered(event)) {
/* revert axi id masking(axi_mask) value */
cfg1 ^= AXI_MASKING_REVERT;
writel(cfg1, pmu->base + COUNTER_MASK_COMP + ((counter - 1) << 4));
if (cfg == 0x41) {
/* revert axi read channel(axi_channel) value */
cfg2 ^= READ_CHANNEL_REVERT;
cfg2 |= FIELD_PREP(READ_PORT_MASK, cfg2);
} else {
/* revert axi write channel(axi_channel) value */
cfg2 ^= WRITE_CHANNEL_REVERT;
cfg2 |= FIELD_PREP(WRITE_PORT_MASK, cfg2);
}
writel(cfg2, pmu->base + COUNTER_MUX_CNTL + ((counter - 1) << 4));
}
}
pmu->events[counter] = event; pmu->events[counter] = event;
hwc->idx = counter; hwc->idx = counter;
......
...@@ -617,7 +617,7 @@ static int ddr_perf_probe(struct platform_device *pdev) ...@@ -617,7 +617,7 @@ static int ddr_perf_probe(struct platform_device *pdev)
platform_set_drvdata(pdev, pmu); platform_set_drvdata(pdev, pmu);
pmu->id = ida_simple_get(&ddr_ida, 0, 0, GFP_KERNEL); pmu->id = ida_alloc(&ddr_ida, GFP_KERNEL);
name = devm_kasprintf(&pdev->dev, GFP_KERNEL, DDR_PERF_DEV_NAME "%d", pmu->id); name = devm_kasprintf(&pdev->dev, GFP_KERNEL, DDR_PERF_DEV_NAME "%d", pmu->id);
if (!name) { if (!name) {
ret = -ENOMEM; ret = -ENOMEM;
...@@ -674,7 +674,7 @@ static int ddr_perf_probe(struct platform_device *pdev) ...@@ -674,7 +674,7 @@ static int ddr_perf_probe(struct platform_device *pdev)
cpuhp_remove_multi_state(pmu->cpuhp_state); cpuhp_remove_multi_state(pmu->cpuhp_state);
cpuhp_state_err: cpuhp_state_err:
format_string_err: format_string_err:
ida_simple_remove(&ddr_ida, pmu->id); ida_free(&ddr_ida, pmu->id);
dev_warn(&pdev->dev, "i.MX9 DDR Perf PMU failed (%d), disabled\n", ret); dev_warn(&pdev->dev, "i.MX9 DDR Perf PMU failed (%d), disabled\n", ret);
return ret; return ret;
} }
...@@ -688,7 +688,7 @@ static int ddr_perf_remove(struct platform_device *pdev) ...@@ -688,7 +688,7 @@ static int ddr_perf_remove(struct platform_device *pdev)
perf_pmu_unregister(&pmu->pmu); perf_pmu_unregister(&pmu->pmu);
ida_simple_remove(&ddr_ida, pmu->id); ida_free(&ddr_ida, pmu->id);
return 0; return 0;
} }
......
...@@ -383,8 +383,8 @@ static struct attribute *hisi_uc_pmu_events_attr[] = { ...@@ -383,8 +383,8 @@ static struct attribute *hisi_uc_pmu_events_attr[] = {
HISI_PMU_EVENT_ATTR(cpu_rd, 0x10), HISI_PMU_EVENT_ATTR(cpu_rd, 0x10),
HISI_PMU_EVENT_ATTR(cpu_rd64, 0x17), HISI_PMU_EVENT_ATTR(cpu_rd64, 0x17),
HISI_PMU_EVENT_ATTR(cpu_rs64, 0x19), HISI_PMU_EVENT_ATTR(cpu_rs64, 0x19),
HISI_PMU_EVENT_ATTR(cpu_mru, 0x1a), HISI_PMU_EVENT_ATTR(cpu_mru, 0x1c),
HISI_PMU_EVENT_ATTR(cycles, 0x9c), HISI_PMU_EVENT_ATTR(cycles, 0x95),
HISI_PMU_EVENT_ATTR(spipe_hit, 0xb3), HISI_PMU_EVENT_ATTR(spipe_hit, 0xb3),
HISI_PMU_EVENT_ATTR(hpipe_hit, 0xdb), HISI_PMU_EVENT_ATTR(hpipe_hit, 0xdb),
HISI_PMU_EVENT_ATTR(cring_rxdat_cnt, 0xfa), HISI_PMU_EVENT_ATTR(cring_rxdat_cnt, 0xfa),
......
...@@ -1239,6 +1239,8 @@ int pci_read_config_dword(const struct pci_dev *dev, int where, u32 *val); ...@@ -1239,6 +1239,8 @@ int pci_read_config_dword(const struct pci_dev *dev, int where, u32 *val);
int pci_write_config_byte(const struct pci_dev *dev, int where, u8 val); int pci_write_config_byte(const struct pci_dev *dev, int where, u8 val);
int pci_write_config_word(const struct pci_dev *dev, int where, u16 val); int pci_write_config_word(const struct pci_dev *dev, int where, u16 val);
int pci_write_config_dword(const struct pci_dev *dev, int where, u32 val); int pci_write_config_dword(const struct pci_dev *dev, int where, u32 val);
void pci_clear_and_set_config_dword(const struct pci_dev *dev, int pos,
u32 clear, u32 set);
int pcie_capability_read_word(struct pci_dev *dev, int pos, u16 *val); int pcie_capability_read_word(struct pci_dev *dev, int pos, u16 *val);
int pcie_capability_read_dword(struct pci_dev *dev, int pos, u32 *val); int pcie_capability_read_dword(struct pci_dev *dev, int pos, u32 *val);
......
...@@ -2605,6 +2605,8 @@ ...@@ -2605,6 +2605,8 @@
#define PCI_VENDOR_ID_TEKRAM 0x1de1 #define PCI_VENDOR_ID_TEKRAM 0x1de1
#define PCI_DEVICE_ID_TEKRAM_DC290 0xdc29 #define PCI_DEVICE_ID_TEKRAM_DC290 0xdc29
#define PCI_VENDOR_ID_ALIBABA 0x1ded
#define PCI_VENDOR_ID_TEHUTI 0x1fc9 #define PCI_VENDOR_ID_TEHUTI 0x1fc9
#define PCI_DEVICE_ID_TEHUTI_3009 0x3009 #define PCI_DEVICE_ID_TEHUTI_3009 0x3009
#define PCI_DEVICE_ID_TEHUTI_3010 0x3010 #define PCI_DEVICE_ID_TEHUTI_3010 0x3010
......
...@@ -59,12 +59,6 @@ struct pmu_hw_events { ...@@ -59,12 +59,6 @@ struct pmu_hw_events {
*/ */
DECLARE_BITMAP(used_mask, ARMPMU_MAX_HWEVENTS); DECLARE_BITMAP(used_mask, ARMPMU_MAX_HWEVENTS);
/*
* Hardware lock to serialize accesses to PMU registers. Needed for the
* read/modify/write sequences.
*/
raw_spinlock_t pmu_lock;
/* /*
* When using percpu IRQs, we need a percpu dev_id. Place it here as we * When using percpu IRQs, we need a percpu dev_id. Place it here as we
* already have to allocate this struct per cpu. * already have to allocate this struct per cpu.
...@@ -189,4 +183,26 @@ void armpmu_free_irq(int irq, int cpu); ...@@ -189,4 +183,26 @@ void armpmu_free_irq(int irq, int cpu);
#define ARMV8_SPE_PDEV_NAME "arm,spe-v1" #define ARMV8_SPE_PDEV_NAME "arm,spe-v1"
#define ARMV8_TRBE_PDEV_NAME "arm,trbe" #define ARMV8_TRBE_PDEV_NAME "arm,trbe"
/* Why does everything I do descend into this? */
#define __GEN_PMU_FORMAT_ATTR(cfg, lo, hi) \
(lo) == (hi) ? #cfg ":" #lo "\n" : #cfg ":" #lo "-" #hi
#define _GEN_PMU_FORMAT_ATTR(cfg, lo, hi) \
__GEN_PMU_FORMAT_ATTR(cfg, lo, hi)
#define GEN_PMU_FORMAT_ATTR(name) \
PMU_FORMAT_ATTR(name, \
_GEN_PMU_FORMAT_ATTR(ATTR_CFG_FLD_##name##_CFG, \
ATTR_CFG_FLD_##name##_LO, \
ATTR_CFG_FLD_##name##_HI))
#define _ATTR_CFG_GET_FLD(attr, cfg, lo, hi) \
((((attr)->cfg) >> lo) & GENMASK_ULL(hi - lo, 0))
#define ATTR_CFG_GET_FLD(attr, name) \
_ATTR_CFG_GET_FLD(attr, \
ATTR_CFG_FLD_##name##_CFG, \
ATTR_CFG_FLD_##name##_LO, \
ATTR_CFG_FLD_##name##_HI)
#endif /* __ARM_PMU_H__ */ #endif /* __ARM_PMU_H__ */
...@@ -215,21 +215,27 @@ ...@@ -215,21 +215,27 @@
#define ARMV8_PMU_PMCR_DP (1 << 5) /* Disable CCNT if non-invasive debug*/ #define ARMV8_PMU_PMCR_DP (1 << 5) /* Disable CCNT if non-invasive debug*/
#define ARMV8_PMU_PMCR_LC (1 << 6) /* Overflow on 64 bit cycle counter */ #define ARMV8_PMU_PMCR_LC (1 << 6) /* Overflow on 64 bit cycle counter */
#define ARMV8_PMU_PMCR_LP (1 << 7) /* Long event counter enable */ #define ARMV8_PMU_PMCR_LP (1 << 7) /* Long event counter enable */
#define ARMV8_PMU_PMCR_N_SHIFT 11 /* Number of counters supported */ #define ARMV8_PMU_PMCR_N GENMASK(15, 11) /* Number of counters supported */
#define ARMV8_PMU_PMCR_N_MASK 0x1f /* Mask for writable bits */
#define ARMV8_PMU_PMCR_MASK 0xff /* Mask for writable bits */ #define ARMV8_PMU_PMCR_MASK (ARMV8_PMU_PMCR_E | ARMV8_PMU_PMCR_P | \
ARMV8_PMU_PMCR_C | ARMV8_PMU_PMCR_D | \
ARMV8_PMU_PMCR_X | ARMV8_PMU_PMCR_DP | \
ARMV8_PMU_PMCR_LC | ARMV8_PMU_PMCR_LP)
/* /*
* PMOVSR: counters overflow flag status reg * PMOVSR: counters overflow flag status reg
*/ */
#define ARMV8_PMU_OVSR_MASK 0xffffffff /* Mask for writable bits */ #define ARMV8_PMU_OVSR_P GENMASK(30, 0)
#define ARMV8_PMU_OVERFLOWED_MASK ARMV8_PMU_OVSR_MASK #define ARMV8_PMU_OVSR_C BIT(31)
/* Mask for writable bits is both P and C fields */
#define ARMV8_PMU_OVERFLOWED_MASK (ARMV8_PMU_OVSR_P | ARMV8_PMU_OVSR_C)
/* /*
* PMXEVTYPER: Event selection reg * PMXEVTYPER: Event selection reg
*/ */
#define ARMV8_PMU_EVTYPE_MASK 0xc800ffff /* Mask for writable bits */ #define ARMV8_PMU_EVTYPE_EVENT GENMASK(15, 0) /* Mask for EVENT bits */
#define ARMV8_PMU_EVTYPE_EVENT 0xffff /* Mask for EVENT bits */ #define ARMV8_PMU_EVTYPE_TH GENMASK_ULL(43, 32) /* arm64 only */
#define ARMV8_PMU_EVTYPE_TC GENMASK_ULL(63, 61) /* arm64 only */
/* /*
* Event filters for PMUv3 * Event filters for PMUv3
...@@ -244,19 +250,19 @@ ...@@ -244,19 +250,19 @@
/* /*
* PMUSERENR: user enable reg * PMUSERENR: user enable reg
*/ */
#define ARMV8_PMU_USERENR_MASK 0xf /* Mask for writable bits */
#define ARMV8_PMU_USERENR_EN (1 << 0) /* PMU regs can be accessed at EL0 */ #define ARMV8_PMU_USERENR_EN (1 << 0) /* PMU regs can be accessed at EL0 */
#define ARMV8_PMU_USERENR_SW (1 << 1) /* PMSWINC can be written at EL0 */ #define ARMV8_PMU_USERENR_SW (1 << 1) /* PMSWINC can be written at EL0 */
#define ARMV8_PMU_USERENR_CR (1 << 2) /* Cycle counter can be read at EL0 */ #define ARMV8_PMU_USERENR_CR (1 << 2) /* Cycle counter can be read at EL0 */
#define ARMV8_PMU_USERENR_ER (1 << 3) /* Event counter can be read at EL0 */ #define ARMV8_PMU_USERENR_ER (1 << 3) /* Event counter can be read at EL0 */
/* Mask for writable bits */
#define ARMV8_PMU_USERENR_MASK (ARMV8_PMU_USERENR_EN | ARMV8_PMU_USERENR_SW | \
ARMV8_PMU_USERENR_CR | ARMV8_PMU_USERENR_ER)
/* PMMIR_EL1.SLOTS mask */ /* PMMIR_EL1.SLOTS mask */
#define ARMV8_PMU_SLOTS_MASK 0xff #define ARMV8_PMU_SLOTS GENMASK(7, 0)
#define ARMV8_PMU_BUS_SLOTS GENMASK(15, 8)
#define ARMV8_PMU_BUS_SLOTS_SHIFT 8 #define ARMV8_PMU_BUS_WIDTH GENMASK(19, 16)
#define ARMV8_PMU_BUS_SLOTS_MASK 0xff #define ARMV8_PMU_THWIDTH GENMASK(23, 20)
#define ARMV8_PMU_BUS_WIDTH_SHIFT 16
#define ARMV8_PMU_BUS_WIDTH_MASK 0xf
/* /*
* This code is really good * This code is really good
......
...@@ -218,45 +218,54 @@ ...@@ -218,45 +218,54 @@
#define ARMV8_PMU_PMCR_DP (1 << 5) /* Disable CCNT if non-invasive debug*/ #define ARMV8_PMU_PMCR_DP (1 << 5) /* Disable CCNT if non-invasive debug*/
#define ARMV8_PMU_PMCR_LC (1 << 6) /* Overflow on 64 bit cycle counter */ #define ARMV8_PMU_PMCR_LC (1 << 6) /* Overflow on 64 bit cycle counter */
#define ARMV8_PMU_PMCR_LP (1 << 7) /* Long event counter enable */ #define ARMV8_PMU_PMCR_LP (1 << 7) /* Long event counter enable */
#define ARMV8_PMU_PMCR_N_SHIFT 11 /* Number of counters supported */ #define ARMV8_PMU_PMCR_N GENMASK(15, 11) /* Number of counters supported */
#define ARMV8_PMU_PMCR_N_MASK 0x1f /* Mask for writable bits */
#define ARMV8_PMU_PMCR_MASK 0xff /* Mask for writable bits */ #define ARMV8_PMU_PMCR_MASK (ARMV8_PMU_PMCR_E | ARMV8_PMU_PMCR_P | \
ARMV8_PMU_PMCR_C | ARMV8_PMU_PMCR_D | \
ARMV8_PMU_PMCR_X | ARMV8_PMU_PMCR_DP | \
ARMV8_PMU_PMCR_LC | ARMV8_PMU_PMCR_LP)
/* /*
* PMOVSR: counters overflow flag status reg * PMOVSR: counters overflow flag status reg
*/ */
#define ARMV8_PMU_OVSR_MASK 0xffffffff /* Mask for writable bits */ #define ARMV8_PMU_OVSR_P GENMASK(30, 0)
#define ARMV8_PMU_OVERFLOWED_MASK ARMV8_PMU_OVSR_MASK #define ARMV8_PMU_OVSR_C BIT(31)
/* Mask for writable bits is both P and C fields */
#define ARMV8_PMU_OVERFLOWED_MASK (ARMV8_PMU_OVSR_P | ARMV8_PMU_OVSR_C)
/* /*
* PMXEVTYPER: Event selection reg * PMXEVTYPER: Event selection reg
*/ */
#define ARMV8_PMU_EVTYPE_MASK 0xc800ffff /* Mask for writable bits */ #define ARMV8_PMU_EVTYPE_EVENT GENMASK(15, 0) /* Mask for EVENT bits */
#define ARMV8_PMU_EVTYPE_EVENT 0xffff /* Mask for EVENT bits */ #define ARMV8_PMU_EVTYPE_TH GENMASK(43, 32)
#define ARMV8_PMU_EVTYPE_TC GENMASK(63, 61)
/* /*
* Event filters for PMUv3 * Event filters for PMUv3
*/ */
#define ARMV8_PMU_EXCLUDE_EL1 (1U << 31) #define ARMV8_PMU_EXCLUDE_EL1 (1U << 31)
#define ARMV8_PMU_EXCLUDE_EL0 (1U << 30) #define ARMV8_PMU_EXCLUDE_EL0 (1U << 30)
#define ARMV8_PMU_INCLUDE_EL2 (1U << 27) #define ARMV8_PMU_EXCLUDE_NS_EL1 (1U << 29)
#define ARMV8_PMU_EXCLUDE_NS_EL0 (1U << 28)
#define ARMV8_PMU_INCLUDE_EL2 (1U << 27)
#define ARMV8_PMU_EXCLUDE_EL3 (1U << 26)
/* /*
* PMUSERENR: user enable reg * PMUSERENR: user enable reg
*/ */
#define ARMV8_PMU_USERENR_MASK 0xf /* Mask for writable bits */
#define ARMV8_PMU_USERENR_EN (1 << 0) /* PMU regs can be accessed at EL0 */ #define ARMV8_PMU_USERENR_EN (1 << 0) /* PMU regs can be accessed at EL0 */
#define ARMV8_PMU_USERENR_SW (1 << 1) /* PMSWINC can be written at EL0 */ #define ARMV8_PMU_USERENR_SW (1 << 1) /* PMSWINC can be written at EL0 */
#define ARMV8_PMU_USERENR_CR (1 << 2) /* Cycle counter can be read at EL0 */ #define ARMV8_PMU_USERENR_CR (1 << 2) /* Cycle counter can be read at EL0 */
#define ARMV8_PMU_USERENR_ER (1 << 3) /* Event counter can be read at EL0 */ #define ARMV8_PMU_USERENR_ER (1 << 3) /* Event counter can be read at EL0 */
/* Mask for writable bits */
#define ARMV8_PMU_USERENR_MASK (ARMV8_PMU_USERENR_EN | ARMV8_PMU_USERENR_SW | \
ARMV8_PMU_USERENR_CR | ARMV8_PMU_USERENR_ER)
/* PMMIR_EL1.SLOTS mask */ /* PMMIR_EL1.SLOTS mask */
#define ARMV8_PMU_SLOTS_MASK 0xff #define ARMV8_PMU_SLOTS GENMASK(7, 0)
#define ARMV8_PMU_BUS_SLOTS GENMASK(15, 8)
#define ARMV8_PMU_BUS_SLOTS_SHIFT 8 #define ARMV8_PMU_BUS_WIDTH GENMASK(19, 16)
#define ARMV8_PMU_BUS_SLOTS_MASK 0xff #define ARMV8_PMU_THWIDTH GENMASK(23, 20)
#define ARMV8_PMU_BUS_WIDTH_SHIFT 16
#define ARMV8_PMU_BUS_WIDTH_MASK 0xf
/* /*
* This code is really good * This code is really good
......
...@@ -42,13 +42,12 @@ struct pmreg_sets { ...@@ -42,13 +42,12 @@ struct pmreg_sets {
static uint64_t get_pmcr_n(uint64_t pmcr) static uint64_t get_pmcr_n(uint64_t pmcr)
{ {
return (pmcr >> ARMV8_PMU_PMCR_N_SHIFT) & ARMV8_PMU_PMCR_N_MASK; return FIELD_GET(ARMV8_PMU_PMCR_N, pmcr);
} }
static void set_pmcr_n(uint64_t *pmcr, uint64_t pmcr_n) static void set_pmcr_n(uint64_t *pmcr, uint64_t pmcr_n)
{ {
*pmcr = *pmcr & ~(ARMV8_PMU_PMCR_N_MASK << ARMV8_PMU_PMCR_N_SHIFT); u64p_replace_bits((__u64 *) pmcr, pmcr_n, ARMV8_PMU_PMCR_N);
*pmcr |= (pmcr_n << ARMV8_PMU_PMCR_N_SHIFT);
} }
static uint64_t get_counters_mask(uint64_t n) static uint64_t get_counters_mask(uint64_t n)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment