Commit 4d2e26a3 authored by Mauro Carvalho Chehab's avatar Mauro Carvalho Chehab

docs: powerpc: convert docs to ReST and rename to *.rst

Convert docs to ReST and add them to the arch-specific
book.

The conversion here was trivial, as almost every file there
was already using an elegant format close to ReST standard.

The changes were mostly to mark literal blocks and add a few
missing section title identifiers.

One note with regards to "--": on Sphinx, this can't be used
to identify a list, as it will format it badly. This can be
used, however, to identify a long hyphen - and "---" is an
even longer one.

At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.
Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> # cxl
parent 0a8ad0ff
......@@ -403,7 +403,7 @@ That is, the recovery API only requires that:
.. note::
Implementation details for the powerpc platform are discussed in
the file Documentation/powerpc/eeh-pci-error-recovery.txt
the file Documentation/powerpc/eeh-pci-error-recovery.rst
As of this writing, there is a growing list of device drivers with
patches implementing error recovery. Not all of these patches are in
......@@ -422,3 +422,6 @@ That is, the recovery API only requires that:
- drivers/net/cxgb3
- drivers/net/s2io.c
- drivers/net/qlge
The End
-------
......@@ -143,6 +143,7 @@ implementation.
arm64/index
ia64/index
m68k/index
powerpc/index
riscv/index
s390/index
sh/index
......
========================
The PowerPC boot wrapper
------------------------
========================
Copyright (C) Secret Lab Technologies Ltd.
PowerPC image targets compresses and wraps the kernel image (vmlinux) with
......@@ -21,6 +23,7 @@ it uses the wrapper script (arch/powerpc/boot/wrapper) to generate target
image. The details of the build system is discussed in the next section.
Currently, the following image format targets exist:
==================== ========================================================
cuImage.%: Backwards compatible uImage for older version of
U-Boot (for versions that don't understand the device
tree). This image embeds a device tree blob inside
......@@ -29,6 +32,7 @@ Currently, the following image format targets exist:
with boot wrapper code that extracts data from the old
bd_info structure and loads the data into the device
tree before jumping into the kernel.
Because of the series of #ifdefs found in the
bd_info structure used in the old U-Boot interfaces,
cuImages are platform specific. Each specific
......@@ -36,24 +40,28 @@ Currently, the following image format targets exist:
which populates the embedded device tree with data
from the platform specific bd_info file. The platform
specific cuImage platform init code can be found in
arch/powerpc/boot/cuboot.*.c. Selection of the correct
`arch/powerpc/boot/cuboot.*.c`. Selection of the correct
cuImage init code for a specific board can be found in
the wrapper structure.
dtbImage.%: Similar to zImage, except device tree blob is embedded
inside the image instead of provided by firmware. The
output image file can be either an elf file or a flat
binary depending on the platform.
dtbImages are used on systems which do not have an
interface for passing a device tree directly.
dtbImages are similar to simpleImages except that
dtbImages have platform specific code for extracting
data from the board firmware, but simpleImages do not
talk to the firmware at all.
PlayStation 3 support uses dtbImage. So do Embedded
Planet boards using the PlanetCore firmware. Board
specific initialization code is typically found in a
file named arch/powerpc/boot/<platform>.c; but this
can be overridden by the wrapper script.
simpleImage.%: Firmware independent compressed image that does not
depend on any particular firmware interface and embeds
a device tree blob. This image is a flat binary that
......@@ -61,6 +69,7 @@ Currently, the following image format targets exist:
Firmware cannot pass any configuration data to the
kernel with this image type and it depends entirely on
the embedded device tree for all information.
The simpleImage is useful for booting systems with
an unknown firmware interface or for booting from
a debugger when no firmware is present (such as on
......@@ -68,6 +77,7 @@ Currently, the following image format targets exist:
simpleImage makes is that RAM is correctly initialized
and that the MMU is either off or has RAM mapped to
base address 0.
simpleImage also supports inserting special platform
specific initialization code to the start of the bootup
sequence. The virtex405 platform uses this feature to
......@@ -81,9 +91,11 @@ Currently, the following image format targets exist:
named (virtex405-<board>.dts). Search the wrapper
script for 'virtex405' and see the file
arch/powerpc/boot/virtex405-head.S for details.
treeImage.%; Image format for used with OpenBIOS firmware found
on some ppc4xx hardware. This image embeds a device
tree blob inside the image.
uImage: Native image format used by U-Boot. The uImage target
does not add any boot code. It just wraps a compressed
vmlinux in the uImage data structure. This image
......@@ -91,12 +103,14 @@ Currently, the following image format targets exist:
a device tree to the kernel at boot. If using an older
version of U-Boot, then you need to use a cuImage
instead.
zImage.%: Image format which does not embed a device tree.
Used by OpenFirmware and other firmware interfaces
which are able to supply a device tree. This image
expects firmware to provide the device tree at boot.
Typically, if you have general purpose PowerPC
hardware then you want this image format.
==================== ========================================================
Image types which embed a device tree blob (simpleImage, dtbImage, treeImage,
and cuImage) all generate the device tree blob from a file in the
......
============
CPU Families
============
......@@ -8,8 +9,8 @@ and are supported by arch/powerpc.
Book3S (aka sPAPR)
------------------
- Hash MMU
- Mix of 32 & 64 bit
- Hash MMU
- Mix of 32 & 64 bit::
+--------------+ +----------------+
| Old POWER | --------------> | RS64 (threads) |
......@@ -108,8 +109,8 @@ Book3S (aka sPAPR)
IBM BookE
---------
- Software loaded TLB.
- All 32 bit
- Software loaded TLB.
- All 32 bit::
+--------------+
| 401 |
......@@ -155,8 +156,8 @@ IBM BookE
Motorola/Freescale 8xx
----------------------
- Software loaded with hardware assist.
- All 32 bit
- Software loaded with hardware assist.
- All 32 bit::
+-------------+
| MPC8xx Core |
......@@ -166,9 +167,9 @@ Motorola/Freescale 8xx
Freescale BookE
---------------
- Software loaded TLB.
- e6500 adds HW loaded indirect TLB entries.
- Mix of 32 & 64 bit
- Software loaded TLB.
- e6500 adds HW loaded indirect TLB entries.
- Mix of 32 & 64 bit::
+--------------+
| e200 |
......@@ -207,8 +208,8 @@ Freescale BookE
IBM A2 core
-----------
- Book3E, software loaded TLB + HW loaded indirect TLB entries.
- 64 bit
- Book3E, software loaded TLB + HW loaded indirect TLB entries.
- 64 bit::
+--------------+ +----------------+
| A2 core | --> | WSP |
......
============
CPU Features
============
Hollis Blanchard <hollis@austin.ibm.com>
5 Jun 2002
......@@ -32,7 +36,7 @@ anyways).
After detecting the processor type, the kernel patches out sections of code
that shouldn't be used by writing nop's over it. Using cpufeatures requires
just 2 macros (found in arch/powerpc/include/asm/cputable.h), as seen in head.S
transfer_to_handler:
transfer_to_handler::
#ifdef CONFIG_ALTIVEC
BEGIN_FTR_SECTION
......
====================================
Coherent Accelerator Interface (CXL)
====================================
......@@ -21,6 +22,8 @@ Introduction
Hardware overview
=================
::
POWER8/9 FPGA
+----------+ +---------+
| | | |
......@@ -59,14 +62,16 @@ Hardware overview
the fault. The context to which this fault is serviced is based on
who owns that acceleration function.
POWER8 <-----> PSL Version 8 is compliant to the CAIA Version 1.0.
POWER9 <-----> PSL Version 9 is compliant to the CAIA Version 2.0.
- POWER8 and PSL Version 8 are compliant to the CAIA Version 1.0.
- POWER9 and PSL Version 9 are compliant to the CAIA Version 2.0.
This PSL Version 9 provides new features such as:
* Interaction with the nest MMU on the P9 chip.
* Native DMA support.
* Supports sending ASB_Notify messages for host thread wakeup.
* Supports Atomic operations.
* ....
* etc.
Cards with a PSL9 won't work on a POWER8 system and cards with a
PSL8 won't work on a POWER9 system.
......@@ -147,7 +152,9 @@ User API
master devices.
A userspace library libcxl is available here:
https://github.com/ibm-capi/libcxl
This provides a C interface to this kernel API.
open
......@@ -165,7 +172,8 @@ open
When all available contexts are allocated the open call will fail
and return -ENOSPC.
Note: IRQs need to be allocated for each context, which may limit
Note:
IRQs need to be allocated for each context, which may limit
the number of contexts that can be created, and therefore
how many times the device can be opened. The POWER8 CAPP
supports 2040 IRQs and 3 are used by the kernel, so 2037 are
......@@ -186,7 +194,9 @@ ioctl
updated as userspace allocates and frees memory. This ioctl
returns once the AFU context is started.
Takes a pointer to a struct cxl_ioctl_start_work:
Takes a pointer to a struct cxl_ioctl_start_work
::
struct cxl_ioctl_start_work {
__u64 flags;
......@@ -269,7 +279,7 @@ read
The buffer passed to read() must be at least 4K bytes.
The result of the read will be a buffer of one or more events,
each event is of type struct cxl_event, of varying size.
each event is of type struct cxl_event, of varying size::
struct cxl_event {
struct cxl_event_header header;
......@@ -280,7 +290,9 @@ read
};
};
The struct cxl_event_header is defined as:
The struct cxl_event_header is defined as
::
struct cxl_event_header {
__u16 type;
......@@ -307,7 +319,9 @@ read
For future extensions and padding.
If the event type is CXL_EVENT_AFU_INTERRUPT then the event
structure is defined as:
structure is defined as
::
struct cxl_event_afu_interrupt {
__u16 flags;
......@@ -326,7 +340,9 @@ read
For future extensions and padding.
If the event type is CXL_EVENT_DATA_STORAGE then the event
structure is defined as:
structure is defined as
::
struct cxl_event_data_storage {
__u16 flags;
......@@ -356,7 +372,9 @@ read
For future extensions
If the event type is CXL_EVENT_AFU_ERROR then the event structure
is defined as:
is defined as
::
struct cxl_event_afu_error {
__u16 flags;
......@@ -393,15 +411,15 @@ open
ioctl
-----
CXL_IOCTL_DOWNLOAD_IMAGE:
CXL_IOCTL_VALIDATE_IMAGE:
CXL_IOCTL_DOWNLOAD_IMAGE / CXL_IOCTL_VALIDATE_IMAGE:
Starts and controls flashing a new FPGA image. Partial
reconfiguration is not supported (yet), so the image must contain
a copy of the PSL and AFU(s). Since an image can be quite large,
the caller may have to iterate, splitting the image in smaller
chunks.
Takes a pointer to a struct cxl_adapter_image:
Takes a pointer to a struct cxl_adapter_image::
struct cxl_adapter_image {
__u64 flags;
__u64 data;
......@@ -442,7 +460,7 @@ Udev rules
The following udev rules could be used to create a symlink to the
most logical chardev to use in any programming mode (afuX.Yd for
dedicated, afuX.Ys for afu directed), since the API is virtually
identical for each:
identical for each::
SUBSYSTEM=="cxl", ATTRS{mode}=="dedicated_process", SYMLINK="cxl/%b"
SUBSYSTEM=="cxl", ATTRS{mode}=="afu_directed", \
......
================================
Coherent Accelerator (CXL) Flash
================================
Introduction
============
......@@ -28,7 +32,7 @@ Introduction
responsible for the initialization of the adapter, setting up the
special path for user space access, and performing error recovery. It
communicates directly the Flash Accelerator Functional Unit (AFU)
as described in Documentation/powerpc/cxl.txt.
as described in Documentation/powerpc/cxl.rst.
The cxlflash driver supports two, mutually exclusive, modes of
operation at the device (LUN) level:
......@@ -58,7 +62,7 @@ Overview
The CXL Flash Adapter Driver establishes a master context with the
AFU. It uses memory mapped I/O (MMIO) for this control and setup. The
Adapter Problem Space Memory Map looks like this:
Adapter Problem Space Memory Map looks like this::
+-------------------------------+
| 512 * 64 KB User MMIO |
......@@ -375,7 +379,7 @@ CXL Flash Driver Host IOCTLs
Each host adapter instance that is supported by the cxlflash driver
has a special character device associated with it to enable a set of
host management function. These character devices are hosted in a
class dedicated for cxlflash and can be accessed via /dev/cxlflash/*.
class dedicated for cxlflash and can be accessed via `/dev/cxlflash/*`.
Applications can be written to perform various functions using the
host ioctl APIs below.
......
=====================
DAWR issues on POWER9
============================
=====================
On POWER9 the Data Address Watchpoint Register (DAWR) can cause a checkstop
if it points to cache inhibited (CI) memory. Currently Linux has no way to
disinguish CI memory when configuring the DAWR, so (for now) the DAWR is
disabled by this commit:
disabled by this commit::
commit 9654153158d3e0684a1bdb76dbababdb7111d5a0
Author: Michael Neuling <mikey@neuling.org>
......@@ -12,7 +13,7 @@ disabled by this commit:
powerpc: Disable DAWR in the base POWER9 CPU features
Technical Details:
============================
==================
DAWR has 6 different ways of being set.
1) ptrace
......@@ -37,7 +38,7 @@ DAWR on the migration.
For xmon, the 'bd' command will return an error on P9.
Consequences for users
============================
======================
For GDB watchpoints (ie 'watch' command) on POWER9 bare metal , GDB
will accept the command. Unfortunately since there is no hardware
......@@ -57,8 +58,8 @@ trapped in GDB. The watchpoint is remembered, so if the guest is
migrated back to the POWER8 host, it will start working again.
Force enabling the DAWR
=============================
Kernels (since ~v5.2) have an option to force enable the DAWR via:
=======================
Kernels (since ~v5.2) have an option to force enable the DAWR via::
echo Y > /sys/kernel/debug/powerpc/dawr_enable_dangerous
......@@ -86,5 +87,7 @@ dawr_enable_dangerous file will fail if the hypervisor doesn't support
writing the DAWR.
To double check the DAWR is working, run this kernel selftest:
tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c
Any errors/failures/skips mean something is wrong.
DSCR (Data Stream Control Register)
================================================
===================================
DSCR (Data Stream Control Register)
===================================
DSCR register in powerpc allows user to have some control of prefetch of data
stream in the processor. Please refer to the ISA documents or related manual
......@@ -10,14 +11,17 @@ user interface.
(A) Data Structures:
(1) thread_struct:
(1) thread_struct::
dscr /* Thread DSCR value */
dscr_inherit /* Thread has changed default DSCR */
(2) PACA:
(2) PACA::
dscr_default /* per-CPU DSCR default value */
(3) sysfs.c:
(3) sysfs.c::
dscr_default /* System DSCR default value */
(B) Scheduler Changes:
......@@ -35,8 +39,8 @@ user interface.
(C) SYSFS Interface:
Global DSCR default: /sys/devices/system/cpu/dscr_default
CPU specific DSCR default: /sys/devices/system/cpu/cpuN/dscr
- Global DSCR default: /sys/devices/system/cpu/dscr_default
- CPU specific DSCR default: /sys/devices/system/cpu/cpuN/dscr
Changing the global DSCR default in the sysfs will change all the CPU
specific DSCR defaults immediately in their PACA structures. Again if
......
==========================
PCI Bus EEH Error Recovery
==========================
Linas Vepstas <linas@austin.ibm.com>
PCI Bus EEH Error Recovery
--------------------------
Linas Vepstas
<linas@austin.ibm.com>
12 January 2005
12 January 2005
Overview:
......@@ -153,7 +153,7 @@ beforehand.
Next, it uses the Linux kernel notifier chain/work queue mechanism to
allow any interested parties to find out about the failure. Device
drivers, or other parts of the kernel, can use
eeh_register_notifier(struct notifier_block *) to find out about EEH
`eeh_register_notifier(struct notifier_block *)` to find out about EEH
events. The event will include a pointer to the pci device, the
device node and some state info. Receivers of the event can "do as
they wish"; the default handler will be described further in this
......@@ -162,10 +162,13 @@ section.
To assist in the recovery of the device, eeh.c exports the
following functions:
rtas_set_slot_reset() -- assert the PCI #RST line for 1/8th of a second
rtas_configure_bridge() -- ask firmware to configure any PCI bridges
rtas_set_slot_reset()
assert the PCI #RST line for 1/8th of a second
rtas_configure_bridge()
ask firmware to configure any PCI bridges
located topologically under the pci slot.
eeh_save_bars() and eeh_restore_bars(): save and restore the PCI
eeh_save_bars() and eeh_restore_bars():
save and restore the PCI
config-space info for a device and any devices under it.
......@@ -191,7 +194,7 @@ events get delivered to user-space scripts.
Following is an example sequence of events that cause a device driver
close function to be called during the first phase of an EEH reset.
The following sequence is an example of the pcnet32 device driver.
The following sequence is an example of the pcnet32 device driver::
rpa_php_unconfig_pci_adapter (struct slot *) // in rpaphp_pci.c
{
......@@ -241,19 +244,20 @@ The following sequence is an example of the pcnet32 device driver.
}}}}}}
in drivers/pci/pci_driver.c,
struct device_driver->remove() is just pci_device_remove()
which calls struct pci_driver->remove() which is pcnet32_remove_one()
which calls unregister_netdev() (in net/core/dev.c)
which calls dev_close() (in net/core/dev.c)
which calls dev->stop() which is pcnet32_close()
which then does the appropriate shutdown.
in drivers/pci/pci_driver.c,
struct device_driver->remove() is just pci_device_remove()
which calls struct pci_driver->remove() which is pcnet32_remove_one()
which calls unregister_netdev() (in net/core/dev.c)
which calls dev_close() (in net/core/dev.c)
which calls dev->stop() which is pcnet32_close()
which then does the appropriate shutdown.
---
Following is the analogous stack trace for events sent to user-space
when the pci device is unconfigured.
when the pci device is unconfigured::
rpa_php_unconfig_pci_adapter() { // in rpaphp_pci.c
rpa_php_unconfig_pci_adapter() { // in rpaphp_pci.c
calls
pci_remove_bus_device (struct pci_dev *) { // in /drivers/pci/remove.c
calls
......@@ -299,12 +303,12 @@ individual device drivers, so that the current design throws a wide net.
The biggest negative of the design is that it potentially disturbs
network daemons and file systems that didn't need to be disturbed.
-- A minor complaint is that resetting the network card causes
- A minor complaint is that resetting the network card causes
user-space back-to-back ifdown/ifup burps that potentially disturb
network daemons, that didn't need to even know that the pci
card was being rebooted.
-- A more serious concern is that the same reset, for SCSI devices,
- A more serious concern is that the same reset, for SCSI devices,
causes havoc to mounted file systems. Scripts cannot post-facto
unmount a file system without flushing pending buffers, but this
is impossible, because I/O has already been stopped. Thus,
......@@ -322,7 +326,7 @@ network daemons and file systems that didn't need to be disturbed.
from the block layer. It would be very natural to add an EEH
reset into this chain of events.
-- If a SCSI error occurs for the root device, all is lost unless
- If a SCSI error occurs for the root device, all is lost unless
the sysadmin had the foresight to run /bin, /sbin, /etc, /var
and so on, out of ramdisk/tmpfs.
......@@ -330,5 +334,3 @@ network daemons and file systems that didn't need to be disturbed.
Conclusions
-----------
There's forward progress ...
======================
Firmware-Assisted Dump
======================
Firmware-Assisted Dump
------------------------
July 2011
July 2011
The goal of firmware-assisted dump is to enable the dump of
a crashed system, and to do so from a fully-reset system, and
......@@ -27,11 +28,11 @@ in production use.
Comparing with kdump or other strategies, firmware-assisted
dump offers several strong, practical advantages:
-- Unlike kdump, the system has been reset, and loaded
- Unlike kdump, the system has been reset, and loaded
with a fresh copy of the kernel. In particular,
PCI and I/O devices have been reinitialized and are
in a clean, consistent state.
-- Once the dump is copied out, the memory that held the dump
- Once the dump is copied out, the memory that held the dump
is immediately available to the running kernel. And therefore,
unlike kdump, fadump doesn't need a 2nd reboot to get back
the system to the production configuration.
......@@ -40,17 +41,18 @@ The above can only be accomplished by coordination with,
and assistance from the Power firmware. The procedure is
as follows:
-- The first kernel registers the sections of memory with the
- The first kernel registers the sections of memory with the
Power firmware for dump preservation during OS initialization.
These registered sections of memory are reserved by the first
kernel during early boot.
-- When a system crashes, the Power firmware will save
- When a system crashes, the Power firmware will save
the low memory (boot memory of size larger of 5% of system RAM
or 256MB) of RAM to the previous registered region. It will
also save system registers, and hardware PTE's.
NOTE: The term 'boot memory' means size of the low memory chunk
NOTE:
The term 'boot memory' means size of the low memory chunk
that is required for a kernel to boot successfully when
booted with restricted memory. By default, the boot memory
size will be the larger of 5% of system RAM or 256MB.
......@@ -64,12 +66,12 @@ as follows:
as fadump uses a predefined offset to reserve memory
for boot memory dump preservation in case of a crash.
-- After the low memory (boot memory) area has been saved, the
- After the low memory (boot memory) area has been saved, the
firmware will reset PCI and other hardware state. It will
*not* clear the RAM. It will then launch the bootloader, as
normal.
-- The freshly booted kernel will notice that there is a new
- The freshly booted kernel will notice that there is a new
node (ibm,dump-kernel) in the device tree, indicating that
there is crash data available from a previous boot. During
the early boot OS will reserve rest of the memory above
......@@ -77,17 +79,18 @@ as follows:
size. This will make sure that the second kernel will not
touch any of the dump memory area.
-- User-space tools will read /proc/vmcore to obtain the contents
- User-space tools will read /proc/vmcore to obtain the contents
of memory, which holds the previous crashed kernel dump in ELF
format. The userspace tools may copy this info to disk, or
network, nas, san, iscsi, etc. as desired.
-- Once the userspace tool is done saving dump, it will echo
- Once the userspace tool is done saving dump, it will echo
'1' to /sys/kernel/fadump_release_mem to release the reserved
memory back to general use, except the memory required for
next firmware-assisted dump registration.
e.g.
e.g.::
# echo 1 > /sys/kernel/fadump_release_mem
Please note that the firmware-assisted dump feature
......@@ -95,7 +98,7 @@ is only available on Power6 and above systems with recent
firmware versions.
Implementation details:
----------------------
-----------------------
During boot, a check is made to see if firmware supports
this feature on that particular machine. If it does, then
......@@ -121,7 +124,7 @@ Allocator (CMA) for memory reservation if CMA is configured for kernel.
With CMA reservation this memory will be available for applications to
use it, while kernel is prevented from using it. With this fadump will
still be able to capture all of the kernel memory and most of the user
space memory except the user pages that were present in CMA region.
space memory except the user pages that were present in CMA region::
o Memory Reservation during first kernel
......@@ -166,7 +169,7 @@ The tools to examine the dump will be same as the ones
used for kdump.
How to enable firmware-assisted dump (fadump):
-------------------------------------
----------------------------------------------
1. Set config option CONFIG_FA_DUMP=y and build kernel.
2. Boot into linux kernel with 'fadump=on' kernel cmdline option.
......@@ -177,7 +180,8 @@ How to enable firmware-assisted dump (fadump):
to specify size of the memory to reserve for boot memory dump
preservation.
NOTE: 1. 'fadump_reserve_mem=' parameter has been deprecated. Instead
NOTE:
1. 'fadump_reserve_mem=' parameter has been deprecated. Instead
use 'crashkernel=' to specify size of the memory to reserve
for boot memory dump preservation.
2. If firmware-assisted dump fails to reserve memory then it
......@@ -189,7 +193,7 @@ NOTE: 1. 'fadump_reserve_mem=' parameter has been deprecated. Instead
old behaviour.
Sysfs/debugfs files:
------------
--------------------
Firmware-assisted dump feature uses sysfs file system to hold
the control files and debugfs file to display memory reserved region.
......@@ -197,20 +201,20 @@ the control files and debugfs file to display memory reserved region.
Here is the list of files under kernel sysfs:
/sys/kernel/fadump_enabled
This is used to display the fadump status.
0 = fadump is disabled
1 = fadump is enabled
- 0 = fadump is disabled
- 1 = fadump is enabled
This interface can be used by kdump init scripts to identify if
fadump is enabled in the kernel and act accordingly.
/sys/kernel/fadump_registered
This is used to display the fadump registration status as well
as to control (start/stop) the fadump registration.
0 = fadump is not registered.
1 = fadump is registered and ready to handle system crash.
- 0 = fadump is not registered.
- 1 = fadump is registered and ready to handle system crash.
To register fadump echo 1 > /sys/kernel/fadump_registered and
echo 0 > /sys/kernel/fadump_registered for un-register and stop the
......@@ -219,11 +223,10 @@ Here is the list of files under kernel sysfs:
easily integrated with kdump service start/stop.
/sys/kernel/fadump_release_mem
This file is available only when fadump is active during
second kernel. This is used to release the reserved memory
region that are held for saving crash dump. To release the
reserved memory echo 1 to it:
reserved memory echo 1 to it::
echo 1 > /sys/kernel/fadump_release_mem
......@@ -238,21 +241,21 @@ Here is the list of files under powerpc debugfs:
(Assuming debugfs is mounted on /sys/kernel/debug directory.)
/sys/kernel/debug/powerpc/fadump_region
This file shows the reserved memory regions if fadump is
enabled otherwise this file is empty. The output format
is:
is::
<region>: [<start>-<end>] <reserved-size> bytes, Dumped: <dump-size>
e.g.
Contents when fadump is registered during first kernel
Contents when fadump is registered during first kernel::
# cat /sys/kernel/debug/powerpc/fadump_region
CPU : [0x0000006ffb0000-0x0000006fff001f] 0x40020 bytes, Dumped: 0x0
HPTE: [0x0000006fff0020-0x0000006fff101f] 0x1000 bytes, Dumped: 0x0
DUMP: [0x0000006fff1020-0x0000007fff101f] 0x10000000 bytes, Dumped: 0x0
Contents when fadump is active during second kernel
Contents when fadump is active during second kernel::
# cat /sys/kernel/debug/powerpc/fadump_region
CPU : [0x0000006ffb0000-0x0000006fff001f] 0x40020 bytes, Dumped: 0x40020
......@@ -260,16 +263,17 @@ Here is the list of files under powerpc debugfs:
DUMP: [0x0000006fff1020-0x0000007fff101f] 0x10000000 bytes, Dumped: 0x10000000
: [0x00000010000000-0x0000006ffaffff] 0x5ffb0000 bytes, Dumped: 0x5ffb0000
NOTE: Please refer to Documentation/filesystems/debugfs.txt on
NOTE:
Please refer to Documentation/filesystems/debugfs.txt on
how to mount the debugfs filesystem.
TODO:
-----
o Need to come up with the better approach to find out more
- Need to come up with the better approach to find out more
accurate boot memory size that is required for a kernel to
boot successfully when booted with restricted memory.
o The fadump implementation introduces a fadump crash info structure
- The fadump implementation introduces a fadump crash info structure
in the scratch area before the ELF core header. The idea of introducing
this structure is to pass some important crash info data to the second
kernel which will help second kernel to populate ELF core header with
......@@ -277,7 +281,9 @@ TODO:
design implementation does not address a possibility of introducing
additional fields (in future) to this structure without affecting
compatibility. Need to come up with the better approach to address this.
The possible approaches are:
1. Introduce version field for version tracking, bump up the version
whenever a new field is added to the structure in future. The version
field can be used to find out what fields are valid for the current
......@@ -285,8 +291,11 @@ TODO:
2. Reserve the area of predefined size (say PAGE_SIZE) for this
structure and have unused area as reserved (initialized to zero)
for future field additions.
The advantage of approach 1 over 2 is we don't need to reserve extra space.
---
Author: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
This document is based on the original documentation written for phyp
assisted dump by Linas Vepstas and Manish Ahuja.
.. SPDX-License-Identifier: GPL-2.0
=======
powerpc
=======
.. toctree::
:maxdepth: 1
bootwrapper
cpu_families
cpu_features
cxl
cxlflash
dawr-power9
dscr
eeh-pci-error-recovery
firmware-assisted-dump
hvcs
isa-versions
mpc52xx
pci_iov_resource_on_powernv
pmu-ebb
ptrace
qe_firmware
syscall64-abi
transactional_memory
.. only:: subproject and html
Indices
=======
* :ref:`genindex`
:orphan:
==========================
CPU to ISA Version Mapping
==========================
Mapping of some CPU versions to relevant ISA versions.
========= ====================
========= ====================================================================
CPU Architecture version
========= ====================
========= ====================================================================
Power9 Power ISA v3.0B
Power8 Power ISA v2.07
Power7 Power ISA v2.06
......@@ -24,7 +23,7 @@ PPC970 - PowerPC User Instruction Set Architecture Book I v2.01
- PowerPC Virtual Environment Architecture Book II v2.01
- PowerPC Operating Environment Architecture Book III v2.01
- Plus Altivec/VMX ~= 2.03
========= ====================
========= ====================================================================
Key Features
......@@ -60,9 +59,9 @@ Power5 No
PPC970 No
========== ====
========== ====================
========== ====================================
CPU Transactional Memory
========== ====================
========== ====================================
Power9 Yes (* see transactional_memory.txt)
Power8 Yes
Power7 No
......@@ -73,4 +72,4 @@ Power5++ No
Power5+ No
Power5 No
PPC970 No
========== ====================
========== ====================================
=============================
Linux 2.6.x on MPC52xx family
-----------------------------
=============================
For the latest info, go to http://www.246tNt.com/mpc52xx/
To compile/use :
- U-Boot:
- U-Boot::
# <edit Makefile to set ARCH=ppc & CROSS_COMPILE=... ( also EXTRAVERSION
if you wish to ).
# make lite5200_defconfig
......@@ -16,7 +18,8 @@ To compile/use :
=> tftpboot 400000 pRamdisk
=> bootm 200000 400000
- DBug:
- DBug::
# <edit Makefile to set ARCH=ppc & CROSS_COMPILE=... ( also EXTRAVERSION
if you wish to ).
# make lite5200_defconfig
......@@ -28,7 +31,8 @@ To compile/use :
DBug> dn -i zImage.initrd.lite5200
Some remarks :
Some remarks:
- The port is named mpc52xxx, and config options are PPC_MPC52xx. The MGT5100
is not supported, and I'm not sure anyone is interesting in working on it
so. I didn't took 5xxx because there's apparently a lot of 5xxx that have
......
===================================================
PCI Express I/O Virtualization Resource on Powerenv
===================================================
Wei Yang <weiyang@linux.vnet.ibm.com>
Benjamin Herrenschmidt <benh@au1.ibm.com>
Bjorn Helgaas <bhelgaas@google.com>
26 Aug 2014
This document describes the requirement from hardware for PCI MMIO resource
......@@ -10,6 +17,7 @@ Endpoints and the implementation on P8 (IODA2). The next two sections talks
about considerations on enabling SRIOV on IODA2.
1. Introduction to Partitionable Endpoints
==========================================
A Partitionable Endpoint (PE) is a way to group the various resources
associated with a device or a set of devices to provide isolation between
......@@ -35,6 +43,7 @@ is a completely separate HW entity that replicates the entire logic, so has
its own set of PEs, etc.
2. Implementation of Partitionable Endpoints on P8 (IODA2)
==========================================================
P8 supports up to 256 Partitionable Endpoints per PHB.
......@@ -149,6 +158,7 @@ P8 supports up to 256 Partitionable Endpoints per PHB.
sense, but we haven't done it yet.
3. Considerations for SR-IOV on PowerKVM
========================================
* SR-IOV Background
......@@ -224,7 +234,7 @@ P8 supports up to 256 Partitionable Endpoints per PHB.
IODA supports 256 PEs, so segmented windows contain 256 segments, so if
total_VFs is less than 256, we have the situation in Figure 1.0, where
segments [total_VFs, 255] of the M64 window may map to some MMIO range on
other devices:
other devices::
0 1 total_VFs - 1
+------+------+- -+------+------+
......@@ -243,7 +253,7 @@ P8 supports up to 256 Partitionable Endpoints per PHB.
Figure 1.0 Direct map VF(n) BAR space
Our current solution is to allocate 256 segments even if the VF(n) BAR
space doesn't need that much, as shown in Figure 1.1:
space doesn't need that much, as shown in Figure 1.1::
0 1 total_VFs - 1 255
+------+------+- -+------+------+- -+------+------+
......@@ -269,6 +279,7 @@ P8 supports up to 256 Partitionable Endpoints per PHB.
responds to segments [total_VFs, 255].
4. Implications for the Generic PCI Code
========================================
The PCIe SR-IOV spec requires that the base of the VF(n) BAR space be
aligned to the size of an individual VF BAR.
......
========================
PMU Event Based Branches
========================
......
======
Ptrace
======
GDB intends to support the following hardware debug features of BookE
processors:
......@@ -12,6 +16,7 @@ that GDB doesn't need to special-case each of them. We added the
following 3 new ptrace requests.
1. PTRACE_PPC_GETHWDEBUGINFO
============================
Query for GDB to discover the hardware debug features. The main info to
be returned here is the minimum alignment for the hardware watchpoints.
......@@ -22,9 +27,9 @@ adding special cases to GDB based on what it sees in AUXV.
Since we're at it, we added other useful info that the kernel can return to
GDB: this query will return the number of hardware breakpoints, hardware
watchpoints and whether it supports a range of addresses and a condition.
The query will fill the following structure provided by the requesting process:
The query will fill the following structure provided by the requesting process::
struct ppc_debug_info {
struct ppc_debug_info {
unit32_t version;
unit32_t num_instruction_bps;
unit32_t num_data_bps;
......@@ -32,46 +37,46 @@ struct ppc_debug_info {
unit32_t data_bp_alignment;
unit32_t sizeof_condition; /* size of the DVC register */
uint64_t features; /* bitmask of the individual flags */
};
};
features will have bits indicating whether there is support for:
features will have bits indicating whether there is support for::
#define PPC_DEBUG_FEATURE_INSN_BP_RANGE 0x1
#define PPC_DEBUG_FEATURE_INSN_BP_MASK 0x2
#define PPC_DEBUG_FEATURE_DATA_BP_RANGE 0x4
#define PPC_DEBUG_FEATURE_DATA_BP_MASK 0x8
#define PPC_DEBUG_FEATURE_DATA_BP_DAWR 0x10
#define PPC_DEBUG_FEATURE_INSN_BP_RANGE 0x1
#define PPC_DEBUG_FEATURE_INSN_BP_MASK 0x2
#define PPC_DEBUG_FEATURE_DATA_BP_RANGE 0x4
#define PPC_DEBUG_FEATURE_DATA_BP_MASK 0x8
#define PPC_DEBUG_FEATURE_DATA_BP_DAWR 0x10
2. PTRACE_SETHWDEBUG
Sets a hardware breakpoint or watchpoint, according to the provided structure:
Sets a hardware breakpoint or watchpoint, according to the provided structure::
struct ppc_hw_breakpoint {
struct ppc_hw_breakpoint {
uint32_t version;
#define PPC_BREAKPOINT_TRIGGER_EXECUTE 0x1
#define PPC_BREAKPOINT_TRIGGER_READ 0x2
#define PPC_BREAKPOINT_TRIGGER_WRITE 0x4
#define PPC_BREAKPOINT_TRIGGER_EXECUTE 0x1
#define PPC_BREAKPOINT_TRIGGER_READ 0x2
#define PPC_BREAKPOINT_TRIGGER_WRITE 0x4
uint32_t trigger_type; /* only some combinations allowed */
#define PPC_BREAKPOINT_MODE_EXACT 0x0
#define PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE 0x1
#define PPC_BREAKPOINT_MODE_RANGE_EXCLUSIVE 0x2
#define PPC_BREAKPOINT_MODE_MASK 0x3
#define PPC_BREAKPOINT_MODE_EXACT 0x0
#define PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE 0x1
#define PPC_BREAKPOINT_MODE_RANGE_EXCLUSIVE 0x2
#define PPC_BREAKPOINT_MODE_MASK 0x3
uint32_t addr_mode; /* address match mode */
#define PPC_BREAKPOINT_CONDITION_MODE 0x3
#define PPC_BREAKPOINT_CONDITION_NONE 0x0
#define PPC_BREAKPOINT_CONDITION_AND 0x1
#define PPC_BREAKPOINT_CONDITION_EXACT 0x1 /* different name for the same thing as above */
#define PPC_BREAKPOINT_CONDITION_OR 0x2
#define PPC_BREAKPOINT_CONDITION_AND_OR 0x3
#define PPC_BREAKPOINT_CONDITION_BE_ALL 0x00ff0000 /* byte enable bits */
#define PPC_BREAKPOINT_CONDITION_BE(n) (1<<((n)+16))
#define PPC_BREAKPOINT_CONDITION_MODE 0x3
#define PPC_BREAKPOINT_CONDITION_NONE 0x0
#define PPC_BREAKPOINT_CONDITION_AND 0x1
#define PPC_BREAKPOINT_CONDITION_EXACT 0x1 /* different name for the same thing as above */
#define PPC_BREAKPOINT_CONDITION_OR 0x2
#define PPC_BREAKPOINT_CONDITION_AND_OR 0x3
#define PPC_BREAKPOINT_CONDITION_BE_ALL 0x00ff0000 /* byte enable bits */
#define PPC_BREAKPOINT_CONDITION_BE(n) (1<<((n)+16))
uint32_t condition_mode; /* break/watchpoint condition flags */
uint64_t addr;
uint64_t addr2;
uint64_t condition_value;
};
};
A request specifies one event, not necessarily just one register to be set.
For instance, if the request is for a watchpoint with a condition, both the
......@@ -88,7 +93,7 @@ can't be allocated on the registers.
Some examples of using the structure to:
- set a breakpoint in the first breakpoint register
- set a breakpoint in the first breakpoint register::
p.version = PPC_DEBUG_CURRENT_VERSION;
p.trigger_type = PPC_BREAKPOINT_TRIGGER_EXECUTE;
......@@ -98,7 +103,7 @@ Some examples of using the structure to:
p.addr2 = 0;
p.condition_value = 0;
- set a watchpoint which triggers on reads in the second watchpoint register
- set a watchpoint which triggers on reads in the second watchpoint register::
p.version = PPC_DEBUG_CURRENT_VERSION;
p.trigger_type = PPC_BREAKPOINT_TRIGGER_READ;
......@@ -108,7 +113,7 @@ Some examples of using the structure to:
p.addr2 = 0;
p.condition_value = 0;
- set a watchpoint which triggers only with a specific value
- set a watchpoint which triggers only with a specific value::
p.version = PPC_DEBUG_CURRENT_VERSION;
p.trigger_type = PPC_BREAKPOINT_TRIGGER_READ;
......@@ -118,7 +123,7 @@ Some examples of using the structure to:
p.addr2 = 0;
p.condition_value = (uint64_t) condition;
- set a ranged hardware breakpoint
- set a ranged hardware breakpoint::
p.version = PPC_DEBUG_CURRENT_VERSION;
p.trigger_type = PPC_BREAKPOINT_TRIGGER_EXECUTE;
......@@ -128,7 +133,7 @@ Some examples of using the structure to:
p.addr2 = (uint64_t) end_range;
p.condition_value = 0;
- set a watchpoint in server processors (BookS)
- set a watchpoint in server processors (BookS)::
p.version = 1;
p.trigger_type = PPC_BREAKPOINT_TRIGGER_RW;
......
Freescale QUICC Engine Firmware Uploading
-----------------------------------------
=========================================
Freescale QUICC Engine Firmware Uploading
=========================================
(c) 2007 Timur Tabi <timur at freescale.com>,
Freescale Semiconductor
Table of Contents
=================
.. Table of Contents
I - Software License for Firmware
......@@ -39,7 +39,7 @@ http://opensource.freescale.com. For other firmware files, please contact
your Freescale representative or your operating system vendor.
III - Description and Terminology
================================
=================================
In this document, the term 'microcode' refers to the sequence of 32-bit
integers that compose the actual QE microcode.
......@@ -89,7 +89,7 @@ being fixed in the RAM package utilizing they should be activated. This data
structure signals the microcode which of these virtual traps is active.
This structure contains 6 words that the application should copy to some
specific been defined. This table describes the structure.
specific been defined. This table describes the structure::
---------------------------------------------------------------
| Offset in | | Destination Offset | Size of |
......@@ -119,7 +119,7 @@ Extended Modes
This is a double word bit array (64 bits) that defines special functionality
which has an impact on the software drivers. Each bit has its own impact
and has special instructions for the s/w associated with it. This structure is
described in this table:
described in this table::
-----------------------------------------------------------------------
| Bit # | Name | Description |
......@@ -220,7 +220,8 @@ The 'model' field is a 16-bit number that matches the actual SOC. The
'major' and 'minor' fields are the major and minor revision numbers,
respectively, of the SOC.
For example, to match the 8323, revision 1.0:
For example, to match the 8323, revision 1.0::
soc.model = 8323
soc.major = 1
soc.minor = 0
......@@ -273,10 +274,10 @@ library and available to any driver that calles qe_get_firmware_info().
'reserved'.
After the last microcode is a 32-bit CRC. It can be calculated using
this algorithm:
this algorithm::
u32 crc32(const u8 *p, unsigned int len)
{
u32 crc32(const u8 *p, unsigned int len)
{
unsigned int i;
u32 crc = 0;
......@@ -286,7 +287,7 @@ u32 crc32(const u8 *p, unsigned int len)
crc = (crc >> 1) ^ ((crc & 1) ? 0xedb88320 : 0);
}
return crc;
}
}
VI - Sample Code for Creating Firmware Files
============================================
......
......@@ -5,11 +5,11 @@ Power Architecture 64-bit Linux system call ABI
syscall
=======
syscall calling sequence[*] matches the Power Architecture 64-bit ELF ABI
syscall calling sequence\ [1]_ matches the Power Architecture 64-bit ELF ABI
specification C function calling sequence, including register preservation
rules, with the following differences.
[*] Some syscalls (typically low-level management functions) may have
.. [1] Some syscalls (typically low-level management functions) may have
different calling sequences (e.g., rt_sigreturn).
Parameters and return value
......@@ -33,12 +33,14 @@ Register preservation rules
Register preservation rules match the ELF ABI calling sequence with the
following differences:
r0: Volatile. (System call number.)
r3: Volatile. (Parameter 1, and return value.)
r4-r8: Volatile. (Parameters 2-6.)
cr0: Volatile (cr0.SO is the return error condition)
cr1, cr5-7: Nonvolatile.
lr: Nonvolatile.
=========== ============= ========================================
r0 Volatile (System call number.)
r3 Volatile (Parameter 1, and return value.)
r4-r8 Volatile (Parameters 2-6.)
cr0 Volatile (cr0.SO is the return error condition)
cr1, cr5-7 Nonvolatile
lr Nonvolatile
=========== ============= ========================================
All floating point and vector data registers as well as control and status
registers are nonvolatile.
......@@ -90,9 +92,12 @@ The vsyscall may or may not use the caller's stack frame save areas.
Register preservation rules
---------------------------
r0: Volatile.
cr1, cr5-7: Volatile.
lr: Volatile.
=========== ========
r0 Volatile
cr1, cr5-7 Volatile
lr Volatile
=========== ========
Invocation
----------
......
============================
Transactional Memory support
============================
......@@ -17,9 +18,9 @@ instructions are presented to delimit transactions; transactions are
guaranteed to either complete atomically or roll back and undo any partial
changes.
A simple transaction looks like this:
A simple transaction looks like this::
begin_move_money:
begin_move_money:
tbegin
beq abort_handler
......@@ -34,7 +35,7 @@ begin_move_money:
b continue
abort_handler:
abort_handler:
... test for odd failures ...
/* Retry the transaction if it failed because it conflicted with
......@@ -123,7 +124,7 @@ Transaction-aware signal handlers can read the transactional register state
from the second ucontext. This will be necessary for crash handlers to
determine, for example, the address of the instruction causing the SIGSEGV.
Example signal handler:
Example signal handler::
void crash_handler(int sig, siginfo_t *si, void *uc)
{
......@@ -133,9 +134,9 @@ Example signal handler:
if (ucp_link) {
u64 msr = ucp->uc_mcontext.regs->msr;
/* May have transactional ucontext! */
#ifndef __powerpc64__
#ifndef __powerpc64__
msr |= ((u64)transactional_ucp->uc_mcontext.regs->msr) << 32;
#endif
#endif
if (MSR_TM_ACTIVE(msr)) {
/* Yes, we crashed during a transaction. Oops. */
fprintf(stderr, "Transaction to be restarted at 0x%llx, but "
......@@ -176,6 +177,7 @@ Failure cause codes used by kernel
These are defined in <asm/reg.h>, and distinguish different reasons why the
kernel aborted a transaction:
====================== ================================
TM_CAUSE_RESCHED Thread was rescheduled.
TM_CAUSE_TLBI Software TLB invalid.
TM_CAUSE_FAC_UNAV FP/VEC/VSX unavailable trap.
......@@ -184,6 +186,7 @@ kernel aborted a transaction:
TM_CAUSE_MISC Currently unused.
TM_CAUSE_ALIGNMENT Alignment fault.
TM_CAUSE_EMULATE Emulation that touched memory.
====================== ================================
These can be checked by the user program's abort handler as TEXASR[0:7]. If
bit 7 is set, it indicates that the error is consider persistent. For example
......@@ -203,7 +206,7 @@ POWER9
======
TM on POWER9 has issues with storing the complete register state. This
is described in this commit:
is described in this commit::
commit 4bb3c7a0208fc13ca70598efd109901a7cd45ae7
Author: Paul Mackerras <paulus@ozlabs.org>
......
......@@ -4468,7 +4468,7 @@ F: arch/powerpc/platforms/powernv/pci-cxl.c
F: drivers/misc/cxl/
F: include/misc/cxl*
F: include/uapi/misc/cxl.h
F: Documentation/powerpc/cxl.txt
F: Documentation/powerpc/cxl.rst
F: Documentation/ABI/testing/sysfs-class-cxl
CXLFLASH (IBM Coherent Accelerator Processor Interface CAPI Flash) SCSI DRIVER
......@@ -4479,7 +4479,7 @@ L: linux-scsi@vger.kernel.org
S: Supported
F: drivers/scsi/cxlflash/
F: include/uapi/scsi/cxlflash_ioctl.h
F: Documentation/powerpc/cxlflash.txt
F: Documentation/powerpc/cxlflash.rst
CYBERPRO FB DRIVER
M: Russell King <linux@armlinux.org.uk>
......@@ -12353,7 +12353,7 @@ F: Documentation/PCI/pci-error-recovery.rst
F: drivers/pci/pcie/aer.c
F: drivers/pci/pcie/dpc.c
F: drivers/pci/pcie/err.c
F: Documentation/powerpc/eeh-pci-error-recovery.txt
F: Documentation/powerpc/eeh-pci-error-recovery.rst
F: arch/powerpc/kernel/eeh*.c
F: arch/powerpc/platforms/*/eeh*.c
F: arch/powerpc/include/*/eeh*.h
......
......@@ -1531,7 +1531,7 @@ EXC_COMMON(trap_0b_common, 0xb00, unknown_exception)
*
* Call convention:
*
* syscall register convention is in Documentation/powerpc/syscall64-abi.txt
* syscall register convention is in Documentation/powerpc/syscall64-abi.rst
*
* For hypercalls, the register convention is as follows:
* r0 volatile
......
......@@ -419,7 +419,7 @@ static void qe_upload_microcode(const void *base,
/*
* Upload a microcode to the I-RAM at a specific address.
*
* See Documentation/powerpc/qe_firmware.txt for information on QE microcode
* See Documentation/powerpc/qe_firmware.rst for information on QE microcode
* uploading.
*
* Currently, only version 1 is supported, so the 'version' field must be
......
......@@ -47,7 +47,7 @@
* using the 2.6 Linux kernel kref construct.
*
* For direction on installation and usage of this driver please reference
* Documentation/powerpc/hvcs.txt.
* Documentation/powerpc/hvcs.rst.
*/
#include <linux/device.h>
......
......@@ -259,7 +259,7 @@ static inline int qe_alive_during_sleep(void)
/* Structure that defines QE firmware binary files.
*
* See Documentation/powerpc/qe_firmware.txt for a description of these
* See Documentation/powerpc/qe_firmware.rst for a description of these
* fields.
*/
struct qe_firmware {
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment