Commit b53ba588 authored by Mike Rapoport's avatar Mike Rapoport Committed by Jonathan Corbet

docs/vm: hwpoison.txt: convert to ReST format

Signed-off-by: default avatarMike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: default avatarJonathan Corbet <corbet@lwn.net>
parent 88ececc2
.. hwpoison:
========
hwpoison
========
What is hwpoison? What is hwpoison?
=================
Upcoming Intel CPUs have support for recovering from some memory errors Upcoming Intel CPUs have support for recovering from some memory errors
(``MCA recovery''). This requires the OS to declare a page "poisoned", (``MCA recovery``). This requires the OS to declare a page "poisoned",
kill the processes associated with it and avoid using it in the future. kill the processes associated with it and avoid using it in the future.
This patchkit implements the necessary infrastructure in the VM. This patchkit implements the necessary infrastructure in the VM.
...@@ -46,9 +53,10 @@ address. This in theory allows other applications to handle ...@@ -46,9 +53,10 @@ address. This in theory allows other applications to handle
memory failures too. The expection is that near all applications memory failures too. The expection is that near all applications
won't do that, but some very specialized ones might. won't do that, but some very specialized ones might.
--- Failure recovery modes
======================
There are two (actually three) modi memory failure recovery can be in: There are two (actually three) modes memory failure recovery can be in:
vm.memory_failure_recovery sysctl set to zero: vm.memory_failure_recovery sysctl set to zero:
All memory failures cause a panic. Do not attempt recovery. All memory failures cause a panic. Do not attempt recovery.
...@@ -67,9 +75,8 @@ late kill ...@@ -67,9 +75,8 @@ late kill
This is best for memory error unaware applications and default This is best for memory error unaware applications and default
Note some pages are always handled as late kill. Note some pages are always handled as late kill.
--- User control
============
User control:
vm.memory_failure_recovery vm.memory_failure_recovery
See sysctl.txt See sysctl.txt
...@@ -79,11 +86,19 @@ vm.memory_failure_early_kill ...@@ -79,11 +86,19 @@ vm.memory_failure_early_kill
PR_MCE_KILL PR_MCE_KILL
Set early/late kill mode/revert to system default Set early/late kill mode/revert to system default
arg1: PR_MCE_KILL_CLEAR: Revert to system default
arg1: PR_MCE_KILL_SET: arg2 defines thread specific mode arg1: PR_MCE_KILL_CLEAR:
PR_MCE_KILL_EARLY: Early kill Revert to system default
PR_MCE_KILL_LATE: Late kill arg1: PR_MCE_KILL_SET:
PR_MCE_KILL_DEFAULT: Use system global default arg2 defines thread specific mode
PR_MCE_KILL_EARLY:
Early kill
PR_MCE_KILL_LATE:
Late kill
PR_MCE_KILL_DEFAULT
Use system global default
Note that if you want to have a dedicated thread which handles Note that if you want to have a dedicated thread which handles
the SIGBUS(BUS_MCEERR_AO) on behalf of the process, you should the SIGBUS(BUS_MCEERR_AO) on behalf of the process, you should
call prctl(PR_MCE_KILL_EARLY) on the designated thread. Otherwise, call prctl(PR_MCE_KILL_EARLY) on the designated thread. Otherwise,
...@@ -92,77 +107,64 @@ PR_MCE_KILL ...@@ -92,77 +107,64 @@ PR_MCE_KILL
PR_MCE_KILL_GET PR_MCE_KILL_GET
return current mode return current mode
Testing
=======
--- * madvise(MADV_HWPOISON, ....) (as root) - Poison a page in the
process for testing
Testing:
madvise(MADV_HWPOISON, ....)
(as root)
Poison a page in the process for testing
hwpoison-inject module through debugfs * hwpoison-inject module through debugfs ``/sys/kernel/debug/hwpoison/``
/sys/kernel/debug/hwpoison/ corrupt-pfn
Inject hwpoison fault at PFN echoed into this file. This does
some early filtering to avoid corrupted unintended pages in test suites.
corrupt-pfn unpoison-pfn
Software-unpoison page at PFN echoed into this file. This way
a page can be reused again. This only works for Linux
injected failures, not for real memory failures.
Inject hwpoison fault at PFN echoed into this file. This does Note these injection interfaces are not stable and might change between
some early filtering to avoid corrupted unintended pages in test suites. kernel versions
unpoison-pfn corrupt-filter-dev-major, corrupt-filter-dev-minor
Only handle memory failures to pages associated with the file
system defined by block device major/minor. -1U is the
wildcard value. This should be only used for testing with
artificial injection.
Software-unpoison page at PFN echoed into this file. This corrupt-filter-memcg
way a page can be reused again. Limit injection to pages owned by memgroup. Specified by inode
This only works for Linux injected failures, not for real number of the memcg.
memory failures.
Note these injection interfaces are not stable and might change between Example::
kernel versions
corrupt-filter-dev-major mkdir /sys/fs/cgroup/mem/hwpoison
corrupt-filter-dev-minor
Only handle memory failures to pages associated with the file system defined usemem -m 100 -s 1000 &
by block device major/minor. -1U is the wildcard value. echo `jobs -p` > /sys/fs/cgroup/mem/hwpoison/tasks
This should be only used for testing with artificial injection.
corrupt-filter-memcg memcg_ino=$(ls -id /sys/fs/cgroup/mem/hwpoison | cut -f1 -d' ')
echo $memcg_ino > /debug/hwpoison/corrupt-filter-memcg
Limit injection to pages owned by memgroup. Specified by inode number page-types -p `pidof init` --hwpoison # shall do nothing
of the memcg. page-types -p `pidof usemem` --hwpoison # poison its pages
Example: corrupt-filter-flags-mask, corrupt-filter-flags-value
mkdir /sys/fs/cgroup/mem/hwpoison When specified, only poison pages if ((page_flags & mask) ==
value). This allows stress testing of many kinds of
pages. The page_flags are the same as in /proc/kpageflags. The
flag bits are defined in include/linux/kernel-page-flags.h and
documented in Documentation/vm/pagemap.txt
usemem -m 100 -s 1000 & * Architecture specific MCE injector
echo `jobs -p` > /sys/fs/cgroup/mem/hwpoison/tasks
memcg_ino=$(ls -id /sys/fs/cgroup/mem/hwpoison | cut -f1 -d' ') x86 has mce-inject, mce-test
echo $memcg_ino > /debug/hwpoison/corrupt-filter-memcg
page-types -p `pidof init` --hwpoison # shall do nothing Some portable hwpoison test programs in mce-test, see below.
page-types -p `pidof usemem` --hwpoison # poison its pages
corrupt-filter-flags-mask References
corrupt-filter-flags-value ==========
When specified, only poison pages if ((page_flags & mask) == value).
This allows stress testing of many kinds of pages. The page_flags
are the same as in /proc/kpageflags. The flag bits are defined in
include/linux/kernel-page-flags.h and documented in
Documentation/vm/pagemap.txt
Architecture specific MCE injector
x86 has mce-inject, mce-test
Some portable hwpoison test programs in mce-test, see blow.
---
References:
http://halobates.de/mce-lc09-2.pdf http://halobates.de/mce-lc09-2.pdf
Overview presentation from LinuxCon 09 Overview presentation from LinuxCon 09
...@@ -174,14 +176,11 @@ git://git.kernel.org/pub/scm/utils/cpu/mce/mce-inject.git ...@@ -174,14 +176,11 @@ git://git.kernel.org/pub/scm/utils/cpu/mce/mce-inject.git
x86 specific injector x86 specific injector
--- Limitations
===========
Limitations:
- Not all page types are supported and never will. Most kernel internal - Not all page types are supported and never will. Most kernel internal
objects cannot be recovered, only LRU pages for now. objects cannot be recovered, only LRU pages for now.
- Right now hugepage support is missing. - Right now hugepage support is missing.
--- ---
Andi Kleen, Oct 2009 Andi Kleen, Oct 2009
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment