Commit 2fcbc413 authored by Mike Rapoport's avatar Mike Rapoport Committed by Jonathan Corbet

docs/vm: ksm.txt: convert to ReST format

Signed-off-by: default avatarMike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: default avatarJonathan Corbet <corbet@lwn.net>
parent e3f2025a
How to use the Kernel Samepage Merging feature .. _ksm:
----------------------------------------------
=======================
Kernel Samepage Merging
=======================
KSM is a memory-saving de-duplication feature, enabled by CONFIG_KSM=y, KSM is a memory-saving de-duplication feature, enabled by CONFIG_KSM=y,
added to the Linux kernel in 2.6.32. See mm/ksm.c for its implementation, added to the Linux kernel in 2.6.32. See ``mm/ksm.c`` for its implementation,
and http://lwn.net/Articles/306704/ and http://lwn.net/Articles/330589/ and http://lwn.net/Articles/306704/ and http://lwn.net/Articles/330589/
The KSM daemon ksmd periodically scans those areas of user memory which The KSM daemon ksmd periodically scans those areas of user memory which
...@@ -51,110 +54,112 @@ Applications should be considerate in their use of MADV_MERGEABLE, ...@@ -51,110 +54,112 @@ Applications should be considerate in their use of MADV_MERGEABLE,
restricting its use to areas likely to benefit. KSM's scans may use a lot restricting its use to areas likely to benefit. KSM's scans may use a lot
of processing power: some installations will disable KSM for that reason. of processing power: some installations will disable KSM for that reason.
The KSM daemon is controlled by sysfs files in /sys/kernel/mm/ksm/, The KSM daemon is controlled by sysfs files in ``/sys/kernel/mm/ksm/``,
readable by all but writable only by root: readable by all but writable only by root:
pages_to_scan - how many present pages to scan before ksmd goes to sleep pages_to_scan
e.g. "echo 100 > /sys/kernel/mm/ksm/pages_to_scan" how many present pages to scan before ksmd goes to sleep
Default: 100 (chosen for demonstration purposes) e.g. ``echo 100 > /sys/kernel/mm/ksm/pages_to_scan`` Default: 100
(chosen for demonstration purposes)
sleep_millisecs - how many milliseconds ksmd should sleep before next scan
e.g. "echo 20 > /sys/kernel/mm/ksm/sleep_millisecs" sleep_millisecs
Default: 20 (chosen for demonstration purposes) how many milliseconds ksmd should sleep before next scan
e.g. ``echo 20 > /sys/kernel/mm/ksm/sleep_millisecs`` Default: 20
merge_across_nodes - specifies if pages from different numa nodes can be merged. (chosen for demonstration purposes)
When set to 0, ksm merges only pages which physically
reside in the memory area of same NUMA node. That brings merge_across_nodes
lower latency to access of shared pages. Systems with more specifies if pages from different numa nodes can be merged.
nodes, at significant NUMA distances, are likely to benefit When set to 0, ksm merges only pages which physically reside
from the lower latency of setting 0. Smaller systems, which in the memory area of same NUMA node. That brings lower
need to minimize memory usage, are likely to benefit from latency to access of shared pages. Systems with more nodes, at
the greater sharing of setting 1 (default). You may wish to significant NUMA distances, are likely to benefit from the
compare how your system performs under each setting, before lower latency of setting 0. Smaller systems, which need to
deciding on which to use. merge_across_nodes setting can be minimize memory usage, are likely to benefit from the greater
changed only when there are no ksm shared pages in system: sharing of setting 1 (default). You may wish to compare how
set run 2 to unmerge pages first, then to 1 after changing your system performs under each setting, before deciding on
which to use. merge_across_nodes setting can be changed only
when there are no ksm shared pages in system: set run 2 to
unmerge pages first, then to 1 after changing
merge_across_nodes, to remerge according to the new setting. merge_across_nodes, to remerge according to the new setting.
Default: 1 (merging across nodes as in earlier releases) Default: 1 (merging across nodes as in earlier releases)
run - set 0 to stop ksmd from running but keep merged pages, run
set 1 to run ksmd e.g. "echo 1 > /sys/kernel/mm/ksm/run", set 0 to stop ksmd from running but keep merged pages,
set 2 to stop ksmd and unmerge all pages currently merged, set 1 to run ksmd e.g. ``echo 1 > /sys/kernel/mm/ksm/run``,
but leave mergeable areas registered for next run set 2 to stop ksmd and unmerge all pages currently merged, but
Default: 0 (must be changed to 1 to activate KSM, leave mergeable areas registered for next run Default: 0 (must
except if CONFIG_SYSFS is disabled) be changed to 1 to activate KSM, except if CONFIG_SYSFS is
disabled)
use_zero_pages - specifies whether empty pages (i.e. allocated pages
that only contain zeroes) should be treated specially. use_zero_pages
When set to 1, empty pages are merged with the kernel specifies whether empty pages (i.e. allocated pages that only
zero page(s) instead of with each other as it would contain zeroes) should be treated specially. When set to 1,
happen normally. This can improve the performance on empty pages are merged with the kernel zero page(s) instead of
architectures with coloured zero pages, depending on with each other as it would happen normally. This can improve
the workload. Care should be taken when enabling this the performance on architectures with coloured zero pages,
setting, as it can potentially degrade the performance depending on the workload. Care should be taken when enabling
of KSM for some workloads, for example if the checksums this setting, as it can potentially degrade the performance of
of pages candidate for merging match the checksum of KSM for some workloads, for example if the checksums of pages
an empty page. This setting can be changed at any time, candidate for merging match the checksum of an empty
it is only effective for pages merged after the change. page. This setting can be changed at any time, it is only
Default: 0 (normal KSM behaviour as in earlier releases) effective for pages merged after the change. Default: 0
(normal KSM behaviour as in earlier releases)
max_page_sharing - Maximum sharing allowed for each KSM page. This
enforces a deduplication limit to avoid the virtual max_page_sharing
memory rmap lists to grow too large. The minimum Maximum sharing allowed for each KSM page. This enforces a
value is 2 as a newly created KSM page will have at deduplication limit to avoid the virtual memory rmap lists to
least two sharers. The rmap walk has O(N) grow too large. The minimum value is 2 as a newly created KSM
complexity where N is the number of rmap_items page will have at least two sharers. The rmap walk has O(N)
(i.e. virtual mappings) that are sharing the page, complexity where N is the number of rmap_items (i.e. virtual
which is in turn capped by max_page_sharing. So mappings) that are sharing the page, which is in turn capped
this effectively spread the the linear O(N) by max_page_sharing. So this effectively spread the the linear
computational complexity from rmap walk context O(N) computational complexity from rmap walk context over
over different KSM pages. The ksmd walk over the different KSM pages. The ksmd walk over the stable_node
stable_node "chains" is also O(N), but N is the "chains" is also O(N), but N is the number of stable_node
number of stable_node "dups", not the number of "dups", not the number of rmap_items, so it has not a
rmap_items, so it has not a significant impact on significant impact on ksmd performance. In practice the best
ksmd performance. In practice the best stable_node stable_node "dup" candidate will be kept and found at the head
"dup" candidate will be kept and found at the head of the "dups" list. The higher this value the faster KSM will
of the "dups" list. The higher this value the merge the memory (because there will be fewer stable_node dups
faster KSM will merge the memory (because there queued into the stable_node chain->hlist to check for pruning)
will be fewer stable_node dups queued into the and the higher the deduplication factor will be, but the
stable_node chain->hlist to check for pruning) and slowest the worst case rmap walk could be for any given KSM
the higher the deduplication factor will be, but page. Slowing down the rmap_walk means there will be higher
the slowest the worst case rmap walk could be for latency for certain virtual memory operations happening during
any given KSM page. Slowing down the rmap_walk swapping, compaction, NUMA balancing and page migration, in
means there will be higher latency for certain turn decreasing responsiveness for the caller of those virtual
virtual memory operations happening during memory operations. The scheduler latency of other tasks not
swapping, compaction, NUMA balancing and page involved with the VM operations doing the rmap walk is not
migration, in turn decreasing responsiveness for affected by this parameter as the rmap walks are always
the caller of those virtual memory operations. The schedule friendly themselves.
scheduler latency of other tasks not involved with
the VM operations doing the rmap walk is not stable_node_chains_prune_millisecs
affected by this parameter as the rmap walks are How frequently to walk the whole list of stable_node "dups"
always schedule friendly themselves. linked in the stable_node "chains" in order to prune stale
stable_nodes. Smaller milllisecs values will free up the KSM
stable_node_chains_prune_millisecs - How frequently to walk the whole metadata with lower latency, but they will make ksmd use more
list of stable_node "dups" linked in the CPU during the scan. This only applies to the stable_node
stable_node "chains" in order to prune stale chains so it's a noop if not a single KSM page hit the
stable_nodes. Smaller milllisecs values will free max_page_sharing yet (there would be no stable_node chains in
up the KSM metadata with lower latency, but they such case).
will make ksmd use more CPU during the scan. This
only applies to the stable_node chains so it's a The effectiveness of KSM and MADV_MERGEABLE is shown in ``/sys/kernel/mm/ksm/``:
noop if not a single KSM page hit the
max_page_sharing yet (there would be no stable_node pages_shared
chains in such case). how many shared pages are being used
pages_sharing
The effectiveness of KSM and MADV_MERGEABLE is shown in /sys/kernel/mm/ksm/: how many more sites are sharing them i.e. how much saved
pages_unshared
pages_shared - how many shared pages are being used how many pages unique but repeatedly checked for merging
pages_sharing - how many more sites are sharing them i.e. how much saved pages_volatile
pages_unshared - how many pages unique but repeatedly checked for merging how many pages changing too fast to be placed in a tree
pages_volatile - how many pages changing too fast to be placed in a tree full_scans
full_scans - how many times all mergeable areas have been scanned how many times all mergeable areas have been scanned
stable_node_chains
stable_node_chains - number of stable node chains allocated, this is number of stable node chains allocated, this is effectively
effectively the number of KSM pages that hit the the number of KSM pages that hit the max_page_sharing limit
max_page_sharing limit stable_node_dups
stable_node_dups - number of stable node dups queued into the number of stable node dups queued into the stable_node chains
stable_node chains
A high ratio of pages_sharing to pages_shared indicates good sharing, but A high ratio of pages_sharing to pages_shared indicates good sharing, but
a high ratio of pages_unshared to pages_sharing indicates wasted effort. a high ratio of pages_unshared to pages_sharing indicates wasted effort.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment