Commit 4d3467e1 authored by David Hildenbrand's avatar David Hildenbrand Committed by Linus Torvalds

mm: balloon: update comment about isolation/migration/compaction

Patch series "mm/kdump: allow to exclude pages that are logically
offline"

Right now, pages inflated as part of a balloon driver will be dumped by
dump tools like makedumpfile.  While XEN is able to check in the crash
kernel whether a certain pfn is actuall backed by memory in the
hypervisor (see xen_oldmem_pfn_is_ram) and optimize this case, dumps of
virtio-balloon, hv-balloon and VMWare balloon inflated memory will
essentially result in zero pages getting allocated by the hypervisor and
the dump getting filled with this data.

The allocation and reading of zero pages can directly be avoided if a
dumping tool could know which pages only contain stale information not
to be dumped.

Also for XEN, calling into the kernel and asking the hypervisor if a pfn
is backed can be avoided if the duming tool would skip such pages right
from the beginning.

Dumping tools have no idea whether a given page is part of a balloon
driver and shall not be dumped.  Esp.  PG_reserved cannot be used for
that purpose as all memory allocated during early boot is also
PG_reserved, see discussion at [1].  So some other way of indication is
required and a new page flag is frowned upon.

We have PG_balloon (MAPCOUNT value), which is essentially unused now.  I
suggest renaming it to something more generic (PG_offline) to mark pages
as logically offline.  This flag can than e.g.  also be used by
virtio-mem in the future to mark subsections as offline.  Or by other
code that wants to put pages logically offline (e.g.  later maybe
poisoned pages that shall no longer be used).

This series converts PG_balloon to PG_offline, allows dumping tools to
query the value to detect such pages and marks pages in the hv-balloon
and XEN balloon properly as PG_offline.  Note that virtio-balloon
already set pages to PG_balloon (and now PG_offline).

Please note that this is also helpful for a problem we were seeing under
Hyper-V: Dumping logically offline memory (pages kept fake offline while
onlining a section via online_page_callback) would under some condicions
result in a kernel panic when dumping them.

As I don't have access to neither XEN nor Hyper-V nor VMWare
installations, this was only tested with the virtio-balloon and pages
were properly skipped when dumping.  I'll also attach the makedumpfile
patch to this series.

[1] https://lkml.org/lkml/2018/7/20/566

This patch (of 8):

Commit b1123ea6 ("mm: balloon: use general non-lru movable page
feature") reworked balloon handling to make use of the general non-lru
movable page feature.  The big comment block in balloon_compaction.h
contains quite some outdated information.  Let's fix this.

Link: http://lkml.kernel.org/r/20181119101616.8901-2-david@redhat.comSigned-off-by: default avatarDavid Hildenbrand <david@redhat.com>
Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Christian Hansen <chansen3@cisco.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juergen Gross <jgross@suse.com>
Cc: Julien Freche <jfreche@vmware.com>
Cc: Kairui Song <kasong@redhat.com>
Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Lianbo Jiang <lijiang@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Miles Chen <miles.chen@mediatek.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Pankaj gupta <pagupta@redhat.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xavier Deguillard <xdeguillard@vmware.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent a9cd410a
...@@ -4,15 +4,18 @@ ...@@ -4,15 +4,18 @@
* *
* Common interface definitions for making balloon pages movable by compaction. * Common interface definitions for making balloon pages movable by compaction.
* *
* Despite being perfectly possible to perform ballooned pages migration, they * Balloon page migration makes use of the general non-lru movable page
* make a special corner case to compaction scans because balloon pages are not * feature.
* enlisted at any LRU list like the other pages we do compact / migrate. *
* page->private is used to reference the responsible balloon device.
* page->mapping is used in context of non-lru page migration to reference
* the address space operations for page isolation/migration/compaction.
* *
* As the page isolation scanning step a compaction thread does is a lockless * As the page isolation scanning step a compaction thread does is a lockless
* procedure (from a page standpoint), it might bring some racy situations while * procedure (from a page standpoint), it might bring some racy situations while
* performing balloon page compaction. In order to sort out these racy scenarios * performing balloon page compaction. In order to sort out these racy scenarios
* and safely perform balloon's page compaction and migration we must, always, * and safely perform balloon's page compaction and migration we must, always,
* ensure following these three simple rules: * ensure following these simple rules:
* *
* i. when updating a balloon's page ->mapping element, strictly do it under * i. when updating a balloon's page ->mapping element, strictly do it under
* the following lock order, independently of the far superior * the following lock order, independently of the far superior
...@@ -21,19 +24,8 @@ ...@@ -21,19 +24,8 @@
* +--spin_lock_irq(&b_dev_info->pages_lock); * +--spin_lock_irq(&b_dev_info->pages_lock);
* ... page->mapping updates here ... * ... page->mapping updates here ...
* *
* ii. before isolating or dequeueing a balloon page from the balloon device * ii. isolation or dequeueing procedure must remove the page from balloon
* pages list, the page reference counter must be raised by one and the * device page list under b_dev_info->pages_lock.
* extra refcount must be dropped when the page is enqueued back into
* the balloon device page list, thus a balloon page keeps its reference
* counter raised only while it is under our special handling;
*
* iii. after the lockless scan step have selected a potential balloon page for
* isolation, re-test the PageBalloon mark and the PagePrivate flag
* under the proper page lock, to ensure isolating a valid balloon page
* (not yet isolated, nor under release procedure)
*
* iv. isolation or dequeueing procedure must clear PagePrivate flag under
* page lock together with removing page from balloon device page list.
* *
* The functions provided by this interface are placed to help on coping with * The functions provided by this interface are placed to help on coping with
* the aforementioned balloon page corner case, as well as to ensure the simple * the aforementioned balloon page corner case, as well as to ensure the simple
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment