• David Hildenbrand's avatar
    mm/memory_hotplug: shrink zones when offlining memory · feee6b29
    David Hildenbrand authored
    We currently try to shrink a single zone when removing memory.  We use
    the zone of the first page of the memory we are removing.  If that
    memmap was never initialized (e.g., memory was never onlined), we will
    read garbage and can trigger kernel BUGs (due to a stale pointer):
    
        BUG: unable to handle page fault for address: 000000000000353d
        #PF: supervisor write access in kernel mode
        #PF: error_code(0x0002) - not-present page
        PGD 0 P4D 0
        Oops: 0002 [#1] SMP PTI
        CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
        Workqueue: kacpi_hotplug acpi_hotplug_work_fn
        RIP: 0010:clear_zone_contiguous+0x5/0x10
        Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840
        RSP: 0018:ffffad2400043c98 EFLAGS: 00010246
        RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000
        RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40
        RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001
        R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
        R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680
        FS:  0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         __remove_pages+0x4b/0x640
         arch_remove_memory+0x63/0x8d
         try_remove_memory+0xdb/0x130
         __remove_memory+0xa/0x11
         acpi_memory_device_remove+0x70/0x100
         acpi_bus_trim+0x55/0x90
         acpi_device_hotplug+0x227/0x3a0
         acpi_hotplug_work_fn+0x1a/0x30
         process_one_work+0x221/0x550
         worker_thread+0x50/0x3b0
         kthread+0x105/0x140
         ret_from_fork+0x3a/0x50
        Modules linked in:
        CR2: 000000000000353d
    
    Instead, shrink the zones when offlining memory or when onlining failed.
    Introduce and use remove_pfn_range_from_zone(() for that.  We now
    properly shrink the zones, even if we have DIMMs whereby
    
     - Some memory blocks fall into no zone (never onlined)
    
     - Some memory blocks fall into multiple zones (offlined+re-onlined)
    
     - Multiple memory blocks that fall into different zones
    
    Drop the zone parameter (with a potential dubious value) from
    __remove_pages() and __remove_section().
    
    Link: http://lkml.kernel.org/r/20191006085646.5768-6-david@redhat.com
    Fixes: f1dd2cd1 ("mm, memory_hotplug: do not associate hotadded memory to zones until online")	[visible after d0dc12e8]
    Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
    Reviewed-by: default avatarOscar Salvador <osalvador@suse.de>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
    Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
    Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: Logan Gunthorpe <logang@deltatee.com>
    Cc: <stable@vger.kernel.org>	[5.0+]
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    feee6b29
memory_hotplug.c 48.1 KB