• Uladzislau Rezki (Sony)'s avatar
    mm/vmalloc: do not adjust the search size for alignment overhead · 9f531973
    Uladzislau Rezki (Sony) authored
    We used to include an alignment overhead into a search length, in that
    case we guarantee that a found area will definitely fit after applying a
    specific alignment that user specifies.  From the other hand we do not
    guarantee that an area has the lowest address if an alignment is >=
    PAGE_SIZE.
    
    It means that, when a user specifies a special alignment together with a
    range that corresponds to an exact requested size then an allocation
    will fail.  This is what happens to KASAN, it wants the free block that
    exactly matches a specified range during onlining memory banks:
    
        [root@vm-0 fedora]# echo online > /sys/devices/system/memory/memory82/state
        [root@vm-0 fedora]# echo online > /sys/devices/system/memory/memory83/state
        [root@vm-0 fedora]# echo online > /sys/devices/system/memory/memory85/state
        [root@vm-0 fedora]# echo online > /sys/devices/system/memory/memory84/state
        vmap allocation for size 16777216 failed: use vmalloc=<size> to increase size
        bash: vmalloc: allocation failure: 16777216 bytes, mode:0x6000c0(GFP_KERNEL), nodemask=(null),cpuset=/,mems_allowed=0
        CPU: 4 PID: 1644 Comm: bash Kdump: loaded Not tainted 4.18.0-339.el8.x86_64+debug #1
        Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
        Call Trace:
         dump_stack+0x8e/0xd0
         warn_alloc.cold.90+0x8a/0x1b2
         ? zone_watermark_ok_safe+0x300/0x300
         ? slab_free_freelist_hook+0x85/0x1a0
         ? __get_vm_area_node+0x240/0x2c0
         ? kfree+0xdd/0x570
         ? kmem_cache_alloc_node_trace+0x157/0x230
         ? notifier_call_chain+0x90/0x160
         __vmalloc_node_range+0x465/0x840
         ? mark_held_locks+0xb7/0x120
    
    Fix it by making sure that find_vmap_lowest_match() returns lowest start
    address with any given alignment value, i.e.  for alignments bigger then
    PAGE_SIZE the algorithm rolls back toward parent nodes checking right
    sub-trees if the most left free block did not fit due to alignment
    overhead.
    
    Link: https://lkml.kernel.org/r/20211004142829.22222-1-urezki@gmail.com
    Fixes: 68ad4a33 ("mm/vmalloc.c: keep track of free blocks for vmap allocation")
    Signed-off-by: default avatarUladzislau Rezki (Sony) <urezki@gmail.com>
    Reported-by: default avatarPing Fang <pifang@redhat.com>
    Tested-by: default avatarDavid Hildenbrand <david@redhat.com>
    Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
    Cc: Mel Gorman <mgorman@suse.de>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Nicholas Piggin <npiggin@gmail.com>
    Cc: Hillf Danton <hdanton@sina.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    9f531973
vmalloc.c 101 KB