• Roland Dreier's avatar
    x86, ioremap: Speed up check for RAM pages · c81c8a1e
    Roland Dreier authored
    In __ioremap_caller() (the guts of ioremap), we loop over the range of
    pfns being remapped and checks each one individually with page_is_ram().
    For large ioremaps, this can be very slow.  For example, we have a
    device with a 256 GiB PCI BAR, and ioremapping this BAR can take 20+
    seconds -- sometimes long enough to trigger the soft lockup detector!
    
    Internally, page_is_ram() calls walk_system_ram_range() on a single
    page.  Instead, we can make a single call to walk_system_ram_range()
    from __ioremap_caller(), and do our further checks only for any RAM
    pages that we find.  For the common case of MMIO, this saves an enormous
    amount of work, since the range being ioremapped doesn't intersect
    system RAM at all.
    
    With this change, ioremap on our 256 GiB BAR takes less than 1 second.
    Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
    Link: http://lkml.kernel.org/r/1399054721-1331-1-git-send-email-roland@kernel.orgSigned-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
    c81c8a1e
ioremap.c 10.8 KB