• Luis Chamberlain's avatar
    module: add debug stats to help identify memory pressure · df3e764d
    Luis Chamberlain authored
    Loading modules with finit_module() can end up using vmalloc(), vmap()
    and vmalloc() again, for a total of up to 3 separate allocations in the
    worst case for a single module. We always kernel_read*() the module,
    that's a vmalloc(). Then vmap() is used for the module decompression,
    and if so the last read buffer is freed as we use the now decompressed
    module buffer to stuff data into our copy module. The last allocation is
    specific to each architectures but pretty much that's generally a series
    of vmalloc() calls or a variation of vmalloc to handle ELF sections with
    special permissions.
    
    Evaluation with new stress-ng module support [1] with just 100 ops
    is proving that you can end up using GiBs of data easily even with all
    care we have in the kernel and userspace today in trying to not load modules
    which are already loaded. 100 ops seems to resemble the sort of pressure a
    system with about 400 CPUs can create on module loading. Although issues
    relating to duplicate module requests due to each CPU inucurring a new
    module reuest is silly and some of these are being fixed, we currently lack
    proper tooling to help diagnose easily what happened, when it happened
    and who likely is to blame -- userspace or kernel module autoloading.
    
    Provide an initial set of stats which use debugfs to let us easily scrape
    post-boot information about failed loads. This sort of information can
    be used on production worklaods to try to optimize *avoiding* redundant
    memory pressure using finit_module().
    
    There's a few examples that can be provided:
    
    A 255 vCPU system without the next patch in this series applied:
    
    Startup finished in 19.143s (kernel) + 7.078s (userspace) = 26.221s
    graphical.target reached after 6.988s in userspace
    
    And 13.58 GiB of virtual memory space lost due to failed module loading:
    
    root@big ~ # cat /sys/kernel/debug/modules/stats
             Mods ever loaded       67
         Mods failed on kread       0
    Mods failed on decompress       0
      Mods failed on becoming       0
          Mods failed on load       1411
            Total module size       11464704
          Total mod text size       4194304
           Failed kread bytes       0
      Failed decompress bytes       0
        Failed becoming bytes       0
            Failed kmod bytes       14588526272
     Virtual mem wasted bytes       14588526272
             Average mod size       171115
        Average mod text size       62602
      Average fail load bytes       10339140
    Duplicate failed modules:
                  module-name        How-many-times                    Reason
                    kvm_intel                   249                      Load
                          kvm                   249                      Load
                    irqbypass                     8                      Load
             crct10dif_pclmul                   128                      Load
          ghash_clmulni_intel                    27                      Load
                 sha512_ssse3                    50                      Load
               sha512_generic                   200                      Load
                  aesni_intel                   249                      Load
                  crypto_simd                    41                      Load
                       cryptd                   131                      Load
                        evdev                     2                      Load
                    serio_raw                     1                      Load
                   virtio_pci                     3                      Load
                         nvme                     3                      Load
                    nvme_core                     3                      Load
        virtio_pci_legacy_dev                     3                      Load
        virtio_pci_modern_dev                     3                      Load
                       t10_pi                     3                      Load
                       virtio                     3                      Load
                 crc32_pclmul                     6                      Load
               crc64_rocksoft                     3                      Load
                 crc32c_intel                    40                      Load
                  virtio_ring                     3                      Load
                        crc64                     3                      Load
    
    The following screen shot, of a simple 8vcpu 8 GiB KVM guest with the
    next patch in this series applied, shows 226.53 MiB are wasted in virtual
    memory allocations which due to duplicate module requests during boot.
    It also shows an average module memory size of 167.10 KiB and an an
    average module .text + .init.text size of 61.13 KiB. The end shows all
    modules which were detected as duplicate requests and whether or not
    they failed early after just the first kernel_read*() call or late after
    we've already allocated the private space for the module in
    layout_and_allocate(). A system with module decompression would reveal
    more wasted virtual memory space.
    
    We should put effort now into identifying the source of these duplicate
    module requests and trimming these down as much possible. Larger systems
    will obviously show much more wasted virtual memory allocations.
    
    root@kmod ~ # cat /sys/kernel/debug/modules/stats
             Mods ever loaded       67
         Mods failed on kread       0
    Mods failed on decompress       0
      Mods failed on becoming       83
          Mods failed on load       16
            Total module size       11464704
          Total mod text size       4194304
           Failed kread bytes       0
      Failed decompress bytes       0
        Failed becoming bytes       228959096
            Failed kmod bytes       8578080
     Virtual mem wasted bytes       237537176
             Average mod size       171115
        Average mod text size       62602
      Avg fail becoming bytes       2758544
      Average fail load bytes       536130
    Duplicate failed modules:
                  module-name        How-many-times                    Reason
                    kvm_intel                     7                  Becoming
                          kvm                     7                  Becoming
                    irqbypass                     6           Becoming & Load
             crct10dif_pclmul                     7           Becoming & Load
          ghash_clmulni_intel                     7           Becoming & Load
                 sha512_ssse3                     6           Becoming & Load
               sha512_generic                     7           Becoming & Load
                  aesni_intel                     7                  Becoming
                  crypto_simd                     7           Becoming & Load
                       cryptd                     3           Becoming & Load
                        evdev                     1                  Becoming
                    serio_raw                     1                  Becoming
                         nvme                     3                  Becoming
                    nvme_core                     3                  Becoming
                       t10_pi                     3                  Becoming
                   virtio_pci                     3                  Becoming
                 crc32_pclmul                     6           Becoming & Load
               crc64_rocksoft                     3                  Becoming
                 crc32c_intel                     3                  Becoming
        virtio_pci_modern_dev                     2                  Becoming
        virtio_pci_legacy_dev                     1                  Becoming
                        crc64                     2                  Becoming
                       virtio                     2                  Becoming
                  virtio_ring                     2                  Becoming
    
    [0] https://github.com/ColinIanKing/stress-ng.git
    [1] echo 0 > /proc/sys/vm/oom_dump_tasks
        ./stress-ng --module 100 --module-name xfs
    Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
    df3e764d
Makefile 785 Bytes