1. 05 Sep, 2024 2 commits
  2. 29 Aug, 2024 35 commits
  3. 27 Aug, 2024 3 commits
    • Vasily Gorbik's avatar
      s390/ftrace: Avoid calling unwinder in ftrace_return_address() · a84dd0d8
      Vasily Gorbik authored
      ftrace_return_address() is called extremely often from
      performance-critical code paths when debugging features like
      CONFIG_TRACE_IRQFLAGS are enabled. For example, with debug_defconfig,
      ftrace selftests on my LPAR currently execute ftrace_return_address()
      as follows:
      
      ftrace_return_address(0) - 0 times (common code uses __builtin_return_address(0) instead)
      ftrace_return_address(1) - 2,986,805,401 times (with this patch applied)
      ftrace_return_address(2) - 140 times
      ftrace_return_address(>2) - 0 times
      
      The use of __builtin_return_address(n) was replaced by return_address()
      with an unwinder call by commit cae74ba8 ("s390/ftrace:
      Use unwinder instead of __builtin_return_address()") because
      __builtin_return_address(n) simply walks the stack backchain and doesn't
      check for reaching the stack top. For shallow stacks with fewer than
      "n" frames, this results in reads at low addresses and random
      memory accesses.
      
      While calling the fully functional unwinder "works", it is very slow
      for this purpose. Moreover, potentially following stack switches and
      walking past IRQ context is simply wrong thing to do for
      ftrace_return_address().
      
      Reimplement return_address() to essentially be __builtin_return_address(n)
      with checks for reaching the stack top. Since the ftrace_return_address(n)
      argument is always a constant, keep the implementation in the header,
      allowing both GCC and Clang to unroll the loop and optimize it to the
      bare minimum.
      
      Fixes: cae74ba8 ("s390/ftrace: Use unwinder instead of __builtin_return_address()")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Reviewed-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Acked-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      a84dd0d8
    • Jens Remus's avatar
      s390/build: Avoid relocation information in final vmlinux · 57216cc9
      Jens Remus authored
      Since commit 778666df ("s390: compile relocatable kernel without
      -fPIE") the kernel vmlinux ELF file is linked with --emit-relocs to
      preserve all relocations, so that all absolute relocations can be
      extracted using the 'relocs' tool to adjust them during boot.
      
      Port and adapt Petr Pavlu's x86 commit 9d9173e9 ("x86/build: Avoid
      relocation information in final vmlinux") to s390 to strip all
      relocations from the final vmlinux ELF file to optimize its size.
      Following is his original commit message with minor adaptions for s390:
      
      The Linux build process on s390 roughly consists of compiling all input
      files, statically linking them into a vmlinux ELF file, and then taking
      and turning this file into an actual bzImage bootable file.
      
      vmlinux has in this process two main purposes:
      1) It is an intermediate build target on the way to produce the final
         bootable image.
      2) It is a file that is expected to be used by debuggers and standard
         ELF tooling to work with the built kernel.
      
      For the second purpose, a vmlinux file is typically collected by various
      package build recipes, such as distribution spec files, including the
      kernel's own tar-pkg target.
      
      When building the kernel vmlinux contains also relocation information
      produced by using the --emit-relocs linker option. This is utilized by
      subsequent build steps to create relocs.S and produce a relocatable
      image. However, the information is not needed by debuggers and other
      standard ELF tooling.
      
      The issue is then that the collected vmlinux file and hence distribution
      packages end up unnecessarily large because of this extra data. The
      following is a size comparison of vmlinux v6.10 with and without the
      relocation information:
      
        | Configuration      | With relocs | Stripped relocs |
        | defconfig          |      696 MB |          320 MB |
        | -CONFIG_DEBUG_INFO |       48 MB |           32 MB |
      
      Optimize a resulting vmlinux by adding a postlink step that splits the
      relocation information into relocs.S and then strips it from the vmlinux
      binary.
      Reviewed-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarJens Remus <jremus@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      57216cc9
    • Vasily Gorbik's avatar
      s390/ftrace: Use kernel ftrace trampoline for modules · d759be28
      Vasily Gorbik authored
      Now that both the kernel modules area and the kernel image itself are
      located within 4 GB, there is no longer a need to maintain a separate
      ftrace_plt trampoline. Use the existing trampoline in the kernel.
      Reviewed-by: default avatarIlya Leoshkevich <iii@linux.ibm.com>
      Reviewed-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      d759be28