1. 24 Jun, 2023 2 commits
  2. 22 Jun, 2023 9 commits
    • Masahiro Yamada's avatar
      linux/export.h: rename 'sec' argument to 'license' · 8ed7e33a
      Masahiro Yamada authored
      Now, EXPORT_SYMBOL() is populated in two stages. In the first stage,
      all of EXPORT_SYMBOL/EXPORT_SYMBOL_GPL go into the same section,
      '.export_symbol'.
      
      'sec' does not make sense any more. Rename it to 'license'.
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      8ed7e33a
    • Masahiro Yamada's avatar
      modpost: show offset from symbol for section mismatch warnings · f2346278
      Masahiro Yamada authored
      Currently, modpost only shows the symbol names and section names, so it
      repeats the same message if there are multiple relocations in the same
      symbol. It is common the relocation spans across multiple instructions.
      
      It is better to show the offset from the symbol.
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      f2346278
    • Masahiro Yamada's avatar
      modpost: merge two similar section mismatch warnings · 78dac1a2
      Masahiro Yamada authored
      In case of section mismatch, modpost shows slightly different messages.
      
      For extable section mismatch:
      
       "%s(%s+0x%lx): Section mismatch in reference to the %s:%s\n"
      
      For the other cases:
      
       "%s: section mismatch in reference: %s (section: %s) -> %s (section: %s)\n"
      
      They are similar. Merge them.
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      78dac1a2
    • Masahiro Yamada's avatar
      kbuild: implement CONFIG_TRIM_UNUSED_KSYMS without recursion · 5e9e95cc
      Masahiro Yamada authored
      When CONFIG_TRIM_UNUSED_KSYMS is enabled, Kbuild recursively traverses
      the directory tree to determine which EXPORT_SYMBOL to trim. If an
      EXPORT_SYMBOL turns out to be unused by anyone, Kbuild begins the
      second traverse, where some source files are recompiled with their
      EXPORT_SYMBOL() tuned into a no-op.
      
      Linus stated negative opinions about this slowness in commits:
      
       - 5cf0fd59 ("Kbuild: disable TRIM_UNUSED_KSYMS option")
       - a555bdd0 ("Kbuild: enable TRIM_UNUSED_KSYMS again, with some guarding")
      
      We can do this better now. The final data structures of EXPORT_SYMBOL
      are generated by the modpost stage, so modpost can selectively emit
      KSYMTAB entries that are really used by modules.
      
      Commit f73edc89 ("kbuild: unify two modpost invocations") is another
      ground-work to do this in a one-pass algorithm. With the list of modules,
      modpost sets sym->used if it is used by a module. modpost emits KSYMTAB
      only for symbols with sym->used==true.
      
      BTW, Nicolas explained why the trimming was implemented with recursion:
      
        https://lore.kernel.org/all/2o2rpn97-79nq-p7s2-nq5-8p83391473r@syhkavp.arg/
      
      Actually, we never achieved that level of optimization where the chain
      reaction of trimming comes into play because:
      
       - CONFIG_LTO_CLANG cannot remove any unused symbols
       - CONFIG_LD_DEAD_CODE_DATA_ELIMINATION is enabled only for vmlinux,
         but not modules
      
      If deeper trimming is required, we need to revisit this, but I guess
      that is unlikely to happen.
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      5e9e95cc
    • Masahiro Yamada's avatar
      modpost: use null string instead of NULL pointer for default namespace · 700c48b4
      Masahiro Yamada authored
      The default namespace is the null string, "".
      
      When set, the null string "" is converted to NULL:
      
        s->namespace = namespace[0] ? NOFAIL(strdup(namespace)) : NULL;
      
      When printed, the NULL pointer is get back to the null string:
      
        sym->namespace ?: ""
      
      This saves 1 byte memory allocated for "", but loses the readability.
      
      In kernel-space, we strive to save memory, but modpost is a userspace
      tool used to build the kernel. On modern systems, such small piece of
      memory is not a big deal.
      
      Handle the namespace string as is.
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      700c48b4
    • Masahiro Yamada's avatar
      modpost: squash sym_update_namespace() into sym_add_exported() · 6e7611c4
      Masahiro Yamada authored
      Pass a set of the name, license, and namespace to sym_add_exported().
      
      sym_update_namespace() is unneeded.
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      6e7611c4
    • Masahiro Yamada's avatar
      modpost: check static EXPORT_SYMBOL* by modpost again · 6d62b1c4
      Masahiro Yamada authored
      Commit 31cb50b5 ("kbuild: check static EXPORT_SYMBOL* by script
      instead of modpost") moved the static EXPORT_SYMBOL* check from the
      mostpost to a shell script because I thought it must be checked per
      compilation unit to avoid false negatives.
      
      I came up with an idea to do this in modpost, against combined ELF
      files. The relocation entries in ELF will find the correct exported
      symbol even if there exist symbols with the same name in different
      compilation units.
      
      Again, the same sample code.
      
        Makefile:
      
          obj-y += foo1.o foo2.o
      
        foo1.c:
      
          #include <linux/export.h>
          static void foo(void) {}
          EXPORT_SYMBOL(foo);
      
        foo2.c:
      
          void foo(void) {}
      
      Then, modpost can catch it correctly.
      
          MODPOST Module.symvers
        ERROR: modpost: vmlinux: local symbol 'foo' was exported
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      6d62b1c4
    • Masahiro Yamada's avatar
      ia64,export.h: replace EXPORT_DATA_SYMBOL* with EXPORT_SYMBOL* · 7d59313f
      Masahiro Yamada authored
      With the previous refactoring, you can always use EXPORT_SYMBOL*.
      
      Replace two instances in ia64, then remove EXPORT_DATA_SYMBOL*.
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      7d59313f
    • Masahiro Yamada's avatar
      kbuild: generate KSYMTAB entries by modpost · ddb5cdba
      Masahiro Yamada authored
      Commit 7b453719 ("kbuild: link symbol CRCs at final link, removing
      CONFIG_MODULE_REL_CRCS") made modpost output CRCs in the same way
      whether the EXPORT_SYMBOL() is placed in *.c or *.S.
      
      For further cleanups, this commit applies a similar approach to the
      entire data structure of EXPORT_SYMBOL().
      
      The EXPORT_SYMBOL() compilation is split into two stages.
      
      When a source file is compiled, EXPORT_SYMBOL() will be converted into
      a dummy symbol in the .export_symbol section.
      
      For example,
      
          EXPORT_SYMBOL(foo);
          EXPORT_SYMBOL_NS_GPL(bar, BAR_NAMESPACE);
      
      will be encoded into the following assembly code:
      
          .section ".export_symbol","a"
          __export_symbol_foo:
                  .asciz ""                      /* license */
                  .asciz ""                      /* name space */
                  .balign 8
                  .quad foo                      /* symbol reference */
          .previous
      
          .section ".export_symbol","a"
          __export_symbol_bar:
                  .asciz "GPL"                   /* license */
                  .asciz "BAR_NAMESPACE"         /* name space */
                  .balign 8
                  .quad bar                      /* symbol reference */
          .previous
      
      They are mere markers to tell modpost the name, license, and namespace
      of the symbols. They will be dropped from the final vmlinux and modules
      because the *(.export_symbol) will go into /DISCARD/ in the linker script.
      
      Then, modpost extracts all the information about EXPORT_SYMBOL() from the
      .export_symbol section, and generates the final C code:
      
          KSYMTAB_FUNC(foo, "", "");
          KSYMTAB_FUNC(bar, "_gpl", "BAR_NAMESPACE");
      
      KSYMTAB_FUNC() (or KSYMTAB_DATA() if it is data) is expanded to struct
      kernel_symbol that will be linked to the vmlinux or a module.
      
      With this change, EXPORT_SYMBOL() works in the same way for *.c and *.S
      files, providing the following benefits.
      
      [1] Deprecate EXPORT_DATA_SYMBOL()
      
      In the old days, EXPORT_SYMBOL() was only available in C files. To export
      a symbol in *.S, EXPORT_SYMBOL() was placed in a separate *.c file.
      arch/arm/kernel/armksyms.c is one example written in the classic manner.
      
      Commit 22823ab4 ("EXPORT_SYMBOL() for asm") removed this limitation.
      Since then, EXPORT_SYMBOL() can be placed close to the symbol definition
      in *.S files. It was a nice improvement.
      
      However, as that commit mentioned, you need to use EXPORT_DATA_SYMBOL()
      for data objects on some architectures.
      
      In the new approach, modpost checks symbol's type (STT_FUNC or not),
      and outputs KSYMTAB_FUNC() or KSYMTAB_DATA() accordingly.
      
      There are only two users of EXPORT_DATA_SYMBOL:
      
        EXPORT_DATA_SYMBOL_GPL(empty_zero_page)    (arch/ia64/kernel/head.S)
        EXPORT_DATA_SYMBOL(ia64_ivt)               (arch/ia64/kernel/ivt.S)
      
      They are transformed as follows and output into .vmlinux.export.c
      
        KSYMTAB_DATA(empty_zero_page, "_gpl", "");
        KSYMTAB_DATA(ia64_ivt, "", "");
      
      The other EXPORT_SYMBOL users in ia64 assembly are output as
      KSYMTAB_FUNC().
      
      EXPORT_DATA_SYMBOL() is now deprecated.
      
      [2] merge <linux/export.h> and <asm-generic/export.h>
      
      There are two similar header implementations:
      
        include/linux/export.h        for .c files
        include/asm-generic/export.h  for .S files
      
      Ideally, the functionality should be consistent between them, but they
      tend to diverge.
      
      Commit 8651ec01 ("module: add support for symbol namespaces.") did
      not support the namespace for *.S files.
      
      This commit shifts the essential implementation part to C, which supports
      EXPORT_SYMBOL_NS() for *.S files.
      
      <asm/export.h> and <asm-generic/export.h> will remain as a wrapper of
      <linux/export.h> for a while.
      
      They will be removed after #include <asm/export.h> directives are all
      replaced with #include <linux/export.h>.
      
      [3] Implement CONFIG_TRIM_UNUSED_KSYMS in one-pass algorithm (by a later commit)
      
      When CONFIG_TRIM_UNUSED_KSYMS is enabled, Kbuild recursively traverses
      the directory tree to determine which EXPORT_SYMBOL to trim. If an
      EXPORT_SYMBOL turns out to be unused by anyone, Kbuild begins the
      second traverse, where some source files are recompiled with their
      EXPORT_SYMBOL() tuned into a no-op.
      
      We can do this better now; modpost can selectively emit KSYMTAB entries
      that are really used by modules.
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      ddb5cdba
  3. 15 Jun, 2023 1 commit
  4. 14 Jun, 2023 4 commits
    • Masahiro Yamada's avatar
      ARC: define ASM_NL and __ALIGN(_STR) outside #ifdef __ASSEMBLY__ guard · 92e2921e
      Masahiro Yamada authored
      ASM_NL is useful not only in *.S files but also in .c files for using
      inline assembler in C code.
      
      On ARC, however, ASM_NL is evaluated inconsistently. It is expanded to
      a backquote (`) in *.S files, but a semicolon (;) in *.c files because
      arch/arc/include/asm/linkage.h defines it inside #ifdef __ASSEMBLY__,
      so the definition for C code falls back to the default value defined in
      include/linux/linkage.h.
      
      If ASM_NL is used in inline assembler in .c files, it will result in
      wrong assembly code because a semicolon is not an instruction separator,
      but the start of a comment for ARC.
      
      Move ASM_NL (also __ALIGN and __ALIGN_STR) out of the #ifdef.
      
      Fixes: 9df62f05 ("arch: use ASM_NL instead of ';' for assembler new line character in the macro")
      Fixes: 8d92e992 ("ARC: define __ALIGN_STR and __ALIGN symbols for ARC")
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      92e2921e
    • Masahiro Yamada's avatar
      scripts/kallsyms: remove KSYM_NAME_LEN_BUFFER · 1c975da5
      Masahiro Yamada authored
      You do not need to decide the buffer size statically.
      
      Use getline() to grow the line buffer as needed.
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarNicolas Schier <n.schier@avm.de>
      1c975da5
    • Masahiro Yamada's avatar
      scripts/kallsyms: constify long_options · 92e74fb6
      Masahiro Yamada authored
      getopt_long() does not modify this.
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarNicolas Schier <n.schier@avm.de>
      92e74fb6
    • Masahiro Yamada's avatar
      Revert "[PATCH] uml: export symbols added by GCC hardened" · 8635e8df
      Masahiro Yamada authored
      This reverts commit cead61a6.
      
      It exported __stack_smash_handler and __guard, while they may not be
      defined by anyone.
      
      The code *declares* __stack_smash_handler and __guard. It does not
      create weak symbols. If no external library is linked, they are left
      undefined, but yet exported.
      
      If a loadable module tries to access non-existing symbols, bad things
      (a page fault, NULL pointer dereference, etc.) will happen. So, the
      current code is wrong and dangerous.
      
      If the code were written as follows, it would *define* them as weak
      symbols so modules would be able to get access to them.
      
        void (*__stack_smash_handler)(void *) __attribute__((weak));
        EXPORT_SYMBOL(__stack_smash_handler);
      
        long __guard __attribute__((weak));
        EXPORT_SYMBOL(__guard);
      
      In fact, modpost forbids exporting undefined symbols. It shows an error
      message if it detects such a mistake.
      
        ERROR: modpost: "..." [...] was exported without definition
      
      Unfortunately, it is checked only when the code is built as modular.
      The problem described above has been unnoticed for a long time because
      arch/um/os-Linux/user_syms.c is always built-in.
      
      With a planned change in Kbuild, exporting undefined symbols will always
      result in a build error instead of a run-time error. It is a good thing,
      but we need to fix the breakage in advance.
      
      One fix is to define weak symbols as shown above. An alternative is to
      export them conditionally as follows:
      
        #ifdef CONFIG_STACKPROTECTOR
        extern void __stack_smash_handler(void *);
        EXPORT_SYMBOL(__stack_smash_handler);
      
        external long __guard;
        EXPORT_SYMBOL(__guard);
        #endif
      
      This is what other architectures do; EXPORT_SYMBOL(__stack_chk_guard)
      is guarded by #ifdef CONFIG_STACKPROTECTOR.
      
      However, adding the #ifdef guard is not sensible because UML cannot
      enable the stack-protector in the first place! (Please note UML does
      not select HAVE_STACKPROTECTOR in Kconfig.)
      
      So, the code is already broken (and unused) in multiple ways.
      
      Just remove.
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      8635e8df
  5. 10 Jun, 2023 2 commits
  6. 08 Jun, 2023 2 commits
  7. 07 Jun, 2023 4 commits
  8. 06 Jun, 2023 1 commit
  9. 05 Jun, 2023 4 commits
    • Masahiro Yamada's avatar
      kbuild: add $(CLANG_FLAGS) to KBUILD_CPPFLAGS · feb843a4
      Masahiro Yamada authored
      When preprocessing arch/*/kernel/vmlinux.lds.S, the target triple is
      not passed to $(CPP) because we add it only to KBUILD_{C,A}FLAGS.
      
      As a result, the linker script is preprocessed with predefined macros
      for the build host instead of the target.
      
      Assuming you use an x86 build machine, compare the following:
      
       $ clang -dM -E -x c /dev/null
       $ clang -dM -E -x c /dev/null -target aarch64-linux-gnu
      
      There is no actual problem presumably because our linker scripts do not
      rely on such predefined macros, but it is better to define correct ones.
      
      Move $(CLANG_FLAGS) to KBUILD_CPPFLAGS, so that all *.c, *.S, *.lds.S
      will be processed with the proper target triple.
      
      [Note]
      After the patch submission, we got an actual problem that needs this
      commit. (CBL issue 1859)
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/1859Reported-by: default avatarTom Rini <trini@konsulko.com>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarNathan Chancellor <nathan@kernel.org>
      Tested-by: default avatarNathan Chancellor <nathan@kernel.org>
      feb843a4
    • Nathan Chancellor's avatar
      kbuild: Add CLANG_FLAGS to as-instr · cff6e7f5
      Nathan Chancellor authored
      A future change will move CLANG_FLAGS from KBUILD_{A,C}FLAGS to
      KBUILD_CPPFLAGS so that '--target' is available while preprocessing.
      When that occurs, the following errors appear multiple times when
      building ARCH=powerpc powernv_defconfig:
      
        ld.lld: error: vmlinux.a(arch/powerpc/kernel/head_64.o):(.text+0x12d4): relocation R_PPC64_ADDR16_HI out of range: -4611686018409717520 is not in [-2147483648, 2147483647]; references '__start___soft_mask_table'
        ld.lld: error: vmlinux.a(arch/powerpc/kernel/head_64.o):(.text+0x12e8): relocation R_PPC64_ADDR16_HI out of range: -4611686018409717392 is not in [-2147483648, 2147483647]; references '__stop___soft_mask_table'
      
      Diffing the .o.cmd files reveals that -DHAVE_AS_ATHIGH=1 is not present
      anymore, because as-instr only uses KBUILD_AFLAGS, which will no longer
      contain '--target'.
      
      Mirror Kconfig's as-instr and add CLANG_FLAGS explicitly to the
      invocation to ensure the target information is always present.
      Signed-off-by: default avatarNathan Chancellor <nathan@kernel.org>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      cff6e7f5
    • Nathan Chancellor's avatar
      powerpc/vdso: Include CLANG_FLAGS explicitly in ldflags-y · a7e5eb53
      Nathan Chancellor authored
      A future change will move CLANG_FLAGS from KBUILD_{A,C}FLAGS to
      KBUILD_CPPFLAGS so that '--target' is available while preprocessing.
      When that occurs, the following error appears when building the compat
      PowerPC vDSO:
      
        clang: error: unsupported option '-mbig-endian' for target 'x86_64-pc-linux-gnu'
        make[3]: *** [.../arch/powerpc/kernel/vdso/Makefile:76: arch/powerpc/kernel/vdso/vdso32.so.dbg] Error 1
      
      Explicitly add CLANG_FLAGS to ldflags-y, so that '--target' will always
      be present.
      Signed-off-by: default avatarNathan Chancellor <nathan@kernel.org>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      a7e5eb53
    • Nathan Chancellor's avatar
      mips: Include KBUILD_CPPFLAGS in CHECKFLAGS invocation · 08f6554f
      Nathan Chancellor authored
      A future change will move CLANG_FLAGS from KBUILD_{A,C}FLAGS to
      KBUILD_CPPFLAGS so that '--target' is available while preprocessing.
      When that occurs, the following error appears when building ARCH=mips
      with clang (tip of tree error shown):
      
        clang: error: unsupported option '-mabi=' for target 'x86_64-pc-linux-gnu'
      
      Add KBUILD_CPPFLAGS in the CHECKFLAGS invocation to keep everything
      working after the move.
      Signed-off-by: default avatarNathan Chancellor <nathan@kernel.org>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      08f6554f
  10. 03 Jun, 2023 3 commits
    • Masahiro Yamada's avatar
      modpost: detect section mismatch for R_ARM_REL32 · 2cb74946
      Masahiro Yamada authored
      For ARM, modpost fails to detect some types of section mismatches.
      
        [test code]
      
          .section .init.data,"aw"
          bar:
                  .long 0
      
          .section .data,"aw"
          .globl foo
          foo:
                  .long bar - .
      
      It is apparently a bad reference, but modpost does not report anything.
      
      The test code above produces the following relocations.
      
        Relocation section '.rel.data' at offset 0xe8 contains 1 entry:
         Offset     Info    Type            Sym.Value  Sym. Name
        00000000  00000403 R_ARM_REL32       00000000   .init.data
      
      Currently, R_ARM_REL32 is just skipped.
      
      Handle it like R_ARM_ABS32.
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      2cb74946
    • Masahiro Yamada's avatar
      modpost: fix section_mismatch message for R_ARM_THM_{CALL,JUMP24,JUMP19} · 3310bae8
      Masahiro Yamada authored
      addend_arm_rel() processes R_ARM_THM_CALL, R_ARM_THM_JUMP24,
      R_ARM_THM_JUMP19 in a wrong way.
      
      Here, test code.
      
      [test code for R_ARM_THM_JUMP24]
      
        .section .init.text,"ax"
        bar:
                bx      lr
      
        .section .text,"ax"
        .globl foo
        foo:
                b       bar
      
      [test code for R_ARM_THM_CALL]
      
        .section .init.text,"ax"
        bar:
                bx      lr
      
        .section .text,"ax"
        .globl foo
        foo:
                push    {lr}
                bl      bar
                pop     {pc}
      
      If you compile it with CONFIG_THUMB2_KERNEL=y, modpost will show the
      symbol name, (unknown).
      
        WARNING: modpost: vmlinux.o: section mismatch in reference: foo (section: .text) -> (unknown) (section: .init.text)
      
      (You need to use GNU linker instead of LLD to reproduce it.)
      
      Fix the code to make modpost show the correct symbol name. I checked
      arch/arm/kernel/module.c to learn the encoding of R_ARM_THM_CALL and
      R_ARM_THM_JUMP24. The module does not support R_ARM_THM_JUMP19, but
      I checked its encoding in ARM ARM.
      
      The '+4' is the compensation for pc-relative instruction. It is
      documented in "ELF for the Arm Architecture" [1].
      
        "If the relocation is pc-relative then compensation for the PC bias
        (the PC value is 8 bytes ahead of the executing instruction in Arm
        state and 4 bytes in Thumb state) must be encoded in the relocation
        by the object producer."
      
      [1]: https://github.com/ARM-software/abi-aa/blob/main/aaelf32/aaelf32.rst
      
      Fixes: c9698e5c ("ARM: 7964/1: Detect section mismatches in thumb relocations")
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      3310bae8
    • Masahiro Yamada's avatar
      modpost: detect section mismatch for R_ARM_THM_{MOVW_ABS_NC,MOVT_ABS} · cd1824fb
      Masahiro Yamada authored
      When CONFIG_THUMB2_KERNEL is enabled, modpost fails to detect some
      types of section mismatches.
      
        [test code]
      
          #include <linux/init.h>
      
          int __initdata foo;
          int get_foo(void) { return foo; }
      
      It is apparently a bad reference, but modpost does not report anything.
      
      The test code above produces the following relocations.
      
        Relocation section '.rel.text' at offset 0x1e8 contains 2 entries:
         Offset     Info    Type            Sym.Value  Sym. Name
        00000000  0000052f R_ARM_THM_MOVW_AB 00000000   .LANCHOR0
        00000004  00000530 R_ARM_THM_MOVT_AB 00000000   .LANCHOR0
      
      Currently, R_ARM_THM_MOVW_ABS_NC and R_ARM_THM_MOVT_ABS are just skipped.
      
      Add code to handle them. I checked arch/arm/kernel/module.c to learn
      how the offset is encoded in the instruction.
      
      One more thing to note for Thumb instructions - the st_value is an odd
      value, so you need to mask the bit 0 to get the offset. Otherwise, you
      will get an off-by-one error in the nearest symbol look-up.
      
      It is documented in "ELF for the ARM Architecture" [1]:
      
        In addition to the normal rules for symbol values the following rules
        shall also apply to symbols of type STT_FUNC:
      
         * If the symbol addresses an Arm instruction, its value is the
           address of the instruction (in a relocatable object, the offset
           of the instruction from the start of the section containing it).
      
         * If the symbol addresses a Thumb instruction, its value is the
           address of the instruction with bit zero set (in a relocatable
           object, the section offset with bit zero set).
      
         * For the purposes of relocation the value used shall be the address
           of the instruction (st_value & ~1).
      
      [1]: https://github.com/ARM-software/abi-aa/blob/main/aaelf32/aaelf32.rstSigned-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      cd1824fb
  11. 02 Jun, 2023 4 commits
    • Masahiro Yamada's avatar
      modpost: refactor find_fromsym() and find_tosym() · b1a9651d
      Masahiro Yamada authored
      find_fromsym() and find_tosym() are similar - both of them iterate
      in the .symtab section and return the nearest symbol.
      
      The difference between them is that find_tosym() allows a negative
      distance, but the distance must be less than 20.
      
      Factor out the common part into find_nearest_sym().
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      b1a9651d
    • Masahiro Yamada's avatar
      modpost: detect section mismatch for R_ARM_{MOVW_ABS_NC,MOVT_ABS} · 12ca2c67
      Masahiro Yamada authored
      For ARM defconfig (i.e. multi_v7_defconfig), modpost fails to detect
      some types of section mismatches.
      
        [test code]
      
          #include <linux/init.h>
      
          int __initdata foo;
          int get_foo(void) { return foo; }
      
      It is apparently a bad reference, but modpost does not report anything.
      
      The test code above produces the following relocations.
      
        Relocation section '.rel.text' at offset 0x200 contains 2 entries:
         Offset     Info    Type            Sym.Value  Sym. Name
        00000000  0000062b R_ARM_MOVW_ABS_NC 00000000   .LANCHOR0
        00000004  0000062c R_ARM_MOVT_ABS    00000000   .LANCHOR0
      
      Currently, R_ARM_MOVW_ABS_NC and R_ARM_MOVT_ABS are just skipped.
      
      Add code to handle them. I checked arch/arm/kernel/module.c to learn
      how the offset is encoded in the instruction.
      
      The referenced symbol in relocation might be a local anchor.
      If is_valid_name() returns false, let's search for a better symbol name.
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      12ca2c67
    • Masahiro Yamada's avatar
      modpost: fix section mismatch message for R_ARM_{PC24,CALL,JUMP24} · 56a24b8c
      Masahiro Yamada authored
      addend_arm_rel() processes R_ARM_PC24, R_ARM_CALL, R_ARM_JUMP24 in a
      wrong way.
      
      Here, test code.
      
      [test code for R_ARM_JUMP24]
      
        .section .init.text,"ax"
        bar:
                bx      lr
      
        .section .text,"ax"
        .globl foo
        foo:
                b       bar
      
      [test code for R_ARM_CALL]
      
        .section .init.text,"ax"
        bar:
                bx      lr
      
        .section .text,"ax"
        .globl foo
        foo:
                push    {lr}
                bl      bar
                pop     {pc}
      
      If you compile it with ARM multi_v7_defconfig, modpost will show the
      symbol name, (unknown).
      
        WARNING: modpost: vmlinux.o: section mismatch in reference: foo (section: .text) -> (unknown) (section: .init.text)
      
      (You need to use GNU linker instead of LLD to reproduce it.)
      
      Fix the code to make modpost show the correct symbol name.
      
      I imported (with adjustment) sign_extend32() from include/linux/bitops.h.
      
      The '+8' is the compensation for pc-relative instruction. It is
      documented in "ELF for the Arm Architecture" [1].
      
        "If the relocation is pc-relative then compensation for the PC bias
        (the PC value is 8 bytes ahead of the executing instruction in Arm
        state and 4 bytes in Thumb state) must be encoded in the relocation
        by the object producer."
      
      [1]: https://github.com/ARM-software/abi-aa/blob/main/aaelf32/aaelf32.rst
      
      Fixes: 56a974fa ("kbuild: make better section mismatch reports on arm")
      Fixes: 6e2e340b ("ARM: 7324/1: modpost: Fix section warnings for ARM for many compilers")
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      56a24b8c
    • Masahiro Yamada's avatar
      modpost: fix section mismatch message for R_ARM_ABS32 · b7c63520
      Masahiro Yamada authored
      addend_arm_rel() processes R_ARM_ABS32 in a wrong way.
      
      Here, test code.
      
        [test code 1]
      
          #include <linux/init.h>
      
          int __initdata foo;
          int get_foo(void) { return foo; }
      
      If you compile it with ARM versatile_defconfig, modpost will show the
      symbol name, (unknown).
      
        WARNING: modpost: vmlinux.o: section mismatch in reference: get_foo (section: .text) -> (unknown) (section: .init.data)
      
      (You need to use GNU linker instead of LLD to reproduce it.)
      
      If you compile it for other architectures, modpost will show the correct
      symbol name.
      
        WARNING: modpost: vmlinux.o: section mismatch in reference: get_foo (section: .text) -> foo (section: .init.data)
      
      For R_ARM_ABS32, addend_arm_rel() sets r->r_addend to a wrong value.
      
      I just mimicked the code in arch/arm/kernel/module.c.
      
      However, there is more difficulty for ARM.
      
      Here, test code.
      
        [test code 2]
      
          #include <linux/init.h>
      
          int __initdata foo;
          int get_foo(void) { return foo; }
      
          int __initdata bar;
          int get_bar(void) { return bar; }
      
      With this commit applied, modpost will show the following messages
      for ARM versatile_defconfig:
      
        WARNING: modpost: vmlinux.o: section mismatch in reference: get_foo (section: .text) -> foo (section: .init.data)
        WARNING: modpost: vmlinux.o: section mismatch in reference: get_bar (section: .text) -> foo (section: .init.data)
      
      The reference from 'get_bar' to 'foo' seems wrong.
      
      I have no solution for this because it is true in assembly level.
      
      In the following output, relocation at 0x1c is no longer associated
      with 'bar'. The two relocation entries point to the same symbol, and
      the offset to 'bar' is encoded in the instruction 'r0, [r3, #4]'.
      
        Disassembly of section .text:
      
        00000000 <get_foo>:
           0: e59f3004          ldr     r3, [pc, #4]   @ c <get_foo+0xc>
           4: e5930000          ldr     r0, [r3]
           8: e12fff1e          bx      lr
           c: 00000000          .word   0x00000000
      
        00000010 <get_bar>:
          10: e59f3004          ldr     r3, [pc, #4]   @ 1c <get_bar+0xc>
          14: e5930004          ldr     r0, [r3, #4]
          18: e12fff1e          bx      lr
          1c: 00000000          .word   0x00000000
      
        Relocation section '.rel.text' at offset 0x244 contains 2 entries:
         Offset     Info    Type            Sym.Value  Sym. Name
        0000000c  00000c02 R_ARM_ABS32       00000000   .init.data
        0000001c  00000c02 R_ARM_ABS32       00000000   .init.data
      
      When find_elf_symbol() gets into a situation where relsym->st_name is
      zero, there is no guarantee to get the symbol name as written in C.
      
      I am keeping the current logic because it is useful in many architectures,
      but the symbol name is not always correct depending on the optimization.
      I left some comments in find_tosym().
      
      Fixes: 56a974fa ("kbuild: make better section mismatch reports on arm")
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      b7c63520
  12. 28 May, 2023 4 commits