• Johannes Berg's avatar
    um: Mark all kernel symbols as local · d5027ca6
    Johannes Berg authored
    Ritesh reported a bug [1] against UML, noting that it crashed on
    startup. The backtrace shows the following (heavily redacted):
    
    (gdb) bt
    ...
     #26 0x0000000060015b5d in sem_init () at ipc/sem.c:268
     #27 0x00007f89906d92f7 in ?? () from /lib/x86_64-linux-gnu/libcom_err.so.2
     #28 0x00007f8990ab8fb2 in call_init (...) at dl-init.c:72
    ...
     #40 0x00007f89909bf3a6 in nss_load_library (...) at nsswitch.c:359
    ...
     #44 0x00007f8990895e35 in _nss_compat_getgrnam_r (...) at nss_compat/compat-grp.c:486
     #45 0x00007f8990968b85 in __getgrnam_r [...]
     #46 0x00007f89909d6b77 in grantpt [...]
     #47 0x00007f8990a9394e in __GI_openpty [...]
     #48 0x00000000604a1f65 in openpty_cb (...) at arch/um/os-Linux/sigio.c:407
     #49 0x00000000604a58d0 in start_idle_thread (...) at arch/um/os-Linux/skas/process.c:598
     #50 0x0000000060004a3d in start_uml () at arch/um/kernel/skas/process.c:45
     #51 0x00000000600047b2 in linux_main (...) at arch/um/kernel/um_arch.c:334
     #52 0x000000006000574f in main (...) at arch/um/os-Linux/main.c:144
    
    indicating that the UML function openpty_cb() calls openpty(),
    which internally calls __getgrnam_r(), which causes the nsswitch
    machinery to get started.
    
    This loads, through lots of indirection that I snipped, the
    libcom_err.so.2 library, which (in an unknown function, "??")
    calls sem_init().
    
    Now, of course it wants to get libpthread's sem_init(), since
    it's linked against libpthread. However, the dynamic linker
    looks up that symbol against the binary first, and gets the
    kernel's sem_init().
    
    Hajime Tazaki noted that "objcopy -L" can localize a symbol,
    so the dynamic linker wouldn't do the lookup this way. I tried,
    but for some reason that didn't seem to work.
    
    Doing the same thing in the linker script instead does seem to
    work, though I cannot entirely explain - it *also* works if I
    just add "VERSION { { global: *; }; }" instead, indicating that
    something else is happening that I don't really understand. It
    may be that explicitly doing that marks them with some kind of
    empty version, and that's different from the default.
    
    Explicitly marking them with a version breaks kallsyms, so that
    doesn't seem to be possible.
    
    Marking all the symbols as local seems correct, and does seem
    to address the issue, so do that. Also do it for static link,
    nsswitch libraries could still be loaded there.
    
    [1] https://bugs.debian.org/983379Reported-by: default avatarRitesh Raj Sarraf <rrs@debian.org>
    Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
    Acked-By: default avatarAnton Ivanov <anton.ivanov@cambridgegreys.com>
    Tested-By: default avatarRitesh Raj Sarraf <rrs@debian.org>
    Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
    d5027ca6
dyn.lds.S 5.22 KB