• Arjan van de Ven's avatar
    [PATCH] Remove sys_call_table export · f960dc50
    Arjan van de Ven authored
    The following patch removes the export of the sys_call_table.
    
    There are no uses of this export that are valid and correct. The uses I've
    found so far are
    
    1. Calling syscalls from inside kernel modules
    iBCS/Linux-abi used to do this (and this is the reason for the export
    in the first place), however it does
    no longer, because newer gcc's (2.96/3.x) don't allow
    function pointer calls with a mismatching type. Also it's much better to
    just call the sys_foo functions directly (most are export symbol'd already
    and exporting more if needed wouldn't be a problem, they are clearly a
    stable interface). Since gcc does no longer allow this
    (and I doubt older ones allowed it for all platforms) this I
    consider invalid and unneeded use.
    
    2. Install new syscalls from kernel modules
    LiS seems to be doing this. The correct way to do this is how NFS does
    it for its syscall, and that doesn't need the syscall table to be
    exported for this. Without an in-kernel helper like NFS has, it is not
    possible to do this race free wrt module-unloads etc. Eg this use of the
    export is unneeded and incorrect.
    
    3. Intercept system calls
    OProfile (and intel's vtune which is similar in function) used to do this;
    however what they really need is a notification on certain
    events (exec() mostly). The way modules do this is store the original
    function pointer, install a new one that calls the old one after storing
    whatever info they need. This mechanism breaks badly in the light of
    multiple such modules doing this versus modules
    unloading/uninstalling their handlers (by restoring their saved pointer
    that may or may not point to a valid handler anymore).
    Eg the use of the export in this just a bandaid due to lack of a
    proper mechanism, and also incorrect and crash prone.
    
    4. Extend system calls
    The mechanism for this is identical to the previous one, except
    that now the actual syscall behavior is changed. I don't think open source
    modules do this (generally they don't need to, just adding things to the
    kernel proper works for them), however I've
    seen IBM's closed source cluster fs do this.
    The objections to the mechanism are the same as in 3. Also
    this changes the userspace ABI effectively, something which is undesireable.
    f960dc50
ksyms.c 16.1 KB