• David Mosberger's avatar
    [IA64] speedup ptrace by avoiding kernel-stack walk · 47651db7
    David Mosberger authored
    This patch changes the syscall entry path to store the
    current-frame-mask (CFM) in pt_regs->cr_ifs.  This just takes one
    extra instruction (a "dep" to clear the bits other than 0-37) and is
    free in terms of cycles.
    
    The advantage of doing this is that it lets ptrace() avoid having to
    walk the stack to determine the end of the user-level backing-store of
    a process which is in the middle of a system-call.  Since this is what
    strace does all the time, this speeds up strace quite a bit (by ~50%).
    More importantly, it makes the syscall vs. non-syscall case much more
    symmetric, which is always something I wanted.
    
    Note that the change to ivt.S looks big but this is just a rippling
    effect of instruction-scheduling to keep syscall latency the same.
    All that's really going on there is that instead of storing 0 into
    cr_ifs member we store the low 38 bits of ar.pfs.
    Signed-off-by: default avatarDavid Mosberger <davidm@hpl.hp.com>
    Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
    47651db7
ivt.S 46.8 KB