• Alexey Dobriyan's avatar
    proc: faster /proc/*/status · f7a5f132
    Alexey Dobriyan authored
    top(1) opens the following files for every PID:
    
    	/proc/*/stat
    	/proc/*/statm
    	/proc/*/status
    
    This patch switches /proc/*/status away from seq_printf().
    The result is 13.5% speedup.
    
    Benchmark is open("/proc/self/status")+read+close 1.000.000 million times.
    
    				BEFORE
    $ perf stat -r 10 taskset -c 3 ./proc-self-status
    
     Performance counter stats for 'taskset -c 3 ./proc-self-status' (10 runs):
    
          10748.474301      task-clock (msec)         #    0.954 CPUs utilized            ( +-  0.91% )
                    12      context-switches          #    0.001 K/sec                    ( +-  1.09% )
                     1      cpu-migrations            #    0.000 K/sec
                   104      page-faults               #    0.010 K/sec                    ( +-  0.45% )
        37,424,127,876      cycles                    #    3.482 GHz                      ( +-  0.04% )
         8,453,010,029      stalled-cycles-frontend   #   22.59% frontend cycles idle     ( +-  0.12% )
         3,747,609,427      stalled-cycles-backend    #  10.01% backend cycles idle       ( +-  0.68% )
        65,632,764,147      instructions              #    1.75  insn per cycle
                                                      #    0.13  stalled cycles per insn  ( +-  0.00% )
        13,981,324,775      branches                  # 1300.773 M/sec                    ( +-  0.00% )
           138,967,110      branch-misses             #    0.99% of all branches          ( +-  0.18% )
    
          11.263885428 seconds time elapsed                                          ( +-  0.04% )
          ^^^^^^^^^^^^
    
    				AFTER
    $ perf stat -r 10 taskset -c 3 ./proc-self-status
    
     Performance counter stats for 'taskset -c 3 ./proc-self-status' (10 runs):
    
           9010.521776      task-clock (msec)         #    0.925 CPUs utilized            ( +-  1.54% )
                    11      context-switches          #    0.001 K/sec                    ( +-  1.54% )
                     1      cpu-migrations            #    0.000 K/sec                    ( +- 11.11% )
                   103      page-faults               #    0.011 K/sec                    ( +-  0.60% )
        32,352,310,603      cycles                    #    3.591 GHz                      ( +-  0.07% )
         7,849,199,578      stalled-cycles-frontend   #   24.26% frontend cycles idle     ( +-  0.27% )
         3,269,738,842      stalled-cycles-backend    #  10.11% backend cycles idle       ( +-  0.73% )
        56,012,163,567      instructions              #    1.73  insn per cycle
                                                      #    0.14  stalled cycles per insn  ( +-  0.00% )
        11,735,778,795      branches                  # 1302.453 M/sec                    ( +-  0.00% )
            98,084,459      branch-misses             #    0.84% of all branches          ( +-  0.28% )
    
           9.741247736 seconds time elapsed                                          ( +-  0.07% )
           ^^^^^^^^^^^
    
    Link: http://lkml.kernel.org/r/20160806125608.GB1187@p183.telecom.bySigned-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
    Cc: Joe Perches <joe@perches.com>
    Cc: Andi Kleen <andi@firstfloor.org>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    f7a5f132
array.c 19.9 KB