1. 21 Jul, 2010 3 commits
    • Lai Jiangshan's avatar
      tracing: Reduce latency and remove percpu trace_seq · bc289ae9
      Lai Jiangshan authored
      __print_flags() and __print_symbolic() use percpu trace_seq:
      
      1) Its memory is allocated at compile time, it wastes memory if we don't use tracing.
      2) It is percpu data and it wastes more memory for multi-cpus system.
      3) It disables preemption when it executes its core routine
         "trace_seq_printf(s, "%s: ", #call);" and introduces latency.
      
      So we move this trace_seq to struct trace_iterator.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4C078350.7090106@cn.fujitsu.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      bc289ae9
    • Richard Kennedy's avatar
      trace: Reorder struct ring_buffer_per_cpu to remove padding on 64bit · 985023de
      Richard Kennedy authored
      Reorder structure to remove 8 bytes of padding on 64 bit builds.
      This shrinks the size to 128 bytes so allowing allocation from a smaller
      slab & needed one fewer cache lines.
      Signed-off-by: default avatarRichard Kennedy <richard@rsk.demon.co.uk>
      LKML-Reference: <1269516456.2054.8.camel@localhost>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      985023de
    • Li Zefan's avatar
      tracing: Allow to disable cmdline recording · e870e9a1
      Li Zefan authored
      We found that even enabling a single trace event that will rarely be
      triggered can add big overhead to context switch.
      
      (lmbench context switch test)
       -------------------------------------------------
       2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
       ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
      ------ ------ ------ ------ ------ ------- -------
        2.19   2.3   2.21   2.56   2.13     2.54    2.07
        2.39   2.51  2.35   2.75   2.27     2.81    2.24
      
      The overhead is 6% ~ 11%.
      
      It's because when a trace event is enabled 3 tracepoints (sched_switch,
      sched_wakeup, sched_wakeup_new) will be activated to map pid to cmdname.
      
      We'd like to avoid this overhead, so add a trace option '(no)record-cmd'
      to allow to disable cmdline recording.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4C2D57F4.2050204@cn.fujitsu.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      e870e9a1
  2. 17 Jul, 2010 4 commits
  3. 16 Jul, 2010 3 commits
  4. 15 Jul, 2010 3 commits
  5. 12 Jul, 2010 1 commit
  6. 06 Jul, 2010 1 commit
  7. 05 Jul, 2010 14 commits
  8. 04 Jul, 2010 1 commit
    • Will Deacon's avatar
      ARM: 6205/1: perf: ensure counter delta is treated as unsigned · 446a5a8b
      Will Deacon authored
      Hardware performance counters on ARM are 32-bits wide but atomic64_t
      variables are used to represent counter data in the hw_perf_event structure.
      
      The armpmu_event_update function right-shifts a signed 64-bit delta variable
      and adds the result to the event count. This can lead to shifting in sign-bits
      if the MSB of the 32-bit counter value is set. This results in perf output
      such as:
      
       Performance counter stats for 'sleep 20':
      
       18446744073460670464  cycles             <-- 0xFFFFFFFFF12A6000
              7783773  instructions             #      0.000 IPC
                  465  context-switches
                  161  page-faults
              1172393  branches
      
         20.154242147  seconds time elapsed
      
      This patch ensures that the delta value is treated as unsigned so that the
      right shift sets the upper bits to zero.
      
      Cc: <stable@kernel.org>
      Acked-by: default avatarJamie Iles <jamie.iles@picochip.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      446a5a8b
  9. 03 Jul, 2010 1 commit
  10. 02 Jul, 2010 9 commits