1. 16 Sep, 2009 1 commit
    • Ingo Molnar's avatar
      perf sched: Account for lost events, increase default buffering · dc02bf71
      Ingo Molnar authored
      Output such lost event and state machine weirdness stats:
      
         TOTAL:                |  14974.910 ms |    46384 |
        ---------------------------------------------------
         INFO: 8.865% lost events (19132 out of 215819, in 8 chunks)
         INFO: 0.198% state machine bugs (49 out of 24708) (due to lost events?)
      
      And increase buffering to -m 1024 (4 MB) by default. Since we
      use output multiplexing that kind of space is needed.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      dc02bf71
  2. 14 Sep, 2009 6 commits
    • mingo's avatar
      perf sched: Add support for sched:sched_stat_runtime events · 39aeb52f
      mingo authored
      This allows more precise 'perf sched latency' output:
      
       ---------------------------------------------------------------------------------------
        Task                  |  Runtime ms | Switches | Average delay ms | Maximum delay ms |
       ---------------------------------------------------------------------------------------
        ksoftirqd/0-4         |    0.010 ms |        2 | avg:    2.476 ms | max:    2.977 ms |
        perf-12328            |   15.844 ms |       66 | avg:    1.118 ms | max:    9.979 ms |
        bdi-default-235       |    0.009 ms |        1 | avg:    0.998 ms | max:    0.998 ms |
        events/1-8            |    0.020 ms |        2 | avg:    0.998 ms | max:    0.998 ms |
        events/0-7            |    0.018 ms |        2 | avg:    0.992 ms | max:    0.996 ms |
        sleep-12329           |    0.742 ms |        3 | avg:    0.906 ms | max:    2.289 ms |
        sshd-12122            |    0.163 ms |        2 | avg:    0.283 ms | max:    0.562 ms |
        loop-getpid-lon-12322 | 1023.636 ms |       69 | avg:    0.208 ms | max:    5.996 ms |
        loop-getpid-lon-12321 | 1038.638 ms |        5 | avg:    0.073 ms | max:    0.171 ms |
        migration/1-5         |    0.000 ms |        1 | avg:    0.006 ms | max:    0.006 ms |
       ---------------------------------------------------------------------------------------
        TOTAL:                | 2079.078 ms |      153 |
       -------------------------------------------------
      
      Also, streamline the code a bit more, add asserts for various state
      machine failures (they should be debugged if they occur) and fix
      a few odd ends.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      39aeb52f
    • mingo's avatar
      perf sched: Print PIDs too · 08f69e6c
      mingo authored
      Often it's useful to know the PID of the task as well - print it
      out too.
      
      ( While at it, reformat the output to be a bit more
        paste-into-commit-logs friendly. )
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      08f69e6c
    • Ingo Molnar's avatar
      perf sched: Fix 'perf sched latency' output on 32-bit systems · d1153389
      Ingo Molnar authored
      Before:
      
        -----------------------------------------------------------------------------------
         Task              |  Runtime ms | Switches | Average delay ms | Maximum delay ms |
        -----------------------------------------------------------------------------------
         perf              |4853313.251 ms |       10 | avg:    0.046 ms | max:    0.337 ms |
         flush-8:0         |2426659.202 ms |        5 | avg:    0.015 ms | max:    0.016 ms |
         sleep             |485331.966 ms |        1 | avg:    0.012 ms | max:    0.012 ms |
         ksoftirqd/1       |485331.320 ms |        1 | avg:    0.005 ms | max:    0.005 ms |
        -----------------------------------------------------------------------------------
         TOTAL:            |8250635.739 ms |       17 |
        ---------------------------------------------
      
      After:
      
        -----------------------------------------------------------------------------------
         Task              |  Runtime ms | Switches | Average delay ms | Maximum delay ms |
        -----------------------------------------------------------------------------------
         perf              |    0.206 ms |       10 | avg:    0.046 ms | max:    0.337 ms |
         flush-8:0         |    2.680 ms |        5 | avg:    0.015 ms | max:    0.016 ms |
         sleep             |    0.662 ms |        1 | avg:    0.012 ms | max:    0.012 ms |
         ksoftirqd/1       |    0.015 ms |        1 | avg:    0.005 ms | max:    0.005 ms |
        -----------------------------------------------------------------------------------
         TOTAL:            |    3.563 ms |       17 |
        ---------------------------------------------
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d1153389
    • Ingo Molnar's avatar
      perf tools: Implement counter output multiplexing · ea57c4f5
      Ingo Molnar authored
      Finish the -M/--multiplex option implementation:
      
       - separate it out from group_fd
      
       - correctly set it via the ioctl and dont mmap counters that
         are multiplexed
      
       - modify the perf record event loop to deal with buffer-less
         counters.
      
       - remove the -g option from perf sched record
      
       - account for unordered events in perf sched latency
      
       - (add -f to perf sched record to ease measurements)
      
       - skip idle threads (pid==0) in latency output
      
      The result is better latency output by 'perf sched latency':
      
       -----------------------------------------------------------------------------------
        Task              |  Runtime ms | Switches | Average delay ms | Maximum delay ms |
       -----------------------------------------------------------------------------------
        ksoftirqd/8       |    0.071 ms |        2 | avg:    0.458 ms | max:    0.913 ms |
        at-spi-registry   |    0.609 ms |       19 | avg:    0.013 ms | max:    0.023 ms |
        perf              |    3.316 ms |       16 | avg:    0.013 ms | max:    0.054 ms |
        Xorg              |    0.392 ms |       19 | avg:    0.011 ms | max:    0.018 ms |
        sleep             |    0.537 ms |        2 | avg:    0.009 ms | max:    0.009 ms |
       -----------------------------------------------------------------------------------
        TOTAL:            |    4.925 ms |       58 |
       ---------------------------------------------
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ea57c4f5
    • Frederic Weisbecker's avatar
      perf tools: Fix processing of randomly serialized sched traces · aa1ab9d2
      Frederic Weisbecker authored
      Currently it's possible to meet such too high latency results
      with 'perf sched latency'.
      
       -----------------------------------------------------------------------------------
       Task              |  Runtime ms | Switches | Average delay ms | Maximum delay ms |
       -----------------------------------------------------------------------------------
       xfce4-panel       |    0.222 ms |        2 | avg: 4718.345 ms | max: 9436.493 ms |
       scsi_eh_3         |    3.962 ms |       36 | avg:   55.957 ms | max: 1977.829 ms |
      
      The origin is on traces that are sometimes badly serialized across cpus.
      For example the raw traces that raised such results for xfce4-panel:
      
      (1)          [init]-0     [000]  1494.663899990: sched_switch: task swapper:0 [140] (R) ==> xfce4-panel:4569 [120]
      (2)     xfce4-panel-4569  [000]  1494.663928373: sched_switch: task xfce4-panel:4569 [120] (S) ==> swapper:0 [140]
      (3)            Xorg-4276  [001]  1494.663860125: sched_wakeup: task xfce4-panel:4569 [120] success=1 [000]
      (4)            Xorg-4276  [001]  1504.098252756: sched_wakeup: task xfce4-panel:4569 [120] success=1 [000]
      (5)            perf-5219  [000]  1504.100353302: sched_switch: task perf:5219 [120] (S) ==> xfce4-panel:4569 [120]
      
      The traces are processed in the order they arrive. Then in (2),
      xfce4-panel sleeps, it is first waken up in (3) and eventually
      scheduled in (5).
      
      The latency reported is then 1504 - 1495 = 9 secs, as reported by perf
      sched. But this is wrong, we are confident in the fact the traces are
      nicely serialized while we should actually more trust the timestamps.
      
      If we reorder by timestamps we get:
      
      (1)            Xorg-4276  [001]  1494.663860125: sched_wakeup: task xfce4-panel:4569 [120] success=1 [000]
      (2)          [init]-0     [000]  1494.663899990: sched_switch: task swapper:0 [140] (R) ==> xfce4-panel:4569 [120]
      (3)     xfce4-panel-4569  [000]  1494.663928373: sched_switch: task xfce4-panel:4569 [120] (S) ==> swapper:0 [140]
      (4)            Xorg-4276  [001]  1504.098252756: sched_wakeup: task xfce4-panel:4569 [120] success=1 [000]
      (5)            perf-5219  [000]  1504.100353302: sched_switch: task perf:5219 [120] (S) ==> xfce4-panel:4569 [120]
      
      Now the trace make more sense, xfce4-panel is sleeping. Then it is
      woken up in (1), scheduled in (2)
      It goes to sleep in (3), woken up in (4) and scheduled in (5).
      
      Now, latency captured between (1) and (2) is of 39 us.
      And between (4) and (5) it is 2.1 ms.
      
      Such pattern of bad serializing is the origin of the high latencies
      reported by perf sched.
      
      Basically, we need to check whether wake up time is higher than
      schedule out time. If it's not the case, we need to tag the current
      work atom as invalid.
      
      Beside that, we may need to work later on a better ordering of the
      traces given by the kernel.
      
      After this patch:
      
      xfce4-session     |    0.221 ms |        1 | avg:    0.538 ms | max:    0.538 ms |
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      aa1ab9d2
    • Frederic Weisbecker's avatar
      perf tools: Add an option to multiplex counters in a single channel · d1302522
      Frederic Weisbecker authored
      Add an option to multiplex counters output in the channel of
      the group leader, ie: the first counter opened:
      
      	-M --multiplex
      
      The effect is better serialized samples. This is especially
      useful for tracepoint samples that need to be well serialized
      for their post-processing.
      
      Also make use of this option in 'perf sched'.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d1302522
  3. 13 Sep, 2009 24 commits
    • Ingo Molnar's avatar
      perf_counter, sched: Add sched_stat_runtime tracepoint · f977bb49
      Ingo Molnar authored
      This allows more precise tracking of how the scheduler accounts
      (and acts upon) a task having spent N nanoseconds of CPU time.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f977bb49
    • Ingo Molnar's avatar
      perf sched: Add 'perf sched trace', improve documentation · c13f0d3c
      Ingo Molnar authored
      Alias 'perf sched trace' to 'perf trace', for workflow completeness.
      
      Add a bit of documentation for perf sched.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      c13f0d3c
    • Ingo Molnar's avatar
      perf_counter: Allow mmap if paranoid checks are turned off · 459ec28a
      Ingo Molnar authored
      Before:
      
        $ perf sched record -f sleep 1
        Error: failed to mmap with 1 (Operation not permitted)
      
      After:
      
        $ perf sched record -f sleep 1
        [ perf record: Captured and wrote 0.095 MB perf.data (~4161 samples) ]
      
      Note, this is only allowed if perfcounter_paranoid is set to
      the most permissive (non-default) value of -1.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      459ec28a
    • Ingo Molnar's avatar
      perf sched: Implement the 'perf sched record' subcommand · 1fc35b29
      Ingo Molnar authored
      Implement the 'perf sched record' subcommand that adds a
      default list of events, turns on raw sampling and system-wide
      tracing and passes off the rest of the command to perf record.
      
      This is more convenient than having to specify the events all
      the time.
      
      Before:
      
       $ perf record -a -R -e sched:sched_switch:r -e sched:sched_stat_wait:r -e sched:sched_stat_sleep:r -e sched:sched_stat_iowait:r -e sched:sched_process_exit:r -e sched:sched_process_fork:r -e sched:sched_wakeup:r -e sched:sched_migrate_task:r -c 1 sleep 1
      
      After:
      
       $ perf sched record -f sleep 1
      
      Also fix an assumption in the event string parser that assumed
      that strings passed in can be modified. (In this case they wont
      be as they come from a readonly constant section.)
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      1fc35b29
    • Ingo Molnar's avatar
      perf sched: Clean up PID sorting logic · b5fae128
      Ingo Molnar authored
      Use a sort list for thread atoms insertion as well - instead of
      hardcoded for PID.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b5fae128
    • Ingo Molnar's avatar
      perf sched: Finish latency => atom rename and misc cleanups · b1ffe8f3
      Ingo Molnar authored
      - Rename 'latency' field/variable names to the better 'atom' ones
      
       - Reduce the number of #include lines and consolidate them
      
       - Gather file scope variables at the top of the file
      
       - Remove unused bits
      
      No change in functionality.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b1ffe8f3
    • Ingo Molnar's avatar
      perf sched: Add 'perf sched latency' and 'perf sched replay' · f2858d8a
      Ingo Molnar authored
      Separate the option parsing cleanly and add two variants:
      
       - 'perf sched latency' (can be abbreviated via 'perf sched lat')
       - 'perf sched replay'  (can be abbreviated via 'perf sched rep')
      
      Also add a repeat count option to replay and add a separation
      set of options for replay.
      
      Do the sorting setup only in the latency sub-command.
      
      Display separate help screens for 'perf sched' and
      'perf sched replay -h' - i.e. further separation of the
      sub-commands.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f2858d8a
    • Frederic Weisbecker's avatar
      perf sched: Implement multidimensional sorting · daa1d7a5
      Frederic Weisbecker authored
      Implement multidimensional sorting on perf sched so that
      you can sort either by number of switches, latency average,
      latency maximum, runtime.
      
      perf sched -l -s avg,max  (this is the default)
      
      -----------------------------------------------------------------------------------
       Task              |  Runtime ms | Switches | Average delay ms | Maximum delay ms |
      -----------------------------------------------------------------------------------
       gnome-power-man   |    0.113 ms |        1 | avg: 4998.531 ms | max: 4998.531 ms |
       xfdesktop         |    1.190 ms |        7 | avg:  136.475 ms | max:  940.933 ms |
       xfce-mcs-manage   |    2.194 ms |       22 | avg:   38.534 ms | max:  735.174 ms |
       notification-da   |    2.749 ms |       31 | avg:   27.436 ms | max:  731.791 ms |
       xfce4-session     |    3.343 ms |       28 | avg:   26.796 ms | max:  734.891 ms |
       xfwm4             |    3.159 ms |       22 | avg:   12.406 ms | max:  241.333 ms |
       xchat             |   42.789 ms |      214 | avg:   11.886 ms | max:  100.349 ms |
       xfce4-terminal    |    5.386 ms |       22 | avg:   11.414 ms | max:  241.611 ms |
       firefox           |  151.992 ms |      123 | avg:    9.543 ms | max:  153.717 ms |
       xfce4-panel       |   24.324 ms |       47 | avg:    8.189 ms | max:  242.352 ms |
       :5090             |    6.932 ms |      111 | avg:    8.131 ms | max:  102.665 ms |
       events/0          |    0.758 ms |       12 | avg:    1.964 ms | max:   21.879 ms |
       Xorg              |  280.558 ms |      340 | avg:    1.864 ms | max:   99.526 ms |
       geany             |   63.391 ms |      295 | avg:    1.099 ms | max:    9.334 ms |
       reiserfs/0        |    0.039 ms |        2 | avg:    0.854 ms | max:    1.487 ms |
       kondemand/0       |    8.251 ms |      245 | avg:    0.691 ms | max:   34.372 ms |
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      daa1d7a5
    • Frederic Weisbecker's avatar
      perf sched: Fix nsec to msec conversion · 73622626
      Frederic Weisbecker authored
      We are dividing a time in ns by 1e9. This is a nsec to sec
      conversion. What we want is msecs. Fix it by dividing by 1e6.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      73622626
    • Frederic Weisbecker's avatar
      perf sched: Export the total, max latency and total runtime to thread atoms list · 66685678
      Frederic Weisbecker authored
      Add a field in the thread atom list that keeps track of the
      total and max latencies and also the total runtime. This makes
      a faster output and also prepares for sorting.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      66685678
    • Frederic Weisbecker's avatar
      perf sched: Add involuntarily sleeping task in work atoms · c6ced611
      Frederic Weisbecker authored
      Currently in perf sched, we are measuring the scheduler wakeup
      latencies.
      
      Now we also want measure the time a task wait to be scheduled
      after it gets preempted.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      c6ced611
    • Frederic Weisbecker's avatar
      perf sched: Rename struct lat_snapshot to struct work atoms · 17562205
      Frederic Weisbecker authored
      To measures the latencies, we capture the sched atoms data into
      a specific structure named struct lat_snapshot.
      
      As this structure can be used for other purposes of scheduler
      profiling and mirrors what happens in a thread work atom, lets
      rename it to struct work_atom and propagate this renaming in
      other functions and structures names to keep it coherent.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      17562205
    • Ingo Molnar's avatar
      perf sched: Output runtime and context switch totals · 3e304147
      Ingo Molnar authored
      After:
      
      -----------------------------------------------------------------------------------
       Task              |  Runtime ms | Switches | Average delay ms | Maximum delay ms |
      -----------------------------------------------------------------------------------
       make              |    0.678 ms |       13 | avg:    0.018 ms | max:    0.050 ms |
       gcc               |    0.014 ms |        2 | avg:    0.320 ms | max:    0.627 ms |
       gcc               |    0.000 ms |        2 | avg:    0.185 ms | max:    0.369 ms |
      ...
      -----------------------------------------------------------------------------------
       TOTAL:            |   21.316 ms |       63 |
      ---------------------------------------------
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3e304147
    • Ingo Molnar's avatar
      perf sched: Add runtime stats · ea92ed5a
      Ingo Molnar authored
      Extend the latency tracking structure with scheduling atom
      runtime info - and sum it up during per task display.
      
      (Also clean up a few details.)
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ea92ed5a
    • Ingo Molnar's avatar
      perf sched: Display time in milliseconds, reorganize output · d9340c1d
      Ingo Molnar authored
      After:
      
      -----------------------------------------------------------------------------------
       Task              |  runtime ms | switches | average delay ms | maximum delay ms |
      -----------------------------------------------------------------------------------
       migration/0       |    0.000 ms |        1 | avg:    0.047 ms | max:    0.047 ms |
       ksoftirqd/0       |    0.000 ms |        1 | avg:    0.039 ms | max:    0.039 ms |
       migration/1       |    0.000 ms |        3 | avg:    0.013 ms | max:    0.016 ms |
       migration/3       |    0.000 ms |        2 | avg:    0.003 ms | max:    0.004 ms |
       migration/4       |    0.000 ms |        1 | avg:    0.022 ms | max:    0.022 ms |
       distccd           |    0.000 ms |        1 | avg:    0.004 ms | max:    0.004 ms |
       distccd           |    0.000 ms |        1 | avg:    0.014 ms | max:    0.014 ms |
       distccd           |    0.000 ms |        2 | avg:    0.000 ms | max:    0.000 ms |
       distccd           |    0.000 ms |        2 | avg:    0.012 ms | max:    0.019 ms |
       distccd           |    0.000 ms |        1 | avg:    0.002 ms | max:    0.002 ms |
       as                |    0.000 ms |        2 | avg:    0.019 ms | max:    0.019 ms |
       as                |    0.000 ms |        3 | avg:    0.015 ms | max:    0.017 ms |
       as                |    0.000 ms |        1 | avg:    0.009 ms | max:    0.009 ms |
       perf              |    0.000 ms |        1 | avg:    0.001 ms | max:    0.001 ms |
       gcc               |    0.000 ms |        1 | avg:    0.021 ms | max:    0.021 ms |
       run-mozilla.sh    |    0.000 ms |        2 | avg:    0.010 ms | max:    0.017 ms |
       mozilla-plugin-   |    0.000 ms |        1 | avg:    0.006 ms | max:    0.006 ms |
       gcc               |    0.000 ms |        2 | avg:    0.013 ms | max:    0.013 ms |
      -----------------------------------------------------------------------------------
      
      (The runtime ms column is not filled in yet.)
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d9340c1d
    • Ingo Molnar's avatar
      perf sched: Clean up latency and replay sub-commands · 46f392c9
      Ingo Molnar authored
      - Separate the latency and the replay commands more cleanly
      
       - Use consistent naming
      
       - Display help page on 'perf sched' outlining comments,
         instead of aborting
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      46f392c9
    • Frederic Weisbecker's avatar
      perf sched: Add sched latency profiling · cdce9d73
      Frederic Weisbecker authored
      Add the -l --latency option that reports statistics about the
      scheduler latencies.
      
      For now, the latencies are measured in the following sequence
      scope:
      
      - task A is sleeping (D or S state)
      - task B wakes up A
               ^
               |
               |
      
         latency timeframe
      
               |
               |
               v
      - task A is scheduled in
      
      Start by recording every scheduler events:
      
      	perf record -e sched:*
      
      and then fetch the results:
      
      	perf sched -l
      
       Tasks                     count          total              avg            max
      
      migration/0                  2             39849            19924           28826
      ksoftirqd/0                  7            756383           108054          373014
      migration/1                  5             45391             9078           10452
      ksoftirqd/1                  2            399055           199527          359130
      events/0                     8           4780110           597513         4500250
      events/1                     9           6353057           705895         2986012
      kblockd/0                   42          37805097           900121         5077684
      
      The snapshot are in nanoseconds.
      
      - Count: number of snapshots taken for the given task
      - Total: total latencies in nanosec
      - Avg  : average of latency between wake up and sched in
      - Max  : max snapshot latency
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      cdce9d73
    • Frederic Weisbecker's avatar
      perf sched: Make it easier to plug in new sub profilers · 419ab0d6
      Frederic Weisbecker authored
      Create a sched event structure of handlers in which various
      sched events reader can plug their own callbacks.
      
      This makes easier the addition of new perf sched sub commands.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      419ab0d6
    • Frederic Weisbecker's avatar
      perf sched: Fix bad event alignment · 46538818
      Frederic Weisbecker authored
      perf sched raises the following error when it meets a sched
      switch event:
      
      perf: builtin-sched.c:286: register_pid: Assertion `!(pid >= 65536)' failed.
      Abandon
      
      Currently in x86-64, the sched switch events have a hole in the
      middle of the structure:
      
      	u16 common_type;
      	u8 common_flags;
      	u8 common_preempt_count;
      	u32 common_pid;
      	u32 common_tgid;
      
      	char prev_comm[16];
      	u32 prev_pid;
      	u32 prev_prio;
      			<--- there
      	u64 prev_state;
      	char next_comm[16];
      	u32 next_pid;
      	u32 next_prio;
      
      Gcc inserts a 4 bytes hole there for prev_state to be u64
      aligned. And the events are exported to userspace with this
      hole.
      
      But in userspace, from perf sched, we fetch it using a
      structure that has a new field in the beginning: u32 size. This
      is because our trace is exported with its size as a field. But
      now that we have this new field, the hole in the middle
      disappears because it makes prev_state becoming well aligned.
      
      And since we are using a pointer to the raw trace using this
      struct, instead of reading prev_state, we are reading the hole.
      
      We could fix it by keeping the size seperate from the struct
      but actually there a lot of other potential problems: some
      fields may be saved as long in a 64 bits system and later read
      as long in a 32 bits system. Also this direct cast doesn't care
      about the endianness differences between the host traced
      machine and the machine in which we do the post processing.
      
      So instead of using such dangerous direct casts, fetch the
      values using the trace parsing API that already takes care of
      all these problems.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      46538818
    • Frederic Weisbecker's avatar
      perf tools: Allow the specification of all tracepoints at once · bcd3279f
      Frederic Weisbecker authored
      Currently, when one wants to activate every tracepoint
      counters of a subsystem from perf record, the current sequence
      is needed:
      
        perf record -e subsys:ev1 -e subsys:ev2 -e subsys:ev3
      
      This may annoy the most patient of us.
      
      Now we can just do:
      
        perf record -e subsys:*
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      bcd3279f
    • Ingo Molnar's avatar
      perf sched: Tighten up the code · ad236fd2
      Ingo Molnar authored
      Various small cleanups - removal of debug printks and dead
      functions, etc.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ad236fd2
    • Ingo Molnar's avatar
      perf sched: Implement the scheduling workload replay engine · fbf94829
      Ingo Molnar authored
      Integrate the schedbench.c bits with the raw trace events
      that we get from the perf machinery, and activate the
      workload replayer/simulator.
      
      Example of a captured 'make -j' workload:
      
      $ perf sched
      
        run measurement overhead: 90 nsecs
        sleep measurement overhead: 2724743 nsecs
        the run test took 1000081 nsecs
        the sleep test took 2981111 nsecs
        version = 0.5
        ...
        nr_run_events:        70
        nr_sleep_events:      66
        nr_wakeup_events:     9
        target-less wakeups:  71
        multi-target wakeups: 47
        run events optimized: 139
        task      0 (                perf:      6607), nr_events: 2
        task      1 (                perf:      6608), nr_events: 6
        task      2 (                    :         0), nr_events: 1
        task      3 (                make:      6609), nr_events: 5
        task      4 (                  sh:      6610), nr_events: 4
        task      5 (                make:      6611), nr_events: 6
        task      6 (                  sh:      6612), nr_events: 4
        task      7 (                make:      6613), nr_events: 5
        task      8 (        migration/11:        25), nr_events: 1
        task      9 (        migration/13:        29), nr_events: 1
        task     10 (        migration/15:        33), nr_events: 1
        task     11 (         migration/9:        21), nr_events: 1
        task     12 (                  sh:      6614), nr_events: 4
        task     13 (                make:      6615), nr_events: 5
        task     14 (                  sh:      6616), nr_events: 4
        task     15 (                make:      6617), nr_events: 7
        task     16 (         migration/3:         9), nr_events: 1
        task     17 (         migration/5:        13), nr_events: 1
        task     18 (         migration/7:        17), nr_events: 1
        task     19 (         migration/1:         5), nr_events: 1
        task     20 (                  sh:      6618), nr_events: 4
        task     21 (                make:      6619), nr_events: 5
        task     22 (                  sh:      6620), nr_events: 4
        task     23 (                make:      6621), nr_events: 10
        task     24 (                  sh:      6623), nr_events: 3
        task     25 (                 gcc:      6624), nr_events: 4
        task     26 (                 gcc:      6625), nr_events: 4
        task     27 (                 gcc:      6626), nr_events: 5
        task     28 (            collect2:      6627), nr_events: 5
        task     29 (                  sh:      6622), nr_events: 1
        task     30 (                make:      6628), nr_events: 7
        task     31 (                  sh:      6630), nr_events: 4
        task     32 (                 gcc:      6631), nr_events: 4
        task     33 (                  sh:      6629), nr_events: 1
        task     34 (                 gcc:      6632), nr_events: 4
        task     35 (                 gcc:      6633), nr_events: 4
        task     36 (            collect2:      6634), nr_events: 4
        task     37 (                make:      6635), nr_events: 8
        task     38 (                  sh:      6637), nr_events: 4
        task     39 (                  sh:      6636), nr_events: 1
        task     40 (                 gcc:      6638), nr_events: 4
        task     41 (                 gcc:      6639), nr_events: 4
        task     42 (                 gcc:      6640), nr_events: 4
        task     43 (            collect2:      6641), nr_events: 4
        task     44 (                make:      6642), nr_events: 6
        task     45 (                  sh:      6643), nr_events: 5
        task     46 (                  sh:      6644), nr_events: 3
        task     47 (                  sh:      6645), nr_events: 4
        task     48 (                make:      6646), nr_events: 6
        task     49 (                  sh:      6647), nr_events: 3
        task     50 (                make:      6648), nr_events: 5
        task     51 (                  sh:      6649), nr_events: 5
        task     52 (                  sh:      6650), nr_events: 6
        task     53 (                make:      6651), nr_events: 4
        task     54 (                make:      6652), nr_events: 5
        task     55 (                make:      6653), nr_events: 4
        task     56 (                make:      6654), nr_events: 4
        task     57 (                make:      6655), nr_events: 5
        task     58 (                  sh:      6656), nr_events: 4
        task     59 (                 gcc:      6657), nr_events: 9
        task     60 (         ksoftirqd/3:        10), nr_events: 1
        task     61 (                 gcc:      6658), nr_events: 4
        task     62 (                make:      6659), nr_events: 5
        task     63 (                  sh:      6660), nr_events: 3
        task     64 (                 gcc:      6661), nr_events: 5
        task     65 (            collect2:      6662), nr_events: 4
        ------------------------------------------------------------
        #1  : 256.745, ravg: 256.74, cpu: 0.00 / 0.00
        #2  : 439.372, ravg: 275.01, cpu: 0.00 / 0.00
        #3  : 411.971, ravg: 288.70, cpu: 0.00 / 0.00
        #4  : 385.500, ravg: 298.38, cpu: 0.00 / 0.00
        #5  : 366.526, ravg: 305.20, cpu: 0.00 / 0.00
        #6  : 381.281, ravg: 312.81, cpu: 0.00 / 0.00
        #7  : 410.756, ravg: 322.60, cpu: 0.00 / 0.00
        #8  : 368.009, ravg: 327.14, cpu: 0.00 / 0.00
        #9  : 408.098, ravg: 335.24, cpu: 0.00 / 0.00
        #10 : 368.582, ravg: 338.57, cpu: 0.00 / 0.00
      
      I.e. we successfully analyzed the trace, replayed it
      via real threads and measured the replayed workload's
      scheduling properties.
      
      This is how it looked like in 'top' output:
      
         PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
        7164 mingo     20   0 1434m 8080  888 R 57.0  0.1   0:02.04 :perf
        7165 mingo     20   0 1434m 8080  888 R 41.8  0.1   0:01.52 :perf
        7228 mingo     20   0 1434m 8080  888 R 39.8  0.1   0:01.44 :gcc
        7225 mingo     20   0 1434m 8080  888 R 33.8  0.1   0:01.26 :gcc
        7202 mingo     20   0 1434m 8080  888 R 31.2  0.1   0:01.16 :sh
        7222 mingo     20   0 1434m 8080  888 R 25.2  0.1   0:00.96 :sh
        7211 mingo     20   0 1434m 8080  888 R 21.9  0.1   0:00.82 :sh
        7213 mingo     20   0 1434m 8080  888 D 19.2  0.1   0:00.74 :sh
        7194 mingo     20   0 1434m 8080  888 D 18.6  0.1   0:00.72 :make
      
      There's still various kinks in it - more patches to come.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      fbf94829
    • Ingo Molnar's avatar
      perf sched: Import schedbench.c · ec156764
      Ingo Molnar authored
      Import the schedbench.c tool that i wrote some time ago to
      simulate scheduler behavior but never finished. It's a good
      basis for perf sched nevertheless.
      
      Most of its guts are not hooked up to the perf event loop
      yet - that will be done in the patches to come.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ec156764
    • Ingo Molnar's avatar
      perf: Add 'perf sched' tool · 0a02ad93
      Ingo Molnar authored
      This turn-key tool allows scheduler measurements to be
      conducted and the results be displayed numerically.
      
      First baby step towards that goal: clone the new command off of
      perf trace.
      
      Fix a few other details along the way:
      
       - add (minimal) perf trace documentation
      
       - reorder a few places
      
       - list perf trace in the mainporcelain list as well
         as it's a very useful utility.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0a02ad93
  4. 11 Sep, 2009 9 commits
    • Linus Torvalds's avatar
      Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6 · 86d71014
      Linus Torvalds authored
      * git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (87 commits)
        NFSv4: Disallow 'mount -t nfs4 -overs=2' and 'mount -t nfs4 -overs=3'
        NFS: Allow the "nfs" file system type to support NFSv4
        NFS: Move details of nfs4_get_sb() to a helper
        NFS: Refactor NFSv4 text-based mount option validation
        NFS: Mount option parser should detect missing "port="
        NFS: out of date comment regarding O_EXCL above nfs3_proc_create()
        NFS: Handle a zero-length auth flavor list
        SUNRPC: Ensure that sunrpc gets initialised before nfs, lockd, etc...
        nfs: fix compile error in rpc_pipefs.h
        nfs: Remove reference to generic_osync_inode from a comment
        SUNRPC: cache must take a reference to the cache detail's module on open()
        NFS: Use the DNS resolver in the mount code.
        NFS: Add a dns resolver for use with NFSv4 referrals and migration
        SUNRPC: Fix a typo in cache_pipefs_files
        nfs: nfs4xdr: optimize low level decoding
        nfs: nfs4xdr: get rid of READ_BUF
        nfs: nfs4xdr: simplify decode_exchange_id by reusing decode_opaque_inline
        nfs: nfs4xdr: get rid of COPYMEM
        nfs: nfs4xdr: introduce decode_sessionid helper
        nfs: nfs4xdr: introduce decode_verifier helper
        ...
      86d71014
    • Linus Torvalds's avatar
      Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev · 86373435
      Linus Torvalds authored
      * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: (25 commits)
        pata_rz1000: use printk_once
        ahci: kill @force_restart and refine CLO for ahci_kick_engine()
        pata_cs5535: add pci id for AMD based CS5535 controllers
        ahci: Add AMD SB900 SATA/IDE controller device IDs
        drivers/ata: use resource_size
        sata_fsl: Defer non-ncq commands when ncq commands active
        libata: add SATA PMP revision information for spec 1.2
        libata: fix off-by-one error in ata_tf_read_block()
        ahci: Gigabyte GA-MA69VM-S2 can't do 64bit DMA
        ahci: make ahci_asus_m2a_vm_32bit_only() quirk more generic
        dmi: extend dmi_get_year() to dmi_get_date()
        dmi: fix date handling in dmi_get_year()
        libata: unbreak TPM filtering by reorganizing ata_scsi_pass_thru()
        sata_sis: convert to slave_link
        sata_sil24: always set protocol override for non-ATAPI data commands
        libata: Export AHCI capabilities
        libata: Delegate nonrot flag setting to SCSI
        [libata] Add pata_rdc driver for RDC ATA devices
        drivers/ata: Remove unnecessary semicolons
        libata: remove spindown skipping and warning
        ...
      86373435
    • Linus Torvalds's avatar
      Merge branch 'tracing-core-for-linus' of... · 483e3cd6
      Linus Torvalds authored
      Merge branch 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (105 commits)
        ring-buffer: only enable ring_buffer_swap_cpu when needed
        ring-buffer: check for swapped buffers in start of committing
        tracing: report error in trace if we fail to swap latency buffer
        tracing: add trace_array_printk for internal tracers to use
        tracing: pass around ring buffer instead of tracer
        tracing: make tracing_reset safe for external use
        tracing: use timestamp to determine start of latency traces
        tracing: Remove mentioning of legacy latency_trace file from documentation
        tracing/filters: Defer pred allocation, fix memory leak
        tracing: remove users of tracing_reset
        tracing: disable buffers and synchronize_sched before resetting
        tracing: disable update max tracer while reading trace
        tracing: print out start and stop in latency traces
        ring-buffer: disable all cpu buffers when one finds a problem
        ring-buffer: do not count discarded events
        ring-buffer: remove ring_buffer_event_discard
        ring-buffer: fix ring_buffer_read crossing pages
        ring-buffer: remove unnecessary cpu_relax
        ring-buffer: do not swap buffers during a commit
        ring-buffer: do not reset while in a commit
        ...
      483e3cd6
    • Linus Torvalds's avatar
      Merge branch 'sched-core-for-linus' of... · 774a694f
      Linus Torvalds authored
      Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (64 commits)
        sched: Fix sched::sched_stat_wait tracepoint field
        sched: Disable NEW_FAIR_SLEEPERS for now
        sched: Keep kthreads at default priority
        sched: Re-tune the scheduler latency defaults to decrease worst-case latencies
        sched: Turn off child_runs_first
        sched: Ensure that a child can't gain time over it's parent after fork()
        sched: enable SD_WAKE_IDLE
        sched: Deal with low-load in wake_affine()
        sched: Remove short cut from select_task_rq_fair()
        sched: Turn on SD_BALANCE_NEWIDLE
        sched: Clean up topology.h
        sched: Fix dynamic power-balancing crash
        sched: Remove reciprocal for cpu_power
        sched: Try to deal with low capacity, fix update_sd_power_savings_stats()
        sched: Try to deal with low capacity
        sched: Scale down cpu_power due to RT tasks
        sched: Implement dynamic cpu_power
        sched: Add smt_gain
        sched: Update the cpu_power sum during load-balance
        sched: Add SD_PREFER_SIBLING
        ...
      774a694f
    • Linus Torvalds's avatar
      Merge branch 'perfcounters-core-for-linus' of... · 4f0ac854
      Linus Torvalds authored
      Merge branch 'perfcounters-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'perfcounters-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits)
        perf tools: Avoid unnecessary work in directory lookups
        perf stat: Clean up statistics calculations a bit more
        perf stat: More advanced variance computation
        perf stat: Use stddev_mean in stead of stddev
        perf stat: Remove the limit on repeat
        perf stat: Change noise calculation to use stddev
        x86, perf_counter, bts: Do not allow kernel BTS tracing for now
        x86, perf_counter, bts: Correct pointer-to-u64 casts
        x86, perf_counter, bts: Fail if BTS is not available
        perf_counter: Fix output-sharing error path
        perf trace: Fix read_string()
        perf trace: Print out in nanoseconds
        perf tools: Seek to the end of the header area
        perf trace: Fix parsing of perf.data
        perf trace: Sample timestamps as well
        perf_counter: Introduce new (non-)paranoia level to allow raw tracepoint access
        perf trace: Sample the CPU too
        perf tools: Work around strict aliasing related warnings
        perf tools: Clean up warnings list in the Makefile
        perf tools: Complete support for dynamic strings
        ...
      4f0ac854
    • Linus Torvalds's avatar
      Merge branch 'oprofile-for-linus' of... · b9356c53
      Linus Torvalds authored
      Merge branch 'oprofile-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'oprofile-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (55 commits)
        arch/x86/oprofile/op_model_amd.c: fix op_amd_handle_ibs() return type
        Revert "x86: oprofile/op_model_amd.c set return values for op_amd_handle_ibs()"
        x86/oprofile: Small coding style fixes
        x86/oprofile: Add counter reservation check for virtual counters
        x86/oprofile: Implement op_x86_virt_to_phys()
        oprofile: Adding switch counter to oprofile statistic variables
        x86/oprofile: Implement mux_clone()
        x86/oprofile: Enable multiplexing only if the model supports it
        x86/oprofile: Add function has_mux() to check multiplexing support
        x86/oprofile: Modify initialization of num_virt_counters
        x86/oprofile: Remove unused num_virt_controls from struct op_x86_model_spec
        x86/oprofile: Remove const qualifier from struct op_x86_model_spec
        x86/oprofile: Moving nmi_cpu_switch() in nmi_int.c
        x86/oprofile: Moving nmi_cpu_save/restore_mpx_registers() in nmi_int.c
        x86/oprofile: Moving nmi_setup_cpu_mux() in nmi_int.c
        x86/oprofile: Implement multiplexing setup/shutdown functions
        oprofile: Grouping multiplexing code in op_model_amd.c
        oprofile: Introduce op_x86_phys_to_virt()
        oprofile: Grouping multiplexing code in oprof.c
        oprofile: Remove oprofile_multiplexing_init()
        ...
      b9356c53
    • Linus Torvalds's avatar
      Merge branch 'irq-threaded-for-linus' of... · d90a7e86
      Linus Torvalds authored
      Merge branch 'irq-threaded-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'irq-threaded-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        genirq: Do not mask oneshot edge type interrupts
        genirq: Support nested threaded irq handling
        genirq: Add buslock support
        genirq: Add oneshot support
      d90a7e86
    • Linus Torvalds's avatar
      Merge branch 'irq-core-for-linus' of... · 12a49961
      Linus Torvalds authored
      Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        pci/intr_remapping: Allocate irq_iommu on node
        irq: Add irq_node() primitive
        irq: Make sure irq_desc for legacy irq get correct node setting
        genirq: Add prototype for handle_nested_irq()
        irq: Remove superfluous NULL pointer check in check_irq_resend()
        irq: Clean up by removing irqfixup MODULE_PARM_DESC()
        genirq: Fix comment describing suspend_device_irqs()
        genirq: Remove obsolete defines and typedefs
      12a49961
    • Linus Torvalds's avatar
      Merge branch 'core-rcu-for-linus' of... · eee2775d
      Linus Torvalds authored
      Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (28 commits)
        rcu: Move end of special early-boot RCU operation earlier
        rcu: Changes from reviews: avoid casts, fix/add warnings, improve comments
        rcu: Create rcutree plugins to handle hotplug CPU for multi-level trees
        rcu: Remove lockdep annotations from RCU's _notrace() API members
        rcu: Add #ifdef to suppress __rcu_offline_cpu() warning in !HOTPLUG_CPU builds
        rcu: Add CPU-offline processing for single-node configurations
        rcu: Add "notrace" to RCU function headers used by ftrace
        rcu: Remove CONFIG_PREEMPT_RCU
        rcu: Merge preemptable-RCU functionality into hierarchical RCU
        rcu: Simplify rcu_pending()/rcu_check_callbacks() API
        rcu: Use debugfs_remove_recursive() simplify code.
        rcu: Merge per-RCU-flavor initialization into pre-existing macro
        rcu: Fix online/offline indication for rcudata.csv trace file
        rcu: Consolidate sparse and lockdep declarations in include/linux/rcupdate.h
        rcu: Renamings to increase RCU clarity
        rcu: Move private definitions from include/linux/rcutree.h to kernel/rcutree.h
        rcu: Expunge lingering references to CONFIG_CLASSIC_RCU, optimize on !SMP
        rcu: Delay rcu_barrier() wait until beginning of next CPU-hotunplug operation.
        rcu: Fix typo in rcu_irq_exit() comment header
        rcu: Make rcupreempt_trace.c look at offline CPUs
        ...
      eee2775d