1. 07 May, 2009 8 commits
    • Ingo Molnar's avatar
      Merge branch 'tracing/hw-branch-tracing' into tracing/core · 0ad5d703
      Ingo Molnar authored
      Merge reason: this topic is ready for upstream now. It passed
                    Oleg's review and Andrew had no further mm/*
                    objections/observations either.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0ad5d703
    • Ingo Molnar's avatar
      Merge branch 'linus' into tracing/core · 44347d94
      Ingo Molnar authored
      Merge reason: tracing/core was on a .30-rc1 base and was missing out on
                    on a handful of tracing fixes present in .30-rc5-almost.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      44347d94
    • Li Zefan's avatar
      tracing/events: fix concurrent access to ftrace_events list, fix · d94fc523
      Li Zefan authored
      In filter_add_subsystem_pred() we should release event_mutex before
      calling filter_free_subsystem_preds(), since both functions hold
      event_mutex.
      
      [ Impact: fix deadlock when writing invalid pred into subsystem filter ]
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Cc: tzanussi@gmail.com
      Cc: a.p.zijlstra@chello.nl
      Cc: fweisbec@gmail.com
      Cc: rostedt@goodmis.org
      LKML-Reference: <4A028993.7020509@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d94fc523
    • Frederic Weisbecker's avatar
      tracing/filters: support for operator reserved characters in strings · 5928c3cc
      Frederic Weisbecker authored
      When we set a filter for an event, such as:
      
      echo "name == my_lock_name" > \
      	/debug/tracing/events/lockdep/lock_acquired/filter
      
      then the following order of token type is parsed:
      
      - space
      - operator
      - parentheses
      - operand
      
      Because the operators and parentheses have a higher precedence
      than the operand characters, which is normal, then we can't
      use any string containing such special characters:
      
      ()=<>!&|
      
      To get this support and also avoid ambiguous intepretation from
      the parser or the human, we can do it using double quotes so that
      we keep the usual languages habits.
      
      Then after this patch you can still declare string condition like
      before:
      
      echo name == myname
      
      But if you want to compare against a string containing an operator
      character, you can use double quotes:
      
      echo 'name == "&myname"'
      
      Don't forget to include the whole expression into single quotes or
      the double ones will be eaten by echo.
      
      [ Impact: support strings with special characters for tracing filters ]
      
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Zhaolei <zhaolei@cn.fujitsu.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      5928c3cc
    • Frederic Weisbecker's avatar
      tracing/filters: support for filters of dynamic sized arrays · e8808c10
      Frederic Weisbecker authored
      Currently the filtering infrastructure supports well the
      numeric types and fixed sized array types.
      
      But the recently added __string() field uses a specific
      indirect offset mechanism which requires a specific
      predicate. Until now it wasn't supported.
      
      This patch adds this support and implies very few changes,
      only a new predicate is needed, the management of this specific
      field can be done through the usual string helpers in the
      filtering infrastructure.
      
      [ Impact: support all kinds of strings in the tracing filters ]
      
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Zhaolei <zhaolei@cn.fujitsu.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      e8808c10
    • Steven Rostedt's avatar
      tracing: add hierarchical enabling of events · 8ae79a13
      Steven Rostedt authored
      With the current event directory, you can only enable individual events.
      The file debugfs/tracing/set_event is used to be able to enable or
      disable several events at once. But that can still be awkward.
      
      This patch adds hierarchical enabling of events. That is, each directory
      in debugfs/tracing/events has an "enable" file. This file can enable
      or disable all events within the directory and below.
      
       # echo 1 > /debugfs/tracing/events/enable
      
      will enable all events.
      
       # echo 1 > /debugfs/tracing/events/sched/enable
      
      will enable all events in the sched subsystem.
      
       # echo 1 > /debugfs/tracing/events/enable
       # echo 0 > /debugfs/tracing/events/irq/enable
      
      will enable all events, but then disable just the irq subsystem events.
      
      When reading one of these enable files, there are four results:
      
       0 - all events this file affects are disabled
       1 - all events this file affects are enabled
       X - there is a mixture of events enabled and disabled
       ? - this file does not affect any event
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      8ae79a13
    • Steven Rostedt's avatar
      tracing: reset ring buffer when removing modules with events · 9456f0fa
      Steven Rostedt authored
      Li Zefan found that there's a race using the event ids of events and
      modules. When a module is loaded, an event id is incremented. We only
      have 16 bits for event ids (65536) and there is a possible (but highly
      unlikely) race that we could load and unload a module that registers
      events so many times that the event id counter overflows.
      
      When it overflows, it then restarts and goes looking for available
      ids. An id is available if it was added by a module and released.
      
      The race is if you have one module add an id, and then is removed.
      Another module loaded can use that same event id. But if the old module
      still had events in the ring buffer, the new module's call back would
      get bogus data.  At best (and most likely) the output would just be
      garbage. But if the module for some reason used pointers (not recommended)
      then this could potentially crash.
      
      The safest thing to do is just reset the ring buffer if a module that
      registered events is removed.
      
      [ Impact: prevent unpredictable results of event id overflows ]
      Reported-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <49FEAFD0.30106@cn.fujitsu.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      9456f0fa
    • Steven Rostedt's avatar
      tracing: update sample with TRACE_INCLUDE_FILE · 71e1c8ac
      Steven Rostedt authored
      When creating trace events for ftrace, the header file with the TRACE_EVENT
      macros must also have a macro called TRACE_SYSTEM. This macro describes
      the name of the system the TRACE_EVENTS are defined for. It also doubles
      as a way for the define_trace.h file to include the file that included
      it.
      
      For example:
      
      in irq.h
      
       #define TRACE_SYSTEM irq
      
      [...]
      
       #include <trace/define_trace.h>
      
      The define_trace will use TRACE_SYSTEM to include irq.h. But if the name
      of the trace system does not match the name of the trace header file,
      one can override it with:
      
      Which will change define_trace.h to inclued foo_trace.h instead of foo.h
      
      The sample comments this, but people that use the sample code will more
      likely use the code and not read the comments. This patch changes the
      sample code to use the TRACE_INCLUDE_FILE to better show developers how to
      use it.
      
      [ Impact: make sample less confusing to developers ]
      Reported-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      71e1c8ac
  2. 06 May, 2009 16 commits
    • Steven Rostedt's avatar
      ring-buffer: change test to be more latency friendly · 3e07a4f6
      Steven Rostedt authored
      The ring buffer benchmark/test runs a producer for 10 seconds.
      This is done with preemption and interrupts enabled. But if the kernel
      is not compiled with CONFIG_PREEMPT, it basically stops everything
      but interrupts for 10 seconds.
      
      Although this is just a test and is not for production, this attribute
      can be quite annoying. It can also spawn badness elsewhere.
      
      This patch solves the issues by calling "cond_resched" when the system
      is not compiled with CONFIG_PREEMPT. It also keeps track of the time
      spent to call cond_resched such that it does not go against the
      time calculations. That is, if the task schedules away, the time scheduled
      out is removed from the test data. Note, this only works for non PREEMPT
      because we do not know when the task is scheduled out if we have PREEMPT
      enabled.
      
      [ Impact: prevent test from stopping the world for 10 seconds ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      3e07a4f6
    • Steven Rostedt's avatar
      ring-buffer: make moving the tail page a separate function · 6634ff26
      Steven Rostedt authored
      Ingo Molnar thought the code would be cleaner if we used a function call
      instead of a goto for moving the tail page. After implementing this,
      it seems that gcc still inlines the result and the output is pretty much
      the same. Since this is considered a cleaner approach, might as well
      implement it.
      
      [ Impact: code clean up ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      6634ff26
    • Steven Rostedt's avatar
      ring-buffer: check for failed allocation in ring buffer benchmark · 00c81a58
      Steven Rostedt authored
      The result of the allocation of the ring buffer read page in the
      ring buffer bench mark does not check the return to see if a page
      was actually allocated. This patch fixes that.
      
      [ Impact: avoid NULL dereference ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      00c81a58
    • Steven Rostedt's avatar
      ring-buffer: remove unneeded conditional in rb_reserve_next · 8e7abf1c
      Steven Rostedt authored
      The code in __rb_reserve_next checks on page overflow if it is the
      original commiter and then resets the page back to the original
      setting.  Although this is fine, and the code is correct, it is
      a bit fragil. Some experimental work I did breaks it easily.
      
      The better and more robust solution is to have all commiters that
      overflow the page, simply subtract what they added.
      
      [ Impact: more robust ring buffer account management ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      8e7abf1c
    • Christoph Hellwig's avatar
      tracing: small trave_events sample Makefile cleanup · 35cf723e
      Christoph Hellwig authored
      Use -I$(src) to add the current directory the include path.
      
      [ Impact: cleanup ]
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      35cf723e
    • Jaswinder Singh Rajput's avatar
      tracing: trace_output.c, fix false positive compiler warning · 48dd0fed
      Jaswinder Singh Rajput authored
      This compiler warning:
      
        CC      kernel/trace/trace_output.o
       kernel/trace/trace_output.c: In function ‘register_ftrace_event’:
       kernel/trace/trace_output.c:544: warning: ‘list’ may be used uninitialized in this function
      
      Is wrong as 'list' is always initialized - but GCC (4.3.2) does not
      recognize this relationship properly.
      
      Work around the warning by initializing the variable to NULL.
      
      [ Impact: fix false positive compiler warning ]
      Signed-off-by: default avatarJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      48dd0fed
    • Alan D. Brunelle's avatar
      blktrace: from-sector redundant in trace_block_remap · 22a7c31a
      Alan D. Brunelle authored
      Remove redundant from-sector parameter: it's /always/ the bio's sector
      passed in.
      
      [ Impact: cleanup ]
      Signed-off-by: default avatarAlan D. Brunelle <alan.brunelle@hp.com>
      Reviewed-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Reviewed-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <49FF517C.7000503@hp.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      22a7c31a
    • Alan D. Brunelle's avatar
      blktrace: correct remap names · a42aaa3b
      Alan D. Brunelle authored
      This attempts to clarify names utilized during block I/O remap
      operations (partition, volume manager). It correctly matches up the
      /from/ information for both device & sector. This takes in the concept
      from Kosaki Motohiro and extends it to include better naming for the
      "device_from" field.
      
      [ Impact: cleanup ]
      Signed-off-by: default avatarAlan D. Brunelle <alan.brunelle@hp.com>
      Reviewed-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Reviewed-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <49FF4FAE.3000301@hp.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a42aaa3b
    • Mathieu Desnoyers's avatar
      tracepoint: trace_sched_migrate_task(): remove parameter · de1d7286
      Mathieu Desnoyers authored
      The orig_cpu parameter in trace_sched_migrate_task() is not necessary,
      it can be got by using task_cpu(p) in the probe.
      
      [ Impact: micro-optimization ]
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      [ modified from Mathieu's patch. The original patch is at:
        http://marc.info/?l=linux-kernel&m=123791201716239&w=2 ]
      Signed-off-by: default avatarXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Cc: fweisbec@gmail.com
      Cc: rostedt@goodmis.org
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: zhaolei@cn.fujitsu.com
      Cc: laijs@cn.fujitsu.com
      LKML-Reference: <49FFFDB7.1050402@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      de1d7286
    • Li Zefan's avatar
      tracing/events: fix concurrent access to ftrace_events list · 20c8928a
      Li Zefan authored
      A module will add/remove its trace events when it gets loaded/unloaded, so
      the ftrace_events list is not "const", and concurrent access needs to be
      protected.
      
      This patch thus fixes races between loading/unloding modules and read
      'available_events' or read/write 'set_event', etc.
      
      Below shows how to reproduce the race:
      
       # for ((; ;)) { cat /mnt/tracing/available_events; } > /dev/null &
       # for ((; ;)) { insmod trace-events-sample.ko; rmmod sample; } &
      
      After a while:
      
      BUG: unable to handle kernel paging request at 0010011c
      IP: [<c1080f27>] t_next+0x1b/0x2d
      ...
      Call Trace:
       [<c10c90e6>] ? seq_read+0x217/0x30d
       [<c10c8ecf>] ? seq_read+0x0/0x30d
       [<c10b4c19>] ? vfs_read+0x8f/0x136
       [<c10b4fc3>] ? sys_read+0x40/0x65
       [<c1002a68>] ? sysenter_do_call+0x12/0x36
      
      [ Impact: fix races when concurrent accessing ftrace_events list ]
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <4A00F709.3080800@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      20c8928a
    • Li Zefan's avatar
      tracing/events: fix memory leak when unloading module · 2df75e41
      Li Zefan authored
      When unloading a module, memory allocated by init_preds() and
      trace_define_field() is not freed.
      
      [ Impact: fix memory leak ]
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4A00F6E0.3040503@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      2df75e41
    • Li Zefan's avatar
      tracing/events: make SAMPLE_TRACE_EVENTS default to n · 96d17980
      Li Zefan authored
      Normally a config should be default to n. This patch also makes the
      sample module-only, like SAMPLE_MARKERS and SAMPLE_TRACEPOINTS.
      
      [ Impact: don't build trace event sample by default ]
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A00F6C0.8090803@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      96d17980
    • Li Zefan's avatar
      tracing/events: don't say hi when loading the trace event sample · fd6da10a
      Li Zefan authored
      The sample is useful for testing, and I'm using it. But after
      loading the module, it keeps saying hi every 10 seconds, this may
      be disturbing.
      
      Also Steven said commenting out the "hi" helped in causing races. :)
      
      [ Impact: make testing a bit easier ]
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A00F6AD.2070008@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      fd6da10a
    • Steven Rostedt's avatar
      ring-buffer: add benchmark and tester · 5092dbc9
      Steven Rostedt authored
      This patch adds code that can benchmark the ring buffer as well as
      test it. This code can be compiled into the kernel (not recommended)
      or as a module.
      
      A separate ring buffer is used to not interfer with other users, like
      ftrace. It creates a producer and a consumer (option to disable creation
      of the consumer) and will run for 10 seconds, then sleep for 10 seconds
      and then repeat.
      
      While running, the producer will write 10 byte loads into the ring
      buffer with just putting in the current CPU number. The reader will
      continually try to read the buffer. The reader will alternate from reading
      the buffer via event by event, or by full pages.
      
      The output is a pr_info, thus it will fill up the syslogs.
      
        Starting ring buffer hammer
        End ring buffer hammer
        Time:     9000349 (usecs)
        Overruns: 12578640
        Read:     5358440  (by events)
        Entries:  0
        Total:    17937080
        Missed:   0
        Hit:      17937080
        Entries per millisec: 1993
        501 ns per entry
        Sleeping for 10 secs
        Starting ring buffer hammer
        End ring buffer hammer
        Time:     9936350 (usecs)
        Overruns: 0
        Read:     28146644  (by pages)
        Entries:  74
        Total:    28146718
        Missed:   0
        Hit:      28146718
        Entries per millisec: 2832
        353 ns per entry
        Sleeping for 10 secs
      
      Time:      is the time the test ran
      Overruns:  the number of events that were overwritten and not read
      Read:      the number of events read (either by pages or events)
      Entries:   the number of entries left in the buffer
                       (the by pages will only read full pages)
      Total:     Entries + Read + Overruns
      Missed:    the number of entries that failed to write
      Hit:       the number of entries that were written
      
      The above example shows that it takes ~353 nanosecs per entry when
      there is a reader, reading by pages (and no overruns)
      
      The event by event reader slowed the producer down to 501 nanosecs.
      
      [ Impact: see how changes to the ring buffer affect stability and performance ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      5092dbc9
    • Steven Rostedt's avatar
      ring-buffer: move big if statement down · aa20ae84
      Steven Rostedt authored
      In the hot path of the ring buffer "__rb_reserve_next" there's a big
      if statement that does not even return back to the work flow.
      
      	code;
      
      	if (cross to next page) {
      
      		[ lots of code ]
      
      		return;
      	}
      
      	more code;
      
      The condition is even the unlikely path, although we do not denote it
      with an unlikely because gcc is fine with it. The condition is true when
      the write crosses a page boundary, and we need to start at a new page.
      
      Having this if statement makes it hard to read, but calling another
      function to do the work is also not appropriate, because we are using a lot
      of variables that were set before the if statement, and we do not want to
      send them as parameters.
      
      This patch changes it to a goto:
      
      	code;
      
      	if (cross to next page)
      		goto next_page;
      
      	more code;
      
      	return;
      
      next_page:
      
      	[ lots of code]
      
      This makes the code easier to understand, and a bit more obvious.
      
      The output from gcc is practically identical. For some reason, gcc decided
      to use different registers when I switched it to a goto. But other than that,
      the logic is the same.
      
      [ Impact: easier to read code ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      aa20ae84
    • Linus Torvalds's avatar
      Merge branch 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 · 413f81eb
      Linus Torvalds authored
      * 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
        drm/r128: fix r128 ioremaps to use ioremap_wc.
        drm: cleanup properly in drm_get_dev() failure paths
        drm: clean the map list before destroying the hash table
        drm: remove unreachable code in drm_sysfs.c
        drm: add control node checks missing from kms merge
        drm/kms: don't try to shortcut drm mode set function
        drm/radeon: bump minor version for occlusion queries support
      413f81eb
  3. 05 May, 2009 16 commits