1. 14 Oct, 2008 40 commits
    • Frederic Weisbecker's avatar
      tracing/fastboot: get the initcall name before it disappears · 5601020f
      Frederic Weisbecker authored
      After some initcall traces, some initcall names may be inconsistent.
      That's because these functions will disappear from the .init section
      and also their name from the symbols table.
      
      So we have to copy the name of the function in a buffer large enough
      during the trace appending. It is not costly for the ring_buffer because
      the number of initcall entries is commonly not really large.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      5601020f
    • Frederic Weisbecker's avatar
      tracing/fastboot: change the printing of boot tracer according to bootgraph.pl · cb5ab742
      Frederic Weisbecker authored
      Change the boot tracer printing to make it parsable for
      the scripts/bootgraph.pl script.
      
      We have now to output two lines for each initcall, according to the
      printk in do_one_initcall() in init/main.c
      We need now the call's time and the return's time.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      cb5ab742
    • Ingo Molnar's avatar
      ring-buffer: fix build error · 77ae11f6
      Ingo Molnar authored
      fix:
      
       kernel/trace/ring_buffer.c: In function ‘rb_allocate_pages’:
       kernel/trace/ring_buffer.c:235: error: ‘cpu’ undeclared (first use in this function)
       kernel/trace/ring_buffer.c:235: error: (Each undeclared identifier is reported only once
       kernel/trace/ring_buffer.c:235: error: for each function it appears in.)
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      77ae11f6
    • Steven Rostedt's avatar
      ftrace: preempt disable over interrupt disable · 38697053
      Steven Rostedt authored
      With the new ring buffer infrastructure in ftrace, I'm trying to make
      ftrace a little more light weight.
      
      This patch converts a lot of the local_irq_save/restore into
      preempt_disable/enable.  The original preempt count in a lot of cases
      has to be sent in as a parameter so that it can be recorded correctly.
      Some places were recording it incorrectly before anyway.
      
      This is also laying the ground work to make ftrace a little bit
      more reentrant, and remove all locking. The function tracers must
      still protect from reentrancy.
      
      Note: All the function tracers must be careful when using preempt_disable.
        It must do the following:
      
        resched = need_resched();
        preempt_disable_notrace();
        [...]
        if (resched)
      	preempt_enable_no_resched_notrace();
        else
      	preempt_enable_notrace();
      
      The reason is that if this function traces schedule() itself, the
      preempt_enable_notrace() will cause a schedule, which will lead
      us into a recursive failure.
      
      If we needed to reschedule before calling preempt_disable, we
      should have already scheduled. Since we did not, this is most
      likely that we should not and are probably inside a schedule
      function.
      
      If resched was not set, we still need to catch the need resched
      flag being set when preemption was off and the if case at the
      end will catch that for us.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      38697053
    • Steven Rostedt's avatar
      ring_buffer: allocate buffer page pointer · e4c2ce82
      Steven Rostedt authored
      The current method of overlaying the page frame as the buffer page pointer
      can be very dangerous and limits our ability to do other things with
      a page from the buffer, like send it off to disk.
      
      This patch allocates the buffer_page instead of overlaying the page's
      page frame. The use of the buffer_page has hardly changed due to this.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e4c2ce82
    • Steven Rostedt's avatar
      ftrace: type cast filter+verifier · 7104f300
      Steven Rostedt authored
      The mmiotrace map had a bug that would typecast the entry from
      the trace to the wrong type. That is a known danger of C typecasts,
      there's absolutely zero checking done on them.
      
      Help that problem a bit by using a GCC extension to implement a
      type filter that restricts the types that a trace record can be
      cast into, and by adding a dynamic check (in debug mode) to verify
      the type of the entry.
      
      This patch adds a macro to assign all entries of ftrace using the type
      of the variable and checking the entry id. The typecasts are now done
      in the macro for only those types that it knows about, which should
      be all the types that are allowed to be read from the tracer.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7104f300
    • Frederic Weisbecker's avatar
      tracing/ftrace: adapt mmiotrace to the new type of print_line, fix · 797d3712
      Frederic Weisbecker authored
      Correct the value's type of trace_empty function
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      797d3712
    • Steven Rostedt's avatar
      ring_buffer: implement new locking · d769041f
      Steven Rostedt authored
      The old "lock always" scheme had issues with lockdep, and was not very
      efficient anyways.
      
      This patch does a new design to be partially lockless on writes.
      Writes will add new entries to the per cpu pages by simply disabling
      interrupts. When a write needs to go to another page than it will
      grab the lock.
      
      A new "read page" has been added so that the reader can pull out a page
      from the ring buffer to read without worrying about the writer writing over
      it. This allows us to not take the lock for all reads. The lock is
      now only taken when a read needs to go to a new page.
      
      This is far from lockless, and interrupts still need to be disabled,
      but it is a step towards a more lockless solution, and it also
      solves a lot of the issues that were noticed by the first conversion
      of ftrace to the ring buffers.
      
      Note: the ring_buffer_{un}lock API has been removed.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d769041f
    • Steven Rostedt's avatar
      ring_buffer: remove raw from local_irq_save · 70255b5e
      Steven Rostedt authored
      The raw_local_irq_save causes issues with lockdep. We don't need it
      so replace them with local_irq_save.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      70255b5e
    • Frederic Weisbecker's avatar
      tracing/ftrace: adapt the boot tracer to the new print_line type · 9e9efffb
      Frederic Weisbecker authored
      This patch adapts the boot tracer to the new type of the
      print_line callback.
      
      It still relays entries it doesn't support to default output
      functions.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarPekka Paalanen <pq@iki.fi>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      9e9efffb
    • Frederic Weisbecker's avatar
      tracing/ftrace: adapt mmiotrace to the new type of print_line · 07f4e4f7
      Frederic Weisbecker authored
      Adapt mmiotrace to the new print_line type.
      By default, it ignores (and consumes) types it doesn't support.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarPekka Paalanen <pq@iki.fi>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      07f4e4f7
    • Pekka Paalanen's avatar
      tracing/ftrace: fix pipe breaking · 9ff4b974
      Pekka Paalanen authored
      This patch fixes a bug which break the pipe when the seq is empty.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      9ff4b974
    • Frederic Weisbecker's avatar
      tracing/ftrace: change the type of the print_line callback · 2c4f035f
      Frederic Weisbecker authored
      We need a kind of disambiguation when a print_line callback
      returns 0.
      
      _There is not enough space to print all the entry.
       Please flush the seq and retry.
      _I can't handle this type of entry
      
      This patch changes the type of this callback for better information.
      
      Also some changes have been made in this V2.
      
      _ Only relay to default functions after the print_line callback fails.
      _ This patch doesn't fix the issue with the broken pipe (see patch 2/4 for that)
      
      Some things are still in discussion:
      
      _ Find better names for the enum print_line_t values
      _ Change the type of print_trace_line into boolean.
      
      Patches to change that can be sent later.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarPekka Paalanen <pq@iki.fi>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      2c4f035f
    • Steven Rostedt's avatar
      ftrace: take advantage of variable length entries · 777e208d
      Steven Rostedt authored
      Now that the underlining ring buffer for ftrace now hold variable length
      entries, we can take advantage of this by only storing the size of the
      actual event into the buffer. This happens to increase the number of
      entries in the buffer dramatically.
      
      We can also get rid of the "trace_cont" operation, but I'm keeping that
      until we have no more users. Some of the ftrace tracers can now change
      their code to adapt to this new feature.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      777e208d
    • Steven Rostedt's avatar
      ftrace: make work with new ring buffer · 3928a8a2
      Steven Rostedt authored
      This patch ports ftrace over to the new ring buffer.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3928a8a2
    • Steven Rostedt's avatar
      ring_buffer: reset buffer page when freeing · ed56829c
      Steven Rostedt authored
      Mathieu Desnoyers pointed out that the freeing of the page frame needs
      to be reset otherwise we might trigger BUG_ON in the page free code.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ed56829c
    • Steven Rostedt's avatar
      ring_buffer: add paranoid check for buffer page · a7b13743
      Steven Rostedt authored
      If for some strange reason the buffer_page gets bigger, or the page struct
      gets smaller, I want to know this ASAP.  The best way is to not let the
      kernel compile.
      
      This patch adds code to test the size of the struct buffer_page against the
      page struct and will cause compile issues if the buffer_page ever gets bigger
      than the page struct.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a7b13743
    • Steven Rostedt's avatar
      tracing: unified trace buffer · 7a8e76a3
      Steven Rostedt authored
      This is a unified tracing buffer that implements a ring buffer that
      hopefully everyone will eventually be able to use.
      
      The events recorded into the buffer have the following structure:
      
        struct ring_buffer_event {
      	u32 type:2, len:3, time_delta:27;
      	u32 array[];
        };
      
      The minimum size of an event is 8 bytes. All events are 4 byte
      aligned inside the buffer.
      
      There are 4 types (all internal use for the ring buffer, only
      the data type is exported to the interface users).
      
       RINGBUF_TYPE_PADDING: this type is used to note extra space at the end
      	of a buffer page.
      
       RINGBUF_TYPE_TIME_EXTENT: This type is used when the time between events
      	is greater than the 27 bit delta can hold. We add another
      	32 bits, and record that in its own event (8 byte size).
      
       RINGBUF_TYPE_TIME_STAMP: (Not implemented yet). This will hold data to
      	help keep the buffer timestamps in sync.
      
      RINGBUF_TYPE_DATA: The event actually holds user data.
      
      The "len" field is only three bits. Since the data must be
      4 byte aligned, this field is shifted left by 2, giving a
      max length of 28 bytes. If the data load is greater than 28
      bytes, the first array field holds the full length of the
      data load and the len field is set to zero.
      
      Example, data size of 7 bytes:
      
      	type = RINGBUF_TYPE_DATA
      	len = 2
      	time_delta: <time-stamp> - <prev_event-time-stamp>
      	array[0..1]: <7 bytes of data> <1 byte empty>
      
      This event is saved in 12 bytes of the buffer.
      
      An event with 82 bytes of data:
      
      	type = RINGBUF_TYPE_DATA
      	len = 0
      	time_delta: <time-stamp> - <prev_event-time-stamp>
      	array[0]: 84 (Note the alignment)
      	array[1..14]: <82 bytes of data> <2 bytes empty>
      
      The above event is saved in 92 bytes (if my math is correct).
      82 bytes of data, 2 bytes empty, 4 byte header, 4 byte length.
      
      Do not reference the above event struct directly. Use the following
      functions to gain access to the event table, since the
      ring_buffer_event structure may change in the future.
      
      ring_buffer_event_length(event): get the length of the event.
      	This is the size of the memory used to record this
      	event, and not the size of the data pay load.
      
      ring_buffer_time_delta(event): get the time delta of the event
      	This returns the delta time stamp since the last event.
      	Note: Even though this is in the header, there should
      		be no reason to access this directly, accept
      		for debugging.
      
      ring_buffer_event_data(event): get the data from the event
      	This is the function to use to get the actual data
      	from the event. Note, it is only a pointer to the
      	data inside the buffer. This data must be copied to
      	another location otherwise you risk it being written
      	over in the buffer.
      
      ring_buffer_lock: A way to lock the entire buffer.
      ring_buffer_unlock: unlock the buffer.
      
      ring_buffer_alloc: create a new ring buffer. Can choose between
      	overwrite or consumer/producer mode. Overwrite will
      	overwrite old data, where as consumer producer will
      	throw away new data if the consumer catches up with the
      	producer.  The consumer/producer is the default.
      
      ring_buffer_free: free the ring buffer.
      
      ring_buffer_resize: resize the buffer. Changes the size of each cpu
      	buffer. Note, it is up to the caller to provide that
      	the buffer is not being used while this is happening.
      	This requirement may go away but do not count on it.
      
      ring_buffer_lock_reserve: locks the ring buffer and allocates an
      	entry on the buffer to write to.
      ring_buffer_unlock_commit: unlocks the ring buffer and commits it to
      	the buffer.
      
      ring_buffer_write: writes some data into the ring buffer.
      
      ring_buffer_peek: Look at a next item in the cpu buffer.
      ring_buffer_consume: get the next item in the cpu buffer and
      	consume it. That is, this function increments the head
      	pointer.
      
      ring_buffer_read_start: Start an iterator of a cpu buffer.
      	For now, this disables the cpu buffer, until you issue
      	a finish. This is just because we do not want the iterator
      	to be overwritten. This restriction may change in the future.
      	But note, this is used for static reading of a buffer which
      	is usually done "after" a trace. Live readings would want
      	to use the ring_buffer_consume above, which will not
      	disable the ring buffer.
      
      ring_buffer_read_finish: Finishes the read iterator and reenables
      	the ring buffer.
      
      ring_buffer_iter_peek: Look at the next item in the cpu iterator.
      ring_buffer_read: Read the iterator and increment it.
      ring_buffer_iter_reset: Reset the iterator to point to the beginning
      	of the cpu buffer.
      ring_buffer_iter_empty: Returns true if the iterator is at the end
      	of the cpu buffer.
      
      ring_buffer_size: returns the size in bytes of each cpu buffer.
      	Note, the real size is this times the number of CPUs.
      
      ring_buffer_reset_cpu: Sets the cpu buffer to empty
      ring_buffer_reset: sets all cpu buffers to empty
      
      ring_buffer_swap_cpu: swaps a cpu buffer from one buffer with a
      	cpu buffer of another buffer. This is handy when you
      	want to take a snap shot of a running trace on just one
      	cpu. Having a backup buffer, to swap with facilitates this.
      	Ftrace max latencies use this.
      
      ring_buffer_empty: Returns true if the ring buffer is empty.
      ring_buffer_empty_cpu: Returns true if the cpu buffer is empty.
      
      ring_buffer_record_disable: disable all cpu buffers (read only)
      ring_buffer_record_disable_cpu: disable a single cpu buffer (read only)
      ring_buffer_record_enable: enable all cpu buffers.
      ring_buffer_record_enabl_cpu: enable a single cpu buffer.
      
      ring_buffer_entries: The number of entries in a ring buffer.
      ring_buffer_overruns: The number of entries removed due to writing wrap.
      
      ring_buffer_time_stamp: Get the time stamp used by the ring buffer
      ring_buffer_normalize_time_stamp: normalize the ring buffer time stamp
      	into nanosecs.
      
      I still need to implement the GTOD feature. But we need support from
      the cpu frequency infrastructure.  But this can be done at a later
      time without affecting the ring buffer interface.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7a8e76a3
    • Steven Rostedt's avatar
      ftrace: give time for wakeup test to run · 5aa60c60
      Steven Rostedt authored
      It is possible that the testing thread in the ftrace wakeup test does not
      run before we stop the trace. This will cause the trace to fail since nothing
      will be in the buffers.
      
      This patch adds a small wait in the wakeup test to allow for the woken task
      to run and be traced.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      5aa60c60
    • Frédéric Weisbecker's avatar
      tracing/ftrace: don't consume unhandled entries by boot tracer · 7c572ac0
      Frédéric Weisbecker authored
      When the boot tracer can't handle an entry output, it returns 1.
      It should return 0 to relay on other output functions.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7c572ac0
    • Frédéric Weisbecker's avatar
      ftrace/fastboot: disable tracers self-tests when boot tracer is selected · 3ce2b920
      Frédéric Weisbecker authored
      The tracing engine resets the ring buffer and the tracers touch it
      too during self-tests. These self-tests happen during tracers registering
      and work against boot tracing which is logging initcalls.
      
      We have to disable tracing self-tests if the boot-tracer is selected.
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3ce2b920
    • Frédéric Weisbecker's avatar
      tracing/ftrace: launch boot tracing after pre-smp initcalls · 3bf77af6
      Frédéric Weisbecker authored
      Launch the boot tracing inside the initcall_debug area. Old printk
      have not been removed to keep the old way of initcall tracing for
      backward compatibility.
      
      [ mingo@elte.hu: resolved conflicts ]
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3bf77af6
    • Frédéric Weisbecker's avatar
      tracing/ftrace: give an entry on the config for boot tracer · 1f5c2abb
      Frédéric Weisbecker authored
      Bring the entry to choose the boot tracer on the kernel config.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      1f5c2abb
    • Frédéric Weisbecker's avatar
      tracing/ftrace: make tracing suitable to run the boot tracer · b5ad384e
      Frédéric Weisbecker authored
      The tracing engine have now to be init in early_initcall to set the
      boot tracer. Only the debugfs settings will be initialized at
      fs_initcall time.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b5ad384e
    • Frédéric Weisbecker's avatar
      tracing/ftrace: add the boot tracer · d13744cd
      Frédéric Weisbecker authored
      Add the boot/initcall tracer.
      
      It's primary purpose is to be able to trace the initcalls.
      
      It is intended to be used with scripts/bootgraph.pl after some small
      improvements.
      
      Note that it is not active after its init. To avoid tracing (and so
      crashing) before the whole tracing engine init, you have to explicitly
      call start_boot_trace() after do_pre_smp_initcalls() to enable it.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d13744cd
    • Arjan van de Ven's avatar
      tracing/fastboot: add a script to visualize the kernel boot process / time · aa5d9151
      Arjan van de Ven authored
      When optimizing the kernel boot time, it's very valuable to visualize
      what is going on at which time. In addition, with the fastboot asynchronous
      initcall level, it's very valuable to see which initcall gets run where
      and when.
      
      This patch adds a script to turn a dmesg into a SVG graph (that can be
      shown with tools such as InkScape, Gimp or Firefox) and a small change
      to the initcall code to print the PID of the thread calling the initcall
      (so that the script can work out the parallelism).
      Signed-off-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      aa5d9151
    • Lai Jiangshan's avatar
      markers: bit-field is not thread-safe nor smp-safe · 1b7ae37c
      Lai Jiangshan authored
      bit-field is not thread-safe nor smp-safe.
      
      struct marker_entry.rcu_pending is not protected by any lock
      in rcu-callback free_old_closure().
      so we must turn it into a safe type.
      
      detail:
      
      I suppose rcu_pending and ptype are store in struct marker_entry.tmp1
      
      free_old_closure() side:           change ptype side:
      
                                      |  load struct marker_entry.tmp1
      --------------------------------|--------------------------------
                                      |  change ptype bit in tmp1
      load struct marker_entry.tmp1   |
      change rcu_pending bit in tmp1  |
      store tmp1                      |
      --------------------------------|--------------------------------
                                      |  store tmp1
      
      now this result equals that free_old_closure() do not change rcu_pending
      bit, bug! This bug will cause redundant rcu_barrier_sched() called.
      not too harmful.
      
      ----- corresponding:
      
      free_old_closure() side:           change ptype side:
      
      load struct marker_entry.tmp1   |
      --------------------------------|--------------------------------
                                      |  load struct marker_entry.tmp1
      change rcu_pending bit in tmp1  |
                                      |  change ptype bit in tmp1
                                      |  store tmp1
      --------------------------------|--------------------------------
      store tmp1                      |
      
      now this result equals that change ptype side do not change ptype
      bit, bug! this bug cause marker_probe_cb() access to invalid memory.
      oops!
      
      see also: http://en.wikipedia.org/wiki/Bit_fieldSigned-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Acked-by: default avatarMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      1b7ae37c
    • Lai Jiangshan's avatar
      markers: fix unchecked format · 48043bcd
      Lai Jiangshan authored
      when the second, third... probe is registered, its format is
      not checked, this patch fixes it.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Acked-by: default avatarMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      48043bcd
    • Mathieu Desnoyers's avatar
      markers: turn marker_synchronize_unregister() into an inline · 53c8c8fd
      Mathieu Desnoyers authored
      Turn marker synchronize unregister into a static inline. There is no
      reason to keep it as a macro over a static inline.
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      53c8c8fd
    • Mathieu Desnoyers's avatar
      markers: re-enable fast batch registration · ed86a590
      Mathieu Desnoyers authored
      Lai Jiangshan discovered a reentrancy issue with markers and fixed it by
      adding synchronize_sched() calls at each registration/unregistraiton.
      
      It works, but it removes the ability to do batch
      registration/unregistration and can cause registration of ~100 markers
      to take about 30 seconds on a loaded machine (synchronize_sched() is
      much slower on such workloads).
      
      This patch implements a version of the fix which won't slow down marker batch
      registration/unregistration. It also go back to the original non-synchronized
      reg/unreg.
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ed86a590
    • Mathieu Desnoyers's avatar
      sputrace: use marker_synchronize_unregister() · 5b9261d9
      Mathieu Desnoyers authored
      We need a marker_synchronize_unregister() before the end of exit() to make sure
      every probe callers have exited the non preemptible section and thus are not
      executing the probe code anymore.
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Acked-by: default avatarJeremy Kerr <jk@ozlabs.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      5b9261d9
    • Mathieu Desnoyers's avatar
      markers: documentation fix for teardown · 91a8d46c
      Mathieu Desnoyers authored
      Document the need for a marker_synchronize_unregister() before the end of
      exit() to make sure every probe callers have exited the non preemptible
      section and thus are not executing the probe code anymore.
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      91a8d46c
    • Mathieu Desnoyers's avatar
      markers: probe example, fix teardown · 531d2975
      Mathieu Desnoyers authored
      Need a marker_synchronize_unregister() before the end of exit() to make sure
      every probe callers have exited the non preemptible section and thus are not
      executing the probe code anymore.
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      531d2975
    • Mathieu Desnoyers's avatar
      markers: fix unregister bug and reenter bug, cleanup · e2d3b75d
      Mathieu Desnoyers authored
      Use the new rcu_read_lock_sched/unlock_sched() in marker code around the call
      site instead of preempt_disable/enable(). It helps reviewing the code more
      easily.
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e2d3b75d
    • Mathieu Desnoyers's avatar
      markers: marker_synchronize_unregister() · e98d0eab
      Mathieu Desnoyers authored
      Create marker_synchronize_unregister() which must be called before the end of
      exit() to make sure every probe callers have exited the non preemptible section
      and thus are not executing the probe code anymore.
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e98d0eab
    • Mathieu Desnoyers's avatar
      tracepoints: fix reentrancy · 9a1e9693
      Mathieu Desnoyers authored
      The tracepoints had the same problem markers did have wrt reentrancy. Apply a
      similar fix using a rcu_barrier after each tracepoint mutex lock.
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      9a1e9693
    • Mathieu Desnoyers's avatar
      tracepoints: use rcu sched · ca2db6cf
      Mathieu Desnoyers authored
      Make tracepoints use rcu sched. (cleanup)
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ca2db6cf
    • Lai Jiangshan's avatar
      markers: fix unregister bug and reenter bug · d74185ed
      Lai Jiangshan authored
      unregister bug:
      
      codes using makers are typically calling marker_probe_unregister()
      and then destroying the data that marker_probe_func needs(or
      unloading this module). This is bug when the corresponding
      marker_probe_func is still running(on other cpus),
      it is using the destroying/ed data.
      
      we should call synchronize_sched() after marker_update_probes().
      
      reenter bug:
      
      marker_probe_register(), marker_probe_unregister() and
      marker_probe_unregister_private_data() are not reentrant safe
      functions. these 3 functions release markers_mutex and then
      require it again and do "entry->oldptr = old; ...", but entry->oldptr
      maybe is using now for these 3 functions may reenter when markers_mutex
      is released.
      
      we use synchronize_sched() instead of call_rcu_sched() to fix
      this bug. actually we can do:
      "
      if (entry->rcu_pending)
      		rcu_barrier_sched();
      "
      after require markers_mutex again. but synchronize_sched()
      is better and simpler. For these 3 functions are not critical path.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d74185ed
    • Frédéric Weisbecker's avatar
      x86/ftrace: use uaccess in atomic context · ac2b86fd
      Frédéric Weisbecker authored
      With latest -tip I get this bug:
      
      [   49.439988] in_atomic():0, irqs_disabled():1
      [   49.440118] INFO: lockdep is turned off.
      [   49.440118] Pid: 2814, comm: modprobe Tainted: G        W 2.6.27-rc7 #4
      [   49.440118]  [<c01215e1>] __might_sleep+0xe1/0x120
      [   49.440118]  [<c01148ea>] ftrace_modify_code+0x2a/0xd0
      [   49.440118]  [<c01148a2>] ? ftrace_test_p6nop+0x0/0xa
      [   49.440118]  [<c016e80e>] __ftrace_update_code+0xfe/0x2f0
      [   49.440118]  [<c01148a2>] ? ftrace_test_p6nop+0x0/0xa
      [   49.440118]  [<c016f190>] ftrace_convert_nops+0x50/0x80
      [   49.440118]  [<c016f1d6>] ftrace_init_module+0x16/0x20
      [   49.440118]  [<c015498b>] load_module+0x185b/0x1d30
      [   49.440118]  [<c01767a0>] ? find_get_page+0x0/0xf0
      [   49.440118]  [<c02463c0>] ? sprintf+0x0/0x30
      [   49.440118]  [<c034e012>] ? mutex_lock_interruptible_nested+0x1f2/0x350
      [   49.440118]  [<c0154eb3>] sys_init_module+0x53/0x1b0
      [   49.440118]  [<c0352340>] ? do_page_fault+0x0/0x740
      [   49.440118]  [<c0104012>] syscall_call+0x7/0xb
      [   49.440118]  =======================
      
      It is because ftrace_modify_code() calls copy_to_user and
      copy_from_user.
      These functions have been inserted after guessing that there
      couldn't be any race condition but copy_[to/from]_user might
      sleep and __ftrace_update_code is called with local_irq_saved.
      
      These function have been inserted since this commit:
      d5e92e8978fd2574e415dc2792c5eb592978243d:
      "ftrace: x86 use copy from user function"
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ac2b86fd
    • Harvey Harrison's avatar
      x86: suppress trivial sparse signedness warnings · 37a52f5e
      Harvey Harrison authored
      Could just as easily change the three casts to cast to the correct
      type...this patch changes the type of ftrace_nop instead.
      
      Supresses sparse warnings:
      
       arch/x86/kernel/ftrace.c:157:14: warning: incorrect type in assignment (different signedness)
       arch/x86/kernel/ftrace.c:157:14:    expected long *static [toplevel] ftrace_nop
       arch/x86/kernel/ftrace.c:157:14:    got unsigned long *<noident>
       arch/x86/kernel/ftrace.c:161:14: warning: incorrect type in assignment (different signedness)
       arch/x86/kernel/ftrace.c:161:14:    expected long *static [toplevel] ftrace_nop
       arch/x86/kernel/ftrace.c:161:14:    got unsigned long *<noident>
       arch/x86/kernel/ftrace.c:165:14: warning: incorrect type in assignment (different signedness)
       arch/x86/kernel/ftrace.c:165:14:    expected long *static [toplevel] ftrace_nop
       arch/x86/kernel/ftrace.c:165:14:    got unsigned long *<noident>
      Signed-off-by: default avatarHarvey Harrison <harvey.harrison@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      37a52f5e