1. 15 Mar, 2013 40 commits
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Optimize trace_printk() with one arg to use trace_puts() · 9d3c752c
      Steven Rostedt (Red Hat) authored
      Although trace_printk() is extremely fast, especially when it uses
      trace_bprintk() (writes args straight to buffer instead of inserting
      into string), it still has the overhead of calling one of the printf
      sprintf() functions, that need to scan the fmt string to determine
      what, if any args it has.
      
      This is a waste of precious CPU cycles if the printk format has no
      args but a single constant string. It is better to use trace_puts()
      which does not have the overhead of the fmt scanning.
      
      But wouldn't it be nice if the developer didn't have to think about
      such things, and the compile would just do it for them?
      
        trace_printk("this string has no args\n");
        [...]
        trace_printk("this sting does %p %d\n", foo, bar);
      
      As tracing is critical to have the least amount of overhead,
      especially when dealing with race conditions, and you want to
      eliminate any "Heisenbugs", you want the trace_printk() to use the
      fastest possible means of tracing.
      
      Currently the macro magic determines if it will use trace_bprintk()
      or if the fmt is a dynamic string (a variable), it will fall
      back to the slow trace_printk() method that does a full snprintf()
      before copying it into the buffer, where as trace_bprintk() only
      copys the pointer to the fmt and the args into the buffer.
      
      Well, now there's a way to spend some more Hogwarts cash and come
      up with new fancy macro magic.
      
        #define trace_printk(fmt, ...)			\
        do {							\
      	char _______STR[] = __stringify((__VA_ARGS__));	\
      	if (sizeof(_______STR) > 3)			\
      		do_trace_printk(fmt, ##__VA_ARGS__);	\
      	else						\
      		trace_puts(fmt);			\
        } while (0)
      
      The above needs a bit of explaining (both here and in the comments).
      
      By stringifying the __VA_ARGS__, we can, at compile time, determine
      the number of args that are being passed to trace_printk(). The extra
      parenthesis are required, otherwise the compiler complains about
      too many parameters for __stringify if there is more than one arg.
      
      When there are no args, the __stringify((__VA_ARGS__)) converts into
      "()\0", a string of 3 characters. Anything else, will be a string
      containing more than 3 characters. Now we assign that string to a
      dynamic char array, and then take the sizeof() of that array.
      If it is greater than 3 characters, we know trace_printk() has args
      and we need to do the full "do_trace_printk()" on them, otherwise
      it was only passed a single arg and we can optimize to use trace_puts().
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarSteven "The King of Nasty Macros!" Rostedt <rostedt@goodmis.org>
      9d3c752c
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Add trace_puts() for even faster trace_printk() tracing · 09ae7234
      Steven Rostedt (Red Hat) authored
      The trace_printk() is extremely fast and is very handy as it can be
      used in any context (including NMIs!). But it still requires scanning
      the fmt string for parsing the args. Even the trace_bprintk() requires
      a scan to know what args will be saved, although it doesn't copy the
      format string itself.
      
      Several times trace_printk() has no args, and wastes cpu cycles scanning
      the fmt string.
      
      Adding trace_puts() allows the developer to use an even faster
      tracing method that only saves the pointer to the string in the
      ring buffer without doing any format parsing at all. This will
      help remove even more of the "Heisenbug" effect, when debugging.
      
      Also fixed up the F_printk()s for the ftrace internal bprint and print events.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      09ae7234
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Fix the branch tracer that broke with buffer change · 153e8ed9
      Steven Rostedt (Red Hat) authored
      The changce to add the trace_buffer struct to have the trace array
      have both the main buffer and max buffer broke the branch tracer
      because the change did not update that code. As the branch tracer
      adds a significant amount of overhead, and must be selected via
      a selection (not a allyesconfig) it was missed in testing.
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      153e8ed9
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Add alloc_snapshot kernel command line parameter · 55034cd6
      Steven Rostedt (Red Hat) authored
      If debugging the kernel, and the developer wants to use
      tracing_snapshot() in places where tracing_snapshot_alloc() may
      be difficult (or more likely, the developer is lazy and doesn't
      want to bother with tracing_snapshot_alloc() at all), then adding
      
        alloc_snapshot
      
      to the kernel command line parameter will tell ftrace to allocate
      the snapshot buffer (if configured) when it allocates the main
      tracing buffer.
      
      I also noticed that ring_buffer_expanded and tracing_selftest_disabled
      had inconsistent use of boolean "true" and "false" with "0" and "1".
      I cleaned that up too.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      55034cd6
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Move the tracing selftest code into its own function · f4e781c0
      Steven Rostedt (Red Hat) authored
      Move the tracing startup selftest code into its own function and
      when not enabled, always have that function succeed.
      
      This makes the register_tracer() function much more readable.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      f4e781c0
    • Steven Rostedt (Red Hat)'s avatar
      ring-buffer: Do not use schedule_work_on() for current CPU · f5eb5588
      Steven Rostedt (Red Hat) authored
      The ring buffer updates when done while the ring buffer is active,
      needs to be completed on the CPU that is used for the ring buffer
      per_cpu buffer. To accomplish this, schedule_work_on() is used to
      schedule work on the given CPU.
      
      Now there's no reason to use schedule_work_on() if the process
      doing the update happens to be on the CPU that it is processing.
      It has already filled the requirement. Instead, just do the work
      and continue.
      
      This is needed for tracing_snapshot_alloc() where it may be called
      really early in boot, where the work queues have not been set up yet.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      f5eb5588
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Add internal tracing_snapshot() functions · ad909e21
      Steven Rostedt (Red Hat) authored
      The new snapshot feature is quite handy. It's a way for the user
      to take advantage of the spare buffer that, until then, only
      the latency tracers used to "snapshot" the buffer when it hit
      a max latency. Now users can trigger a "snapshot" manually when
      some condition is hit in a program. But a snapshot currently can
      not be triggered by a condition inside the kernel.
      
      With the addition of tracing_snapshot() and tracing_snapshot_alloc(),
      snapshots can now be taking when a condition is hit, and the
      developer wants to snapshot the case without stopping the trace.
      
      Note, any snapshot will overwrite the old one, so take care
      in how this is done.
      
      These new functions are to be used like tracing_on(), tracing_off()
      and trace_printk() are. That is, they should never be called
      in the mainline Linux kernel. They are solely for the purpose
      of debugging.
      
      The tracing_snapshot() will not allocate a buffer, but it is
      safe to be called from any context (except NMIs). But if a
      snapshot buffer isn't allocated when it is called, it will write
      to the live buffer, complaining about the lack of a snapshot
      buffer, and then stop tracing (giving you the "permanent snapshot").
      
      tracing_snapshot_alloc() will allocate the snapshot buffer if
      it was not already allocated and then take the snapshot. This routine
      *may sleep*, and must be called from context that can sleep.
      The allocation is done with GFP_KERNEL and not atomic.
      
      If you need a snapshot in an atomic context, say in early boot,
      then it is best to call the tracing_snapshot_alloc() before then,
      where it will allocate the buffer, and then you can use the
      tracing_snapshot() anywhere you want and still get snapshots.
      
      Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      ad909e21
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Prevent deleting instances when they are being read · a695cb58
      Steven Rostedt (Red Hat) authored
      Add a ref count to the trace_array structure and prevent removal
      of instances that have open descriptors.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      a695cb58
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Add per_cpu directory into tracing instances · 121aaee7
      Steven Rostedt (Red Hat) authored
      Add the per_cpu directory to the created tracing instances:
      
        cd /sys/kernel/debug/tracing/instances
        mkdir foo
        ls foo/per_cpu/cpu0
      buffer_size_kb	snapshot_raw  trace	  trace_pipe_raw
      snapshot	stats	      trace_pipe
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      121aaee7
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Add snapshot feature to instances · ce9bae55
      Steven Rostedt (Red Hat) authored
      Add the "snapshot" file to the the multi-buffer instances.
      
        cd /sys/kernel/debug/tracing/instances
        mkdir foo
        ls foo
      buffer_size_kb  buffer_total_size_kb  events  free_buffer  set_event
      snapshot  trace  trace_clock  trace_marker  trace_options  trace_pipe
      tracing_on
        cat foo/snapshot
       # tracer: nop
       #
       #
       # * Snapshot is freed *
       #
       # Snapshot commands:
       # echo 0 > snapshot : Clears and frees snapshot buffer
       # echo 1 > snapshot : Allocates snapshot buffer, if not already allocated.
       #                      Takes a snapshot of the main buffer.
       # echo 2 > snapshot : Clears snapshot buffer (but does not allocate)
       #                      (Doesn't have to be '2' works with any number that
       #                       is not a '0' or '1')
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      ce9bae55
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Consolidate buffer allocation code · 737223fb
      Steven Rostedt (Red Hat) authored
      There's a bit of duplicate code in creating the trace buffers for
      the normal trace buffer and the max trace buffer among the instances
      and the main global_trace. This code can be consolidated and cleaned
      up a bit making the code cleaner and more readable as well as less
      duplication.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      737223fb
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Have trace_array keep track if snapshot buffer is allocated · 45ad21ca
      Steven Rostedt (Red Hat) authored
      The snapshot buffer belongs to the trace array not the tracer that is
      running. The trace array should be the data structure that keeps track
      of whether or not the snapshot buffer is allocated, not the tracer
      desciptor. Having the trace array keep track of it makes modifications
      so much easier.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      45ad21ca
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Add snapshot_raw to extract the raw data from snapshot · 6de58e62
      Steven Rostedt (Red Hat) authored
      Add a 'snapshot_raw' per_cpu file that allows tools to read the raw
      binary data of the snapshot buffer.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      6de58e62
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Add config option to allow snapshot to swap per cpu · 0b85ffc2
      Steven Rostedt (Red Hat) authored
      When the preempt or irq latency tracers are enabled, they require
      the ring buffer to be able to swap the per cpu sub buffers between
      two main buffers. This adds a slight overhead to tracing as the
      trace recording needs to perform some checks to synchronize
      between recording and swaps that might be happening on other CPUs.
      
      The config RING_BUFFER_ALLOW_SWAP is set when a user of the ring
      buffer needs the "swap cpu" feature, otherwise the extra checks
      are not implemented and removed from the tracing overhead.
      
      The snapshot feature will swap per CPU if the RING_BUFFER_ALLOW_SWAP
      config is set. But that only gets set by things like OPROFILE
      and the irqs and preempt latency tracers.
      
      This config is added to let the user decide to include this feature
      with the snapshot agnostic from whether or not another user of
      the ring buffer sets this config.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      0b85ffc2
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Add snapshot in the per_cpu trace directories · f1affcaa
      Steven Rostedt (Red Hat) authored
      Add the snapshot file into the per_cpu tracing directories to allow
      them to be read for an individual cpu. This also allows to clear
      an individual cpu from the snapshot buffer.
      
      If the kernel allows it (CONFIG_RING_BUFFER_ALLOW_SWAP is set), then
      echoing in '1' into one of the per_cpu snapshot files will do an
      individual cpu buffer swap instead of the entire file.
      
      Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      f1affcaa
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Consolidate max_tr into main trace_array structure · 12883efb
      Steven Rostedt (Red Hat) authored
      Currently, the way the latency tracers and snapshot feature works
      is to have a separate trace_array called "max_tr" that holds the
      snapshot buffer. For latency tracers, this snapshot buffer is used
      to swap the running buffer with this buffer to save the current max
      latency.
      
      The only items needed for the max_tr is really just a copy of the buffer
      itself, the per_cpu data pointers, the time_start timestamp that states
      when the max latency was triggered, and the cpu that the max latency
      was triggered on. All other fields in trace_array are unused by the
      max_tr, making the max_tr mostly bloat.
      
      This change removes the max_tr completely, and adds a new structure
      called trace_buffer, that holds the buffer pointer, the per_cpu data
      pointers, the time_start timestamp, and the cpu where the latency occurred.
      
      The trace_array, now has two trace_buffers, one for the normal trace and
      one for the max trace or snapshot. By doing this, not only do we remove
      the bloat from the max_trace but the instances of traces can now use
      their own snapshot feature and not have just the top level global_trace have
      the snapshot feature and latency tracers for itself.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      12883efb
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Enable snapshot when any latency tracer is enabled · 22cffc2b
      Steven Rostedt (Red Hat) authored
      The snapshot utility is extremely useful, and does not add any more
      overhead in memory when another latency tracer is enabled. They use
      the snapshot underneath. There's no reason to hide the snapshot file
      when a latency tracer has been enabled in the kernel.
      
      If any of the latency tracers (irq, preempt or wakeup) is enabled
      then also select the snapshot facility.
      
      Note, snapshot can be enabled without the latency tracers enabled.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      22cffc2b
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Clear all trace buffers when unloaded module event was used · 873c642f
      Steven Rostedt (Red Hat) authored
      Currently we do not know what buffer a module event was enabled in.
      On unload, it is safest to clear all buffer instances, not just the
      top level buffer.
      
      Todo: Clear only the buffer that the event was used in. The
      infrastructure is there to do this, but it makes the code a bit
      more complex. Lets get the current code vetted before we add that.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      873c642f
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Only clear trace buffer on module unload if event was traced · 575380da
      Steven Rostedt (Red Hat) authored
      Currently, when a module with events is unloaded, the trace buffer is
      cleared. This is just a safety net in case the module might have some
      strange callback when its event is outputted. But there's no reason
      to reset the buffer if the module didn't have any of its events traced.
      
      Add a flag to the event "call" structure called WAS_ENABLED and gets set
      when the event is ever enabled, and this flag never gets cleared. When a
      module gets unloaded, if any of its events have this flag set, then the
      trace buffer will get cleared.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      575380da
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Add comment for trace event flag IGNORE_ENABLE · 2a30c11f
      Steven Rostedt (Red Hat) authored
      All the trace event flags have comments but the IGNORE_ENABLE flag
      which is set for ftrace internal events that should not be enabled
      via the debugfs "enable" file. That is, if the top level enable file
      is set, it will enable all events. It use to just check the ftrace
      event call descriptor "reg" field and skip those whithout it, but now
      some ftrace internal events have a reg field but still need to be
      skipped. The flag was created to ignore those events.
      
      Now document it.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      2a30c11f
    • Steven Rostedt (Red Hat)'s avatar
      ring-buffer: Init waitqueue for blocked readers · f1dc6725
      Steven Rostedt (Red Hat) authored
      The move of blocked readers to the ring buffer left out the
      init of the wait queue that is used. Tests missed this due to running
      stress tests against the buffers, which didn't allow for any
      readers to end up waiting. Running a simple read and wait triggered
      a bug.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      f1dc6725
    • Li Zefan's avatar
      tracing: Fix some section mismatch warnings · 523c8113
      Li Zefan authored
      As we've added __init annotation to field-defining functions, we should
      add __refdata annotation to event_call variables, which reference those
      functions.
      
      Link: http://lkml.kernel.org/r/51343C1F.2050502@huawei.comReported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      523c8113
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Fix trace events build without modules · 315326c1
      Steven Rostedt (Red Hat) authored
      The new multi-buffers added a descriptor that kept track of module
      events, and the directories they use, with struct ftace_module_file_ops.
      This is used to add a ref count to keep modules from unloading while
      their files are being accessed.
      
      As the descriptor is only needed when CONFIG_MODULES is enabled, it
      is only declared when the config is enabled. But that struct is
      dereferenced in a few areas outside the #ifdef CONFIG_MODULES.
      
      By adding some helper routines and moving code around a little,
      events can be compiled again without modules.
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      315326c1
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Add __per_cpu annotation to trace array percpu data pointer · 34ef61b1
      Steven Rostedt (Red Hat) authored
      With the conversion of the data array to per cpu, sparse now complains
      about the use of per_cpu_ptr() on the variable. But The variable is
      allocated with alloc_percpu() and is fine to use. But since the structure
      that contains the data variable does not annotate it as such, sparse
      gives out a lot of false warnings.
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      34ef61b1
    • Li Zefan's avatar
      tracing/syscalls: Annotate field-defining functions with __init · b8aae39f
      Li Zefan authored
      These two functions are called during kernel boot only.
      
      Link: http://lkml.kernel.org/r/51258796.7020704@huawei.comSigned-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      b8aae39f
    • Li Zefan's avatar
      tracing: Annotate event field-defining functions with __init · 7e4f44b1
      Li Zefan authored
      Those functions are called either during kernel boot or module init.
      
      Before:
      
      $ dmesg | grep 'Freeing unused kernel memory'
      Freeing unused kernel memory: 1208k freed
      Freeing unused kernel memory: 1360k freed
      Freeing unused kernel memory: 1960k freed
      
      After:
      
      $ dmesg | grep 'Freeing unused kernel memory'
      Freeing unused kernel memory: 1236k freed
      Freeing unused kernel memory: 1388k freed
      Freeing unused kernel memory: 1960k freed
      
      Link: http://lkml.kernel.org/r/5125877D.5000201@huawei.comSigned-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      7e4f44b1
    • Li Zefan's avatar
      tracing: Add a helper function for event print functions · f71130de
      Li Zefan authored
      Move duplicate code in event print functions to a helper function.
      
      This shrinks the size of the kernel by ~13K.
      
         text    data     bss     dec     hex filename
      6596137 1743966 10138672        18478775        119f6b7 vmlinux.o.old
      6583002 1743849 10138672        18465523        119c2f3 vmlinux.o.new
      
      Link: http://lkml.kernel.org/r/51258746.2060304@huawei.comSigned-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      f71130de
    • Steven Rostedt (Red Hat)'s avatar
      tracing/ring-buffer: Move poll wake ups into ring buffer code · 15693458
      Steven Rostedt (Red Hat) authored
      Move the logic to wake up on ring buffer data into the ring buffer
      code itself. This simplifies the tracing code a lot and also has the
      added benefit that waiters on one of the instance buffers can be woken
      only when data is added to that instance instead of data added to
      any instance.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      15693458
    • Steven Rostedt's avatar
      tracing: Fix read blocking on trace_pipe_raw · b627344f
      Steven Rostedt authored
      If the ring buffer is empty, a read to trace_pipe_raw wont block.
      The tracing code has the infrastructure to wake up waiting readers,
      but the trace_pipe_raw doesn't take advantage of that.
      
      When a read is done to trace_pipe_raw without the O_NONBLOCK flag
      set, have the read block until there's data in the requested buffer.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      b627344f
    • Steven Rostedt's avatar
      tracing: Fix polling on trace_pipe_raw · cc60cdc9
      Steven Rostedt authored
      The trace_pipe_raw never implemented polling and this was casing
      issues for several utilities. This is now implemented.
      
      Blocked reads still are on the TODO list.
      Reported-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      Tested-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      cc60cdc9
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Do not block on splice if either file or splice NONBLOCK flag is set · 189e5784
      Steven Rostedt (Red Hat) authored
      Currently only the splice NONBLOCK flag is checked to determine if
      the splice read should block or not. But the file descriptor NONBLOCK
      flag also needs to be checked.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      189e5784
    • Steven Rostedt's avatar
      tracing: Use direct field, type and system names · 92edca07
      Steven Rostedt authored
      The names used to display the field and type in the event format
      files are copied, as well as the system name that is displayed.
      
      All these names are created by constant values passed in.
      If one of theses values were to be removed by a module, the module
      would also be required to remove any event it created.
      
      By using the strings directly, we can save over 100K of memory.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      92edca07
    • Steven Rostedt's avatar
      tracing: Use kmem_cache_alloc instead of kmalloc in trace_events.c · d1a29143
      Steven Rostedt authored
      The event structures used by the trace events are mostly persistent,
      but they are also allocated by kmalloc, which is not the best at
      allocating space for what is used. By converting these kmallocs
      into kmem_cache_allocs, we can save over 50K of space that is
      permanently allocated.
      
      After boot we have:
      
       slab name          active allocated size
       ---------          ------ --------- ----
      ftrace_event_file    979   1005     56   67    1
      ftrace_event_field   2301   2310     48   77    1
      
      The ftrace_event_file has at boot up 979 active objects out of
      1005 allocated in the slabs. Each object is 56 bytes. In a normal
      kmalloc, that would allocate 64 bytes for each object.
      
       1005 - 979  = 26 objects not used
       26 * 56 = 1456 bytes wasted
      
      But if we used kmalloc:
      
       64 - 56 = 8 bytes unused per allocation
       8 * 979 = 7832 bytes wasted
      
       7832 - 1456 = 6376 bytes in savings
      
      Doing the same for ftrace_event_field where there's 2301 objects
      allocated in a slab that can hold 2310 with 48 bytes each we have:
      
       2310 - 2301 = 9 objects not used
       9 * 48 = 432 bytes wasted
      
      A kmalloc would also use 64 bytes per object:
      
       64 - 48 = 16 bytes unused per allocation
       16 * 2301 = 36816 bytes wasted!
      
       36816 - 432 = 36384 bytes in savings
      
      This change gives us a total of 42760 bytes in savings. At least
      on my machine, but as there's a lot of these persistent objects
      for all configurations that use trace points, this is a net win.
      
      Thanks to Ezequiel Garcia for his trace_analyze presentation which
      pointed out the wasted space in my code.
      
      Cc: Ezequiel Garcia <elezegarcia@gmail.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      d1a29143
    • Steven Rostedt's avatar
      tracing: Get trace_events kernel command line working again · 77248221
      Steven Rostedt authored
      With the new descriptors used to allow multiple buffers in the
      tracing directory added, the kernel command line parameter
      trace_events=... no longer works. This is because the top level
      (global) trace array now has a list of descriptors associated
      with the events and the files in the debugfs directory. But in
      early bootup, when the command line is processed and the events
      enabled, the trace array list of events has not been set up yet.
      
      Without the list of events in the trace array, the setting of
      events to record will fail because it would not match any events.
      
      The solution is to set up the top level array in two stages.
      The first is to just add the ftrace file descriptors that just point
      to the events. This will allow events to be enabled and start tracing.
      The second stage is called after the filesystem is set up, and this
      stage will create the debugfs event files and directories associated
      with the trace array events.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      77248221
    • Steven Rostedt's avatar
      tracing: Add rmdir to remove multibuffer instances · 0c8916c3
      Steven Rostedt authored
      Add a method to the hijacked dentry descriptor of the
      "instances" directory to allow for rmdir to remove an
      instance of a multibuffer.
      
      Example:
      
        cd /debug/tracing/instances
        mkdir hello
        ls
      hello/
        rmdir hello
        ls
      
      Like the mkdir method, the i_mutex is dropped for the instances
      directory. The instances directory is created at boot up and can
      not be renamed or removed. The trace_types_lock mutex is used to
      synchronize adding and removing of instances.
      
      I've run several stress tests with different threads trying to
      create and delete directories of the same name, and it has stood
      up fine.
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      0c8916c3
    • Steven Rostedt's avatar
      tracing: Add interface to allow multiple trace buffers · 277ba044
      Steven Rostedt authored
      Add the interface ("instances" directory) to add multiple buffers
      to ftrace. To create a new instance, simply do a mkdir in the
      instances directory:
      
      This will create a directory with the following:
      
       # cd instances
       # mkdir foo
       # ls foo
      buffer_size_kb        free_buffer  trace_clock    trace_pipe
      buffer_total_size_kb  set_event    trace_marker   tracing_enabled
      events/               trace        trace_options  tracing_on
      
      Currently only events are able to be set, and there isn't a way
      to delete a buffer when one is created (yet).
      
      Note, the i_mutex lock is dropped from the parent "instances"
      directory during the mkdir operation. As the "instances" directory
      can not be renamed or deleted (created on boot), I do not see
      any harm in dropping the lock. The creation of the sub directories
      is protected by trace_types_lock mutex, which only lets one
      instance get into the code path at a time. If two tasks try to
      create or delete directories of the same name, only one will occur
      and the other will fail with -EEXIST.
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      277ba044
    • Steven Rostedt's avatar
      tracing: Make syscall events suitable for multiple buffers · 12ab74ee
      Steven Rostedt authored
      Currently the syscall events record into the global buffer. But if
      multiple buffers are in place, then we need to have syscall events
      record in the proper buffers.
      
      By adding descriptors to pass to the syscall event functions, the
      syscall events can now record into the buffers that have been assigned
      to them (one event may be applied to mulitple buffers).
      
      This will allow tracing high volume syscalls along with seldom occurring
      syscalls without losing the seldom syscall events.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      12ab74ee
    • Steven Rostedt's avatar
      tracing: Replace the static global per_cpu arrays with allocated per_cpu · a7603ff4
      Steven Rostedt authored
      The global and max-tr currently use static per_cpu arrays for the CPU data
      descriptors. But in order to get new allocated trace_arrays, they need to
      be allocated per_cpu arrays. Instead of using the static arrays, switch
      the global and max-tr to use allocated data.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      a7603ff4
    • Steven Rostedt's avatar
      tracing: Pass the ftrace_file to the buffer lock reserve code · ccb469a1
      Steven Rostedt authored
      Pass the struct ftrace_event_file *ftrace_file to the
      trace_event_buffer_lock_reserve() (new function that replaces the
      trace_current_buffer_lock_reserver()).
      
      The ftrace_file holds a pointer to the trace_array that is in use.
      In the case of multiple buffers with different trace_arrays, this
      allows different events to be recorded into different buffers.
      
      Also fixed some of the stale comments in include/trace/ftrace.h
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      ccb469a1
    • Steven Rostedt's avatar
      tracing: Encapsulate global_trace and remove dependencies on global vars · 2b6080f2
      Steven Rostedt authored
      The global_trace variable in kernel/trace/trace.c has been kept 'static' and
      local to that file so that it would not be used too much outside of that
      file. This has paid off, even though there were lots of changes to make
      the trace_array structure more generic (not depending on global_trace).
      
      Removal of a lot of direct usages of global_trace is needed to be able to
      create more trace_arrays such that we can add multiple buffers.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      2b6080f2