1. 03 Nov, 2015 6 commits
    • Petr Mladek's avatar
      ring_buffer: Do no not complete benchmark reader too early · 8b46ff69
      Petr Mladek authored
      It seems that complete(&read_done) might be called too early
      in some situations.
      
      1st scenario:
      -------------
      
      CPU0					CPU1
      
      ring_buffer_producer_thread()
        wake_up_process(consumer);
        wait_for_completion(&read_start);
      
      					ring_buffer_consumer_thread()
      					  complete(&read_start);
      
        ring_buffer_producer()
          # producing data in
          # the do-while cycle
      
      					  ring_buffer_consumer();
      					    # reading data
      					    # got error
      					    # set kill_test = 1;
      					    set_current_state(
      						TASK_INTERRUPTIBLE);
      					    if (reader_finish)  # false
      					    schedule();
      
          # producer still in the middle of
          # do-while cycle
          if (consumer && !(cnt % wakeup_interval))
            wake_up_process(consumer);
      
      					    # spurious wakeup
      					    while (!reader_finish &&
      						   !kill_test)
      					    # leaving because
      					    # kill_test == 1
      					    reader_finish = 0;
      					    complete(&read_done);
      
      1st BANG: We might access uninitialized "read_done" if this is the
      	  the first round.
      
          # producer finally leaving
          # the do-while cycle because kill_test == 1;
      
          if (consumer) {
            reader_finish = 1;
            wake_up_process(consumer);
            wait_for_completion(&read_done);
      
      2nd BANG: This will never complete because consumer already did
      	  the completion.
      
      2nd scenario:
      -------------
      
      CPU0					CPU1
      
      ring_buffer_producer_thread()
        wake_up_process(consumer);
        wait_for_completion(&read_start);
      
      					ring_buffer_consumer_thread()
      					  complete(&read_start);
      
        ring_buffer_producer()
          # CPU3 removes the module	  <--- difference from
          # and stops producer          <--- the 1st scenario
          if (kthread_should_stop())
            kill_test = 1;
      
      					  ring_buffer_consumer();
      					    while (!reader_finish &&
      						   !kill_test)
      					    # kill_test == 1 => we never go
      					    # into the top level while()
      					    reader_finish = 0;
      					    complete(&read_done);
      
          # producer still in the middle of
          # do-while cycle
          if (consumer && !(cnt % wakeup_interval))
            wake_up_process(consumer);
      
      					    # spurious wakeup
      					    while (!reader_finish &&
      						   !kill_test)
      					    # leaving because kill_test == 1
      					    reader_finish = 0;
      					    complete(&read_done);
      
      BANG: We are in the same "bang" situations as in the 1st scenario.
      
      Root of the problem:
      --------------------
      
      ring_buffer_consumer() must complete "read_done" only when "reader_finish"
      variable is set. It must not be skipped due to other conditions.
      
      Note that we still must keep the check for "reader_finish" in a loop
      because there might be spurious wakeups as described in the
      above scenarios.
      
      Solution:
      ----------
      
      The top level cycle in ring_buffer_consumer() will finish only when
      "reader_finish" is set. The data will be read in "while-do" cycle
      so that they are not read after an error (kill_test == 1)
      or a spurious wake up.
      
      In addition, "reader_finish" is manipulated by the producer thread.
      Therefore we add READ_ONCE() to make sure that the fresh value is
      read in each cycle. Also we add the corresponding barrier
      to synchronize the sleep check.
      
      Next we set the state back to TASK_RUNNING for the situation where we
      did not sleep.
      
      Just from paranoid reasons, we initialize both completions statically.
      This is safer, in case there are other races that we are unaware of.
      
      As a side effect we could remove the memory barrier from
      ring_buffer_producer_thread(). IMHO, this was the reason for
      the barrier. ring_buffer_reset() uses spin locks that should
      provide the needed memory barrier for using the buffer.
      
      Link: http://lkml.kernel.org/r/1441629518-32712-2-git-send-email-pmladek@suse.comSigned-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      8b46ff69
    • Dmitry Safonov's avatar
      tracing: Remove redundant TP_ARGS redefining · fb8c2293
      Dmitry Safonov authored
      TP_ARGS is not used anywhere in trace.h nor trace_entries.h
      Firstly, I left just #undef TP_ARGS and had no errors - remove it.
      
      Link: http://lkml.kernel.org/r/1446576560-14085-1-git-send-email-0x7f454c46@gmail.comSigned-off-by: default avatarDmitry Safonov <0x7f454c46@gmail.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      fb8c2293
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Rename max_stack_lock to stack_trace_max_lock · d332736d
      Steven Rostedt (Red Hat) authored
      Now that max_stack_lock is a global variable, it requires a naming
      convention that is unlikely to collide. Rename it to the same naming
      convention that the other stack_trace variables have.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      d332736d
    • AKASHI Takahiro's avatar
      tracing: Allow arch-specific stack tracer · bb99d8cc
      AKASHI Takahiro authored
      A stack frame may be used in a different way depending on cpu architecture.
      Thus it is not always appropriate to slurp the stack contents, as current
      check_stack() does, in order to calcurate a stack index (height) at a given
      function call. At least not on arm64.
      In addition, there is a possibility that we will mistakenly detect a stale
      stack frame which has not been overwritten.
      
      This patch makes check_stack() a weak function so as to later implement
      arch-specific version.
      
      Link: http://lkml.kernel.org/r/1446182741-31019-5-git-send-email-takahiro.akashi@linaro.orgSigned-off-by: default avatarAKASHI Takahiro <takahiro.akashi@linaro.org>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      bb99d8cc
    • Li Bin's avatar
      recordmcount: arm64: Replace the ignored mcount call into nop · 2ee8a74f
      Li Bin authored
      By now, the recordmcount only records the function that in
      following sections:
      .text/.ref.text/.sched.text/.spinlock.text/.irqentry.text/
      .kprobes.text/.text.unlikely
      
      For the function that not in these sections, the call mcount
      will be in place and not be replaced when kernel boot up. And
      it will bring performance overhead, such as do_mem_abort (in
      .exception.text section). This patch make the call mcount to
      nop for this case in recordmcount.
      
      Link: http://lkml.kernel.org/r/1446019445-14421-1-git-send-email-huawei.libin@huawei.com
      Link: http://lkml.kernel.org/r/1446193864-24593-4-git-send-email-huawei.libin@huawei.com
      
      Cc: <lkp@intel.com>
      Cc: <catalin.marinas@arm.com>
      Cc: <takahiro.akashi@linaro.org>
      Cc: <stable@vger.kernel.org> # 3.18+
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarLi Bin <huawei.libin@huawei.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      2ee8a74f
    • libin's avatar
      recordmcount: Fix endianness handling bug for nop_mcount · c84da8b9
      libin authored
      In nop_mcount, shdr->sh_offset and welp->r_offset should handle
      endianness properly, otherwise it will trigger Segmentation fault
      if the recordmcount main and file.o have different endianness.
      
      Link: http://lkml.kernel.org/r/563806C7.7070606@huawei.com
      
      Cc: <stable@vger.kernel.org> # 3.0+
      Signed-off-by: default avatarLi Bin <huawei.libin@huawei.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      c84da8b9
  2. 02 Nov, 2015 14 commits
  3. 26 Oct, 2015 5 commits
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Fix sparse RCU warning · fb662288
      Steven Rostedt (Red Hat) authored
      p_start() and p_stop() are seq_file functions that match. Teach sparse to
      know that rcu_read_lock_sched() that is taken by p_start() is released by
      p_stop.
      Reported-by: default avatarkbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      fb662288
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Check all tasks on each CPU when filtering pids · 8ca532ad
      Steven Rostedt (Red Hat) authored
      My tests found that if a task is running but not filtered when set_event_pid
      is modified, then it can still be traced.
      
      Call on_each_cpu() to check if the current running task should be filtered
      and update the per cpu flags of tr->data appropriately.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      8ca532ad
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Implement event pid filtering · 3fdaf80f
      Steven Rostedt (Red Hat) authored
      Add the necessary hooks to use the pids loaded in set_event_pid to filter
      all the events enabled in the tracing instance that match the pids listed.
      
      Two probes are added to both sched_switch and sched_wakeup tracepoints to be
      called before other probes are called and after the other probes are called.
      The first is used to set the necessary flags to let the probes know to test
      if they should be traced or not.
      
      The sched_switch pre probe will set the "ignore_pid" flag if neither the
      previous or next task has a matching pid.
      
      The sched_switch probe will set the "ignore_pid" flag if the next task
      does not match the matching pid.
      
      The pre probe allows for probes tracing sched_switch to be traced if
      necessary.
      
      The sched_wakeup pre probe will set the "ignore_pid" flag if neither the
      current task nor the wakee task has a matching pid.
      
      The sched_wakeup post probe will set the "ignore_pid" flag if the current
      task does not have a matching pid.
      
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      3fdaf80f
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Add set_event_pid directory for future use · 49090107
      Steven Rostedt (Red Hat) authored
      Create a tracing directory called set_event_pid, which currently has no
      function, but will be used to filter all events for the tracing instance or
      the pids that are added to the file.
      
      The reason no functionality is added with this commit is that this commit
      focuses on the creation and removal of the pids in a safe manner. And tests
      can be made against this change to make sure things are correct before
      hooking features to the list of pids.
      
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      49090107
    • Steven Rostedt (Red Hat)'s avatar
      tracepoint: Give priority to probes of tracepoints · 7904b5c4
      Steven Rostedt (Red Hat) authored
      In order to guarantee that a probe will be called before other probes that
      are attached to a tracepoint, there needs to be a mechanism to provide
      priority of one probe over the others.
      
      Adding a prio field to the struct tracepoint_func, which lets the probes be
      sorted by the priority set in the structure. If no priority is specified,
      then a priority of 10 is given (this is a macro, and perhaps may be changed
      in the future).
      
      Now probes may be added to affect other probes that are attached to a
      tracepoint with a guaranteed order.
      
      One use case would be to allow tracing of tracepoints be able to filter by
      pid. A special (higher priority probe) may be added to the sched_switch
      tracepoint and set the necessary flags of the other tracepoints to notify
      them if they should be traced or not. In case a tracepoint is enabled at the
      sched_switch tracepoint too, the order of the two are not random.
      
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      7904b5c4
  4. 22 Oct, 2015 1 commit
  5. 21 Oct, 2015 5 commits
  6. 20 Oct, 2015 1 commit
  7. 16 Oct, 2015 1 commit
  8. 14 Oct, 2015 1 commit
  9. 01 Oct, 2015 2 commits
  10. 30 Sep, 2015 4 commits