Commits · 8b46ff6938d2c78a16520a37086944ecbaa8ab8b · Kirill Smelkov / linux

03 Nov, 2015 6 commits

ring_buffer: Do no not complete benchmark reader too early · 8b46ff69

Petr Mladek authored Sep 07, 2015

It seems that complete(&read_done) might be called too early
in some situations.

1st scenario:
-------------

CPU0					CPU1

ring_buffer_producer_thread()
  wake_up_process(consumer);
  wait_for_completion(&read_start);

					ring_buffer_consumer_thread()
					  complete(&read_start);

  ring_buffer_producer()
    # producing data in
    # the do-while cycle

					  ring_buffer_consumer();
					    # reading data
					    # got error
					    # set kill_test = 1;
					    set_current_state(
						TASK_INTERRUPTIBLE);
					    if (reader_finish)  # false
					    schedule();

    # producer still in the middle of
    # do-while cycle
    if (consumer && !(cnt % wakeup_interval))
      wake_up_process(consumer);

					    # spurious wakeup
					    while (!reader_finish &&
						   !kill_test)
					    # leaving because
					    # kill_test == 1
					    reader_finish = 0;
					    complete(&read_done);

1st BANG: We might access uninitialized "read_done" if this is the
	  the first round.

    # producer finally leaving
    # the do-while cycle because kill_test == 1;

    if (consumer) {
      reader_finish = 1;
      wake_up_process(consumer);
      wait_for_completion(&read_done);

2nd BANG: This will never complete because consumer already did
	  the completion.

2nd scenario:
-------------

CPU0					CPU1

ring_buffer_producer_thread()
  wake_up_process(consumer);
  wait_for_completion(&read_start);

					ring_buffer_consumer_thread()
					  complete(&read_start);

  ring_buffer_producer()
    # CPU3 removes the module	  <--- difference from
    # and stops producer          <--- the 1st scenario
    if (kthread_should_stop())
      kill_test = 1;

					  ring_buffer_consumer();
					    while (!reader_finish &&
						   !kill_test)
					    # kill_test == 1 => we never go
					    # into the top level while()
					    reader_finish = 0;
					    complete(&read_done);

    # producer still in the middle of
    # do-while cycle
    if (consumer && !(cnt % wakeup_interval))
      wake_up_process(consumer);

					    # spurious wakeup
					    while (!reader_finish &&
						   !kill_test)
					    # leaving because kill_test == 1
					    reader_finish = 0;
					    complete(&read_done);

BANG: We are in the same "bang" situations as in the 1st scenario.

Root of the problem:
--------------------

ring_buffer_consumer() must complete "read_done" only when "reader_finish"
variable is set. It must not be skipped due to other conditions.

Note that we still must keep the check for "reader_finish" in a loop
because there might be spurious wakeups as described in the
above scenarios.

Solution:
----------

The top level cycle in ring_buffer_consumer() will finish only when
"reader_finish" is set. The data will be read in "while-do" cycle
so that they are not read after an error (kill_test == 1)
or a spurious wake up.

In addition, "reader_finish" is manipulated by the producer thread.
Therefore we add READ_ONCE() to make sure that the fresh value is
read in each cycle. Also we add the corresponding barrier
to synchronize the sleep check.

Next we set the state back to TASK_RUNNING for the situation where we
did not sleep.

Just from paranoid reasons, we initialize both completions statically.
This is safer, in case there are other races that we are unaware of.

As a side effect we could remove the memory barrier from
ring_buffer_producer_thread(). IMHO, this was the reason for
the barrier. ring_buffer_reset() uses spin locks that should
provide the needed memory barrier for using the buffer.

Link: http://lkml.kernel.org/r/1441629518-32712-2-git-send-email-pmladek@suse.comSigned-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

8b46ff69

tracing: Remove redundant TP_ARGS redefining · fb8c2293

Dmitry Safonov authored Nov 03, 2015

TP_ARGS is not used anywhere in trace.h nor trace_entries.h
Firstly, I left just #undef TP_ARGS and had no errors - remove it.

Link: http://lkml.kernel.org/r/1446576560-14085-1-git-send-email-0x7f454c46@gmail.comSigned-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

fb8c2293

tracing: Rename max_stack_lock to stack_trace_max_lock · d332736d

Steven Rostedt (Red Hat) authored Nov 03, 2015

Now that max_stack_lock is a global variable, it requires a naming
convention that is unlikely to collide. Rename it to the same naming
convention that the other stack_trace variables have.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

d332736d

tracing: Allow arch-specific stack tracer · bb99d8cc

AKASHI Takahiro authored Oct 30, 2015

A stack frame may be used in a different way depending on cpu architecture.
Thus it is not always appropriate to slurp the stack contents, as current
check_stack() does, in order to calcurate a stack index (height) at a given
function call. At least not on arm64.
In addition, there is a possibility that we will mistakenly detect a stale
stack frame which has not been overwritten.

This patch makes check_stack() a weak function so as to later implement
arch-specific version.

Link: http://lkml.kernel.org/r/1446182741-31019-5-git-send-email-takahiro.akashi@linaro.orgSigned-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

bb99d8cc

recordmcount: arm64: Replace the ignored mcount call into nop · 2ee8a74f

Li Bin authored Oct 30, 2015

By now, the recordmcount only records the function that in
following sections:
.text/.ref.text/.sched.text/.spinlock.text/.irqentry.text/
.kprobes.text/.text.unlikely

For the function that not in these sections, the call mcount
will be in place and not be replaced when kernel boot up. And
it will bring performance overhead, such as do_mem_abort (in
.exception.text section). This patch make the call mcount to
nop for this case in recordmcount.

Link: http://lkml.kernel.org/r/1446019445-14421-1-git-send-email-huawei.libin@huawei.com
Link: http://lkml.kernel.org/r/1446193864-24593-4-git-send-email-huawei.libin@huawei.com

Cc: <lkp@intel.com>
Cc: <catalin.marinas@arm.com>
Cc: <takahiro.akashi@linaro.org>
Cc: <stable@vger.kernel.org> # 3.18+
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Li Bin <huawei.libin@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

2ee8a74f

recordmcount: Fix endianness handling bug for nop_mcount · c84da8b9

libin authored Nov 03, 2015

In nop_mcount, shdr->sh_offset and welp->r_offset should handle
endianness properly, otherwise it will trigger Segmentation fault
if the recordmcount main and file.o have different endianness.

Link: http://lkml.kernel.org/r/563806C7.7070606@huawei.com

Cc: <stable@vger.kernel.org> # 3.0+
Signed-off-by: Li Bin <huawei.libin@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

c84da8b9

02 Nov, 2015 14 commits

tracepoints: Fix documentation of RCU lockdep checks · a15920be

Mathieu Desnoyers authored Nov 02, 2015

The documentation on top of __DECLARE_TRACE() does not match its
implementation since the condition check has been added to the
RCU lockdep checks. Update the documentation to match its
implementation.

Link: http://lkml.kernel.org/r/1446504164-21563-1-git-send-email-mathieu.desnoyers@efficios.com

CC: Dave Hansen <dave@sr71.net>
Fixes: a05d59a5 "tracing: Add condition check to RCU lockdep checks"
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>