Commit 72357590 authored by Beau Belgrave's avatar Beau Belgrave Committed by Steven Rostedt (Google)

tracing/user_events: Use remote writes for event enablement

As part of the discussions for user_events aligned with user space
tracers, it was determined that user programs should register a aligned
value to set or clear a bit when an event becomes enabled. Currently a
shared page is being used that requires mmap(). Remove the shared page
implementation and move to a user registered address implementation.

In this new model during the event registration from user programs 3 new
values are specified. The first is the address to update when the event
is either enabled or disabled. The second is the bit to set/clear to
reflect the event being enabled. The third is the size of the value at
the specified address.

This allows for a local 32/64-bit value in user programs to support
both kernel and user tracers. As an example, setting bit 31 for kernel
tracers when the event becomes enabled allows for user tracers to use
the other bits for ref counts or other flags. The kernel side updates
the bit atomically, user programs need to also update these values
atomically.

User provided addresses must be aligned on a natural boundary, this
allows for single page checking and prevents odd behaviors such as a
enable value straddling 2 pages instead of a single page. Currently
page faults are only logged, future patches will handle these.

Link: https://lkml.kernel.org/r/20230328235219.203-4-beaub@linux.microsoft.comSuggested-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: default avatarBeau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
parent fd593511
......@@ -9,13 +9,63 @@
#ifndef _LINUX_USER_EVENTS_H
#define _LINUX_USER_EVENTS_H
#include <linux/list.h>
#include <linux/refcount.h>
#include <linux/mm_types.h>
#include <linux/workqueue.h>
#include <uapi/linux/user_events.h>
#ifdef CONFIG_USER_EVENTS
struct user_event_mm {
struct list_head link;
struct list_head enablers;
struct mm_struct *mm;
struct user_event_mm *next;
refcount_t refcnt;
refcount_t tasks;
struct rcu_work put_rwork;
};
#endif
extern void user_event_mm_dup(struct task_struct *t,
struct user_event_mm *old_mm);
extern void user_event_mm_remove(struct task_struct *t);
static inline void user_events_fork(struct task_struct *t,
unsigned long clone_flags)
{
struct user_event_mm *old_mm;
if (!t || !current->user_event_mm)
return;
old_mm = current->user_event_mm;
if (clone_flags & CLONE_VM) {
t->user_event_mm = old_mm;
refcount_inc(&old_mm->tasks);
return;
}
user_event_mm_dup(t, old_mm);
}
static inline void user_events_execve(struct task_struct *t)
{
if (!t || !t->user_event_mm)
return;
user_event_mm_remove(t);
}
static inline void user_events_exit(struct task_struct *t)
{
if (!t || !t->user_event_mm)
return;
user_event_mm_remove(t);
}
#else
static inline void user_events_fork(struct task_struct *t,
unsigned long clone_flags)
{
......@@ -28,5 +78,6 @@ static inline void user_events_execve(struct task_struct *t)
static inline void user_events_exit(struct task_struct *t)
{
}
#endif /* CONFIG_USER_EVENTS */
#endif /* _LINUX_USER_EVENTS_H */
......@@ -27,12 +27,21 @@ struct user_reg {
/* Input: Size of the user_reg structure being used */
__u32 size;
/* Input: Bit in enable address to use */
__u8 enable_bit;
/* Input: Enable size in bytes at address */
__u8 enable_size;
/* Input: Flags for future use, set to 0 */
__u16 flags;
/* Input: Address to update when enabled */
__u64 enable_addr;
/* Input: Pointer to string with event name, description and flags */
__u64 name_args;
/* Output: Bitwise index of the event within the status page */
__u32 status_bit;
/* Output: Index of the event to use when writing data */
__u32 write_index;
} __attribute__((__packed__));
......
......@@ -798,9 +798,10 @@ config USER_EVENTS
can be used like an existing kernel trace event. User trace
events are generated by writing to a tracefs file. User
processes can determine if their tracing events should be
generated by memory mapping a tracefs file and checking for
an associated byte being non-zero.
generated by registering a value and bit with the kernel
that reflects when it is enabled or not.
See Documentation/trace/user_events.rst.
If in doubt, say N.
config HIST_TRIGGERS
......
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment