Commit e65f7ae7 authored by Masami Hiramatsu's avatar Masami Hiramatsu Committed by Steven Rostedt (VMware)

tracing/probe: Support user-space dereference

Support user-space dereference syntax for probe event arguments
to dereference the data-structure or array in user-space.

The syntax is just adding 'u' before an offset value.

 +|-u<OFFSET>(<FETCHARG>)

e.g. +u8(%ax), +u0(+0(%si))

For example, if you probe do_sched_setscheduler(pid, policy,
param) and record param->sched_priority, you can add new
probe as below;

 p do_sched_setscheduler priority=+u0($arg3)

Note that kprobe event provides this and it doesn't change the
dereference method automatically because we do not know whether
the given address is in userspace or kernel on some archs.

So as same as "ustring", this is an option for user, who has to
carefully choose the dereference method.

Link: http://lkml.kernel.org/r/155789872187.26965.4468456816590888687.stgit@devnote2Acked-by: default avatarIngo Molnar <mingo@kernel.org>
Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
parent 88903c46
...@@ -51,7 +51,7 @@ Synopsis of kprobe_events ...@@ -51,7 +51,7 @@ Synopsis of kprobe_events
$argN : Fetch the Nth function argument. (N >= 1) (\*1) $argN : Fetch the Nth function argument. (N >= 1) (\*1)
$retval : Fetch return value.(\*2) $retval : Fetch return value.(\*2)
$comm : Fetch current task comm. $comm : Fetch current task comm.
+|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address.(\*3) +|-[u]OFFS(FETCHARG) : Fetch memory at FETCHARG +|- OFFS address.(\*3)(\*4)
NAME=FETCHARG : Set NAME as the argument name of FETCHARG. NAME=FETCHARG : Set NAME as the argument name of FETCHARG.
FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types
(u8/u16/u32/u64/s8/s16/s32/s64), hexadecimal types (u8/u16/u32/u64/s8/s16/s32/s64), hexadecimal types
...@@ -61,6 +61,7 @@ Synopsis of kprobe_events ...@@ -61,6 +61,7 @@ Synopsis of kprobe_events
(\*1) only for the probe on function entry (offs == 0). (\*1) only for the probe on function entry (offs == 0).
(\*2) only for return probe. (\*2) only for return probe.
(\*3) this is useful for fetching a field of data structures. (\*3) this is useful for fetching a field of data structures.
(\*4) "u" means user-space dereference. See :ref:`user_mem_access`.
Types Types
----- -----
...@@ -79,10 +80,7 @@ wrong, but '+8($stack):x8[8]' is OK.) ...@@ -79,10 +80,7 @@ wrong, but '+8($stack):x8[8]' is OK.)
String type is a special type, which fetches a "null-terminated" string from String type is a special type, which fetches a "null-terminated" string from
kernel space. This means it will fail and store NULL if the string container kernel space. This means it will fail and store NULL if the string container
has been paged out. "ustring" type is an alternative of string for user-space. has been paged out. "ustring" type is an alternative of string for user-space.
Note that kprobe-event provides string/ustring types, but doesn't change it See :ref:`user_mem_access` for more info..
automatically. So user has to decide if the targe string in kernel or in user
space carefully. On some arch, if you choose wrong one, it always fails to
record string data.
The string array type is a bit different from other types. For other base The string array type is a bit different from other types. For other base
types, <base-type>[1] is equal to <base-type> (e.g. +0(%di):x32[1] is same types, <base-type>[1] is equal to <base-type> (e.g. +0(%di):x32[1] is same
as +0(%di):x32.) But string[1] is not equal to string. The string type itself as +0(%di):x32.) But string[1] is not equal to string. The string type itself
...@@ -97,6 +95,25 @@ Symbol type('symbol') is an alias of u32 or u64 type (depends on BITS_PER_LONG) ...@@ -97,6 +95,25 @@ Symbol type('symbol') is an alias of u32 or u64 type (depends on BITS_PER_LONG)
which shows given pointer in "symbol+offset" style. which shows given pointer in "symbol+offset" style.
For $comm, the default type is "string"; any other type is invalid. For $comm, the default type is "string"; any other type is invalid.
.. _user_mem_access:
User Memory Access
------------------
Kprobe events supports user-space memory access. For that purpose, you can use
either user-space dereference syntax or 'ustring' type.
The user-space dereference syntax allows you to access a field of a data
structure in user-space. This is done by adding the "u" prefix to the
dereference syntax. For example, +u4(%si) means it will read memory from the
address in the register %si offset by 4, and the memory is expected to be in
user-space. You can use this for strings too, e.g. +u0(%si):string will read
a string from the address in the register %si that is expected to be in user-
space. 'ustring' is a shortcut way of performing the same task. That is,
+0(%si):ustring is equivalent to +u0(%si):string.
Note that kprobe-event provides the user-memory access syntax but it doesn't
use it transparently. This means if you use normal dereference or string type
for user memory, it might fail, and may always fail on some archs. The user
has to carefully check if the target data is in kernel or user space.
Per-Probe Event Filtering Per-Probe Event Filtering
------------------------- -------------------------
......
...@@ -42,16 +42,18 @@ Synopsis of uprobe_tracer ...@@ -42,16 +42,18 @@ Synopsis of uprobe_tracer
@+OFFSET : Fetch memory at OFFSET (OFFSET from same file as PATH) @+OFFSET : Fetch memory at OFFSET (OFFSET from same file as PATH)
$stackN : Fetch Nth entry of stack (N >= 0) $stackN : Fetch Nth entry of stack (N >= 0)
$stack : Fetch stack address. $stack : Fetch stack address.
$retval : Fetch return value.(*) $retval : Fetch return value.(\*1)
$comm : Fetch current task comm. $comm : Fetch current task comm.
+|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address.(**) +|-[u]OFFS(FETCHARG) : Fetch memory at FETCHARG +|- OFFS address.(\*2)(\*3)
NAME=FETCHARG : Set NAME as the argument name of FETCHARG. NAME=FETCHARG : Set NAME as the argument name of FETCHARG.
FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types
(u8/u16/u32/u64/s8/s16/s32/s64), hexadecimal types (u8/u16/u32/u64/s8/s16/s32/s64), hexadecimal types
(x8/x16/x32/x64), "string" and bitfield are supported. (x8/x16/x32/x64), "string" and bitfield are supported.
(*) only for return probe. (\*1) only for return probe.
(**) this is useful for fetching a field of data structures. (\*2) this is useful for fetching a field of data structures.
(\*3) Unlike kprobe event, "u" prefix will just be ignored, becuse uprobe
events can access only user-space memory.
Types Types
----- -----
......
...@@ -4842,10 +4842,11 @@ static const char readme_msg[] = ...@@ -4842,10 +4842,11 @@ static const char readme_msg[] =
"\t args: <name>=fetcharg[:type]\n" "\t args: <name>=fetcharg[:type]\n"
"\t fetcharg: %<register>, @<address>, @<symbol>[+|-<offset>],\n" "\t fetcharg: %<register>, @<address>, @<symbol>[+|-<offset>],\n"
#ifdef CONFIG_HAVE_FUNCTION_ARG_ACCESS_API #ifdef CONFIG_HAVE_FUNCTION_ARG_ACCESS_API
"\t $stack<index>, $stack, $retval, $comm, $arg<N>\n" "\t $stack<index>, $stack, $retval, $comm, $arg<N>,\n"
#else #else
"\t $stack<index>, $stack, $retval, $comm\n" "\t $stack<index>, $stack, $retval, $comm,\n"
#endif #endif
"\t +|-[u]<offset>(<fetcharg>)\n"
"\t type: s8/16/32/64, u8/16/32/64, x8/16/32/64, string, symbol,\n" "\t type: s8/16/32/64, u8/16/32/64, x8/16/32/64, string, symbol,\n"
"\t b<bit-width>@<bit-offset>/<container-size>, ustring,\n" "\t b<bit-width>@<bit-offset>/<container-size>, ustring,\n"
"\t <type>\\[<array-size>\\]\n" "\t <type>\\[<array-size>\\]\n"
......
...@@ -952,6 +952,12 @@ probe_mem_read(void *dest, void *src, size_t size) ...@@ -952,6 +952,12 @@ probe_mem_read(void *dest, void *src, size_t size)
return probe_kernel_read(dest, src, size); return probe_kernel_read(dest, src, size);
} }
static nokprobe_inline int
probe_mem_read_user(void *dest, void *src, size_t size)
{
return probe_user_read(dest, src, size);
}
/* Note that we don't verify it, since the code does not come from user space */ /* Note that we don't verify it, since the code does not come from user space */
static int static int
process_fetch_insn(struct fetch_insn *code, struct pt_regs *regs, void *dest, process_fetch_insn(struct fetch_insn *code, struct pt_regs *regs, void *dest,
......
...@@ -324,6 +324,7 @@ parse_probe_arg(char *arg, const struct fetch_type *type, ...@@ -324,6 +324,7 @@ parse_probe_arg(char *arg, const struct fetch_type *type,
{ {
struct fetch_insn *code = *pcode; struct fetch_insn *code = *pcode;
unsigned long param; unsigned long param;
int deref = FETCH_OP_DEREF;
long offset = 0; long offset = 0;
char *tmp; char *tmp;
int ret = 0; int ret = 0;
...@@ -396,9 +397,14 @@ parse_probe_arg(char *arg, const struct fetch_type *type, ...@@ -396,9 +397,14 @@ parse_probe_arg(char *arg, const struct fetch_type *type,
break; break;
case '+': /* deref memory */ case '+': /* deref memory */
arg++; /* Skip '+', because kstrtol() rejects it. */
/* fall through */
case '-': case '-':
if (arg[1] == 'u') {
deref = FETCH_OP_UDEREF;
arg[1] = arg[0];
arg++;
}
if (arg[0] == '+')
arg++; /* Skip '+', because kstrtol() rejects it. */
tmp = strchr(arg, '('); tmp = strchr(arg, '(');
if (!tmp) { if (!tmp) {
trace_probe_log_err(offs, DEREF_NEED_BRACE); trace_probe_log_err(offs, DEREF_NEED_BRACE);
...@@ -434,7 +440,7 @@ parse_probe_arg(char *arg, const struct fetch_type *type, ...@@ -434,7 +440,7 @@ parse_probe_arg(char *arg, const struct fetch_type *type,
} }
*pcode = code; *pcode = code;
code->op = FETCH_OP_DEREF; code->op = deref;
code->offset = offset; code->offset = offset;
} }
break; break;
...@@ -573,14 +579,15 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size, ...@@ -573,14 +579,15 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
/* Store operation */ /* Store operation */
if (!strcmp(parg->type->name, "string") || if (!strcmp(parg->type->name, "string") ||
!strcmp(parg->type->name, "ustring")) { !strcmp(parg->type->name, "ustring")) {
if (code->op != FETCH_OP_DEREF && code->op != FETCH_OP_IMM && if (code->op != FETCH_OP_DEREF && code->op != FETCH_OP_UDEREF &&
code->op != FETCH_OP_COMM) { code->op != FETCH_OP_IMM && code->op != FETCH_OP_COMM) {
trace_probe_log_err(offset + (t ? (t - arg) : 0), trace_probe_log_err(offset + (t ? (t - arg) : 0),
BAD_STRING); BAD_STRING);
ret = -EINVAL; ret = -EINVAL;
goto fail; goto fail;
} }
if (code->op != FETCH_OP_DEREF || parg->count) { if ((code->op == FETCH_OP_IMM || code->op == FETCH_OP_COMM) ||
parg->count) {
/* /*
* IMM and COMM is pointing actual address, those must * IMM and COMM is pointing actual address, those must
* be kept, and if parg->count != 0, this is an array * be kept, and if parg->count != 0, this is an array
...@@ -594,7 +601,8 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size, ...@@ -594,7 +601,8 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
} }
} }
/* If op == DEREF, replace it with STRING */ /* If op == DEREF, replace it with STRING */
if (!strcmp(parg->type->name, "ustring")) if (!strcmp(parg->type->name, "ustring") ||
code->op == FETCH_OP_UDEREF)
code->op = FETCH_OP_ST_USTRING; code->op = FETCH_OP_ST_USTRING;
else else
code->op = FETCH_OP_ST_STRING; code->op = FETCH_OP_ST_STRING;
...@@ -603,6 +611,9 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size, ...@@ -603,6 +611,9 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
} else if (code->op == FETCH_OP_DEREF) { } else if (code->op == FETCH_OP_DEREF) {
code->op = FETCH_OP_ST_MEM; code->op = FETCH_OP_ST_MEM;
code->size = parg->type->size; code->size = parg->type->size;
} else if (code->op == FETCH_OP_UDEREF) {
code->op = FETCH_OP_ST_UMEM;
code->size = parg->type->size;
} else { } else {
code++; code++;
if (code->op != FETCH_OP_NOP) { if (code->op != FETCH_OP_NOP) {
......
...@@ -92,9 +92,11 @@ enum fetch_op { ...@@ -92,9 +92,11 @@ enum fetch_op {
FETCH_OP_FOFFS, /* File offset: .immediate */ FETCH_OP_FOFFS, /* File offset: .immediate */
// Stage 2 (dereference) op // Stage 2 (dereference) op
FETCH_OP_DEREF, /* Dereference: .offset */ FETCH_OP_DEREF, /* Dereference: .offset */
FETCH_OP_UDEREF, /* User-space Dereference: .offset */
// Stage 3 (store) ops // Stage 3 (store) ops
FETCH_OP_ST_RAW, /* Raw: .size */ FETCH_OP_ST_RAW, /* Raw: .size */
FETCH_OP_ST_MEM, /* Mem: .offset, .size */ FETCH_OP_ST_MEM, /* Mem: .offset, .size */
FETCH_OP_ST_UMEM, /* Mem: .offset, .size */
FETCH_OP_ST_STRING, /* String: .offset, .size */ FETCH_OP_ST_STRING, /* String: .offset, .size */
FETCH_OP_ST_USTRING, /* User String: .offset, .size */ FETCH_OP_ST_USTRING, /* User String: .offset, .size */
// Stage 4 (modify) op // Stage 4 (modify) op
......
...@@ -64,6 +64,8 @@ static nokprobe_inline int ...@@ -64,6 +64,8 @@ static nokprobe_inline int
fetch_store_string_user(unsigned long addr, void *dest, void *base); fetch_store_string_user(unsigned long addr, void *dest, void *base);
static nokprobe_inline int static nokprobe_inline int
probe_mem_read(void *dest, void *src, size_t size); probe_mem_read(void *dest, void *src, size_t size);
static nokprobe_inline int
probe_mem_read_user(void *dest, void *src, size_t size);
/* From the 2nd stage, routine is same */ /* From the 2nd stage, routine is same */
static nokprobe_inline int static nokprobe_inline int
...@@ -77,14 +79,21 @@ process_fetch_insn_bottom(struct fetch_insn *code, unsigned long val, ...@@ -77,14 +79,21 @@ process_fetch_insn_bottom(struct fetch_insn *code, unsigned long val,
stage2: stage2:
/* 2nd stage: dereference memory if needed */ /* 2nd stage: dereference memory if needed */
while (code->op == FETCH_OP_DEREF) { do {
if (code->op == FETCH_OP_DEREF) {
lval = val; lval = val;
ret = probe_mem_read(&val, (void *)val + code->offset, ret = probe_mem_read(&val, (void *)val + code->offset,
sizeof(val)); sizeof(val));
} else if (code->op == FETCH_OP_UDEREF) {
lval = val;
ret = probe_mem_read_user(&val,
(void *)val + code->offset, sizeof(val));
} else
break;
if (ret) if (ret)
return ret; return ret;
code++; code++;
} } while (1);
s3 = code; s3 = code;
stage3: stage3:
...@@ -109,6 +118,9 @@ process_fetch_insn_bottom(struct fetch_insn *code, unsigned long val, ...@@ -109,6 +118,9 @@ process_fetch_insn_bottom(struct fetch_insn *code, unsigned long val,
case FETCH_OP_ST_MEM: case FETCH_OP_ST_MEM:
probe_mem_read(dest, (void *)val + code->offset, code->size); probe_mem_read(dest, (void *)val + code->offset, code->size);
break; break;
case FETCH_OP_ST_UMEM:
probe_mem_read_user(dest, (void *)val + code->offset, code->size);
break;
case FETCH_OP_ST_STRING: case FETCH_OP_ST_STRING:
loc = *(u32 *)dest; loc = *(u32 *)dest;
ret = fetch_store_string(val + code->offset, dest, base); ret = fetch_store_string(val + code->offset, dest, base);
......
...@@ -140,6 +140,13 @@ probe_mem_read(void *dest, void *src, size_t size) ...@@ -140,6 +140,13 @@ probe_mem_read(void *dest, void *src, size_t size)
return copy_from_user(dest, vaddr, size) ? -EFAULT : 0; return copy_from_user(dest, vaddr, size) ? -EFAULT : 0;
} }
static nokprobe_inline int
probe_mem_read_user(void *dest, void *src, size_t size)
{
return probe_mem_read(dest, src, size);
}
/* /*
* Fetch a null-terminated string. Caller MUST set *(u32 *)dest with max * Fetch a null-terminated string. Caller MUST set *(u32 *)dest with max
* length and relative data location. * length and relative data location.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment