generate proper usdt code to prevent llvm meddling with ctx->#fields
Qin reported a test case where llvm still messes up with ctx->#fields.
For code like below:
switch(ctx->ip) {
case 0x7fdf2ede9820ULL: *((int64_t *)dest) = *(volatile int64_t *)&ctx->r12; return 0;
case 0x7fdf2edecd9cULL: *((int64_t *)dest) = *(volatile int64_t *)&ctx->bx; return 0;
}
The compiler still generates:
# r1 is the pointer to the ctx
r1 += 24
goto LBB0_4
LBB0_3:
r1 += 40
LBB0_4:
r3 = *(u64 *)(r1 + 0)
The verifier will reject the above code since the last load is not "ctx + field_offset"
format.
The responsible llvm optimization pass is CFGSimplifyPass. Its main implementation
in llvm/lib/Transforms/Utils/SimplifyCFG.cpp. The main routine to do the optimization
is SinkThenElseCodeToEnd. The routine canSinkInstructions is used to determine whether
an insn is a candidate for sinking.
Unfortunately, volatile load/store is not a condition to prevent the optimization.
But inline assembly is a condition which can prevent further optimization.
In this patch, instead of using volatile to annotate ctx->#field access, we do
normal ctx->#field access but put a compiler inline assembly memory barrier
__asm__ __volatile__(\"\": : :\"memory\");
after the field access.
Tested with usdt unit test case, usdt_samples example, a couple of usdt unit tests
developed in the past.
Signed-off-by: Yonghong Song <yhs@fb.com>
Showing
Please register or sign in to comment