• Martin KaFai Lau's avatar
    bpf: Fix syscall's stackmap lookup potential deadlock · ae26a710
    Martin KaFai Lau authored
    [ Upstream commit 7c4cd051 ]
    
    The map_lookup_elem used to not acquiring spinlock
    in order to optimize the reader.
    
    It was true until commit 557c0c6e ("bpf: convert stackmap to pre-allocation")
    The syscall's map_lookup_elem(stackmap) calls bpf_stackmap_copy().
    bpf_stackmap_copy() may find the elem no longer needed after the copy is done.
    If that is the case, pcpu_freelist_push() saves this elem for reuse later.
    This push requires a spinlock.
    
    If a tracing bpf_prog got run in the middle of the syscall's
    map_lookup_elem(stackmap) and this tracing bpf_prog is calling
    bpf_get_stackid(stackmap) which also requires the same pcpu_freelist's
    spinlock, it may end up with a dead lock situation as reported by
    Eric Dumazet in https://patchwork.ozlabs.org/patch/1030266/
    
    The situation is the same as the syscall's map_update_elem() which
    needs to acquire the pcpu_freelist's spinlock and could race
    with tracing bpf_prog.  Hence, this patch fixes it by protecting
    bpf_stackmap_copy() with this_cpu_inc(bpf_prog_active)
    to prevent tracing bpf_prog from running.
    
    A later syscall's map_lookup_elem commit f1a2e44a ("bpf: add queue and stack maps")
    also acquires a spinlock and races with tracing bpf_prog similarly.
    Hence, this patch is forward looking and protects the majority
    of the map lookups.  bpf_map_offload_lookup_elem() is the exception
    since it is for network bpf_prog only (i.e. never called by tracing
    bpf_prog).
    
    Fixes: 557c0c6e ("bpf: convert stackmap to pre-allocation")
    Reported-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
    Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
    Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
    Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
    ae26a710
syscall.c 55.9 KB