• Yonghong Song's avatar
    bpf: Using rcu_read_lock for bpf_sk_storage_map iterator · c69d2ddb
    Yonghong Song authored
    
    
    If a bucket contains a lot of sockets, during bpf_iter traversing
    a bucket, concurrent userspace bpf_map_update_elem() and
    bpf program bpf_sk_storage_{get,delete}() may experience
    some undesirable delays as they will compete with bpf_iter
    for bucket lock.
    
    Note that the number of buckets for bpf_sk_storage_map
    is roughly the same as the number of cpus. So if there
    are lots of sockets in the system, each bucket could
    contain lots of sockets.
    
    Different actual use cases may experience different delays.
    Here, using selftest bpf_iter subtest bpf_sk_storage_map,
    I hacked the kernel with ktime_get_mono_fast_ns()
    to collect the time when a bucket was locked
    during bpf_iter prog traversing that bucket. This way,
    the maximum incurred delay was measured w.r.t. the
    number of elements in a bucket.
        # elems in each bucket          delay(ns)
          64                            17000
          256                           72512
          2048                          875246
    
    The potential delays will be further increased if
    we have even more elemnts in a bucket. Using rcu_read_lock()
    is a reasonable compromise here. It may lose some precision, e.g.,
    access stale sockets, but it will not hurt performance of
    bpf program or user space application which also tries
    to get/delete or update map elements.
    Signed-off-by: default avatarYonghong Song <yhs@fb.com>
    Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    Acked-by: default avatarSong Liu <songliubraving@fb.com>
    Cc: Martin KaFai Lau <kafai@fb.com>
    Link: https://lore.kernel.org/bpf/20200916224645.720172-1-yhs@fb.com
    c69d2ddb
bpf_sk_storage.c 21.4 KB