• Eric Dumazet's avatar
    net: sched: do not acquire qdisc spinlock in qdisc/class stats dump · edb09eb1
    Eric Dumazet authored
    Large tc dumps (tc -s {qdisc|class} sh dev ethX) done by Google BwE host
    agent [1] are problematic at scale :
    
    For each qdisc/class found in the dump, we currently lock the root qdisc
    spinlock in order to get stats. Sampling stats every 5 seconds from
    thousands of HTB classes is a challenge when the root qdisc spinlock is
    under high pressure. Not only the dumps take time, they also slow
    down the fast path (queue/dequeue packets) by 10 % to 20 % in some cases.
    
    An audit of existing qdiscs showed that sch_fq_codel is the only qdisc
    that might need the qdisc lock in fq_codel_dump_stats() and
    fq_codel_dump_class_stats()
    
    In v2 of this patch, I now use the Qdisc running seqcount to provide
    consistent reads of packets/bytes counters, regardless of 32/64 bit arches.
    
    I also changed rate estimators to use the same infrastructure
    so that they no longer need to lock root qdisc lock.
    
    [1]
    http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43838.pdfSigned-off-by: default avatarEric Dumazet <edumazet@google.com>
    Cc: Cong Wang <xiyou.wangcong@gmail.com>
    Cc: Jamal Hadi Salim <jhs@mojatatu.com>
    Cc: John Fastabend <john.fastabend@gmail.com>
    Cc: Kevin Athey <kda@google.com>
    Cc: Xiaotian Pei <xiaotian@google.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    edb09eb1
sch_prio.c 8.42 KB