• Eric Dumazet's avatar
    net: introduce DST_NOCACHE flag · c7d4426a
    Eric Dumazet authored
    While doing stress tests with IP route cache disabled, and multi queue
    devices, I noticed a very high contention on one rwlock used in
    neighbour code.
    
    When many cpus are trying to send frames (possibly using a high
    performance multiqueue device) to the same neighbour, they fight for the
    neigh->lock rwlock in order to call neigh_hh_init(), and fight on
    hh->hh_refcnt (a pair of atomic_inc/atomic_dec_and_test())
    
    But we dont need to call neigh_hh_init() for dst that are used only
    once. It costs four atomic operations at least, on two contended cache
    lines, plus the high contention on neigh->lock rwlock.
    
    Introduce a new dst flag, DST_NOCACHE, that is set when dst was not
    inserted in route cache.
    
    With the stress test bench, sending 160000000 frames on one neighbour,
    results are :
    
    Before patch:
    
    real	2m28.406s
    user	0m11.781s
    sys	36m17.964s
    
    
    After patch:
    
    real	1m26.532s
    user	0m12.185s
    sys	20m3.903s
    Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    c7d4426a
neighbour.c 65.5 KB