• Jesper Dangaard Brouer's avatar
    Revert "net: use lib/percpu_counter API for fragmentation mem accounting" · 021c60ff
    Jesper Dangaard Brouer authored
    
    [ Upstream commit fb452a1a ]
    
    This reverts commit 6d7b857d.
    
    There is a bug in fragmentation codes use of the percpu_counter API,
    that can cause issues on systems with many CPUs.
    
    The frag_mem_limit() just reads the global counter (fbc->count),
    without considering other CPUs can have upto batch size (130K) that
    haven't been subtracted yet.  Due to the 3MBytes lower thresh limit,
    this become dangerous at >=24 CPUs (3*1024*1024/130000=24).
    
    The correct API usage would be to use __percpu_counter_compare() which
    does the right thing, and takes into account the number of (online)
    CPUs and batch size, to account for this and call __percpu_counter_sum()
    when needed.
    
    We choose to revert the use of the lib/percpu_counter API for frag
    memory accounting for several reasons:
    
    1) On systems with CPUs > 24, the heavier fully locked
       __percpu_counter_sum() is always invoked, which will be more
       expensive than the atomic_t that is reverted to.
    
    Given systems with more than 24 CPUs are becoming common this doesn't
    seem like a good option.  To mitigate this, the batch size could be
    decreased and thresh be increased.
    
    2) The add_frag_mem_limit+sub_frag_mem_limit pairs happen on the RX
       CPU, before SKBs are pushed into sockets on remote CPUs.  Given
       NICs can only hash on L2 part of the IP-header, the NIC-RXq's will
       likely be limited.  Thus, a fair chance that atomic add+dec happen
       on the same CPU.
    
    Revert note that commit 1d6119ba ("net: fix percpu memory leaks")
    removed init_frag_mem_limit() and instead use inet_frags_init_net().
    After this revert, inet_frags_uninit_net() becomes empty.
    
    Fixes: 6d7b857d ("net: use lib/percpu_counter API for fragmentation mem accounting")
    Fixes: 1d6119ba ("net: fix percpu memory leaks")
    Signed-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
    Acked-by: default avatarFlorian Westphal <fw@strlen.de>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    021c60ff
inet_fragment.c 10.3 KB