• Eric Dumazet's avatar
    bnx2x: uses build_skb() in receive path · e52fcb24
    Eric Dumazet authored
    bnx2x uses following formula to compute its rx_buf_sz :
    
    dev->mtu + 2*L1_CACHE_BYTES + 14 + 8 + 8 + 2
    
    Then core network adds NET_SKB_PAD and SKB_DATA_ALIGN(sizeof(struct
    skb_shared_info))
    
    Final allocated size for skb head on x86_64 (L1_CACHE_BYTES = 64,
    MTU=1500) : 2112 bytes : SLUB/SLAB round this to 4096 bytes.
    
    Since skb truesize is then bigger than SK_MEM_QUANTUM, we have lot of
    false sharing because of mem_reclaim in UDP stack.
    
    One possible way to half truesize is to reduce the need by 64 bytes
    (2112 -> 2048 bytes)
    
    Instead of allocating a full cache line at the end of packet for
    alignment, we can use the fact that skb_shared_info sits at the end of
    skb->head, and we can use this room, if we convert bnx2x to new
    build_skb() infrastructure.
    
    skb_shared_info will be initialized after hardware finished its
    transfert, so we can eventually overwrite the final padding.
    
    Using build_skb() also reduces cache line misses in the driver, since we
    use cache hot skb instead of cold ones. Number of in-flight sk_buff
    structures is lower, they are recycled while still hot.
    
    Performance results :
    
    (820.000 pps on a rx UDP monothread benchmark, instead of 720.000 pps)
    Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
    CC: Eilon Greenstein <eilong@broadcom.com>
    CC: Ben Hutchings <bhutchings@solarflare.com>
    CC: Tom Herbert <therbert@google.com>
    CC: Jamal Hadi Salim <hadi@mojatatu.com>
    CC: Stephen Hemminger <shemminger@vyatta.com>
    CC: Thomas Graf <tgraf@infradead.org>
    CC: Herbert Xu <herbert@gondor.apana.org.au>
    CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
    Acked-by: default avatarEilon Greenstein <eilong@broadcom.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    e52fcb24
bnx2x_main.c 308 KB