Commit c8c8b127 authored by Eric Dumazet's avatar Eric Dumazet Committed by David S. Miller

udp: under rx pressure, try to condense skbs

Under UDP flood, many softirq producers try to add packets to
UDP receive queue, and one user thread is burning one cpu trying
to dequeue packets as fast as possible.

Two parts of the per packet cost are :
- copying payload from kernel space to user space,
- freeing memory pieces associated with skb.

If socket is under pressure, softirq handler(s) can try to pull in
skb->head the payload of the packet if it fits.

Meaning the softirq handler(s) can free/reuse the page fragment
immediately, instead of letting udp_recvmsg() do this hundreds of usec
later, possibly from another node.

Additional gains :
- We reduce skb->truesize and thus can store more packets per SO_RCVBUF
- We avoid cache line misses at copyout() time and consume_skb() time,
and avoid one put_page() with potential alien freeing on NUMA hosts.

This comes at the cost of a copy, bounded to available tail room, which
is usually small. (We might have to fix GRO_MAX_HEAD which looks bigger
than necessary)

This patch gave me about 5 % increase in throughput in my tests.

skb_condense() helper could probably used in other contexts.
Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parent 2408022e
...@@ -1966,6 +1966,8 @@ static inline int pskb_may_pull(struct sk_buff *skb, unsigned int len) ...@@ -1966,6 +1966,8 @@ static inline int pskb_may_pull(struct sk_buff *skb, unsigned int len)
return __pskb_pull_tail(skb, len - skb_headlen(skb)) != NULL; return __pskb_pull_tail(skb, len - skb_headlen(skb)) != NULL;
} }
void skb_condense(struct sk_buff *skb);
/** /**
* skb_headroom - bytes at buffer head * skb_headroom - bytes at buffer head
* @skb: buffer to check * @skb: buffer to check
......
...@@ -4931,3 +4931,31 @@ struct sk_buff *pskb_extract(struct sk_buff *skb, int off, ...@@ -4931,3 +4931,31 @@ struct sk_buff *pskb_extract(struct sk_buff *skb, int off,
return clone; return clone;
} }
EXPORT_SYMBOL(pskb_extract); EXPORT_SYMBOL(pskb_extract);
/**
* skb_condense - try to get rid of fragments/frag_list if possible
* @skb: buffer
*
* Can be used to save memory before skb is added to a busy queue.
* If packet has bytes in frags and enough tail room in skb->head,
* pull all of them, so that we can free the frags right now and adjust
* truesize.
* Notes:
* We do not reallocate skb->head thus can not fail.
* Caller must re-evaluate skb->truesize if needed.
*/
void skb_condense(struct sk_buff *skb)
{
if (!skb->data_len ||
skb->data_len > skb->end - skb->tail ||
skb_cloned(skb))
return;
/* Nice, we can free page frag(s) right now */
__pskb_pull_tail(skb, skb->data_len);
/* Now adjust skb->truesize, since __pskb_pull_tail() does
* not do this.
*/
skb->truesize = SKB_TRUESIZE(skb_end_offset(skb));
}
...@@ -1199,7 +1199,7 @@ int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb) ...@@ -1199,7 +1199,7 @@ int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb)
{ {
struct sk_buff_head *list = &sk->sk_receive_queue; struct sk_buff_head *list = &sk->sk_receive_queue;
int rmem, delta, amt, err = -ENOMEM; int rmem, delta, amt, err = -ENOMEM;
int size = skb->truesize; int size;
/* try to avoid the costly atomic add/sub pair when the receive /* try to avoid the costly atomic add/sub pair when the receive
* queue is full; always allow at least a packet * queue is full; always allow at least a packet
...@@ -1208,6 +1208,16 @@ int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb) ...@@ -1208,6 +1208,16 @@ int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb)
if (rmem > sk->sk_rcvbuf) if (rmem > sk->sk_rcvbuf)
goto drop; goto drop;
/* Under mem pressure, it might be helpful to help udp_recvmsg()
* having linear skbs :
* - Reduce memory overhead and thus increase receive queue capacity
* - Less cache line misses at copyout() time
* - Less work at consume_skb() (less alien page frag freeing)
*/
if (rmem > (sk->sk_rcvbuf >> 1))
skb_condense(skb);
size = skb->truesize;
/* we drop only if the receive buf is full and the receive /* we drop only if the receive buf is full and the receive
* queue contains some other skb * queue contains some other skb
*/ */
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment