1. 01 Oct, 2010 27 commits
  2. 30 Sep, 2010 7 commits
  3. 29 Sep, 2010 6 commits
    • Eric Dumazet's avatar
      net: rename netdev rx_queue to ingress_queue · bfa5ae63
      Eric Dumazet authored
      There is some confusion with rx_queue name after RPS, and net drivers
      private rx_queue fields.
      
      I suggest to rename "struct net_device"->rx_queue to ingress_queue.
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bfa5ae63
    • Eric Dumazet's avatar
      ip6tnl: percpu stats accounting · 8560f226
      Eric Dumazet authored
      Maintain per_cpu tx_bytes, tx_packets, rx_bytes, rx_packets.
      
      Other seldom used fields are kept in netdev->stats structure, possibly
      unsafe.
      
      This is a preliminary work to support lockless transmit path, and
      correct RX stats, that are already unsafe.
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8560f226
    • Eric Dumazet's avatar
      ipip: enable lockless xmits · 153f0943
      Eric Dumazet authored
      IPIP tunnels can benefit from lockless xmits, using NETIF_F_LLTX
      
      Bench on a 16 cpus machine (dual E5540 cpus), 16 threads sending
      10000000 UDP frames via one ipip tunnel (size:200 bytes per frame)
      
      Before patch :
      real	2m53.321s
      user	0m10.277s
      sys	46m0.597s
      
      After patch:
      real	0m32.063s
      user	0m9.237s
      sys	8m16.255s
      
      Last problem to solve is the contention on dst :
      
      16118.00 28.3% __ip_route_output_key         vmlinux
       6135.00 10.8% dst_release                   vmlinux
       3220.00  5.6% ip_finish_output              vmlinux
       2149.00  3.8% ip_route_output_flow          vmlinux
       1575.00  2.8% ip_append_data                vmlinux
       1481.00  2.6% ip_push_pending_frames        vmlinux
       1349.00  2.4% __xfrm_lookup                 vmlinux
       1216.00  2.1% csum_partial_copy_generic     vmlinux
       1208.00  2.1% udp_sendmsg                   vmlinux
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      153f0943
    • Eric Dumazet's avatar
      ip_gre: lockless xmit · b790e01a
      Eric Dumazet authored
      GRE tunnels can benefit from lockless xmits, using NETIF_F_LLTX
      
      Note: If tunnels are created with the "oseq" option, LLTX is not
      enabled :
      
      Even using an atomic_t o_seq, we would increase chance for packets being
      out of order at receiver.
      
      Bench on a 16 cpus machine (dual E5540 cpus), 16 threads sending
      10000000 UDP frames via one gre tunnel (size:200 bytes per frame)
      
      Before patch :
      real	3m0.094s
      user	0m9.365s
      sys	47m50.103s
      
      After patch:
      real	0m29.756s
      user	0m11.097s
      sys	7m33.012s
      
      Last problem to solve is the contention on dst :
      
      38660.00 21.4% __ip_route_output_key          vmlinux
      20786.00 11.5% dst_release                    vmlinux
      14191.00  7.8% __xfrm_lookup                  vmlinux
      12410.00  6.9% ip_finish_output               vmlinux
       4540.00  2.5% ip_push_pending_frames         vmlinux
       4427.00  2.4% ip_append_data                 vmlinux
       4265.00  2.4% __alloc_skb                    vmlinux
       4140.00  2.3% __ip_local_out                 vmlinux
       3991.00  2.2% dev_queue_xmit                 vmlinux
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b790e01a
    • Eric Dumazet's avatar
      sit: enable lockless xmits · 8df40d10
      Eric Dumazet authored
      SIT tunnels can benefit from lockless xmits, using NETIF_F_LLTX
      
      Bench on a 16 cpus machine (dual E5540 cpus), 16 threads sending
      10000000 UDP frames via one sit tunnel (size:220 bytes per frame)
      
      Before patch :
      
      real	3m15.399s
      user	0m9.185s
      sys	51m55.403s
      
      75029.00 87.5% _raw_spin_lock            vmlinux
       1090.00  1.3% dst_release               vmlinux
        902.00  1.1% dev_queue_xmit            vmlinux
        627.00  0.7% sock_wfree                vmlinux
        613.00  0.7% ip6_push_pending_frames   ipv6.ko
        505.00  0.6% __ip_route_output_key     vmlinux
      
      After patch:
      
      real	1m1.387s
      user	0m12.489s
      sys	15m58.868s
      
      28239.00 23.3% dst_release               vmlinux
      13570.00 11.2% ip6_push_pending_frames   ipv6.ko
      13118.00 10.8% ip6_append_data           ipv6.ko
       7995.00  6.6% __ip_route_output_key     vmlinux
       7924.00  6.5% sk_dst_check              vmlinux
       5015.00  4.1% udpv6_sendmsg             ipv6.ko
       3594.00  3.0% sock_alloc_send_pskb      vmlinux
       3135.00  2.6% sock_wfree                vmlinux
       3055.00  2.5% ip6_sk_dst_lookup         ipv6.ko
       2473.00  2.0% ip_finish_output          vmlinux
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8df40d10
    • Eric Dumazet's avatar
      sit: fix percpu stats accounting · dd4080ee
      Eric Dumazet authored
      commit 15fc1f70 (sit: percpu stats accounting) forgot the fallback
      tunnel case (sit0), and can crash pretty fast.
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dd4080ee