1. 07 Aug, 2009 4 commits
    • Krishna Kumar's avatar
      net: Avoid enqueuing skb for default qdiscs · bbd8a0d3
      Krishna Kumar authored
      dev_queue_xmit enqueue's a skb and calls qdisc_run which
      dequeue's the skb and xmits it. In most cases, the skb that
      is enqueue'd is the same one that is dequeue'd (unless the
      queue gets stopped or multiple cpu's write to the same queue
      and ends in a race with qdisc_run). For default qdiscs, we
      can remove the redundant enqueue/dequeue and simply xmit the
      skb since the default qdisc is work-conserving.
      
      The patch uses a new flag - TCQ_F_CAN_BYPASS to identify the
      default fast queue. The controversial part of the patch is
      incrementing qlen when a skb is requeued - this is to avoid
      checks like the second line below:
      
      +  } else if ((q->flags & TCQ_F_CAN_BYPASS) && !qdisc_qlen(q) &&
      >>         !q->gso_skb &&
      +          !test_and_set_bit(__QDISC_STATE_RUNNING, &q->state)) {
      
      Results of a 2 hour testing for multiple netperf sessions (1,
      2, 4, 8, 12 sessions on a 4 cpu system-X). The BW numbers are
      aggregate Mb/s across iterations tested with this version on
      System-X boxes with Chelsio 10gbps cards:
      
      ----------------------------------
      Size |  ORG BW          NEW BW   |
      ----------------------------------
      128K |  156964          159381   |
      256K |  158650          162042   |
      ----------------------------------
      
      Changes from ver1:
      
      1. Move sch_direct_xmit declaration from sch_generic.h to
         pkt_sched.h
      2. Update qdisc basic statistics for direct xmit path.
      3. Set qlen to zero in qdisc_reset.
      4. Changed some function names to more meaningful ones.
      Signed-off-by: default avatarKrishna Kumar <krkumar2@in.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bbd8a0d3
    • Yevgeny Petrilin's avatar
      mlx4_en: Not using Shared Receive Queues · 9f519f68
      Yevgeny Petrilin authored
      We use 1:1 mapping between QPs and SRQs on receive side,
      so additional indirection level not required. Allocated the receive
      buffers for the RSS QPs.
      Signed-off-by: default avatarYevgeny Petrilin <yevgenyp@mellanox.co.il>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f519f68
    • Yevgeny Petrilin's avatar
      mlx4_en: Using real number of rings as RSS map size · b6b912e0
      Yevgeny Petrilin authored
      There is no point in using more QPs then actual number of receive rings.
      If the RSS function for two streams gives the same result modulo number
      of rings, they will arrive to the same RX ring anyway.
      Signed-off-by: default avatarYevgeny Petrilin <yevgenyp@mellanox.co.il>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b6b912e0
    • Yevgeny Petrilin's avatar
      mlx4_en: Adaptive moderation policy change · a35ee541
      Yevgeny Petrilin authored
      If the net device is identified as "sender" (number of sent packets
      is higher then the number of received packets and the incoming packets are
      small), set the moderation time to its low limit.
      We do it because the incoming packets are acks, and we don't want to delay them
      Signed-off-by: default avatarYevgeny Petrilin <yevgenyp@mellanox.co.il>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a35ee541
  2. 06 Aug, 2009 9 commits
  3. 05 Aug, 2009 22 commits
  4. 04 Aug, 2009 5 commits