1. 23 Apr, 2019 6 commits
    • Shay Agroskin's avatar
      net/mlx5e: XDP, Inline small packets into the TX MPWQE in XDP xmit flow · c2273219
      Shay Agroskin authored
      Upon high packet rate with multiple CPUs TX workloads, much of the HCA's
      resources are spent on prefetching TX descriptors, thus affecting
      transmission rates.
      This patch comes to mitigate this problem by moving some workload to the
      CPU and reducing the HW data prefetch overhead for small packets (<= 256B).
      
      When forwarding packets with XDP, a packet that is smaller
      than a certain size (set to ~256 bytes) would be sent inline within
      its WQE TX descrptor (mem-copied), when the hardware tx queue is congested
      beyond a pre-defined water-mark.
      
      This is added to better utilize the HW resources (which now makes
      one less packet data prefetch) and allow better scalability, on the
      account of CPU usage (which now 'memcpy's the packet into the WQE).
      
      To load balance between HW and CPU and get max packet rate, we use
      watermarks to detect how much the HW is congested and move the work
      loads back and forth between HW and CPU.
      
      Performance:
      Tested packet rate for UDP 64Byte multi-stream
      over two dual port ConnectX-5 100Gbps NICs.
      CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
      
      * Tested with hyper-threading disabled
      
      XDP_TX:
      
      |          | before | after   |       |
      | 24 rings | 51Mpps | 116Mpps | +126% |
      | 1 ring   | 12Mpps | 12Mpps  | same  |
      
      XDP_REDIRECT:
      
      ** Below is the transmit rate, not the redirection rate
      which might be larger, and is not affected by this patch.
      
      |          | before  | after   |      |
      | 32 rings | 64Mpps  | 92Mpps  | +43% |
      | 1 ring   | 6.4Mpps | 6.4Mpps | same |
      
      As we can see, feature significantly improves scaling, without
      hurting single ring performance.
      Signed-off-by: default avatarShay Agroskin <shayag@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      c2273219
    • Shay Agroskin's avatar
      net/mlx5e: XDP, Add TX MPWQE session counter · 73cab880
      Shay Agroskin authored
      This counter tracks how many TX MPWQE sessions are started in XDP SQ
      in XDP TX/REDIRECT flow. It counts per-channel and global stats.
      Signed-off-by: default avatarShay Agroskin <shayag@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      73cab880
    • Tariq Toukan's avatar
      net/mlx5e: XDP, Enhance RQ indication for XDP redirect flush · 15143bf5
      Tariq Toukan authored
      The XDP redirect flush indication belongs to the receive queue,
      not to its XDP send queue.
      
      For this, use a new bit on rq->flags.
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Reviewed-by: default avatarShay Agroskin <shayag@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      15143bf5
    • Tariq Toukan's avatar
      net/mlx5e: XDP, Fix shifted flag index in RQ bitmap · f03590f7
      Tariq Toukan authored
      Values in enum mlx5e_rq_flag are used as bit indixes.
      Intention was to use them with no BIT(i) wrapping.
      
      No functional bug fix here, as the same (shifted)flag bit
      is used for all set, test, and clear operations.
      
      Fixes: 121e8927 ("net/mlx5e: Refactor RQ XDP_TX indication")
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Reviewed-by: default avatarShay Agroskin <shayag@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      f03590f7
    • Tariq Toukan's avatar
      net/mlx5e: RX, Support multiple outstanding UMR posts · fd9b4be8
      Tariq Toukan authored
      The buffers mapping of the Multi-Packet WQEs (of Striding RQ)
      is done via UMR posts, one UMR WQE per an RX MPWQE.
      
      A single MPWQE is capable of serving many incoming packets,
      usually larger than the budget of a single napi cycle.
      Hence, posting a single UMR WQE per napi cycle (and handling its
      completion in the next cycle) works fine in many common cases,
      but not always.
      
      When an XDP program is loaded, every MPWQE is capable of serving less
      packets, to satisfy the packet-per-page requirement.
      Thus, for the same number of packets more MPWQEs (and UMR posts)
      are needed (twice as much for the default MTU), giving less latency
      room for the UMR completions.
      
      In this patch, we add support for multiple outstanding UMR posts,
      to allow faster gap closure between consuming MPWQEs and reposting
      them back into the WQ.
      
      For better SW and HW locality, we combine the UMR posts in bulks of
      (at least) two.
      
      This is expected to improve packet rate in high CPU scale.
      
      Performance test:
      As expected, huge improvement in large-scale (48 cores).
      
      xdp_redirect_map, 64B UDP multi-stream.
      Redirect from ConnectX-5 100Gbps to ConnectX-6 100Gbps.
      CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz.
      
      Before: Unstable, 7 to 30 Mpps
      After:  Stable,   at 70.5 Mpps
      
      No degradation in other tested scenarios.
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      fd9b4be8
    • Saeed Mahameed's avatar
  2. 22 Apr, 2019 2 commits
  3. 21 Apr, 2019 7 commits
  4. 20 Apr, 2019 14 commits
  5. 19 Apr, 2019 11 commits