1. 18 May, 2022 3 commits
    • Jakub Kicinski's avatar
      Merge branch 'net-smc-send-and-write-inline-optimization-for-smc' · 68a0bd67
      Jakub Kicinski authored
      Guangguan Wang says:
      
      ====================
      net/smc: send and write inline optimization for smc
      
      Send cdc msgs and write data inline if qp has sufficent inline
      space, helps latency reducing.
      
      In my test environment, which are 2 VMs running on the same
      physical host and whose NICs(ConnectX-4Lx) are working on
      SR-IOV mode, qperf shows 0.4us-1.3us improvement in latency.
      
      Test command:
      server: smc_run taskset -c 1 qperf
      client: smc_run taskset -c 1 qperf <server ip> -oo \
      		msg_size:1:2K:*2 -t 30 -vu tcp_lat
      
      The results shown below:
      msgsize     before       after
      1B          11.9 us      10.6 us (-1.3 us)
      2B          11.7 us      10.7 us (-1.0 us)
      4B          11.7 us      10.7 us (-1.0 us)
      8B          11.6 us      10.6 us (-1.0 us)
      16B         11.7 us      10.7 us (-1.0 us)
      32B         11.7 us      10.6 us (-1.1 us)
      64B         11.7 us      11.2 us (-0.5 us)
      128B        11.6 us      11.2 us (-0.4 us)
      256B        11.8 us      11.2 us (-0.6 us)
      512B        11.8 us      11.3 us (-0.5 us)
      1KB         11.9 us      11.5 us (-0.4 us)
      2KB         12.1 us      11.5 us (-0.6 us)
      ====================
      
      Link: https://lore.kernel.org/r/20220516055137.51873-1-guangguan.wang@linux.alibaba.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      68a0bd67
    • Guangguan Wang's avatar
      net/smc: rdma write inline if qp has sufficient inline space · 793a7df6
      Guangguan Wang authored
      Rdma write with inline flag when sending small packages,
      whose length is shorter than the qp's max_inline_data, can
      help reducing latency.
      
      In my test environment, which are 2 VMs running on the same
      physical host and whose NICs(ConnectX-4Lx) are working on
      SR-IOV mode, qperf shows 0.5us-0.7us improvement in latency.
      
      Test command:
      server: smc_run taskset -c 1 qperf
      client: smc_run taskset -c 1 qperf <server ip> -oo \
      		msg_size:1:2K:*2 -t 30 -vu tcp_lat
      
      The results shown below:
      msgsize     before       after
      1B          11.2 us      10.6 us (-0.6 us)
      2B          11.2 us      10.7 us (-0.5 us)
      4B          11.3 us      10.7 us (-0.6 us)
      8B          11.2 us      10.6 us (-0.6 us)
      16B         11.3 us      10.7 us (-0.6 us)
      32B         11.3 us      10.6 us (-0.7 us)
      64B         11.2 us      11.2 us (0 us)
      128B        11.2 us      11.2 us (0 us)
      256B        11.2 us      11.2 us (0 us)
      512B        11.4 us      11.3 us (-0.1 us)
      1KB         11.4 us      11.5 us (0.1 us)
      2KB         11.5 us      11.5 us (0 us)
      Signed-off-by: default avatarGuangguan Wang <guangguan.wang@linux.alibaba.com>
      Reviewed-by: default avatarTony Lu <tonylu@linux.alibaba.com>
      Tested-by: default avatarkernel test robot <lkp@intel.com>
      Acked-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      793a7df6
    • Guangguan Wang's avatar
      net/smc: send cdc msg inline if qp has sufficient inline space · b632eb06
      Guangguan Wang authored
      As cdc msg's length is 44B, cdc msgs can be sent inline in
      most rdma devices, which can help reducing sending latency.
      
      In my test environment, which are 2 VMs running on the same
      physical host and whose NICs(ConnectX-4Lx) are working on
      SR-IOV mode, qperf shows 0.4us-0.7us improvement in latency.
      
      Test command:
      server: smc_run taskset -c 1 qperf
      client: smc_run taskset -c 1 qperf <server ip> -oo \
      		msg_size:1:2K:*2 -t 30 -vu tcp_lat
      
      The results shown below:
      msgsize     before       after
      1B          11.9 us      11.2 us (-0.7 us)
      2B          11.7 us      11.2 us (-0.5 us)
      4B          11.7 us      11.3 us (-0.4 us)
      8B          11.6 us      11.2 us (-0.4 us)
      16B         11.7 us      11.3 us (-0.4 us)
      32B         11.7 us      11.3 us (-0.4 us)
      64B         11.7 us      11.2 us (-0.5 us)
      128B        11.6 us      11.2 us (-0.4 us)
      256B        11.8 us      11.2 us (-0.6 us)
      512B        11.8 us      11.4 us (-0.4 us)
      1KB         11.9 us      11.4 us (-0.5 us)
      2KB         12.1 us      11.5 us (-0.6 us)
      Signed-off-by: default avatarGuangguan Wang <guangguan.wang@linux.alibaba.com>
      Reviewed-by: default avatarTony Lu <tonylu@linux.alibaba.com>
      Tested-by: default avatarkernel test robot <lkp@intel.com>
      Acked-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b632eb06
  2. 17 May, 2022 9 commits
  3. 16 May, 2022 28 commits