Commits · b288e62f8d29aa5a1a21d9fc4887f4542eb7b139 · Kirill Smelkov / neo

10 Oct, 2017 8 commits

X neo1 - neo2 timings with turned on TSO, SG, TX · b288e62f

Test authored Oct 10, 2017

See previous commit for details about why latency is bad for TCP payload > MSS
without TSO.

Compared to latest neo1-neo2 timings from Oct05 (C-states disabled, no
rx-delay) it improves:

TCP1             ~45μs  ->   ~45μs
TCP1472         ~430μs			# TCP lat. anaomaly
TCP1400                 ->  ~120μs	# finally
TCP1500                 ->  ~130μs	# fixed
TCP4096         ~285μs  ->  ~170μs	# !

ZEO             ~670μs  ->  ~580-1045μs (?)
NEO/pylite      ~605μs  ->  ~600-700μs  (?)     (Cpy)
NEO/pylite      ~505μs  ->  ~525-580μs  (?)     (Cgo)
NEO/pysql       ~900μs  ->  ~820-930μs  (?)     (Cpy)
NEO/pysql       ~780μs  ->  ~740-800μs  (?)     (Cgo)
NEO/go          ~430μs  ->  ~360μs              (Cpy)	# <-- NOTE
NEO/go          ~210μs  ->  ~160μs              (Cgo)	# <-- NOTE
NEO/go-nosha1   ~190μs  ->  ~140μs			# <-- NOTE

not sure about noise in pure py runs but given raw tcp latency absolutely
improves this should be a good change to make.

b288e62f

X neotest: Show NIC features and emit warning if !TSO · 4c815af9

Kirill Smelkov authored Oct 10, 2017

On neo1 - neo2 without TSO latency becomes very poor in lat_tcp when
payload size becomes greater TCP MSS (lat_tcp -m 1448 ~ 130μs; lat_tcp
-m 1449 ~ 500μs and more)

4c815af9

X neotest: Provide cpustat command so it is possible to cpustat something external · 4436b983
Kirill Smelkov authored Oct 10, 2017
```
e.g.

	neotest cpustat ls
```
4436b983
. · 474c17bc
Kirill Smelkov authored Oct 10, 2017

474c17bc

X time for z6001 (localhost) and z6001-z600 · b6d6344d

Kirill Smelkov authored Oct 10, 2017

Same a before with added information.
In particular z6001-z600 shows there is no TCP RR step-wise 400μs increase
while going 1400B -> 1500B as it is currently the case on RTL.

b6d6344d

. · 50b0d130
Kirill Smelkov authored Oct 10, 2017

50b0d130

X neo1 - neo2 timings · c5df0f47

Test authored Oct 10, 2017

Same as before but with more information and adjusted ping and TCP RR.

In particular TCP RR shows:

TCPRR1400	~ 120μs
TCPRR1500	~ 430μs

The time when this increases step-wise is when TCP packet crosses fitting 1 ethernet frame.
It does not happen on z600* with their current settings.

c5df0f47

X neotest: Adjust ping and TCP RR sizes to fit 1 Ethernet frame, etc... · 659ce938
Kirill Smelkov authored Oct 10, 2017

659ce938

09 Oct, 2017 6 commits

X neo1 localhost timings Oct09 · af41382e

Test authored Oct 09, 2017

Same as Oct05 5μs C-latency with added:

- pystones
- t(sha1) py & go
- zhash.go working in prefetch mode
- 16 clients loading 1 server at the same time

af41382e

X deco localhost timings Oct09 · 1e9f23cb

Kirill Smelkov authored Oct 09, 2017

Same as Oct05 5μs C-latency with added:

- pystones
- t(sha1) py & go
- zhash.go working in prefetch mode
- 16 clients loading 1 server at the same time

1e9f23cb

. · 0d1206b9
Kirill Smelkov authored Oct 09, 2017

0d1206b9

X neo/connection: Fix race between link.shutdown() and conn.lightClose() · 8eac771c

Kirill Smelkov authored Oct 09, 2017

The scenario could be: nl.shutdown sets nl.connTab=nil and
then iterates connTab snapshot taken under nl.connMu lock. If so
this activity (which calls e.g. Conn.shutdown) could be running with
conn.lightClose -> conn.release() in parallel.

Bug triggers under neotest in zhash.go +prefetch128:

	crc32:bf9deea9   ; oid=0..8499  nread=34134938  t=9.010696705s (1.060081ms / object)  x=zhash.go +prefetch128
	==================
	WARNING: DATA RACE
	Write at 0x00c424b55ed0 by goroutine 147:
	  lab.nexedi.com/kirr/neo/go/neo.(*Conn).reinit()
	      /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/neo/connection.go:258 +0x102
	  lab.nexedi.com/kirr/neo/go/neo.(*Conn).release()
	      /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/neo/connection.go:240 +0x38
	  lab.nexedi.com/kirr/neo/go/neo.(*Conn).lightClose()
	      /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/neo/connection.go:1633 +0xa4
	  lab.nexedi.com/kirr/neo/go/neo.(*NodeLink).Ask1()
	      /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/neo/connection.go:1719 +0xde
	  lab.nexedi.com/kirr/neo/go/neo/client.(*Client)._Load()
	      /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/neo/client/client.go:466 +0x5d3
	  lab.nexedi.com/kirr/neo/go/neo/client.(*Client).Load()
	      /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/neo/client/client.go:392 +0xa8
	  lab.nexedi.com/kirr/neo/go/zodb/storage.(*Cache).loadRCE()
	      /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/storage/cache.go:297 +0xfb

	Previous write at 0x00c424b55ed0 by goroutine 75:
	  sync/atomic.AddInt32()
	      /home/kirr/src/tools/go/go/src/runtime/race_amd64.s:269 +0xb
	  sync.(*Mutex).Unlock()
	      /home/kirr/src/tools/go/go/src/sync/mutex.go:182 +0x54
	  sync.(*Once).Do()
	      /home/kirr/src/tools/go/go/src/sync/once.go:46 +0x95
	  lab.nexedi.com/kirr/neo/go/neo.(*Conn).shutdownRX()
	      /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/neo/connection.go:471 +0xb1
	  lab.nexedi.com/kirr/neo/go/neo.(*Conn).shutdown()
	      /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/neo/connection.go:442 +0x62
	  lab.nexedi.com/kirr/neo/go/neo.(*NodeLink).shutdown.func1.1()
	      /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/neo/connection.go:408 +0x18f

	Goroutine 147 (running) created at:
	  lab.nexedi.com/kirr/neo/go/zodb/storage.(*Cache).Prefetch()
	      /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/storage/cache.go:198 +0xbb
	  main.zhash.func2()
	      /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/neo/t/zhash.go:134 +0x86
	  main.zhash.func4()
	      /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/neo/t/zhash.go:157 +0x8f
	  main.zhash()
	      /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/neo/t/zhash.go:193 +0xc7b
	  main.main()
	      /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/neo/t/zhash.go:107 +0x48d

	Goroutine 75 (finished) created at:
	  lab.nexedi.com/kirr/neo/go/neo.(*NodeLink).shutdown.func1()
	      /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/neo/connection.go:391 +0x116
	  sync.(*Once).Do()
	      /home/kirr/src/tools/go/go/src/sync/once.go:44 +0xe1
	  lab.nexedi.com/kirr/neo/go/neo.(*NodeLink).shutdown()
	      /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/neo/connection.go:383 +0x99
	  lab.nexedi.com/kirr/neo/go/neo.(*NodeLink).serveRecv()
	      /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/neo/connection.go:748 +0x651
	==================

and without -race zhash.go instrumentation it was showing itself as
"panic: unlock of unlocked mutex" in

	  sync.(*Mutex).Unlock()
	      /home/kirr/src/tools/go/go/src/sync/mutex.go:182 +0x54
	  sync.(*Once).Do()
	      /home/kirr/src/tools/go/go/src/sync/once.go:46 +0x95
	  lab.nexedi.com/kirr/neo/go/neo.(*Conn).shutdownRX()
	      /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/neo/connection.go:471 +0xb1
	  lab.nexedi.com/kirr/neo/go/neo.(*Conn).shutdown()
	      /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/neo/connection.go:442 +0x62
	  lab.nexedi.com/kirr/neo/go/neo.(*NodeLink).shutdown.func1.1()
	      /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/neo/connection.go:408 +0x18f

8eac771c

X zodb: Disable Buf tests under race · e4a100de

Kirill Smelkov authored Oct 09, 2017

It was failing in a strange way, e.g.

panic: Buf: refcnt < 0
fatal error: panic holding locks

goroutine 19 [running]:
runtime.throw(0x5f9d65, 0x13)
        /home/kirr/src/tools/go/go/src/runtime/panic.go:605 +0x95 fp=0xc42003cc20 sp=0xc42003cc00 pc=0x452935
panic(0x5c21c0, 0x611f60)
        /home/kirr/src/tools/go/go/src/runtime/panic.go:451 +0x585 fp=0xc42003ccc8 sp=0xc42003cc20 pc=0x4524c5
lab.nexedi.com/kirr/neo/go/zodb.(*Buf).Release(0xc4200900e0)
        /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/buffer.go:112 +0x161 fp=0xc42003ccf8 sp=0xc42003ccc8 pc=0x59ebc1
lab.nexedi.com/kirr/neo/go/zodb.TestBufAllocFree(0xc4200a80f0)
        /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/buffer_test.go:111 +0x6ae fp=0xc42003cf78 sp=0xc42003ccf8 pc=0x5a137e
testing.tRunner(0xc4200a80f0, 0x600930)
        /home/kirr/src/tools/go/go/src/testing/testing.go:746 +0x16d fp=0xc42003cfd0 sp=0xc42003cf78 pc=0x51688d
runtime.goexit()
        /home/kirr/src/tools/go/go/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc42003cfd8 sp=0xc42003cfd0 pc=0x4835f1
created by testing.(*T).Run
        /home/kirr/src/tools/go/go/src/testing/testing.go:789 +0x569

goroutine 1 [chan receive]:
testing.(*T).Run(0xc4200a8000, 0x5f9180, 0x10, 0x600930, 0xc420051c50)
        /home/kirr/src/tools/go/go/src/testing/testing.go:790 +0x59b
testing.runTests.func1(0xc4200a8000)
        /home/kirr/src/tools/go/go/src/testing/testing.go:1004 +0xa8
testing.tRunner(0xc4200a8000, 0xc420051d90)
        /home/kirr/src/tools/go/go/src/testing/testing.go:746 +0x16d
testing.runTests(0xc420090040, 0x6ed4a0, 0x7, 0x7, 0x4159b0)
        /home/kirr/src/tools/go/go/src/testing/testing.go:1002 +0x522
testing.(*M).Run(0xc420051f20, 0x1d7ab28)
        /home/kirr/src/tools/go/go/src/testing/testing.go:921 +0x207
main.main()
        lab.nexedi.com/kirr/neo/go/zodb/_test/_testmain.go:56 +0x1d4
exit status 2
FAIL    lab.nexedi.com/kirr/neo/go/zodb 0.006s

becuase upon release object was not put back into pool, then we were
getting another object from pool, complainig but still doing
buf.Release.

e4a100de

X tracing: Silence race-detector about probe.Detach · bab79665

Kirill Smelkov authored Oct 09, 2017

Race-detector does not know Probe.Detach works under world stopped and
that this way it cannot break consistency of probes list attached to a
trace event - on event signalling either a probe will be run or not run
at all.

And we do not mind that e.g. while Detach was in progress a probe was
was read from traceevent list and deceded to be run and the probe
function was actually called just after Detach finished.

For this reason tell race-detector to not take into account all memory
read/write that are performed while the world is stopped.

If we do not it complains e.g. this way:

    ==================
    WARNING: DATA RACE
    Read at 0x00c42000d760 by goroutine 7:
      lab.nexedi.com/kirr/neo/go/zodb/storage._traceCacheGCFinish_run()
          /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/xcommon/tracing/tracing.go:265 +0x81
      lab.nexedi.com/kirr/neo/go/zodb/storage.traceCacheGCFinish()
          /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/storage/ztrace.go:22 +0x63
      lab.nexedi.com/kirr/neo/go/zodb/storage.(*Cache).gc()
          /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/storage/cache.go:497 +0x62c
      lab.nexedi.com/kirr/neo/go/zodb/storage.(*Cache).gcmain()
          /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/storage/cache.go:478 +0x4c

    Previous write at 0x00c42000d760 by goroutine 6:
      lab.nexedi.com/kirr/neo/go/xcommon/tracing.(*Probe).Detach()
          /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/xcommon/tracing/tracing.go:319 +0x103
      lab.nexedi.com/kirr/neo/go/xcommon/tracing.(*ProbeGroup).Done()
          /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/xcommon/tracing/tracing.go:344 +0xa5
      lab.nexedi.com/kirr/neo/go/zodb/storage.TestCache()
          /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/storage/cache_test.go:576 +0x7f94
      testing.tRunner()
          /home/kirr/src/tools/go/go/src/testing/testing.go:746 +0x16c

    Goroutine 7 (running) created at:
      lab.nexedi.com/kirr/neo/go/zodb/storage.NewCache()
          /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/storage/cache.go:129 +0x227
      lab.nexedi.com/kirr/neo/go/zodb/storage.TestCache()
          /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/storage/cache_test.go:165 +0x7b1
      testing.tRunner()
          /home/kirr/src/tools/go/go/src/testing/testing.go:746 +0x16c

    Goroutine 6 (finished) created at:
      testing.(*T).Run()
          /home/kirr/src/tools/go/go/src/testing/testing.go:789 +0x568
      testing.runTests.func1()
          /home/kirr/src/tools/go/go/src/testing/testing.go:1004 +0xa7
      testing.tRunner()
          /home/kirr/src/tools/go/go/src/testing/testing.go:746 +0x16c
      testing.runTests()
          /home/kirr/src/tools/go/go/src/testing/testing.go:1002 +0x521
      testing.(*M).Run()
          /home/kirr/src/tools/go/go/src/testing/testing.go:921 +0x206
      main.main()
          lab.nexedi.com/kirr/neo/go/zodb/storage/_test/_testmain.go:44 +0x1d3
    ==================

bab79665

06 Oct, 2017 5 commits
- . · 29adac06
  Kirill Smelkov authored Oct 06, 2017
  
  29adac06
- . · 0fb48b65
  Kirill Smelkov authored Oct 06, 2017
  
  0fb48b65
- X microbenchmark CPU first · b062b349
  Kirill Smelkov authored Oct 06, 2017
  
  b062b349
- . · f7e23ed8
  Kirill Smelkov authored Oct 06, 2017
  
  f7e23ed8
- . · 9ccda1cc
  Kirill Smelkov authored Oct 06, 2017
  
  9ccda1cc
05 Oct, 2017 21 commits

X time z6001 - z600 + prefetch · b0cb5454
Kirill Smelkov authored Oct 05, 2017
```
Max < 3x (nprefetch=8) in go-go case.
```
b0cb5454
X time z6001 (localhost) +prefetch · 6be37f4e
Kirill Smelkov authored Oct 05, 2017
```
improves max 2x becase probably of go scheduler.
```
6be37f4e
. · 41d2b5b4
Kirill Smelkov authored Oct 05, 2017

41d2b5b4

X neo1-neo2 time with C-states >C1 (and RX delay) disabled on both machines · 4f8313c1

Test authored Oct 05, 2017

Compared to no C-states disabled (but RX delay = 0 as before) it improves:

ping56            36μs  ->    33μs
ping1472        ~150μs  ->  ~107μs

TCP1             ~65μs  ->   ~45μs
TCP1472     ~440-515μs  ->  ~420μs    (still very bad)
TCP4096         ~370μs  ->  ~280μs    (still bad)

ZEO:            ~900μs  ->  ~670μs
NEO/pylite:    ~1050μs  ->  ~600μs    (Cpy)
NEO/pylite:     ~800μs  ->  ~505μs    (Cgo)
NEO/pysql:     ~1700μs  ->  ~900μs    (Cpy)
NEO/pysql:     ~1300μs  ->  ~790μs    (Cgo)
NEO/go:         ~730μs  ->  ~430μs    (Cpy)
NEO/go:         ~370μs  ->  ~210μs    (Cgo)
NEO/go-nosha1:  ~280μs  ->  ~190μs

Raw TCP timings are still bad.

4f8313c1

X neo1 localhost time with C-states >C1 disabled · 8e7527d0

Test authored Oct 05, 2017

It improves:

ZEO:             ~500μs ->  ~470μs
NEO/pylite:      ~640μs ->  ~460μs      (Cpy)
NEO/pylite:      ~415μs ->  ~350μs      (Cgo)
NEO/pysql:      ~1300μs ->  ~790μs      (Cpy)
NEO/pysql:       ~850μs ->  ~650μs      (Cgo)
NEO/go:          ~350μs ->  ~275μs      (Cpy)
NEO/go:           ~78μs ->   ~78μs      (Cgo)
NEO/go-nosha1:    ~45μs ->   ~45μs

8e7527d0

. · 119d2908
Kirill Smelkov authored Oct 05, 2017

119d2908

X time neo1-neo2 · a4e95d4e

Test authored Oct 05, 2017

Same as on oct4 - we just add C-states profiling. In particular TCP1472 and TCP4096 are awful.
C-states are not yet disabled.

a4e95d4e

X time on neo1 (localhost) · de5d9c06

Test authored Oct 05, 2017

Same as oct04 - we just add C-states profiling. C-states are not yet disabled.

de5d9c06

. · d2b13cbb
Kirill Smelkov authored Oct 05, 2017

d2b13cbb

X z6001-z600 time with C-states >C1 and RX delay disabled on both machines · cf6e57b0

Kirill Smelkov authored Oct 05, 2017

Compared to just C-states disabled it improves:

ping56           ~76μs  ->   ~40μs
ping1472        ~150μs  ->  ~120μs

TCP1             ~90μs  ->   ~50μs
TCP1472         ~180μs  ->  ~145μs
TCP4096      ~220-230μs ->  ~175μs

ZEO:            ~660μs  ->  ~660μs
NEO/pylite:     ~630μs  ->  ~575μs    (Cpy)
NEO/pylite:     ~505μs  ->  ~460μs    (Cgo)
NEO/pysql:      ~930μs  ->  ~880μs    (Cpy)
NEO/pysql:      ~810μs  ->  ~750μs    (Cgo)
NEO/go:         ~430μs  ->  ~380μs    (Cpy)
NEO/go:         ~215μs  ->  ~170μs    (Cgo)
NEO/go-nosha1:  ~195μs  ->  ~150μs

cf6e57b0

X z6001-z600 time with C-states >C1 disabled on both machines · a85ae5aa

Kirill Smelkov authored Oct 05, 2017

It improves:

ping56          ~
ping1472        ~

TCP1             ~92μs  ->  ~89μs       (c -> c)
TCP1            ~120μs  ->  ~91μs       (c -> go)
TCP1            ~120μs  ->  ~90μs       (c <- c)
TCP1            ~120μs  ->  ~90μs       (go <- c)

TCP1472         ~220-250μs  -> ~180μs
TCP4096         ~270-300μs  -> ~220-230μs

ZEO:             ~750μs ->  ~660μs
NEO/pylite:      ~850μs ->  ~630μs      (Cpy)
NEO/pylite:      ~640μs ->  ~505μs      (Cgo)
NEO/pysql:      ~1500μs ->  ~930μs      (Cpy)
NEO/pysql:      ~1350μs ->  ~810μs      (Cgo)
NEO/go:          ~600μs ->  ~430μs      (Cpy)
NEO/go:          ~320μs ->  ~215μs      (Cgo)
NEO/go-nosha1:   ~260μs ->  ~195μs

a85ae5aa

. · 9c2b63c9
Kirill Smelkov authored Oct 05, 2017

9c2b63c9

X z6001 localhost time with C-states >C1 disabled · 8d708026

Kirill Smelkov authored Oct 05, 2017

It improves a lot:

ZEO:             ~660μs ->  ~490μs
NEO/pylite:      ~750μs ->  ~470μs      (Cpy)
NEO/pylite:      ~500μs ->  ~370μs      (Cgo)
NEO/pysql:      ~1450μs ->  ~780μs      (Cpy)
NEO/pysql:      ~1000μs ->  ~670μs      (Cgo)
NEO/go:          ~450μs ->  ~280μs      (Cpy)
NEO/go:           ~90μs ->   ~80μs      (Cgo)
NEO/go-nosha1:    ~60μs ->   ~55μs

8d708026

X time z6001 - z600 · 42abaf7a

Kirill Smelkov authored Oct 05, 2017

Same as on oct04 but now we add C-states profile. C-states are not yet disabled.

42abaf7a

. · 2764bf35
Kirill Smelkov authored Oct 05, 2017

2764bf35

X time on z6001 (localhost) · 4b5db5b0

Kirill Smelkov authored Oct 05, 2017

~ the same as on oct04 only we add C-states profiling. C-states are not yet disabled.

4b5db5b0

X deco localhost time with C-states >C1 disabled · b1f98a52

Kirill Smelkov authored Oct 05, 2017

py timings improve a bit - ~15-20μs per python part (i.e. ~30-40μs if
both client and server is in python).

go-go timings stays ~ the same.

b1f98a52

X time on deco (localhost) · 2ad17ebc

Kirill Smelkov authored Oct 05, 2017

Timings are ~same as on sep17 but now a lot more info about system is
added + C-stats profile. C-states are not yet disabled.

2ad17ebc

. · 472c0da7
Kirill Smelkov authored Oct 05, 2017

472c0da7
. · cfde4271
Kirill Smelkov authored Oct 05, 2017

cfde4271
X first cut on C-state profiling · a4a18b55
Kirill Smelkov authored Oct 05, 2017

a4a18b55