-
Kirill Smelkov authored
This release is driven by wendelin.core v2 needs with one of the changes being that now most of the library was moved into nogil code and can be used fully from inside nogil world(*). Python modules are now just wrappers of their nogil counterparts. The way for Python and nogil worlds to communicate is also provided. The move to nogil required many other enhancements along the way. Please see CHANGELOG for overview. The move to nogil brought some speedup automatically. Below are benchmark results of this release compared to pygolang v0.0.4 (1573d101) for python-level benchmarks (we have only those at present): (on i7@2.6GHz) thread runtime: name old time/op new time/op delta go 18.3µs ± 0% 18.3µs ± 1% ~ (p=1.000 n=10+10) chan 2.91µs ± 3% 2.99µs ± 5% +2.73% (p=0.022 n=10+10) select 3.57µs ± 3% 3.57µs ± 4% ~ (p=0.720 n=9+10) def 55.0ns ± 0% 54.0ns ± 0% -1.82% (p=0.002 n=8+10) func_def 43.8µs ± 2% 44.1µs ± 1% +0.64% (p=0.035 n=10+9) call 64.0ns ± 0% 66.3ns ± 1% +3.59% (p=0.000 n=10+10) func_call 1.05µs ± 1% 1.24µs ± 0% +17.80% (p=0.000 n=10+7) try_finally 138ns ± 0% 137ns ± 1% -0.51% (p=0.003 n=10+10) defer 2.32µs ± 1% 2.63µs ± 1% +13.52% (p=0.000 n=10+10) workgroup_empty 38.0µs ± 1% 24.1µs ± 1% -36.43% (p=0.000 n=10+10) workgroup_raise 47.7µs ± 1% 28.2µs ± 0% -40.76% (p=0.000 n=10+10) gevent runtime: name old time/op new time/op delta go 16.9µs ± 1% 17.2µs ± 2% +1.94% (p=0.000 n=10+10) chan 7.43µs ± 0% 7.82µs ± 0% +5.34% (p=0.000 n=10+7) select 10.5µs ± 0% 11.2µs ± 0% +6.74% (p=0.000 n=10+10) def 63.0ns ± 0% 57.6ns ± 1% -8.57% (p=0.000 n=9+10) func_def 44.0µs ± 1% 44.2µs ± 1% ~ (p=0.063 n=10+10) call 67.0ns ± 0% 64.0ns ± 0% -4.48% (p=0.002 n=8+10) func_call 1.06µs ± 1% 1.23µs ± 1% +16.50% (p=0.000 n=10+10) try_finally 144ns ± 0% 136ns ± 0% -5.90% (p=0.000 n=10+10) defer 2.37µs ± 1% 2.61µs ± 1% +10.07% (p=0.000 n=10+10) workgroup_empty 57.0µs ± 0% 55.0µs ± 2% -3.53% (p=0.000 n=10+9) workgroup_raise 72.4µs ± 0% 69.6µs ± 6% -3.95% (p=0.035 n=9+10) workgroup_* changes for thread runtime is the speedup I am talking about. defer/func_call slowdown is due to added exception chaining. We did not optimize Python-level defer yet, and if/when that would be needed, it should be possible to optimize by moving pydefer implementation into Cython. (*) go and channels were moved into nogil world in Pygolang v0.0.3 + v0.0.4 . Now it is the rest of the library that was moved with packages like context, time, sync etc. wendelin.core v2 needs nogil to run pinner thread on client side to support isolation property in cooperation with wcfs: since there is a `client -> wcfs -> pinner` loop: - - - - - - | | pinner <------. | | wcfs client -------^ | | - - - - - - client process the pinner thread would deadlock if it tries to take the GIL because client thread can be holding GIL already while accessing wcfs-mmaped memory (think doing e.g. `x = A[i]` in Python).
c5c3071b