- 28 Feb, 2020 4 commits
-
-
Kirill Smelkov authored
-
Kirill Smelkov authored
On macos and windows, Python2 is built with --enable-unicode=ucs2, which makes it to use UTF-16 encoding for unicode characters, and so for characters higher than U+10000 it uses surrogate encoding with _2_ unicode points, for example: >>> import sys >>> sys.maxunicode 65535 <-- NOTE indicates UCS2 build >>> s = u'\U00012345' >>> s u'\U00012345' >>> s.encode('utf-8') '\xf0\x92\x8d\x85' >>> len(s) 2 <-- NOTE _not_ 1 >>> s[0] u'\ud808' >>> s[1] u'\udf45' This leads to e.g. b tests failing for # tbytes tunicode (b"\xf0\x90\x8c\xbc", u'\U0001033c'), # Valid 4 Octet Sequence '𐌼' > assert b(tunicode) == tbytes E AssertionError: assert '\xed\xa0\x80\xed\xbc\xbc' == '\xf0\x90\x8c\xbc' E - \xed\xa0\x80\xed\xbc\xbc E + \xf0\x90\x8c\xbc because on UCS2 python build u'\U0001033c' is represented as 2 unicode points: >>> s = u'\U0001033c' >>> len(s) 2 >>> s[0] u'\ud800' >>> s[1] u'\udf3c' >>> s[0].encode('utf-8') '\xed\xa0\x80' >>> s[1].encode('utf-8') '\xed\xbc\xbc' -> Fix it by detecting UCS2 build and working around by manually combining such surrogate unicode pairs appropriately. A reference on the subject: https://matthew-brett.github.io/pydagogue/python_unicode.html#utf-16-ucs2-builds-of-python-and-32-bit-unicode-code-points
-
Kirill Smelkov authored
This is a preparatory step for the next patch where we'll be fixing strconv for Python2 builds with --enable-unicode=ucs2, where a unicode character can be taking _2_ unicode points. In that general case relying on unicode objects to represent runes is not good, because many things generally do not work for U+10000 and above, e.g. ord breaks: >>> import sys >>> sys.maxunicode 65535 <-- NOTE indicates UCS2 build >>> s = u'\U00012345' >>> s u'\U00012345' >>> s.encode('utf-8') '\xf0\x92\x8d\x85' >>> len(s) 2 <-- NOTE _not_ 1 >>> ord(s) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: ord() expected a character, but string of length 2 found so we switch to represent runes as integer, similarly to what Go does.
-
Kirill Smelkov authored
Commit 8af78fc5 (pyx.build: v↑ setuptools_dso (1.2 -> 1.4)) upgraded setuptools_dso to 1.4, but since from https://github.com/mdavidsaver/setuptools_dso/commit/3f3ff746 setuptools_dso started to use multiprocessing, pyx.build, when running under gpython, started to hang, which is a known gevent problem - see e.g. here: https://github.com/gevent/gevent/issues/993. The problem was manifesting itself as pyx.build unit test hanging under Python3. Fix it by installing gevent multiprocessing plugin which is automatically used/activated by gevent.monkey.patch_all(). geventmp says it is pre-alpha, but by using it we can unhang pyx.build tests, which is better state than before. The other future possibility would be to use https://github.com/jgehrcke/gipc wrapped into multiprocessing compatible API.
-
- 27 Feb, 2020 3 commits
-
-
Kirill Smelkov authored
This is top-level documentation for error chaining that was promised and marked as TODO in - fd95c88a (golang, errors, fmt: Error chaining (C++/Pyx)) - 17798442 (golang: Expose error at Py level) - 78d0c76f (golang: Teach pyerror to be a base class) - 337de0d7 (golang, errors, fmt: Error chaining (Python)) - 03f88c0b (errors: Take .__cause__ into account)
-
Kirill Smelkov authored
This provides top-level documentation for b and u that was promised and marked as TODO in bcb95cd5 (golang: Provide b, u for strings).
-
Kirill Smelkov authored
Pychan provides __eq__ (see 2c8063f4 "*: Channels must be compared by ==, not by "is" even for nilchan"), but does not provide __ne__. At the same time in 17798442 (golang: Expose error at Py level) we had to define both pyerror.__eq__ and pyerror.__ne__ because without __ne__ pyerror != pyerror was not working correctly. As it turns out pychan != pychan already works ok, because pychan does not have base class and for that case cython automatically generates __ne__ based on __eq__: https://github.com/cython/cython/blob/0.29.14-629-ga73815042/Cython/Compiler/ModuleNode.py#L1963-L1976 https://github.com/cython/cython/commit/b75d2942afab Add corresponding comment and extend tests to make sure it is indeed so.
-
- 20 Feb, 2020 1 commit
-
-
Kirill Smelkov authored
Go version does not provide this, but the topic of sync.RWMutex downgrading was raised up several times, at least https://github.com/golang/go/issues/4026 https://github.com/golang/go/issues/23513 https://groups.google.com/forum/#!topic/golang-nuts/MmIDUzl8HA0 ... Atomic downgrading is often useful to avoid race window in between Unlock and RLock and, as consequence, having the need to recheck things after RLock. We can put this complexity and logic into well-defined RWMutex primitive instead of throwing it to be solved by every RWMutex user.
-
- 17 Feb, 2020 1 commit
-
-
Kirill Smelkov authored
Provide sync.RWMutex that can be useful for cases when there are multiple simultaneous readers and more seldom writer(s). This implements readers-writer mutex with preference for writers similarly to Go version.
-
- 12 Feb, 2020 1 commit
-
-
Kirill Smelkov authored
Only io.EOF and io.ErrUnexpectedEOF for now. Moved here from wcfs from wendelin.core.
-
- 11 Feb, 2020 3 commits
-
-
Kirill Smelkov authored
A Python error can have links to other errors by means of both .Unwrap() and .__cause__ . These ways are both explicit and so should be treated by e.g. errors.Is as present in error's error chain. It is a bit unclear, at least initially, how to linearise and order error chain traversal in divergence points - for exception objects where both .Unwrap() and .__cause__ are !None. However more closer look suggests linearisation rule to traverse into .__cause__ after going through .Unwrap() part - please see details in documentation added into _error.pyx -> Teach errors.Is to do this traversal, and this way now e.g. exception raised as raise X from Y will be treated by errors.Is as being both X and Y, even if any of X or Y also has its own error chain via .Unwrap(). Top-level documentation is TODO.
-
Kirill Smelkov authored
Following errors model in Go and fd95c88a (golang, errors, fmt: Error chaining (C++/Pyx)) let's add support at Python-level for errors to wrap each other and to be inspected/unwrapped: - an error can additionally provide way to unwrap itself, if it provides .Unwrap() method. .__cause__ is not taken into account yet, but will be in a follow-up patch; - errors.Is(err) tests whether an item in error's chain matches target; - `fmt.Errorf("... : %w", ... err)` is similar to `"... : %s" % (..., err)` but resulting error, when unwrapped, will return err. - errors.Unwrap is not exposed as chaining through both .Unwrap() and .__cause__ will need more than just "current element" as unwrapping state (i.e. errors.Unwrap API is insufficient - see next patch), and in practice users of errors.Unwrap() are very seldom. Support for error chaining through .__cause__ will follow in the next patch. Top-level documentation is TODO. See https://blog.golang.org/go1.13-errors for error chaining overview.
-
Kirill Smelkov authored
It is surprising to have an exception class that cannot be derived from. Besides, in the future we'll use subclassing from golang.error as an indicator that an error is a "well-defined" (in simple words - does not need traceback to be interpreted).
-
- 10 Feb, 2020 1 commit
-
-
Kirill Smelkov authored
The first step to expose errors and error chaining to Python: - Add pyerror that wraps a pyx/nogil C-level error and is exposed as golang.error at py level. - py errors must be compared by ==, not by "is" - Add (py) errors.New to create a new error from text. - a C-level error that has .Unwrap, is exposed with .Unwrap at py level, but full py-level chaining will be implemented in a follow-up patch. - py error does not support inheritance yet. Top-level documentation is TODO.
-
- 06 Feb, 2020 1 commit
-
-
Kirill Smelkov authored
Following errors model in Go, let's add support for errors to wrap other errors and to be inspected/unwrapped: - an error can additionally provide way to unwrap itself, if it implements errorWrapper interface; - errors.Unwrap(err) tries to extract wrapped error; - errors.Is(err) tests whether an item in error's chain matches target; - `fmt.errorf("... : %w", ... err)` is similar to `fmt.errorf("... : %s", ... err.c_str())` but resulting error, when unwrapped, will return err. Add C++ implementation for the above + tests. Python analogs will follow in the next patches. Top-level documentation is TODO. See https://blog.golang.org/go1.13-errors for error chaining overview.
-
- 04 Feb, 2020 16 commits
-
-
Kirill Smelkov authored
Package cxx was added in 9785f2d3 (cxx: New package), but the interface that cxx:dict provided turned out to be not optimal: dict.get was returning (v, ok), and dict.pop ----//--- Correct dict.get and dict.pop to return just value, and, similarly to channels API, provide additional dict.get_ and dict.pop_ - extended versions that also return ok: dict.get(k) -> v dict.pop(k) -> v dict.get_(k) -> (v, ok) dict.pop_(k) -> (v, ok) This time add tests.
-
Kirill Smelkov authored
Follow the scheme established and used for all other packages, because we will soon have fmt pyx part which, if named as fmt.pyx, will intersect and conflict with fmt.py .
-
Kirill Smelkov authored
errors.New was added in a245ab56 (errors: New package) without test.
-
Kirill Smelkov authored
-
Kirill Smelkov authored
-
Kirill Smelkov authored
-
Kirill Smelkov authored
Makes understanding which test is it and where when one fails.
-
Kirill Smelkov authored
Currently libgolang_test.cpp contains tests for code in libgolang.cpp and for code that lives in other libgolang packages - sync, fmt, etc. It is becoming tight and we are going to split libgolang_test.cpp and move package tests to their corresponing files - e.g. to sync_test.cpp and the like. Move common assertion utilities into shared header before that as a preparatory step.
-
Kirill Smelkov authored
Just use builtins and cimported things that we have at pyx level.
-
Kirill Smelkov authored
U is preffered way to make sure an object is unicode string.
-
Kirill Smelkov authored
This will allow to integrate qq with u in the next patch. Moving to compiled code for string processing functions is also generally better for performance.
-
Kirill Smelkov authored
With Python3 I've got tired to constantly use .encode() and .decode(); getting exception if original argument was unicode on e.g. b.decode(); getting exception on raw bytes that are invalid UTF-8, not being able to use bytes literal with non-ASCII characters, etc. So instead of this pain provide two functions that make sure an object is either bytes or unicode: - b converts str/unicode/bytes s to UTF-8 encoded bytestring. Bytes input is preserved as-is: b(bytes_input) == bytes_input Unicode input is UTF-8 encoded. The encoding always succeeds. b is reverse operation to u - the following invariant is always true: b(u(bytes_input)) == bytes_input - u converts str/unicode/bytes s to unicode string. Unicode input is preserved as-is: u(unicode_input) == unicode_input Bytes input is UTF-8 decoded. The decoding always succeeds and input information is not lost: non-valid UTF-8 bytes are decoded into surrogate codes ranging from U+DC80 to U+DCFF. u is reverse operation to b - the following invariant is always true: u(b(unicode_input)) == unicode_input NOTE: encoding _and_ decoding *never* fail nor loose information. This is achieved by using 'surrogateescape' error handler on Python3, and providing manual fallback that behaves the same way on Python2. The naming is chosen with the idea so that b(something) resembles b"something", and u(something) resembles u"something". This, even being only a part of strings solution discussed in [1], should help handle byte- and unicode- strings in more robust and distraction free way. Top-level documentation is TODO. [1] zodbtools!13
-
Kirill Smelkov authored
This continues 60f6db6f (libgolang: Provide nil as alias for nullptr and NULL): I've tried to compile pygolang with Clang on my Debian 10 workstation and got: $ CC=clang CXX=clang++ python setup.py build_dso -i In file included from ./golang/fmt.h:32: ./golang/libgolang.h:381:11: error: unknown type name 'nullptr_t'; did you mean 'std::nullptr_t'? constexpr nullptr_t nil = nullptr; ^~~~~~~~~ std::nullptr_t /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8/bits/c++config.h:242:29: note: 'std::nullptr_t' declared here typedef decltype(nullptr) nullptr_t; ^ : In file included from ./golang/context.h In file included from golang/runtime/libgolang.cpp:30: ./golang/libgolang.h:381:11: error: unknown type name 'nullptr_t'; did you mean 'std::nullptr_t'? constexpr nullptr_t nil = nullptr; ^~~~~~~~~ std::nullptr_t /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8/bits/c++config.h:242:29: note: 'std::nullptr_t' declared here typedef decltype(nullptr) nullptr_t; ^ :39: ./golang/libgolang.h:381:11: error: unknown type In file included from golang/fmt.cpp:25: In file included from ./golang/fmt.h:32: ./golang/libgolang.h:421:17: error: unknown type name 'nullptr_t'; did you mean 'std::nullptr_t'? inline chan(nullptr_t) { _ch = nil; } ^~~~~~~~~ std::nullptr_t ... It seems with GCC and Clang under macOS nullptr_t is automatically provided in builtin namespace, while with older Clang on Linux (clang version 7.0.1-8) only in std:: namespace - rightfully as nullptr_t is described to be present there: https://en.cppreference.com/w/cpp/types/nullptr_t This way we either have to correct all occurrences of nullptr_t to std::nullptr_t, or do something similar with providing nil under golang:: . To reduce noise I prefer the later and let it be named as Nil.
-
Kirill Smelkov authored
The code was assigning nil to local, _not_ global _tblockforever. As a result _tblockforever was left set with a test hook even after leaving test context. Fix it. The bug was there starting from 3b241983 (Port/move channels to C/C++/Pyx). Had to change `= nil` to `= NULL` because with nil Cython complains as def __exit__(pypanicWhenBlocked t, typ, val, tb): global _tblockforever _tblockforever = nil ^ ------------------------------------------------------------ golang/_golang_test.pyx:86:25: Cannot assign type 'nullptr_t' to 'void (*)(void) nogil' This is https://github.com/cython/cython/issues/3314.
-
Kirill Smelkov authored
It's a leftover originating from b073f6df (time: Move/Port timers to C++/Pyx nogil).
-
Kirill Smelkov authored
Instead of `pyctx.ctx = nil` it was just `ctx = nil` - i.e. assign nil to local variable instead of changing pyctx instance data. We were not observing this bug because Cython, for C++ fields of cdef classes, automatically emits in-place destructor calls in generated __dealloc__ https://github.com/cython/cython/blob/0.29.14-11-g8c620c388/Cython/Compiler/ModuleNode.py#L1477-L1478 and so this way there was no leak. However we want to be explicit and the code was not correct. Fix it. The bug was there from 2a359791 (context: Move/Port context package to C++/Pyx nogil).
-
- 17 Jan, 2020 3 commits
-
-
Kirill Smelkov authored
Convert Pyx part of the project to use nil instead of NULL. Not every usage of NULL was converted and some places were left to use NULL where changing it to nil currently hits Cython compilation error: https://github.com/cython/cython/issues/3314
-
Kirill Smelkov authored
Convert C++ part of the project to use nil instead of NULL/nullptr. We do not convert pyx part yet, because Cython currently does not understand that nullptr_t has properties of NULL and with e.g. the following change --- a/golang/_context.pyx +++ b/golang/_context.pyx @@ -116,7 +116,7 @@ cdef cppclass _PyValue (_interface, gobject) nogil: __dealloc__(): with gil: obj = <object>this.pyobj - this.pyobj = NULL + this.pyobj = nil Py_DECREF(obj) errors as Error compiling Cython file: ------------------------------------------------------------ ... if __decref(): del self __dealloc__(): with gil: obj = <object>this.pyobj this.pyobj = nil ^ ------------------------------------------------------------ golang/_context.pyx:119:25: Cannot assign type 'nullptr_t' to 'PyObject *' https://github.com/cython/cython/issues/3314
-
Kirill Smelkov authored
Nil is more native to Go.
-
- 13 Jan, 2020 1 commit
-
- 08 Jan, 2020 1 commit
-
-
Kirill Smelkov authored
Projects that use pyx.build (ex. wendelin.core) need recent setuptools_dso fixes: https://github.com/mdavidsaver/setuptools_dso/issues/5 https://github.com/mdavidsaver/setuptools_dso/commit/67d717a6 https://github.com/mdavidsaver/setuptools_dso/commit/e40f5883 https://github.com/mdavidsaver/setuptools_dso/commit/40b492ab Increase setuptools_dso version in pygolang's build requirement for uniformity as well.
-
- 06 Dec, 2019 1 commit
-
-
Kirill Smelkov authored
Providing pygolang-specific DSO is needed because using just setuptools_dso.DSO in external project will result in that e.g. "<golang/libgolang.h>" won't be found.
-
- 27 Nov, 2019 3 commits
-
-
Kirill Smelkov authored
This release is driven by wendelin.core v2 needs with one of the changes being that now most of the library was moved into nogil code and can be used fully from inside nogil world(*). Python modules are now just wrappers of their nogil counterparts. The way for Python and nogil worlds to communicate is also provided. The move to nogil required many other enhancements along the way. Please see CHANGELOG for overview. The move to nogil brought some speedup automatically. Below are benchmark results of this release compared to pygolang v0.0.4 (1573d101) for python-level benchmarks (we have only those at present): (on i7@2.6GHz) thread runtime: name old time/op new time/op delta go 18.3µs ± 0% 18.3µs ± 1% ~ (p=1.000 n=10+10) chan 2.91µs ± 3% 2.99µs ± 5% +2.73% (p=0.022 n=10+10) select 3.57µs ± 3% 3.57µs ± 4% ~ (p=0.720 n=9+10) def 55.0ns ± 0% 54.0ns ± 0% -1.82% (p=0.002 n=8+10) func_def 43.8µs ± 2% 44.1µs ± 1% +0.64% (p=0.035 n=10+9) call 64.0ns ± 0% 66.3ns ± 1% +3.59% (p=0.000 n=10+10) func_call 1.05µs ± 1% 1.24µs ± 0% +17.80% (p=0.000 n=10+7) try_finally 138ns ± 0% 137ns ± 1% -0.51% (p=0.003 n=10+10) defer 2.32µs ± 1% 2.63µs ± 1% +13.52% (p=0.000 n=10+10) workgroup_empty 38.0µs ± 1% 24.1µs ± 1% -36.43% (p=0.000 n=10+10) workgroup_raise 47.7µs ± 1% 28.2µs ± 0% -40.76% (p=0.000 n=10+10) gevent runtime: name old time/op new time/op delta go 16.9µs ± 1% 17.2µs ± 2% +1.94% (p=0.000 n=10+10) chan 7.43µs ± 0% 7.82µs ± 0% +5.34% (p=0.000 n=10+7) select 10.5µs ± 0% 11.2µs ± 0% +6.74% (p=0.000 n=10+10) def 63.0ns ± 0% 57.6ns ± 1% -8.57% (p=0.000 n=9+10) func_def 44.0µs ± 1% 44.2µs ± 1% ~ (p=0.063 n=10+10) call 67.0ns ± 0% 64.0ns ± 0% -4.48% (p=0.002 n=8+10) func_call 1.06µs ± 1% 1.23µs ± 1% +16.50% (p=0.000 n=10+10) try_finally 144ns ± 0% 136ns ± 0% -5.90% (p=0.000 n=10+10) defer 2.37µs ± 1% 2.61µs ± 1% +10.07% (p=0.000 n=10+10) workgroup_empty 57.0µs ± 0% 55.0µs ± 2% -3.53% (p=0.000 n=10+9) workgroup_raise 72.4µs ± 0% 69.6µs ± 6% -3.95% (p=0.035 n=9+10) workgroup_* changes for thread runtime is the speedup I am talking about. defer/func_call slowdown is due to added exception chaining. We did not optimize Python-level defer yet, and if/when that would be needed, it should be possible to optimize by moving pydefer implementation into Cython. (*) go and channels were moved into nogil world in Pygolang v0.0.3 + v0.0.4 . Now it is the rest of the library that was moved with packages like context, time, sync etc. wendelin.core v2 needs nogil to run pinner thread on client side to support isolation property in cooperation with wcfs: since there is a `client -> wcfs -> pinner` loop: - - - - - - | | pinner <------. | | wcfs client -------^ | | - - - - - - client process the pinner thread would deadlock if it tries to take the GIL because client thread can be holding GIL already while accessing wcfs-mmaped memory (think doing e.g. `x = A[i]` in Python).
-
Kirill Smelkov authored
For base functionality we have overview in the readme itself, but for packages we have only their listing with brief overview and no documentation for in-package functionality. Let's have at least links to .h/.pxd/.py where package functionality is documented.
-
Kirill Smelkov authored
See commit 5a99b769 (libgolang: Start providing interfaces) for context.
-