Commits · 2c20c0553f36b8b82bab6bc9059f1249c5d1ea80 · Kirill Smelkov / pygolang

09 Oct, 2022 17 commits

golang_str: tests: Deep replacer · 2c20c055

Kirill Smelkov authored Oct 07, 2022

deepReplace returns object's clone with replacing all internal objects
selected by predicate via provided replacement function. We will use
this functionality in the following patches to organize testing of
bstr/ustr methods: a method would be first invoked on regular str, and
then on bstr/ustr and the result will be compared against each other.
The results are usually different, because e.g. u'a b c'.split() returns
[u'a', u'b', u'c'] while b('a b c').split() should return
[b('a'), b('b'), b('c')]. We want to make sure that the second result is
exactly the first result with all instances of unicode replaced by bstr.
That's where deep replacer will be used.

The deep replacement itself is implemented via pickle reduce/rebuild
protocol: we unassemble and reconstruct objects. And while an object is
unassembled, we try to apply the replacement recursively. Since this is
not so trivial functionality, it itself also comes with a test.

2c20c055

golang_str: Fix bstr.tp_print(flags=print_repr) · 510cf8d1

Kirill Smelkov authored Oct 07, 2022

On py2 objects are printed via their .tp_repr slot with flags=0
(contrary to Py_PRINT_RAW which requests to print str -
https://docs.python.org/2.7/c-api/object.html#c.PyObject_Print)

We were not handling repr'ing inside our tp_print implementation, and
as the result e.g. b('мир') was printed on interactive console as
'\xd0\xbc\xd0\xb8\xd1\x80' instead of b('мир').

Fix it.

510cf8d1

golang_str: bstr/ustr repr · 386844d3

Kirill Smelkov authored Oct 07, 2022

Teach bstr/ustr to provide repr of themselves: it goes as b(...) and
u(...) where u stands for human-readable repr of contained data.
Human-readable means that non-ascii printable unicode characters are
shown as-is instead of escaping them, for example:

    >>> x = u'αβγ'
    >>> x
    'αβγ'
    >>> y = b(x)
    >>> y
    b('αβγ')				<-- NOTE not b(b'\xce\xb1\xce\xb2\xce\xb3')
    >>> x.encode('utf-8')
    b'\xce\xb1\xce\xb2\xce\xb3'

386844d3

strconv, golang_str: Switch quote, unquote and qq to always return bstr · 604a7765

Kirill Smelkov authored Oct 07, 2022

bstr is becoming the default pygolang string type. And it can be mixed
ok with all bytes/unicode and ustr. Previously e.g. strconv.quote was
checking which kind of type its input was and was trying to return the
result of the same type. Now this becomes unnecessary since bstr is
intended to be used universally and interoperable with all other string
types.

604a7765

golang_str: bstr/ustr support for + and * · bbbb58f0

Kirill Smelkov authored Oct 07, 2022

Add support for +, *, += and *= operators to bstr and ustr.

For * rhs should be integer and the result, similarly to std strings, is
repetition of rhs times.

For + the other argument could be any supported string - bstr/ustr /
unicode/bytes/bytearray. And the result is always bstr or ustr:

    u()   +     *     ->  u()
    b()   +     *     ->  b()
    u''   +  u()/b()  ->  u()
    u''   +  u''      ->  u''
    b''   +  u()/b()  ->  b()
    b''   +      b''  ->  b''
    barr  +  u()/b()  ->  barr

in particular if lhs is bstr or ustr, the result will remain exactly of
original lhs type. This should be handy when one has e.g. bstr at hand
and wants to incrementally append something to it.

And if lhs is bytes/unicode, but we append bstr/ustr to it, we "upgrade"
the result to bstr/ustr correspondingly. Only if lhs is bytearray it
remains to stay that way because it is logical for appended object to
remain mutable if it was mutable in the beginning.

As before bytearray.__add__ and friends need to patched a bit for
bytearray not to reject ustr.

bbbb58f0

golang_str: bstr/ustr pickle support · ebd18f3f

Kirill Smelkov authored Oct 07, 2022

Without explicitly overriding __reduce_ex__ pickling was failing for
protocols < 2:

    _________________________ test_strings_pickle __________________________

        def test_strings_pickle():
            bs = b("мир")
            us = u("май")

            #from pickletools import dis
            for proto in range(0, pickle.HIGHEST_PROTOCOL):
    >           p_bs = pickle.dumps(bs, proto)

    golang/golang_str_test.py:282:
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    self = b'\xd0\xbc\xd0\xb8\xd1\x80', proto = 0

        def _reduce_ex(self, proto):
    >       assert proto < 2
    E       RecursionError: maximum recursion depth exceeded in comparison

    /usr/lib/python3.9/copyreg.py:56: RecursionError

See added comments for details.

ebd18f3f

golang_str: bstr/ustr iteration · a72c1c1a

Kirill Smelkov authored Oct 07, 2022

Even though bstr is semantically array of bytes, while ustr is array of
unicode characters, iterating them _both_ yields unicode characters.
This goes in line with Go approach described in "Strings, bytes, runes
and characters in Go"[1] and allows for both ustr _and_ bstr to be used
as strings in unicode world.

Even though this diverges (just a bit) from str/py2 str behaviur, and
diverges more from bytes/py3 behaviour, I have not hit any problem in
practice due to this divergence. In other words the semantics of
bytestring used in Go - to iterate them as unicode characters - is
sound. For the reference it is the authors of Go who originally invented
UTF-8 - see [2] for details.

See also [3] for our discussion with Jérome on this topic.

[1] https://blog.golang.org/strings
[2] https://www.cl.cam.ac.uk/~mgk25/ucs/UTF-8-Plan9-paper.pdf
[3] nexedi/zodbtools!13 (comment 81646)

a72c1c1a

golang_str: bstr/ustr index access · 04be919b

Kirill Smelkov authored Oct 07, 2022

Implement access to bstr/ustr by [index] and by slice. Result of such
[index] access - similarly to standard str - returns the same bstr/ustr
type with one character:

  - ustr[i] returns ustr with one unicode character taken from i'th character of original string, while
  - bstr[i] returns bstr with one byte taken from i'th byte of original bytestring.

This follows str/unicode semantics on both py2/py3, bytes semantic on
py2, but diverges from bytes semantics on py3. I originally tried to
follow bytes/py3 semantic - for bstr to return an integer instead of
1-byte character, but later found several compatibility breakages due to
it. I contemplated about this divergence for a long time and finally
took decision to follow strings semantics for both ustr and bstr. This
preserves backward compatibility with Python2 and also allows for bstr
to be practically drop-in replacement for str type.

To get an ordinal corresponding to retrieved character, one can use
standard `ord`, e.g. as in `ord(bstr[i])`. This will always return an
integer for all bstr/ustr/str/unicode. Similarly to standard `chr` and
`unichr`, we also provide two utility functions - `uchr` and `bbyte` to
create 1-character and 1-byte ustr/bstr correspondingly.

04be919b

golang_str: Add test for memoryview(bstr) · 105d03d4

Kirill Smelkov authored Oct 07, 2022

Verify that it works as expected, and that memoryview(ustr) is rejected,
because ustr is semantically array of unicode characters, not bytes.

No change to the code - just add tests for current status which is
already working as expected.

105d03d4

golang_str: Teach b/u to accept objects with buffer interface · d7e55bb0

Kirill Smelkov authored Oct 07, 2022

And to convert them to bstr/ustr decoding buffer data as if it was
bytes. This is needed if e.g. we have data in mmap or numpy.ndarray, and
want to convert the data to string. The conversion is always explicit via
explicit call to b/u. And for bstr/ustr constructors, we preserver their
behaviour to match unicode constructor not to convert automatically, but
instead to stringify the object, e.g. as shown below:

    In [1]: bdata = b'hello 123'

    In [2]: mview = memoryview(bdata)

    In [3]: str(mview)
    Out[3]: '<memory at 0x7fb226b26700>'	# NOTE _not_ b'hello 123'

d7e55bb0

golang_str: Treat bytearray also as bytestring, just mutable · e4d5cb21

Kirill Smelkov authored Oct 07, 2022

bytearray was introduced in Python as a mutable version of bytes. It has
all strings methods (e.g. .capitalize() .islower(), etc), and it also
supports % formatting. In other words it has all attributes of being a
byte-string, with the only difference from bytes in that bytearray is
mutable. In other words bytearray is handy to have when a string is
being incrementally constructed step by step without hitting overhead of
many bytes objects creation/destruction.

So, since bytearray is also a bytestring, similarly to bytes, let's add
support to interoperate with bytearray to bstr and ustr:

- b/u and bstr/ustr now accept bytearray as argument and treat it as bytestring.
- bytearray() constructor, similarly to bytes() and unicode()
  constructors, now also accepts bstr/ustr and create bytearray object
  corresponding to byte-stream of input.

For the latter point to work we need to patch bytearray.__init__() a bit,
since, contrary to bytes.__init__(), it does not pay attention to
whether provided argument has __bytes__ method or not.

e4d5cb21

golang_str: Implement bstr/ustr constructors · 781802d4

Kirill Smelkov authored Oct 06, 2022

Both bstr and ustr constructors mimic constructor of unicode(= str on py3) -
an object is either stringified, or decoded if it provides buffer
interface, or the constructor is invoked with optional encoding and
errors argument:

    # py2
    class unicode(basestring)
     |  unicode(object='') -> unicode object
     |  unicode(string[, encoding[, errors]]) -> unicode object

    # py3
    class str(object)
     |  str(object='') -> str
     |  str(bytes_or_buffer[, encoding[, errors]]) -> str

Stringification of all bstr/ustr / unicode/bytes is handled
automatically with the meaning to convert to created type via b or u.

We follow unicode semantic for both ustr _and_ bstr, because bstr/ustr
are intended to be used as strings.

781802d4

golang_str: Teach bstr/ustr to compare wrt any string with automatic coercion · 54c2a3cf

Kirill Smelkov authored Oct 05, 2022

So that e.g. `bstr == <any string type>` works. We want `bstr == ustr`
to work because we intend those types to be interoperable. We also want
e.g. `bstr == "a_string"` to work because we want bstr to be
interoperable with standard strings. In general we want to have full
automatic interoperability with all string types, so that e.g. `bstr == X`
works for X being all bstr, ustr, unicode, bytes (and later bytearray).

For now we add support only for comparison operators. But later, we
will be adding support for e.g. +, string methods, etc - and in all
those operations we will be following the same approach: to have
automatic interoperability with all string types out of the box.

The text added to README reflects this.

The patch to unicode.tp_richcompare on py2 illustrates our approach to
adjust builtin types when absolutely needed. In this particular case
original builtin unicode.__eq__(unicode, bstr) is always returning False
for non-ASCII bstr even despite bstr having .__unicode__() method. Our
adjustment is non-intrusive - we adjust unicode behaviour only wrt bstr
and it stays exactly the same as before wrt all other types.

We anyway do that with care and add a test that verifies that behaviour
of what we patched stays unaffected when used outside of bstr/ustr
context.

54c2a3cf

golang_str: Infrastructure to patch builtin types · 34667355

Kirill Smelkov authored Oct 06, 2022

_patch_slot(typ, slotname, func) installs func into typ's
dict[slotname]. For example in the next patch we will need to adjust
unicode.__eq__ on py2 not to reject bstr with always assuming that
`unicode == bstr` is False. We will do it via patching unicode.__eq__ to
first check rhs or whether it is bstr and handling that with our code,
while tailing to original unicode.__eq__ for all other types.

34667355

golang_str: Refresh b/u and bstr/ustr docstrings · 88b21b40

Kirill Smelkov authored Oct 05, 2022

Document explicitly which types b/u accept and how they are handled.
Change bstr/ustr docstrings to also be more explicit.

Documentation changes only.

88b21b40

golang_str: Make bytes(bstr) -> bstr, unicode(ustr) -> ustr · b7cda092

Kirill Smelkov authored Oct 05, 2022

In other words casting to bytes/unicode preserves pygolang string to
remain pygolang string.

Without the changes to bstr/ustr added test fails as e.g.

    >       assert bytes  (bs) is bs
    E       AssertionError: assert b'\xd0\xbc\xd0\xb8\xd1\x80' is b'\xd0\xbc\xd0\xb8\xd1\x80'
    E        +  where b'\xd0\xbc\xd0\xb8\xd1\x80' = bytes(b'\xd0\xbc\xd0\xb8\xd1\x80')

in other words bytes(bstr) was creating a copy and changing type to bytes.

b7cda092

golang_str: Extend tests a bit · 85c4615d

Kirill Smelkov authored Oct 05, 2022

Extend current coverage for b/u tests more explicitly verifying
resulting type (`type(·) is ...` instead of `isinstance(·, ...)`),
verifying unicode(bstr)->ustr and bytes(ustr)->bstr, and str() of both
bstr and ustr.

Move the check for "no custom attributes" from test_qq to generic
test_strings_basic, because now verified string types are publicly
accessible, not only via qq.

Small cosmetics in benchmarks - by reusing hereby introduced xbytes()
utility.

No change for the code itself - the tests just add verification to
current status.

85c4615d

08 Oct, 2022 1 commit

golang_str: Start exposing Pygolang string types publicly · 1f99393d

Kirill Smelkov authored Oct 05, 2022

In 2020 in edc7aaab (golang: Teach qq to be usable with both bytes and
str format whatever type qq argument is) I added custom bytes- and
unicode- like types for qq to return instead of str with the idea for
qq's result to be interoperable with both bytes and unicode. Citing that patch:

    qq is used to quote strings or byte-strings. The following example
    illustrates the problem we are currently hitting in zodbtools with
    Python3:

        >>> "hello %s" % qq("мир")
        'hello "мир"'

        >>> b"hello %s" % qq("мир")
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        TypeError: %b requires a bytes-like object, or an object that implements __bytes__, not 'str'

        >>> "hello %s" % qq(b("мир"))
        'hello "мир"'

        >>> b"hello %s" % qq(b("мир"))
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        TypeError: %b requires a bytes-like object, or an object that implements __bytes__, not 'str'

    i.e. one way or another if type of format string and what qq returns do not
    match it creates a TypeError.

    We want qq(obj) to be useable with both string and bytestring format.

    For that let's teach qq to return special str- and bytes- derived types that
    know how to automatically convert to str->bytes and bytes->str via b/u
    correspondingly. This way formatting works whatever types combination it was
    for format and for qq, and the whole result has the same type as format.

    For now we teach only qq to use new types and don't generally expose
    _str and _unicode to be returned by b and u yet. However we might do so
    in the future after incrementally gaining a bit more experience.

So two years later I gained that experience and found that having string
type, that can interoperate with both bytes and unicode, is generally
useful. It is useful for practical backward compatibility with Python2
and for simplicity of programming avoiding constant stream of
encode/decode noise. Thus the day to expose Pygolang string types for
general use has come.

This patch does the first small step: it exposes bytes- and unicode-
like types (now named as bstr and ustr) publicly. It switches b and u to
return bstr and ustr correspondingly instead of bytes and unicode. This
is change in behaviour, but hopefully it should not break anything as
there are not many b/u users currently and bstr and ustr are intended to
be drop-in replacements for standard string types.

Next patches will enhance bstr/ustr step by step to be actually drop-in
replacements for standard string types for real.

See nexedi/zodbtools!13 (comment 81646)
for preliminary discussion from 2019.

See also "Python 3 Losses: Nexedi Perspective"[1] and associated "cost
overview"[2] for related presentation by Jean-Paul from 2018.

[1] https://www.nexedi.com/NXD-Presentation.Multicore.PyconFR.2018?portal_skin=CI_slideshow#/20/1
[2] https://www.nexedi.com/NXD-Presentation.Multicore.PyconFR.2018?portal_skin=CI_slideshow#/20

1f99393d

05 Oct, 2022 2 commits

py.bench: Automatically discover benchmarks in test files · ffb40903

Kirill Smelkov authored Oct 04, 2022

Since the beginning (9bf03d9c "py.bench: New command to benchmark python
code similarly to `go test -bench`") py.bench was automatically
discovering benchmarks in bench_*.py files only. This was inherited from
wendelin.core which keeps its benchmarks in those files.

However in pygolang, following Go convention(*), we already have several
benchmarks that reside together with tests in same *_test.py files. And
currently just running py.bench does not discover them.

-> Let's fix this and teach py.bench to automatically discover
benchmarks in the test files by default as well.

Pytest's default is to look for tests in test_*.py and *_test.py (+).
Add those patterns and also keep bench_*.py for backward compatibility.

Before this patch running py.bench inside pygolang repository does not
run any benchmark at all. After the patch py.bench runs all the
benchmarks by default:

    (z-dev) kirr@deca:~/src/tools/go/pygolang$ py.bench
    ========================= test session starts ==========================
    platform linux2 -- Python 2.7.18, pytest-4.6.11, py-1.10.0, pluggy-0.13.1
    rootdir: /home/kirr/src/tools/go/pygolang
    plugins: timeout-1.4.2, profiling-1.7.0, mock-2.0.0
    collected 18 items

    pymod: golang/golang_str_test.py
    Benchmarkstddecode              2000000 0.756 µs/op
    Benchmarkudecode                20000   74.359 µs/op
    Benchmarkstdencode              3000000 0.327 µs/op
    Benchmarkbencode                40000   32.613 µs/op

    pymod: golang/golang_test.py
    Benchmarkpyx_select_nogil       500000  2.051 µs/op
    Benchmarkpyx_go_nogil           90000   12.177 µs/op
    Benchmarkpyx_chan_nogil         600000  1.826 µs/op
    Benchmarkgo                     80000   13.267 µs/op
    Benchmarkchan                   500000  2.076 µs/op
    Benchmarkselect                 300000  3.835 µs/op
    Benchmarkdef                    30000000        0.035 µs/op
    Benchmarkfunc_def               40000   29.387 µs/op
    Benchmarkcall                   30000000        0.043 µs/op
    Benchmarkfunc_call              2000000 0.819 µs/op
    Benchmarktry_finally            20000000        0.096 µs/op
    Benchmarkdefer                  600000  1.755 µs/op

    pymod: golang/sync_test.py
    Benchmarkworkgroup_empty        40000   25.807 µs/op
    Benchmarkworkgroup_raise        40000   31.637 µs/op                     [100%]

    =========================== warnings summary ===========================

(*) see https://pkg.go.dev/cmd/go#hdr-Test_packages
(+) see https://docs.pytest.org/en/7.1.x/reference/reference.html#confval-python_files

/reviewed-by @jerome
/reviewed-on !20

ffb40903

golang_str: Speedup utf-8 decoding a bit on py2 · 9cb7b210

Kirill Smelkov authored Oct 04, 2022

We recently moved our custom UTF-8 encoding/decoding routines to Cython.
Now we can start taking speedup advantage on C level to make our own
UTF-8 decoder a bit less horribly slow on py2:

    name       old time/op  new time/op  delta
    stddecode   752ns ± 0%   743ns ± 0%   -1.19%  (p=0.000 n=9+10)
    udecode     216µs ± 0%    75µs ± 0%  -65.19%  (p=0.000 n=9+10)
    stdencode   328ns ± 2%   327ns ± 1%     ~     (p=0.252 n=10+9)
    bencode    34.1µs ± 1%  32.1µs ± 1%   -5.92%  (p=0.000 n=10+10)

So it is ~ 3x speedup for u(), but still significantly slower compared
to std unicode.decode('utf-8').

Only low-hanging fruit here to make _utf_decode_rune a bit more prompt,
since it sits in the most inner loop. In the future
_utf8_decode_surrogateescape might be reworked as well to avoid
constructing resulting unicode via py-level list of py-unicode character
objects. And similarly for _utf8_encode_surrogateescape.

On py3 the performance of std and u/b decode/encode is approximately the same.

/trusted-by @jerome
/reviewed-on !19

9cb7b210

04 Oct, 2022 4 commits

golang_str,strconv: Fix decoding of rune-error · 598eb479

Kirill Smelkov authored Oct 03, 2022

Error rune (u+fffd) is returned by _utf8_decode_rune to indicate an
error in decoding. But the error rune itself is valid unicode codepoint:

   >>> x = u"�"
   >>> x
   u'\ufffd'
   >>> x.encode('utf-8')
   '\xef\xbf\xbd'

This way only (r=_rune_error, size=1) should be treated by the caller as
utf8 decoding error.

But e.g. strconv.quote was not careful to also inspect the size, and this way
was quoting � into just "\xef" instead of "\xef\xbf\xbd".
_utf8_decode_surrogateescape was also subject to similar error.

-> Fix it.

Without the fix e.g. added test for strconv.quote fails as

    >           assert quote(tin) == tquoted
    E           assert '"\xef"' == '"�"'
    E             - "\xef"
    E             + "�"

/reviewed-by @jerome
/reviewed-at nexedi/pygolang!18

598eb479

golang_str: Move py3/py2 conditioning into _utf8_{encode,decode}_surrogateescape · ea5abe71

Kirill Smelkov authored Oct 03, 2022

So that those routines could be just called and do what is expected
without the caller caring whether it is py2 or py3. We will soon need to
use those routines from several callsites, and having that py2/py3
conditioning being spread over all usage places would be inconvenient.

/reviewed-by @jerome
/reviewed-at nexedi/pygolang!18

ea5abe71

strconv: Move functionality related to UTF8 encode/decode into _golang_str · 50b8cb7e

Kirill Smelkov authored Oct 03, 2022

- Move _utf8_decode_rune, _utf8_decode_surrogateescape, _utf8_encode_surrogateescape out from strconv into _golang_str
- Factor _bstr/_ustr code into pyb/pyu. _bstr/_ustr become plain wrappers over pyb/pyu.
- work-around emerged golang ↔ strconv dependency with at-runtime import.

Moved routines belong to the main part of golang strings processing
-> their home should be in _golang_str.pyx

/reviewed-by @jerome
/reviewed-at nexedi/pygolang!18

50b8cb7e

golang: Move strings-related code to _golang_str "submodule" · e72a459f

Kirill Smelkov authored Oct 03, 2022

We are going to significantly extend py-strings related functionality soon
- to the point where amount of strings related code will be
approximately the same compared to the amount of all other
python-related code inside golang module.

-> First move everything related to py strings to dedicated
_golang_str.pyx as a preparatory step.

Keep that new file included from _golang.pyx instead of being real new
module, because we want strings functionality to be provided by golang
main namespace itself, and to ease internal code interdependencies.

Plain code movement.

/reviewed-by @jerome
/reviewed-at nexedi/pygolang!18

e72a459f

26 Jan, 2022 15 commits

pygolang v0.1 · 7b72d418
Kirill Smelkov authored Jan 26, 2022

7b72d418

golang: Fix print(_pystr) · 08dc5d10

Kirill Smelkov authored Jan 24, 2022

On Python2 without .tp_print printing _pystr crashes as:

    pygolang$ ./golang/testprog/golang_test_str.py
    Traceback (most recent call last):
      File "./golang/testprog/golang_test_str.py", line 39, in <module>
        main()
      File "./golang/testprog/golang_test_str.py", line 34, in main
        print("print(qq(b)):", qq(sb))
    RuntimeError: print recursion

See added comments for details.

08dc5d10

os += ReadFile · 2a35ef5b

Kirill Smelkov authored Jan 26, 2022

Add convenient utility to read whole file and return its content
similarly to Go. The code is taken from wendelin.core:

https://lab.nexedi.com/nexedi/wendelin.core/blob/wendelin.core-2.0.alpha1-18-g38dde766/wcfs/client/wcfs_misc.cpp#L246-281

2a35ef5b

Nogil signals · e18adbab

Kirill Smelkov authored Jan 24, 2022

Provide os/signal package that can be used to setup signal delivery to nogil
channels. This way for user code signal handling becomes regular handling of a
signalling channel instead of being something special or limited to only-main
python thread. The rationale for why we need it is explained below:

There are several problems with regular python's stdlib signal module:

1. Python2 does not call signal handler from under blocked lock.acquire.
This means that if the main thread is blocked waiting on a semaphore,
signal delivery will be delayed indefinitely, similarly to e.g. problem
described in nxdtest!14 (comment 147527)
where raising KeyboardInterrupt is delayed after SIGINT for many,
potentially unbounded, seconds until ~semaphore wait finishes.

Note that Python3 does not have this problem wrt stdlib locks and
semaphores, but read below for the next point.

2. all pygolang communication operations (channels send/recv, sync.Mutex,
sync.RWMutex, sync.Sema, sync.WaitGroup, sync.WorkGroup, ...) run with
GIL released, but if blocked do not handle EINTR and do not schedule
python signal handler to run (on main thread).

Even if we could theoretically adjust this behaviour of pygolang at python
level to match Python3, there are also C++ and pyx/nogil worlds. And we want gil
and nogil worlds to interoperate (see https://pypi.org/project/pygolang/#cython-nogil-api),
so that e.g. if completely nogil code happens to run on the main thread,
signal handling is still possible, even if that signal handling was setup at
python level.

With signals delivered to nogil channels both nogil world and python
world can setup signal handlers and to be notified of them irregardles
of whether main python thread is currently blocked in nogil wait or not.

/reviewed-on !17

e18adbab

golang: Provide __pystr internally · ce507f4e

Kirill Smelkov authored Jan 24, 2022

To convert an object to str of current python.
It will be handy to use __pystr when implementing __str__ methods.

/reviewed-on nexedi/pygolang!17

ce507f4e

Nogil IO · 4690460b

Kirill Smelkov authored Jan 24, 2022

Provide C++ package "os" with File, Pipe, etc similarly to what is
provided on Go side. The package works through IO methods provided by
runtimes.

We need IO facility because os/signal package will need to use
pipe in cooperative IO mode in its receiving-loop goroutine.

os.h and os.cpp are based on drafts from wendelin.core:

https://lab.nexedi.com/nexedi/wendelin.core/blob/wendelin.core-2.0.alpha1-18-g38dde766/wcfs/client/wcfs_misc.h
https://lab.nexedi.com/nexedi/wendelin.core/blob/wendelin.core-2.0.alpha1-18-g38dde766/wcfs/client/wcfs_misc.cpp

/reviewed-on !17

4690460b

libgolang/gevent: Put explicit try/catch boundary for tasks spawned via go · 07cae4e9

Kirill Smelkov authored Jan 24, 2022

Else as https://github.com/python-greenlet/greenlet/pull/285
demonstrates there can be segmentation faults and crashes due to
exceptions from one greenlet propagating to C stack of another greenlet.

No test here. I've tried to do it, but with gevent (contrary to plain
greenlets), spawning new task only schedules corresponding greenlet to
run in the end of current event loop cycle instead of switching to
created greenlet immediately. With this delaying, it was hard for me to
develop corresponding test in a reasonable time.

Hopefully having the test I've done for greenlet itself + hereby
protection is good enough.

/reviewed-on nexedi/pygolang!17

07cae4e9

internal/atomic: New package · d358fa75

Kirill Smelkov authored Jan 24, 2022

This package provides special kind of atomic that is automatically reset
to zero after fork in child. This kind of atomic will be used in os
package to implement IO that does not deadlock in Close after fork.

/reviewed-on nexedi/pygolang!17

d358fa75

internal/syscall: New package · c2471014

Kirill Smelkov authored Jan 24, 2022

This package provides wrappers to OS system calls.
Minimal functionality that will be used to implement os and os/signal pacakges.

/reviewed-on !17

c2471014

libgolang: Export _runtime internally · 3a131a51

Kirill Smelkov authored Jan 24, 2022

Package os will need to access runtime operations to implement IO.

/reviewed-on nexedi/pygolang!17

3a131a51

libgolang/{thread,gevent}: Switch runtimes to C++ · 3a838d24

Kirill Smelkov authored Jan 24, 2022

We will soon need to use common functionality(*) from both runtimes and
other packages. The "other packages" are all C++ and it is handy to keep
common functionality in C++ as well. While we could also maintain
`extern "C"` interface, it duplicates the work. Let's switch everything
to C++ to ease further maintenance.

(*) e.g. package internal/syscall from both runtimes and from package os.

/reviewed-on nexedi/pygolang!17

3a838d24

libgolang: Provide std::hash for chan · 1fad944d

Kirill Smelkov authored Jan 24, 2022

Without defined std::hash it is not possible to use channels as keys in
dict or set. We will be using set<chan> in os/signal package
implementation.

/reviewed-on !17

1fad944d

pyx.build: Add runtime/_libgolang.pxd to dependencies · 60de9538
Kirill Smelkov authored Jan 24, 2022
```
I forgot to include it in ad00be70 (libgolang: Introduce runtimes).

/reviewed-on nexedi/pygolang!17
```
60de9538
pyx.build: Simplify listing dependencies · ff9beb02
Kirill Smelkov authored Jan 24, 2022
```
/reviewed-on !17
```
ff9beb02

manifest += .nxdtest · ad84cf76

Kirill Smelkov authored Jan 24, 2022

    $ check-manifest
    lists of files in version control and sdist do not match!
    missing from sdist:
      .nxdtest

/reviewed-on !17

ad84cf76

08 Dec, 2021 1 commit
- pygolang v0.0.9 · e503beb0
  Kirill Smelkov authored Dec 08, 2021
  
  e503beb0