- 04 Nov, 2019 3 commits
-
-
Kirill Smelkov authored
e.g. go -> pygo, select -> pyselect, etc. This will make the followup transition to non-py mode more clear.
-
Kirill Smelkov authored
Similarly to 0e838833 (time: Turn Ticker and Timer into cdef classes) rework all classes in context package to be cdef classes: - cdef their attributes (else accessing any of them raises AttributeError). But don't change any of them to be used in cimport mode yet - this will be done later as separate steps.
-
Kirill Smelkov authored
In preparation to start migrating, at least partially, context functionality to nogil mode, first, similarly to package time (see 32f34607 "time: Move code to pyx"), move the code from context.py to _context.pyx. This is straight code movement + associated setup.py & co adjusts. We don't move just to context.pyx (note no _ prefix), since we will need to continue distinguishing pyx/nogil from py objects/functions.
-
- 31 Oct, 2019 1 commit
-
-
Kirill Smelkov authored
Gcc 5.4 from Ubuntu 16.04 LTS complains e.g. ./golang/libgolang.h: In function ‘golang::_selcase golang::_selrecv_(golang::_chan*, void*, bool*)’: ./golang/libgolang.h:227:5: sorry, unimplemented: non-trivial designated initializers not supported }; ^ The problem started to appear after 47111d3e (libgolang: Teach select to accept inplace tx data) when we moved _selcase.ptxrx inside union. Let's add workaround for older compilers, even though e.g. gcc 8.3 from Debian 10 accepts existing code just fine.
-
- 30 Oct, 2019 1 commit
-
-
Kirill Smelkov authored
Provide sync.WaitGroup that can be used directly from C++ and Pyx/nogil codes. Python-level sync.WaitGroup becomes small wrapper around pyx/nogil one. Python-level tests should be enough to cover C++/Pyx functionality at zero-level approximation.
-
- 24 Oct, 2019 1 commit
-
-
Kirill Smelkov authored
Provide sync.Once that can be used directly from C++ and Pyx/nogil worlds. Python-level sync.Once becomes small wrapper around pyx/nogil sync.Once .
-
- 22 Oct, 2019 1 commit
-
-
Kirill Smelkov authored
Libgolang, since 3b241983 (Port/move channels to C/C++/Pyx), already had defer macro implemented and used, but only internally. The reason it was not yet exposed as public API is that there is a difference with Go's defer in that deferred function is called at end of current scope instead of end of current function, and I was a bit reluctant to expose defer with different-than-Go semantic. However even with this difference defer is useful, and the difference can be documented. Unfortunately it is not easy to correctly fix the difference, so the most practical way for now is to expose defer as it is. I've also contemplated how to avoid using macro, but without a macro, users will have to explicitly declare placeholder variable for every defer call which goes against usability. Since defer is exposed as macro, I've also contemplated to expose it as something like `libgolang_defer` with the idea to avoid name conflicts, and so that users - that are using defer - will be doing `#define defer libgolang_defer`. However I ended up not doing that and exposing `defer` macro with its own name. My rationale is: - grepping /usr/include/ for \<defer\> on my system did not showed any real usage. - Qt also #defines `slots` and `signals` and that does not cause problems in practice. -> expose `defer` macro into public C++ API as is, so that it can be used not only inside libgolang.cpp . For example I myself need defer functionality in C++ part of wendelin.core.
-
- 18 Oct, 2019 3 commits
-
-
Kirill Smelkov authored
Else the exception, even if it was recovered from, will be included as cause for next raised exception. See added test for details.
-
Kirill Smelkov authored
Python3 chains exceptions, so that e.g. if exc1 is raised and, while it was not handled, another exc2 is raised, exc2 will be linked to exc1 via exc2.__context__ attribute and exc1 will be included into exc2 traceback printout. However many projects still use Python2 and there is no similar chaining functionality there. This way exc1 is completely lost. Since defer code is in our hands, we can teach it to implement exception chaining even on Python2 by carefully analyzing what happens in _GoFrame.__exit__(). Implementing chaining itself is relatively easy, but is only part of the story. Even if an exception is chained with its cause, but exception dump does not show the cause, the chaining will be practically useless. With this in mind this patches settles not only on implementing chaining itself, but on also giving a promise that chained cause exceptions will be included into traceback dumps as well. To realize this promise we adjust all exception dumping funcitons in traceback module and carefully install adjusted traceback.print_exception() into sys.excepthook. This amends python interactive sessions and programs run by python interpreter to include causes in exception dumps. "Careful" here means that we don't change sys.excepthook if on golang module load we see that sys.excepthook was already changed by some other module - e.g. due to IPython session running because IPython installs its own sys.excepthook. In such cases we don't install our sys.excepthook, but we also provide integration patches that add exception chaining support for traceback dump functionality in popular third-party software. The patches (currently for IPython and Pytest) are activated automatically, but only when/if corresponding software is imported and actually used. This should give practically good implementation of the promise - a user can now rely on seeing exception cause in traceback dump whatever way python programs are run. The implementation takes https://pypi.org/project/pep3134/ experience into account [1]. peak.utils.imports [2,3] is used to be notified when/if third-party module is imported. [1] https://github.com/9seconds/pep3134/ [2] https://pypi.org/project/Importing/ [3] http://peak.telecommunity.com/DevCenter/Importing This patch originally started as hacky workaround in wendelin.core because in wcfs tests I was frequently hitting situations, where exception raised by an assert was hidden by another exception raised in further generic teardown check. For example wcfs tests check that wcfs is unmounted after every test run [4] and if that fails it was hiding problems raised by an assert. As the result I was constantly guessing and adding code like [5] to find what was actually breaking. At some point I added hacky workaround for defer to print cause exception not to loose it [6]. [7] has more context and background discussion on this topic. [4] https://lab.nexedi.com/kirr/wendelin.core/blob/49e73a6d/wcfs/wcfs_test.py#L70 [5] https://lab.nexedi.com/kirr/wendelin.core/blob/49e73a6d/wcfs/wcfs_test.py#L853-857 [6] kirr/wendelin.core@c00d94c7 [7] zodbtools!13 (comment 81553) After this patch, on Python2 defer(cleanup1) defer(cleanup2) defer(cleanup3) ... is no longer just a syntatic sugar for try: try: try: ... finally: cleanup3() finally: cleanup2() finally: cleanup1()
-
Kirill Smelkov authored
- _pyrun runs the command and returns full information: exitcode, stdout, stderr. - pyrun runs the command and raises exception if ran command fails. We will need _pyrun in the next patch to test that particular command fails and access its stderr.
-
- 15 Oct, 2019 2 commits
-
-
Kirill Smelkov authored
_makesema declaration was giving a warning when compiling with C: In file included from golang/runtime/libgolang_test_c.c:26: ./golang/libgolang.h:237:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes] LIBGOLANG_API _sema *_makesema(); ^~~~~~~~~~~~~ Fix the prototype to be valid for both C and C++. Fixes: 34b7a1f4.
-
Kirill Smelkov authored
E.g. wendelin.core amends build_ext class and so it needs a way to import it.
-
- 14 Oct, 2019 9 commits
-
-
Kirill Smelkov authored
- use .c.chan_double() which gives chan[double] pyx/nogil way to access Ticker.c and Timer.c. Use the channels via pyx/nogil API from inside. - use pyx/nogil sleep and now; This gets time.pyx codebase closer to be used from pyx/nogil mode. NOTE: unless something like pyx/nogil memory management emerges[1] we are relying on Python to manage memory of Ticker and Timer classes. If we just spawn e.g. Ticker.__tick via pyx/nogil go, the thread that is spawned won't be holding a reference to Ticker object, and once the ticker goes out of scope in original thread (while its channel .c might be still in scope), __tick will segfault accessing freed Ticker object. To workaround it we use the following pattern: nogilready = chan(dtype='C.structZ') pygo(mymeth) nogilready.recv() def mymeth(MyObject self, pychan nogilready) with nogil: nogilready.chan_structZ().close() self._mymeth() cdef void _mymeth(MyObject self) nogil: ... where python reference to MyObject will be held in spawned thread during its lifetime, while the service provided by mymeth will be done under nogil. [1] https://www.nexedi.com/blog/NXD-Document.Blog.Cypclass
-
Kirill Smelkov authored
This will allow to use the channels in nogil mode including in followup patches touching time.pyx codebase.
-
Kirill Smelkov authored
C.structZ is empty structure and chan[C.structZ] can be used in pyx/nogil world. This way context tree initially created from Python can be extended in pyx/nogil and e.g. root context can be canceled from Python, which will correctly transfer the cancelation signal to pyx/nogil world too.
-
Kirill Smelkov authored
Introduce notion of data type (dtype similarly to NumPy) into pychan and teach it to accept for send objects only matching that dtype. Likewise teach pychan to decode raw bytes received from underlying channel into Python object correspodningg to pychan dtype. For C dtypes, e.g. 'C.int', 'C.double' etc, contrary to chan of python objects, the transfer can be done without depending on Python GIL. This way channels of such C-level dtypes can be used to implement interaction in between Python and nogil worlds.
-
Kirill Smelkov authored
In a followup commit we are going to add channel types (e.g. chan of double, chan of int, etc) and accordingly there will be several nil channel objects, e.g. nil(dtype=int), nil(dtype=double) etc, which will be separate python objects. Even without data types, another planned change is to add directional channels, e.g. a channel instance that can only send, but not receive and vice versa(*). This way for the same underlying channel object, there can be several pychan objects that point to it - even for nil channel - e.g. nilchan and `niltx = nilchan.txonly()` that creates another pychan object pointing to the same underlying nil. Since we want all channels (of the same type) that point to the same underlying channel to compare as same, we cannot use "is" for comparison and have to use ==. In other words channels, both at C and Python level, should be perceived as pointers, of which there can be multiple ones pointing to the same location, and thus == has to be used to compare them. (*) see https://golang.org/ref/spec#Channel_types for details.
-
Kirill Smelkov authored
For example - def send(pych, obj): + def send(pychan pych, obj): Even though Cython allows not to annotate self with type, this makes the code more clear.
-
Kirill Smelkov authored
This prepares pyselect codebase for future logic where all channel element types will be sent via inplace _selcase data. For PyObject we could previously go with "wiring ptx through pycase[1]", but for arbitrary type, that has to be first converted from Python object to C-level object, we would have to store the result somewhere, and that would mean extra allocation and pyselect code complexity increase, even for cases that don't use anything but chan[object]. So do the preparation and switch pyselect into using inplace tx.
-
Kirill Smelkov authored
Currently select, via _selcase, requires users to provide pointers to tx and rx buffers. However if element type itself can fit into a pointer word, we can put the element directly into _selcase and mark the case with a flag, that it contains inplace data instead of referring to external storage. This will be helpful in upcoming patch where we'll teach pychan to work with several element types, not only pyobject elements. This patch does careful introduction of _selcase.flags - in such a way that the size of _selcase stays the same as it was before by using bitfields. The .ptxrx pointer is unioned with newly introduced inplace uint64 .itxrx data, which is used by select instead of .ptxrx if the flag is set. The usage of uint64 should not increase _selcase size on 64-bit platforms. Then _selcase.ptx() and .prx() accessors are adapted accordingly and the rest of the changes are corresponding test and _chanselect2<onstack=false> adaptation. This functionality is kind of low-level and is not exposed via any _selsend() or chan.sends() API changes. Whenever inplace tx should be used, the case should be prepared either completely manually, or with e.g. first calling _selsend() and then manually changing .flags and .itxrx. Added test serves as the example on how to do it. Inplace rx is currently forbidden - because supporting that would require to drop const from casev select argument. However in the future, for symmetry, we might want to support that as well. P.S. Since write to selcase.itxrx requires casting pointers e.g. like this: *(int *)&sel[0].itxrx = 12345; it breaks C99 strict aliasing and by default compiler can generate bad code on such pattern. To the problem we adapt the build system to default compiler to no-strict-aliasing (as many other projects do, e.g. Linux kernel) with the idea that in many cases where strict aliasing was intended to help it actually does not, because e.g. pointer types are the same, while explicitly marking pointers with `restrict` keyword does help indeed. Nothing new for Python2 here, as it is using -fno-strict-aliasing by itself. However Python3 is compiling without -fno-strict-aliasing: https://python.org/dev/peps/pep-3123 .
-
Kirill Smelkov authored
In the next patch we are going to teach _selcase to support both external and inplace data. Before that let's do a couple of preparatory things. This patch: introduces .ptx() and .prx() accessors to get to corresponding data buffer associated with _selcase. Convert _selcase users to use .ptx() and .prx() instead of doing direct .ptxrx access. This way when we'll add inplace support to _selcase, we'll need to adapt only accessors, not clients. The only place that is left using .ptxrx directly is one tricky place in _chanselect2<onstack=false> where gevent-related code needs to carefully deal with proxying tx/rx buffers due to STACK_DEAD_WHILE_PARKED. NOTE even though new accessors may panic, libgolang.cpp always calls them after checking that the conditions for ptx/prx calls are valid.
-
- 13 Oct, 2019 3 commits
-
-
Kirill Smelkov authored
In the next patches we are going to teach _selcase to support both external and inplace data. Before that let's do a couple of preparatory things. This patch: rename .data -> .ptxrx . The new name is more clear: - "p" prefix aligns with other libgolang style, e.g. ptx or prx. - "txrx" suffix says that this is used for both send and recv. Just plain renaming, nothing else in this patch.
-
Kirill Smelkov authored
Before passing objects to _chanselect for send, pyselect increfs them, as just send does, to indicate that one object reference is passed to channel buffer. On exit, since only one case is actually executed by select, pyselect needs to decref incref'ed object from not executed cases. Pyselect already implements the latter cleanup, but currently the cleanup is executed only if control flow reaches _chanselect at all. Which is a bug, since pyselect can panic or raise an exception just in the middle of preparation phase. -> Fix it by associating the finally-decref cleanup with whole prepare+_chanselect code. Without the fix, the second part of added test (abnormal exit) fails e.g. like: @mark.skipif(not hasattr(sys, 'getrefcount'), # skipped e.g. on PyPy reason="needs sys.getrefcount") def test_select_refleak(): ch1 = chan() ch2 = chan() obj1 = object() obj2 = object() tx1 = (ch1.send, obj1) tx2 = (ch2.send, obj2) # normal exit gc.collect() nref1 = sys.getrefcount(obj1) nref2 = sys.getrefcount(obj2) _, _rx = select( tx1, # 0 tx2, # 1 default, # 2 ) assert (_, _rx) == (2, None) gc.collect() assert sys.getrefcount(obj1) == nref1 gc.collect() assert sys.getrefcount(obj1) == nref2 # abnormal exit with raises(AttributeError) as exc: select( tx1, # 0 tx2, # 1 'zzz', # 2 causes pyselect to panic ) assert exc.value.args == ("'str' object has no attribute '__self__'",) gc.collect() > assert sys.getrefcount(obj1) == nref1 E assert 4 == 3 E -4 E +3 golang/golang_test.py:690: AssertionError The bug was introduced in 3b241983 (Port/move channels to C/C++/Pyx).
-
Kirill Smelkov authored
We will soon rework pychan to be python wrapper not only for chan<object>, but also for other channels of various C types - e.g. chan<structZ>, chan<int>, etc. To prepare for this let's first rework pychan from using chan[PyObject*] into raw _chan* functions. This will allow us to use the same functions over raw channels while dynamically dispatching on channel element type.
-
- 05 Oct, 2019 3 commits
-
-
Kirill Smelkov authored
-
Kirill Smelkov authored
e.g. go -> pygo, select -> pyselect, etc. This will make the followup transition to non-py mode more clear.
-
Kirill Smelkov authored
- cdef their attributes (else accessing any of them raises AttributeError), in particular use pychan and sync.Mutex in cimport mode. - change `with mu` into explicit lock/unlock (since pyx sync.Mutex does not support with). - use INFINITY instead of None for empty dt.
-
- 04 Oct, 2019 4 commits
-
-
Kirill Smelkov authored
In preparation to start migrating, at least partly, time functionality to nogil mode, move the code from time.py to _time.pyx . This is straight code movement except now -> pynow, and sleep -> pysleep replaces, since in _time.pyx now and sleep were already referring to nogil versions. We don't move just to time.pyx (note no _ prefix), since we will need to continue distinguishing pyx/nogil from py objects/functions, e.g. pyx time.second is C constant, while pyx time.pysecond is pyobject exported to python world as time.second.
-
Kirill Smelkov authored
One place in _time.pyx was overlooked in ce8152a2 (pyx api: Provide sleep).
-
Kirill Smelkov authored
Similarly to 78d85cdc (sync: threading.Event -> chan) replace everywhere threaing.Lock usage with sync.Mutex . This brings 2 goods: - sync.Mutex becomes more well tested; - we untie ourselves from threading python module (threading.Lock was the last user).
-
Kirill Smelkov authored
Until now libgolang had semaphore and mutex functionality, but implemented only internally and not exposed to users. Since for its pinner thread wendelin.core v2 needs not only nogil channels, but also nogil mutexes, timers and so on, it is now time to start providing that. We start with mutexes: - Expose Mutex from insides of libgolang.cpp to public API in libgolang.h . We actually expose both Mutex and Sema because semaphores are sometimes also useful for low-level synchronization, and because it is easier to export Mutex and Sema both at the same time. - Provide pyx/nogil API to those as sync.Mutex and sync.Sema. - Provide corresponding python wrappers. - Add Pyx-level test for semaphore wakeup when wakeup is done not by the same thread which did the original acquire. This is the test that was promised in 5142460d (libgolang/thread: Add links to upstream PyThread_release_lock bug), and it used to corrupt memory and deadlock on macOS due to CPython & PyPy runtime bugs: https://bugs.python.org/issue38106 -> https://github.com/python/cpython/pull/16047 https://bitbucket.org/pypy/pypy/issues/3072 Note about python wrappers: At present, due to https://github.com/cython/cython/issues/3165, C-level panic is not properly translated into Py-level exception in (Py)Sema/Mutex constructors. This should not pose a real problem in practice though.
-
- 03 Oct, 2019 4 commits
-
-
Kirill Smelkov authored
- Fix references to time::sleep and _tasknanosleep in library overview. - add golang::time:: comment before openeing corresponding namespace.
-
Kirill Smelkov authored
We will soon need this functionality for sync_test.py to import tests from _sync_test.pyx . In the future many modules will need to import X_test.pyx from X_test.py as well.
-
Kirill Smelkov authored
- it is in Go style to panic on memory allocation failure. - _makechan can already panic, instead of returning NULL, on e.g. ch._mu initialization failure. - we were already not checking in several places for NULL after _makechan() call (e.g. in golang/runtime/libgolang_test_c.c). This switches _makechan to panic on memory allocation failure and settles on style of future C-level API calls to panic instead of returning NULL on failure.
-
Kirill Smelkov authored
sys is imported in the beginning of main, so there is no need to import it the second time in the middle of main. The bug was there since gpython day1 - since 32a21d5b (gpython: Python interpreter with support for lightweight threads).
-
- 17 Sep, 2019 5 commits
-
-
Kirill Smelkov authored
This release is bugfix-only. Compared to pygolang v0.0.3 (4ca65816) the change in speed is likely within noise: (on i7@2.6GHz) thread runtime: name old time/op new time/op delta go 18.3µs ± 1% 18.3µs ± 1% ~ (p=0.218 n=10+10) chan 2.97µs ± 5% 2.97µs ± 8% ~ (p=0.781 n=10+10) select 3.59µs ± 2% 3.55µs ± 5% ~ (p=0.447 n=9+10) def 56.0ns ± 0% 55.0ns ± 0% -1.79% (p=0.000 n=10+10) func_def 43.7µs ± 1% 43.8µs ± 1% +0.35% (p=0.029 n=10+10) call 65.0ns ± 0% 62.3ns ± 1% -4.15% (p=0.000 n=10+10) func_call 1.06µs ± 1% 1.04µs ± 0% -1.26% (p=0.000 n=10+8) try_finally 137ns ± 1% 137ns ± 0% ~ (p=1.000 n=10+10) defer 2.32µs ± 0% 2.33µs ± 1% +0.43% (p=0.000 n=9+10) workgroup_empty 37.6µs ± 1% 37.1µs ± 2% -1.29% (p=0.003 n=10+10) workgroup_raise 47.9µs ± 1% 47.6µs ± 0% -0.63% (p=0.001 n=9+9) gevent runtime: name old time/op new time/op delta go 15.8µs ± 0% 16.1µs ± 1% +2.18% (p=0.000 n=9+10) chan 7.36µs ± 0% 7.21µs ± 0% -1.97% (p=0.000 n=8+10) select 10.4µs ± 0% 10.5µs ± 0% +0.71% (p=0.000 n=10+10) def 57.0ns ± 0% 55.0ns ± 0% -3.51% (p=0.000 n=10+10) func_def 43.3µs ± 1% 44.1µs ± 2% +1.81% (p=0.000 n=10+10) call 66.0ns ± 0% 65.0ns ± 0% -1.52% (p=0.000 n=10+10) func_call 1.04µs ± 1% 1.06µs ± 1% +1.48% (p=0.000 n=10+10) try_finally 137ns ± 1% 136ns ± 0% -1.31% (p=0.000 n=10+10) defer 2.32µs ± 0% 2.31µs ± 1% ~ (p=0.472 n=8+10) workgroup_empty 56.0µs ± 0% 55.7µs ± 0% -0.49% (p=0.000 n=10+10) workgroup_raise 71.3µs ± 1% 71.7µs ± 1% +0.62% (p=0.001 n=10+10)
-
Kirill Smelkov authored
Like we already do for e.g. _chan, to increase probability that if something is used after free we get a NULL-dereference crash instead of more hard-to-understand segfault.
-
Kirill Smelkov authored
Similarly to situation described in dcf4ebd1 (libgolang: Fix chan.close to dequeue subscribers atomically), select can be also accessing a channel object at the time of wakeup when that channel could be already destroyed: select queues waiters to channels recv/send queues and upon wakeup needs to dequeue them. This requires locking channels, not to mention that a channel destroy with non-empty subscribers queue will trigger bug panic. Contrary to the fix for recv, we cannot rework select not to access channel objects after wakeup, because for select upon wakeup all queued channels could be already destroyed, not only selected one. Thus the fix here is to incref/decref the channels for the duration where we need to access them. The bug was not caught by existing tests and was noted while doing libgolang.cpp review for concurrency issues. With added test (hereby fix is served by a bit amended _test_close_wakeup_all) the bug, if not fixed, renders itself as e.g. the following under TSAN: WARNING: ThreadSanitizer: data race (pid=4421) Write of size 8 at 0x7b1400000650 by thread T9: #0 free ../../../../src/libsanitizer/tsan/tsan_interceptors.cc:649 (libtsan.so.0+0x2b46a) #1 free ../../../../src/libsanitizer/tsan/tsan_interceptors.cc:643 (libtsan.so.0+0x2b46a) #2 golang::_chan::decref() golang/runtime/libgolang.cpp:479 (liblibgolang.so.0.1+0x4822) #3 _chanxdecref golang/runtime/libgolang.cpp:461 (liblibgolang.so.0.1+0x487a) #4 golang::chan<int>::operator=(decltype(nullptr)) golang/libgolang.h:296 (_golang_test.so+0x14cf9) #5 operator() golang/runtime/libgolang_test.cpp:304 (_golang_test.so+0x14cf9) #6 __invoke_impl<void, __test_close_wakeup_all(bool)::<lambda()>&> /usr/include/c++/8/bits/invoke.h:60 (_golang_test.so+0x14cf9) #7 __invoke<__test_close_wakeup_all(bool)::<lambda()>&> /usr/include/c++/8/bits/invoke.h:95 (_golang_test.so+0x14cf9) #8 __call<void> /usr/include/c++/8/functional:400 (_golang_test.so+0x14cf9) #9 operator()<> /usr/include/c++/8/functional:484 (_golang_test.so+0x14cf9) #10 _M_invoke /usr/include/c++/8/bits/std_function.h:297 (_golang_test.so+0x14cf9) #11 std::function<void ()>::operator()() const /usr/include/c++/8/bits/std_function.h:687 (_golang_test.so+0x1850c) #12 operator() golang/libgolang.h:273 (_golang_test.so+0x1843a) #13 _FUN golang/libgolang.h:271 (_golang_test.so+0x1843a) #14 <null> <null> (python2.7+0x1929e3) Previous read of size 8 at 0x7b1400000650 by thread T10: #0 golang::Sema::acquire() golang/runtime/libgolang.cpp:168 (liblibgolang.so.0.1+0x413a) #1 golang::Mutex::lock() golang/runtime/libgolang.cpp:179 (liblibgolang.so.0.1+0x424a) #2 operator() golang/runtime/libgolang.cpp:1044 (liblibgolang.so.0.1+0x424a) #3 _M_invoke /usr/include/c++/8/bits/std_function.h:297 (liblibgolang.so.0.1+0x424a) #4 std::function<void ()>::operator()() const /usr/include/c++/8/bits/std_function.h:687 (liblibgolang.so.0.1+0x5f07) #5 golang::_deferred::~_deferred() golang/runtime/libgolang.cpp:215 (liblibgolang.so.0.1+0x5f07) #6 __chanselect2 golang/runtime/libgolang.cpp:1044 (liblibgolang.so.0.1+0x5f07) #7 _chanselect2<true> golang/runtime/libgolang.cpp:968 (liblibgolang.so.0.1+0x6665) #8 _chanselect golang/runtime/libgolang.cpp:963 (liblibgolang.so.0.1+0x6665) #9 select golang/libgolang.h:386 (_golang_test.so+0x14fc1) #10 operator() golang/runtime/libgolang_test.cpp:320 (_golang_test.so+0x14fc1) #11 __invoke_impl<void, __test_close_wakeup_all(bool)::<lambda()>&> /usr/include/c++/8/bits/invoke.h:60 (_golang_test.so+0x14fc1) #12 __invoke<__test_close_wakeup_all(bool)::<lambda()>&> /usr/include/c++/8/bits/invoke.h:95 (_golang_test.so+0x14fc1) #13 __call<void> /usr/include/c++/8/functional:400 (_golang_test.so+0x14fc1) #14 operator()<> /usr/include/c++/8/functional:484 (_golang_test.so+0x14fc1) #15 _M_invoke /usr/include/c++/8/bits/std_function.h:297 (_golang_test.so+0x14fc1) #16 std::function<void ()>::operator()() const /usr/include/c++/8/bits/std_function.h:687 (_golang_test.so+0x1850c) #17 operator() golang/libgolang.h:273 (_golang_test.so+0x183da) #18 _FUN golang/libgolang.h:271 (_golang_test.so+0x183da) #19 <null> <null> (python2.7+0x1929e3) Thread T9 (tid=4661, running) created by main thread at: #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors.cc:915 (libtsan.so.0+0x2be1b) #1 PyThread_start_new_thread <null> (python2.7+0x19299f) #2 _taskgo golang/runtime/libgolang.cpp:123 (liblibgolang.so.0.1+0x3f98) #3 go<__test_close_wakeup_all(bool)::<lambda()> > golang/libgolang.h:271 (_golang_test.so+0x16c94) #4 __test_close_wakeup_all(bool) golang/runtime/libgolang_test.cpp:298 (_golang_test.so+0x16c94) #5 _test_close_wakeup_all_vsselect() golang/runtime/libgolang_test.cpp:342 (_golang_test.so+0x16f64) #6 __pyx_pf_6golang_12_golang_test_24test_close_wakeup_all_vsselect golang/_golang_test.cpp:4013 (_golang_test.so+0xd92a) #7 __pyx_pw_6golang_12_golang_test_25test_close_wakeup_all_vsselect golang/_golang_test.cpp:3978 (_golang_test.so+0xd92a) #8 PyEval_EvalFrameEx <null> (python2.7+0xf68b4) Thread T10 (tid=4662, running) created by main thread at: #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors.cc:915 (libtsan.so.0+0x2be1b) #1 PyThread_start_new_thread <null> (python2.7+0x19299f) #2 _taskgo golang/runtime/libgolang.cpp:123 (liblibgolang.so.0.1+0x3f98) #3 go<__test_close_wakeup_all(bool)::<lambda()> > golang/libgolang.h:271 (_golang_test.so+0x16d96) #4 __test_close_wakeup_all(bool) golang/runtime/libgolang_test.cpp:315 (_golang_test.so+0x16d96) #5 _test_close_wakeup_all_vsselect() golang/runtime/libgolang_test.cpp:342 (_golang_test.so+0x16f64) #6 __pyx_pf_6golang_12_golang_test_24test_close_wakeup_all_vsselect golang/_golang_test.cpp:4013 (_golang_test.so+0xd92a) #7 __pyx_pw_6golang_12_golang_test_25test_close_wakeup_all_vsselect golang/_golang_test.cpp:3978 (_golang_test.so+0xd92a) #8 PyEval_EvalFrameEx <null> (python2.7+0xf68b4) and reliably crashes under regular builds.
-
Kirill Smelkov authored
During the second phase of select - after all cases were polled and noone was found to be ready - cases are queued to corresponding channels recv and send queues. While that queueing is in progress, a case, that was already queued, could win (see f0b592b4 "select: Don't let both a queued and a tried cases win at the same time" for details). If such won case is detected, we break out of queuing loop, but currently don't wait for that case to become ready. This is a bug, because when a case is marked as won, its data is not yet copied - for example for won recv case if we don't wait for that case to become ready, we will be returning from select while corresponding *prx and *rxok for recv waiter is still being copied in progress. An example TSAN reports for this bug are as follows: (1) WARNING: ThreadSanitizer: data race (pid=8223) Read of size 1 at 0x7b1800000a48 by main thread: #0 __chanselect2 golang/runtime/libgolang.cpp:1112 (liblibgolang.so.0.1+0x5fd6) #1 _chanselect2<true> golang/runtime/libgolang.cpp:949 (liblibgolang.so.0.1+0x6665) #2 _chanselect golang/runtime/libgolang.cpp:944 (liblibgolang.so.0.1+0x6665) #3 __pyx_f_6golang_7_golang__chanselect_pyexc golang/_golang.cpp:5896 (_golang.so+0x1deac) #4 __pyx_pf_6golang_7_golang_4pyselect golang/_golang.cpp:4935 (_golang.so+0x1deac) #5 __pyx_pw_6golang_7_golang_5pyselect golang/_golang.cpp:4355 (_golang.so+0x1deac) #6 PyEval_EvalFrameEx <null> (python2.7+0xf0e49) Previous write of size 1 at 0x7b1800000a48 by thread T57: #0 golang::_RecvSendWaiting::wakeup(bool) golang/runtime/libgolang.cpp:346 (liblibgolang.so.0.1+0x459d) #1 golang::_chan::_tryrecv(void*, bool*) golang/runtime/libgolang.cpp:730 (liblibgolang.so.0.1+0x511d) #2 __chanselect2 golang/runtime/libgolang.cpp:1074 (liblibgolang.so.0.1+0x5d4b) #3 _chanselect2<true> golang/runtime/libgolang.cpp:949 (liblibgolang.so.0.1+0x6665) #4 _chanselect golang/runtime/libgolang.cpp:944 (liblibgolang.so.0.1+0x6665) #5 __pyx_f_6golang_7_golang__chanselect_pyexc golang/_golang.cpp:5896 (_golang.so+0x1deac) #6 __pyx_pf_6golang_7_golang_4pyselect golang/_golang.cpp:4935 (_golang.so+0x1deac) #7 __pyx_pw_6golang_7_golang_5pyselect golang/_golang.cpp:4355 (_golang.so+0x1deac) #8 PyEval_EvalFrameEx <null> (python2.7+0xf0e49) #9 <null> <null> (python2.7+0x1929e3) Location is heap block of size 96 at 0x7b1800000a20 allocated by main thread: #0 calloc ../../../../src/libsanitizer/tsan/tsan_interceptors.cc:623 (libtsan.so.0+0x2b323) #1 calloc ../../../../src/libsanitizer/tsan/tsan_interceptors.cc:618 (libtsan.so.0+0x2b323) #2 __chanselect2 golang/runtime/libgolang.cpp:1018 (liblibgolang.so.0.1+0x5b8c) #3 _chanselect2<true> golang/runtime/libgolang.cpp:949 (liblibgolang.so.0.1+0x6665) #4 _chanselect golang/runtime/libgolang.cpp:944 (liblibgolang.so.0.1+0x6665) #5 __pyx_f_6golang_7_golang__chanselect_pyexc golang/_golang.cpp:5896 (_golang.so+0x1deac) #6 __pyx_pf_6golang_7_golang_4pyselect golang/_golang.cpp:4935 (_golang.so+0x1deac) #7 __pyx_pw_6golang_7_golang_5pyselect golang/_golang.cpp:4355 (_golang.so+0x1deac) #8 PyEval_EvalFrameEx <null> (python2.7+0xf0e49) Thread T57 (tid=13758, finished) created by main thread at: #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors.cc:915 (libtsan.so.0+0x2be1b) #1 PyThread_start_new_thread <null> (python2.7+0x19299f) #2 _taskgo golang/runtime/libgolang.cpp:123 (liblibgolang.so.0.1+0x3f98) #3 __pyx_f_6golang_7_golang__taskgo_pyexc golang/_golang.cpp:5926 (_golang.so+0x16f7e) #4 __pyx_pf_6golang_7_golang_2pygo golang/_golang.cpp:2399 (_golang.so+0x16f7e) #5 __pyx_pw_6golang_7_golang_3pygo golang/_golang.cpp:2324 (_golang.so+0x16f7e) #6 PyEval_EvalFrameEx <null> (python2.7+0xf0e49) (2) WARNING: ThreadSanitizer: data race (pid=14185) Write of size 8 at 0x7b1800003000 by thread T95: #0 free ../../../../src/libsanitizer/tsan/tsan_interceptors.cc:649 (libtsan.so.0+0x2b46a) #1 free ../../../../src/libsanitizer/tsan/tsan_interceptors.cc:643 (libtsan.so.0+0x2b46a) #2 operator() golang/runtime/libgolang.cpp:1023 (liblibgolang.so.0.1+0x44bd) #3 _M_invoke /usr/include/c++/8/bits/std_function.h:297 (liblibgolang.so.0.1+0x44bd) #4 std::function<void ()>::operator()() const /usr/include/c++/8/bits/std_function.h:687 (liblibgolang.so.0.1+0x5fd8) #5 golang::_deferred::~_deferred() golang/runtime/libgolang.cpp:215 (liblibgolang.so.0.1+0x5fd8) #6 __chanselect2 golang/runtime/libgolang.cpp:1023 (liblibgolang.so.0.1+0x5fd8) #7 _chanselect2<true> golang/runtime/libgolang.cpp:949 (liblibgolang.so.0.1+0x6736) #8 _chanselect golang/runtime/libgolang.cpp:944 (liblibgolang.so.0.1+0x6736) #9 __pyx_f_6golang_7_golang__chanselect_pyexc golang/_golang.cpp:5896 (_golang.cpython-36m-x86_64-linux-gnu.so+0x1e562) #10 __pyx_pf_6golang_7_golang_4pyselect golang/_golang.cpp:4935 (_golang.cpython-36m-x86_64-linux-gnu.so+0x1e562) #11 __pyx_pw_6golang_7_golang_5pyselect golang/_golang.cpp:4355 (_golang.cpython-36m-x86_64-linux-gnu.so+0x1e562) #12 _PyCFunction_FastCallDict Objects/methodobject.c:231 (python3.6+0xd4db9) #13 pythread_wrapper Python/thread_pthread.h:205 (python3.6+0x6c5d6) Previous read of size 8 at 0x7b1800003000 by main thread: #0 golang::_RecvSendWaiting::wakeup(bool) golang/runtime/libgolang.cpp:347 (liblibgolang.so.0.1+0x4769) #1 golang::_chan::_trysend(void const*) golang/runtime/libgolang.cpp:661 (liblibgolang.so.0.1+0x5781) #2 _chanselect golang/runtime/libgolang.cpp:901 (liblibgolang.so.0.1+0x64d9) #3 __pyx_f_6golang_7_golang__chanselect_pyexc golang/_golang.cpp:5896 (_golang.cpython-36m-x86_64-linux-gnu.so+0x1e562) #4 __pyx_pf_6golang_7_golang_4pyselect golang/_golang.cpp:4935 (_golang.cpython-36m-x86_64-linux-gnu.so+0x1e562) #5 __pyx_pw_6golang_7_golang_5pyselect golang/_golang.cpp:4355 (_golang.cpython-36m-x86_64-linux-gnu.so+0x1e562) #6 _PyCFunction_FastCallDict Objects/methodobject.c:231 (python3.6+0xd4db9) Thread T95 (tid=16547, running) created by main thread at: #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors.cc:915 (libtsan.so.0+0x2be1b) #1 PyThread_start_new_thread Python/thread_pthread.h:252 (python3.6+0x6c67e) #2 _taskgo golang/runtime/libgolang.cpp:123 (liblibgolang.so.0.1+0x4158) #3 __pyx_f_6golang_7_golang__taskgo_pyexc golang/_golang.cpp:5926 (_golang.cpython-36m-x86_64-linux-gnu.so+0x1a9b5) #4 __pyx_pf_6golang_7_golang_2pygo golang/_golang.cpp:2399 (_golang.cpython-36m-x86_64-linux-gnu.so+0x1a9b5) #5 __pyx_pw_6golang_7_golang_3pygo golang/_golang.cpp:2324 (_golang.cpython-36m-x86_64-linux-gnu.so+0x1a9b5) #6 _PyCFunction_FastCallDict Objects/methodobject.c:231 (python3.6+0xd4db9) -> Fix it by always waiting for WaitGroup's won case to become ready. The bug was introduced in 3b241983 (Port/move channels to C/C++/Pyx). Before that - when channels were implemented at Python level, we were always waiting on select's group. Added test catches the bug on all - even not under TSAN - builds.
-
Kirill Smelkov authored
Currently chan.close iterates all send/recv subscribers unlocking/relocking the channel for each and notifying dequeued subscriber with channel unlocked. This leads to that even if channel had only one subscriber, chan.close accesses chan._mu again - after notifying that subscriber. That in turn means that an idiom where e.g. a done channel is passed to worker, which worker closes at the end, and main task waiting on the done and destroying done right after wakeup cannot work - because close, internally, accesses already destroyed channel as the following TSAN report shows for _test_go_c: WARNING: ThreadSanitizer: data race (pid=7143) Write of size 8 at 0x7b1400000650 by main thread: #0 free ../../../../src/libsanitizer/tsan/tsan_interceptors.cc:649 (libtsan.so.0+0x2b46a) #1 free ../../../../src/libsanitizer/tsan/tsan_interceptors.cc:643 (libtsan.so.0+0x2b46a) #2 golang::_chan::decref() golang/runtime/libgolang.cpp:470 (liblibgolang.so.0.1+0x47f2) #3 _chanxdecref golang/runtime/libgolang.cpp:452 (liblibgolang.so.0.1+0x484a) #4 _test_go_c golang/runtime/libgolang_test_c.c:86 (_golang_test.so+0x13a2e) #5 __pyx_pf_6golang_12_golang_test_12test_go_c golang/_golang_test.cpp:3340 (_golang_test.so+0xcbaa) #6 __pyx_pw_6golang_12_golang_test_13test_go_c golang/_golang_test.cpp:3305 (_golang_test.so+0xcbaa) #7 PyEval_EvalFrameEx <null> (python2.7+0xf68b4) Previous read of size 8 at 0x7b1400000650 by thread T8: #0 golang::Sema::acquire() golang/runtime/libgolang.cpp:164 (liblibgolang.so.0.1+0x410a) #1 golang::Mutex::lock() golang/runtime/libgolang.cpp:175 (liblibgolang.so.0.1+0x4c82) #2 golang::_chan::close() golang/runtime/libgolang.cpp:754 (liblibgolang.so.0.1+0x4c82) #3 _chanclose golang/runtime/libgolang.cpp:732 (liblibgolang.so.0.1+0x4d1a) #4 _work golang/runtime/libgolang_test_c.c:92 (_golang_test.so+0x136cc) #5 <null> <null> (python2.7+0x1929e3) Thread T8 (tid=7311, finished) created by main thread at: #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors.cc:915 (libtsan.so.0+0x2be1b) #1 PyThread_start_new_thread <null> (python2.7+0x19299f) #2 _taskgo golang/runtime/libgolang.cpp:119 (liblibgolang.so.0.1+0x3f68) #3 _test_go_c golang/runtime/libgolang_test_c.c:84 (_golang_test.so+0x13a1c) #4 __pyx_pf_6golang_12_golang_test_12test_go_c golang/_golang_test.cpp:3340 (_golang_test.so+0xcbaa) #5 __pyx_pw_6golang_12_golang_test_13test_go_c golang/_golang_test.cpp:3305 (_golang_test.so+0xcbaa) #6 PyEval_EvalFrameEx <null> (python2.7+0xf68b4) -> Fix close to dequeue all channel's subscribers atomically, and notify them all after channel is unlocked and _no_ longer accessed. Close was already working this way when channels were done at Python level, but in 3b241983 (Port/move channels to C/C++/Pyx) I introduced this bug while trying to avoid additional memory allocation in close. Added test catches the bug on all - even not under TSAN - builds. ---- Added test also reveals another bug: recv<onstack=false> uses channel after wakeup, and, as at the time of wakeup the channel could be already destroyed, that segfaults. Fix it by pre-reading in recv everything needed from _chan object before going to sleep. This fix cannot go separately from close fix, as fixed close is required for recv-uses-chan-after-wakeup testcase.
-