Commits · f263d971e9237b5214f211e96815306dc3fdab2b · Kirill Smelkov / wendelin.core

20 Nov, 2020 4 commits
- . ci · f263d971
  Kirill Smelkov authored Nov 20, 2020
  
  f263d971
- . ci · 1fb412f9
  Kirill Smelkov authored Nov 20, 2020
  
  1fb412f9
- X onResyncCallback for ZODB4 · 533a4cfa
  Kirill Smelkov authored Nov 20, 2020
  
  533a4cfa
- X zconn_at for ZODB4 · 1c3b7750
  Kirill Smelkov authored Nov 20, 2020
  
  1c3b7750
18 Nov, 2020 1 commit
- . ci · cd5041bb
  Kirill Smelkov authored Nov 18, 2020
  
  cd5041bb
17 Nov, 2020 1 commit
- . · 8aedb049
  Kirill Smelkov authored Nov 17, 2020
  
  8aedb049
09 Nov, 2020 4 commits
- . · 266ae7c0
  Kirill Smelkov authored Nov 09, 2020
  
  266ae7c0
- . · a7c14c25
  Kirill Smelkov authored Nov 09, 2020
  
  a7c14c25
- . · 35191c8a
  Kirill Smelkov authored Nov 09, 2020
  
  35191c8a
- . · 651eaf72
  Kirill Smelkov authored Nov 09, 2020
  
  651eaf72
08 Nov, 2020 2 commits

. · 29cced29
Kirill Smelkov authored Nov 08, 2020

29cced29

XY bigfile/py: Garbage-collect BigFile <=> BigFileH cycles · 1be4d730

Kirill Smelkov authored Nov 08, 2020

Since ZBigFile keeps references to fileh objects that are created
through it it forms a file <=> fileh cycle that is not collected without
cyclic GC:

https://lab.nexedi.com/nexedi/wendelin.core/blob/v0.13-52-ga702d41/bigfile/file_zodb.py#L497
https://lab.nexedi.com/nexedi/wendelin.core/blob/v0.13-52-ga702d41/bigfile/file_zodb.py#L566-571

We did not noticed this leak until now because it is small, but with
upcoming wendelin.core 2 it is important to release a fileh, because
there is WCFS connection associated with fileh, and if fileh is not
released, that connection also stays alive, keeping on-WCFS resources
still being used, and preventing WCFS from being unmounted cleanly.

-> Add cyclic GC support to PyBigFile / PyBigFileH

NOTE: we still don't allow PyVMA <=> PyBigFileH cycles to be collected,
because fileh_close called from fileh.__del__ asserts that there are no
live mappings left. See added comments for details. There is no
known practical need to use such cycles, so this should be ok.

See also other patches on cyclic GC topic:

- 450ad804 (bigarray: ArrayRef support for BigArray)  // adds cyclic GC support for PyVMA
- d97641d2 (bigfile/py: Properly untrack PyVMA from GC before dealloc)

1be4d730

05 Nov, 2020 4 commits

X wcfs: v↑ NEO/go (error context fixups) · 21913d02
Kirill Smelkov authored Nov 05, 2020

21913d02

X bigfile/_file_zodb: Fix ZSync to close not only wconn, but also wconn.wc... · a9a82d5a

Kirill Smelkov authored Nov 05, 2020

X bigfile/_file_zodb: Fix ZSync to close not only wconn, but also wconn.wc through which wconn was created

pywconnOf, before creating wconn, performs wc=wcfs.join(zurl) which
creates new filesystem-level connection to WCFS server. This wc is used
only to create wconn. So if we do not close wc, after releaseing wconn,
it will leak opened file descriptor, to e.g. .wcfs/zurl and prevent
tests from finishing cleanly.

a9a82d5a

. · 4e23152e
Kirill Smelkov authored Nov 05, 2020

4e23152e

X wcfs: Don't noise ZWatcher trace logs with "select ..." · 0e60e9ff

Kirill Smelkov authored Nov 05, 2020

Its just a debugging print - helpful to debug zwatcher, but not helpful
to understand which events the system was observing.

0e60e9ff

04 Nov, 2020 3 commits
- X wcfs: v↑ NEO/go (fixup for NEO/py v1.12 protocol + "object not found" from loadBefore) · 3f84a1e2
  Kirill Smelkov authored Nov 04, 2020
  
  3f84a1e2
- X wcfs: tests: Print which files are still opened on wcfs if `fusermount -u` fails · 112720f3
  Kirill Smelkov authored Nov 04, 2020
```
Helps to understand why if wcfs cannot be unmounted.
```
  112720f3
- fixup! X bigfile/_file_zodb: Fix logic around ZSync usage · 571cb737
  Kirill Smelkov authored Nov 04, 2020
```
Don't use regular mutex to protect _zsyncReg updates as this can
deadlock because one of _zsyncReg mutators (on_zconn_dealloc) is invoked
by automatic GC that can be triggered any time.
```
  571cb737
03 Nov, 2020 13 commits

. · 066d7203
Kirill Smelkov authored Nov 03, 2020

066d7203

X bigfile/_file_zodb: Fix logic around ZSync usage · 8bf8f23b

Kirill Smelkov authored Nov 03, 2020

The logic inside ZSync was correct, but it was incorrect to attach zsync
to zconn to stay alive and react when that zconn is garbage collected:
zsync._on_zconn_dealloc was not called because zsync itself was garbage
collected too.

This fixes many failures where wconn and associated pinner was not
released even though ZODB DB was correctly closed.

8bf8f23b

X wcfs: tests: Run `fusermount -u` the second time if we had to kill wcfs · 3a6bd764

Kirill Smelkov authored Nov 03, 2020

This makes sure to cleanup /proc/mounts from stale / broken FUSE
connection, and removes uninformational `assert not is_mountpoint` from
raising, thus, adding more noise in already very verbose wcfs-kill-dump.

3a6bd764

bigfile/_file_zodb: Test for ZSync · 75857c32
Kirill Smelkov authored Nov 03, 2020
```
Excercise the logic that keeps wconn <-> zconn in sync.
```
75857c32

X wcfs: client: Provide Conn.at() · 24378c46

Kirill Smelkov authored Nov 03, 2020

To known to which DB state WCFS connection corresponds. This is similar
to zodb.Connection.At() in ZODB/go and to zconn_at in ZODB/py.

wconn.at() will be used in the next patch to verify ZSync.

24378c46

. · 2be847c4
Kirill Smelkov authored Nov 03, 2020

2be847c4
X tests: Don't try to stop wcfs that is already exited · f118617b
Kirill Smelkov authored Nov 03, 2020

f118617b

X setup: Add build dependency information · 84404f8f

Kirill Smelkov authored Nov 03, 2020

Manaully, because there is no automatic dependency tracking in
setuptools...

Dependency tracking is needed to avoid miscompilation after incremental
update under SlapOS/buildout/testnode/... when e.g. only .h was changed.

84404f8f

X tests: Stop wcfs spawned during tests · f622e751

Kirill Smelkov authored Nov 03, 2020

Tests inside wcfs/ care to do this, but e.g. test.py/fs-wcfs autospawns
wcfs servers during regular bigfile tests. If we don't stop spawned
wcfs, those processes will leak, and also they keep `nxdtest
test.py/*-wcfs` in "hung" state, because nxdtest is waiting for wcfs to
stop as wcfs stdout is connected to nxdtest input.

Currently kills wcfs in abrupt way, because graceful pinner shutdown is
not yet implemented there.

f622e751

X wcfs: tests: Split tDB into -> tDB + tWCFS · 6dec74e7

Kirill Smelkov authored Nov 03, 2020

tWCFS is responsible for starting/mounting/unmounting/stopping wcfs
tDB uses tWCFS and provides commit/test service on top.

We'll use tWCFS in the next patch to unmount/stop WCFS processes that
are automatically spawned during test.py

6dec74e7

X wcfs: tests: Don't use testmntpt everywhere · bc9eb16f

Kirill Smelkov authored Nov 03, 2020

Once WCFS instance is created, use wc.mountpoint to refer to where this
wcfs is mounted. It does not change anything right now, but in a
follow-up patches we'll reuse the code from wcfs_test to work on any wc,
not neccessarily mounted on testmntpt.

bc9eb16f

X bigfile/_file_zodb: Import wendelin.wcfs, not just wcfs · 2ba7cb52

Kirill Smelkov authored Nov 03, 2020

Else, when runing tests intree `import wcfs` and `import wendelin.wcfs`
will give two different modules, and inspecting e.g. wendelin.wcfs at
teardown will see fresh module state (_wcregistry) because it was wcfs
which was used.

Also just `import wcfs` will raise ImportError when run out of tree.

2ba7cb52

fixup! X wcfs: client: Handle fork · 0ed6b8b6

Kirill Smelkov authored Nov 03, 2020

Starting from 3f83469c Conn and WatchLink started to inherit from
interface, which made them to use virtual functions, which, without
destructor being also virtual emits the following warnings:

    wcfs/client/wcfs.cpp: In member function ‘virtual void wcfs::_Conn::decref()’:
    wcfs/client/wcfs.cpp:1531:16: warning: deleting object of polymorphic class type ‘wcfs::_Conn’ which has non-virtual destructor might cause undefined behavior [-Wdelete-non-virtual-dtor]
             delete this;
                    ^~~~

    wcfs/client/wcfs_watchlink.cpp: In member function ‘virtual void wcfs::_WatchLink::decref()’:
    wcfs/client/wcfs_watchlink.cpp:514:16: warning: deleting object of polymorphic class type ‘wcfs::_WatchLink’ which has non-virtual destructor might cause undefined behavior [-Wdelete-non-virtual-dtor]
             delete this;
                    ^~~~

0ed6b8b6

02 Nov, 2020 1 commit

X wcfs: test: Fix thinko in getting /sys/fs/fuse/connection/<X> for wcfs · 78f36993

Kirill Smelkov authored Nov 02, 2020

FUSE puts X as st_dev's minor, which, for minors <= 255 is the same as st_dev.
However when there are many connections, and minor goes after 255, minor becomes != st_dev:

    In [2]: os.makedev(0, 254)
    Out[2]: 254

    In [3]: os.makedev(0, 255)
    Out[3]: 255

    In [5]: os.makedev(0, 256)
    Out[5]: 1048576

As a result we were constructing wrong patch, and if wcfs was failing we were
also failing to kill it with something like:

    t = <wcfs.wcfs_test.tDB object at 0x7fef78043260>

        @func
        def __init__(t):
            t.root = testdb.dbopen()
            def _(): # close/unlock db if __init__ fails
                exc = sys.exc_info()[1]
                if exc is not None:
                    dbclose(t.root)
            defer(_)

            assert not os.path.exists(testmntpt)
            t.wc = wcfs.join(testzurl, autostart=True)
            assert os.path.exists(testmntpt)
            assert is_mountpoint(testmntpt)

            # force-unmount wcfs on timeout to unstuck current test and let it fail.
            # Force-unmount can be done reliably only by writing into
            # /sys/fs/fuse/connections/<X>/abort. For everything else there are
            # cases, when wcfs, even after receiving `kill -9`, will be stuck in kernel.
            # ( git.kernel.org/linus/a131de0a482a makes in-kernel FUSE client to
            #   still wait for request completion even after fatal signal )
    >       t._wcfuseabort   = open("/sys/fs/fuse/connections/%d/abort" % os.stat(testmntpt).st_dev, "w")
    E       IOError: [Errno 2] No such file or directory: '/sys/fs/fuse/connections/2097264/abort'

    wcfs/wcfs_test.py:236: IOError

In the above failure st_dev=2097264 corresponds to X=624:

    In [6]: os.minor(2097264)
    Out[6]: 624

78f36993

01 Nov, 2020 2 commits
- . · e4e5571a
  Kirill Smelkov authored Nov 01, 2020
  
  e4e5571a
- X wcfs: v↑ NEO/go (add support for NEO/py v1.12 protocol) · 69bb395a
  Kirill Smelkov authored Nov 01, 2020
  
  69bb395a
30 Oct, 2020 5 commits

. · 9040bc20
Kirill Smelkov authored Oct 30, 2020

9040bc20
. · ad860ba6
Kirill Smelkov authored Oct 30, 2020

ad860ba6

X wcfs: client: Handle fork · 3f83469c

Kirill Smelkov authored Oct 30, 2020

Without special care a forked child may interfere in parent-wcfs
exchange via Python GC -> PyFileH.__del__ -> FileH.close -> message to
WCFS sent from the child. This actually happens for real when running
test.py/neo-wcfs because NEO test cluster spawns master and storage
nodes with just fork without exec.

-> detach from wcfs in child right after fork and deactivate all
mappings in order not to provide stale data. See top-level comments
added to wcfs/client/wcfs.cpp for details.

3f83469c

X wcfs: tests: Factor-out waiting for a general condition to become true into waitfor · c2c35851

Kirill Smelkov authored Oct 30, 2020

Currently in wcfs_test.py there is only waiting for a proc
(subprocess.Popen instance) to become ready. However in the next patch
we'll need to wait via polling for another condition.

-> Generalize the pollwait code into waitfor* variants, and make
procwait* use waitfor* internally.

c2c35851

X wcfs: client: os: Factor syserr -> string into _sysErrString · 17f98edc

Kirill Smelkov authored Oct 30, 2020

Currently the code to convert `int err` or errno into string is usde
only in _pathError, but in the next patches we'll need it to also handle
error from pthread_atfork. -> Factor-out to separate function.

17f98edc