1. 17 Nov, 2020 2 commits
    • Kirill Smelkov's avatar
      bigfile/py: Garbage-collect BigFile <=> BigFileH cycles · a6a8f5ba
      Kirill Smelkov authored
      Since ZBigFile keeps references to fileh objects that are created
      through it it forms a file <=> fileh cycle that is not collected without
      cyclic GC:
      
      https://lab.nexedi.com/nexedi/wendelin.core/blob/v0.13-52-ga702d41/bigfile/file_zodb.py#L497
      https://lab.nexedi.com/nexedi/wendelin.core/blob/v0.13-52-ga702d41/bigfile/file_zodb.py#L566-571
      
      We did not noticed this leak until now because it is small, but with
      upcoming wendelin.core 2 it is important to release a fileh, because
      there is WCFS connection associated with fileh, and if fileh is not
      released, that connection also stays alive, keeping on-WCFS resources
      still being used, and preventing WCFS from being unmounted cleanly.
      
      -> Add cyclic GC support to PyBigFile / PyBigFileH
      
      NOTE: we still don't allow PyVMA <=> PyBigFileH cycles to be collected,
      because fileh_close called from fileh.__del__ asserts that there are no
      live mappings left. See added comments for details. There is no
      known practical need to use such cycles, so this should be ok.
      
      See also other patches on cyclic GC topic:
      
      - 450ad804 (bigarray: ArrayRef support for BigArray)  // adds cyclic GC support for PyVMA
      - d97641d2 (bigfile/py: Properly untrack PyVMA from GC before dealloc)
      
      /proposed-for-review-on !12
      a6a8f5ba
    • Kirill Smelkov's avatar
      bigfile/py: Move PyVMA's support for cyclic GC close to pyvma_dealloc · 7cc35422
      Kirill Smelkov authored
      The logic in pyvma_traverse and pyvma_clear needs to be synchronized
      with PyVMA deallocation. In the next patche we'll be amending this
      logic, and it will help a reader to keep all those functions together.
      
      For the reference: PyVMA support for cyclic GC was introduced in
      450ad804 (bigarray: ArrayRef support for BigArray). See also d97641d2
      (bigfile/py: Properly untrack PyVMA from GC before dealloc).
      
      /proposed-for-review-on !12
      7cc35422
  2. 03 Nov, 2020 2 commits
    • Kirill Smelkov's avatar
      t/tfault-run: Require bash · a702d410
      Kirill Smelkov authored
      Otherwise when /bin/sh is dash it fails with
      
          t/tfault-run: 35: test: on_pagefault: unexpected operator
      a702d410
    • Kirill Smelkov's avatar
      t/tfault-run: Clear state from previous run before starting · cf92dfca
      Kirill Smelkov authored
      Otherwise, if previous test.fault failed, tfault-run fails to start, e.g.
      
          >>> test.fault
          $ make test.fault # MAKEFLAGS=-j1
          x86_64-linux-gnu-gcc -pthread -g -Wall -D_GNU_SOURCE -std=gnu99 -fplan9-extensions -Wno-declaration-after-statement -Wno-error=declaration-after-statement  -Iinclude -I3rdparty/ccan -I3rdparty/include   bigfile/tests/tfault.c lib/bug.c lib/utils.c 3rdparty/ccan/ccan/tap/tap.c  -o bigfile/tests/tfault.t
          t/tfault-run bigfile/tests/tfault.t faultr on_pagefault
          mkdir: cannot create directory ‘t/tfault-run.faultr’: File exists
          Makefile:186: recipe for target 'faultr.tfault' failed
          make: *** [faultr.tfault] Error 1
          rm bigfile/tests/tfault.t
          error   test.fault      0.433s  # 1t 1e 0f 0s
      cf92dfca
  3. 02 Nov, 2020 1 commit
  4. 11 Sep, 2020 1 commit
  5. 17 May, 2020 2 commits
  6. 17 Apr, 2020 1 commit
  7. 15 Apr, 2020 10 commits
    • Kirill Smelkov's avatar
      setup: Fix hooking of git_lsfiles in PEP517 mode · bd1fb19e
      Kirill Smelkov authored
      In PEP517 mode setup.py is sourced - not executed - and the build fails
      with ImportError like this:
      
          Preparing wheel metadata ... error
          ERROR: Command errored out with exit status 1:
           command: /home/kirr/src/wendelin/venv/z-dev/bin/python2 /home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/pip/_vendor/pep517/_in_process.py prepare_metadata_for_build_wheel /tmp/tmp2F3aEs
               cwd: /home/kirr/src/wendelin/wendelin.core
          Complete output (53 lines):
          running dist_info
          creating /tmp/pip-modern-metadata-sPiqUt/wendelin.core.egg-info
          writing requirements to /tmp/pip-modern-metadata-sPiqUt/wendelin.core.egg-info/requires.txt
          writing /tmp/pip-modern-metadata-sPiqUt/wendelin.core.egg-info/PKG-INFO
          writing top-level names to /tmp/pip-modern-metadata-sPiqUt/wendelin.core.egg-info/top_level.txt
          writing dependency_links to /tmp/pip-modern-metadata-sPiqUt/wendelin.core.egg-info/dependency_links.txt
          writing entry points to /tmp/pip-modern-metadata-sPiqUt/wendelin.core.egg-info/entry_points.txt
          writing manifest file '/tmp/pip-modern-metadata-sPiqUt/wendelin.core.egg-info/SOURCES.txt'
          package init file '__init__.py' not found (or not a regular file)
          Traceback (most recent call last):
            File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/pip/_vendor/pep517/_in_process.py", line 257, in <module>
              main()
            File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/pip/_vendor/pep517/_in_process.py", line 240, in main
              json_out['return_val'] = hook(**hook_input['kwargs'])
            File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/pip/_vendor/pep517/_in_process.py", line 110, in prepare_metadata_for_build_wheel
              return hook(metadata_directory, config_settings)
            File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/setuptools/build_meta.py", line 155, in prepare_metadata_for_build_wheel
              self.run_setup()
            File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/setuptools/build_meta.py", line 234, in run_setup
              self).run_setup(setup_script=setup_script)
            File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/setuptools/build_meta.py", line 141, in run_setup
              exec(compile(code, __file__, 'exec'), locals())
            File "setup.py", line 374, in <module>
              """.splitlines()]
            File "/home/kirr/src/tools/go/pygolang/golang/pyx/build.py", line 118, in setup
              setuptools_dso.setup(**kw)
            File "/home/kirr/src/tools/py/pypa/setuptools_dso/src/setuptools_dso/__init__.py", line 37, in setup
              _setup(**kws)
            File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/setuptools/__init__.py", line 145, in setup
              return distutils.core.setup(**attrs)
            File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
              dist.run_commands()
            File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands
              self.run_command(cmd)
            File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
              cmd_obj.run()
            File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/setuptools/command/dist_info.py", line 31, in run
              egg_info.run()
            File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/setuptools/command/egg_info.py", line 296, in run
              self.find_sources()
            File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/setuptools/command/egg_info.py", line 303, in find_sources
              mm.run()
            File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/setuptools/command/egg_info.py", line 534, in run
              self.add_defaults()
            File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/setuptools/command/egg_info.py", line 574, in add_defaults
              rcfiles = list(walk_revctrl())
            File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/setuptools/command/sdist.py", line 20, in walk_revctrl
              for item in ep.load()(dirname):
            File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2434, in load
              return self.resolve()
            File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2444, in resolve
              raise ImportError(str(exc))
          ImportError: 'module' object has no attribute 'git_lsfiles'
      
      See comments added to register_as_entrypoint for explanation of what
      happens.
      
      Wendelin.core will soon switch to PEP517 mode (by adding pyproject.toml)
      to build-require Cython, Pygolang and friends.
      bd1fb19e
    • Kirill Smelkov's avatar
      lib/zodb: Add zstor_2zurl - way to convert a ZODB storage into URL to access it · 6637d216
      Kirill Smelkov authored
      Wendelin.core 2 will need to spawn WCFS filesystem server that accesses
      the same ZODB database as the program that spawns it. The database
      argument passed to WCFS is passed in the form of URL[1,2].  Even though
      zodburi provides way to convert an URL into ZODB storage instance, there
      is currently no way for reverse operation - to convert ZODB storage
      instance into URL to access it(*). So we have to build it by our own.
      
      Provide zstor_2zurl stub that currently works for FileStorage only.
      ZEO and NEO support is TODO.
      
      In the future we might want to move this functionality into
      zodbtools/py.
      
      [1] https://lab.nexedi.com/nexedi/zodbtools/blob/a2e4dd23/zodbtools/help.py#L27-53
      [2] https://lab.nexedi.com/kirr/neo/blob/3d909114/go/zodb/zodbtools/help.go#L25-51
      
      (*) contrary to ZODB/go where this functionality is provided out of the box:
          https://godoc.org/lab.nexedi.com/kirr/neo/go/zodb#IStorage
      6637d216
    • Kirill Smelkov's avatar
      lib/zodb: Add patch to ZODB.Connection to support callback on connection DB view change · 959ae2d0
      Kirill Smelkov authored
      Wendelin.core 2 will need to hook into when client ZODB.Connection
      changes its database view and readjust WCFS-level client connection
      accordingly.
      
      ZODB.Connection can change its view on either connection reopen, or even
      without reopen on start of new transaction.
      
      This patch implements ZODB.Connection.onResyncCallback for ZODB5 only.
      
      ZODB4 and ZODB3 support is TODO.
      959ae2d0
    • Kirill Smelkov's avatar
      lib/zodb: Add zconn_at draft (ZODB5 only) · 3bd82127
      Kirill Smelkov authored
      For wendelin.core v2 we need a way to know at which particular database
      state application-level ZODB connection is viewing the database. Knowing
      that state, WCFS client library will interact with WCFS filesystem server
      and, in simple terms, request the server to provide data as of that
      particular database state.
      
      Contrary to ZODB/go[1] ZODB/py does not provide the functionality to
      obtain DB state of connection view, so we have to build it ourselves.
      Let us call the function that for a client ZODB connection returns
      database state corresponding to its database view as zconn_at.
      
      It is relatively easy to implement zconn_at for ZODB5, since ZODB5
      adopted MVCC uniformly and this patch does just that. However even with
      ZODB5 currently all released ZODB5 versions have race in
      Connection.open() vs invalidations[2], and so the first ZODB5 release
      with which zconn_at implemented here will work reliable should be
      upcoming ZODB 5.5.2
      
      It is TODO to implement zconn_at for ZODB4 and ZODB3, which organize
      things differently.
      
      Please note what would happen if zconn_at gives, even a bit, incorrect
      answer: wcfs client will ask wcfs server to provide array data as of
      different database state compared to current on-client ZODB connection.
      This will result in that data accessed via ZBigArray will _not_
      correspond to all other data accessed via regular ZODB mechanism.
      It is, in other words, would be a data corruptions.
      
      [1] https://godoc.org/lab.nexedi.com/kirr/neo/go/zodb#Connection
      [2] https://github.com/zopefoundation/ZODB/issues/290
      3bd82127
    • Kirill Smelkov's avatar
      lib/zodb: Add zmajor - way to know under which ZODB 3, 4 or 5 we are running · 8c0b7471
      Kirill Smelkov authored
      This will be needed in the following patches to know how to inject
      zconn_at or zconn resync functionality into particular ZODB version.
      8c0b7471
    • Kirill Smelkov's avatar
      bigfile/zodb: Cosmetics · c671aaea
      Kirill Smelkov authored
      - mention in comments that _ZBigFileH not only proxies changes from
        virtmem -> ZODB, but also the other way: virtmem <- ZODB.
      - refresh comments, fix typo.
      c671aaea
    • Kirill Smelkov's avatar
      bigfile/file.h: Cosmetics · 927458f6
      Kirill Smelkov authored
      - Provide brief top-level overview + refresh loadblk/storeblk/release comments.
      - Add `typedef struct bigfile_ops bigfile_ops` that we usually add for all structs.
      927458f6
    • Kirill Smelkov's avatar
      bigfile/virtmem: vma_page_addr: Kill wrong XXX · 34ed82c6
      Kirill Smelkov authored
      It is valid to compare a Page and a VMA only if they belong to the same
      fileh.
      34ed82c6
    • Kirill Smelkov's avatar
      bigfile/virtmem: Factor-out checking whether `page->fpgoffset` is in file-range covered by `vma` · d53a480f
      Kirill Smelkov authored
      -> into vma_page_infilerange().
      
      We will soon need to use this functionality from several places.
      d53a480f
    • Kirill Smelkov's avatar
      bigfile/virtmem: fileh_mmap: Refactor a bit · 516f4625
      Kirill Smelkov authored
      Start preparing vma early, not after the call to mem_valloc.
      This codeflow will be more convenient when we add mmap-through-wcfs codepath.
      516f4625
  8. 14 Apr, 2020 4 commits
  9. 01 Apr, 2020 2 commits
  10. 18 Dec, 2019 8 commits
    • Kirill Smelkov's avatar
      bigfile/py: Move data structures to public .h file · 907bd9d4
      Kirill Smelkov authored
      This is needed so that e.g. a Python class implemented in C or Cython
      (cdef class) could inherit from PyBigFile.
      
      Don't put _bigfile.h into separate include/ directory, and keep it along
      main .c file, similarly to how pygolang is organized.
      907bd9d4
    • Kirill Smelkov's avatar
      bigfile/py: Provide package-level documentation · 3684d164
      Kirill Smelkov authored
      Provide package-level documentation that gives brief overview of what
      this package does. Split internal notes into separate comment.
      3684d164
    • Kirill Smelkov's avatar
      bigfile/py: Stop using Plan9 C extensions · bf44905b
      Kirill Smelkov authored
      Starting from 5755a6b3 (Basic setup.py / Makefile to build/install/sdist
      stuff + bigfile.so skeleton) and 35eb95c2 (bigfile: Python wrapper
      around virtual memory subsystem) we were using Plan9 C extensions[1] for
      simple inheritance. Those extensions are supported by GCC with
      -fplan9-extensions option. However that option is supported only for C,
      while for C++ it does not work at all with error produced by the compiler
      on Plan9 syntax.
      
      Soon we'll need to add another extension - written in C++ - to
      wendelin.core . This extension will be providing client side of WCFS and
      integrating that with virtmem. In that extension we'll need to use
      _bigfile data structures - in particular we'll need to use PyBigFile and
      extend it with another `cdef class` children written in Cython/C++.
      
      This patch prepares for that: first stop using Plan9 C extensions in
      _bigfile py module data structures and adapt the code correspondingly.
      In the next patch we'll move those data structures into an .h file.
      
      We don't drop -fplan9-extensions from setup.py, because Plan9-style
      inheritance continues to be used internally by virtmem - e.g. in
      ram_shmfs.c and friends.
      
      A bit pity to drop that good stuff, but given that we'll need to use C++
      for WCFS client for other good stuff provided by pygolang[2], it is a
      reasonable compromise.
      
      [1] http://9p.io/sys/doc/comp.html  "Extensions" section
      [2] https://pypi.org/project/pygolang
      bf44905b
    • Kirill Smelkov's avatar
      bigfile/zodb: Factor-out LivePersistent into -> lib/zodb · c02776e9
      Kirill Smelkov authored
      It was from long-ago marked as "XXX move to common place".
      c02776e9
    • Kirill Smelkov's avatar
      bigfile/zodb: FIXME invalidations are not working correctly on blocks topology change · 8c32c9f6
      Kirill Smelkov authored
      I noticed this while working on WCFS: if file blocks topology change,
      the invalidation process is not working correctly. It is also not
      correct with respect to live cache pressure.
      
      Add FIXME in the code and test for live cache pressure.
      
      kirr/wendelin.core@5a4562fc
      kirr/wendelin.core@48eb692f
      kirr/wendelin.core@d1a579b2
      kirr/wendelin.core@69c94fbc
      8c32c9f6
    • Kirill Smelkov's avatar
      bigfile/zodb: Explain why we always mark ZBlk object changed if block data change · d27ade8e
      Kirill Smelkov authored
      For ZBlk0 this is trivial, but for ZBlk1 it may seem that we could avoid
      changing ZBlk object itself and mark only pointed-to ZData object as
      changed. However that would be not correct to do if we consider
      invalidations.
      
      Noticed while working on WCFS.
      d27ade8e
    • Kirill Smelkov's avatar
      *: Add package-level documentation to ZODB-related packages · d5e0d2f9
      Kirill Smelkov authored
      Add package-level documentation to
      
        - bigfile/file_zodb.py,
        - bigarray/array_zodb.py, and
        - lib/zodb.py
      
      The most interesting read is file_zodb.py .
      
      Slightly improve documenation for functions in a couple of places.
      
      Improving documentation was long overdue and it is improved only slightly by this commit.
      d5e0d2f9
    • Kirill Smelkov's avatar
      tests: Keep ZEO test database on /tmp/ · 70c998c1
      Kirill Smelkov authored
      We already keep FileStorage test database on /tmp/ and NEO itself (via
      neo.tests.functional.NEOCluster) also keeps test data on tmpfs. However
      test database for ZEO was created in current directory and was wearing
      out SSD unnecessarily.
      
      FIXME zeo_forker currently does not provide API to keep all server files
      in particular place. This way server conf and log are still emitted in
      current directory, but at least we move data.fs away. Since conf and log
      are uniquely named, e.g. server-<ΧΧΧ>.conf and tmpYYY.log, and it was
      only that Data.fs was named non-uniquely, by moving Data.fs into unique
      per-server place, this also helps with-ZEO tests to execute correctly in
      parallel with `tox -p`.
      70c998c1
  11. 04 Dec, 2019 2 commits
  12. 29 Sep, 2019 1 commit
  13. 18 Sep, 2019 1 commit
  14. 15 Jul, 2019 1 commit
    • Kirill Smelkov's avatar
      bigfile/py: It is ok to have .fileh_open() as BigFile method · 7ee02038
      Kirill Smelkov authored
      There was an XXX of whether fileh_open should be a BigFile method or
      global function. However if it would be a global function it will need
      to anyway accept file parameter to indicate which file is opened, and
      that in turn suggests that it should be a file method. Remove XXX.
      7ee02038
  15. 12 Jul, 2019 2 commits
    • Kirill Smelkov's avatar
      */tests: Use defer instead of finally · 2b457640
      Kirill Smelkov authored
      try/finally was used in a couple of places to save/restore default ZBlk
      format setting. Move the restore part close to save with the help of
      defer.
      2b457640
    • Kirill Smelkov's avatar
      *: Use defer for dbclose & friends · 5c8340d2
      Kirill Smelkov authored
      For tests this makes sure that if one test fails, it won't make following
      tests fail just because the next test will fail trying to lock test database.
      
      For regular code (demo_zbigarray.py) this is also a good thing to do -
      to always close the database irregardless of whether an exception was
      raised before program reached end of main.
      
      Pygolang becomes regular - not test only - dependency. Being regular
      dependency is currently required only by demo_zbigarray.py, but it will
      be also used in upcoming wcfs, so adding pygolang into wendelin.core
      dependencies aligns with the plan.
      
      dbclose now uses defer almost everywhere - there are still few places in
      tests, where one test function is opening/closing test database multiple
      times - those were not (yet ?) converted.
      5c8340d2