Commits · e020026a4eb2208fb5fd3153a093b7fd77676e25 · Kirill Smelkov / neo

01 Feb, 2019 1 commit

go/zodb: Complete (Py)Stateful & Ghostable · e020026a

Kirill Smelkov authored Feb 01, 2019

- require (Py)GetState and make it clear that it is called by
  persistency machinery only on non-ghost objects.
- make it clear that (Py)SetState is called by persistency machinery
  only on ghost objects.
- make it clear that DropState is called by persistency machinery only
  on non-ghost objects.

For btree PyGetState is marked as TODO which we'll fill incrementally
(the code is draftly ready, but there is no test for now).

e020026a

30 Jan, 2019 3 commits

go/zodb/fs1: tests: Factor-out commiting a transaction via ZODB/py into common place · 7f14e2cb
Kirill Smelkov authored Jan 29, 2019
```
We will soon need to be able to commit in tests for ZODB itself.
```
7f14e2cb

go/zodb: Require drivers to provide at₀ on open · a6580062

Kirill Smelkov authored Jan 29, 2019

and to deliver to watchq only and all events in (at₀, +∞] range.

This continues 4d2c8b1d (go/zodb: Require drivers to provide
notifications for database change events) and makes initialization
semantics to process invalidations easier for both application-level ZODB
layer, and for low-level users that work with drivers directly:

- there is no possibility that an event will come from watchq with tid <
  current user head.

- there is no possibility that watcher will start delivering events
  after at₀, but not immediately after it, i.e. users can rely that they
  won't loose an event.

This correctness invariants should be easy to provide in drivers.

a6580062

go/zodb: DB.connv -> DB.pool · 2799f995
Kirill Smelkov authored Jan 29, 2019
```
pool is a better name for "pool of unused connections" compared to connv.
```
2799f995

29 Jan, 2019 2 commits

go/zodb: Connection -= .stor · 0806740d

Kirill Smelkov authored Jan 29, 2019

Connection has .db and db has .stor - there is no need to keep separate
.stor on the Connection. This thinko was there from Connection beginning
(533f0c73 "go/zodb: DB - application-level handle to database (very draft)")

0806740d

go/transaction: Fix Abort to wait for synchronizers completion · 7e0c944f

Kirill Smelkov authored Jan 29, 2019

There was a thinko in transaction.Abort - it was spawning synchronizers
under a waitgroup, but wg.Wait() call was forgotten. This way, e.g. in
ZODB if a transaction was aborted, corresponding connection could be not
yet returned back into DB pool.

Fix it.

Test is TODO for the time when, hopefully, tracetest is generally ready.

7e0c944f

28 Jan, 2019 1 commit

go/zodb += TidFromTime · a9e2badf

Kirill Smelkov authored Jan 28, 2019

In ZODB transaction ID is connected with time. We already have
functionality to convert tid to time (see bac6c953 "go/zodb: Tid
connection with time"), but the functionality for converting in another
way - time -> tid - was missing.

Fix it.

a9e2badf

18 Jan, 2019 6 commits

go/zodb: Expose retrieving ZODB class name of a Go object as public API · 2f061c0c

Kirill Smelkov authored Jan 18, 2019

It is useful in situations where one need to print potentially Broken
objects properly, because %T does not print that detail and using %#v
creates too much noise.

2f061c0c

go/zodb: Allow to set live cache control policy · cab4dd52

Kirill Smelkov authored Jan 18, 2019

This allows for applications to tune eviction strategy for their needs.
The interface to do so (LiveCacheControl) was defined previously in
c67ff9ea (go/zodb: Connection: Allow applications to tune live-cache
eviction policy), but so far there was no way to install cache
controller. Fix it.

cab4dd52

go/zodb: Expose access to connection's live cache as public API · e2d902b0
Kirill Smelkov authored Jan 18, 2019
```
Wendelin.core (wcfs) needs to check whether an object is currently
cached or not.
```
e2d902b0

go/zodb/fs1: Teach it to support notifications on database change · db852511

Kirill Smelkov authored Jan 18, 2019

Following-up on 4d2c8b1d (go/zodb: Require drivers to provide
notifications for database change events) let's teach FileStorage to
support watching for database changes:

- add watcher who observes changes to data file via fsnotify.

- when we see that the file was updated - there is a tricky case to distinguish
  in-progress writes from just some garbage added at tail. See comments in
  _watcher for details.

- if the watcher sees there is indeed new transactions, the watcher
  updates index & txnh{Min,Max} and sends corresponding event to watchq.

- since index / txnh{Min,Max} can now be changed (by watcher) -> they
  are now protected by mu.

- consequently operations like LastTid, LastOid, Load, ... are all taught to
  observe index / txnh{Min,Max} state in read-snapshot way.

- add down & friends to indicate that the storage is no longer
  operational, if watcher sees there is a problem with data file.

- Open is reworked to start from loading/rebuilding index first, and then tail
  to watcher to detect whether what's after found topPos is garbage or another
  in-progress transaction. Consequently it is now possible to correctly open
  filestorage that is being currently written to and has in-progress transaction
  at tail.

The patch is a bit big, but the changes here are all tightly
interrelated.

db852511

go/zodb/fs1: Factor out making zodb.OpError into FileStorage.zerr · 8eb0988e

Kirill Smelkov authored Jan 18, 2019

It is already 3 places where we return zodb.OpError and it will be more.
Make a helper function to create those errors with higher signal/noise.

8eb0988e

go/zodb: Require drivers to provide notifications for database change events · 4d2c8b1d

Kirill Smelkov authored Jan 17, 2019

A ZODB database can be changed by local client as well as another
process. A notification channel is thus needed for local cache and
database view to know they have to update to current database state.

This patch builds the interface of how such notifications should be
provided by drivers. Contrary to ZODB/py it is required that every
driver provide it.

However we will be providing driver support incrementally and for now
all drivers behave as if the database is not changing.

A note on why Watchq is passed to driver as options: low-level ZODB
users, who might want to work with drivers directly, might not need it,
and this way with Watchq not present in driver options they will continue to
observe the same driver behaviour as before watchq was introduced. In
practice many low-level utilities don't need notification support, and
it would be not good to required them all to update open calls and to
provide watchq drainer not to get stuck.

TODO Watchq should be extended to also receive errors from watcher, so
that clients could be notified when there is something wrong with the
database.

4d2c8b1d

17 Jan, 2019 9 commits

go/zodb: Decouple driver-specific options from OpenOptions · 559a1be7

Kirill Smelkov authored Jan 17, 2019

OpenOptions is high-level options clients give to ZODB when they want to
open a high-level IStorage, and not all of those options apply to drivers
- for example NoCache is meaningless since cache is provided by ZODB
infrastructure, not a driver.

On the other hand we are going to introduce driver-specific options,
that either low-lever users, or ZODB infrastructure itself will use when
opening drivers.

For this reasons decouple driver-specific options from OpenOptions into
DriverOptions.

559a1be7

go/zodb: Decouple IStorage from IStorageDriver · e81f490a

Kirill Smelkov authored Jan 17, 2019

IStorage is what we give to users. IStorageDriver is what is internally
provided by drivers to common ZODB infrastructure that in turn
implements easy-to-use IStorage interface for clients.

So far those two were close to each other. However we need to decouple
them, or else every driver will have to implement the same high-level
functionality that users expect from IStorage. What we want to do is to
offload drivers from that work.

In the next patches IStorageDriver will evolve with driver
specific-bits, while IStorage will evolve with higher-level stuff
that clients will use.

e81f490a

go/zodb/fs1: tests: Factor-out checking for Python/ZODB presence into common place · 46636b16

Kirill Smelkov authored Jan 17, 2019

We will soon need to check for Python and other modules in more tests.
Now the checking function is generic - see xtesting.NeedPy docs.

46636b16

go/zodb: Give ConnOptions its own String · 2fc04a08

Kirill Smelkov authored Jan 17, 2019

Connection options could be prepared/logged not only in DB.Open, so
instead of teaching only DB.Open how to print them, teach ConnOptions to
represent itself in human-readable form.

2fc04a08

go/zodb/zodbtools: Refer to `zodb help tidrange` about how history range should be specified · 218a9c6f
Kirill Smelkov authored Jan 17, 2019
```
This syncs to nexedi/zodbtools@f7eff5fe
(*: Refer to `zodb help tidrange` about how history range should be
specified) in zodbtools/py.
```
218a9c6f
go/zodb/zodbtools: Minor godoc cosmetics · d36da842
Kirill Smelkov authored Jan 17, 2019

d36da842
go/zodb/btree: Expose iterating through leaf tree nodes as public API · 36bb9b1e
Kirill Smelkov authored Jan 17, 2019
```
Done for consistency with 313d2d78 (go/zodb/btree: Expose access to
BTree/Bucket entries as public API).
```
36bb9b1e

go/zodb/btree: {Min,Max}Key · a47146ea

Kirill Smelkov authored Jan 17, 2019

Provide functionality to query for key-range limit for all children
under a tree node.

a47146ea

go/zodb/btree: Minor cosmetics · 07869c01

Kirill Smelkov authored Jan 17, 2019

- Refer to zodb interfaces as zodb.<name>;
- in tests bmapping is about LOBTree, not BTree. This fixes up c1ba9a90
  (go/zodb/btree: Turn it into template).

07869c01

13 Dec, 2018 1 commit
- go/neo/t/{tcpu,tzodb}.py: Move hashing utilities & hashRegistry to -> zodbtools · e5354e76
  Kirill Smelkov authored Dec 13, 2018
```
Moved here:

	nexedi/zodbtools@e973d519
```
  e5354e76
04 Dec, 2018 1 commit
- go/zodb/fs1: Make it render more well in godoc · fe751a15
  Kirill Smelkov authored Dec 04, 2018
```
- Put dot after the subject,
- Indent lists,
- ...
```
  fe751a15
08 Oct, 2018 9 commits

go/zodb: Add support for class aliases · 4eed282c

Kirill Smelkov authored Oct 08, 2018

Sometimes class name of an object changes, but to support loading
previously-saved objects, the old class name has to be also supported.

For example wendelin.core has ZBlk0 class, but historically used just
"ZBlk" name for it:

https://lab.nexedi.com/nexedi/wendelin.core/blob/v0.12-6-g318efce/bigfile/file_zodb.py#L377

Both class names have to be supported on loading and resolve to to
ZBlk0-typed runtime object.

4eed282c

go/zodb: Fix copy-paste thinko in PyStateful tests · df936d75
Kirill Smelkov authored Oct 08, 2018
```
The code here is testing t.zodb.MyObject class, not unknown.
```
df936d75

go/zodb: RegisterClass: Prevent double registration of a type · 6372c7a4

Kirill Smelkov authored Oct 08, 2018

Currently RegisterClass was catching double registration of a ZODB class
(a string), but not a Go type.

We want to prevent double registration of a Go type, because when saving
in-RAM state to ZODB we have to translate Go type -> ZODB class.

Fix it.

6372c7a4

go/zodb/btree: Expose access to BTree/Bucket entries as public API · 313d2d78

Kirill Smelkov authored Oct 08, 2018

Traditionally BTrees in ZODB/py expose point query and iteration APIs.
However they don't allow a BTree to be scanned through concurrently.

For example in wendelin.core each ZBlk1 consists of a IOBTree with 512
chunks

https://lab.nexedi.com/nexedi/wendelin.core/blob/v0.12-6-g318efce/bigfile/file_zodb.py#L267

and loading those chunks from ZODB one-by-one serially is very slow.

Expose a way to retrieve all children of a B⁺ tree node. This way
loading them all could be started in parallel thus significantly
reducing overall latency if a range or whole BTree needs to be fetched.

313d2d78

go/zodb/btree: Rearrange data structure definitions · 8d21e8cc

Kirill Smelkov authored Oct 08, 2018

In the next patch we are going to expose access to BTree/Bucket entries
as public API. This will turn _BTreeItem into Entry and will also add
BucketEntry data type. Before doing that rearrange the order in which
the data structure go:

	- BTree,
	- Entry (_BTreeItem for now),
	- Bucket
	- BucketEntry (not present for now).

Only code movement - no other change.

8d21e8cc

go/zodb/btree: Draft overview · d124ea04
Kirill Smelkov authored Oct 08, 2018
```
Very brief and incomplete.
```
d124ea04

go/zodb/btree: Verify that BTree children are always all of the same kind · 2e9e5cab

Kirill Smelkov authored Oct 08, 2018

They are all either BTree or all Buckets.

See https://github.com/zopefoundation/ZODB/blob/3.10.7-4-gb8d7a8567/src/BTrees/Development.txt#L231
for details

2e9e5cab

go/zodb/btree: Verify that BTree/Bucket keys come in sorted order · 50f570c8
Kirill Smelkov authored Oct 08, 2018
```
This is one of BTree invariants - check it on load.
```
50f570c8

go/zodb/btree: Turn it into template · c1ba9a90

Kirill Smelkov authored Oct 08, 2018

2dba8607 (go/zodb/btree: New package to work with ZODB BTrees (draft))
added btree module with btree.BTree essentially being LOBTree (int64 key
-> object). Since for wendelin.core we also need IOBTree (int32 key ->
object), which is used in ZBlk1

	https://lab.nexedi.com/nexedi/wendelin.core/blob/v0.12-6-g318efce/bigfile/file_zodb.py#L267
	https://lab.nexedi.com/nexedi/wendelin.core/blob/v0.12-6-g318efce/bigfile/file_zodb.py#L374

let's turn btree module into template internally and generate code for
both LOBTree and IOBTree.

For the reference BTree/py takes similar approach with respect to
templating.

c1ba9a90

02 Oct, 2018 1 commit
- go/zodb: Connection += .At() · fcab58ca
  Kirill Smelkov authored Oct 02, 2018
```
To know database state corresponding to the connection.
```
  fcab58ca
01 Oct, 2018 1 commit

go/zodb: Don't truncate Tid time precision to 1µs · 9112f21e

Kirill Smelkov authored Oct 01, 2018

The format of tid assumes ~ ns precision, and it is only formatted to µs
precision by default. So don't truncate TimeStamp value when computing
it from Tid, and perform the µs-rounding only on formatting.

The float numbers are not always exactly as in python. For example the
following program

	tidv = [
	    0x0000000000000000,
	    0x0285cbac258bf266,
	    0x0285cbad27ae14e6,
	    0x037969f722a53488,
	    0x03b84285d71c57dd,
	    0x03caa84275fc1166,
	]

	for tid in tidv:
	    t = TimeStamp.TimeStamp(p64(tid))
	    print '0x%016x %s %.9f\t%.9f' % (tid, t, t.timeTime(), t.second())

prints:

	0x0000000000000000 1900-01-01 00:00:00.000000 -2208988800.000000000     0.000000000
	0x0285cbac258bf266 1979-01-03 21:00:08.800000 284245208.800000191       8.800000185
	0x0285cbad27ae14e6 1979-01-03 21:01:09.300001 284245269.300001621       9.300001496	<-- ex here
	0x037969f722a53488 2008-10-24 05:11:08.120000 1224825068.119999886      8.119999878
	0x03b84285d71c57dd 2016-07-01 09:41:50.416574 1467366110.416574001      50.416573989
	0x03caa84275fc1166 2018-10-01 16:34:27.652650 1538411667.652649879      27.652650112

the difference is due to floating point operation ordering, because
TimeStamp.timeTime() looses precision - e.g. for marked case:

	In [8]: '%.10f' % (281566860.000000000 + 9.300001496)
	Out[8]: '281566869.3000015020'

We don't try to mimic float64 behaviour to Python exactly - because it is even
different for PURE_PYTHON=y or C TimeStamp implementations. However we don't
limit due to that our timestamp precision to only 1µs.

In other words we keep on maintaining exact compatibility with Python on
printing, but timestamp values itself are now ~ ns precision.

9112f21e

28 Sep, 2018 1 commit

go/zodb/fs1/index: Don't rely on []byte being pickled as string · c72aaa0d

Kirill Smelkov authored Sep 28, 2018

As https://github.com/kisielk/og-rek/pull/57 maybe shows []byte was
pickling as string only unintentionally and that might change.

We are already explicitly checking for string in corresponding index
load place:

	https://lab.nexedi.com/kirr/neo/blob/2dba8607/go/zodb/storage/fs1/index.go#L282

so it is better we also explicitly save the bits as string.

If we don't and https://github.com/kisielk/og-rek/pull/57 gets accepted,
tests will fail:

	--- FAIL: TestIndexSaveLoad (0.00s)
	    index_test.go:176: index load: /tmp/t-index893650059/458967662/1.fs.index: pickle @6: invalid oidPrefix: type []uint8
	Traceback (most recent call last):
	  File "./py/indexcmp", line 41, in <module>
	    main()
	  File "./py/indexcmp", line 29, in main
	    d2 = fsIndex.load(path2)
	  File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/fsIndex.py", line 138, in load
	    data[ensure_bytes(k)] = fsBucket().fromString(ensure_bytes(v))
	  File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/fsIndex.py", line 71, in ensure_bytes
	    return s.encode('ascii') if not isinstance(s, bytes) else s
	AttributeError: 'bytearray' object has no attribute 'encode'
	--- FAIL: TestIndexSaveToPy (0.04s)
	    index_test.go:218: zodb/py read/compare index: exit status 1

c72aaa0d

09 Aug, 2018 4 commits

go/zodb/btree: New package to work with ZODB BTrees (draft) · 2dba8607
Kirill Smelkov authored Aug 09, 2018
```
Provide minimal support for BTrees.LOBTree Get for now.
```
2dba8607

go/zodb: DB - application-level handle to database (very draft) · 533f0c73

Kirill Smelkov authored Aug 09, 2018

DB represents a handle to database at application level and contains pool
of connections. DB.Open opens database connection. The connection will be
automatically put back into DB pool for future reuse after corresponding
transaction is complete. DB thus provides service to maintain live objects
cache and reuse live objects from transaction to transaction.

Note that it is possible to have several DB handles to the same database.
This might be useful if application accesses distinctly different sets of
objects in different transactions and knows beforehand which set it will be
next time. Then, to avoid huge cache misses, it makes sense to keep DB
handles opened for every possible case of application access.

TODO handle invalidations.

533f0c73

go/zodb: Connection: Allow applications to tune live-cache eviction policy · c67ff9ea

Kirill Smelkov authored Aug 09, 2018

For example Wendelin.core wcfs will need to keep some types of objects
(e.g. BigFile index) always in RAM for efficiency.

Provide corresponding interface that application could use to install
such live-cache eviction decision-making tuning.

c67ff9ea

go/zodb: Implement Connection · fb343a6f

Kirill Smelkov authored Aug 09, 2018

Connection represents an application-level view of a ZODB database.
It has groups of in-RAM application-level objects associated with it.
The objects are isolated from both changes in further database
transactions and from changes to in-RAM objects in other connections.

Connection, as objects group manager, is responsible for handling
object -> object database references. For this to work it keeps

{} oid -> obj

dict and uses it to find already loaded object when another object
persistently references particular oid. Since it related pydata handling
of persistent references is correspondingly implemented in this patch.

The dict must keep weak references on objects. The following text
explains the rationale:

if Connection keeps strong link to obj, just
obj.PDeactivate will not fully release obj if there are no
references to it from other objects:

- deactivate will release obj state (ok)
- but there will be still reference from connection `oid -> obj` map to this object,
which means the object won't be garbage-collected.

-> we can solve it by using "weak" pointers in the map.

NOTE we cannot use regular map and arbitrarily manually "gc" entries
there periodically: since for an obj we don't know whether other
objects are referencing it, we can't just remove obj's oid from
the map - if we do so and there are other live objects that
reference obj, user code can still reach obj via those
references. On the other hand, if another, not yet loaded, object
also references obj and gets loaded, traversing reference from
that loaded object will load second copy of obj, thus breaking 1
object in db <-> 1 live object invariant:

A → B → C
↓ |
D <--------- - - -> D2 (wrong)

- A activate
- D activate
- B activate
- D gc, A still keeps link on D
- C activate -> it needs to get to D, but D was removed from objtab
-> new D2 is wrongly created

that's why we have to depend on Go's GC to know whether there are
still live references left or not. And that in turn means finalizers
and thus weak references.

some link on the subject:
https://groups.google.com/forum/#!topic/golang-nuts/PYWxjT2v6ps

fb343a6f