enhancements

898ec9a6 · Jim Fulton · d4da45f2 · 898ec9a6
Commit 898ec9a6 authored Feb 19, 2014 by Jim Fulton
Show whitespace changes
Inline Side-by-side

Showing with 186 additions and 82 deletions

documentation/tutorial.rst documentation/tutorial.rst +186 -82

No files found.
--- a/documentation/tutorial.rst
+++ b/documentation/tutorial.rst
@@ -8,10 +8,15 @@ of how to develop an application which stores its data in the ZODB.
 Introduction
 ============

-Lets have a look at a simple piece of code that we want to turn into using
-ZODB::
+To save application data in ZODB, you'll generally define classes that
+subclass ``persistent.Persistent``::
+
+    # account.py
+
+    import persistent
+
+    class Account(persistent.Persistent):

-    class Account(object):
        def __init__(self):
            self.balance = 0.0

@@ -22,120 +27,219 @@ ZODB::
            assert amount < self.balance
            self.balance -= amount

-This code defines a simple class that holds the balance of a bank account and provides two methods to manipulate the balance: deposit and cash.
+This code defines a simple class that holds the balance of a bank
+account and provides two methods to manipulate the balance: deposit
+and cash.
+
+Subclassing ``Persistent`` provides a number of features:
+
+- The database will automatically track object changes made by setting
+  attributes [#changed]_.
+
+- Data will be saved in it's own database record.
+
+  You can save data that doesn't subclass ``Persistent``, but it will be
+  stored in the database record of whatever persistent object
+  references it.
+
+- Objects will have unique persistent identity.
+
+  Multiple objects can refer to the same persistent object and they'll
+  continue to refer to the same object even after being saved
+  and loaded from the database.
+
+  Non-persistent objects are essentially owned by their containing
+  persistent object and if multiple persistent objects refer to the
+  same non-persistent subobject, they'll (eventually) get their own
+  copies.
+
+Note that we put the class in a named module.  Classes aren't stored
+in the ZODB [#persistentclasses]_.  They exist on the file system and
+their names, consisting of their class and module names, are stored in
+the database. It's sometimes tempting to create persistent classes in
+scripts or in interactive sessions, but if you do, then their module
+name will be ``'__main__'`` and you'll always have to define them that
+way.

 Installation
 ============

 Before being able to use ZODB we have to install it, using easy_install. Note
-that the actual package name is called "ZODB3"::
+that the actual package name is called "ZODB::
+
+    $ pip install ZODB
+
+Creating Databases
+==================
+
+When a program wants to use the ZODB it has to establish a connection,
+like any other database. For the ZODB we need 3 different parts: a
+storage, a database and finally a connection::
+
+    import ZODB, ZODB.FileStorage

-    $ easy_install ZODB3
-    ...
-    $ python
-    >>> import ZODB
+    storage = ZODB.FileStorage.FileStorage('mydata.fs')
+    db = ZODB.DB(storage)
+    connection = db.open()
+    root = connection.root

-ZODB is now installed and can be imported from your Python installation.
+ZODB has a pluggable storage framework.  This means there are a
+variety of storage implementations to meet different needs, from
+in-memory databases, to databases stored in local files, to databases
+on remote database servers, and specialized databases for compression,
+encryption, and so on.  In the example above, we created a database
+that stores it's data in a local file, using the ``FileStorage``
+class.

-    If you do not have easy_install available on your system, follow the
-    `EasyInstall
-    <http://peak.telecommunity.com/DevCenter/EasyInstall#installation-instructions>`_
-    installation instructions.
+Having a storage, we then use it to instantiate a database, which we
+then connect to by calling ``open()``.  A process with multiple
+threads will often have multiple connections to the same database,
+with different threads having different connections.

-    There are other installation mechanisms available to manage the
-    installation of Python packages. This tutorial assumes that you are using
-    a plain Python installation and install ZODB globally.
+There are a number of convenient shortcuts you can use for some of the
+commonly used storages:

+- You can pass a file name to the ``DB`` constructor to have it construct
+  a FileStorage for you::

-Configuration
-=============
+    db = ZODB.DB('mydata.fs')

-When a program wants to use the ZODB it has to establish a connection, like any other database. For the ZODB we need 3 different parts: a storage, a database and finally a connection:
+  You can pass None to create an in-memory database::

-    >>> from ZODB.FileStorage import FileStorage
-    >>> from ZODB.DB import DB
-    >>> storage = FileStorage('Data.fs')
-    >>> db = DB(storage)
-    >>> connection = db.open()
-    >>> root = connection.root()
+    memory_db = ZODB.DB(None)

-We create a storage called FileStorage, which is the current standard storage
-used by virtually everyone. It keeps track of all data in a single file as
-stated by the first parameter. From this storage we create a database and
-finally open up a connection. Finally, we retrieve the database root object
-from the connection that we opened.
+- If you're only going to use one connection, you can call the
+  ``connection`` function::
+
+    connection = ZODB.connection('mydata.fs')
+    memory_connection = ZODB.connection(None)

 Storing objects
 ===============

-To store an object in the ZODB we simply attach it to any other object that
-already lives in the database. Hence, the root object functions as a
-boot-strapping point. The root object is a dictionary and you can start storing
-your objects directly in there:
+To store an object in the ZODB we simply attach it to any other object
+that already lives in the database. Hence, the root object functions
+as a boot-strapping point.  The root object is meant to serve as a
+namespace for top-level objects in your database.  We could store
+account objects directly on the root object::
+
+    import account
+
+    # Probably a bad idea:
+    root.account1 = account.Account()
+
+But if you're going to store many objects, you'll want to use a
+collection object [#root]_::
+
+    import account, BTrees.OOBTree
+
+    root.accounts = BTrees.OOBTree.BTree()
+    root.accounts['account-1'] = Account()
+
+Another common practice is to store a persistent object in the root of
+the database that provides an application-specific root::
+
+    root.accounts = AccountManagementApplication()
+
+That can facilitate encapsulation of an application that shares a
+database with other applications.  This is a little bit like using
+modules to avoid namespace colisions in Python programs.

->>> root['account-1'] = Account()
->>> root['account-2'] = Account()
+Containers and search
+=====================

-    Frameworks like Zope only
-    create a single object in the ZODB root representing the application itself and
-    then let all sub-objects be referenced from there on. They choose names like
-    'app' for the first object they place in the ZODB.
+BTrees provide the core scalable containers and indexing facility for
+ZODB. There are different families of BTrees.  The most general are
+OOBTrees, which have object keys and values. There are specialized
+BTrees that support integer keys and values.  Integers can be stored
+more efficiently, and compared more quickly than objects and their
+often used as application-level object identifiers.  It's critical,
+when using BTrees, to make sure that it's keys have a stable ordering.

+ZODB doesn't provide a query engine.  The primary way to access
+objects in ZODB is by traversing (accessing attributes or items, or
+calling methods) other objects.  Object traversal is typically much
+faster than search.
+
+You can use BTrees to build indexes for efficient search, when
+necessary.  If your application is search centric, or if you prefer to
+approach data access that way, then ZODB might not be the best
+technology for you.

 Transactions
 ============

-You now have two objects placed in your root object and in your database.
-However, they are not permanently stored yet. The ZODB uses transactions and to
-make your changes permanent, you have to commit the transaction:
+You now have objects in your root object and in your database.
+However, they are not permanently stored yet. The ZODB uses
+transactions and to make your changes permanent, you have to commit
+the transaction::

->>> import transaction
->>> transaction.commit()
->>> root.keys()
-['account-1', 'account-2']
+   import transaction

-Now you can stop and start your application and look at the root object again,
-and you will find the entries 'account-1' and 'account-2' to be still be
-present and be the objects you created.
+   transaction.commit()

-    Objects that have not been stored in
-    the ZODB yet are not rolled back by an abort.
+Now you can stop and start your application and look at the root object again,
+and you will find the data you saved.

 If your application makes changes during a transaction and finds that it does
 not want to commit those changes, then you can abort the transaction and have
-the changes rolled back for you:
+the changes rolled back [#rollback]_ for you::

->>> del root['account-1']
->>> root.keys()
-['account-2']
->>> transaction.abort()
->>> root.keys()
-['account-1', 'account-2']
-
-Persistent objects
-==================
+   transaction.abort()

-One last aspect that we need to cover are persistent objects themselves. The
-ZODB will be happy to store almost any Python object that you pass to it (it
-won't store files for example). But in order for noticing which objects have
-changed the ZODB needs those objects to cooperate with the database. In
-general, you just subclass `persistent.Persistent` to make this happen. So our
-example class from above would read like::
-
-    import persistent
-
-    class Account(persistent.Persistent):
-        # ... same code as above ...
+Transactions are a very powerful way to protect the integrity of a
+database.  Transactions have the property that all of the changes made
+in a transaction are saved, or none of them are.  If in the midst of a
+program, there's an error after making changes, you can simply abort
+the transaction (or not commit it) and all of the intermediate changes
+you make are automatically discarded.

-..
+Memory Management
+=================

-    Have a look at the reference documentation about the rules of persistency and
-    to find out more about specialised persistent objects like BTrees.
+ZODB manages moving objects in and out of memory for you.  The unit of
+storage is the persistent object.  When you access attributes of a
+persistent objects, it's loaded from the database automatically, if
+necessary. If too many objects are in memory, then objects used least
+recently are evicted [#eviction]_.  The maximum number of objects or
+bytes in memory is configurable,

 Summary
 =======

 You have seen how to install ZODB and how to open a database in your
-application and to start storing objects in it. We also touched the two simple
-transaction commands: commit and abort. The reference documentation contains
-sections with more information on the individual topics.
+application and to start storing objects in it. We also touched the
+two simple transaction commands: ``commit`` and ``abort``. The
+reference documentation contains sections with more information on the
+individual topics.
+
+.. [#changed] 
+   You can manually mark an object as changed by setting it's
+   ``_p_changed__`` attribute to ``True``. You might do this if you
+   update a subobject, such as a standard Python ``list`` or ``set``,
+   that doesn't subclass ``Persistent``.
+
+.. [#persistentclasses]
+   Actually, there is semi-experimental support for storing classes in
+   the database, but applications rarely do this.
+
+.. [#root]
+   The root object is a fairy simple persistent object that's stored
+   in a single database record.  If you stored many objects in it,
+   it's database record would become very large, causing updates to be
+   inefficient and causing memory to be used ineffeciently.
+
+   Another reason not to store items directly in the root object is
+   that doing so would make adding a second collection of objects
+   later awkward.
+
+.. [#rollback]
+   A caveat is that ZODB can only roll back changes to objects that
+   have been stored and committed to the database.  Objects not
+   previously committed can't be rolled back because there's no
+   previous state to roll back to.
+
+.. [#eviction]
+   Objects aren't actually evicted, but their state is released, so
+   they take up much less memory and any objects they referenced can
+   be removed from memory.