Commit 7e38f374 authored by Tim Peters's avatar Tim Peters

Merge rev 37347 from 2.8 branch.

Deleted this directory, which is actually part of ZODB.
Will stitch in again via svn:externals.
parent 2b77c337
ZEO Documentation
=================
This directory contains ZEO documentation.
howto.txt
The first place to look.
It provides a high-level overview of ZEO features and details on
how to install and configure clients and servers. Includes a
configuration reference.
cache.txt
Explains how the client cache works.
trace.txt
Describe cache trace tool used to determine ideal cache size.
ZopeREADME.txt
A somewhat dated description of how to integrate ZEO with Zope.
It provides a few hints that are not yet in howto.txt.
Zope Enterprise Objects (ZEO)
Installation
ZEO 2.0 requires Zope 2.4 or higher and Python 2.1 or higher.
If you use Python 2.1, we recommend the latest minor release
(2.1.3 as of this writing) because it includes a few bug fixes
that affect ZEO.
Put the package (the ZEO directory, without any wrapping directory
included in a distribution) in your Zope lib/python.
The setup.py script in the top-level ZEO directory can also be
used. Run "python setup.py install --home=ZOPE" where ZOPE is the
top-level Zope directory.
You can test ZEO before installing it with the test script::
python test.py -v
Run the script with the -h option for a full list of options. The
ZEO 2.0b2 release contains 122 unit tests on Unix.
Starting (and configuring) the ZEO Server
To start the storage server, go to your Zope install directory and
run::
python lib/python/ZEO/start.py -p port_number
This run the storage sever under zdaemon. zdaemon automatically
restarts programs that exit unexpectedly.
The server and the client don't have to be on the same machine.
If they are on the same machine, then you can use a Unix domain
socket::
python lib/python/ZEO/start.py -U filename
The start script provides a number of options not documented here.
See doc/start.txt for more information.
Running Zope as a ZEO client
To get Zope to use the server, create a custom_zodb module,
custom_zodb.py, in your Zope install directory, so that Zope uses a
ClientStorage::
from ZEO.ClientStorage import ClientStorage
Storage = ClientStorage(('', port_number))
You can specify a host name (rather than '') if you want. The port
number is, of course, the port number used to start the storage
server.
You can also give the name of a Unix domain socket file::
from ZEO.ClientStorage import ClientStorage
Storage = ClientStorage(filename)
There are a number of configuration options available for the
ClientStorage. See doc/ClientStorage.txt for details.
If you want a persistent client cache which retains cache contents
across ClientStorage restarts, you need to define the environment
variable, ZEO_CLIENT, or set the client keyword argument to the
constructor to a unique name for the client. This is needed so
that unique cache name files can be computed. Otherwise, the
client cache is stored in temporary files which are removed when
the ClientStorage shuts down. For example, to start two Zope
processes with unique caches, use something like::
python z2.py -P8700 ZEO_CLIENT=8700
python z2.py -P8800 ZEO_CLIENT=8800
Zope product installation
Normally, Zope updates the Zope database during startup to reflect
product changes or new products found. It makes no sense for
multiple ZEO clients to do the same installation. Further, if
different clients have different software installed, the correct
state of the database is ambiguous.
Zope will not modify the Zope database during product installation
if the environment variable ZEO_CLIENT is set.
Normally, Zope ZEO clients should be run with ZEO_CLIENT set so
that product initialization is not performed.
If you do install new Zope products, then you need to take a
special step to cause the new products to be properly registered
in the database. The easiest way to do this is to start Zope
once with the environment variable FORCE_PRODUCT_LOAD set.
The interaction between ZEO and Zope product installation is
unfortunate.
ZEO Client Cache
The Client cache provides a disk based cache for each ZEO client.
The client cache allows reads to be done from local disk rather than
by remote access to the storage server.
The cache may be persistent or transient. If the cache is
persistent, then the cache files are retained for use after process
restarts. A non-persistent cache uses temporary files that are
removed when the client storage is closed.
The client cache is managed as two files. The cache manager
endeavors to maintain the two files at sizes less than or equal to
one half the cache size. One of the cache files is designated the
"current" cache file. The other cache file is designated the "old"
cache file, if it exists. All writes are done to the current cache
files. When transactions are committed on the client, transactions
are not split between cache files. Large transactions may cause
cache files to be larger than one half the target cache size.
The life of the cache is as follows:
- When the cache is created, the first of the two cache files is
created and designated the "current" cache file.
- Cache records are written to the cache file, either as
transactions commit locally, or as data are loaded from the
server.
- When the cache file size exceeds one half the cache size, the
second cache file is created and designated the "current" cache
file. The first cache file becomes the "old" cache file.
- Cache records are written to the new current cache file, either as
transactions commit locally, or as data are loaded from the
server.
- When a cache hit is found in the old cache file, it is copied to
the current cache file.
- When the current cache file size exceeds one half the cache size, the
first cache file is recreated and designated the "current" cache
file. The second cache file becomes the "old" cache file.
and so on.
Persistent cache files are created in the directory named in the
'var' argument to the ClientStorage (see ClientStorage.txt) or in
the 'var' subdirectory of the directory given by the INSTANCE_HOME
builtin (created by Zope), or in the current working directory.
Persistent cache files have names of the form::
cstorage-client-n.zec
where:
storage -- the storage name
client -- the client name, as given by the 'ZEO_CLIENT' environment
variable or the 'client' argument provided when creating a client
storage.
n -- '0' for the first cache file and '1' for the second.
For example, the second cache file for storage 'spam' and client 8881
would be named 'cspam-8881-1.zec'.
This diff is collapsed.
ZEO Client Cache Tracing
========================
An important question for ZEO users is: how large should the ZEO
client cache be? ZEO 2 (as of ZEO 2.0b2) has a new feature that lets
you collect a trace of cache activity and tools to analyze this trace,
enabling you to make an informed decision about the cache size.
Don't confuse the ZEO client cache with the Zope object cache. The
ZEO client cache is only used when an object is not in the Zope object
cache; the ZEO client cache avoids roundtrips to the ZEO server.
Enabling Cache Tracing
----------------------
To enable cache tracing, set the environment variable ZEO_CACHE_TRACE
to the name of a file to which the ZEO client process can write. ZEO
will append a hyphen and the storage name to the filename, to
distinguish different storages. If the file doesn't exist, the ZEO
will try to create it. If there are problems with the file, a log
message is written to the standard Zope log file. To start or stop
tracing, the ZEO client process (typically a Zope application server)
must be restarted.
The trace file can grow pretty quickly; on a moderately loaded server,
we observed it growing by 5 MB per hour. The file consists of binary
records, each 24 bytes long; a detailed description of the record
lay-out is given in stats.py. No sensitive data is logged.
Analyzing a Cache Trace
-----------------------
The stats.py command-line tool is the first-line tool to analyze a
cache trace. Its default output consists of two parts: a one-line
summary of essential statistics for each segment of 15 minutes,
interspersed with lines indicating client restarts and "cache flip
events" (more about those later), followed by a more detailed summary
of overall statistics.
The most important statistic is probably the "hit rate", a percentage
indicating how many requests to load an object could be satisfied from
the cache. Hit rates around 70% are good. 90% is probably close to
the theoretical maximum. If you see a hit rate under 60% you can
probably improve the cache performance (and hence your Zope
application server's performance) by increasing the ZEO cache size.
This is normally configured using the cache_size keyword argument to
the ClientStorage() constructor in your custom_zodb.py file. The
default cache size is 20 MB.
The stats.py tool shows its command line syntax when invoked without
arguments. The tracefile argument can be a gzipped file if it has a
.gz extension. It will read from stdin (assuming uncompressed data)
if the tracefile argument is '-'.
Simulating Different Cache Sizes
--------------------------------
Based on a cache trace file, you can make a prediction of how well the
cache might do with a different cache size. The simul.py tool runs an
accurate simulation of the ZEO client cache implementation based upon
the events read from a trace file. A new simulation is started each
time the trace file records a client restart event; if a trace file
contains more than one restart event, a separate line is printed for
each simulation, and line with overall statistics is added at the end.
Example, assuming the trace file is in /tmp/cachetrace.log::
$ python simul.py -s 100 /tmp/cachetrace.log
START TIME DURATION LOADS HITS INVALS WRITES FLIPS HITRATE
Sep 4 11:59 38:01 59833 40473 257 20 2 67.6%
$
This shows that with a 100 MB cache size, the cache hit rate is
67.6%. So let's try this again with a 200 MB cache size::
$ python simul.py -s 200 /tmp/cachetrace.log
START TIME DURATION LOADS HITS INVALS WRITES FLIPS HITRATE
Sep 4 11:59 38:01 59833 40921 258 20 1 68.4%
$
This showed hardly any improvement. So let's try a 300 MB cache
size::
$ python2.0 simul.py -s 300 /tmp/cachetrace.log
ZEOCacheSimulation, cache size 300,000,000 bytes
START TIME DURATION LOADS HITS INVALS WRITES FLIPS HITRATE
Sep 4 11:59 38:01 59833 40921 258 20 0 68.4%
$
This shows that for this particular trace file, the maximum attainable
hit rate is 68.4%. This is probably caused by the fact that nearly a
third of the objects mentioned in the trace were loaded only once --
the cache only helps if an object is loaded more than once.
The simul.py tool also supports simulating different cache
strategies. Since none of these are implemented, these are not
further documented here.
Cache Flips
-----------
The cache uses two files, which are managed as follows:
- Data are written to file 0 until file 0 exceeds limit/2 in size.
- Data are written to file 1 until file 1 exceeds limit/2 in size.
- File 0 is truncated to size 0 (or deleted and recreated).
- Data are written to file 0 until file 0 exceeds limit/2 in size.
- File 1 is truncated to size 0 (or deleted and recreated).
- Data are written to file 1 until file 1 exceeds limit/2 in size.
and so on.
A switch from file 0 to file 1 is called a "cache flip". At all cache
flips except the first, half of the cache contents is wiped out. This
affects cache performance. How badly this impact is can be seen from
the per-15-minutes summaries printed by stats.py. The -i option lets
you choose a smaller summary interval which shows the impact more
acutely.
The simul.py tool shows the number of cache flips in the FLIPS column.
If you see more than one flip per hour the cache may be too small.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment