tests/cluster: speedup waiting a bit
Kirill Smelkov authored
NEO functional tests use pdb.wait() in a few places, for example in
NEOCluster .run(), .start() and .expectCondition(). The wait
implementation uses polling with exponentially growing wait period.

With the following instrumentation

	--- a/neo/tests/cluster.py
	+++ b/neo/tests/cluster.py
	@@ -236,6 +236,7 @@ def wait(self, test, timeout):
	                         return False
	             finally:
	                 cluster_dict.release()
	+            print 'next_sleep:', next_sleep
	             sleep(next_sleep)
	         return True

during execution of functional tests it is not uncommon to see the
following sleep periods

	next_sleep: 0.1
	next_sleep: 0.1
	next_sleep: 0.15
	next_sleep: 0.225
	next_sleep: 0.3375
	next_sleep: 0.50625
	next_sleep: 0.1
	next_sleep: 0.1
	next_sleep: 0.15
	next_sleep: 0.225
	next_sleep: 0.3375
	next_sleep: 0.50625
	next_sleep: 0.1
	next_sleep: 0.1
	next_sleep: 0.1
	next_sleep: 0.15
	next_sleep: 0.225
	next_sleep: 0.3375
	next_sleep: 0.1
	next_sleep: 0.1
	next_sleep: 0.1
	next_sleep: 0.1
	next_sleep: 0.1
	next_sleep: 0.1
	next_sleep: 0.1
	next_sleep: 0.1
	next_sleep: 0.1
	next_sleep: 0.1
	next_sleep: 0.15
	next_sleep: 0.225
	next_sleep: 0.3375
	next_sleep: 0.50625

.

Without going into reworking the wait mechanism to use real
notifications instead of polling, it was observed that the exponential
progression tends to create too coarse sleeps. Initial 0.1s interval was
found to be also too much.

This patch remove the exponential period growth and reduces period by order
of one magnitude. For functional tests timings on my computer it is thus:

before patch:

	Functional tests

	28 Tests, 0 Failed

	Title                     : TestRunner
	Date                      : 2018-04-04
	Node                      : deco
	Machine                   : x86_64
	System                    : Linux
	Python                    : 2.7.14

	Directory                 : /tmp/neo_tests/1522868674.115798
	Status                    : 100.000%
	NEO_TESTS_ADAPTER         : SQLite

	                               NEO TESTS REPORT

	              Test Module |  run  | unexpected | expected | skipped |  time
	--------------------------+-------+------------+----------+---------+----------
	                   Client |    6  |       .    |      .   |     .   |   8.51s
	                  Cluster |    7  |       .    |      .   |     .   |   9.84s
	                   Master |    4  |       .    |      .   |     .   |   9.68s
	                  Storage |   11  |       .    |      .   |     .   |  20.76s
	--------------------------+-------+------------+----------+---------+----------
	     neo.tests.functional |       |            |          |         |
	--------------------------+-------+------------+----------+---------+----------
	                  Summary |   28  |       .    |      .   |     .   |  48.79s
	--------------------------+-------+------------+----------+---------+----------

after patch:

	Functional tests

	28 Tests, 0 Failed

	Title                     : TestRunner
	Date                      : 2018-04-04
	Node                      : deco
	Machine                   : x86_64
	System                    : Linux
	Python                    : 2.7.14

	Directory                 : /tmp/neo_tests/1522868527.624376
	Status                    : 100.000%
	NEO_TESTS_ADAPTER         : SQLite

	                               NEO TESTS REPORT

	              Test Module |  run  | unexpected | expected | skipped |  time
	--------------------------+-------+------------+----------+---------+----------
	                   Client |    6  |       .    |      .   |     .   |   7.38s
	                  Cluster |    7  |       .    |      .   |     .   |   7.05s
	                   Master |    4  |       .    |      .   |     .   |   8.22s
	                  Storage |   11  |       .    |      .   |     .   |  19.22s
	--------------------------+-------+------------+----------+---------+----------
	     neo.tests.functional |       |            |          |         |
	--------------------------+-------+------------+----------+---------+----------
	                  Summary |   28  |       .    |      .   |     .   |  41.87s
	--------------------------+-------+------------+----------+---------+----------

in other words ~ 10% improvement for the whole time to run functional tests.
2bef65b7

NEO is a distributed, redundant and scalable implementation of ZODB API. NEO stands for Nexedi Enterprise Object.

Overview

A NEO cluster is composed of the following types of nodes:

  • "master" nodes (mandatory, 1 or more)

    Takes care of transactionality. Only one master node is really active (the active master node is called "primary master") at any given time, extra masters are spares (they are called "secondary masters").

  • "storage" nodes (mandatory, 1 or more)

    Stores data, preserving history. All available storage nodes are in use simultaneously. This offers redundancy and data distribution. Available backends: MySQL (InnoDB, RocksDB or TokuDB), SQLite

  • "admin" nodes (mandatory for startup, optional after)

    Accepts commands from neoctl tool and transmits them to the primary master, and monitors cluster state.

  • "client" nodes

    Well... Something needing to store/load data in a NEO cluster.

ZODB API is fully implemented except:

  • pack: only old revisions of objects are removed (it should be possible to use zc.zodbdgc for garbage collection)
  • blobs: not implemented (not considered yet)

Any ZODB like FileStorage can be converted to NEO instantaneously, which means the database is operational before all data are imported. There's also a tool to convert back to FileStorage.

For more detailed information about features related to scalability, see the Architecture and Characteristics section of https://neo.nexedi.com/.

Requirements

Installation

  1. NEO can be installed like any other egg (see setup.py). Or you can simply make neo directory available for Python to import (for example, by adding its container directory to the PYTHONPATH environment variable).

  2. Write a neo.conf file like the example provided. If you use MySQL, you'll also need create 1 database per storage node.

  3. Start all required nodes:

    $ neomaster -f neo.conf
    $ neostorage -f neo.conf -s storage1
    $ neostorage -f neo.conf -s storage2
    $ neoadmin -f neo.conf
  4. Tell the cluster to initialize storage nodes:

    $ neoctl -a <admin> start
  5. Clients can connect when the cluster is in RUNNING state:

    $ neoctl -a <admin> print cluster
    RUNNING
  6. See importer.conf file to import an existing database, or neoctl command for more administrative tasks.

Alternatively, you can use neosimple command to quickly setup a cluster for testing.

How to use

First make sure Python can import 'neo.client' package.

In zope

  1. Edit your zope.conf, add a neo import and edit the zodb_db section by replacing its filestorage subsection by a NEOStorage one. It should look like:

    %import neo.client
    <zodb_db main>
        <NEOStorage>
            master_nodes 127.0.0.1:10000
            name <cluster name>
        </NEOStorage>
        mount-point /
    </zodb_db>
  2. Start zope

In a Python script

Just create the storage object and play with it:

from neo.client.Storage import Storage
s = Storage(master_nodes="127.0.0.1:10010", name="main")
...

"name" and "master_nodes" parameters have the same meaning as in configuration file.

Shutting down

Before shutting down NEO, all clients like Zope instances should be stopped, so that cluster become idle. This is required for multi-DB setups, to prevent critical failures in second phase of TPC.

A cluster (i.e. masters+storages+admin) can be stopped gracefully by putting it in STOPPING state using neoctl:

neoctl -a <admin> set cluster STOPPING

This can also be done manually, which helps if your cluster is in bad state:

  • Stop all master nodes first with a SIGINT or SIGTERM, so that storage nodes don't become in OUT_OF_DATE state.
  • Next stop remaining nodes with a SIGINT or SIGTERM.

Master-slave asynchronous replication

This is the recommanded way to backup a NEO cluster. Once a cluster with appropriate upstream_cluster & upstream_masters configuration is started, you can switch it into backup mode using:

neoctl -a <admin> set cluster STARTING_BACKUP

It remembers it is in such mode when it is stopped, and it can be put back into normal mode (RUNNING) by setting it into STOPPING_BACKUP state.

Packs are currently not replicated, which means packing should always be done up to a TID that is already fully replicated, so that the backup cluster has a full history (and not random holes).

SSL support

In addition to any external solution like OpenVPN, NEO has builtin SSL support to authenticate and encrypt communications between nodes.

All commands and configuration files have options to specify a CA certificate, the node certificate and the node private key. A certificate can be shared by several nodes.

NEO always uses the latest SSL protocol supported by the Python interpreter, without fallback to older versions. A "SSL: WRONG_VERSION_NUMBER" error means that a node runs in an older environment (Python + OpenSSL) than others.

Note also that you can't mix non-SSL nodes and SSL nodes, even between a upstream cluster and a backup one. In doing so, connections can get stuck, or fail with malformed packets or SSL handshake errors.

Deployment

NEO has no built-in deployment features such as process daemonization. We use supervisor with configuration like below:

[group:neo]
programs=master_01,storage_01,admin

[program:storage_01]
priority=10
command=neostorage -s storage_01 -f /neo/neo.conf

[program:master_01]
priority=20
command=neomaster -s master_01 -f /neo/neo.conf

[program:admin]
priority=20
command=neoadmin -s admin -f /neo/neo.conf

Developers

Developers interested in NEO may refer to NEO Web site and subscribe to following mailing lists:

Automated test results are published at https://www.erp5.com/quality/integration/P-ERP5.Com.Unit%20Tests/Base_viewListMode?proxy_form_id=WebSection_viewERP5UnitTestForm&proxy_field_id=listbox&proxy_field_selection_name=WebSection_viewERP5UnitTestForm_listbox_selection&reset=1&listbox_title=NEO-%25

Commercial Support

Nexedi provides commercial support for NEO: https://www.nexedi.com/