setup.py · 05bf48de48a6feafc81ba340c8770f8aea62065a · Levin Zimmermann / neoppod

importer: fetch and process the data to import in a separate process · 05bf48de

Julien Muchembled authored May 02, 2018

A new subprocess is used to:
- fetch data from the source DB
- repickle to change oids (when merging several DB)
- compress
- checksum

This is mostly useful for the second step, which is relatively much slower than
any other step, while not releasing the GIL.

By using a second CPU core, it is also often possible to use a better
compression algorithm for free (e.g. zlib=9). Actually, smaller data can speed
up the writing process.

In addition to greatly speed up the import by parallelizing fetch+process with
write, it also makes the main process more reactive to queries from client
nodes.

05bf48de

setup.py 3.4 KB

Replace setup.py