fixup! ZBigFile: Add ZBlk format option 'h' (heuristic)

Rework the benchmark: - cleanup benchmark directory after benchmark run. I'm running out of memory after several benchmark runs because my /tmp is on tmpfs and it is generally a leak not to clean after test run. - Use only [size]int instead of [size][2]int as test array. Besides major dimension array shape is orthogonal to testing how storage behaves. - Benchmark both append and random write workloads, not only append. It is generally good to run the benchmark and have full set of numbers. - Fix off-by-one error in accessrand: random.randint(a,b) returns [a,b], not [a,b) and so using A[randint(0, arraysize)] can result in A[arraysize] which will go beyound last array element. Also replace arraysize with len(A) for better clarity. - Rework read access benchmark to robustly never access the same block twice. Previously the code was setting just niter=10 and hoping that a block would never be hit for the same time, but in the benchmarks we have not so many blocks and blindly selecting 10 random of them starts to overloap. Updated code makes sure to load any block only up to one time. - Do not manually set sys.path when running the benchmark: When tests are run wendelin.core is expectied to be installed in development mode via e.g. `pip install -e`, or, under buildout, via using custom python interpreter that has wendelin.core egg on path. This way path setup is already such that import wendelin.core should work. And if that would not be the case, we would have to adjust sys.path in every test or demo program. - Use unified benchmarking format for the output, so that tools like benchstat could be used to aggregate and compare results. - Remove code to raise RLIMIT_NOFILE. In the benchmark we use only one array and the amount of needed file descriptors is proportional to the number of used arrays. In other words he benchmark should not be a heavy user of the file descriptors. With `ulimit -n 20` the benchmarks run just ok, while the system default is usually 1024 or similar. - Remove usage of bash - the benchmark spawns processes from itself via python code. - Restructure the code for clarity. - Rename the benchmark to start with bench_ similarly to other existing benchmarks.

fixup! ZBigFile: Add ZBlk format option 'h' (heuristic)
Rework the benchmark: - cleanup benchmark directory after benchmark run. I'm running out of memory after several benchmark runs because my /tmp is on tmpfs and it is generally a leak not to clean after test run. - Use only [size]int instead of [size][2]int as test array. Besides major dimension array shape is orthogonal to testing how storage behaves. - Benchmark both append and random write workloads, not only append. It is generally good to run the benchmark and have full set of numbers. - Fix off-by-one error in accessrand: random.randint(a,b) returns [a,b], not [a,b) and so using A[randint(0, arraysize)] can result in A[arraysize] which will go beyound last array element. Also replace arraysize with len(A) for better clarity. - Rework read access benchmark to robustly never access the same block twice. Previously the code was setting just niter=10 and hoping that a block would never be hit for the same time, but in the benchmarks we have not so many blocks and blindly selecting 10 random of them starts to overloap. Updated code makes sure to load any block only up to one time. - Do not manually set sys.path when running the benchmark: When tests are run wendelin.core is expectied to be installed in development mode via e.g. `pip install -e`, or, under buildout, via using custom python interpreter that has wendelin.core egg on path. This way path setup is already such that import wendelin.core should work. And if that would not be the case, we would have to adjust sys.path in every test or demo program. - Use unified benchmarking format for the output, so that tools like benchstat could be used to aggregate and compare results. - Remove code to raise RLIMIT_NOFILE. In the benchmark we use only one array and the amount of needed file descriptors is proportional to the number of used arrays. In other words he benchmark should not be a heavy user of the file descriptors. With `ulimit -n 20` the benchmarks run just ok, while the system default is usually 1024 or similar. - Remove usage of bash - the benchmark spawns processes from itself via python code. - Restructure the code for clarity. - Rename the benchmark to start with bench_ similarly to other existing benchmarks.
0c6f0850 · Kirill Smelkov · 4f314ee0 · 4f314ee0 · 0c6f0850 · 4f314ee0
Commit 0c6f0850 authored Mar 29, 2024 by Kirill Smelkov
Showing with 241 additions and 257 deletions

bigfile/tests/_test_zblk_fmt bigfile/tests/_test_zblk_fmt +0 -170

bigfile/tests/bench_zblkfmt bigfile/tests/bench_zblkfmt +241 -0

bigfile/tests/test-zblk-fmt bigfile/tests/test-zblk-fmt +0 -87

No files found.
--- a/bigfile/tests/_test_zblk_fmt
+++ b/bigfile/tests/_test_zblk_fmt
-# Copyright (C) 2023  Nexedi SA and Contributors.
-#
-# This program is free software: you can Use, Study, Modify and Redistribute
-# it under the terms of the GNU General Public License version 3, or (at your
-# option) any later version, as published by the Free Software Foundation.
-#
-# You can also Link and Combine this program with other software covered by
-# the terms of any of the Free Software licenses or any of the Open Source
-# Initiative approved licenses and Convey the resulting work. Corresponding
-# source of such a combination shall include the source code for all other
-# software used.
-#
-# This program is distributed WITHOUT ANY WARRANTY; without even the implied
-# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
-#
-# See COPYING file for full licensing terms.
-# See https://www.nexedi.com/licensing for rationale and options.
-
-# Test to compare disk-space and access-speed of the different ZBlk format options:
-#
-# 	- ZBlk0
-# 	- ZBlk1
-# 	- auto
-
-import os
-import random
-import resource
-import tempfile
-import timeit
-import sys
-from time import time, sleep
-
-# Add relative module path, to run tests on local code
-sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', '.'))
-
-from golang import defer, func
-import numpy as np
-import transaction
-import ZODB, ZODB.FileStorage
-
-from wendelin.bigarray.array_zodb import ZBigArray
-
-ms = 1e-3
-
-random.seed(10)
-
-# Avoid error due to too many opened file descriptors.
-cur_limit = resource.getrlimit(resource.RLIMIT_NOFILE)
-new_limit = (cur_limit[1], cur_limit[1])
-resource.setrlimit(resource.RLIMIT_NOFILE, new_limit)
-
-storage_path = tempfile.mktemp(prefix='zblkbenchmark')
-
-# Declare test parameters.
-change_percentage_set = tuple(
-    float(n) for n in os.environ.get('change_percentage_set', '0.2').split(','))
-change_count = int(os.environ.get('change_count', '1000'))
-arrsize = int(os.environ.get('arrsize', '1000000'))
-change_type = os.environ.get('change_type', 'setitem')  # setitem or append
-
-# Utiliy functions
-
-def randarr(size=1000000):
-    return np.array([[random.randint(1, 1000), random.randint(1, 1000)] for _ in range(size)])
-
-def setrand(A, blksize_length, change_percentage=1):
-    size = int(blksize_length * change_percentage)
-    blk_index = random.randint(0, int(arrsize / blksize_length) - 1)
-    blk_offset = blk_index * blksize_length
-    # Ensure we don't always only change the beginning of a block
-    blk_offset = blk_offset + random.randint(0, blksize_length - size)
-    A[blk_offset:blk_offset+size][:] = randarr(size)
-    transaction.commit()
-
-def accessrand(A):
-    # force load of ZBlk data via reading ndarray element
-    A[random.randint(0, arrsize), 0]
-
-def fillup(root):
-    root.A.append([[0, 0] for _ in range(arrsize)])
-    transaction.commit()
-
-def change_setitem(root):
-    A = root.A[:]
-    blksize_length = get_blksize_length(root)
-    for _ in range(change_count):
-        change_percentage = random.choice(change_percentage_set)
-        setrand(A, blksize_length, change_percentage)
-        transaction.commit()
-
-def change_append(root):
-    A = root.A
-    blksize_length = get_blksize_length(root)
-    for _ in range(change_count):
-        change_percentage = random.choice(change_percentage_set)
-        size = int(blksize_length * change_percentage)
-        A.append(randarr(size))
-        transaction.commit()
-
-def get_blksize_length(root):
-    return root.A.zfile.blksize / 16
-
-traceload = False
-delayload = False
-@func
-def root(func):
-    storage = ZODB.FileStorage.FileStorage(storage_path)
-    stor_load       = storage.load
-    stor_loadBefore = storage.loadBefore
-    def loadBefore(oid, tid):
-        if traceload:
-            print 'loadBefore %r %r' % (oid, tid)
-        # simulate loading latency as actually seen on NEO.
-        # there I was seeing latencies up to _1_ millisecond, but even with
-        # "modest" 0.2 ms it really shows in the figures.
-        #
-        # (activated only during read benchmark to avoid avoid wasting time
-        # while preparing data)
-        if delayload:
-            sleep(0.2 * ms)
-        return stor_loadBefore(oid, tid)
-    def load(oid):
-        print 'load %r' % oid
-        1/0 # should not call load at all
-        return stor_load(oid)
-    storage.loadBefore = loadBefore
-    storage.load       = load
-
-    db = ZODB.DB(storage)
-    connection = db.open()
-    root = connection.root
-
-    defer(connection.close)
-    defer(db.close)
-    defer(storage.close)
-
-    func(root)
-
-@root
-def setup(root):
-    root.A = A = ZBigArray(shape=[1, 2], dtype=int)
-    transaction.commit()
-
-if change_type == "setitem":
-    root(fillup)
-    root(change_setitem)
-elif change_type == "append":
-    root(change_append)
-else:
-    raise NotImplementedError(change_type)
-
-print("\tZODB storage size: %s MB" % (os.path.getsize(storage_path) / float(10**6)))
-
-@root
-def access(root):
-    global traceload, delayload
-    a = root.A[:]   # create BigArray -> ndarray view only once
-
-    delayload = True
-    def _():
-        t0 = time()
-        accessrand(a)
-        t1 = time()
-    random.seed(10)
-
-    # niter should be small to avoid getting into situation when most blocks becomes loaded into cache
-    # and we start to measure time of hot access without any ZODB loading
-    niter=10
-    taccess = timeit.timeit(_, number=niter) / niter
-    print("\tAccess time: %.3f ms / blk  (initially cold; might get warmer during benchmark)" % (taccess/ms))
--- a/bigfile/tests/bench_zblkfmt
+++ b/bigfile/tests/bench_zblkfmt
+#!/usr/bin/env python
+# Copyright (C) 2023  Nexedi SA and Contributors.
+#
+# This program is free software: you can Use, Study, Modify and Redistribute
+# it under the terms of the GNU General Public License version 3, or (at your
+# option) any later version, as published by the Free Software Foundation.
+#
+# You can also Link and Combine this program with other software covered by
+# the terms of any of the Free Software licenses or any of the Open Source
+# Initiative approved licenses and Convey the resulting work. Corresponding
+# source of such a combination shall include the source code for all other
+# software used.
+#
+# This program is distributed WITHOUT ANY WARRANTY; without even the implied
+# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+#
+# See COPYING file for full licensing terms.
+# See https://www.nexedi.com/licensing for rationale and options.
+
+# Test to compare disk-space and access-speed of the different ZBlk format options:
+#
+# 	- ZBlk0
+# 	- ZBlk1
+# 	- auto
+
+from __future__ import print_function, absolute_import, division
+
+import os
+import random
+import tempfile
+import timeit
+import shutil
+import multiprocessing
+from time import sleep
+ms = 1e-3
+
+from golang import defer, func
+import numpy as np
+import transaction
+import ZODB, ZODB.FileStorage
+
+from wendelin.bigarray.array_zodb import ZBigArray
+from wendelin.bigfile import file_zodb
+
+
+# IWriteWorkLoad represents write workload type:
+class IWriteWorkLoad:
+    # do_write should perform all write operations of the workload on the
+    # database associated with root object.
+    def do_write(wrk, root):  raise NotImplementedError()
+
+    # args should be set to string with arguments used to parameterize this workload.
+    args = ''
+
+
+# benchwrk benchmarks database size and read speed under write workload wrk.
+#
+# the benchmark is run for all supported ZBlk formats.
+def benchwrk(wrk):
+    # run each benchmark in separate process so that there is no chance they
+    # somehow affect each other.
+    zblk_fmtv = list(file_zodb.ZBlk_fmt_registry.keys())
+    zblk_fmtv.sort()
+    for zblk_fmt in zblk_fmtv:
+        def _():
+            file_zodb.ZBlk_fmt_write = zblk_fmt
+            _benchwrk(wrk)
+        p = multiprocessing.Process(target=_)
+        p.start()
+        p.join()
+
+@func
+def _benchwrk(wrk):
+    tmpd = tempfile.mkdtemp('', 'zblkbench')
+    def _():
+        shutil.rmtree(tmpd)
+    defer(_)
+
+    storage_path = '%s/data.fs' % tmpd
+
+    # with_db runs f(root) on a freshly-opened connection to test database.
+    traceload = False
+    delayload = False
+    @func
+    def with_db(f):
+        storage = ZODB.FileStorage.FileStorage(storage_path)
+        defer(storage.close)
+
+        # simulate loading latency as actually seen on NEO.
+        # there I was seeing latencies up to _1_ millisecond, but even with
+        # "modest" 0.2 ms it really shows in the figures.
+        #
+        # (activated only during read benchmark to avoid wasting time
+        # while preparing data)
+        tloaddelay = 0.2 * ms
+        stor_load       = storage.load
+        stor_loadBefore = storage.loadBefore
+        def loadBefore(oid, tid):
+            if traceload:
+                print('# loadBefore %r %r' % (oid, tid))
+            if delayload:
+                sleep(tloaddelay)
+            return stor_loadBefore(oid, tid)
+        def load(oid):
+            # load is used on plain ZODB4; ZODB5 and ZODB4-wc2 use loadBefore only
+            if traceload:
+                print('# load %r' % (oid,))
+            # see loadBefore above
+            if delayload:
+                sleep(tloaddelay)
+            return stor_load(oid)
+        storage.loadBefore = loadBefore
+        storage.load       = load
+
+        db = ZODB.DB(storage)   ; defer(db.close)
+        connection = db.open()  ; defer(connection.close)
+        root = connection.root
+
+        f(root)
+
+    # create test database with empty array, then run specified write workload
+    # and see how big ZODB size is.
+    @with_db
+    def _(root):
+        root.A = ZBigArray(shape=[0], dtype=int)
+        transaction.commit()
+
+        random.seed(10)
+        wrk.do_write(root)
+        transaction.commit()    # just in case
+
+    def emitbench(name, data):
+        wrkname = wrk.__class__.__name__
+        benchprefix = "Benchmark%s%s/zblk=%s/%s" % (wrkname, name, file_zodb.ZBlk_fmt_write, wrk.args)
+        print('%s\t%s' % (benchprefix, data))
+    emitbench("Size", "1\t%.1f MB" % (os.path.getsize(storage_path) / 1E6))
+
+    # now benchmark random reads.
+    delayload = True
+    @with_db
+    def _(root):
+        A = root.A
+        blklen = arr_blklen(A)
+
+        # make sure we never read the same block twice - else we will start to
+        # measure time of hot access without any ZODB loading
+        random.seed(10)
+        blkv = list(range(len(A) // blklen))
+        random.shuffle(blkv)
+
+        a = A[:]
+        def _():
+            blk = blkv.pop()
+            # force load of ZBlk data via reading ndarray element from inside the block
+            a[blk*blklen]
+
+        niter = min(len(blkv), 10)
+        assert niter >= 3, niter
+        taccess = timeit.timeit(_, number=niter) / niter
+        emitbench("RandRead", "%d %.3f ms/blk" % (niter, taccess/ms))
+
+
+# Append simulates workload when data are appended in chunks to end of array.
+class Append(IWriteWorkLoad):
+    def __init__(wrk, change_count, change_percentage_set):
+        wrk.change_count = change_count
+        wrk.change_percentage_set = change_percentage_set
+        wrk.args = "change_count=%d/change_percentage_set=%s" % (
+                                    change_count, repr(change_percentage_set).replace(' ',''))
+
+    def do_write(wrk, root):
+        A = root.A
+        for _ in range(wrk.change_count):
+            change_percentage = random.choice(wrk.change_percentage_set)
+            size = int(arr_blklen(A) * change_percentage)
+            A.append(randarr(size))
+            transaction.commit()
+
+
+# RandWrite simulates workload when data is written randomly in the array.
+class RandWrite(IWriteWorkLoad):
+    def __init__(wrk, arrsize, change_count, change_percentage_set):
+        wrk.arrsize = arrsize
+        wrk.change_count = change_count
+        wrk.change_percentage_set = change_percentage_set
+        wrk.args = "arrsize=%d/change_count=%d/change_percentage_set=%s" % (
+                                    arrsize, change_count,
+                                    repr(change_percentage_set).replace(' ',''))
+
+    def do_write(wrk, root):
+        A = root.A
+        A.append([0]*wrk.arrsize)
+        transaction.commit()
+
+        for _ in range(wrk.change_count):
+            change_percentage = random.choice(wrk.change_percentage_set)
+            setrand(A, change_percentage)
+            transaction.commit()
+
+
+# Utility functions
+
+# randarr returns random [size]int array.
+def randarr(size):
+    return np.array([random.randint(1, 1000) for _ in range(size)])
+
+# setrand makes random write access to ZBigArray A.
+#
+# The amount of changed data is fraction of underlying block size.
+# Only one block is changed.
+def setrand(A, change_percentage):
+    blklen = arr_blklen(A)
+    change_size = int(blklen * change_percentage)
+    blk_index = random.randrange(0, len(A) // blklen)
+    blk_offset = blk_index * blklen
+    # Ensure we don't always only change the beginning of a block
+    blk_offset = blk_offset + random.randint(0, blklen - change_size)
+    A[blk_offset:blk_offset+change_size][:] = randarr(change_size)
+
+# arr_blklen returns how many ZBigArray items constitute up a block in underlying ZBigFile.
+def arr_blklen(A):
+    assert isinstance(A, ZBigArray)
+    assert len(A.shape) == 1
+    assert A.zfile.blksize  % A.itemsize == 0
+    return A.zfile.blksize // A.itemsize
+
+
+# ---- benchmarks we want to run ----
+
+def main():
+    _ = benchwrk
+
+    _(Append(            500, [0.014]))     # appends of ~ 30K
+    _(RandWrite(1000000, 500, [0.2]))       # small change size, so that heuristic always uses ZBlk1
+    _(RandWrite(1000000, 500, [1]))         # big change size,   so that heuristic always uses ZBlk0
+    _(RandWrite(1000000, 500, [0.2, 1]))    # Mix between change size so that heuristic switches
+                                            # between ZBlk0 and ZBlk1
+
+
+if __name__ == '__main__':
+    main()
--- a/bigfile/tests/test-zblk-fmt
+++ b/bigfile/tests/test-zblk-fmt
-#!/usr/bin/env bash
-
-# Copyright (C) 2023  Nexedi SA and Contributors.
-#
-# This program is free software: you can Use, Study, Modify and Redistribute
-# it under the terms of the GNU General Public License version 3, or (at your
-# option) any later version, as published by the Free Software Foundation.
-#
-# You can also Link and Combine this program with other software covered by
-# the terms of any of the Free Software licenses or any of the Open Source
-# Initiative approved licenses and Convey the resulting work. Corresponding
-# source of such a combination shall include the source code for all other
-# software used.
-#
-# This program is distributed WITHOUT ANY WARRANTY; without even the implied
-# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
-#
-# See COPYING file for full licensing terms.
-# See https://www.nexedi.com/licensing for rationale and options.
-
-# Test to compare disk-space and access-speed of the different ZBlk format options:
-#
-# 	- ZBlk0
-# 	- ZBlk1
-# 	- auto
-#
-# The heuristic 'auto' should behave as good as ZBlk0 in case of wide changes
-# and as good as ZBlk1 in case of small changes.
-
-function test {
-
-	function t {
-		zblkfmt=$1
-		echo "Run tests with format $zblkfmt:"
-		echo ""
-		export WENDELIN_CORE_ZBLK_FMT=$zblkfmt
-		python bigfile/tests/_test_zblk_fmt
-		echo ""
-		echo ""
-	}
-
-	change_percentage_set=$1
-	change_count=$2
-	arrsize=$3
-	change_type=$4
-
-	echo "---------------------------------------------"
-	echo "---------------------------------------------"
-	echo "Set change_percentage_set to $change_percentage_set"
-	echo "Set change_count to $change_count"
-	echo "Set arrsize to $arrsize"
-	echo "Set change_type to $change_type"
-
-	echo ""
-
-	export change_percentage_set=$change_percentage_set
-	export change_count=$change_count
-	export arrsize=$arrsize
-	export change_type=$change_type
-
-	t auto
-	t ZBlk0
-	t ZBlk1
-
-	echo ""
-	echo "---------------------------------------------"
-	echo "---------------------------------------------"
-	echo ""
-}
-
-
-echo "Run append tests"
-test 0.014 500 500000 "append"
-
-# TODO(add 'small changes after initial fillup' optimization, see
-# 'bigfile/file_zodb/ZBigFile_zblk_fmt_heuristic' for more details)
-
-# echo "Run setitem tests"
-# 
-# echo "Use only a very small change size, so that heuristic always uses ZBlk1"
-# test 0.2 500 1000000 "setitem"
-# 
-# echo "Use only a very big change size, so that heuristic always uses ZBlk0"
-# test 1 500 1000000 "setitem"
-# 
-# echo "Mix between change size so that heuristic switches between ZBlk0 and ZBlk1"
-# test 0.2,1 500 1000000 "setitem"