• Kirill Smelkov's avatar
    fixup! ZBigFile: Add ZBlk format option 'h' (heuristic) · 0c6f0850
    Kirill Smelkov authored
    Rework the benchmark:
    
    - cleanup benchmark directory after benchmark run.
      I'm running out of memory after several benchmark runs because my /tmp is on
      tmpfs and it is generally a leak not to clean after test run.
    
    - Use only [size]int instead of [size][2]int as test array.
      Besides major dimension array shape is orthogonal to testing how storage behaves.
    
    - Benchmark both append and random write workloads, not only append.
      It is generally good to run the benchmark and have full set of numbers.
    
    - Fix off-by-one error in accessrand: random.randint(a,b) returns [a,b], not
      [a,b) and so using A[randint(0, arraysize)] can result in A[arraysize] which
      will go beyound last array element.
      Also replace arraysize with len(A) for better clarity.
    
    - Rework read access benchmark to robustly never access the same block twice.
      Previously the code was setting just niter=10 and hoping that a block would
      never be hit for the same time, but in the benchmarks we have not so
      many blocks and blindly selecting 10 random of them starts to overloap.
      Updated code makes sure to load any block only up to one time.
    
    - Do not manually set sys.path when running the benchmark:
      When tests are run wendelin.core is expectied to be installed in development
      mode via e.g. `pip install -e`, or, under buildout, via using custom python
      interpreter that has wendelin.core egg on path. This way path setup is
      already such that import wendelin.core should work. And if that would not be
      the case, we would have to adjust sys.path in every test or demo program.
    
    - Use unified benchmarking format for the output, so that tools like benchstat
      could be used to aggregate and compare results.
    
    - Remove code to raise RLIMIT_NOFILE.
      In the benchmark we use only one array and the amount of needed file
      descriptors is proportional to the number of used arrays. In other words he
      benchmark should not be a heavy user of the file descriptors.
    
      With `ulimit -n 20` the benchmarks run just ok, while the system
      default is usually 1024 or similar.
    
    - Remove usage of bash - the benchmark spawns processes from itself via python code.
    
    - Restructure the code for clarity.
    
    - Rename the benchmark to start with bench_ similarly to other existing
      benchmarks.
    0c6f0850
bench_zblkfmt 8.11 KB