Commit 332ca39e authored by Tim Peters's avatar Tim Peters

Merge rev 29291 from ZODB 3.3 branch.

Port from ZODB 3.2.

Give fsIndex an efficient maxKey() implementation.

This will (in a later checkin) be used to give FileStorage an "obviously
correct" way to determine the largest oid used in an .fs.index file.
parent 139dafbc
......@@ -96,6 +96,15 @@ This happens if an attempt is made to add distinct objects to the cache
that have the same oid (object identifier). ZODB should never do this,
but it's possible for application code to force such an attempt.
fsIndex
-------
An efficient ``maxKey()`` method was implemented for the ``fsIndex`` class.
This makes it possible to determine the largest oid in a ``FileStorage``
index efficiently, directly, and reliably, replacing a more delicate scheme
that tried to keep track of this by saving an oid high water mark in the
index file and incrementally updating it.
What's new in ZODB3 3.3.1a1?
============================
......
......@@ -24,7 +24,7 @@
# 2. We limit the data size to 48 bits. This should allow databases
# as large as 256 terabytes.
#
# Mostof the space is consumed by items in the mappings from 2-byte
# Most of the space is consumed by items in the mappings from 2-byte
# suffix to 6-byte data. This should reduce the overall memory usage to
# 8-16 bytes per OID.
#
......@@ -125,3 +125,32 @@ class fsIndex(object):
def values(self):
return list(self.itervalues())
def maxKey(self):
# This is less general than the BTree method of the same name: we
# only care about the largest key in the entire tree. By
# construction, that's the largest oid in use in the associated
# FileStorage.
keys = self._data.keys()
if not keys:
# This is the same exception a BTree maxKey() raises when the
# tree is empty.
raise ValueError("empty tree")
# We expect that keys is small, since each fsBTree in _data.values()
# can hold as many as 2**16 = 64K entries. So this max() should go
# fast too. Regardless, there's no faster way to find the largest
# prefix.
biggest_prefix = max(keys)
tree = self._data[biggest_prefix]
# Obscure: what if tree is actually empty? We're relying here on
# that this class doesn't implement __delitem__: once a key gets
# into an fsIndex, the only way it can go away is by invoking
# clear(). Therefore nothing in _data.values() is ever empty.
#
# Note that because `tree` is an fsBTree, its maxKey() method is very
# efficient.
assert tree
return biggest_prefix + tree.maxKey()
......@@ -12,6 +12,8 @@
#
##############################################################################
import unittest
import random
from ZODB.fsIndex import fsIndex
from ZODB.utils import p64
......@@ -111,6 +113,22 @@ class Test(unittest.TestCase):
for i, item in enumerate(items):
self.assertEqual(item, (p64(i * 1000), (i * 1000L + 1)))
def testMaxKey(self):
index = self.index
index.clear()
# An empty index should complain.
self.assertRaises(ValueError, index.maxKey)
# Now build up a tree with random values, and check maxKey at each
# step.
correct_max = "" # smaller than anything we'll add
for i in range(1000):
key = p64(random.randrange(100000000))
index[key] = i
correct_max = max(correct_max, key)
index_max = index.maxKey()
self.assertEqual(index_max, correct_max)
def test_suite():
loader=unittest.TestLoader()
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment