client: fix possible data corruption after conflict resolutions with replicas
This really fixes the bug described in commit 40bac312, which could probably be reverted. It only reduced the probability of failure. What happened is that the second conflict on 'a' for t3 what first reported by an answer to first store with: - a base serial at which a=0 - a conflict serial at which a=7 However, the cached data is not 8 anymore but 12, since a second store already occurred after the first conflict (reported by the other storage node). When this conflict was resolved before receiving the conflict for second store, it gave: resolve(old=0, saved=7, new=12) -> 19 instead of: resolve(old=4, saved=7, new=12) -> 15 (if we still had the data of the first store, we could also do resolve(old=0, saved=7, new=8) but that would be inefficient from a memory point of view) The bug was difficult to reproduce. testNotifyReplicated had to be run many many times before that race conditions trigger it. The test was changed to enforce some of them, and the above scenario now happens almost always.
Showing
Please register or sign in to comment