mhnsw: closest neighbor precalc heuristic
This is based on the heuristic that if a candidate neighbor has a very close neighbor of its own, than this close neighbor is also likely a candidate neighbor itself. Meaning, we might replace the loop that compares a candidate with all neighbors if we know the distance between the candidate and its closest neighbor. Which can be precalculated. This gives the most speedup when the number of neighbors and the number of dimensions are large. In the tests it was 2.5-3x speedup, with the recall being worse by 0.1%-1% Incidentally, in the opposite case it gives both litle speedup and notably worse recall. Tests have shown 1.13x speedup with recall going down by ~20% in the worst - smallest - case. Thus, this heuristic is only enabled above the certain threshold.
Showing
Please register or sign in to comment