• Sergei Golubchik's avatar
    mhnsw: inter-statement shared cache · 1dc75e2e
    Sergei Golubchik authored
    * preserve the graph in memory between statements
    * keep it in a TABLE_SHARE, available for concurrent searches
    * nodes are generally read-only, walking the graph doesn't change them
    * distance to target is cached, calculated only once
    * SIMD-optimized bloom filter detects visited nodes
    * nodes are stored in an array, not List, to better utilize bloom filter
    * auto-adjusting heuristic to estimate the number of visited nodes
      (to configure the bloom filter)
    * many threads can concurrently walk the graph. MEM_ROOT and Hash_set
      are protected with a mutex, but walking doesn't need them
    * up to 8 threads can concurrently load nodes into the cache,
      nodes are partitioned into 8 mutexes (8 is chosen arbitrarily, might
      need tuning)
    * concurrent editing is not supported though
    * this is fine for MyISAM, TL_WRITE protects the TABLE_SHARE and the
      graph (note that TL_WRITE_CONCURRENT_INSERT is not allowed, because an
      INSERT into the main table means multiple UPDATEs in the graph)
    * InnoDB uses secondary transaction-level caches linked in a list in
      in thd->ha_data via a fake handlerton
    * on rollback the secondary cache is discarded, on commit nodes
      from the secondary cache are invalidated in the shared cache
      while it is exclusively locked
    * on savepoint rollback both caches are flushed. this can be improved
      in the future with a row visibility callback
    * graph size is controlled by @@mhnsw_cache_size, the cache is flushed
      when it reaches the threshold
    1dc75e2e
sql_base.cc 323 KB