Commit a893b326 authored by Dylan Griffith's avatar Dylan Griffith

Set 1s server side timeout on Elasticsearch counts

These count requests are loaded one per tab every time the search page
loads. This means a single search for one type of document will trigger
up to 7 other searches just to get the counts for the other tabs.

These tab counts are often incredibly expensive requests too especially
relative to the cheaper searches. For example an issue search may take
1s while a blobs count will take 30s. Due to a limited thread pool on
the Elasticsearch side we regularly see these count queries being the
cause of queuing which is slowing down otherwise fast searches on
GitLab.com.

As such we want to set a timeout on these. This timeout is just a
server side Elasticsearch timeout for now which is a soft limit because
Elasticsearch is asynchronous and it may actually take Elasticsearch
longer to realise it's timed out and cancel the query. As such we may
see searches take a few seconds before they timeout even though the
timeout is 1s. This is not perfect but benchmarking in the related issue
shows this still can drastically improve throughput and this is one of
the easiest steps to take now.

One thing to also note about this approach is that users will still see
a count in the event of a timeout. The count may be a partial count and
actually lower than the true count. If they switch to the tab they will
see a true count. I think this is probably still better than displaying
nothing since the main value the tab counts have is showing whether or
not there are searches on that tab at all.

Later we may wish to introduce client side timeouts on our ES client but
it's trickier to accomplish since we use a single client configuration
which has a global timeout for all Elasticsearch queries. Additionally
client side timeouts will result in errors that we may wish to handle
specially to show some indicator on the tab.

Read more at https://gitlab.com/gitlab-org/gitlab/-/issues/301146
parent 6f28cd1d
---
title: Set 1s server side timeout on Elasticsearch counts
merge_request: 53435
author:
type: performance
......@@ -10,6 +10,11 @@ module Elastic
def search(query, search_options = {})
es_options = routing_options(search_options)
# Counts need to be fast as we load one count per type of document
# on every page load. Fail early if they are slow since they don't
# need to be accurate.
es_options[:timeout] = '1s' if search_options[:count_only]
# Calling elasticsearch-ruby method
super(query, es_options)
end
......
......@@ -40,6 +40,7 @@ RSpec.shared_examples 'does not load results for count only queries' do |scopes|
expect(request.dig(:body, :size)).to eq(0)
expect(request.dig(:body, :query, :bool, :must)).to be_blank
expect(request[:highlight]).to be_blank
expect(request.dig(:params, :timeout)).to eq('1s')
end
end
end
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment