Commit 219e7da8 authored by Dylan Griffith's avatar Dylan Griffith

Merge branch '214357-add-a-rake-task-to-create-a-new-elasticsearch-index' into 'master'

Resolve "Add a rake task to create a new ElasticSearch index"

Closes #214357

See merge request gitlab-org/gitlab!29598
parents f3e69da7 2ac9d704
......@@ -100,7 +100,7 @@ def instrument_classes(instrumentation)
instrumentation.instrument_instance_methods(Gitlab::Elastic::ProjectSearchResults)
instrumentation.instrument_instance_methods(Gitlab::Elastic::Indexer)
instrumentation.instrument_instance_methods(Gitlab::Elastic::SnippetSearchResults)
instrumentation.instrument_methods(Gitlab::Elastic::Helper)
instrumentation.instrument_instance_methods(Gitlab::Elastic::Helper)
instrumentation.instrument_instance_methods(Elastic::ApplicationVersionedSearch)
instrumentation.instrument_instance_methods(Elastic::ProjectsSearch)
......
......@@ -203,7 +203,7 @@ The best place to start is to determine if the issue is with creating an empty i
If it is, check on the Elasticsearch side to determine if the `gitlab-production` (the
name for the GitLab index) exists. If it exists, manually delete it on the Elasticsearch
side and attempt to recreate it from the
[`create_empty_index`](../../integration/elasticsearch.md#gitlab-elasticsearch-rake-tasks)
[`recreate_index`](../../integration/elasticsearch.md#gitlab-elasticsearch-rake-tasks)
Rake task.
If you still encounter issues, try creating an index manually on the Elasticsearch
......
......@@ -174,7 +174,7 @@ If no namespaces or projects are selected, no Elasticsearch indexing will take p
CAUTION: **Warning**:
If you have already indexed your instance, you will have to regenerate the index in order to delete all existing data
for filtering to work correctly. To do this run the Rake tasks `gitlab:elastic:create_empty_index` and
for filtering to work correctly. To do this run the Rake tasks `gitlab:elastic:recreate_index` and
`gitlab:elastic:clear_index_status`. Afterwards, removing a namespace or a project from the list will delete the data
from the Elasticsearch index as expected.
......@@ -406,7 +406,7 @@ There are several Rake tasks available to you via the command line:
- [`sudo gitlab-rake gitlab:elastic:index`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake)
- This is a wrapper task. It does the following:
- `sudo gitlab-rake gitlab:elastic:create_empty_index`
- `sudo gitlab-rake gitlab:elastic:recreate_index`
- `sudo gitlab-rake gitlab:elastic:clear_index_status`
- `sudo gitlab-rake gitlab:elastic:index_projects`
- `sudo gitlab-rake gitlab:elastic:index_snippets`
......@@ -414,27 +414,24 @@ There are several Rake tasks available to you via the command line:
- This iterates over all projects and queues Sidekiq jobs to index them in the background.
- [`sudo gitlab-rake gitlab:elastic:index_projects_status`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake)
- This determines the overall status of the indexing. It is done by counting the total number of indexed projects, dividing by a count of the total number of projects, then multiplying by 100.
- [`sudo gitlab-rake gitlab:elastic:create_empty_index`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake)
- This generates an empty index on the Elasticsearch side, deleting the existing one if present.
- [`sudo gitlab-rake gitlab:elastic:clear_index_status`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake)
- This deletes all instances of IndexStatus for all projects.
- [`sudo gitlab-rake gitlab:elastic:delete_index`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake)
NOTE: **Note:**
The `INDEX_NAME` parameter is optional and will use the default index name from the current `RAILS_ENV` if not set.
- [`sudo gitlab-rake gitlab:elastic:create_empty_index[<INDEX_NAME>]`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake)
- This generates an empty index on the Elasticsearch side only if it doesn't already exists.
- [`sudo gitlab-rake gitlab:elastic:delete_index[<INDEX_NAME>]`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake)
- This removes the GitLab index on the Elasticsearch instance.
- [`sudo gitlab-rake gitlab:elastic:recreate_index`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake)
- Does the same thing as `sudo gitlab-rake gitlab:elastic:create_empty_index`
- [`sudo gitlab-rake gitlab:elastic:recreate_index[<INDEX_NAME>]`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake)
- This is a wrapper task. It does the following:
- `sudo gitlab-rake gitlab:elastic:delete_index[<INDEX_NAME>]`
- `sudo gitlab-rake gitlab:elastic:create_empty_index[<INDEX_NAME>]`
- [`sudo gitlab-rake gitlab:elastic:index_snippets`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake)
- Performs an Elasticsearch import that indexes the snippets data.
- [`sudo gitlab-rake gitlab:elastic:projects_not_indexed`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake)
- Displays which projects are not indexed.
- [`sudo gitlab-rake gitlab:elastic:reindex_to_another_cluster[<SOURCE_CLUSTER_URL>,<DESTINATION_CLUSTER_URL>]`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake)
- Creates a new index in the destination cluster from the source index using
Elasticsearch "reindex from remote", where the source index is copied to the
destination. This is useful when migrating to a new cluster because it should be
quicker than reindexing via GitLab.
NOTE: **Note:**
Your source cluster must be whitelisted in your destination cluster's Elasticsearch
settings. See [Reindex from remote](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html#reindex-from-remote).
### Environment Variables
......
......@@ -10,7 +10,7 @@ class Admin::ElasticsearchController < Admin::ApplicationController
# POST
# Scheduling indexing jobs
def enqueue_index
if Gitlab::Elastic::Helper.index_exists?
if Gitlab::Elastic::Helper.default.index_exists?
::Elastic::IndexProjectsService.new.execute
notice = _('Elasticsearch indexing started')
......
......@@ -46,7 +46,7 @@ module EE
def prevent_elasticsearch_indexing_update?
!application_setting.elasticsearch_indexing &&
::Gitlab::Utils.to_boolean(params[:elasticsearch_indexing]) &&
!::Gitlab::Elastic::Helper.index_exists?
!::Gitlab::Elastic::Helper.default.index_exists?
end
end
end
......
---
title: Enable creation on custom index with rake
merge_request: 29598
author: mbergeron
type: added
......@@ -3,8 +3,33 @@
module Gitlab
module Elastic
class Helper
attr_reader :version, :client
attr_accessor :index_name
def initialize(
version: ::Elastic::MultiVersionUtil::TARGET_VERSION,
client: nil,
index_name: nil)
proxy = self.class.create_proxy(version)
@client = client || proxy.client
@index_name = index_name || proxy.index_name
@version = version
end
class << self
def create_proxy(version = nil)
Project.__elasticsearch__.version(version)
end
def default
@default ||= self.new
end
end
# rubocop: disable CodeReuse/ActiveRecord
def self.create_empty_index(version = ::Elastic::MultiVersionUtil::TARGET_VERSION, client = nil)
def create_empty_index
settings = {}
mappings = {}
......@@ -22,10 +47,6 @@ module Gitlab
mappings.deep_merge!(klass.__elasticsearch__.mappings.to_hash)
end
proxy = Project.__elasticsearch__.version(version)
client ||= proxy.client
index_name = proxy.index_name
create_index_options = {
index: index_name,
body: {
......@@ -44,65 +65,35 @@ module Gitlab
create_index_options[:include_type_name] = true
end
if client.indices.exists? index: index_name
client.indices.delete index: index_name
if client.indices.exists?(index: index_name)
raise "Index '#{index_name}' already exists, use `recreate_index` to recreate it."
end
client.indices.create create_index_options
end
# rubocop: enable CodeReuse/ActiveRecord
def self.reindex_to_another_cluster(source_cluster_url, destination_cluster_url, version = ::Elastic::MultiVersionUtil::TARGET_VERSION)
proxy = Project.__elasticsearch__.version(version)
index_name = proxy.index_name
destination_client = Gitlab::Elastic::Client.build(url: destination_cluster_url)
create_empty_index(version, destination_client)
optimize_for_write_settings = { index: { number_of_replicas: 0, refresh_interval: "-1" } }
destination_client.indices.put_settings(index: index_name, body: optimize_for_write_settings)
source_addressable = Addressable::URI.parse(source_cluster_url)
response = destination_client.reindex(body: {
source: {
remote: {
host: source_addressable.omit(:user, :password).to_s,
username: source_addressable.user,
password: source_addressable.password
},
index: index_name
},
dest: {
index: index_name
}
}, wait_for_completion: false)
response['task']
def delete_index
result = client.indices.delete(index: index_name)
result['acknowledged']
rescue ::Elasticsearch::Transport::Transport::Errors::NotFound => e
Gitlab::ErrorTracking.log_exception(e)
false
end
def self.delete_index(version = ::Elastic::MultiVersionUtil::TARGET_VERSION)
Project.__elasticsearch__.version(version).delete_index!
end
def self.index_exists?(version = ::Elastic::MultiVersionUtil::TARGET_VERSION)
proxy = Project.__elasticsearch__.version(version)
client = proxy.client
index_name = proxy.index_name
client.indices.exists? index: index_name # rubocop:disable CodeReuse/ActiveRecord
def index_exists?
client.indices.exists?(index: index_name) # rubocop:disable CodeReuse/ActiveRecord
end
# Calls Elasticsearch refresh API to ensure data is searchable
# immediately.
# https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-refresh.html
def self.refresh_index
Project.__elasticsearch__.refresh_index!
def refresh_index
client.indices.refresh(index: index_name)
end
def self.index_size(version = ::Elastic::MultiVersionUtil::TARGET_VERSION)
Project.__elasticsearch__.version(version).client.indices.stats['indices'][Project.__elasticsearch__.index_name]['total']
def index_size
client.indices.stats['indices'][index_name]['total']
end
end
end
......
......@@ -7,7 +7,7 @@ namespace :gitlab do
# to use this configuration during a full re-index anyways.
ENV['UPDATE_INDEX'] = nil
Rake::Task["gitlab:elastic:create_empty_index"].invoke
Rake::Task["gitlab:elastic:recreate_index"].invoke
Rake::Task["gitlab:elastic:clear_index_status"].invoke
Rake::Task["gitlab:elastic:index_projects"].invoke
Rake::Task["gitlab:elastic:index_snippets"].invoke
......@@ -56,27 +56,34 @@ namespace :gitlab do
end
desc "GitLab | Elasticsearch | Create empty index"
task create_empty_index: :environment do
Gitlab::Elastic::Helper.create_empty_index
puts "Index created".color(:green)
end
task :create_empty_index, [:index_name] => [:environment] do |t, args|
helper = Gitlab::Elastic::Helper.new(index_name: args[:index_name])
helper.create_empty_index
desc "GitLab | Elasticsearch | Clear indexing status"
task clear_index_status: :environment do
IndexStatus.delete_all
puts "Index status has been reset".color(:green)
puts "Index '#{helper.index_name}' has been created.".color(:green)
end
desc "GitLab | Elasticsearch | Delete index"
task delete_index: :environment do
Gitlab::Elastic::Helper.delete_index
puts "Index deleted".color(:green)
task :delete_index, [:index_name] => [:environment] do |t, args|
helper = Gitlab::Elastic::Helper.new(index_name: args[:index_name])
if helper.delete_index
puts "Index '#{helper.index_name}' has been deleted".color(:green)
else
puts "Index '#{helper.index_name}' was not found".color(:green)
end
end
desc "GitLab | Elasticsearch | Recreate index"
task recreate_index: :environment do
Gitlab::Elastic::Helper.create_empty_index
puts "Index recreated".color(:green)
task :recreate_index, [:index_name] => [:environment] do |t, args|
Rake::Task["gitlab:elastic:delete_index"].invoke(*args)
Rake::Task["gitlab:elastic:create_empty_index"].invoke(*args)
end
desc "GitLab | Elasticsearch | Clear indexing status"
task clear_index_status: :environment do
IndexStatus.delete_all
puts "Index status has been reset".color(:green)
end
desc "GitLab | Elasticsearch | Display which projects are not indexed"
......@@ -90,12 +97,6 @@ namespace :gitlab do
end
end
desc "GitLab | Elasticsearch | Reindex to another cluster"
task :reindex_to_another_cluster, [:source_cluster_url, :dest_cluster_url] => :environment do |_, args|
task_id = Gitlab::Elastic::Helper.reindex_to_another_cluster(args.source_cluster_url, args.dest_cluster_url)
puts "Reindexing to another cluster started with task id: #{task_id}".color(:green)
end
def project_id_batches(&blk)
relation = Project
......
......@@ -11,7 +11,7 @@ describe Admin::ElasticsearchController do
end
it 'starts indexing' do
expect(Gitlab::Elastic::Helper).to(receive(:index_exists?)).and_return(true)
expect(Gitlab::Elastic::Helper.default).to(receive(:index_exists?)).and_return(true)
expect_next_instance_of(::Elastic::IndexProjectsService) do |service|
expect(service).to receive(:execute)
end
......@@ -24,7 +24,7 @@ describe Admin::ElasticsearchController do
context 'without an index' do
before do
allow(Gitlab::Elastic::Helper).to(receive(:index_exists?)).and_return(false)
allow(Gitlab::Elastic::Helper.default).to(receive(:index_exists?)).and_return(false)
end
it 'does nothing and returns 404' do
......
......@@ -10,7 +10,7 @@ describe 'Admin updates EE-only settings' do
stub_env('IN_MEMORY_APPLICATION_SETTINGS', 'false')
sign_in(create(:admin))
allow(License).to receive(:feature_available?).and_return(true)
allow(Gitlab::Elastic::Helper).to receive(:index_exists?).and_return(true)
allow(Gitlab::Elastic::Helper.default).to receive(:index_exists?).and_return(true)
end
context 'Geo settings' do
......
# frozen_string_literal: true
require 'fast_spec_helper'
require 'webmock/rspec'
require 'spec_helper'
describe Gitlab::Elastic::Helper do
describe '.index_exists' do
it 'returns correct values' do
described_class.create_empty_index
subject(:helper) { described_class.default }
expect(described_class.index_exists?).to eq(true)
shared_context 'with an existing index' do
before do
helper.create_empty_index
end
end
after do
helper.delete_index
end
describe '.new' do
it 'has the proper default values' do
expect(helper).to have_attributes(
version: ::Elastic::MultiVersionUtil::TARGET_VERSION,
index_name: ::Elastic::Latest::Config.index_name)
end
context 'with a custom `index_name`' do
let(:index_name) { 'custom-index-name' }
described_class.delete_index
subject(:helper) { described_class.new(index_name: index_name) }
expect(described_class.index_exists?).to eq(false)
it 'has the proper `index_name`' do
expect(helper).to have_attributes(index_name: index_name)
end
end
end
describe 'reindex_to_another_cluster' do
it 'creates an empty index and triggers a reindex' do
_version_check_request = stub_request(:get, 'http://newcluster.example.com:9200/')
.to_return(status: 200, body: { version: { number: '7.5.1' } }.to_json)
_index_exists_check = stub_request(:head, 'http://newcluster.example.com:9200/gitlab-test')
.to_return(status: 404, body: +'')
create_cluster_request = stub_request(:put, 'http://newcluster.example.com:9200/gitlab-test')
.to_return(status: 200, body: +'')
optimize_settings_for_write_request = stub_request(:put, 'http://newcluster.example.com:9200/gitlab-test/_settings')
.with(body: { index: { number_of_replicas: 0, refresh_interval: "-1" } })
.to_return(status: 200, body: +'')
reindex_request = stub_request(:post, 'http://newcluster.example.com:9200/_reindex?wait_for_completion=false')
.with(
body: {
source: {
remote: {
host: 'http://oldcluster.example.com:9200/',
username: 'olduser',
password: 'oldpass'
},
index: 'gitlab-test'
},
dest: {
index: 'gitlab-test'
}
}).to_return(status: 200,
headers: { "Content-Type" => "application/json" },
body: { task: 'abc123' }.to_json)
source_url = 'http://olduser:oldpass@oldcluster.example.com:9200/'
dest_url = 'http://newcluster.example.com:9200/'
task = Gitlab::Elastic::Helper.reindex_to_another_cluster(source_url, dest_url)
expect(task).to eq('abc123')
assert_requested create_cluster_request
assert_requested optimize_settings_for_write_request
assert_requested reindex_request
describe '#create_empty_index' do
context 'without an existing index' do
it 'creates the index' do
helper.create_empty_index
expect(helper.index_exists?).to eq(true)
end
end
context 'when there is an index' do
include_context 'with an existing index'
it 'raises an error' do
expect { helper.create_empty_index }.to raise_error
end
end
end
describe '#delete_index' do
subject { helper.delete_index }
context 'without an existing index' do
it 'fails gracefully' do
is_expected.to be_falsy
end
end
context 'when there is an index' do
include_context 'with an existing index'
it { is_expected.to be_truthy }
end
end
describe '#index_exists?' do
subject { helper.index_exists? }
context 'without an existing index' do
it { is_expected.to be_falsy }
end
context 'when there is an index' do
include_context 'with an existing index'
it { is_expected.to be_truthy }
end
end
end
......@@ -46,7 +46,7 @@ describe ApplicationSettings::UpdateService do
with_them do
before do
allow(Gitlab::Elastic::Helper).to(receive(:index_exists?)).and_return(index_exists)
allow(Gitlab::Elastic::Helper.default).to(receive(:index_exists?)).and_return(index_exists)
allow(service.application_setting).to(receive(:elasticsearch_indexing)).and_return(indexing_enabled)
end
......
......@@ -3,11 +3,12 @@
RSpec.configure do |config|
config.before(:each, :elastic) do
Elastic::ProcessBookkeepingService.clear_tracking!
Gitlab::Elastic::Helper.create_empty_index
Gitlab::Elastic::Helper.default.delete_index
Gitlab::Elastic::Helper.default.create_empty_index
end
config.after(:each, :elastic) do
Gitlab::Elastic::Helper.delete_index
Gitlab::Elastic::Helper.default.delete_index
Elastic::ProcessBookkeepingService.clear_tracking!
end
......
......@@ -10,6 +10,6 @@ module ElasticsearchHelpers
end
def refresh_index!
::Gitlab::Elastic::Helper.refresh_index
::Gitlab::Elastic::Helper.default.refresh_index
end
end
......@@ -10,7 +10,7 @@ describe 'gitlab:elastic namespace rake tasks', :elastic do
describe 'index' do
it 'calls all indexing tasks in order' do
expect(Rake::Task['gitlab:elastic:create_empty_index']).to receive(:invoke).ordered
expect(Rake::Task['gitlab:elastic:recreate_index']).to receive(:invoke).ordered
expect(Rake::Task['gitlab:elastic:clear_index_status']).to receive(:invoke).ordered
expect(Rake::Task['gitlab:elastic:index_projects']).to receive(:invoke).ordered
expect(Rake::Task['gitlab:elastic:index_snippets']).to receive(:invoke).ordered
......@@ -73,11 +73,12 @@ describe 'gitlab:elastic namespace rake tasks', :elastic do
end
end
describe 'reindex_to_another_cluster' do
it 'calls reindex_to_another_cluster' do
expect(Gitlab::Elastic::Helper).to receive(:reindex_to_another_cluster).with('http://oldcluster.example.com:9300/', 'http://newcluster.example.com:9300/')
describe 'recreate_index' do
it 'calls all related subtasks in order' do
expect(Rake::Task['gitlab:elastic:delete_index']).to receive(:invoke).ordered
expect(Rake::Task['gitlab:elastic:create_empty_index']).to receive(:invoke).ordered
run_rake_task 'gitlab:elastic:reindex_to_another_cluster', 'http://oldcluster.example.com:9300/', 'http://newcluster.example.com:9300/'
run_rake_task 'gitlab:elastic:recreate_index'
end
end
end
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment