Commit 953e8d8d authored by Evan Read's avatar Evan Read

Merge branch 'mk/mitigate-some-geo-ha-setup-pitfalls-docs' into 'master'

Mitigate a few Geo HA setup pitfalls in the documentation

Closes #7601

See merge request gitlab-org/gitlab-ee!8117
parents 20409996 fc97edb1
......@@ -83,7 +83,14 @@ must disable the primary.
[update the primary domain DNS record](#step-4-optional-updating-the-primary-domain-dns-record),
you may wish to lower the TTL now to speed up propagation.
### Step 3. Promoting a secondary Geo replica
### Step 3. Promoting a **secondary** node
NOTE: **Note:**
A new **secondary** should not be added at this time. If you want to add a new
**secondary**, do this after you have completed the entire process of promoting
the **secondary** to the **primary**.
#### Promoting a **secondary** node running on a single machine
1. SSH in to your **secondary** and login as root:
......@@ -91,43 +98,66 @@ must disable the primary.
sudo -i
```
1. Edit `/etc/gitlab/gitlab.rb` to reflect its new status as primary by
removing the following line:
1. Edit `/etc/gitlab/gitlab.rb` to reflect its new status as **primary** by
removing any lines that enabled the `geo_secondary_role`:
```ruby
## REMOVE THIS LINE
## In pre-11.5 documentation, the role was enabled as follows. Remove this line.
geo_secondary_role['enable'] = true
```
A new secondary should not be added at this time. If you want to add a new
secondary, do this after you have completed the entire process of promoting
the secondary to the primary.
## In 11.5+ documentation, the role was enabled as follows. Remove this line.
roles ['geo_secondary_role']
```
1. Promote the secondary to primary. Execute:
1. Promote the **secondary** to **primary**. Execute:
```bash
gitlab-ctl promote-to-primary-node
```
1. Verify you can connect to the newly promoted primary using the URL used
previously for the secondary.
1. Success! The secondary has now been promoted to primary.
1. Verify you can connect to the newly promoted **primary** using the URL used
previously for the **secondary**.
1. Success! The **secondary** has now been promoted to **primary**.
#### Promoting a node with HA
#### Promoting a **secondary** node with HA
The `gitlab-ctl promote-to-primary-node` command cannot be used yet in conjunction with
High Availability or with multiple machines, as it can only perform changes on
a single one.
The `gitlab-ctl promote-to-primary-node` command cannot be used yet in
conjunction with High Availability or with multiple machines, as it can only
perform changes on a **secondary** with only a single machine. Instead, you must
do this manually.
The command above does the following changes:
1. SSH in to the database node in the **secondary** and trigger PostgreSQL to
promote to read-write:
- Promotes the PostgreSQL secondary to primary
- Executes `gitlab-ctl reconfigure` to apply the changes in `/etc/gitlab/gitlab.rb`
- Runs `gitlab-rake geo:set_secondary_as_primary`
```bash
sudo gitlab-pg-ctl promote
```
1. Edit `/etc/gitlab/gitlab.rb` on every machine in the **secondary** to
reflect its new status as **primary** by removing any lines that enabled the
`geo_secondary_role`:
```ruby
## In pre-11.5 documentation, the role was enabled as follows. Remove this line.
geo_secondary_role['enable'] = true
## In 11.5+ documentation, the role was enabled as follows. Remove this line.
roles ['geo_secondary_role']
```
After making these changes [Reconfigure GitLab](../../restart_gitlab.md#omnibus-gitlab-reconfigure) each
machine so the changes take effect.
1. Promote the **secondary** to **primary**. SSH into a single application
server and execute:
```bash
sudo gitlab-rake geo:set_secondary_as_primary
```
You need to make sure all the affected machines no longer have `geo_secondary_role['enable'] = true` in
`/etc/gitlab/gitlab.rb`, that you execute the database promotion on the required database nodes
and you execute the `gitlab-rake geo:set_secondary_as_primary` in a machine running the application server.
1. Verify you can connect to the newly promoted **primary** using the URL used
previously for the **secondary**.
1. Success! The **secondary** has now been promoted to **primary**.
### Step 4. (Optional) Updating the primary domain DNS record
......@@ -198,7 +228,7 @@ and after that you also need two extra steps.
```ruby
## Enable a Geo Primary role (if you haven't yet)
geo_primary_role['enable'] = true
roles ['geo_primary_role']
##
# Primary and Secondary addresses
......
......@@ -3,7 +3,7 @@
NOTE: **Note:**
This is the documentation for the Omnibus GitLab packages. For installations
from source, follow the
[**database replication for installations from source**][database-source] guide.
[Geo database replication (source)](database_source.md) guide.
NOTE: **Note:**
If your GitLab installation uses external PostgreSQL, the Omnibus roles
......@@ -101,10 +101,12 @@ The following guide assumes that:
This command will also read the `postgresql['sql_replication_user']` Omnibus
setting in case you have changed `gitlab_replicator` username to something
else.
If you are using an external database not managed by Omnibus GitLab, you need
to create the replicator user and define a password to it manually.
Check [How to create replication user][database-source-primary] documentation.
For information on how to create a replication user, refer to the
[appropriate step](database_source.md#step-1-configure-the-primary-server)
in [Geo database replication (source)](database_source.md).
1. Configure PostgreSQL to listen on network interfaces
......@@ -161,14 +163,14 @@ The following guide assumes that:
## Geo Primary role
## - configure dependent flags automatically to enable Geo
##
geo_primary_role['enable'] = true
roles ['geo_primary_role']
##
## Primary address
## - replace '1.2.3.4' with the primary public or VPC address
##
postgresql['listen_address'] = '1.2.3.4'
##
# Primary and Secondary addresses
# - replace '1.2.3.4' with the primary public or VPC address
......@@ -263,8 +265,8 @@ The following guide assumes that:
gitlab-ctl stop sidekiq
```
NOTE: **Note**:
This step is important so we don't try to execute anything before the node is fully configured.
NOTE: **Note**:
This step is important so we don't try to execute anything before the node is fully configured.
1. [Check TCP connectivity][rake-maintenance] to the primary's PostgreSQL server:
......@@ -272,7 +274,7 @@ The following guide assumes that:
gitlab-rake gitlab:tcp_check[1.2.3.4,5432]
```
NOTE: **Note**:
NOTE: **Note**:
If this step fails, you may be using the wrong IP address, or a firewall may
be preventing access to the server. Check the IP address, paying close
attention to the difference between public and private addresses and ensure
......@@ -325,8 +327,8 @@ The following guide assumes that:
## Geo Secondary role
## - configure dependent flags automatically to enable Geo
##
geo_secondary_role['enable'] = true
roles ['geo_secondary_role']
##
## Secondary address
## - replace '5.6.7.8' with the secondary public or VPC address
......@@ -336,7 +338,7 @@ The following guide assumes that:
##
## Database credentials password (defined previously in primary node)
## - replicate same values here as defined in primary node
## - replicate same values here as defined in primary node
##
postgresql['sql_user_password'] = 'fca0b89a972d69f00eb3ec98a5838484'
gitlab_rails['db_password'] = 'mypassword'
......@@ -348,7 +350,7 @@ The following guide assumes that:
```
For external PostgreSQL instances, [see additional instructions][external postgresql].
If you bring a former primary back online to serve as a secondary then you also need to remove `geo_primary_role['enable'] = true`.
If you bring a former **primary** node back online to serve as a **secondary** node, then you also need to remove `roles ['geo_primary_role']` or `geo_primary_role['enable'] = true`.
1. Reconfigure GitLab for the changes to take effect:
......@@ -362,7 +364,7 @@ The following guide assumes that:
gitlab-ctl restart postgresql
gitlab-ctl reconfigure
```
This last reconfigure will provision the FDW configuration and enable it.
### Step 3. Initiate the replication process
......@@ -484,7 +486,7 @@ You only need to follow the steps below if you are not using the managed
PostgreSQL from a Omnibus GitLab package.
Geo secondary nodes use a tracking database to keep track of replication
status and recover automatically from some replication issues.
status and recover automatically from some replication issues.
This is a separate PostgreSQL installation that can be configured to use
FDW to connect with the secondary database for improved performance.
......@@ -495,10 +497,10 @@ the instructions below:
1. Edit `/etc/gitlab/gitlab.rb` with the connection params and credentials
```ruby
# note this is shared between both databases,
# note this is shared between both databases,
# make sure you define the same password in both
gitlab_rails['db_password'] = 'mypassword'
geo_secondary['db_host'] = '2.3.4.5' # change to the correct public IP
geo_secondary['db_port'] = 5431 # change to the correct port
geo_secondary['db_fdw'] = true # enable FDW
......@@ -510,7 +512,7 @@ the instructions below:
```bash
gitlab-ctl reconfigure
```
1. Run the tracking database migrations:
```bash
......@@ -521,36 +523,36 @@ the instructions below:
Save the script below in a file, ex. `/tmp/geo_fdw.sh` and modify the connection
params to match your environment. Execute it to set up the FDW connection.
```bash
#!/bin/bash
# Secondary Database connection params:
DB_HOST="5.6.7.8" # change to the public IP or VPC private IP
DB_NAME="gitlabhq_production"
DB_USER="gitlab"
DB_PORT="5432"
# Tracking Database connection params:
GEO_DB_HOST="2.3.4.5" # change to the public IP or VPC private IP
GEO_DB_NAME="gitlabhq_geo_production"
GEO_DB_USER="gitlab_geo"
GEO_DB_PORT="5432"
query_exec () {
gitlab-psql -h $GEO_DB_HOST -d $GEO_DB_NAME -p $GEO_DB_PORT -c "${1}"
}
query_exec "CREATE EXTENSION postgres_fdw;"
query_exec "CREATE SERVER gitlab_secondary FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '${DB_HOST}', dbname '${DB_NAME}', port '${DB_PORT}');"
query_exec "CREATE USER MAPPING FOR ${GEO_DB_USER} SERVER gitlab_secondary OPTIONS (user '${DB_USER}');"
query_exec "CREATE SCHEMA gitlab_secondary;"
query_exec "GRANT USAGE ON FOREIGN SERVER gitlab_secondary TO ${GEO_DB_USER};"
```
NOTE: **Note:** The script template above uses `gitlab-psql` as it's intended to be executed from the Geo machine,
but you can change it to `psql` and run it from any machine that has access to the database.
1. Restart GitLab
```bash
......@@ -621,8 +623,6 @@ Read the [troubleshooting document](troubleshooting.md).
[tracking]: database_source.md#enable-tracking-database-on-the-secondary-server
[FDW]: https://www.postgresql.org/docs/9.6/static/postgres-fdw.html
[toc]: index.md#using-omnibus-gitlab
[database-source]: database_source.md
[database-source-primary]: database_source.md#step-1-configure-the-primary-server
[rake-maintenance]: ../../raketasks/maintenance.md
[pg-docs-ssl]: https://www.postgresql.org/docs/9.6/static/libpq-ssl.html#LIBPQ-SSL-PROTECTION
[pg-docs-runtime-conn]: https://www.postgresql.org/docs/9.6/static/runtime-config-connection.html
......
......@@ -26,165 +26,165 @@ The only external way to access the two Geo deployments is by HTTPS at
## Redis and PostgreSQL High Availability
The primary and secondary Redis and PostgreSQL should be configured
for high availability. Because of the additional complexity involved
for high availability. Because of the additional complexity involved
in setting up this configuration for PostgreSQL and Redis
it is not covered by this Geo HA documentation.
The two services will instead be configured such that
they will each run on a single machine.
For more information about setting up a highly available PostgreSQL cluster and Redis cluster using the omnibus package see the high availability documentation for
[PostgreSQL][postgresql-ha] and [Redis][redis-ha], respectively.
From these instructions you will need the following for the examples below:
* `gitlab_rails['db_password']` for the PostgreSQL "DB password"
* `redis['password']` for the Redis "Redis password"
[PostgreSQL](../../high_availability/database.md) and
[Redis](../../high_availability/redis.md), respectively.
NOTE: **Note:**
It is possible to use cloud hosted services for PostgreSQL and Redis but this is beyond the scope of this document.
### Prerequisites
Make sure you have GitLab EE installed using the
[Omnibus package](https://about.gitlab.com/installation).
### Step 1: Configure the Geo Backend Services
## Prerequisites: A working GitLab HA cluster
On the **primary** backend servers configure the following services:
This cluster will serve as the **primary** node. Use the
[GitLab HA documentation](../../high_availability/README.md) to set this up.
* [Redis][redis-ha] for high availability.
* [NFS Server][nfs-ha] for repository, LFS, and upload storage.
* [PostgreSQL][postgresql-ha] for high availability.
## Configure the GitLab cluster to be the **primary** node
On the **secondary** backend servers configure the following services:
The following steps enable a GitLab cluster to serve as the **primary** node.
* [Redis][redis-ha] for high availability.
* [NFS Server][nfs-ha] which will store data that is synchronized from the Geo primary.
### Step 1: Configure the **primary** frontend servers
### Step 2: Configure the Postgres services on the Geo Secondary
1. Configure the [secondary Geo PostgreSQL database][database]
as a read-only secondary of the primary Geo PostgreSQL database.
1. Configure the Geo tracking database on the secondary server, to do this modify `/etc/gitlab/gitlab.rb`:
1. Edit `/etc/gitlab/gitlab.rb` and add the following:
```ruby
geo_postgresql['enable'] = true
geo_postgresql['listen_address'] = '10.1.4.1'
##
## Enable the Geo primary role
##
roles ['geo_primary_role']
geo_secondary['auto_migrate'] = true
geo_secondary['db_host'] = '10.1.4.1'
geo_secondary['db_password'] = 'Geo tracking DB password'
##
## Disable automatic migrations
##
gitlab_rails['auto_migrate'] = false
```
NOTE: **Note:**
Be sure that other non-postgresql services are disabled by setting `enable` to `false` in
the [gitlab.rb configuration][gitlab-rb-template].
After making these changes, [reconfigure GitLab][gitlab-reconfigure] so the changes take effect.
After making these changes be sure to run `sudo gitlab-ctl reconfigure` so that they take effect.
NOTE: **Note:** PostgreSQL and Redis should have already been disabled on the
application servers, and connections from the application servers to those
services on the backend servers configured, during normal GitLab HA set up. See
high availability configuration documentation for
[PostgreSQL](../../high_availability/database.md#configuring-the-application-nodes)
and [Redis](../../high_availability/redis.md#example-configuration-for-the-gitlab-application).
### Step 3: Set up the LoadBalancer
The **primary** database will require modification later, as part of
[step 2](#step-2-configure-the-main-read-only-replica-postgresql-database-on-the-secondary-node).
In this topology there will need to be a load balancers at each geographical location
to route traffic to the application servers.
## Configure a **secondary** node
See the [Load Balancer for GitLab HA][load-balancer-ha]
documentation for more information.
A **secondary** cluster is similar to any other GitLab HA cluster, with two
major differences:
### Step 4: Configure the Geo Frontend Application Servers
* The main PostgreSQL database is a read-only replica of the **primary** node's
PostgreSQL database.
* There is also a single PostgreSQL database for the **secondary** cluster,
called the "tracking database", which tracks the synchronization state of
various resources.
In the architecture overview there are two machines running the GitLab application
services. These services are enabled selectively in the configuration. Additionally
the addresses of the remote endpoints for PostgreSQL and Redis will need to be specified.
Therefore, we will set up the HA components one-by-one, and include deviations
from the normal HA setup.
#### On the GitLab Primary Frontend servers
### Step 1: Configure the Redis and NFS services on the **secondary** node
1. Edit `/etc/gitlab/gitlab.rb` and ensure the following to disable PostgreSQL and Redis from running locally.
Configure the following services, again using the non-Geo high availability
documentation:
```ruby
##
## Disable PostgreSQL on the local machine and connect to the remote
##
* [Configuring Redis for GitLab HA](../../high_availability/redis.md) for high
availability.
* [NFS](../../high_availability/nfs.md) which will store data that is
synchronized from the **primary** node.
postgresql['enable'] = false
gitlab_rails['auto_migrate'] = false
gitlab_rails['db_host'] = '10.0.3.1'
gitlab_rails['db_password'] = 'plaintext DB password'
### Step 2: Configure the main read-only replica PostgreSQL database on the **secondary** node
##
## Disable Redis on the local machine and connect to the remote
##
NOTE: **Note:** The following documentation assumes the database will be run on
only a single machine, rather than as a PostgreSQL cluster.
redis['enable'] = false
gitlab_rails['redis_host'] = '10.0.2.1'
gitlab_rails['redis_password'] = 'Redis password'
Configure the [**secondary** database](database.md) as a read-only replica of
the **primary** database. Be sure to follow the
[External PostgreSQL instances](database.md#external-postgresql-instances)
section.
geo_primary_role['enable'] = true
```
### Step 3: Configure the tracking database on the **secondary** node
NOTE: **Note:**
If you had set up PostgreSQL cluster using the omnibus package and you had set up `postgresql['sql_user_password'] = 'md5 digest of secret'` setting, keep in mind that `gitlab_rails['db_password']` setting mentioned above contains the plaintext password.
NOTE: **Note:** This documentation assumes the tracking database will be run on
only a single machine, rather than as a PostgreSQL cluster.
NOTE: **Note:**
Make sure that current node IP is listed in `postgresql['md5_auth_cidr_addresses']` setting of your remote database.
Configure the
[tracking database](database.md#tracking-database-for-the-secondary-nodes).
#### On the GitLab Secondary Frontend servers
### Step 4: Configure the frontend application servers on the **secondary** node
On the secondary the remote endpoint for the PostgreSQL Geo database will
be specified.
In the architecture overview, there are two machines running the GitLab
application services. These services are enabled selectively in the
configuration.
1. Edit `/etc/gitlab/gitlab.rb` and ensure the following to disable PostgreSQL and
Redis from running locally. Configure the secondary to connect to the Geo tracking database.
Configure the application servers following
[Configuring GitLab for HA](../../high_availability/gitlab.md), then make the
following modifications:
1. Edit `/etc/gitlab/gitlab.rb` on each application server in the **secondary**
cluster, and add the following:
```ruby
##
## Disable PostgreSQL on the local machine and connect to the remote
## Enable the Geo secondary role
##
roles ['geo_secondary_role', 'application_role']
postgresql['enable'] = false
##
## Disable automatic migrations
##
gitlab_rails['auto_migrate'] = false
##
## Configure the connection to the tracking DB. And disable application
## servers from running tracking databases.
##
geo_secondary['db_host'] = '10.1.4.1'
geo_secondary['db_password'] = 'plaintext Geo tracking DB password'
geo_postgresql['enable'] = false
##
## Configure connection to the streaming replica database, if you haven't
## already
##
gitlab_rails['db_host'] = '10.1.3.1'
gitlab_rails['db_password'] = 'plaintext DB password'
##
## Disable Redis on the local machine and connect to the remote
## Configure connection to Redis, if you haven't already
##
redis['enable'] = false
gitlab_rails['redis_host'] = '10.1.2.1'
gitlab_rails['redis_password'] = 'Redis password'
##
## Enable the geo secondary role and configure the
## geo tracking database
## If you are using custom users not managed by Omnibus, you need to specify
## UIDs and GIDs like below, and ensure they match between servers in a
## cluster to avoid permissions issues
##
geo_secondary_role['enable'] = true
geo_secondary['db_host'] = '10.1.4.1'
geo_secondary['db_password'] = 'Geo tracking DB password'
geo_postgresql['enable'] = false
user['uid'] = 9000
user['gid'] = 9000
web_server['uid'] = 9001
web_server['gid'] = 9001
registry['uid'] = 9002
registry['gid'] = 9002
```
NOTE: **Note:**
If you had set up PostgreSQL cluster using the omnibus package and you had set up `postgresql['sql_user_password'] = 'md5 digest of secret'` setting, keep in mind that `gitlab_rails['db_password']` setting mentioned above contains the plaintext password.
If you had set up PostgreSQL cluster using the omnibus package and you had set
up `postgresql['sql_user_password'] = 'md5 digest of secret'` setting, keep in
mind that `gitlab_rails['db_password']` and `geo_secondary['db_password']`
mentioned above contains the plaintext passwords. This is used to let the Rails
servers connect to the databases.
NOTE: **Note:**
Make sure that current node IP is listed in `postgresql['md5_auth_cidr_addresses']` setting of your remote database.
After making these changes [Reconfigure GitLab][gitlab-reconfigure] so that they take effect.
On the primary the following GitLab frontend services will be enabled:
* gitlab-pages
* gitlab-workhorse
* logrotate
* nginx
* registry
* remote-syslog
* sidekiq
* unicorn
After making these changes [Reconfigure GitLab][gitlab-reconfigure] so the changes take effect.
On the secondary the following GitLab frontend services will be enabled:
......@@ -201,11 +201,13 @@ On the secondary the following GitLab frontend services will be enabled:
Verify these services by running `sudo gitlab-ctl status` on the frontend
application servers.
### Step 5: Set up the LoadBalancer for the **secondary** node
In this topology, a load balancer is required at each geographic location to
route traffic to the application servers.
See [Load Balancer for GitLab HA](../../high_availability/load_balancer.md) for
more information.
[diagram-source]: https://docs.google.com/drawings/d/1z0VlizKiLNXVVVaERFwgsIOuEgjcUqDTWPdQYsE7Z4c/edit
[gitlab-reconfigure]: ../../restart_gitlab.md#omnibus-gitlab-reconfigure
[redis-ha]: ../../high_availability/redis.md
[postgresql-ha]: ../../high_availability/database.md
[nfs-ha]: ../../high_availability/nfs.md
[load-balancer-ha]: ../../high_availability/load_balancer.md
[database]: database.md
[gitlab-rb-template]: https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/files/gitlab-config-template/gitlab.rb.template
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment