Commit 8f3fa21c authored by Nick Thomas's avatar Nick Thomas

Merge branch 'docs-geo-10-5-improvements' into 'master'

Improve Geo documentation

Closes #4921, #4889, #4883, #4892, #4882, #4886, #4890, and #4482

See merge request gitlab-org/gitlab-ee!4585
parents 1fc7ad5f 68bb1d0d
......@@ -8,13 +8,11 @@ restore your original configuration. This process consists of two steps:
## Configure the former primary to be a secondary
Since the former primary will be out of sync with the current primary, the first
step is to bring the former primary up to date. There is one downside though,
some uploads and repositories that have been deleted during an idle period of a
primary node, will not be deleted from the disk but the overall sync will be
much faster. As an alternative, you can set up a
[GitLab instance from scratch](../replication/index.md#setup-instructions) to
workaround this downside.
Since the former primary will be out of sync with the current primary, the first step is
to bring the former primary up to date. Note, deletion of data stored on disk like
repositories and uploads will not be replayed when bringing the former primary in back
into sync, which may result in increased disk usage.
Alternatively, you can [setup a new secondary GitLab instance][setup-geo] to avoid this.
To bring the former primary up to date:
......@@ -25,24 +23,24 @@ To bring the former primary up to date:
sudo gitlab-ctl start
```
NOTE: **Note:** If you [disabled the primary permanently](index.md#step-2-permanently-disable-the-primary),
NOTE: **Note:** If you [disabled primary permanently][disaster-recovery-disable-primary],
you need to undo those steps now. For Debian/Ubuntu you just need to run
`sudo systemctl enable gitlab-runsvdir`. For CentOS 6, you need to install
`sudo systemctl enable gitlab-runsvdir`. For CentoOS 6, you need to install
the GitLab instance from scratch and setup it as a secondary node by
following the [setup instructions](../replication/index.md#setup-instructions).
following [Setup instructions][setup-geo].
In this case you don't need to follow the next step.
1. [Setup database replication](../replication/database.md). Note that in this
1. [Setup database replication][database-replication]. Note that in this
case, primary refers to the current primary, and secondary refers to the
former primary.
If you have lost your original primary, follow the
[setup instructions](../replication/index.md#setup-instructions) to set up a new secondary.
[setup instructions][setup-geo] to set up a new secondary.
## Promote the secondary to primary
When the initial replication is complete and the primary and secondary are
closely in sync, you can do a [planned failover](planned_failover.md).
closely in sync, you can do a [planned failover].
## Restore the secondary node
......@@ -50,3 +48,8 @@ If your objective is to have two nodes again, you need to bring your secondary
node back online as well by repeating the first step
([configure the former primary to be a secondary](#configure-the-former-primary-to-be-a-secondary))
for the secondary node.
[setup-geo]: ../replication/index.md#setup-instructions
[database-replication]: ../replication/database.md
[disaster-recovery-disable-primary]: index.md#step-2-permanently-disable-the-primary
[planned failover]: planned_failover.md
# Disaster Recovery
Geo replicates your database and your Git repositories. We will
support and replicate more data in the future, that will enable you to
Geo replicates your database, your Git repositories, and few other assets.
We will support and replicate more data in the future, that will enable you to
failover with minimal effort, in a disaster situation.
See [Geo current limitations](../replication/index.md#current-limitations)
for more information.
See [Geo current limitations][geo-limitations] for more information.
CAUTION: **Warning:**
Disaster recovery for multi-secondary configurations is in **Alpha**.
For the latest updates, check the multi-secondary [Disaster Recovery epic][gitlab-org&65].
## Promoting secondary Geo replica in single-secondary configurations
......@@ -19,7 +22,7 @@ immediately after following these instructions.
### Step 1. Allow replication to finish if possible
If the secondary is still replicating data from the primary, follow
[the planned failover docs](planned_failover.md) as closely as possible in
[the planned failover docs][planned-failover] as closely as possible in
order to avoid unnecessary data loss.
### Step 2. Permanently disable the primary
......@@ -30,7 +33,7 @@ that has not been replicated to the secondary. This data should be treated
as lost if you proceed.
If an outage on your primary happens, you should do everything possible to
avoid a split-brain situation where writes can occur to two different GitLab
avoid a split-brain situation where writes can occur in two different GitLab
instances, complicating recovery efforts. So to prepare for the failover, we
must disable the primary.
......@@ -46,21 +49,31 @@ must disable the primary.
sudo systemctl disable gitlab-runsvdir
```
On some operating systems such as CentOS 6, an easy way to prevent GitLab
from being started if the machine reboots isn't available
(see [Omnibus issue #3058](https://gitlab.com/gitlab-org/omnibus-gitlab/issues/3058)).
> **CentOS only**: In CentOS 6 or older, there is no easy way to prevent GitLab from being
started if the machine reboots isn't available (see [gitlab-org/omnibus-gitlab#3058]).
It may be safest to uninstall the GitLab package completely:
```bash
yum remove gitlab-ee
```
> **Ubuntu 14.04 LTS**: If you are using an older version of Ubuntu
or any other distro based on the Upstart init system, you can prevent GitLab
from starting if the machine reboots by doing the following:
```bash
initctl stop gitlab-runsvvdir
echo 'manual' > /etc/init/gitlab-runsvdir.override
initctl reload-configuration
```
1. If you do not have SSH access to your primary, take the machine offline and
prevent it from rebooting by any means at your disposal.
Since there are many ways you may prefer to accomplish this, we will avoid a
single recommendation. You may need to:
- Reconfigure the load balancers
- Change DNS records (e.g., point the primary DNS record to the secondary node in order to stop usage of the primary)
- Change DNS records (e.g., point the primary DNS record to the secondary
node in order to stop usage of the primary)
- Stop the virtual servers
- Block traffic through a firewall
- Revoke object storage permissions from the primary
......@@ -96,13 +109,29 @@ must disable the primary.
previously for the secondary.
1. Success! The secondary has now been promoted to primary.
#### Promoting a node with HA
The `gitlab-ctl promote-to-primary-node` command cannot be used yet in conjunction with
High Availability or with multiple machines, as it can only perform changes on
a single one.
The command above does the following changes:
- Promotes the PostgreSQL secondary to primary
- Executes `gitlab-ctl reconfigure` to apply the changes in `/etc/gitlab/gitlab.rb`
- Runs `gitlab-rake geo:set_secondary_as_primary`
You need to make sure all the affected machines no longer have `geo_secondary_role['enable'] = true` in
`/etc/gitlab/gitlab.rb`, that you execute the database promotion on the required database nodes
and you execute the `gitlab-rake geo:set_secondary_as_primary` in a machine running the application server.
### Step 4. (Optional) Updating the primary domain DNS record
Updating the DNS records for the primary domain to point to the secondary
will prevent the need to update all references to the primary domain to the
secondary domain, like changing Git remotes and API URLs.
1. SSH in to your **secondary** and login as root:
1. SSH into your **secondary** and login as root:
```bash
sudo -i
......@@ -141,20 +170,17 @@ secondary domain, like changing Git remotes and API URLs.
Promoting a secondary to primary using the process above does not enable
Geo on the new primary.
To bring a new secondary online, follow the
[Geo setup instructions](../replication/index.md#setup-instructions).
To bring a new secondary online, follow the [Geo setup instructions][setup-geo].
## Promoting secondary Geo replica in multi-secondary configurations
CAUTION: **Caution:**
Disaster Recovery for multi-secondary configurations is in
**Alpha** development. Do not use this as your only Disaster Recovery
strategy as you may lose data.
CAUTION: **Warning:**
Disaster Recovery for multi-secondary configurations is in **Alpha** development.
Do not use this as your only Disaster Recovery strategy as you may lose data.
Disaster Recovery does not yet support systems with multiple
secondary Geo replicas (e.g., one primary and two or more secondaries). We are
working on it, see [#4284](https://gitlab.com/gitlab-org/gitlab-ee/issues/4284)
for details.
working on it, see [gitlab-org/gitlab-ee#4284] for details.
## Troubleshooting
......@@ -168,6 +194,15 @@ after a failover.
If you still have access to the old primary node, you can follow the
instructions in the
[Upgrading to GitLab 10.5](../replication/updating_the_geo_nodes.md#upgrading-to-gitlab-105)
[Upgrading to GitLab 10.5][updating-geo]
section to resolve the error. Otherwise, the secret is lost and you'll need to
[reset two-factor authentication for all users](../../../security/two_factor_authentication.md#disabling-2fa-for-everyone).
[reset two-factor authentication for all users][sec-tfa].
[gitlab-org&65]: https://gitlab.com/groups/gitlab-org/-/epics/65
[geo-limitations]: ../replication/index.md#current-limitations
[planned-failover]: planned_failover.md
[setup-geo]: ../replication/index.md#setup-instructions
[updating-geo]: ../replication/updating_the_geo_nodes.md#upgrading-to-gitlab-105
[sec-tfa]: ../../../security/two_factor_authentication.md#disabling-2fa-for-everyone
[gitlab-org/omnibus-gitlab#3058]: https://gitlab.com/gitlab-org/omnibus-gitlab/issues/3058
[gitlab-org/gitlab-ee#4284]: https://gitlab.com/gitlab-org/gitlab-ee/issues/4284
......@@ -4,7 +4,7 @@ A planned failover is similar to a disaster recovery scenario, except you are ab
to notify users of the maintenance window, and allow data to finish replicating to
secondaries.
Please read this entire document as well as [Disaster Recovery](index.md)
Please read this entire document as well as [Disaster Recovery][disaster-recovery]
before proceeding.
## Notify users of scheduled maintenance
......@@ -13,9 +13,8 @@ On the primary, navigate to **Admin Area > Messages**, add a broadcast message.
You can check under **Admin Area > Geo Nodes** to estimate how long it will
take to finish syncing. An example message would be:
>
A scheduled maintenance will take place at XX:XX UTC. We expect it to take
less than 1 hour.
> A scheduled maintenance will take place at XX:XX UTC. We expect it to take
less than 1 hour.
On the secondary, you may need to clear the cache for the broadcast message
to show up.
......@@ -35,5 +34,7 @@ IP.
## Promote the secondary
Finally, follow the [Disaster Recovery docs](index.md) to promote the secondary
Finally, follow the [Disaster Recovery docs][disaster-recovery] to promote the secondary
to a primary.
[disaster-recovery]: index.md
......@@ -3,14 +3,14 @@
>**Note:**
This is the documentation for the Omnibus GitLab packages. For installations
from source, follow the [**Geo nodes configuration for installations
from source**](configuration_source.md) guide.
from source**][configuration-source] guide.
## Configuring a new secondary node
>**Note:**
This is the final step in setting up a secondary Geo node. Stages of the
setup process must be completed in the documented order.
Before attempting the steps in this stage, [complete all prior stages](index.md#using-omnibus-gitlab).
Before attempting the steps in this stage, [complete all prior stages][setup-geo-omnibus].
The basic steps of configuring a secondary node are to replicate required
configurations between the primary and the secondaries; to configure a tracking
......@@ -19,19 +19,18 @@ database on each secondary; and to start GitLab on the secondary node.
You are encouraged to first read through all the steps before executing them
in your testing/production environment.
>**Notes:**
> **Notes:**
- **Do not** setup any custom authentication in the secondary nodes, this will be
handled by the primary node.
- **Do not** add anything in the secondaries Geo nodes admin area
(**Admin Area ➔ Geo Nodes**). This is handled solely by the primary node.
- Any change that requires access to the **Admin Area** needs to be done in the
primary node, as the secondary node is a read-only replica.
### Step 1. Manually replicate secret GitLab values
GitLab stores a number of secret values in the `/etc/gitlab/gitlab-secrets.json`
file which *must* match between the primary and secondary nodes. Until there is
a means of automatically replicating these between nodes (see
[issue #3789](https://gitlab.com/gitlab-org/gitlab-ee/issues/3789)), they must
be manually replicated to the secondary.
a means of automatically replicating these between nodes (see issue [gitlab-org/gitlab-ee#3789]),
they must be manually replicated to the secondary.
1. SSH into the **primary** node, and execute the command below:
......@@ -76,21 +75,12 @@ be manually replicated to the secondary.
gitlab-ctl reconfigure
```
Once reconfigured, the secondary will automatically start
replicating missing data from the primary in a process known as backfill.
Meanwhile, the primary node will start to notify the secondary of any changes, so
that the secondary can act on those notifications immediately.
Make sure the secondary instance is
running and accessible. You can login to the secondary node
with the same credentials as used in the primary.
### Step 2. Manually replicate primary SSH host keys
GitLab integrates with the system-installed SSH daemon, designating a user
(typically named git) through which all access requests are handled.
In a [Disaster Recovery](../disaster_recovery/index.md) situation, GitLab system
In a [Disaster Recovery] situation, GitLab system
administrators will promote a secondary Geo replica to a primary and they can
update the DNS records for the primary domain to point to the secondary to prevent
the need to update all references to the primary domain to the secondary domain,
......@@ -140,13 +130,49 @@ keys must be manually replicated to the secondary node.
service ssh restart
```
### Step 3. (Optional) Enabling hashed storage (from GitLab 10.0)
### Step 3. Add the secondary GitLab node
1. Visit the **primary** node's **Admin Area ➔ Geo Nodes**
(`/admin/geo_nodes`) in your browser.
1. Add the secondary node by providing its full URL. **Do NOT** check the box
'This is a primary node'.
1. Optionally, choose which namespaces should be replicated by the
secondary node. Leave blank to replicate all. Read more in
[selective replication](#selective-replication).
1. Click the **Add node** button.
1. SSH into your GitLab **secondary** server and restart the services:
```
gitlab-ctl restart
```
Check if there are any common issue with your Geo setup by running:
```
gitlab-rake gitlab:geo:check
```
1. SSH into your GitLab **primary** server and login as root to verify the
secondary is reachable or there are any common issue with your Geo setup:
>**Warning**
Hashed storage is in **Beta**. It is not considered production-ready. See
[Hashed Storage](../../repository_storage_types.md) for more detail,
and for the latest updates, check
[infrastructure issue #2821](https://gitlab.com/gitlab-com/infrastructure/issues/2821).
```
gitlab-rake gitlab:geo:check
```
Once added to the admin panel and restarted, the secondary will automatically start
replicating missing data from the primary in a process known as **backfill**.
Meanwhile, the primary node will start to notify the secondary of any changes, so
that the secondary can act on those notifications immediately.
Make sure the secondary instance is running and accessible.
You can login to the secondary node with the same credentials as used in the primary.
### Step 4. (Optional) Enabling hashed storage (from GitLab 10.0)
CAUTION: **Warning**:
Hashed storage is in **Beta**. It is not considered production-ready. See
[Hashed Storage] for more detail, and for the latest updates, check
infrastructure issue [gitlab-com/infrastructure#2821].
Using hashed storage significantly improves Geo replication - project and group
renames no longer require synchronization between nodes.
......@@ -157,24 +183,24 @@ renames no longer require synchronization between nodes.
![](img/hashed_storage.png)
### Step 4. (Optional) Configuring the secondary to trust the primary
### Step 5. (Optional) Configuring the secondary to trust the primary
You can safely skip this step if your primary uses a CA-issued HTTPS certificate.
If your primary is using a self-signed certificate for *HTTPS* support, you will
need to add that certificate to the secondary's trust store. Retrieve the
certificate from the primary and follow
[these instructions](https://docs.gitlab.com/omnibus/settings/ssl.html)
[these instructions][omnibus-ssl]
on the secondary.
### Step 5. Enable Git access over HTTP/HTTPS
### Step 6. Enable Git access over HTTP/HTTPS
Geo synchronizes repositories over HTTP/HTTPS, and therefore requires this clone
method to be enabled. Navigate to **Admin Area ➔ Settings**
(`/admin/application_settings`) on the primary node, and set
`Enabled Git access protocols` to `Both SSH and HTTP(S)` or `Only HTTP(S)`.
### Step 6. Verify proper functioning of the secondary node
### Step 7. Verify proper functioning of the secondary node
Congratulations! Your secondary geo node is now configured!
......@@ -190,7 +216,7 @@ node's Geo Nodes dashboard in your browser.
![Geo dashboard](img/geo_node_dashboard.png)
If your installation isn't working properly, check the
[troubleshooting document](troubleshooting.md).
[troubleshooting document].
The two most obvious issues that can become apparent in the dashboard are:
......@@ -198,7 +224,7 @@ The two most obvious issues that can become apparent in the dashboard are:
1. Instance to instance notification not working. In that case, it can be
something of the following:
- You are using a custom certificate or custom CA (see the
[troubleshooting document](troubleshooting.md))
[troubleshooting document])
- The instance is firewalled (check your firewall rules)
Please note that disabling a secondary node will stop the sync process.
......@@ -206,7 +232,7 @@ Please note that disabling a secondary node will stop the sync process.
Please note that if `git_data_dirs` is customized on the primary for multiple
repository shards you must duplicate the same configuration on the secondary.
Point your users to the ["Using a Geo Server" guide](using_a_geo_server.md).
Point your users to the ["Using a Geo Server" guide][using-geo].
Currently, this is what is synced:
......@@ -245,3 +271,13 @@ See the [updating the Geo nodes document](updating_the_geo_nodes.md).
## Troubleshooting
See the [troubleshooting document](troubleshooting.md).
[configuration-source]: configuration_source.md
[setup-geo-omnibus]: index.md#using-omnibus-gitlab
[Hashed Storage]: ../../repository_storage_types.md
[Disaster Recovery]: ../disaster_recovery/index.md
[gitlab-org/gitlab-ee#3789]: https://gitlab.com/gitlab-org/gitlab-ee/issues/3789
[gitlab-com/infrastructure#2821]: https://gitlab.com/gitlab-com/infrastructure/issues/2821
[omnibus-ssl]: https://docs.gitlab.com/omnibus/settings/ssl.html
[troubleshooting document]: troubleshooting.md
[using-geo]: using_a_geo_server.md
......@@ -3,14 +3,14 @@
>**Note:**
This is the documentation for installations from source. For installations
using the Omnibus GitLab packages, follow the
[**Omnibus Geo nodes configuration**](configuration.md) guide.
[**Omnibus Geo nodes configuration**][configuration] guide.
## Configuring a new secondary node
>**Note:**
This is the final step in setting up a secondary Geo node. Stages of the setup
process must be completed in the documented order. Before attempting the steps
in this stage, [complete all prior stages](index.md#using-gitlab-installed-from-source).
in this stage, [complete all prior stages][setup-geo-source].
The basic steps of configuring a secondary node are to replicate required
configurations between the primary and the secondaries; to configure a tracking
......@@ -30,8 +30,7 @@ in your testing/production environment.
GitLab stores a number of secret values in the `/home/git/gitlab/config/secrets.yml`
file which *must* match between the primary and secondary nodes. Until there is
a means of automatically replicating these between nodes (see
[issue #3789](https://gitlab.com/gitlab-org/gitlab-ee/issues/3789)), they must
a means of automatically replicating these between nodes (see [gitlab-org/gitlab-ee#3789]), they must
be manually replicated to the secondary.
1. SSH into the **primary** node, and execute the command below:
......@@ -71,12 +70,6 @@ be manually replicated to the secondary.
chmod 0600 /home/git/gitlab/config/secrets.yml
```
1. Restart GitLab for the changes to take effect:
```bash
service gitlab restart
```
Once restarted, the secondary will automatically start replicating missing data
from the primary in a process known as backfill. Meanwhile, the primary node
will start to notify the secondary of any changes, so that the secondary can
......@@ -87,13 +80,50 @@ the secondary node with the same credentials as used in the primary.
### Step 2. Manually replicate primary SSH host keys
Read [Manually replicate primary SSH host keys](configuration.md#step-2-manually-replicate-primary-ssh-host-keys)
Read [Manually replicate primary SSH host keys][configuration-replicate-ssh]
### Step 3. (Optional) Enabling hashed storage (from GitLab 10.0)
### Step 3. Add the secondary GitLab node
Read [Enabling Hashed Storage](configuration.md#step-3-optional-enabling-hashed-storage-from-gitlab-10-0)
1. Visit the **primary** node's **Admin Area ➔ Geo Nodes**
(`/admin/geo_nodes`) in your browser.
1. Add the secondary node by providing its full URL. **Do NOT** check the box
'This is a primary node'.
1. Optionally, choose which namespaces should be replicated by the
secondary node. Leave blank to replicate all. Read more in
[selective replication](#selective-replication).
1. Click the **Add node** button.
1. SSH into your GitLab **secondary** server and restart the services:
### Step 4. (Optional) Configuring the secondary to trust the primary
```bash
service gitlab restart
```
Check if there are any common issue with your Geo setup by running:
```bash
bundle exec rake gitlab:geo:check
```
1. SSH into your GitLab **primary** server and login as root to verify the
secondary is reachable or there are any common issue with your Geo setup:
```bash
bundle exec rake gitlab:geo:check
```
Once reconfigured, the secondary will automatically start
replicating missing data from the primary in a process known as backfill.
Meanwhile, the primary node will start to notify the secondary of any changes, so
that the secondary can act on those notifications immediately.
Make sure the secondary instance is running and accessible.
You can login to the secondary node with the same credentials as used in the primary.
### Step 4. (Optional) Enabling hashed storage (from GitLab 10.0)
Read [Enabling Hashed Storage][configuration-hashed-storage]
### Step 5. (Optional) Configuring the secondary to trust the primary
You can safely skip this step if your primary uses a CA-issued HTTPS certificate.
......@@ -109,22 +139,31 @@ cp primary.geo.example.com.crt /usr/local/share/ca-certificates
update-ca-certificates
```
### Step 5. Enable Git access over HTTP/HTTPS
### Step 6. Enable Git access over HTTP/HTTPS
Geo synchronizes repositories over HTTP/HTTPS, and therefore requires this clone
method to be enabled. Navigate to **Admin Area ➔ Settings**
(`/admin/application_settings`) on the primary node, and set
`Enabled Git access protocols` to `Both SSH and HTTP(S)` or `Only HTTP(S)`.
### Step 6. Verify proper functioning of the secondary node
### Step 7. Verify proper functioning of the secondary node
Read [Verify proper functioning of the secondary node](configuration.md#step-6-verify-proper-functioning-of-the-secondary-node).
Read [Verify proper functioning of the secondary node][configuration-verify-node].
## Selective synchronization
Read [Selective synchronization](configuration.md#selective-synchronization).
Read [Selective synchronization][configuration-selective-replication].
## Troubleshooting
Read the [troubleshooting document](troubleshooting.md).
Read the [troubleshooting document][troubleshooting].
[setup-geo-source]: index.md#using-gitlab-installed-from-source
[gitlab-org/gitlab-ee#3789]: https://gitlab.com/gitlab-org/gitlab-ee/issues/3789
[configuration]: configuration.md
[configuration-hashed-storage]: configuration.md#step-4-optional-enabling-hashed-storage-from-gitlab-10-0
[configuration-replicate-ssh]: configuration.md#step-2-manually-replicate-primary-ssh-host-keys
[configuration-selective-replication]: configuration.md#selective-synchronization
[configuration-verify-node]: configuration.md#step-7-verify-proper-functioning-of-the-secondary-node
[troubleshooting]: troubleshooting.md
......@@ -3,7 +3,7 @@
>**Note:**
This is the documentation for the Omnibus GitLab packages. For installations
from source, follow the
[**database replication for installations from source**](database_source.md) guide.
[**database replication for installations from source**][database-source] guide.
>**Note:**
If your GitLab installation uses external PostgreSQL, the Omnibus roles
......@@ -32,8 +32,7 @@ connect to the secondary database servers (which are also read-only).
In database documentation you may see "primary" being referenced as "master"
and "secondary" as either "slave" or "standby" server (read-only).
We recommend using [PostgreSQL replication
slots](https://medium.com/@tk512/replication-slots-in-postgresql-b4b03d277c75)
We recommend using [PostgreSQL replication slots][replication-slots-article]
to ensure that the primary retains all the data necessary for the secondaries to
recover. See below for more details.
......@@ -89,7 +88,7 @@ The following guide assumes that:
gitlab_rails['db_password'] = 'mypassword'
```
1. Omnibus GitLab already has a [replication user](https://wiki.postgresql.org/wiki/Streaming_Replication)
1. Omnibus GitLab already has a [replication user]
called `gitlab_replicator`. You must set the password for this user manually.
You will be prompted to enter a password:
......@@ -100,6 +99,10 @@ The following guide assumes that:
This command will also read the `postgresql['sql_replication_user']` Omnibus
setting in case you have changed `gitlab_replicator` username to something
else.
If you are using an external database not managed by Omnibus GitLab, you need
to create the replicator user and define a password to it manually.
Check [How to create replication user][database-source-primary] documentation.
1. Configure PostgreSQL to listen on network interfaces
......@@ -128,21 +131,19 @@ The following guide assumes that:
In most cases, the following addresses will be used to configure GitLab
Geo:
| Configuration | Address |
|-----|-----|
| `postgresql['listen_address']` | Primary's private address |
| `postgresql['trust_auth_cidr_addresses']` | Primary's private address |
| `postgresql['md5_auth_cidr_addresses']` | Secondary's public addresses |
| Configuration | Address |
|-----------------------------------------|---------------------------------------------|
| `postgresql['listen_address']` | Primary's public or VPC private address |
| `postgresql['md5_auth_cidr_addresses']` | Secondary's public or VPC private addresses |
If you are using Google Cloud Platform, SoftLayer, or any other vendor that
provides a virtual private cloud you can use the secondary's private
provides a virtual private cloud (VPC) you can use the secondary's private
address (corresponds to "internal address" for Google Cloud Platform) for
`postgresql['md5_auth_cidr_addresses']`.
`postgresql['md5_auth_cidr_addresses']` and `postgresql['listen_address']`.
The `listen_address` option opens PostgreSQL up to network connections
with the interface corresponding to the given address. See [the PostgreSQL
documentation](https://www.postgresql.org/docs/9.6/static/runtime-config-connection.html)
for more details.
documentation][pg-docs-runtime-conn] for more details.
Depending on your network configuration, the suggested addresses may not
be correct. If your primary and secondary connect over a local
......@@ -158,14 +159,13 @@ The following guide assumes that:
##
## Primary address
## - replace '1.2.3.4' with the primary private address
## - replace '1.2.3.4' with the primary public or VPC address
##
postgresql['listen_address'] = '1.2.3.4'
postgresql['trust_auth_cidr_addresses'] = ['127.0.0.1/32','1.2.3.4/32']
##
# Secondary addresses
# - replace '5.6.7.8' with the secondary public address
# - replace '5.6.7.8' with the secondary public or VPC address
##
postgresql['md5_auth_cidr_addresses'] = ['5.6.7.8/32']
......@@ -192,7 +192,7 @@ The following guide assumes that:
You may also want to edit the `wal_keep_segments` and `max_wal_senders` to
match your database replication requirements. Consult the [PostgreSQL -
Replication documentation](https://www.postgresql.org/docs/9.6/static/runtime-config-replication.html)
Replication documentation][pg-docs-runtime-replication]
for more information.
1. Save the file and reconfigure GitLab for the database listen changes and
......@@ -241,46 +241,30 @@ The following guide assumes that:
will need it when setting up the secondary! The certificate is not sensitive
data.
### Step 2. Add the secondary GitLab node
To prevent the secondary geo node from trying to act as the primary once the
database is replicated, the secondary geo node must be added on the
primary before the database is replicated.
### Step 2. Configure the secondary server
1. Visit the **primary** node's **Admin Area ➔ Geo Nodes**
(`/admin/geo_nodes`) in your browser.
1. Add the secondary node by providing its full URL. **Do NOT** check the box
'This is a primary node'.
1. Optionally, choose which namespaces should be replicated by the
secondary node. Leave blank to replicate all. Read more in
[selective replication](#selective-replication).
1. Click the **Add node** button.
1. SSH into your GitLab **primary** server and login as root to verify the
secondary is reachable:
1. SSH into your GitLab **secondary** server and login as root:
```
gitlab-rake gitlab:geo:check
sudo -i
```
The new secondary geo node will have the status **Unhealthy**. This is expected
because we have not yet configured the secondary server. This is the next step.
### Step 3. Configure the secondary server
1. SSH into your GitLab **secondary** server and login as root:
1. Stop application server and Sidekiq
```
sudo -i
gitlab-ctl stop unicorn
gitlab-ctl stop sidekiq
```
1. [Check TCP connectivity](../../raketasks/maintenance.md) to the
primary's PostgreSQL server:
> **Note**: This step is important so we don't try to execute anything before the node is fully configured.
1. [Check TCP connectivity][rake-maintenance] to the primary's PostgreSQL server:
```bash
gitlab-rake gitlab:tcp_check[1.2.3.4,5432]
```
If this step fails, you may be using the wrong IP address, or a firewall may
> **Note**: If this step fails, you may be using the wrong IP address, or a firewall may
be preventing access to the server. Check the IP address, paying close
attention to the difference between public and private addresses and ensure
that, if a firewall is present, the secondary is permitted to connect to the
......@@ -329,9 +313,8 @@ because we have not yet configured the secondary server. This is the next step.
```ruby
# Secondary addresses
# - replace '5.6.7.8' with the secondary private address
# - replace '5.6.7.8' with the secondary public or VPC address
postgresql['listen_address'] = '5.6.7.8'
postgresql['trust_auth_cidr_addresses'] = ['127.0.0.1/32','5.6.7.8/32']
postgresql['md5_auth_cidr_addresses'] = ['5.6.7.8/32']
# gitlab database user's password (defined previously)
......@@ -339,11 +322,8 @@ because we have not yet configured the secondary server. This is the next step.
# enable fdw for the geo tracking database
geo_secondary['db_fdw'] = true
```
1. Edit `/etc/gitlab/gitlab.rb` and add the following:
```ruby
# make this a secondary Geo node
geo_secondary_role['enable'] = true
```
......@@ -362,7 +342,7 @@ because we have not yet configured the secondary server. This is the next step.
gitlab-ctl restart postgresql
```
### Step 4. Initiate the replication process
### Step 3. Initiate the replication process
Below we provide a script that connects the database on the secondary node to
the database on the primary node, replicates the database, and creates the
......@@ -408,7 +388,7 @@ data before running `pg_basebackup`.
(e.g., you know the network path is secure, or you are using a site-to-site
VPN). This is **not** safe over the public Internet!
- You can read more details about each `sslmode` in the
[PostgreSQL documentation](https://www.postgresql.org/docs/9.6/static/libpq-ssl.html#LIBPQ-SSL-PROTECTION);
[PostgreSQL documentation][pg-docs-ssl];
the instructions above are carefully written to ensure protection against
both passive eavesdroppers and active "man-in-the-middle" attackers.
- Change the `--slot-name` to the name of the replication slot
......@@ -458,7 +438,7 @@ max_replication_slots = 1 # number of secondary instances
hot_standby = on
```
Th `geo_secondary_role` makes configuration changes to `postgresql.conf` and
The `geo_secondary_role` makes configuration changes to `postgresql.conf` and
enables the Geo Log Cursor (`geo_logcursor`) and secondary tracking database
on the secondary. The PostgreSQL settings for this database it adds to
the default settings:
......@@ -486,8 +466,16 @@ MySQL replication is not supported for Geo.
Read the [troubleshooting document](troubleshooting.md).
[replication-slots-article]: https://medium.com/@tk512/replication-slots-in-postgresql-b4b03d277c75
[pgback]: http://www.postgresql.org/docs/9.2/static/app-pgbasebackup.html
[replication user]:https://wiki.postgresql.org/wiki/Streaming_Replication
[external postgresql]: #external-postgresql-instances
[tracking]: database_source.md#enable-tracking-database-on-the-secondary-server
[FDW]: https://www.postgresql.org/docs/9.6/static/postgres-fdw.html
[toc]: index.md#using-omnibus-gitlab
[database-source]: database_source.md
[database-source-primary]: database_source.md#step-1-configure-the-primary-server
[rake-maintenance]: ../../raketasks/maintenance.md
[pg-docs-ssl]: https://www.postgresql.org/docs/9.6/static/libpq-ssl.html#LIBPQ-SSL-PROTECTION
[pg-docs-runtime-conn]: https://www.postgresql.org/docs/9.6/static/runtime-config-connection.html
[pg-docs-runtime-replication]: https://www.postgresql.org/docs/9.6/static/runtime-config-replication.html
......@@ -3,7 +3,7 @@
>**Note:**
This is the documentation for installations from source. For installations
using the Omnibus GitLab packages, follow the
[**database replication for Omnibus GitLab**](database.md) guide.
[**database replication for Omnibus GitLab**][database] guide.
>**Note:**
The stages of the setup process must be completed in the documented order.
......@@ -26,8 +26,7 @@ connect to secondary database servers (which are read-only too).
In many databases documentation you will see "primary" being referenced as "master"
and "secondary" as either "slave" or "standby" server (read-only).
We recommend using [PostgreSQL replication
slots](https://medium.com/@tk512/replication-slots-in-postgresql-b4b03d277c75)
We recommend using [PostgreSQL replication slots][replication-slots-article]
to ensure the primary retains all the data necessary for the secondaries to
recover. See below for more details.
......@@ -60,8 +59,12 @@ The following guide assumes that:
1. Create a [replication user] named `gitlab_replicator`:
```bash
sudo -u postgres psql -c "CREATE USER gitlab_replicator REPLICATION ENCRYPTED PASSWORD 'thepassword';"
```sql
--- Create a new user 'replicator'
CREATE USER gitlab_replicator;
--- Set/change a password and grants replication privilege
ALTER USER gitlab_replicator WITH REPLICATION ENCRYPTED PASSWORD 'replicationpasswordhere';
```
1. Make sure your the `gitlab` database user has a password defined
......@@ -149,12 +152,11 @@ The following guide assumes that:
The `listen_address` option opens PostgreSQL up to external connections
with the interface corresponding to the given IP. See [the PostgreSQL
documentation](https://www.postgresql.org/docs/9.6/static/runtime-config-connection.html)
for more details.
documentation][pg-docs-runtime-conn] for more details.
You may also want to edit the `wal_keep_segments` and `max_wal_senders` to
match your database replication requirements. Consult the [PostgreSQL - Replication documentation](https://www.postgresql.org/docs/9.6/static/runtime-config-replication.html)
for more information.
match your database replication requirements. Consult the
[PostgreSQL - Replication documentation][pg-docs-runtime-replication] for more information.
1. Set the access control on the primary to allow TCP connections using the
server's public IP and set the connection from the secondary to require a
......@@ -162,8 +164,7 @@ The following guide assumes that:
`/etc/postgresql/9.x/main/pg_hba.conf`):
```bash
host all all 127.0.0.1/32 trust
host all all 1.2.3.4/32 trust
host all all 1.2.3.4/32 md5
host replication gitlab_replicator 5.6.7.8/32 md5
```
......@@ -173,8 +174,7 @@ The following guide assumes that:
address:
```bash
host all all 127.0.0.1/32 trust
host all all 1.2.3.4/32 trust
host all all 1.2.3.4/32 md5
host replication gitlab_replicator 5.6.7.8/32 md5
host replication gitlab_replicator 11.22.33.44/32 md5
```
......@@ -200,13 +200,9 @@ The following guide assumes that:
`netstat -plnt` to make sure that PostgreSQL is listening to the server's
public IP.
### Step 2. Add the secondary GitLab node
Follow the steps in ["add the secondary GitLab node"](database.md#step-2-add-the-secondary-gitlab-node).
### Step 2. Configure the secondary server
### Step 3. Configure the secondary server
Follow the first steps in ["configure the secondary server"](database.md#step-3-configure-the-secondary-server),
Follow the first steps in ["configure the secondary server"][database-replication],
but note that since you are installing from source, the username and
group listed as `gitlab-psql` in those steps should be replaced by `postgres`
instead. After completing the "Test that the `gitlab-psql` user can connect to
......@@ -293,7 +289,7 @@ node.
And edit the content of `database_geo.yml` and to add `fdw: true` to
the `production:` block.
### Step 4. Initiate the replication process
### Step 3. Initiate the replication process
Below we provide a script that connects the database on the secondary node to
the database on the primary node, replicates the database, and creates the
......@@ -315,7 +311,7 @@ data before running `pg_basebackup`.
1. Save the snippet below in a file, let's say `/tmp/replica.sh`. Modify the
embedded paths if necessary:
```bash
```
#!/bin/bash
PORT="5432"
......@@ -332,7 +328,8 @@ data before running `pg_basebackup`.
read SSLMODE
echo Stopping PostgreSQL and all GitLab services
gitlab-ctl stop
sudo service gitlab stop
sudo service postgresql stop
echo Backing up postgresql.conf
sudo -u postgres mv /var/opt/gitlab/postgresql/data/postgresql.conf /var/opt/gitlab/postgresql/
......@@ -356,8 +353,8 @@ data before running `pg_basebackup`.
echo Restoring postgresql.conf
sudo -u postgres mv /var/opt/gitlab/postgresql/postgresql.conf /var/opt/gitlab/postgresql/data/
echo Starting PostgreSQL and all GitLab services
gitlab-ctl start
echo Starting PostgreSQL
sudo service postgresql start
```
1. Run it with:
......@@ -375,7 +372,7 @@ data before running `pg_basebackup`.
**not** safe over the public Internet!
You can read more details about each `sslmode` in the
[PostgreSQL documentation](https://www.postgresql.org/docs/9.6/static/libpq-ssl.html#LIBPQ-SSL-PROTECTION);
[PostgreSQL documentation][pg-docs-ssl];
the instructions above are carefully written to ensure protection against
both passive eavesdroppers and active "man-in-the-middle" attackers.
......@@ -389,7 +386,14 @@ MySQL replication is not supported for Geo.
Read the [troubleshooting document](troubleshooting.md).
[replication-slots-article]: https://medium.com/@tk512/replication-slots-in-postgresql-b4b03d277c75
[pgback]: http://www.postgresql.org/docs/9.6/static/app-pgbasebackup.html
[replication user]:https://wiki.postgresql.org/wiki/Streaming_Replication
[FDW]: https://www.postgresql.org/docs/9.6/static/postgres-fdw.html
[toc]: index.md#using-gitlab-installed-from-source
[database]: database.md
[add-geo-node]: configuration.md#step-3-add-the-secondary-gitlab-node
[database-replication]: database.md#step-2-configure-the-secondary-server
[pg-docs-ssl]: https://www.postgresql.org/docs/9.6/static/libpq-ssl.html#LIBPQ-SSL-PROTECTION
[pg-docs-runtime-conn]: https://www.postgresql.org/docs/9.6/static/runtime-config-connection.html
[pg-docs-runtime-replication]: https://www.postgresql.org/docs/9.6/static/runtime-config-replication.html
# Docker Registry for a secondary node
You can setup a [Docker Registry](https://docs.docker.com/registry/) on your
You can setup a [Docker Registry] on your
secondary Geo node that mirrors the one on the primary Geo node.
## Storage support
CAUTION: **Warning:**
If you use [local storage](../../container_registry.md#container-registry-storage-driver)
If you use [local storage][registry-storage]
for the Container Registry you **cannot** replicate it to the secondary Geo node.
Docker Registry currently supports a few types of storages. If you choose a
distributed storage (`azure`, `gcs`, `s3`, `swift`, or `oss`) for your Docker
Registry on a primary Geo node, you can use the same storage for a secondary
Docker Registry as well. For more information, read the
[Load balancing considerations](https://docs.docker.com/registry/deploying/#load-balancing-considerations)
[Load balancing considerations][registry-load-balancing]
when deploying the Registry, and how to setup the storage driver for GitLab's
integrated [Container Registry](../../container_registry.md#container-registry-storage-driver).
integrated [Container Registry][registry-storage].
[ee]: https://about.gitlab.com/products/
[Docker Registry]: https://docs.docker.com/registry/
[registry-storage]: ../../container_registry.md#container-registry-storage-driver
[registry-load-balancing]: https://docs.docker.com/registry/deploying/#load-balancing-considerations
......@@ -8,7 +8,7 @@ described, it is possible to adapt these instructions to your needs.
![Geo HA Diagram](../../img/high_availability/geo-ha-diagram.png)
_[diagram source - gitlab employees only](https://docs.google.com/drawings/d/1z0VlizKiLNXVVVaERFwgsIOuEgjcUqDTWPdQYsE7Z4c/edit)_
_[diagram source - gitlab employees only][diagram-source]_
The topology above assumes that the primary and secondary Geo clusters
are located in two separate locations, on their own virtual network
......@@ -21,7 +21,7 @@ The only external way to access the two Geo deployments is by HTTPS at
`gitlab.us.example.com` and `gitlab.eu.example.com` in the example above.
> **Note:** The primary and secondary Geo deployments must be able to
> communicate to each other over HTTPS.
communicate to each other over HTTPS.
## Redis and PostgreSQL High Availability
......@@ -33,8 +33,7 @@ The two services will instead be configured such that
they will each run on a single machine.
For more information about setting up a highly available PostgreSQL cluster and Redis cluster using the omnibus package see the high availability documentation for
[PostgreSQL](../../high_availability/database.md) and
[Redis](../../high_availability/redis.md), respectively.
[PostgreSQL][postgresql-ha] and [Redis][redis-ha], respectively.
From these instructions you will need the following for the examples below:
* `gitlab_rails['db_password']` for the PostgreSQL "DB password"
......@@ -53,18 +52,18 @@ Make sure you have GitLab EE installed using the
On the **primary** backend servers configure the following services:
* [Redis](../../high_availability/redis.md) for high availability.
* [NFS Server](../../high_availability/nfs.md) for repository, LFS, and upload storage.
* [PostgreSQL](../../high_availability/database.md) for high availability.
* [Redis][redis-ha] for high availability.
* [NFS Server][nfs-ha] for repository, LFS, and upload storage.
* [PostgreSQL][postgresql-ha] for high availability.
On the **secondary** backend servers configure the following services:
* [Redis](../../high_availability/redis.md) for high availability.
* [NFS Server](../../high_availability/nfs.md) which will store data that is synchronized from the Geo primary.
* [Redis][redis-ha] for high availability.
* [NFS Server][nfs-ha] which will store data that is synchronized from the Geo primary.
### Step 2: Configure the Postgres services on the Geo Secondary
1. Configure the [secondary Geo PostgreSQL database](database.md)
1. Configure the [secondary Geo PostgreSQL database][database]
as a read-only secondary of the primary Geo PostgreSQL database.
1. Configure the Geo tracking database on the secondary server, to do this modify `/etc/gitlab/gitlab.rb`:
......@@ -82,7 +81,7 @@ On the **secondary** backend servers configure the following services:
NOTE: **Note:**
Be sure that other non-postgresql services are disabled by setting `enable` to `false` in
the [gitlab.rb configuration](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/files/gitlab-config-template/gitlab.rb.template).
the [gitlab.rb configuration][gitlab-rb-template].
After making these changes be sure to run `sudo gitlab-ctl reconfigure` so that they take effect.
......@@ -91,7 +90,7 @@ After making these changes be sure to run `sudo gitlab-ctl reconfigure` so that
In this topology there will need to be a load balancers at each geographical location
to route traffic to the application servers.
See the [Load Balancer for GitLab HA](../../high_availability/load_balancer.md)
See the [Load Balancer for GitLab HA][load-balancer-ha]
documentation for more information.
### Step 4: Configure the Geo Frontend Application Servers
......@@ -130,7 +129,8 @@ the addresses of the remote endpoints for PostgreSQL and Redis will need to be s
On the secondary the remote endpoint for the PostgreSQL Geo database will
be specified.
1. Edit `/etc/gitlab/gitlab.rb` and ensure the following to disable PostgreSQL and Redis from running locally. Configure the secondary to connect to the Geo tracking database.
1. Edit `/etc/gitlab/gitlab.rb` and ensure the following to disable PostgreSQL and
Redis from running locally. Configure the secondary to connect to the Geo tracking database.
```ruby
......@@ -164,7 +164,7 @@ be specified.
```
After making these changes [Reconfigure GitLab][] so that they take effect.
After making these changes [Reconfigure GitLab][gitlab-reconfigure] so that they take effect.
On the primary the following GitLab frontend services will be enabled:
......@@ -192,5 +192,11 @@ On the secondary the following GitLab frontend services will be enabled:
Verify these services by running `sudo gitlab-ctl status` on the frontend
application servers.
[reconfigure GitLab]: ../../restart_gitlab.md#omnibus-gitlab-reconfigure
[restart GitLab]: ../../restart_gitlab.md#omnibus-gitlab-restart
[diagram-source]: https://docs.google.com/drawings/d/1z0VlizKiLNXVVVaERFwgsIOuEgjcUqDTWPdQYsE7Z4c/edit
[gitlab-reconfigure]: ../../restart_gitlab.md#omnibus-gitlab-reconfigure
[redis-ha]: ../../high_availability/redis.md
[postgresql-ha]: ../../high_availability/database.md
[nfs-ha]: ../../high_availability/nfs.md
[load-balancer-ha]: ../../high_availability/load_balancer.md
[database]: database.md
[gitlab-rb-template]: https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/files/gitlab-config-template/gitlab.rb.template
......@@ -7,7 +7,7 @@
basic Geo features, or latest version for a better experience.
- You should make sure that all nodes run the same GitLab version.
- Geo requires PostgreSQL 9.6 and Git 2.9 in addition to GitLab's usual
[minimum requirements](../../../install/requirements.md)
[minimum requirements][install-requirements]
- Using Geo in combination with High Availability is considered **GA** in GitLab Enterprise Edition 10.4
>**Note:**
......@@ -51,14 +51,13 @@ to reading any data available in the GitLab web interface (see [current limitati
improving speed for distributed teams
- Helps reducing the loading time for automated tasks,
custom integrations and internal workflows
- Quickly failover to a Geo secondary in a
[Disaster Recovery](../disaster_recovery/index.md) scenario
- Allows [planned failover](../disaster_recovery/planned_failover.md) to a Geo secondary
- Quickly failover to a Geo secondary in a [Disaster Recovery][disaster-recovery] scenario
- Allows [planned failover] to a Geo secondary
## Architecture
The following diagram illustrates the underlying architecture of Geo
([source diagram](https://docs.google.com/drawings/d/1Abw0P_H0Ew1-2Lj_xPDRWP87clGIke-1fil7_KQqrtE/edit)).
([source diagram]).
![Geo architecture](img/geo_architecture.png)
......@@ -88,7 +87,7 @@ current version of OpenSSH:
Note that CentOS 6 and 7.0 ship with an old version of OpenSSH that do not
support a feature that Geo requires. See the [documentation on Geo SSH
access](../../operations/fast_ssh_key_lookup.md) for more details.
access][fast-ssh-lookup] for more details.
### LDAP
......@@ -100,7 +99,7 @@ tokens will still work.
Check with your LDAP provider for instructions on on how to set up
replication. For example, OpenLDAP provides [these
instructions](https://www.openldap.org/doc/admin24/replication.html).
instructions][ldap-replication].
### Geo Tracking Database
......@@ -143,18 +142,16 @@ If you installed GitLab using the Omnibus packages (highly recommended):
1. [Install GitLab Enterprise Edition][install-ee] on the server that will serve
as the **secondary** Geo node. Do not create an account or login to the new
secondary node.
1. [Upload the GitLab License](../../../user/admin_area/license.md) on the **primary**
1. [Upload the GitLab License][upload-license] on the **primary**
Geo node to unlock Geo.
1. [Setup the database replication](database.md) (`primary (read-write) <->
1. [Setup the database replication][database] (`primary (read-write) <->
secondary (read-only)` topology).
1. [Configure fast lookup of authorized SSH keys in the database](../../operations/fast_ssh_key_lookup.md),
1. [Configure fast lookup of authorized SSH keys in the database][fast-ssh-lookup],
this step is required and needs to be done on both the primary AND secondary nodes.
1. [Configure GitLab](configuration.md) to set the primary and secondary nodes.
1. Optional: [Configure a secondary LDAP server](../../auth/ldap.md)
1. [Configure GitLab][configuration] to set the primary and secondary nodes.
1. Optional: [Configure a secondary LDAP server][config-ldap]
for the secondary. See [notes on LDAP](#ldap).
1. [Follow the "Using a Geo Server" guide](using_a_geo_server.md).
[install-ee]: https://about.gitlab.com/downloads-ee/ "GitLab Enterprise Edition Omnibus packages downloads page"
1. [Follow the "Using a Geo Server" guide][using-geo].
### Using GitLab installed from source
......@@ -163,50 +160,60 @@ If you installed GitLab from source:
1. [Install GitLab Enterprise Edition][install-ee-source] on the server that
will serve as the **secondary** Geo node. Do not create an account or login
to the new secondary node.
1. [Upload the GitLab License](../../../user/admin_area/license.md) on the **primary**
1. [Upload the GitLab License][upload-license] on the **primary**
Geo node to unlock Geo.
1. [Setup the database replication](database_source.md) (`primary (read-write)
1. [Setup the database replication][database-source] (`primary (read-write)
<-> secondary (read-only)` topology).
1. [Configure fast lookup of authorized SSH keys in the database](../../operations/fast_ssh_key_lookup.md),
1. [Configure fast lookup of authorized SSH keys in the database][fast-ssh-lookup],
do this step for both primary AND secondary nodes.
1. [Configure GitLab](configuration_source.md) to set the primary and secondary
1. [Configure GitLab][configuration-source] to set the primary and secondary
nodes.
1. [Follow the "Using a Geo Server" guide](using_a_geo_server.md).
[install-ee-source]: https://docs.gitlab.com/ee/install/installation.html "GitLab Enterprise Edition installation from source"
1. [Follow the "Using a Geo Server" guide][using-geo].
## Configuring Geo
Read through the [Geo configuration](configuration.md) documentation.
Read through the [Geo configuration][configuration] documentation.
## Updating the Geo nodes
Read how to [update your Geo nodes to the latest GitLab version](updating_the_geo_nodes.md).
Read how to [update your Geo nodes to the latest GitLab version][updating-geo].
## Configuring Geo HA
Read through the [Geo High Availability documentation](high_availability.md).
Read through the [Geo High Availability documentation][ha].
## Configuring Geo with Object storage
When you have object storage enabled, please consult the
[Geo with Object Storage](object_storage.md) documentation.
[Geo with Object Storage][object-storage] documentation.
## Replicating the Container Registry
## Disaster Recovery
Read how to [replicate the Container Registry](docker_registry.md).
Read through the [Disaster Recovery documentation][disaster-recovery] how to use Geo to mitigate data-loss and
restore services in a disaster scenario.
### Replicating the Container Registry
Read how to [replicate the Container Registry][docker-registry].
## Current limitations
- You cannot push code to secondary nodes, see [3912](https://gitlab.com/gitlab-org/gitlab-ee/issues/3912) for details.
> **IMPORTANT**: This list of limitations tracks only the latest version. If you are in an older version,
extra limitations may be in place.
- You cannot push code to secondary nodes, see [gitlab-org/gitlab-ee#3912] for details.
- The primary node has to be online for OAuth login to happen (existing sessions and Git are not affected)
- It works for repos, wikis, issues, and merge requests, but it does not work for job logs, artifacts, GitLab Pages, and Docker images of the Container
Registry (by default, but you can configure it separately, see [replicate the Container Registry](docker_registry.md) for details)
- The installation takes multiple manual steps that together can take about an hour depending on circumstances; we are working on improving this experience, see [#2978](https://gitlab.com/gitlab-org/omnibus-gitlab/issues/2978) for details.
- It works for repos, wikis, issues, merge requests, file attachments, artifacts and job logs but it does not work for,
GitLab Pages, and Docker images of the Container Registry (by default, but you can configure it separately,
see [replicate the Container Registry][docker-registry] for details).
- The installation takes multiple manual steps that together can take about an hour depending on circumstances; we are
working on improving this experience, see [gitlab-org/omnibus-gitlab#2978] for details.
- Real-time updates of issues/merge requests (e.g. via long polling) doesn't work on the secondary
- Broadcast messages set on the primary won't be seen on the secondary without a cache flush (e.g. gitlab-rake cache:clear)
## Frequently Asked Questions
Read more in the [Geo FAQ](faq.md).
Read more in the [Geo FAQ][faq].
## Log files
......@@ -225,16 +232,39 @@ This message shows that Geo detected that a repository update was needed for pro
## Security of Geo
Read the [security review](security_review.md) page.
Read the [security review][security-review] page.
## Tuning Geo
Read the [Geo tuning](tuning.md) documentation.
Read the [Geo tuning][tunning] documentation.
## Troubleshooting
Read the [troubleshooting document](troubleshooting.md).
Read the [troubleshooting document][troubleshooting].
[ee]: https://about.gitlab.com/products/ "GitLab Enterprise Edition landing page"
[install-requirements]: ../../../install/requirements.md
[install-ee]: https://about.gitlab.com/downloads-ee/ "GitLab Enterprise Edition Omnibus packages downloads page"
[install-ee-source]: https://docs.gitlab.com/ee/install/installation.html "GitLab Enterprise Edition installation from source"
[disaster-recovery]: ../disaster_recovery/index.md
[planned failover]: ../disaster_recovery/planned_failover.md
[fast-ssh-lookup]: ../../operations/fast_ssh_key_lookup.md
[upload-license]: ../../../user/admin_area/license.md
[database]: database.md
[database-source]: database_source.md
[configuration]: configuration.md
[configuration-source]: configuration_source.md
[config-ldap]: ../../auth/ldap.md
[using-geo]: using_a_geo_server.md
[updating-geo]: updating_the_geo_nodes.md
[ha]: high_availability.md
[object-storage]: object_storage.md
[docker-registry]: docker_registry.md
[faq]: faq.md
[security-review]: security_review.md
[tunning]: tuning.md
[troubleshooting]: troubleshooting.md
[source diagram]: https://docs.google.com/drawings/d/1Abw0P_H0Ew1-2Lj_xPDRWP87clGIke-1fil7_KQqrtE/edit
[ldap-replication]: https://www.openldap.org/doc/admin24/replication.html
[gitlab-org/gitlab-ee#3912]: https://gitlab.com/gitlab-org/gitlab-ee/issues/3912
[gitlab-org/omnibus-gitlab#2978]: https://gitlab.com/gitlab-org/omnibus-gitlab/issues/2978
......@@ -66,7 +66,7 @@ be set on the primary database. In GitLab 9.4, we have made this setting
default to 1. You may need to increase this value if you have more Geo
secondary nodes. Be sure to restart PostgreSQL for this to take
effect. See the [PostgreSQL replication
setup](database.md#postgresql-replication) guide for more details.
setup][database-pg-replication] guide for more details.
#### How do I fix the message, "FATAL: could not start WAL streaming: ERROR: replication slot "geo_secondary_my_domain_com" does not exist"?
......@@ -76,8 +76,8 @@ process](database.md) on the secondary.
#### How do I fix the message, "Command exceeded allowed execution time" when setting up replication?
This may happen while [initiating the replication process](database.md#step-4-initiate-the-replication-process) on the Geo secondary, and indicates that your
initial dataset is too large to be replicated in the default timeout (30 minutes).
This may happen while [initiating the replication process][database-start-replication] on the Geo secondary,
and indicates that your initial dataset is too large to be replicated in the default timeout (30 minutes).
Re-run `gitlab-ctl replicate-geo-database`, but include a larger value for
`--backup-timeout`:
......@@ -91,8 +91,8 @@ the default thirty minutes. Adjust as required for your installation.
#### How do I fix the message, "PANIC: could not write to file 'pg_xlog/xlogtemp.123': No space left on device"
Determine if you have any unused replication slots in the primary database. This can cause large amounts of log data to build up in `pg_xlog`.
Removing the unused slots can reduce the amount of space used in the `pg_xlog`.
Determine if you have any unused replication slots in the primary database. This can cause large amounts of
log data to build up in `pg_xlog`. Removing the unused slots can reduce the amount of space used in the `pg_xlog`.
1. Start a PostgreSQL console session:
......@@ -100,7 +100,8 @@ Removing the unused slots can reduce the amount of space used in the `pg_xlog`.
sudo gitlab-psql gitlabhq_production
```
Note that using `gitlab-rails dbconsole` will not work, because managing replication slots requires superuser permissions.
> Note that using `gitlab-rails dbconsole` will not work, because managing replication slots requires
superuser permissions.
2. View your replication slots with
......@@ -111,9 +112,10 @@ Removing the unused slots can reduce the amount of space used in the `pg_xlog`.
Slots where `active` is `f` are not active.
- When this slot should be active, because you have a secondary configured using that slot,
log in to that secondary and check the PostgreSQL logs why the replication is not running.
log in to that secondary and check the PostgreSQL logs why the replication is not running.
- If you are no longer using the slot (e.g. you no longer have Geo enabled), you can remove it with in the PostgreSQL console session:
- If you are no longer using the slot (e.g. you no longer have Geo enabled), you can remove it with in the
PostgreSQL console session:
```sql
SELECT pg_drop_replication_slot('name_of_extra_slot');
......@@ -139,3 +141,6 @@ sudo gitlab-ctl reconfigure
This will increase the timeout to three hours (10800 seconds). Choose a time
long enough to accomodate a full clone of your largest repositories.
[database-start-replication]: database.md#step-3-initiate-the-replication-process
[database-pg-replication]: database.md#postgresql-replication
......@@ -13,5 +13,7 @@ but this may not lead to a more downloads in parallel unless the number of
available Sidekiq threads is also increased. For example, if repository sync
capacity is increased from 25 to 50, you may also want to increase the number
of Sidekiq threads from 25 to 50. See the [Sidekiq concurrency
documentation](../../operations/extra_sidekiq_processes.html#concurrency)
documentation][sidekiq-concurrency]
for more details.
[sidekiq-concurrency]: ../../operations/extra_sidekiq_processes.html#concurrency
......@@ -34,7 +34,7 @@ If you do not perform this step, you may find that two-factor authentication
To prevent SSH requests to the newly promoted primary node from failing
due to SSH host key mismatch when updating the primary domain's DNS record
you should perform the step to [Manually replicate primary SSH host keys](configuration.md#step-2-manually-replicate-primary-ssh-host-keys) in each
you should perform the step to [Manually replicate primary SSH host keys][configuration-replicate-ssh] in each
secondary node.
## Upgrading to GitLab 10.4
......@@ -49,7 +49,7 @@ In GitLab 10.2, synchronizing secondaries over SSH was deprecated. In 10.3,
support is removed entirely. All installations will switch to the HTTP/HTTPS
cloning method instead. Before upgrading, ensure that all your Geo nodes are
configured to use this method and that it works for your installation. In
particular, ensure that [Git access over HTTP/HTTPS is enabled](configuration.md#step-5-enable-git-access-over-http-https).
particular, ensure that [Git access over HTTP/HTTPS is enabled][configuration-git-over-http].
Synchronizing repositories over the public Internet using HTTP is insecure, so
you should ensure that you have HTTPS configured before upgrading. Note that
......@@ -63,7 +63,7 @@ Support for TLS-secured PostgreSQL replication has been added. If you are
currently using PostgreSQL replication across the open internet without an
external means of securing the connection (e.g., a site-to-site VPN), then you
should immediately reconfigure your primary and secondary PostgreSQL instances
according to the [updated instructions](#database.md).
according to the [updated instructions][database].
If you *are* securing the connections externally and wish to continue doing so,
ensure you include the new option `--sslmode=prefer` in future invocations of
......@@ -96,10 +96,9 @@ secondary if ever promoted to a primary:
### Hashed Storage
>**Warning**
CAUTION: **Warning:**
Hashed storage is in **Alpha**. It is considered experimental and not
production-ready. See [Hashed
Storage](../../repository_storage_types.md) for more detail.
production-ready. See [Hashed Storage] for more detail.
If you previously enabled Hashed Storage and migrated all your existing
projects to Hashed Storage, disabling hashed storage will not migrate projects
......@@ -108,19 +107,17 @@ migrated we recommend leaving Hashed Storage enabled.
## Upgrading to GitLab 10.1
>**Warning**
CAUTION: **Warning:**
Hashed storage is in **Alpha**. It is considered experimental and not
production-ready. See [Hashed
Storage](../../repository_storage_types.md) for more detail.
production-ready. See [Hashed Storage] for more detail.
[Hashed storage](../../repository_storage_types.md) was introduced
in GitLab 10.0, and a [migration path](../../raketasks/storage.md)
[Hashed storage] was introduced in GitLab 10.0, and a [migration path][hashed-migration]
for existing repositories was added in GitLab 10.1.
## Upgrading to GitLab 10.0
Since GitLab 10.0, we require all **Geo** systems to [use SSH key lookups via
the database](../../operations/fast_ssh_key_lookup.md) to avoid having to maintain consistency of the
the database][ssh-fast-lookup] to avoid having to maintain consistency of the
`authorized_keys` file for SSH access. Failing to do this will prevent users
from being able to clone via SSH.
......@@ -158,10 +155,12 @@ instructions below.
When in doubt, it does not hurt to do a resync. The easiest way to do this in
Omnibus is the following:
1. Install GitLab on the primary server
1. Make sure you have Omnibus GitLab on the primary server
1. Run `gitlab-ctl reconfigure` and `gitlab-ctl restart postgresql`. This will enable replication slots on the primary database.
1. Check the steps about defining `postgresql['sql_user_password']`, `gitlab_rails['db_password']`
1. Make sure `postgresql['max_replication_slots']` matches the number of secondary Geo Nodes locations
1. Install GitLab on the secondary server.
1. Re-run the [database replication process](database.md#step-3-initiate-the-replication-process).
1. Re-run the [database replication process][database-replication].
## Special update notes for 9.0.x
......@@ -255,7 +254,7 @@ is prepended with the relevant node for better clarity:
```
1. **[secondary]** Create the `replica.sh` script as described in the
[database configuration document](database.md#step-3-initiate-the-replication-process).
[database configuration document][database-source-replication].
1. **[secondary]** Run the recovery script using the credentials from the
previous step:
......@@ -315,3 +314,11 @@ and it is required since 10.0.
1. Repeat this step for every secondary node
[update]: ../../../update/README.md
[database]: database.md
[database-replication]: database.md#step-3-initiate-the-replication-process
[database-source-replication]: database_source.md#step-3-initiate-the-replication-process
[Hashed Storage]: ../../repository_storage_types.md
[hashed-migration]: ../../raketasks/storage.md
[ssh-fast-lookup]: ../../operations/fast_ssh_key_lookup.md
[configuration-replicate-ssh]: configuration.md#step-2-manually-replicate-primary-ssh-host-keys
[configuration-git-over-http]: configuration.md#step-6-enable-git-access-over-http-https
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment