Move geo docs to administration/geo/replication

0c047594 · James Ramsay · c7759ab3 · 0c047594 · 0c047594 · 0c047594
Commit 0c047594 authored Feb 15, 2018 by James Ramsay
34 changed files
--- a/doc/administration/geo/replication/configuration.md
+++ b/doc/administration/geo/replication/configuration.md
+# Geo configuration
+
+>**Note:**
+This is the documentation for the Omnibus GitLab packages. For installations
+from source, follow the [**Geo nodes configuration for installations
+from source**](configuration_source.md) guide.
+
+## Configuring a new secondary node
+
+>**Note:**
+This is the final step in setting up a secondary Geo node. Stages of the
+setup process must be completed in the documented order.
+Before attempting the steps in this stage, [complete all prior stages](README.md#using-omnibus-gitlab).
+
+The basic steps of configuring a secondary node are to replicate required
+configurations between the primary and the secondaries; to configure a tracking
+database on each secondary; and to start GitLab on the secondary node.
+
+You are encouraged to first read through all the steps before executing them
+in your testing/production environment.
+
+>**Notes:**
+- **Do not** setup any custom authentication in the secondary nodes, this will be
+  handled by the primary node.
+- **Do not** add anything in the secondaries Geo nodes admin area
+  (**Admin Area ➔ Geo Nodes**). This is handled solely by the primary node.
+
+### Step 1. Manually replicate secret GitLab values
+
+GitLab stores a number of secret values in the `/etc/gitlab/gitlab-secrets.json`
+file which *must* match between the primary and secondary nodes. Until there is
+a means of automatically replicating these between nodes (see
+[issue #3789](https://gitlab.com/gitlab-org/gitlab-ee/issues/3789)), they must
+be manually replicated to the secondary.
+
+1. SSH into the **primary** node, and execute the command below:
+
+    ```bash
+    sudo cat /etc/gitlab/gitlab-secrets.json
+    ```
+
+    This will display the secrets that need to be replicated, in JSON format.
+
+1. SSH into the **secondary** node and login as the `root` user:
+
+    ```
+    sudo -i
+    ```
+
+1. Make a backup of any existing secrets:
+
+    ```bash
+    mv /etc/gitlab/gitlab-secrets.json /etc/gitlab/gitlab-secrets.json.`date +%F`
+    ```
+
+1. Copy `/etc/gitlab/gitlab-secrets.json` from the primary to the secondary, or
+   copy-and-paste the file contents between nodes:
+
+    ```bash
+    sudo editor /etc/gitlab/gitlab-secrets.json
+
+    # paste the output of the `cat` command you ran on the primary
+    # save and exit
+    ```
+
+1. Ensure the file permissions are correct:
+
+    ```bash
+    chown root:root /etc/gitlab/gitlab-secrets.json
+    chmod 0600 /etc/gitlab/gitlab-secrets.json
+    ```
+
+1. Reconfigure the secondary node for the change to take effect:
+
+    ```
+    gitlab-ctl reconfigure
+    ```
+
+Once reconfigured, the secondary will automatically start
+replicating missing data from the primary in a process known as backfill.
+Meanwhile, the primary node will start to notify the secondary of any changes, so
+that the secondary can act on those notifications immediately.
+
+Make sure the secondary instance is
+running and accessible. You can login to the secondary node
+with the same credentials as used in the primary.
+
+### Step 2. Manually replicate primary SSH host keys
+
+GitLab integrates with the system-installed SSH daemon, designating a user
+(typically named git) through which all access requests are handled.
+
+In a [Disaster Recovery](../disaster_recovery/index.md) situation, GitLab system
+administrators will promote a secondary Geo replica to a primary and they can
+update the DNS records for the primary domain to point to the secondary to prevent
+the need to update all references to the primary domain to the secondary domain,
+like changing Git remotes and API URLs.
+
+This will cause all SSH requests to the newly promoted primary node from
+failing due to SSH host key mismatch. To prevent this, the primary SSH host
+keys must be manually replicated to the secondary node.
+
+1. SSH into the **secondary** node and login as the `root` user:
+
+    ```
+    sudo -i
+    ```
+
+1. Make a backup of any existing SSH host keys:
+
+    ```bash
+    find /etc/ssh -iname ssh_host_* -exec cp {} {}.backup.`date +%F` \;
+    ```
+
+1. SSH into the **primary** node, and execute the command below:
+
+    ```bash
+    sudo find /etc/ssh -iname ssh_host_* -not -iname '*.pub'
+    ```
+
+1. For each file in that list replace the file from the primary node to
+   the **same** location on your **secondary** node.
+
+1. On your **secondary** node, ensure the file permissions are correct:
+
+    ```bash
+    chown root:root /etc/ssh/ssh_host_*
+    chmod 0600 /etc/ssh/ssh_host_*
+    ```
+
+1. Regenerate the public keys from the private keys:
+
+    ```bash
+    find /etc/ssh -iname ssh_host_* -not -iname '*.backup*' -exec sh -c 'ssh-keygen -y -f "{}" > "{}.pub"' \;
+    ```
+
+1. Restart sshd:
+
+    ```bash
+    service ssh restart
+    ```
+
+### Step 3. (Optional) Enabling hashed storage (from GitLab 10.0)
+
+>**Warning**
+Hashed storage is in **Beta**. It is not considered production-ready. See
+[Hashed Storage](../repository_storage_types.md) for more detail,
+and for the latest updates, check
+[infrastructure issue #2821](https://gitlab.com/gitlab-com/infrastructure/issues/2821).
+
+Using hashed storage significantly improves Geo replication - project and group
+renames no longer require synchronization between nodes.
+
+1. Visit the **primary** node's **Admin Area ➔ Settings**
+   (`/admin/application_settings`) in your browser
+1. In the `Repository Storages` section, check `Create new projects using hashed storage paths`:
+
+    ![](img/hashed-storage.png)
+
+### Step 4. (Optional) Configuring the secondary to trust the primary
+
+You can safely skip this step if your primary uses a CA-issued HTTPS certificate.
+
+If your primary is using a self-signed certificate for *HTTPS* support, you will
+need to add that certificate to the secondary's trust store. Retrieve the
+certificate from the primary and follow
+[these instructions](https://docs.gitlab.com/omnibus/settings/ssl.html)
+on the secondary.
+
+### Step 5. Enable Git access over HTTP/HTTPS
+
+Geo synchronizes repositories over HTTP/HTTPS, and therefore requires this clone
+method to be enabled. Navigate to **Admin Area ➔ Settings**
+(`/admin/application_settings`) on the primary node, and set
+`Enabled Git access protocols` to `Both SSH and HTTP(S)` or `Only HTTP(S)`.
+
+### Step 6. Verify proper functioning of the secondary node
+
+Congratulations! Your secondary geo node is now configured!
+
+You can login to the secondary node with the same credentials you used on the
+primary. Visit the secondary node's **Admin Area ➔ Geo Nodes**
+(`/admin/geo_nodes`) in your browser to check if it's correctly identified as a
+secondary Geo node and if Geo is enabled.
+
+The initial replication, or 'backfill', will probably still be in progress. You
+can monitor the synchronization process on each geo node from the primary
+node's Geo Nodes dashboard in your browser.
+
+![Geo dashboard](img/geo-node-dashboard.png)
+
+If your installation isn't working properly, check the
+[troubleshooting document](troubleshooting.md).
+
+The two most obvious issues that can become apparent in the dashboard are:
+
+1. Database replication not working well
+1. Instance to instance notification not working. In that case, it can be
+   something of the following:
+     - You are using a custom certificate or custom CA (see the
+       [troubleshooting document](troubleshooting.md))
+     - The instance is firewalled (check your firewall rules)
+
+Please note that disabling a secondary node will stop the sync process.
+
+Please note that if `git_data_dirs` is customized on the primary for multiple
+repository shards you must duplicate the same configuration on the secondary.
+
+Point your users to the ["Using a Geo Server" guide](using_a_geo_server.md).
+
+Currently, this is what is synced:
+
+* Git repositories
+* Wikis
+* LFS objects
+* Issues, merge requests, snippets, and comment attachments
+* Users, groups, and project avatars
+
+## Selective synchronization
+
+Geo supports selective synchronization, which allows admins to choose
+which projects should be synchronized by secondary nodes.
+
+It is important to note that selective synchronization does not:
+
+1. Restrict permissions from secondary nodes.
+1. Hide project metadata from secondary nodes.
+  * Since Geo currently relies on PostgreSQL replication, all project metadata
+    gets replicated to secondary nodes, but repositories that have not been
+    selected will be empty.
+1. Reduce the number of events generated for the Geo event log
+  * The primary generates events as long as any secondaries are present.
+    Selective synchronization restrictions are implemented on the secondaries,
+    not the primary.
+
+A subset of projects can be chosen, either by group or by storage shard. The
+former is ideal for replicating data belonging to a subset of users, while the
+latter is more suited to progressively rolling out Geo to a large GitLab
+instance.
+
+## Upgrading Geo
+
+See the [updating the Geo nodes document](updating_the_geo_nodes.md).
+
+## Troubleshooting
+
+See the [troubleshooting document](troubleshooting.md).
--- a/doc/administration/geo/replication/configuration_source.md
+++ b/doc/administration/geo/replication/configuration_source.md
+# Geo configuration
+
+>**Note:**
+This is the documentation for installations from source. For installations
+using the Omnibus GitLab packages, follow the
+[**Omnibus Geo nodes configuration**](configuration.md) guide.
+
+## Configuring a new secondary node
+
+>**Note:**
+This is the final step in setting up a secondary Geo node. Stages of the setup
+process must be completed in the documented order. Before attempting the steps
+in this stage, [complete all prior stages](README.md#using-gitlab-installed-from-source).
+
+The basic steps of configuring a secondary node are to replicate required
+configurations between the primary and the secondaries; to configure a tracking
+database on each secondary; and to start GitLab on the secondary node.
+
+You are encouraged to first read through all the steps before executing them
+in your testing/production environment.
+
+
+>**Notes:**
+- **Do not** setup any custom authentication in the secondary nodes, this will be
+  handled by the primary node.
+- **Do not** add anything in the secondaries Geo nodes admin area
+  (**Admin Area ➔ Geo Nodes**). This is handled solely by the primary node.
+
+### Step 1. Manually replicate secret GitLab values
+
+GitLab stores a number of secret values in the `/home/git/gitlab/config/secrets.yml`
+file which *must* match between the primary and secondary nodes. Until there is
+a means of automatically replicating these between nodes (see
+[issue #3789](https://gitlab.com/gitlab-org/gitlab-ee/issues/3789)), they must
+be manually replicated to the secondary.
+
+1. SSH into the **primary** node, and execute the command below:
+
+    ```bash
+    sudo cat /home/git/gitlab/config/secrets.yml
+    ```
+
+    This will display the secrets that need to be replicated, in YAML format.
+
+1. SSH into the **secondary** node and login as the `git` user:
+
+    ```bash
+    sudo -i -u git
+    ```
+
+1. Make a backup of any existing secrets:
+
+    ```bash
+    mv /home/git/gitlab/config/secrets.yml /home/git/gitlab/config/secrets.yml.`date +%F`
+    ```
+
+1. Copy `/home/git/gitlab/config/secrets.yml` from the primary to the secondary, or
+   copy-and-paste the file contents between nodes:
+
+    ```bash
+    sudo editor /home/git/gitlab/config/secrets.yml
+
+    # paste the output of the `cat` command you ran on the primary
+    # save and exit
+    ```
+
+1. Ensure the file permissions are correct:
+
+    ```bash
+    chown git:git /home/git/gitlab/config/secrets.yml
+    chmod 0600 /home/git/gitlab/config/secrets.yml
+    ```
+
+1. Restart GitLab for the changes to take effect:
+
+    ```bash
+    service gitlab restart
+    ```
+
+Once restarted, the secondary will automatically start replicating missing data
+from the primary in a process known as backfill. Meanwhile, the primary node
+will start to notify the secondary of any changes, so that the secondary can
+act on those notifications immediately.
+
+Make sure the secondary instance is running and accessible. You can login to
+the secondary node with the same credentials as used in the primary.
+
+### Step 2. Manually replicate primary SSH host keys
+
+Read [Manually replicate primary SSH host keys](configuration.md#step-2-manually-replicate-primary-ssh-host-keys)
+
+### Step 3. (Optional) Enabling hashed storage (from GitLab 10.0)
+
+Read [Enabling Hashed Storage](configuration.md#step-3-optional-enabling-hashed-storage-from-gitlab-10-0)
+
+### Step 4. (Optional) Configuring the secondary to trust the primary
+
+You can safely skip this step if your primary uses a CA-issued HTTPS certificate.
+
+If your primary is using a self-signed certificate for *HTTPS* support, you will
+need to add that certificate to the secondary's trust store. Retrieve the
+certificate from the primary and follow your distribution's instructions for
+adding it to the secondary's trust store. In Debian/Ubuntu, for example, with a
+certificate file of `primary.geo.example.com.crt`, you would follow these steps:
+
+```
+sudo -i
+cp primary.geo.example.com.crt /usr/local/share/ca-certificates
+update-ca-certificates
+```
+
+### Step 5. Enable Git access over HTTP/HTTPS
+
+Geo synchronizes repositories over HTTP/HTTPS, and therefore requires this clone
+method to be enabled. Navigate to **Admin Area ➔ Settings**
+(`/admin/application_settings`) on the primary node, and set
+`Enabled Git access protocols` to `Both SSH and HTTP(S)` or `Only HTTP(S)`.
+
+### Step 6. Verify proper functioning of the secondary node
+
+Read [Verify proper functioning of the secondary node](configuration.md#step-6-verify-proper-functioning-of-the-secondary-node).
+
+
+## Selective synchronization
+
+Read [Selective synchronization](configuration.md#selective-synchronization).
+
+## Troubleshooting
+
+Read the [troubleshooting document](troubleshooting.md).
--- a/doc/administration/geo/replication/database.md
+++ b/doc/administration/geo/replication/database.md
+# Geo database replication
+
+>**Note:**
+This is the documentation for the Omnibus GitLab packages. For installations
+from source, follow the
+[**database replication for installations from source**](database_source.md) guide.
+
+>**Note:**
+If your GitLab installation uses external PostgreSQL, the Omnibus roles
+will not be able to perform all necessary configuration steps. Refer to the
+section on [External PostreSQL][external postgresql] for additional instructions.
+
+>**Note:**
+The stages of the setup process must be completed in the documented order.
+Before attempting the steps in this stage, [complete all prior stages][toc].
+
+This document describes the minimal steps you have to take in order to
+replicate your primary GitLab database to a secondary node's database. You may
+have to change some values according to your database setup, how big it is, etc.
+
+You are encouraged to first read through all the steps before executing them
+in your testing/production environment.
+
+
+## PostgreSQL replication
+
+The GitLab primary node where the write operations happen will connect to
+the primary database server, and the secondary nodes which are read-only will
+connect to the secondary database servers (which are also read-only).
+
+>**Note:**
+In database documentation you may see "primary" being referenced as "master"
+and "secondary" as either "slave" or "standby" server (read-only).
+
+We recommend using [PostgreSQL replication
+slots](https://medium.com/@tk512/replication-slots-in-postgresql-b4b03d277c75)
+to ensure that the primary retains all the data necessary for the secondaries to
+recover. See below for more details.
+
+The following guide assumes that:
+
+- You are using Omnibus and therefore you are using PostgreSQL 9.6 or later
+  which includes the  [`pg_basebackup` tool][pgback] and improved
+  [Foreign Data Wrapper][FDW] support.
+- You have a primary node already set up (the GitLab server you are
+  replicating from), running Omnibus' PostgreSQL (or equivalent version), and
+  you have a new secondary server set up with the same versions of the OS,
+  PostgreSQL, and GitLab on all nodes.
+- The IP of the primary server for our examples will be `1.2.3.4`, whereas the
+  secondary's IP will be `5.6.7.8`. Note that the primary and secondary servers
+  **must** be able to communicate over these addresses. More on this in the
+  guide below.
+
+
+### Step 1. Configure the primary server
+
+1. SSH into your GitLab **primary** server and login as root:
+
+    ```bash
+    sudo -i
+    ```
+
+1. Execute the command below to define the node as primary Geo node:
+
+    ```bash
+    gitlab-ctl set-geo-primary-node
+    ```
+
+    This command will use your defined `external_url` in `/etc/gitlab/gitlab.rb`.
+
+1. GitLab 10.4 and up only: Do the following to make sure the `gitlab` database user has a password defined
+
+    Generate a MD5 hash of the desired password:
+
+    ```bash
+    gitlab-ctl pg-password-md5 gitlab
+    # Enter password: mypassword
+    # Confirm password: mypassword
+    # fca0b89a972d69f00eb3ec98a5838484
+    ```
+
+    Edit `/etc/gitlab/gitlab.rb`:
+
+    ```ruby
+    # Fill with the hash generated by `gitlab-ctl pg-password-md5 gitlab`
+    postgresql['sql_user_password'] = 'fca0b89a972d69f00eb3ec98a5838484'
+
+    # If you have HA setup, this must be present in all nodes as well
+    gitlab_rails['db_password'] = 'mypassword'
+    ```
+
+1. Omnibus GitLab already has a [replication user](https://wiki.postgresql.org/wiki/Streaming_Replication)
+   called `gitlab_replicator`. You must set the password for this user manually.
+   You will be prompted to enter a password:
+
+    ```bash
+    gitlab-ctl set-replication-password
+    ```
+
+    This command will also read the `postgresql['sql_replication_user']` Omnibus
+    setting in case you have changed `gitlab_replicator` username to something
+    else.
+
+1. Configure PostgreSQL to listen on network interfaces
+
+    For security reasons, PostgreSQL does not listen on any network interfaces
+    by default. However, Geo requires the secondary to be able to
+    connect to the primary's database. For this reason, we need the address of
+    each node. Note: For external PostgreSQL instances, see [additional instructions][external postgresql].
+
+    If you are using a cloud provider, you can lookup the addresses for each
+    Geo node through your cloud provider's management console.
+
+    To lookup the address of a Geo node, SSH in to the Geo node and execute:
+
+    ```bash
+    ##
+    ## Private address
+    ##
+    ip route get 255.255.255.255 | awk '{print "Private address:", $NF; exit}'
+
+    ##
+    ## Public address
+    ##
+    echo "External address: $(curl ipinfo.io/ip)"
+    ```
+
+    In most cases, the following addresses will be used to configure GitLab
+    Geo:
+
+    | Configuration | Address |
+    |-----|-----|
+    | `postgresql['listen_address']` | Primary's private address |
+    | `postgresql['trust_auth_cidr_addresses']` | Primary's private address |
+    | `postgresql['md5_auth_cidr_addresses']` | Secondary's public addresses |
+
+    If you are using Google Cloud Platform, SoftLayer, or any other vendor that
+    provides a virtual private cloud you can use the secondary's private
+    address (corresponds to "internal address" for Google Cloud Platform) for
+    `postgresql['md5_auth_cidr_addresses']`.
+
+    The `listen_address` option opens PostgreSQL up to network connections
+    with the interface corresponding to the given address. See [the PostgreSQL
+    documentation](https://www.postgresql.org/docs/9.6/static/runtime-config-connection.html)
+    for more details.
+
+    Depending on your network configuration, the suggested addresses may not
+    be correct. If your primary and secondary connect over a local
+    area network, or a virtual network connecting availability zones like
+    [Amazon's VPC](https://aws.amazon.com/vpc/) or [Google's VPC](https://cloud.google.com/vpc/)
+    you should use the secondary's private address for `postgresql['md5_auth_cidr_addresses']`.
+
+    Edit `/etc/gitlab/gitlab.rb` and add the following, replacing the IP
+    addresses with addresses appropriate to your network configuration:
+
+    ```ruby
+    geo_primary_role['enable'] = true
+
+    ##
+    ## Primary address
+    ## - replace '1.2.3.4' with the primary private address
+    ##
+    postgresql['listen_address'] = '1.2.3.4'
+    postgresql['trust_auth_cidr_addresses'] = ['127.0.0.1/32','1.2.3.4/32']
+
+    ##
+    # Secondary addresses
+    # - replace '5.6.7.8' with the secondary public address
+    ##
+    postgresql['md5_auth_cidr_addresses'] = ['5.6.7.8/32']
+
+    ##
+    ## Replication settings
+    ## - set this to be the number of Geo secondary nodes you have
+    ##
+    postgresql['max_replication_slots'] = 1
+    # postgresql['max_wal_senders'] = 10
+    # postgresql['wal_keep_segments'] = 10
+
+    ##
+    ## Disable automatic database migrations temporarily
+    ## (until PostgreSQL is restarted and listening on the private address).
+    ##
+    gitlab_rails['auto_migrate'] = false
+    ```
+
+1. Optional: If you want to add another secondary, the relevant setting would look like:
+
+    ```ruby
+    postgresql['md5_auth_cidr_addresses'] = ['5.6.7.8/32','9.10.11.12/32']
+    ```
+
+    You may also want to edit the `wal_keep_segments` and `max_wal_senders` to
+    match your database replication requirements. Consult the [PostgreSQL -
+    Replication documentation](https://www.postgresql.org/docs/9.6/static/runtime-config-replication.html)
+    for more information.
+
+1. Save the file and reconfigure GitLab for the database listen changes and
+   the replication slot changes to be applied.
+
+    ```bash
+    gitlab-ctl reconfigure
+    ```
+
+    Restart PostgreSQL for its changes to take effect:
+
+    ```bash
+    gitlab-ctl restart postgresql
+    ```
+
+1. Re-enable migrations now that PostgreSQL is restarted and listening on the
+   private address.
+
+    Edit `/etc/gitlab/gitlab.rb` and **change** the configuration to `true`:
+
+    ```ruby
+    gitlab_rails['auto_migrate'] = true
+    ```
+
+    Save the file and reconfigure GitLab:
+
+    ```bash
+    gitlab-ctl reconfigure
+    ```
+
+1. Now that the PostgreSQL server is set up to accept remote connections, run
+   `netstat -plnt | grep 5432` to make sure that PostgreSQL is listening on port
+   `5432` to the primary server's private address.
+
+1. A certificate was automatically generated when GitLab was reconfigured. This
+   will be used automatically to protect your PostgreSQL traffic from
+   eavesdroppers, but to protect against active ("man-in-the-middle") attackers,
+   the secondary needs a copy of the certificate. Make a copy of the PostgreSQL
+    `server.crt` file on the primary node by running this command:
+
+    ```bash
+    cat ~gitlab-psql/data/server.crt
+    ```
+
+    Copy the output into a clipboard or into a local file. You
+    will need it when setting up the secondary! The certificate is not sensitive
+    data.
+
+### Step 2. Add the secondary GitLab node
+
+To prevent the secondary geo node from trying to act as the primary once the
+database is replicated, the secondary geo node must be added on the
+primary before the database is replicated.
+
+1. Visit the **primary** node's **Admin Area ➔ Geo Nodes**
+   (`/admin/geo_nodes`) in your browser.
+1. Add the secondary node by providing its full URL. **Do NOT** check the box
+   'This is a primary node'.
+1. Optionally, choose which namespaces should be replicated by the
+   secondary node. Leave blank to replicate all. Read more in
+   [selective replication](#selective-replication).
+1. Click the **Add node** button.
+1. SSH into your GitLab **primary** server and login as root to verify the
+   secondary is reachable:
+
+    ```
+    gitlab-rake gitlab:geo:check
+    ```
+
+The new secondary geo node will have the status **Unhealthy**. This is expected
+because we have not yet configured the secondary server. This is the next step.
+
+### Step 3. Configure the secondary server
+
+1. SSH into your GitLab **secondary** server and login as root:
+
+    ```
+    sudo -i
+    ```
+
+1. [Check TCP connectivity](../raketasks/maintenance.md) to the
+   primary's PostgreSQL server:
+
+    ```bash
+    gitlab-rake gitlab:tcp_check[1.2.3.4,5432]
+    ```
+
+    If this step fails, you may be using the wrong IP address, or a firewall may
+    be preventing access to the server. Check the IP address, paying close
+    attention to the difference between public and private addresses and ensure
+    that, if a firewall is present, the secondary is permitted to connect to the
+    primary on port 5432.
+
+1. Create a file `server.crt` in the secondary server, with the content you got on the last step of the primary setup:
+
+    ```
+    editor server.crt
+    ```
+
+
+1. Set up PostgreSQL TLS verification on the secondary
+
+    Install the `server.crt` file:
+
+    ```bash
+    install -D -o gitlab-psql -g gitlab-psql -m 0400 -T server.crt ~gitlab-psql/.postgresql/root.crt
+    ```
+
+    PostgreSQL will now only recognize that exact certificate when verifying TLS
+    connections. The certificate can only be replicated by someone with access
+    to the private key, which is **only** present on the primary node.
+
+1. Test that the `gitlab-psql` user can connect to the primary's database:
+
+    ```bash
+    sudo -u gitlab-psql /opt/gitlab/embedded/bin/psql --list -U gitlab_replicator -d "dbname=gitlabhq_production sslmode=verify-ca" -W -h 1.2.3.4
+    ```
+
+    When prompted enter the password you set in the first step for the
+    `gitlab_replicator` user. If all worked correctly, you should see
+    the list of primary's databases.
+
+    A failure to connect here indicates that the TLS configuration is incorrect.
+    Ensure that the contents of `~gitlab-psql/data/server.crt` on the primary
+    match the contents of `~gitlab-psql/.postgresql/root.crt` on the secondary.
+
+1. Configure PostreSQL to enable FDW support
+
+    This step is similar to how we configured the primary instance.
+    We need to enable this, to enable FDW support, even if using a single node.
+
+    Edit `/etc/gitlab/gitlab.rb` and add the following, replacing the IP
+    addresses with addresses appropriate to your network configuration:
+
+    ```ruby
+    # Secondary addresses
+    # - replace '5.6.7.8' with the secondary private address
+    postgresql['listen_address'] = '5.6.7.8'
+    postgresql['trust_auth_cidr_addresses'] = ['127.0.0.1/32','5.6.7.8/32']
+    postgresql['md5_auth_cidr_addresses'] = ['5.6.7.8/32']
+
+    # gitlab database user's password (defined previously)
+    gitlab_rails['db_password'] = 'mypassword'
+
+    # enable fdw for the geo tracking database
+    geo_secondary['db_fdw'] = true
+    ```
+
+1. Edit `/etc/gitlab/gitlab.rb` and add the following:
+
+    ```ruby
+    geo_secondary_role['enable'] = true
+    ```
+
+    For external PostgreSQL instances, [see additional instructions][external postgresql].
+    If you bring a former primary back online to serve as a secondary then you also need to remove `geo_primary_role['enable'] = true`.
+
+1. Reconfigure GitLab for the changes to take effect:
+
+    ```bash
+    gitlab-ctl reconfigure
+    ```
+
+1. Restart PostgreSQL for its changes to take effect:
+
+    ```bash
+    gitlab-ctl restart postgresql
+    ```
+
+### Step 4. Initiate the replication process
+
+Below we provide a script that connects the database on the secondary node to
+the database on the primary node, replicates the database, and creates the
+needed files for streaming replication.
+
+The directories used are the defaults that are set up in Omnibus. If you have
+changed any defaults or are using a source installation, configure it as you
+see fit replacing the directories and paths.
+
+>**Warning:**
+Make sure to run this on the **secondary** server as it removes all PostgreSQL's
+data before running `pg_basebackup`.
+
+1. SSH into your GitLab **secondary** server and login as root:
+
+    ```
+    sudo -i
+    ```
+
+1. Choose a database-friendly name to use for your secondary to
+   use as the replication slot name. For example, if your domain is
+   `secondary.geo.example.com`, you may use `secondary_example` as the slot
+   name as shown in the commands below.
+
+1. Execute the command below to start a backup/restore and begin the replication
+   >**Warning:** Each Geo secondary must have its own unique replication slot name.
+   Using the same slot name between two secondaries will break PostgreSQL replication.
+
+    ```bash
+    gitlab-ctl replicate-geo-database --slot-name=secondary_example --host=1.2.3.4
+    ```
+
+    When prompted, enter the _plaintext_ password you set up for the `gitlab_replicator`
+    user in the first step.
+
+    This command also takes a number of additional options. You can use `--help`
+    to list them all, but here are a couple of tips:
+       - If PostgreSQL is listening on a non-standard port, add `--port=` as well.
+       - If your database is too large to be transferred in 30 minutes, you will need
+         to increase the timeout, e.g., `--backup-timeout=3600` if you expect the
+         initial replication to take under an hour.
+       - Pass `--sslmode=disable` to skip PostgreSQL TLS authentication altogether
+         (e.g., you know the network path is secure, or you are using a site-to-site
+         VPN). This is **not** safe over the public Internet!
+       - You can read more details about each `sslmode` in the
+         [PostgreSQL documentation](https://www.postgresql.org/docs/9.6/static/libpq-ssl.html#LIBPQ-SSL-PROTECTION);
+         the instructions above are carefully written to ensure protection against
+         both passive eavesdroppers and active "man-in-the-middle" attackers.
+       - Change the `--slot-name` to the name of the replication slot
+         to be used on the primary database. The script will attempt to create the
+         replication slot automatically if it does not exist.
+       - If you're repurposing an old server into a Geo secondary, you'll need to
+         add `--force` to the command line.
+       - When not in a production machine you can disable backup step if you
+         really sure this is what you want by adding `--skip-backup`
+
+1. Verify that the secondary is configured correctly and that the primary is
+   reachable:
+
+    ```
+    gitlab-rake gitlab:geo:check
+    ```
+
+The replication process is now complete.
+
+### External PostgreSQL instances
+
+For installations using external PostgreSQL instances, the `geo_primary_role`
+and `geo_secondary_role` includes configuration changes that must be applied
+manually.
+
+The `geo_primary_role` makes configuration changes to `pg_hba.conf` and
+`postgresql.conf` on the primary:
+
+```
+##
+## Geo Primary
+## - pg_hba.conf
+##
+host    replication gitlab_replicator <trusted secondary IP>/32     md5
+```
+
+```
+##
+## Geo Primary Role
+## - postgresql.conf
+##
+sql_replication_user = gitlab_replicator
+wal_level = hot_standby
+max_wal_senders = 10
+wal_keep_segments = 50
+max_replication_slots = 1 # number of secondary instances
+hot_standby = on
+```
+
+Th `geo_secondary_role` makes configuration changes to `postgresql.conf` and
+enables the Geo Log Cursor (`geo_logcursor`) and secondary tracking database
+on the secondary. The PostgreSQL settings for this database it adds to
+the default settings:
+
+```
+##
+## Geo Secondary Role
+## - postgresql.conf
+##
+wal_level = hot_standby
+max_wal_senders = 10
+wal_keep_segments = 10
+hot_standby = on
+```
+
+Geo secondary nodes use a tracking database to keep track of replication
+status and recover automatically from some replication issues. Follow the
+instructions for [enabling tracking database on the secondary server][tracking].
+
+## MySQL replication
+
+MySQL replication is not supported for Geo.
+
+## Troubleshooting
+
+Read the [troubleshooting document](troubleshooting.md).
+
+[pgback]: http://www.postgresql.org/docs/9.2/static/app-pgbasebackup.html
+[external postgresql]: #external-postgresql-instances
+[tracking]: database_source.md#enable-tracking-database-on-the-secondary-server
+[FDW]: https://www.postgresql.org/docs/9.6/static/postgres-fdw.html
+[toc]: README.md#using-omnibus-gitlab
--- a/doc/administration/geo/replication/database_source.md
+++ b/doc/administration/geo/replication/database_source.md
+# Geo database replication
+
+>**Note:**
+This is the documentation for installations from source. For installations
+using the Omnibus GitLab packages, follow the
+[**database replication for Omnibus GitLab**](database.md) guide.
+
+>**Note:**
+The stages of the setup process must be completed in the documented order.
+Before attempting the steps in this stage, [complete all prior stages][toc].
+
+This document describes the minimal steps you have to take in order to
+replicate your primary GitLab database to a secondary node's database. You may
+have to change some values according to your database setup, how big it is, etc.
+
+You are encouraged to first read through all the steps before executing them
+in your testing/production environment.
+
+## PostgreSQL replication
+
+The GitLab primary node where the write operations happen will connect to
+primary database server, and the secondary ones which are read-only will
+connect to secondary database servers (which are read-only too).
+
+>**Note:**
+In many databases documentation you will see "primary" being referenced as "master"
+and "secondary" as either "slave" or "standby" server (read-only).
+
+We recommend using [PostgreSQL replication
+slots](https://medium.com/@tk512/replication-slots-in-postgresql-b4b03d277c75)
+to ensure the primary retains all the data necessary for the secondaries to
+recover. See below for more details.
+
+The following guide assumes that:
+
+- You are using PostgreSQL 9.6 or later which includes the
+  [`pg_basebackup` tool][pgback] and improved [Foreign Data Wrapper][FDW] support.
+- You have a primary node already set up (the GitLab server you are
+  replicating from), running PostgreSQL 9.6 or later, and
+  you have a new secondary server set up with the same versions of the OS,
+  PostgreSQL, and GitLab on all nodes.
+- The IP of the primary server for our examples will be `1.2.3.4`, whereas the
+  secondary's IP will be `5.6.7.8`. Note that the primary and secondary servers
+  **must** be able to communicate over these addresses. These IP addresses can either
+  be public or private.
+
+### Step 1. Configure the primary server
+
+1. SSH into your GitLab **primary** server and login as root:
+
+    ```bash
+    sudo -i
+    ```
+
+1. Add this node as the Geo primary by running:
+
+    ```bash
+    bundle exec rake geo:set_primary_node
+    ```
+
+1. Create a [replication user] named `gitlab_replicator`:
+
+    ```bash
+    sudo -u postgres psql -c "CREATE USER gitlab_replicator REPLICATION ENCRYPTED PASSWORD 'thepassword';"
+    ```
+    
+1. Make sure your the `gitlab` database user has a password defined
+
+    ```bash
+    sudo -u postgres psql -d template1 -c "ALTER USER gitlab WITH ENCRYPTED PASSWORD 'mydatabasepassword';"
+    ```
+    
+1. Edit the content of `database.yml` in `production:` and add the password like the exemple below:
+
+    ```yaml
+    #
+    # PRODUCTION
+    #
+    production:
+      adapter: postgresql
+      encoding: unicode
+      database: gitlabhq_production
+      pool: 10
+      username: gitlab
+      password: mydatabasepassword
+      host: /var/opt/gitlab/geo-postgresql
+    ```
+
+1. Set up TLS support for the PostgreSQL primary server
+
+    > **Warning**: Only skip this step if you **know** that PostgreSQL traffic
+    > between the primary and secondary will be secured through some other
+    > means, e.g., a known-safe physical network path or a site-to-site VPN that
+    > you have configured.
+
+    If you are replicating your database across the open Internet, it is
+    **essential** that the connection is TLS-secured. Correctly configured, this
+    provides protection against both passive eavesdroppers and active
+    "man-in-the-middle" attackers.
+
+    To generate a self-signed certificate and key, run this command:
+
+    ```bash
+    openssl req -nodes -batch -x509 -newkey rsa:4096 -keyout server.key -out server.crt -days 3650
+    ```
+
+    This will create two files - `server.key` and `server.crt` - that you can
+    use for authentication.
+
+    Copy them to the correct location for your PostgreSQL installation:
+
+    ```bash
+    # Copying a self-signed certificate and key
+    install -o postgres -g postgres -m 0400 -T server.crt ~postgres/9.x/main/data/server.crt
+    install -o postgres -g postgres -m 0400 -T server.key ~postgres/9.x/main/data/server.key
+    ```
+
+    Add this configuration to `postgresql.conf`, removing any existing
+    configuration for `ssl_cert_file` or `ssl_key_file`:
+
+    ```
+    ssl = on
+    ssl_cert_file='server.crt'
+    ssl_key_file='server.key'
+    ```
+
+1. Edit `postgresql.conf` to configure the primary server for streaming replication
+   (for Debian/Ubuntu that would be `/etc/postgresql/9.x/main/postgresql.conf`):
+
+    ```
+    listen_address = '1.2.3.4'
+    wal_level = hot_standby
+    max_wal_senders = 5
+    min_wal_size = 80MB
+    max_wal_size = 1GB
+    max_replicaton_slots = 1 # Number of Geo secondary nodes
+    wal_keep_segments = 10
+    hot_standby = on
+    ```
+
+    Be sure to set `max_replication_slots` to the number of Geo secondary
+    nodes that you may potentially have (at least 1).
+
+    For security reasons, PostgreSQL by default only listens on the local
+    interface (e.g. 127.0.0.1). However, Geo needs to communicate
+    between the primary and secondary nodes over a common network, such as a
+    corporate LAN or the public Internet. For this reason, we need to
+    configure PostgreSQL to listen on more interfaces.
+
+    The `listen_address` option opens PostgreSQL up to external connections
+    with the interface corresponding to the given IP. See [the PostgreSQL
+    documentation](https://www.postgresql.org/docs/9.6/static/runtime-config-connection.html)
+    for more details.
+
+    You may also want to edit the `wal_keep_segments` and `max_wal_senders` to
+    match your database replication requirements. Consult the [PostgreSQL - Replication documentation](https://www.postgresql.org/docs/9.6/static/runtime-config-replication.html)
+    for more information.
+
+1. Set the access control on the primary to allow TCP connections using the
+   server's public IP and set the connection from the secondary to require a
+   password.  Edit `pg_hba.conf` (for Debian/Ubuntu that would be
+   `/etc/postgresql/9.x/main/pg_hba.conf`):
+
+    ```bash
+    host    all             all                      127.0.0.1/32    trust
+    host    all             all                      1.2.3.4/32      trust
+    host    replication     gitlab_replicator        5.6.7.8/32      md5
+    ```
+
+    Where `1.2.3.4` is the public IP address of the primary server, and `5.6.7.8`
+    the public IP address of the secondary one. If you want to add another
+    secondary, add one more row like the replication one and change the IP
+    address:
+
+    ```bash
+    host    all             all                      127.0.0.1/32    trust
+    host    all             all                      1.2.3.4/32      trust
+    host    replication     gitlab_replicator        5.6.7.8/32      md5
+    host    replication     gitlab_replicator        11.22.33.44/32  md5
+    ```
+
+1. Restart PostgreSQL for the changes to take effect.
+
+1. Choose a database-friendly name to use for your secondary to use as the
+   replication slot name. For example, if your domain is
+   `secondary.geo.example.com`, you may use `secondary_example` as the slot
+   name.
+
+1. Create the replication slot on the primary:
+
+    ```bash
+    $ sudo -u postgres psql -c "SELECT * FROM pg_create_physical_replication_slot('secondary_example');"
+      slot_name         | xlog_position
+      ------------------+---------------
+      secondary_example |
+      (1 row)
+    ```
+
+1. Now that the PostgreSQL server is set up to accept remote connections, run
+   `netstat -plnt` to make sure that PostgreSQL is listening to the server's
+   public IP.
+
+### Step 2. Add the secondary GitLab node
+
+Follow the steps in ["add the secondary GitLab node"](database.md#step-2-add-the-secondary-gitlab-node).
+
+### Step 3. Configure the secondary server
+
+Follow the first steps in ["configure the secondary server"](database.md#step-3-configure-the-secondary-server),
+but note that since you are installing from source, the username and
+group listed as `gitlab-psql` in those steps should be replaced by `postgres`
+instead. After completing the "Test that the `gitlab-psql` user can connect to
+the primary's database" step, continue here:
+
+1. Edit `postgresql.conf` to configure the secondary for streaming replication
+   (for Debian/Ubuntu that would be `/etc/postgresql/9.*/main/postgresql.conf`):
+
+    ```bash
+    wal_level = hot_standby
+    max_wal_senders = 5
+    checkpoint_segments = 10
+    wal_keep_segments = 10
+    hot_standby = on
+    ```
+
+1. Restart PostgreSQL for the changes to take effect.
+
+#### Enable tracking database on the secondary server
+
+Geo secondary nodes use a tracking database to keep track of replication status
+and recover automatically from some replication issues. Follow the steps below to create
+the tracking database.
+
+1. On the secondary node, run the following command to create `database_geo.yml` with the
+information of your secondary PostgreSQL instance:
+
+    ```bash
+    sudo cp /home/git/gitlab/config/database_geo.yml.postgresql /home/git/gitlab/config/database_geo.yml
+    ```
+
+1. Edit the content of `database_geo.yml` in `production:` as in the example below:
+
+    ```yaml
+    #
+    # PRODUCTION
+    #
+    production:
+      adapter: postgresql
+      encoding: unicode
+      database: gitlabhq_geo_production
+      pool: 10
+      username: gitlab_geo
+      # password:
+      host: /var/opt/gitlab/geo-postgresql
+    ```
+
+1. Create the database `gitlabhq_geo_production` on the PostgreSQL instance of the secondary
+node.
+
+1. Set up the Geo tracking database:
+
+    ```bash
+    bundle exec rake geo:db:migrate
+    ```
+
+1. Configure the [PostgreSQL FDW][FDW] connection and credentials:
+
+    Save the script below in a file, ex. `/tmp/geo_fdw.sh` and modify the connection
+    params to match your environment.
+    
+    ```bash
+    #!/bin/bash
+ 
+    # Secondary Database connection params:
+    DB_HOST="/var/opt/gitlab/postgresql"
+    DB_NAME="gitlabhq_production"
+    DB_USER="gitlab"
+    DB_PORT="5432"
+    
+    # Tracking Database connection params:
+    GEO_DB_HOST="/var/opt/gitlab/geo-postgresql"
+    GEO_DB_NAME="gitlabhq_geo_production"
+    GEO_DB_USER="gitlab_geo"
+    GEO_DB_PORT="5432"
+ 
+    sudo -u postgres psql -h $GEO_DB_HOST -d $GEO_DB_NAME -p $GEO_DB_PORT -c "CREATE EXTENSION postgres_fdw;"
+    sudo -u postgres psql -h $GEO_DB_HOST -d $GEO_DB_NAME -p $GEO_DB_PORT -c "CREATE SERVER gitlab_secondary FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '$(DB_HOST)', dbname '$(DB_NAME)', port '$(DB_PORT)' );"
+    sudo -u postgres psql -h $GEO_DB_HOST -d $GEO_DB_NAME -p $GEO_DB_PORT -c "CREATE USER MAPPING FOR $(GEO_DB_USER) SERVER gitlab_secondary OPTIONS (user '$(DB_USER)');"
+    sudo -u postgres psql -h $GEO_DB_HOST -d $GEO_DB_NAME -p $GEO_DB_PORT -c "CREATE SCHEMA gitlab_secondary;"
+    sudo -u postgres psql -h $GEO_DB_HOST -d $GEO_DB_NAME -p $GEO_DB_PORT -c "GRANT USAGE ON FOREIGN SERVER gitlab_secondary TO $(GEO_DB_USER);"
+    ```
+
+    And edit the content of `database_geo.yml` and to add `fdw: true` to
+    the  `production:` block.
+
+### Step 4. Initiate the replication process
+
+Below we provide a script that connects the database on the secondary node to
+the database on the primary node, replicates the database, and creates the
+needed files for streaming replication.
+
+The directories used are the defaults for Debian/Ubuntu. If you have changed
+any defaults, configure it as you see fit replacing the directories and paths.
+
+>**Warning:**
+Make sure to run this on the **secondary** server as it removes all PostgreSQL's
+data before running `pg_basebackup`.
+
+1. SSH into your GitLab **secondary** server and login as root:
+
+    ```bash
+    sudo -i
+    ```
+
+1. Save the snippet below in a file, let's say `/tmp/replica.sh`. Modify the
+   embedded paths if necessary:
+
+    ```bash
+    #!/bin/bash
+
+    PORT="5432"
+    USER="gitlab_replicator"
+    echo ---------------------------------------------------------------
+    echo WARNING: Make sure this script is run from the secondary server
+    echo ---------------------------------------------------------------
+    echo
+    echo Enter the IP or FQDN of the primary PostgreSQL server
+    read HOST
+    echo Enter the password for $USER@$HOST
+    read -s PASSWORD
+    echo Enter the required sslmode
+    read SSLMODE
+
+    echo Stopping PostgreSQL and all GitLab services
+    gitlab-ctl stop
+
+    echo Backing up postgresql.conf
+    sudo -u postgres mv /var/opt/gitlab/postgresql/data/postgresql.conf /var/opt/gitlab/postgresql/
+
+    echo Cleaning up old cluster directory
+    sudo -u postgres rm -rf /var/opt/gitlab/postgresql/data
+    rm -f /tmp/postgresql.trigger
+
+    echo Starting base backup as the replicator user
+    echo Enter the password for $USER@$HOST
+    sudo -u postgres /opt/gitlab/embedded/bin/pg_basebackup -h $HOST -D /var/opt/gitlab/postgresql/data -U gitlab_replicator -v -x -P
+
+    echo Writing recovery.conf file
+    sudo -u postgres bash -c "cat > /var/opt/gitlab/postgresql/data/recovery.conf <<- _EOF1_
+      standby_mode = 'on'
+      primary_conninfo = 'host=$HOST port=$PORT user=$USER password=$PASSWORD sslmode=$SSLMODE'
+      trigger_file = '/tmp/postgresql.trigger'
+    _EOF1_
+    "
+
+    echo Restoring postgresql.conf
+    sudo -u postgres mv /var/opt/gitlab/postgresql/postgresql.conf /var/opt/gitlab/postgresql/data/
+
+    echo Starting PostgreSQL and all GitLab services
+    gitlab-ctl start
+    ```
+
+1. Run it with:
+
+    ```bash
+    bash /tmp/replica.sh
+    ```
+
+    When prompted, enter the IP/FQDN of the primary, and the password you set up
+    for the `gitlab_replicator` user in the first step.
+
+    You should use `verify-ca` for the `sslmode`. You can use `disable` if you
+    are happy to skip PostgreSQL TLS authentication altogether (e.g., you know
+    the network path is secure, or you are using a site-to-site VPN). This is
+    **not** safe over the public Internet!
+
+    You can read more details about each `sslmode` in the
+    [PostgreSQL documentation](https://www.postgresql.org/docs/9.6/static/libpq-ssl.html#LIBPQ-SSL-PROTECTION);
+    the instructions above are carefully written to ensure protection against
+    both passive eavesdroppers and active "man-in-the-middle" attackers.
+
+The replication process is now over.
+
+## MySQL replication
+
+MySQL replication is not supported for Geo.
+
+## Troubleshooting
+
+Read the [troubleshooting document](troubleshooting.md).
+
+[pgback]: http://www.postgresql.org/docs/9.6/static/app-pgbasebackup.html
+[replication user]:https://wiki.postgresql.org/wiki/Streaming_Replication
+[FDW]: https://www.postgresql.org/docs/9.6/static/postgres-fdw.html
+[toc]: README.md#using-gitlab-installed-from-source
--- a/doc/administration/geo/replication/docker_registry.md
+++ b/doc/administration/geo/replication/docker_registry.md
+# Docker Registry for a secondary node
+
+You can setup a [Docker Registry](https://docs.docker.com/registry/) on your
+secondary Geo node that mirrors the one on the primary Geo node.
+
+## Storage support
+
+CAUTION: **Warning:**
+If you use [local storage](../container_registry.md#container-registry-storage-driver)
+for the Container Registry you **cannot** replicate it to the secondary Geo node.
+
+Docker Registry currently supports a few types of storages. If you choose a
+distributed storage (`azure`, `gcs`, `s3`, `swift`, or `oss`) for your Docker
+Registry on a primary Geo node, you can use the same storage for a secondary
+Docker Registry as well. For more information, read the
+[Load balancing considerations](https://docs.docker.com/registry/deploying/#load-balancing-considerations)
+when deploying the Registry, and how to setup the storage driver for GitLab's
+integrated [Container Registry](../container_registry.md#container-registry-storage-driver).
+
+[ee]: https://about.gitlab.com/products/
--- a/doc/administration/geo/replication/faq.md
+++ b/doc/administration/geo/replication/faq.md
+# Geo Frequently Asked Questions
+
+## Can I use Geo in a disaster recovery situation?
+
+Yes, but there are limitations to what we replicate (see
+[What data is replicated to a secondary node?](#what-data-is-replicated-to-a-secondary-node)).
+
+Read the documentation for [Disaster Recovery](../disaster_recovery/index.md).
+
+## What data is replicated to a secondary node?
+
+We currently replicate project repositories, LFS objects, generated
+attachments / avatars and the whole database. This means user accounts,
+issues, merge requests, groups, project data, etc., will be available for
+query. We currently don't replicate artifact data (`shared/folder`).
+
+## Can I git push to a secondary node?
+
+No. All writing operations (this includes `git push`) must be done in your
+primary node.
+
+## How long does it take to have a commit replicated to a secondary node?
+
+All replication operations are asynchronous and are queued to be dispatched in
+a batched request every 10 minutes. Besides that, it depends on a lot of other
+factors including the amount of traffic, how big your commit is, the
+connectivity between your nodes, your hardware, etc.
+
+## What if the SSH server runs at a different port?
+
+We send the clone url from the primary server to any secondaries, so it
+doesn't matter. If primary is running on port `2200`, clone url will reflect
+that.
+
+## Is this possible to set up a Docker Registry for a secondary node that mirrors the one on a primary node?
+
+Yes. See [Docker Registry for a secondary Geo node](docker_registry.md).
--- a/doc/administration/geo/replication/high_availability.md
+++ b/doc/administration/geo/replication/high_availability.md
+# Geo High Availability
+
+This document describes a minimal reference architecture for running Geo
+in a high availability configuration. If your HA setup differs from the one
+described, it is possible to adapt these instructions to your needs.
+
+## Architecture overview
+
+![Geo HA Diagram](../img/high_availability/geo-ha-diagram.png)
+
+_[diagram source - gitlab employees only](https://docs.google.com/drawings/d/1z0VlizKiLNXVVVaERFwgsIOuEgjcUqDTWPdQYsE7Z4c/edit)_
+
+The topology above assumes that the primary and secondary Geo clusters
+are located in two separate locations, on their own virtual network
+with private IP addresses. The network is configured such that all machines within
+one geographic location can communicate with each other using their private IP addresses.
+The IP addresses given are examples and may be different depending on the
+network topology of your deployment.
+
+The only external way to access the two Geo deployments is by HTTPS at
+`gitlab.us.example.com` and `gitlab.eu.example.com` in the example above.
+
+> **Note:** The primary and secondary Geo deployments must be able to
+> communicate to each other over HTTPS.
+
+## Redis and PostgreSQL High Availability
+
+The primary and secondary Redis and PostgreSQL should be configured
+for high availability.  Because of the additional complexity involved
+in setting up this configuration for PostgreSQL and Redis
+it is not covered by this Geo HA documentation.
+The two services will instead be configured such that
+they will each run on a single machine.
+
+For more information about setting up a highly available PostgreSQL cluster and Redis cluster using the omnibus package see the high availability documentation for
+[PostgreSQL](../high_availability/database.md) and
+[Redis](../high_availability/redis.md), respectively.
+
+From these instructions you will need the following for the examples below:
+* `gitlab_rails['db_password']` for the PostgreSQL "DB password"
+* `redis['password']` for the Redis "Redis password"
+
+NOTE: **Note:**
+It is possible to use cloud hosted services for PostgreSQL and Redis but this is beyond the scope of this document.
+
+### Prerequisites
+
+Make sure you have GitLab EE installed using the
+[Omnibus package](https://about.gitlab.com/installation).
+
+
+### Step 1: Configure the Geo Backend Services
+
+On the **primary** backend servers configure the following services:
+
+* [Redis](../high_availability/redis.md) for high availability.
+* [NFS Server](../high_availability/nfs.md) for repository, LFS, and upload storage.
+* [PostgreSQL](../high_availability/database.md) for high availability.
+
+On the **secondary** backend servers configure the following services:
+
+* [Redis](../high_availability/redis.md) for high availability.
+* [NFS Server](../high_availability/nfs.md) which will store data that is synchronized from the Geo primary.
+
+### Step 2: Configure the Postgres services on the Geo Secondary
+
+1. Configure the [secondary Geo PostgreSQL database](../gitlab-geo/database.md)
+ as a read-only secondary of the primary Geo PostgreSQL database.
+
+1. Configure the Geo tracking database on the secondary server, to do this modify `/etc/gitlab/gitlab.rb`:
+
+    ```ruby
+    geo_postgresql['enable'] = true
+
+    geo_postgresql['listen_address'] = '10.1.4.1'
+    geo_postgresql['trust_auth_cidr_addresses'] = ['10.1.0.0/16']
+
+    geo_secondary['auto_migrate'] = true
+    geo_secondary['db_host'] = '10.1.4.1'
+    geo_secondary['db_password'] = 'Geo tracking DB password'
+    ```
+
+NOTE: **Note:**
+Be sure that other non-postgresql services are disabled by setting `enable` to `false` in
+the [gitlab.rb configuration](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/files/gitlab-config-template/gitlab.rb.template).
+
+After making these changes be sure to run `sudo gitlab-ctl reconfigure` so that they take effect.
+
+### Step 3: Setup the LoadBalancer
+
+In this topology there will need to be a load balancers at each geographical location
+to route traffic to the application servers.
+
+See the [Load Balancer for GitLab HA](../high_availability/load_balancer.md)
+documentation for more information.
+
+### Step 4: Configure the Geo Frontend Application Servers
+
+In the architecture overview there are two machines running the GitLab application
+services.  These services are enabled selectively in the configuration. Additionally
+the addresses of the remote endpoints for PostgreSQL and Redis will need to be specified.
+
+#### On the GitLab Primary Frontend servers
+
+1. Edit `/etc/gitlab/gitlab.rb` and ensure the following to disable PostgreSQL and Redis from running locally.
+
+    ```ruby
+    ##
+    ## Disable PostgreSQL on the local machine and connect to the remote
+    ##
+
+    postgresql['enable'] = false
+    gitlab_rails['auto_migrate'] = false
+    gitlab_rails['db_host'] = '10.0.3.1'
+    gitlab_rails['db_password'] = 'DB password'
+
+    ##
+    ## Disable Redis on the local machine and connect to the remote
+    ##
+
+    redis['enable'] = false
+    gitlab_rails['redis_host'] = '10.0.2.1'
+    gitlab_rails['redis_password'] = 'Redis password'
+
+    geo_primary_role['enable'] = true
+    ```
+
+#### On the GitLab Secondary Frontend servers
+
+On the secondary the remote endpoint for the PostgreSQL Geo database will
+be specified.
+
+1. Edit `/etc/gitlab/gitlab.rb` and ensure the following to disable PostgreSQL and Redis from running locally. Configure the secondary to connect to the Geo tracking database.
+
+
+    ```ruby
+    ##
+    ## Disable PostgreSQL on the local machine and connect to the remote
+    ##
+
+    postgresql['enable'] = false
+    gitlab_rails['auto_migrate'] = false
+    gitlab_rails['db_host'] = '10.1.3.1'
+    gitlab_rails['db_password'] = 'DB password'
+
+    ##
+    ## Disable Redis on the local machine and connect to the remote
+    ##
+
+    redis['enable'] = false
+    gitlab_rails['redis_host'] = '10.1.2.1'
+    gitlab_rails['redis_password'] = 'Redis password'
+
+
+    ##
+    ## Enable the geo secondary role and configure the
+    ## geo tracking database
+    ##
+
+    geo_secondary_role['enable'] = true
+    geo_secondary['db_host'] = '10.1.4.1'
+    geo_secondary['db_password'] = 'Geo tracking DB password'
+    geo_postgresql['enable'] = false
+    ```
+
+
+After making these changes [Reconfigure GitLab][] so that they take effect.
+
+On the primary the following GitLab frontend services will be enabled:
+
+* gitlab-pages
+* gitlab-workhorse
+* logrotate
+* nginx
+* registry
+* remote-syslog
+* sidekiq
+* unicorn
+
+On the secondary the following GitLab frontend services will be enabled:
+
+* geo-logcursor
+* gitlab-pages
+* gitlab-workhorse
+* logrotate
+* nginx
+* registry
+* remote-syslog
+* sidekiq
+* unicorn
+
+Verify these services by running `sudo gitlab-ctl status` on the frontend
+application servers.
+
+[reconfigure GitLab]: ../restart_gitlab.md#omnibus-gitlab-reconfigure
+[restart GitLab]: ../restart_gitlab.md#omnibus-gitlab-restart
--- a/doc/administration/geo/replication/img/geo_architecture.png
+++ b/doc/administration/geo/replication/img/geo_architecture.png
--- a/doc/administration/geo/replication/img/geo_node_dashboard.png
+++ b/doc/administration/geo/replication/img/geo_node_dashboard.png
--- a/doc/administration/geo/replication/img/geo_node_healthcheck.png
+++ b/doc/administration/geo/replication/img/geo_node_healthcheck.png
--- a/doc/administration/geo/replication/img/geo_overview.png
+++ b/doc/administration/geo/replication/img/geo_overview.png
--- a/doc/administration/geo/replication/img/hashed_storage.png
+++ b/doc/administration/geo/replication/img/hashed_storage.png
--- a/doc/administration/geo/replication/index.md
+++ b/doc/administration/geo/replication/index.md
+# Geo (Geo Replication)
+
+> **Notes:**
+- Geo is part of [GitLab Premium][ee].
+- Introduced in GitLab Enterprise Edition 8.9.
+  We recommend you use it with at least GitLab Enterprise Edition 10.0 for
+  basic Geo features, or latest version for a better experience.
+- You should make sure that all nodes run the same GitLab version.
+- Geo requires PostgreSQL 9.6 and Git 2.9 in addition to GitLab's usual
+  [minimum requirements](../install/requirements.md)
+- Using Geo in combination with High Availability is considered **GA** in GitLab Enterprise Edition 10.4
+
+>**Note:**
+Geo changes significantly from release to release. Upgrades **are**
+supported and [documented](#updating-the-geo-nodes), but you should ensure that
+you're following the right version of the documentation for your installation!
+The best way to do this is to follow the documentation from the `/help` endpoint
+on your **primary** node, but you can also navigate to [this page on GitLab.com](https://gitlab.com/gitlab-org/gitlab-ee/blob/master/doc/gitlab-geo/README.md)
+and choose the appropriate release from the `tags` dropdown, e.g., `v10.0.0-ee`.
+
+Geo allows you to replicate your GitLab instance to other geographical
+locations as a read-only fully operational version.
+
+## Overview
+
+If you have two or more teams geographically spread out, but your GitLab
+instance is in a single location, fetching large repositories can take a long
+time.
+
+Your Geo instance can be used for cloning and fetching projects, in addition to
+reading any data. This will make working with large repositories over large
+distances much faster.
+
+![Geo overview](img/geo-overview.png)
+
+When Geo is enabled, we refer to your original instance as a **primary** node
+and the replicated read-only ones as **secondaries**.
+
+Keep in mind that:
+
+- Secondaries talk to the primary to get user data for logins (API) and to
+  replicate repositories, LFS Objects and Attachments (HTTPS + JWT).
+- Since GitLab Premium 10.0, the primary no longer talks to
+  secondaries to notify for changes (API).
+
+## Use-cases
+
+- Can be used for cloning and fetching projects, in addition
+to reading any data available in the GitLab web interface (see [current limitations](#current-limitations))
+- Overcomes slow connection between distant offices, saving time by
+improving speed for distributed teams
+- Helps reducing the loading time for automated tasks,
+custom integrations and internal workflows
+- Quickly fail-over to a Geo secondary in a
+[Disaster Recovery](../disaster_recovery/index.md) scenario
+- Allows [planned fail-over](../disaster_recovery/planned_fail_over.md) to a Geo secondary
+
+## Architecture
+
+The following diagram illustrates the underlying architecture of Geo:
+
+![Geo architecture](img/geo-architecture.png)
+
+[Source diagram](https://docs.google.com/drawings/d/1Abw0P_H0Ew1-2Lj_xPDRWP87clGIke-1fil7_KQqrtE/edit)
+
+In this diagram, there is one Geo primary node and one secondary. The
+secondary clones repositories via git over HTTPS. Attachments, LFS objects, and
+other files are downloaded via HTTPS using the GitLab API to authenticate,
+with a special endpoint protected by JWT.
+
+Writes to the database and Git repositories can only be performed on the Geo
+primary node. The secondary node receives database updates via PostgreSQL
+streaming replication.
+
+Note that the secondary needs two different PostgreSQL databases: a read-only
+instance that streams data from the main GitLab database and another used
+internally by the secondary node to record what data has been replicated.
+
+In the secondary nodes there is an additional daemon: Geo Log Cursor.
+
+## Geo Recommendations
+
+We highly recommend that you install Geo on an operating system that supports
+OpenSSH 6.9 or higher. The following operating systems are known to ship with a
+current version of OpenSSH:
+
+    * CentOS 7.4
+    * Ubuntu 16.04
+
+Note that CentOS 6 and 7.0 ship with an old version of OpenSSH that do not
+support a feature that Geo requires. See the [documentation on Geo SSH
+access](../operations/fast_ssh_key_lookup.md) for more details.
+
+### LDAP
+
+We recommend that if you use LDAP on your primary that you also set up a
+secondary LDAP server for the secondary Geo node. Otherwise, users will not be
+able to perform Git operations over HTTP(s) on the **secondary** Geo node
+using HTTP Basic Authentication. However, Git via SSH and personal access
+tokens will still work.
+
+Check with your LDAP provider for instructions on on how to set up
+replication. For example, OpenLDAP provides [these
+instructions](https://www.openldap.org/doc/admin24/replication.html).
+
+### Geo Tracking Database
+
+We use the tracking database as metadata to control what needs to be
+updated on the disk of the local instance (for example, download new assets,
+fetch new LFS Objects or fetch changes from a repository that has recently been
+updated).
+
+Because the replicated instance is read-only, we need this additional instance
+per secondary location.
+
+### Geo Log Cursor
+
+This daemon reads a log of events replicated by the primary node to the secondary
+database and updates the Geo Tracking Database with changes that need to be
+executed.
+
+When something is marked to be updated in the tracking database, asynchronous
+jobs running on the secondary node will execute the required operations and
+update the state.
+
+This new architecture allows us to be resilient to connectivity issues between the
+nodes. It doesn't matter if it was just a few minutes or days. The secondary
+instance will be able to replay all the events in the correct order and get in
+sync again.
+
+## Setup instructions
+
+These instructions assume you have a working instance of GitLab. They will
+guide you through making your existing instance the primary Geo node and
+adding secondary Geo nodes.
+
+The steps below should be followed in the order they appear. **Make sure the
+GitLab version is the same on all nodes.**
+
+### Using Omnibus GitLab
+
+If you installed GitLab using the Omnibus packages (highly recommended):
+
+1. [Install GitLab Enterprise Edition][install-ee] on the server that will serve
+   as the **secondary** Geo node. Do not create an account or login to the new
+   secondary node.
+1. [Upload the GitLab License](../user/admin_area/license.md) on the **primary**
+   Geo node to unlock Geo.
+1. [Setup the database replication](database.md) (`primary (read-write) <->
+   secondary (read-only)` topology).
+1. [Configure fast lookup of authorized SSH keys in the database](../operations/fast_ssh_key_lookup.md),
+   this step is required and needs to be done on both the primary AND secondary nodes.
+1. [Configure GitLab](configuration.md) to set the primary and secondary nodes.
+1. Optional: [Configure a secondary LDAP server](../auth/ldap.md)
+   for the secondary. See [notes on LDAP](#ldap).
+1. [Follow the "Using a Geo Server" guide](using_a_geo_server.md).
+
+[install-ee]: https://about.gitlab.com/downloads-ee/ "GitLab Enterprise Edition Omnibus packages downloads page"
+
+### Using GitLab installed from source
+
+If you installed GitLab from source:
+
+1. [Install GitLab Enterprise Edition][install-ee-source] on the server that
+   will serve as the **secondary** Geo node. Do not create an account or login
+   to the new secondary node.
+1. [Upload the GitLab License](../user/admin_area/license.md) on the **primary**
+   Geo node to unlock Geo.
+1. [Setup the database replication](database_source.md) (`primary (read-write)
+   <-> secondary (read-only)` topology).
+1. [Configure fast lookup of authorized SSH keys in the database](../operations/fast_ssh_key_lookup.md),
+   do this step for both primary AND secondary nodes.
+1. [Configure GitLab](configuration_source.md) to set the primary and secondary
+   nodes.
+1. [Follow the "Using a Geo Server" guide](using_a_geo_server.md).
+
+[install-ee-source]: https://docs.gitlab.com/ee/install/installation.html "GitLab Enterprise Edition installation from source"
+
+## Configuring Geo
+
+Read through the [Geo configuration](configuration.md) documentation.
+
+## Updating the Geo nodes
+
+Read how to [update your Geo nodes to the latest GitLab version](updating_the_geo_nodes.md).
+
+## Configuring Geo HA
+
+Read through the [Geo High Availability documentation](ha.md).
+
+## Configuring Geo with Object storage
+
+When you have object storage enabled, please consult the
+[Geo with Object Storage](object_storage.md) documentation.
+
+## Replicating the Container Registry
+
+Read how to [replicate the Container Registry](docker_registry.md).
+
+## Current limitations
+
+- You cannot push code to secondary nodes, see [3912](https://gitlab.com/gitlab-org/gitlab-ee/issues/3912) for details.
+- The primary node has to be online for OAuth login to happen (existing sessions and Git are not affected)
+- It works for repos, wikis, issues, and merge requests, but it does not work for job logs, artifacts, GitLab Pages, and Docker images of the Container
+  Registry (by default, but you can configure it separately, see [replicate the Container Registry](docker_registry.md) for details)
+- The installation takes multiple manual steps that together can take about an hour depending on circumstances; we are working on improving this experience, see [#2978](https://gitlab.com/gitlab-org/omnibus-gitlab/issues/2978) for details.
+
+## Frequently Asked Questions
+
+Read more in the [Geo FAQ](faq.md).
+
+## Log files
+
+Since GitLab 9.5, Geo stores structured log messages in a `geo.log` file. For
+Omnibus installations, this file can be found in
+`/var/log/gitlab/gitlab-rails/geo.log`. This file contains information about
+when Geo attempts to sync repositories and files. Each line in the file contains a
+separate JSON entry that can be ingested into Elasticsearch, Splunk, etc. For
+example:
+
+```json
+{"severity":"INFO","time":"2017-08-06T05:40:16.104Z","message":"Repository update","project_id":1,"source":"repository","resync_repository":true,"resync_wiki":true,"class":"Gitlab::Geo::LogCursor::Daemon","cursor_delay_s":0.038}
+```
+
+This message shows that Geo detected that a repository update was needed for project 1.
+
+## Security of Geo
+
+Read the [security review](security-review.md) page.
+
+## Tuning Geo
+
+Read the [Geo tuning](tuning.md) documentation.
+
+## Troubleshooting
+
+Read the [troubleshooting document](troubleshooting.md).
+
+[ee]: https://about.gitlab.com/products/ "GitLab Enterprise Edition landing page"
+[install-ee]: https://about.gitlab.com/downloads-ee/ "GitLab Enterprise Edition Omnibus packages downloads page"
+[install-ee-source]: https://docs.gitlab.com/ee/install/installation.html "GitLab Enterprise Edition installation from source"
--- a/doc/administration/geo/replication/object_storage.md
+++ b/doc/administration/geo/replication/object_storage.md
+# Geo with Object storage
+
+Geo can be used in combination with Object Storage (AWS S3, or
+other compatible object storage).
+
+## Configuration
+
+At this time it is required that if object storage is enabled on the
+primary, it must also be enabled on the secondary.
+
+The secondary nodes can use the same storage bucket as the primary, or
+they can use a replicated storage bucket. At this time GitLab does not
+take care of content replication in object storage.
+
+For LFS, follow the documentation to
+[set up LFS object storage](../workflow/lfs/lfs_administration.md#setting-up-s3-compatible-object-storage).
+
+For CI job artifacts, there is similar documentation to configure
+[jobs artifact object storage](../job_artifacts.md#using-object-storage)
+
+Complete these steps on all nodes, primary **and** secondary.
+
+## Replication
+
+When using Amazon S3, you can use
+[CRR](https://docs.aws.amazon.com/AmazonS3/latest/dev/crr.html) to
+have automatic replication between the bucket used by the primary and
+the bucket used by the secondary.
+
+If you are using Google Cloud Storage, consider using
+[Multi-Regional Storage](https://cloud.google.com/storage/docs/storage-classes#multi-regional).
+Or you can use the [Storage Transfer Service](https://cloud.google.com/storage/transfer/),
+although this only supports daily synchronization.
+
+For manual synchronization, or scheduled by `cron`, please have a look at:
+
+- [`s3cmd sync`](http://s3tools.org/s3cmd-sync)
+- [`gsutil rsync`](https://cloud.google.com/storage/docs/gsutil/commands/rsync)
--- a/doc/administration/geo/replication/security_review.md
+++ b/doc/administration/geo/replication/security_review.md
+The following security review of the Geo feature set focuses on security
+aspects of the feature as they apply to customers running their own GitLab
+instances. The review questions are based in part on the [application security architecture](https://www.owasp.org/index.php/Application_Security_Architecture_Cheat_Sheet)
+questions from [owasp.org](https://www.owasp.org).
+
+
+
+## Business Model
+
+### What geographic areas does the application service?
+
+- This varies by customer. Geo allows customers to deploy to multiple areas,
+   and they get to choose where they are.
+- Region and node selection is entirely manual.
+
+
+
+## Data Essentials
+
+### What data does the application receive, produce, and process?
+
+- Geo streams almost all data held by a GitLab instance between sites. This
+  includes full database replication, most files (user-uploaded attachments,
+  etc) and repository + wiki data. In a typical configuration, this will
+  happen across the public Internet, and be TLS-encrypted.
+- PostgreSQL replication is TLS-encrypted.
+- See also: [only TLSv1.2 should be supported](https://gitlab.com/gitlab-org/omnibus-gitlab/issues/2948)
+
+### How can the data be classified into categories according to its sensitivity?
+
+- GitLab’s model of sensitivity is centered around public vs. internal vs.
+private projects. Geo replicates them all indiscriminately. “Selective sync”
+exists for files and repositories (but not database content), which would permit
+only less-sensitive projects to be replicated to a secondary if desired.
+- See also: [developing a data classification policy](https://gitlab.com/gitlab-com/security/issues/4).
+
+### What data backup and retention requirements have been defined for the application?
+
+- Geo is designed to provide replication of a certain subset of the application
+data. It is part of the solution, rather than part of the problem.
+
+
+
+## End-Users
+
+### Who are the application's end‐users?
+
+- Geo nodes (secondaries) are created in regions that are distant (in terms of
+  Internet latency) from the main GitLab installation (the primary). They are
+  intended to be used by anyone who would ordinarily use the primary, who finds
+  that the secondary is closer to them (in terms of Internet latency).
+
+### How do the end‐users interact with the application?
+
+- A Geo secondary node provides all the interfaces a Geo primary node does
+(notably a HTTP/HTTPS web application, and HTTP/HTTPS or SSH git repository
+access), but is constrained to read-only activities. The principal use case is
+envisioned to be cloning git repositories from the secondary in favor of the
+primary, but end-users may use the GitLab web interface to view projects,
+issues, merge requests, snippets, etc.
+
+### What security expectations do the end‐users have?
+
+- The replication process must be secure. It would typically be unacceptable to
+transmit the entire database contents or all files and repositories across the
+public Internet in plaintext, for instance.
+- The Geo secondary must have the same access controls over its content as the
+primary - unauthenticated users must not be able to gain access to privileged
+information on the primary by querying the secondary.
+- Attackers must not be able to impersonate the secondary to the primary, and
+thus gain access to privileged information.
+
+
+
+## Administrators
+
+### Who has administrative capabilities in the application?
+
+- Nothing Geo-specific. Any user where `admin: true` is set in the database is
+considered an admin with super-user privileges.
+- See also: [more granular access control](https://gitlab.com/gitlab-org/gitlab-ce/issues/32730)
+(not geo-specific)
+- Much of Geo’s integration (database replication, for instance) must be
+configured with the application, typically by system administrators.
+
+### What administrative capabilities does the application offer?
+
+- Geo secondaries may be added, modified, or removed by users with
+administrative access.
+- The replication process may be controlled (start/stop) via the Sidekiq
+administrative controls.
+
+
+
+## Network
+
+### What details regarding routing, switching, firewalling, and load‐balancing have been defined?
+
+- Geo requires the primary and secondary to be able to communicate with each
+other across a TCP/IP network. In particular, the secondaries must be able to
+access HTTP/HTTPS and PostgreSQL services on the primary.
+
+### What core network devices support the application?
+
+- Varies from customer to customer.
+
+### What network performance requirements exist?
+
+- Maximum replication speeds between primary and secondary is limited by the
+available bandwidth between sites. No hard requirements exist - time to complete
+replication (and ability to keep up with changes on the primary) is a function
+of the size of the data set, tolerance for latency, and available network
+capacity.
+
+### What private and public network links support the application?
+
+- Customers choose their own networks. As sites are intended to be
+geographically separated, it is envisioned that replication traffic will pass
+over the public Internet in a typical deployment, but this is not a requirement.
+
+
+
+## Systems
+
+### What operating systems support the application?
+
+- Geo imposes no additional restrictions on operating system (see the
+  [GitLab installation](https://about.gitlab.com/installation/) page for more
+  details), however we recommend using the operating systems listed in the [Geo documentation](http://docs.gitlab.com/ee/gitlab-geo/#geo-recommendations). 
+
+
+### What details regarding required OS components and lock‐down needs have been defined?
+
+- The recommended installation method (Omnibus) packages most components itself.
+A from-source installation method exists. Both are documented at
+https://docs.gitlab.com/ee/gitlab-geo/
+- There are significant dependencies on the system-installed OpenSSH daemon (Geo
+  requires users to set up custom authentication methods) and the omnibus or
+  system-provided PostgreSQL daemon (it must be configured to listen on TCP,
+  additional users and replication slots must be added, etc).
+- The process for dealing with security updates (for example, if there is a
+  significant vulnerability in OpenSSH or other services, and the customer
+  wants to patch those services on the OS) is identical to the non-Geo
+  situation: security updates to OpenSSH would be provided to the user via the
+  usual distribution channels. Geo introduces no delay there.
+
+
+
+## Infrastructure Monitoring
+
+### What network and system performance monitoring requirements have been defined?
+
+- None specific to Geo.
+
+### What mechanisms exist to detect malicious code or compromised application components?
+
+- None specific to Geo.
+
+### What network and system security monitoring requirements have been defined?
+
+- None specific to Geo.
+
+
+
+## Virtualization and Externalization
+
+### What aspects of the application lend themselves to virtualization?
+
+- All.
+
+## What virtualization requirements have been defined for the application?
+
+- Nothing Geo-specific, but everything in GitLab needs to have full
+functionality in such an environment.
+
+### What aspects of the product may or may not be hosted via the cloud computing model?
+
+- GitLab is “cloud native” and this applies to Geo as much as to the rest of the
+product. Deployment in clouds is a common and supported scenario.
+
+## If applicable, what approach(es) to cloud computing will be taken (Managed Hosting versus "Pure" Cloud, a "full machine" approach such as AWS-EC2 versus a "hosted database" approach such as AWS-RDS and Azure, etc)?
+
+- To be decided by our customers, according to their operational needs.
+
+
+
+## Environment
+
+### What frameworks and programming languages have been used to create the application?
+
+- Ruby on Rails, Ruby.
+
+### What process, code, or infrastructure dependencies have been defined for the application?
+
+- Nothing specific to Geo.
+
+### What databases and application servers support the application?
+
+- PostgreSQL >= 9.6, Redis, Sidekiq, Unicorn.
+
+### How will database connection strings, encryption keys, and other sensitive components be stored, accessed, and protected from unauthorized detection?
+
+- There are some Geo-specific values. Some are shared secrets which must be
+securely transmitted from the primary to the secondary at setup time. Our
+documentation recommends transmitting them from the primary to the system
+administrator via SSH, and then back out to the secondary in the same manner.
+In particular, this includes the PostgreSQL replication credentials and a secret
+key (`db_key_base`) which is used to decrypt certain columns in the database.
+The `db_key_base` secret is stored unencrypted on the filesystem, in
+`/etc/gitlab/gitlab-secrets.json`, along with a number of other secrets. There is
+no at-rest protection for them.
+
+
+
+## Data Processing
+
+### What data entry paths does the application support?
+
+- Data is entered via the web application exposed by GitLab itself. Some data is
+also entered using system administration commands on the GitLab servers (e.g.,
+  `gitlab-ctl set-primary-node`).
+- Secondaries also receive inputs via PostgreSQL streaming replication from the
+primary.
+
+### What data output paths does the application support?
+
+- Primaries output via PostgreSQL streaming replication to the secondary.
+Otherwise, principally via the web application exposed by GitLab itself, and via
+SSH `git clone` operations initiated by the end-user.
+
+### How does data flow across the application's internal components?
+
+- Secondaries and primaries interact via HTTP/HTTPS (secured with JSON web
+  tokens) and via PostgreSQL streaming replication.
+- Within a primary or secondary, the SSOT is the filesystem and the database
+(including Geo tracking database on secondary). The various internal components
+are orchestrated to make alterations to these stores.
+
+### What data input validation requirements have been defined?
+
+- Secondaries must have a faithful replication of the primary’s data.
+
+### What data does the application store and how?
+
+- Git repositories and files, tracking information related to the them, and the
+GitLab database contents.
+
+### What data is or may need to be encrypted and what key management requirements have been defined?
+
+- Neither primaries or secondaries encrypt Git repository or filesystem data at
+rest. A subset of database columns are encrypted at rest using the `db_otp_key`
+- a static secret shared across all hosts in a GitLab deployment.
+- In transit, data should be encrypted, although the application does permit
+communication to proceed unencrypted. The two main transits are the secondary’s
+replication process for PostgreSQL, and for git repositories/files. Both should
+be protected using TLS, with the keys for that managed via Omnibus per existing
+configuration for end-user access to GitLab.
+
+### What capabilities exist to detect the leakage of sensitive data?
+
+- Comprehensive system logs exist, tracking every connection to GitLab and
+PostgreSQL.
+
+### What encryption requirements have been defined for data in transit - including transmission over WAN, LAN, SecureFTP, or publicly accessible protocols such as http: and https:?
+
+- Data must have the option to be encrypted in transit, and be secure against
+both passive and active attack (e.g., MITM attacks should not be possible).
+
+
+
+## Access
+
+### What user privilege levels does the application support?
+
+- Geo adds one type of privilege: secondaries can access a special Geo API to
+download files over HTTP/HTTPS, and to clone repositories using HTTP/HTTPS.
+
+### What user identification and authentication requirements have been defined?
+
+- Geo secondaries identify to Geo primaries via OAuth or JWT authentication
+based on the shared database (HTTP access) or a PostgreSQL replication user (for
+database replication). The database replication also requires IP-based access
+controls to be defined.
+
+### What user authorization requirements have been defined?
+
+- Secondaries must only be able to *read* data. They are not currently able to
+mutate data on the primary.
+
+### What session management requirements have been defined?
+
+- Geo JWTs are defined to last for only two minutes before needing to be
+regenerated.
+
+### What access requirements have been defined for URI and Service calls?
+
+- A Geo secondary makes many calls to the primary's API. This is how file
+replication proceeds, for instance. This endpoint is only accessible with a JWT
+token.
+- The primary also makes calls to the secondary to get status information.
+
+
+
+## Application Monitoring
+
+### What application auditing requirements have been defined? How are audit and debug logs accessed, stored, and secured?
+
+- Structured JSON log is written to the filesystem, and can also be ingested
+into a Kibana installation for further analysis.
--- a/doc/administration/geo/replication/troubleshooting.md
+++ b/doc/administration/geo/replication/troubleshooting.md
+# Geo Troubleshooting
+
+>**Note:**
+This list is an attempt to document all the moving parts that can go wrong.
+We are working into getting all this steps verified automatically in a
+rake task in the future.
+
+Setting up Geo requires careful attention to details and sometimes it's easy to
+miss a step. Here is a list of questions you should ask to try to detect
+what you need to fix (all commands and path locations are for Omnibus installs):
+
+#### First check the health of the secondary
+
+Visit the primary node's **Admin Area ➔ Geo Nodes** (`/admin/geo_nodes`) in
+your browser. We perform the following health checks on each secondary node
+to help identify if something is wrong:
+
+- Is the node running?
+- Is the node's secondary database configured for streaming replication?
+- Is the node's secondary tracking database configured?
+- Is the node's secondary tracking database connected?
+- Is the node's secondary tracking database up-to-date?
+
+![Geo health check](img/geo-node-healthcheck.png)
+
+There is also an option to check the status of the secondary node by running a special rake task:
+
+```
+sudo gitlab-rake geo:status
+```
+
+#### Is Postgres replication working?
+
+#### Are my nodes pointing to the correct database instance?
+
+You should make sure your primary Geo node points to the instance with
+writing permissions.
+
+Any secondary nodes should point only to read-only instances.
+
+#### Can Geo detect my current node correctly?
+
+Geo uses the defined node from the `Admin ➔ Geo` screen, and tries to match
+it with the value defined in the `/etc/gitlab/gitlab.rb` configuration file.
+The relevant line looks like: `external_url "http://gitlab.example.com"`.
+
+To check if the node on the current machine is correctly detected type:
+
+```bash
+sudo gitlab-rails runner "puts Gitlab::Geo.current_node.inspect"
+```
+
+and expect something like:
+
+```
+#<GeoNode id: 2, schema: "https", host: "gitlab.example.com", port: 443, relative_url_root: "", primary: false, ...>
+```
+
+By running the command above, `primary` should be `true` when executed in
+the primary node, and `false` on any secondary.
+
+#### How do I fix the message, "ERROR:  replication slots can only be used if max_replication_slots > 0"?
+
+This means that the `max_replication_slots` PostgreSQL variable needs to
+be set on the primary database. In GitLab 9.4, we have made this setting
+default to 1. You may need to increase this value if you have more Geo
+secondary nodes. Be sure to restart PostgreSQL for this to take
+effect. See the [PostgreSQL replication
+setup](database.md#postgresql-replication) guide for more details.
+
+#### How do I fix the message, "FATAL:  could not start WAL streaming: ERROR:  replication slot "geo_secondary_my_domain_com" does not exist"?
+
+This occurs when PostgreSQL does not have a replication slot for the
+secondary by that name. You may want to rerun the [replication
+process](database.md) on the secondary.
+
+#### How do I fix the message, "Command exceeded allowed execution time" when setting up replication?
+
+This may happen while [initiating the replication process](database.md#step-4-initiate-the-replication-process) on the Geo secondary, and indicates that your
+initial dataset is too large to be replicated in the default timeout (30 minutes).
+
+Re-run `gitlab-ctl replicate-geo-database`, but include a larger value for
+`--backup-timeout`:
+
+```bash
+sudo gitlab-ctl replicate-geo-database --host=primary.geo.example.com --slot-name=secondary_geo_example_com --backup-timeout=21600
+```
+
+This will give the initial replication up to six hours to complete, rather than
+the default thirty minutes. Adjust as required for your installation.
+
+#### How do I fix the message, "PANIC: could not write to file 'pg_xlog/xlogtemp.123': No space left on device"
+
+Determine if you have any unused replication slots in the primary database.  This can cause large amounts of log data to build up in `pg_xlog`.
+Removing the unused slots can reduce the amount of space used in the `pg_xlog`.
+
+1. Start a PostgreSQL console session:
+
+    ```bash
+    sudo gitlab-psql gitlabhq_production
+    ```
+
+    Note that using `gitlab-rails dbconsole` will not work, because managing replication slots requires superuser permissions.
+
+2. View your replication slots with
+
+     ```sql
+     SELECT * FROM pg_replication_slots;
+     ```
+
+Slots where `active` is `f` are not active.
+
+- When this slot should be active, because you have a secondary configured using that slot,
+log in to that secondary and check the PostgreSQL logs why the replication is not running.
+
+- If you are no longer using the slot (e.g. you no longer have Geo enabled), you can remove it with in the PostgreSQL console session:
+
+    ```sql
+    SELECT pg_drop_replication_slot('name_of_extra_slot');
+    ```
+
+#### Very large repositories never successfully synchronize on the secondary
+
+GitLab places a timeout on all repository clones, including project imports
+and Geo synchronization operations. If a fresh `git clone` of a repository
+on the primary takes more than a few minutes, you may be affected by this.
+To increase the timeout, add the following line to `/etc/gitlab/gitlab.rb`
+on the secondary:
+
+```ruby
+gitlab_rails['gitlab_shell_git_timeout'] = 10800
+```
+
+Then reconfigure GitLab:
+
+```bash
+sudo gitlab-ctl reconfigure
+```
+
+This will increase the timeout to three hours (10800 seconds). Choose a time
+long enough to accomodate a full clone of your largest repositories.
--- a/doc/administration/geo/replication/tuning.md
+++ b/doc/administration/geo/replication/tuning.md
+# Tuning Geo
+
+## Changing the sync capacity values
+
+In the Geo admin page (`/admin/geo_nodes`), there are several variables that
+can be tuned to improve performance of Geo:
+
+* Repository sync capacity
+* File sync capacity
+
+Increasing these values will increase the number of jobs that are scheduled,
+but this may not lead to a more downloads in parallel unless the number of
+available Sidekiq threads is also increased. For example, if repository sync
+capacity is increased from 25 to 50, you may also want to increase the number
+of Sidekiq threads from 25 to 50. See the [Sidekiq concurrency
+documentation](../operations/extra_sidekiq_processes.html#concurrency)
+for more details.
--- a/doc/administration/geo/replication/updating_the_geo_nodes.md
+++ b/doc/administration/geo/replication/updating_the_geo_nodes.md
+# Updating the Geo nodes
+
+Depending on which version of Geo you are updating to/from, there may be
+different steps.
+
+## General update steps
+
+In order to update the Geo nodes when a new GitLab version is released,
+all you need to do is update GitLab itself:
+
+1. Log into each node (primary and secondaries)
+1. [Update GitLab][update]
+1. [Update tracking database on secondary node](#update-tracking-database-on-secondary-node) when
+   the tracking database is enabled.
+1. [Test](#check-status-after-updating) primary and secondary nodes, and check version in each.
+
+## Upgrading to GitLab 10.5
+
+For Geo Disaster Recovery to work with minimum downtime, your Geo secondary
+should use the same set of secrets as the primary. However, setup instructions
+prior to the 10.5 release only synchronized the `db_key_base` secret.
+
+To rectify this error on existing installations, you should **overwrite** the
+contents of `/etc/gitlab/gitlab-secrets.json` on the secondary node with the
+contents of `/etc/gitlab/gitlab-secrets.json` on the primary node, then run the
+following command on the secondary node:
+
+```bash
+sudo gitlab-ctl reconfigure
+```
+
+If you do not perform this step, you may find that two-factor authentication
+[is broken following DR](faq.md#i-followed-the-disaster-recovery-instructions-and-now-two-factor-auth-is-broken).
+
+To prevent SSH requests to the newly promoted primary node from failing
+due to SSH host key mismatch when updating the primary domain's DNS record
+you should perform the step to [Manually replicate primary SSH host keys](configuration.md#step-2-manually-replicate-primary-ssh-host-keys) in each
+secondary node.
+
+## Upgrading to GitLab 10.4
+
+There are no Geo-specific steps to take!
+
+## Upgrading to GitLab 10.3
+
+### Support for SSH repository synchronization removed
+
+In GitLab 10.2, synchronizing secondaries over SSH was deprecated. In 10.3,
+support is removed entirely. All installations will switch to the HTTP/HTTPS
+cloning method instead. Before upgrading, ensure that all your Geo nodes are
+configured to use this method and that it works for your installation. In
+particular, ensure that [Git access over HTTP/HTTPS is enabled](configuration.md#step-5-enable-git-access-over-http-https).
+
+Synchronizing repositories over the public Internet using HTTP is insecure, so
+you should ensure that you have HTTPS configured before upgrading. Note that
+file synchronization is **also** insecure in these cases!
+
+## Upgrading to GitLab 10.2
+
+### Secure PostgreSQL replication
+
+Support for TLS-secured PostgreSQL replication has been added. If you are
+currently using PostgreSQL replication across the open internet without an
+external means of securing the connection (e.g., a site-to-site VPN), then you
+should immediately reconfigure your primary and secondary PostgreSQL instances
+according to the [updated instructions](#database.md).
+
+If you *are* securing the connections externally and wish to continue doing so,
+ensure you include the new option `--sslmode=prefer` in future invocations of
+`gitlab-ctl replicate-geo-database`.
+
+### HTTPS repository sync
+
+Support for replicating repositories and wikis over HTTP/HTTPS has been added.
+Replicating over SSH has been deprecated, and support for this option will be
+removed in a future release.
+
+To switch to HTTP/HTTPS replication, log into the primary node as an admin and visit
+**Admin Area ➔ Geo Nodes** (`/admin/geo_nodes`). For each secondary listed,
+press the "Edit" button, change the "Repository cloning" setting from
+"SSH (deprecated)" to "HTTP/HTTPS", and press "Save changes". This should take
+effect immediately.
+
+Any new secondaries should be created using HTTP/HTTPS replication - this is the
+default setting.
+
+After you've verified that HTTP/HTTPS replication is working, you should remove
+the now-unused SSH keys from your secondaries, as they may cause problems if the
+secondary if ever promoted to a primary:
+
+1. **[secondary]** Login to **all** your secondary nodes and run:
+
+    ```ruby
+    sudo -u git -H rm ~git/.ssh/id_rsa ~git/.ssh/id_rsa.pub
+    ```
+
+### Hashed Storage
+
+>**Warning**
+Hashed storage is in **Alpha**. It is considered experimental and not
+production-ready. See [Hashed
+Storage](../repository_storage_types.md) for more detail.
+
+If you previously enabled Hashed Storage and migrated all your existing
+projects to Hashed Storage, disabling hashed storage will not migrate projects
+to their previous project based storage path. As such, once enabled and
+migrated we recommend leaving Hashed Storage enabled.
+
+## Upgrading to GitLab 10.1
+
+>**Warning**
+Hashed storage is in **Alpha**. It is considered experimental and not
+production-ready. See [Hashed
+Storage](../repository_storage_types.md) for more detail.
+
+[Hashed storage](../repository_storage_types.md) was introduced
+in GitLab 10.0, and a [migration path](../raketasks/storage.md)
+for existing repositories was added in GitLab 10.1.
+
+## Upgrading to GitLab 10.0
+
+Since GitLab 10.0, we require all **Geo** systems to [use SSH key lookups via
+the database](../operations/fast_ssh_key_lookup.md) to avoid having to maintain consistency of the
+`authorized_keys` file for SSH access. Failing to do this will prevent users
+from being able to clone via SSH.
+
+Note that in older versions of Geo, attachments downloaded on the secondary
+nodes would be saved to the wrong directory. We recommend that you do the
+following to clean this up.
+
+On the SECONDARY Geo nodes, run as root:
+
+```sh
+mv /var/opt/gitlab/gitlab-rails/working /var/opt/gitlab/gitlab-rails/working.old
+mkdir /var/opt/gitlab/gitlab-rails/working
+chmod 700 /var/opt/gitlab/gitlab-rails/working
+chown git:git /var/opt/gitlab/gitlab-rails/working
+```
+
+You may delete `/var/opt/gitlab/gitlab-rails/working.old` any time.
+
+Once this is done, we advise restarting GitLab on the secondary nodes for the
+new working directory to be used:
+
+```
+sudo gitlab-ctl restart
+```
+
+## Upgrading from GitLab 9.3 or older
+
+If you started running Geo on GitLab 9.3 or older, we recommend that you
+resync your secondary PostgreSQL databases to use replication slots. If you
+started using Geo with GitLab 9.4 or 10.x, no further action should be
+required because replication slots are used by default. However, if you
+started with GitLab 9.3 and upgraded later, you should still follow the
+instructions below.
+
+When in doubt, it does not hurt to do a resync. The easiest way to do this in
+Omnibus is the following:
+
+  1. Install GitLab on the primary server
+  1. Run `gitlab-ctl reconfigure` and `gitlab-ctl restart postgresql`. This will enable replication slots on the primary database.
+  1. Install GitLab on the secondary server.
+  1. Re-run the [database replication process](database.md#step-3-initiate-the-replication-process).
+
+## Special update notes for 9.0.x
+
+> **IMPORTANT**:
+With GitLab 9.0, the PostgreSQL version is upgraded to 9.6 and manual steps are
+required in order to update the secondary nodes and keep the Streaming
+Replication working. Downtime is required, so plan ahead.
+
+The following steps apply only if you upgrade from a 8.17 GitLab version to
+9.0+. For previous versions, update to GitLab 8.17 first before attempting to
+upgrade to 9.0+.
+
+---
+
+Make sure to follow the steps in the exact order as they appear below and pay
+extra attention in what node (primary/secondary) you execute them! Each step
+is prepended with the relevant node for better clarity:
+
+1. **[secondary]** Login to **all** your secondary nodes and stop all services:
+
+    ```ruby
+    sudo gitlab-ctl stop
+    ```
+
+1. **[secondary]** Make a backup of the `recovery.conf` file on **all**
+   secondary nodes to preserve PostgreSQL's credentials:
+
+    ```
+    sudo cp /var/opt/gitlab/postgresql/data/recovery.conf /var/opt/gitlab/
+    ```
+
+1. **[primary]** Update the primary node to GitLab 9.0 following the
+   [regular update docs][update]. At the end of the update, the primary node
+   will be running with PostgreSQL 9.6.
+
+1. **[primary]** To prevent a de-synchronization of the repository replication,
+   stop all services except `postgresql` as we will use it to re-initialize the
+   secondary node's database:
+
+    ```
+    sudo gitlab-ctl stop
+    sudo gitlab-ctl start postgresql
+    ```
+
+1. **[secondary]** Run the following steps on each of the secondaries:
+
+    1. **[secondary]**  Stop all services:
+
+        ```
+        sudo gitlab-ctl stop
+        ```
+
+    1. **[secondary]** Prevent running database migrations:
+
+        ```
+        sudo touch /etc/gitlab/skip-auto-migrations
+        ```
+
+    1. **[secondary]** Move the old database to another directory:
+
+        ```
+        sudo mv /var/opt/gitlab/postgresql{,.bak}
+        ```
+
+    1. **[secondary]** Update to GitLab 9.0 following the [regular update docs][update].
+       At the end of the update, the node will be running with PostgreSQL 9.6.
+
+    1. **[secondary]** Make sure all services are up:
+
+        ```
+        sudo gitlab-ctl start
+        ```
+
+    1. **[secondary]** Reconfigure GitLab:
+
+        ```
+        sudo gitlab-ctl reconfigure
+        ```
+
+    1. **[secondary]** Run the PostgreSQL upgrade command:
+
+          ```
+          sudo gitlab-ctl pg-upgrade
+          ```
+
+    1. **[secondary]** See the stored credentials for the database that you will
+       need to re-initialize the replication:
+
+        ```
+        sudo grep -s primary_conninfo /var/opt/gitlab/recovery.conf
+        ```
+
+    1. **[secondary]** Create the `replica.sh` script as described in the
+       [database configuration document](database.md#step-3-initiate-the-replication-process).
+
+    1. **[secondary]** Run the recovery script using the credentials from the
+       previous step:
+
+        ```
+        sudo bash /tmp/replica.sh
+        ```
+
+    1. **[secondary]** Reconfigure GitLab:
+
+        ```
+        sudo gitlab-ctl reconfigure
+        ```
+
+    1. **[secondary]** Start all services:
+
+        ```
+        sudo gitlab-ctl start
+        ```
+
+    1. **[secondary]** Repeat the steps for the rest of the secondaries.
+
+1. **[primary]** After all secondaries are updated, start all services in
+   primary:
+
+    ```
+    sudo gitlab-ctl start
+    ```
+
+## Check status after updating
+
+Now that the update process is complete, you may want to check whether
+everything is working correctly:
+
+1. Run the Geo raketask on all nodes, everything should be green:
+
+    ```
+    sudo gitlab-rake gitlab:geo:check
+    ```
+
+1. Check the primary's Geo dashboard for any errors
+1. Test the data replication by pushing code to the primary and see if it
+   is received by the secondaries
+
+## Update tracking database on secondary node
+
+After updating a secondary node, you might need to run migrations on
+the tracking database. The tracking database was added in GitLab 9.1,
+and it is required since 10.0.
+
+1. Run database migrations on tracking database
+
+    ```
+    sudo gitlab-rake geo:db:migrate
+    ```
+
+1. Repeat this step for every secondary node
+
+[update]: ../update/README.md
--- a/doc/administration/geo/replication/using_a_geo_server.md
+++ b/doc/administration/geo/replication/using_a_geo_server.md
+[//]: # (Please update EE::GitLab::GeoGitAccess::GEO_SERVER_DOCS_URL if this file is moved)
+
+# Using a Geo Server
+
+After you set up the [database replication and configure the Geo nodes][req],
+there are a few things to consider:
+
+1. Users need an extra step to be able to fetch code from the secondary and push
+   to primary:
+
+     1. Clone the repository as you would normally do, but from the secondary node:
+
+         ```bash
+         git clone git@secondary.gitlab.example.com:user/repo.git
+         ```
+
+     1. Change the remote push URL to always push to primary, following this example:
+
+         ```bash
+         git remote set-url --push origin git@primary.gitlab.example.com:user/repo.git
+         ```
+
+[req]: README.md#setup-instructions
--- a/doc/gitlab-geo/README.md
+++ b/doc/gitlab-geo/README.md
-# Geo (Geo Replication)
-
-> **Notes:**
- Geo is part of [GitLab Premium][ee].
- Introduced in GitLab Enterprise Edition 8.9.
-  We recommend you use it with at least GitLab Enterprise Edition 10.0 for
-  basic Geo features, or latest version for a better experience.
- You should make sure that all nodes run the same GitLab version.
- Geo requires PostgreSQL 9.6 and Git 2.9 in addition to GitLab's usual
-  [minimum requirements](../install/requirements.md)
- Using Geo in combination with High Availability is considered **GA** in GitLab Enterprise Edition 10.4
-
->**Note:**
-Geo changes significantly from release to release. Upgrades **are**
-supported and [documented](#updating-the-geo-nodes), but you should ensure that
-you're following the right version of the documentation for your installation!
-The best way to do this is to follow the documentation from the `/help` endpoint
-on your **primary** node, but you can also navigate to [this page on GitLab.com](https://gitlab.com/gitlab-org/gitlab-ee/blob/master/doc/gitlab-geo/README.md)
-and choose the appropriate release from the `tags` dropdown, e.g., `v10.0.0-ee`.
-
-Geo allows you to replicate your GitLab instance to other geographical
-locations as a read-only fully operational version.
-
-## Overview
-
-If you have two or more teams geographically spread out, but your GitLab
-instance is in a single location, fetching large repositories can take a long
-time.
-
-Your Geo instance can be used for cloning and fetching projects, in addition to
-reading any data. This will make working with large repositories over large
-distances much faster.
-
-![Geo overview](img/geo-overview.png)
-
-When Geo is enabled, we refer to your original instance as a **primary** node
-and the replicated read-only ones as **secondaries**.
-
-Keep in mind that:
-
- Secondaries talk to the primary to get user data for logins (API) and to
-  replicate repositories, LFS Objects and Attachments (HTTPS + JWT).
- Since GitLab Premium 10.0, the primary no longer talks to
-  secondaries to notify for changes (API).
-
-## Use-cases
-
- Can be used for cloning and fetching projects, in addition
-to reading any data available in the GitLab web interface (see [current limitations](#current-limitations))
- Overcomes slow connection between distant offices, saving time by
-improving speed for distributed teams
- Helps reducing the loading time for automated tasks,
-custom integrations and internal workflows
- Quickly fail-over to a Geo secondary in a
-[Disaster Recovery](../administration/disaster_recovery/index.md) scenario
- Allows [planned fail-over](../administration/disaster_recovery/planned_fail_over.md) to a Geo secondary
-
-## Architecture
-
-The following diagram illustrates the underlying architecture of Geo:
-
-![Geo architecture](img/geo-architecture.png)
-
-[Source diagram](https://docs.google.com/drawings/d/1Abw0P_H0Ew1-2Lj_xPDRWP87clGIke-1fil7_KQqrtE/edit)
-
-In this diagram, there is one Geo primary node and one secondary. The
-secondary clones repositories via git over HTTPS. Attachments, LFS objects, and
-other files are downloaded via HTTPS using the GitLab API to authenticate,
-with a special endpoint protected by JWT.
-
-Writes to the database and Git repositories can only be performed on the Geo
-primary node. The secondary node receives database updates via PostgreSQL
-streaming replication.
-
-Note that the secondary needs two different PostgreSQL databases: a read-only
-instance that streams data from the main GitLab database and another used
-internally by the secondary node to record what data has been replicated.
-
-In the secondary nodes there is an additional daemon: Geo Log Cursor.
-
-## Geo Recommendations
-
-We highly recommend that you install Geo on an operating system that supports
-OpenSSH 6.9 or higher. The following operating systems are known to ship with a
-current version of OpenSSH:
-
-    * CentOS 7.4
-    * Ubuntu 16.04
-
-Note that CentOS 6 and 7.0 ship with an old version of OpenSSH that do not
-support a feature that Geo requires. See the [documentation on Geo SSH
-access](../administration/operations/fast_ssh_key_lookup.md) for more details.
-
-### LDAP
-
-We recommend that if you use LDAP on your primary that you also set up a
-secondary LDAP server for the secondary Geo node. Otherwise, users will not be
-able to perform Git operations over HTTP(s) on the **secondary** Geo node
-using HTTP Basic Authentication. However, Git via SSH and personal access
-tokens will still work.
-
-Check with your LDAP provider for instructions on on how to set up
-replication. For example, OpenLDAP provides [these
-instructions](https://www.openldap.org/doc/admin24/replication.html).
-
-### Geo Tracking Database
-
-We use the tracking database as metadata to control what needs to be
-updated on the disk of the local instance (for example, download new assets,
-fetch new LFS Objects or fetch changes from a repository that has recently been
-updated).
-
-Because the replicated instance is read-only, we need this additional instance
-per secondary location.
-
-### Geo Log Cursor
-
-This daemon reads a log of events replicated by the primary node to the secondary
-database and updates the Geo Tracking Database with changes that need to be
-executed.
-
-When something is marked to be updated in the tracking database, asynchronous
-jobs running on the secondary node will execute the required operations and
-update the state.
-
-This new architecture allows us to be resilient to connectivity issues between the
-nodes. It doesn't matter if it was just a few minutes or days. The secondary
-instance will be able to replay all the events in the correct order and get in
-sync again.
-
-## Setup instructions
-
-These instructions assume you have a working instance of GitLab. They will
-guide you through making your existing instance the primary Geo node and
-adding secondary Geo nodes.
-
-The steps below should be followed in the order they appear. **Make sure the
-GitLab version is the same on all nodes.**
-
-### Using Omnibus GitLab
-
-If you installed GitLab using the Omnibus packages (highly recommended):
-
-1. [Install GitLab Enterprise Edition][install-ee] on the server that will serve
-   as the **secondary** Geo node. Do not create an account or login to the new
-   secondary node.
-1. [Upload the GitLab License](../user/admin_area/license.md) on the **primary**
-   Geo node to unlock Geo.
-1. [Setup the database replication](database.md) (`primary (read-write) <->
-   secondary (read-only)` topology).
-1. [Configure fast lookup of authorized SSH keys in the database](../administration/operations/fast_ssh_key_lookup.md),
-   this step is required and needs to be done on both the primary AND secondary nodes.
-1. [Configure GitLab](configuration.md) to set the primary and secondary nodes.
-1. Optional: [Configure a secondary LDAP server](../administration/auth/ldap.md)
-   for the secondary. See [notes on LDAP](#ldap).
-1. [Follow the "Using a Geo Server" guide](using_a_geo_server.md).
-
-[install-ee]: https://about.gitlab.com/downloads-ee/ "GitLab Enterprise Edition Omnibus packages downloads page"
-
-### Using GitLab installed from source
-
-If you installed GitLab from source:
-
-1. [Install GitLab Enterprise Edition][install-ee-source] on the server that
-   will serve as the **secondary** Geo node. Do not create an account or login
-   to the new secondary node.
-1. [Upload the GitLab License](../user/admin_area/license.md) on the **primary**
-   Geo node to unlock Geo.
-1. [Setup the database replication](database_source.md) (`primary (read-write)
-   <-> secondary (read-only)` topology).
-1. [Configure fast lookup of authorized SSH keys in the database](../administration/operations/fast_ssh_key_lookup.md),
-   do this step for both primary AND secondary nodes.
-1. [Configure GitLab](configuration_source.md) to set the primary and secondary
-   nodes.
-1. [Follow the "Using a Geo Server" guide](using_a_geo_server.md).
-
-[install-ee-source]: https://docs.gitlab.com/ee/install/installation.html "GitLab Enterprise Edition installation from source"
-
-## Configuring Geo
-
-Read through the [Geo configuration](configuration.md) documentation.
-
-## Updating the Geo nodes
-
-Read how to [update your Geo nodes to the latest GitLab version](updating_the_geo_nodes.md).
-
-## Configuring Geo HA
-
-Read through the [Geo High Availability documentation](ha.md).
-
-## Configuring Geo with Object storage
-
-When you have object storage enabled, please consult the
-[Geo with Object Storage](object_storage.md) documentation.
-
-## Replicating the Container Registry
-
-Read how to [replicate the Container Registry](docker_registry.md).
-
-## Current limitations
-
- You cannot push code to secondary nodes, see [3912](https://gitlab.com/gitlab-org/gitlab-ee/issues/3912) for details.
- The primary node has to be online for OAuth login to happen (existing sessions and Git are not affected)
- It works for repos, wikis, issues, and merge requests, but it does not work for job logs, artifacts, GitLab Pages, and Docker images of the Container
-  Registry (by default, but you can configure it separately, see [replicate the Container Registry](docker_registry.md) for details)
- The installation takes multiple manual steps that together can take about an hour depending on circumstances; we are working on improving this experience, see [#2978](https://gitlab.com/gitlab-org/omnibus-gitlab/issues/2978) for details.
-
-## Frequently Asked Questions
-
-Read more in the [Geo FAQ](faq.md).
-
-## Log files
-
-Since GitLab 9.5, Geo stores structured log messages in a `geo.log` file. For
-Omnibus installations, this file can be found in
-`/var/log/gitlab/gitlab-rails/geo.log`. This file contains information about
-when Geo attempts to sync repositories and files. Each line in the file contains a
-separate JSON entry that can be ingested into Elasticsearch, Splunk, etc. For
-example:
-
-```json
-{"severity":"INFO","time":"2017-08-06T05:40:16.104Z","message":"Repository update","project_id":1,"source":"repository","resync_repository":true,"resync_wiki":true,"class":"Gitlab::Geo::LogCursor::Daemon","cursor_delay_s":0.038}
-```
-
-This message shows that Geo detected that a repository update was needed for project 1.
-
-## Security of Geo
-
-Read the [security review](security-review.md) page.
-
-## Tuning Geo
-
-Read the [Geo tuning](tuning.md) documentation.
-
-## Troubleshooting
-
-Read the [troubleshooting document](troubleshooting.md).
-
-[ee]: https://about.gitlab.com/products/ "GitLab Enterprise Edition landing page"
-[install-ee]: https://about.gitlab.com/downloads-ee/ "GitLab Enterprise Edition Omnibus packages downloads page"
-[install-ee-source]: https://docs.gitlab.com/ee/install/installation.html "GitLab Enterprise Edition installation from source"
+This document was moved to [another location](../administration/geo/index.md).
--- a/doc/gitlab-geo/configuration.md
+++ b/doc/gitlab-geo/configuration.md
-# Geo configuration
-
->**Note:**
-This is the documentation for the Omnibus GitLab packages. For installations
-from source, follow the [**Geo nodes configuration for installations
-from source**](configuration_source.md) guide.
-
-## Configuring a new secondary node
-
->**Note:**
-This is the final step in setting up a secondary Geo node. Stages of the
-setup process must be completed in the documented order.
-Before attempting the steps in this stage, [complete all prior stages](README.md#using-omnibus-gitlab).
-
-The basic steps of configuring a secondary node are to replicate required
-configurations between the primary and the secondaries; to configure a tracking
-database on each secondary; and to start GitLab on the secondary node.
-
-You are encouraged to first read through all the steps before executing them
-in your testing/production environment.
-
->**Notes:**
- **Do not** setup any custom authentication in the secondary nodes, this will be
-  handled by the primary node.
- **Do not** add anything in the secondaries Geo nodes admin area
-  (**Admin Area ➔ Geo Nodes**). This is handled solely by the primary node.
-
-### Step 1. Manually replicate secret GitLab values
-
-GitLab stores a number of secret values in the `/etc/gitlab/gitlab-secrets.json`
-file which *must* match between the primary and secondary nodes. Until there is
-a means of automatically replicating these between nodes (see
-[issue #3789](https://gitlab.com/gitlab-org/gitlab-ee/issues/3789)), they must
-be manually replicated to the secondary.
-
-1. SSH into the **primary** node, and execute the command below:
-
-    ```bash
-    sudo cat /etc/gitlab/gitlab-secrets.json
-    ```
-
-    This will display the secrets that need to be replicated, in JSON format.
-
-1. SSH into the **secondary** node and login as the `root` user:
-
-    ```
-    sudo -i
-    ```
-
-1. Make a backup of any existing secrets:
-
-    ```bash
-    mv /etc/gitlab/gitlab-secrets.json /etc/gitlab/gitlab-secrets.json.`date +%F`
-    ```
-
-1. Copy `/etc/gitlab/gitlab-secrets.json` from the primary to the secondary, or
-   copy-and-paste the file contents between nodes:
-
-    ```bash
-    sudo editor /etc/gitlab/gitlab-secrets.json
-
-    # paste the output of the `cat` command you ran on the primary
-    # save and exit
-    ```
-
-1. Ensure the file permissions are correct:
-
-    ```bash
-    chown root:root /etc/gitlab/gitlab-secrets.json
-    chmod 0600 /etc/gitlab/gitlab-secrets.json
-    ```
-
-1. Reconfigure the secondary node for the change to take effect:
-
-    ```
-    gitlab-ctl reconfigure
-    ```
-
-Once reconfigured, the secondary will automatically start
-replicating missing data from the primary in a process known as backfill.
-Meanwhile, the primary node will start to notify the secondary of any changes, so
-that the secondary can act on those notifications immediately.
-
-Make sure the secondary instance is
-running and accessible. You can login to the secondary node
-with the same credentials as used in the primary.
-
-### Step 2. Manually replicate primary SSH host keys
-
-GitLab integrates with the system-installed SSH daemon, designating a user
-(typically named git) through which all access requests are handled.
-
-In a [Disaster Recovery](../administration/disaster_recovery/index.md) situation, GitLab system
-administrators will promote a secondary Geo replica to a primary and they can
-update the DNS records for the primary domain to point to the secondary to prevent
-the need to update all references to the primary domain to the secondary domain,
-like changing Git remotes and API URLs.
-
-This will cause all SSH requests to the newly promoted primary node from
-failing due to SSH host key mismatch. To prevent this, the primary SSH host
-keys must be manually replicated to the secondary node.
-
-1. SSH into the **secondary** node and login as the `root` user:
-
-    ```
-    sudo -i
-    ```
-
-1. Make a backup of any existing SSH host keys:
-
-    ```bash
-    find /etc/ssh -iname ssh_host_* -exec cp {} {}.backup.`date +%F` \;
-    ```
-
-1. SSH into the **primary** node, and execute the command below:
-
-    ```bash
-    sudo find /etc/ssh -iname ssh_host_* -not -iname '*.pub'
-    ```
-
-1. For each file in that list replace the file from the primary node to
-   the **same** location on your **secondary** node.
-
-1. On your **secondary** node, ensure the file permissions are correct:
-
-    ```bash
-    chown root:root /etc/ssh/ssh_host_*
-    chmod 0600 /etc/ssh/ssh_host_*
-    ```
-
-1. Regenerate the public keys from the private keys:
-
-    ```bash
-    find /etc/ssh -iname ssh_host_* -not -iname '*.backup*' -exec sh -c 'ssh-keygen -y -f "{}" > "{}.pub"' \;
-    ```
-
-1. Restart sshd:
-
-    ```bash
-    service ssh restart
-    ```
-
-### Step 3. (Optional) Enabling hashed storage (from GitLab 10.0)
-
->**Warning**
-Hashed storage is in **Beta**. It is not considered production-ready. See
-[Hashed Storage](../administration/repository_storage_types.md) for more detail,
-and for the latest updates, check
-[infrastructure issue #2821](https://gitlab.com/gitlab-com/infrastructure/issues/2821).
-
-Using hashed storage significantly improves Geo replication - project and group
-renames no longer require synchronization between nodes.
-
-1. Visit the **primary** node's **Admin Area ➔ Settings**
-   (`/admin/application_settings`) in your browser
-1. In the `Repository Storages` section, check `Create new projects using hashed storage paths`:
-
-    ![](img/hashed-storage.png)
-
-### Step 4. (Optional) Configuring the secondary to trust the primary
-
-You can safely skip this step if your primary uses a CA-issued HTTPS certificate.
-
-If your primary is using a self-signed certificate for *HTTPS* support, you will
-need to add that certificate to the secondary's trust store. Retrieve the
-certificate from the primary and follow
-[these instructions](https://docs.gitlab.com/omnibus/settings/ssl.html)
-on the secondary.
-
-### Step 5. Enable Git access over HTTP/HTTPS
-
-Geo synchronizes repositories over HTTP/HTTPS, and therefore requires this clone
-method to be enabled. Navigate to **Admin Area ➔ Settings**
-(`/admin/application_settings`) on the primary node, and set
-`Enabled Git access protocols` to `Both SSH and HTTP(S)` or `Only HTTP(S)`.
-
-### Step 6. Verify proper functioning of the secondary node
-
-Congratulations! Your secondary geo node is now configured!
-
-You can login to the secondary node with the same credentials you used on the
-primary. Visit the secondary node's **Admin Area ➔ Geo Nodes**
-(`/admin/geo_nodes`) in your browser to check if it's correctly identified as a
-secondary Geo node and if Geo is enabled.
-
-The initial replication, or 'backfill', will probably still be in progress. You
-can monitor the synchronization process on each geo node from the primary
-node's Geo Nodes dashboard in your browser.
-
-![Geo dashboard](img/geo-node-dashboard.png)
-
-If your installation isn't working properly, check the
-[troubleshooting document](troubleshooting.md).
-
-The two most obvious issues that can become apparent in the dashboard are:
-
-1. Database replication not working well
-1. Instance to instance notification not working. In that case, it can be
-   something of the following:
-     - You are using a custom certificate or custom CA (see the
-       [troubleshooting document](troubleshooting.md))
-     - The instance is firewalled (check your firewall rules)
-
-Please note that disabling a secondary node will stop the sync process.
-
-Please note that if `git_data_dirs` is customized on the primary for multiple
-repository shards you must duplicate the same configuration on the secondary.
-
-Point your users to the ["Using a Geo Server" guide](using_a_geo_server.md).
-
-Currently, this is what is synced:
-
-* Git repositories
-* Wikis
-* LFS objects
-* Issues, merge requests, snippets, and comment attachments
-* Users, groups, and project avatars
-
-## Selective synchronization
-
-Geo supports selective synchronization, which allows admins to choose
-which projects should be synchronized by secondary nodes.
-
-It is important to note that selective synchronization does not:
-
-1. Restrict permissions from secondary nodes.
-1. Hide project metadata from secondary nodes.
-  * Since Geo currently relies on PostgreSQL replication, all project metadata
-    gets replicated to secondary nodes, but repositories that have not been
-    selected will be empty.
-1. Reduce the number of events generated for the Geo event log
-  * The primary generates events as long as any secondaries are present.
-    Selective synchronization restrictions are implemented on the secondaries,
-    not the primary.
-
-A subset of projects can be chosen, either by group or by storage shard. The
-former is ideal for replicating data belonging to a subset of users, while the
-latter is more suited to progressively rolling out Geo to a large GitLab
-instance.
-
-## Upgrading Geo
-
-See the [updating the Geo nodes document](updating_the_geo_nodes.md).
-
-## Troubleshooting
-
-See the [troubleshooting document](troubleshooting.md).
+This document was moved to [another location](../administration/geo/configuration.md).
--- a/doc/gitlab-geo/configuration_source.md
+++ b/doc/gitlab-geo/configuration_source.md
-# Geo configuration
-
->**Note:**
-This is the documentation for installations from source. For installations
-using the Omnibus GitLab packages, follow the
-[**Omnibus Geo nodes configuration**](configuration.md) guide.
-
-## Configuring a new secondary node
-
->**Note:**
-This is the final step in setting up a secondary Geo node. Stages of the setup
-process must be completed in the documented order. Before attempting the steps
-in this stage, [complete all prior stages](README.md#using-gitlab-installed-from-source).
-
-The basic steps of configuring a secondary node are to replicate required
-configurations between the primary and the secondaries; to configure a tracking
-database on each secondary; and to start GitLab on the secondary node.
-
-You are encouraged to first read through all the steps before executing them
-in your testing/production environment.
-
-
->**Notes:**
- **Do not** setup any custom authentication in the secondary nodes, this will be
-  handled by the primary node.
- **Do not** add anything in the secondaries Geo nodes admin area
-  (**Admin Area ➔ Geo Nodes**). This is handled solely by the primary node.
-
-### Step 1. Manually replicate secret GitLab values
-
-GitLab stores a number of secret values in the `/home/git/gitlab/config/secrets.yml`
-file which *must* match between the primary and secondary nodes. Until there is
-a means of automatically replicating these between nodes (see
-[issue #3789](https://gitlab.com/gitlab-org/gitlab-ee/issues/3789)), they must
-be manually replicated to the secondary.
-
-1. SSH into the **primary** node, and execute the command below:
-
-    ```bash
-    sudo cat /home/git/gitlab/config/secrets.yml
-    ```
-
-    This will display the secrets that need to be replicated, in YAML format.
-
-1. SSH into the **secondary** node and login as the `git` user:
-
-    ```bash
-    sudo -i -u git
-    ```
-
-1. Make a backup of any existing secrets:
-
-    ```bash
-    mv /home/git/gitlab/config/secrets.yml /home/git/gitlab/config/secrets.yml.`date +%F`
-    ```
-
-1. Copy `/home/git/gitlab/config/secrets.yml` from the primary to the secondary, or
-   copy-and-paste the file contents between nodes:
-
-    ```bash
-    sudo editor /home/git/gitlab/config/secrets.yml
-
-    # paste the output of the `cat` command you ran on the primary
-    # save and exit
-    ```
-
-1. Ensure the file permissions are correct:
-
-    ```bash
-    chown git:git /home/git/gitlab/config/secrets.yml
-    chmod 0600 /home/git/gitlab/config/secrets.yml
-    ```
-
-1. Restart GitLab for the changes to take effect:
-
-    ```bash
-    service gitlab restart
-    ```
-
-Once restarted, the secondary will automatically start replicating missing data
-from the primary in a process known as backfill. Meanwhile, the primary node
-will start to notify the secondary of any changes, so that the secondary can
-act on those notifications immediately.
-
-Make sure the secondary instance is running and accessible. You can login to
-the secondary node with the same credentials as used in the primary.
-
-### Step 2. Manually replicate primary SSH host keys
-
-Read [Manually replicate primary SSH host keys](configuration.md#step-2-manually-replicate-primary-ssh-host-keys)
-
-### Step 3. (Optional) Enabling hashed storage (from GitLab 10.0)
-
-Read [Enabling Hashed Storage](configuration.md#step-3-optional-enabling-hashed-storage-from-gitlab-10-0)
-
-### Step 4. (Optional) Configuring the secondary to trust the primary
-
-You can safely skip this step if your primary uses a CA-issued HTTPS certificate.
-
-If your primary is using a self-signed certificate for *HTTPS* support, you will
-need to add that certificate to the secondary's trust store. Retrieve the
-certificate from the primary and follow your distribution's instructions for
-adding it to the secondary's trust store. In Debian/Ubuntu, for example, with a
-certificate file of `primary.geo.example.com.crt`, you would follow these steps:
-
-```
-sudo -i
-cp primary.geo.example.com.crt /usr/local/share/ca-certificates
-update-ca-certificates
-```
-
-### Step 5. Enable Git access over HTTP/HTTPS
-
-Geo synchronizes repositories over HTTP/HTTPS, and therefore requires this clone
-method to be enabled. Navigate to **Admin Area ➔ Settings**
-(`/admin/application_settings`) on the primary node, and set
-`Enabled Git access protocols` to `Both SSH and HTTP(S)` or `Only HTTP(S)`.
-
-### Step 6. Verify proper functioning of the secondary node
-
-Read [Verify proper functioning of the secondary node](configuration.md#step-6-verify-proper-functioning-of-the-secondary-node).
-
-
-## Selective synchronization
-
-Read [Selective synchronization](configuration.md#selective-synchronization).
-
-## Troubleshooting
-
-Read the [troubleshooting document](troubleshooting.md).
+This document was moved to [another location](../administration/geo/configuration_source.md).
--- a/doc/gitlab-geo/database.md
+++ b/doc/gitlab-geo/database.md
-# Geo database replication
-
->**Note:**
-This is the documentation for the Omnibus GitLab packages. For installations
-from source, follow the
-[**database replication for installations from source**](database_source.md) guide.
-
->**Note:**
-If your GitLab installation uses external PostgreSQL, the Omnibus roles
-will not be able to perform all necessary configuration steps. Refer to the
-section on [External PostreSQL][external postgresql] for additional instructions.
-
->**Note:**
-The stages of the setup process must be completed in the documented order.
-Before attempting the steps in this stage, [complete all prior stages][toc].
-
-This document describes the minimal steps you have to take in order to
-replicate your primary GitLab database to a secondary node's database. You may
-have to change some values according to your database setup, how big it is, etc.
-
-You are encouraged to first read through all the steps before executing them
-in your testing/production environment.
-
-
-## PostgreSQL replication
-
-The GitLab primary node where the write operations happen will connect to
-the primary database server, and the secondary nodes which are read-only will
-connect to the secondary database servers (which are also read-only).
-
->**Note:**
-In database documentation you may see "primary" being referenced as "master"
-and "secondary" as either "slave" or "standby" server (read-only).
-
-We recommend using [PostgreSQL replication
-slots](https://medium.com/@tk512/replication-slots-in-postgresql-b4b03d277c75)
-to ensure that the primary retains all the data necessary for the secondaries to
-recover. See below for more details.
-
-The following guide assumes that:
-
- You are using Omnibus and therefore you are using PostgreSQL 9.6 or later
-  which includes the  [`pg_basebackup` tool][pgback] and improved
-  [Foreign Data Wrapper][FDW] support.
- You have a primary node already set up (the GitLab server you are
-  replicating from), running Omnibus' PostgreSQL (or equivalent version), and
-  you have a new secondary server set up with the same versions of the OS,
-  PostgreSQL, and GitLab on all nodes.
- The IP of the primary server for our examples will be `1.2.3.4`, whereas the
-  secondary's IP will be `5.6.7.8`. Note that the primary and secondary servers
-  **must** be able to communicate over these addresses. More on this in the
-  guide below.
-
-
-### Step 1. Configure the primary server
-
-1. SSH into your GitLab **primary** server and login as root:
-
-    ```bash
-    sudo -i
-    ```
-
-1. Execute the command below to define the node as primary Geo node:
-
-    ```bash
-    gitlab-ctl set-geo-primary-node
-    ```
-
-    This command will use your defined `external_url` in `/etc/gitlab/gitlab.rb`.
-
-1. GitLab 10.4 and up only: Do the following to make sure the `gitlab` database user has a password defined
-
-    Generate a MD5 hash of the desired password:
-
-    ```bash
-    gitlab-ctl pg-password-md5 gitlab
-    # Enter password: mypassword
-    # Confirm password: mypassword
-    # fca0b89a972d69f00eb3ec98a5838484
-    ```
-
-    Edit `/etc/gitlab/gitlab.rb`:
-
-    ```ruby
-    # Fill with the hash generated by `gitlab-ctl pg-password-md5 gitlab`
-    postgresql['sql_user_password'] = 'fca0b89a972d69f00eb3ec98a5838484'
-
-    # If you have HA setup, this must be present in all nodes as well
-    gitlab_rails['db_password'] = 'mypassword'
-    ```
-
-1. Omnibus GitLab already has a [replication user](https://wiki.postgresql.org/wiki/Streaming_Replication)
-   called `gitlab_replicator`. You must set the password for this user manually.
-   You will be prompted to enter a password:
-
-    ```bash
-    gitlab-ctl set-replication-password
-    ```
-
-    This command will also read the `postgresql['sql_replication_user']` Omnibus
-    setting in case you have changed `gitlab_replicator` username to something
-    else.
-
-1. Configure PostgreSQL to listen on network interfaces
-
-    For security reasons, PostgreSQL does not listen on any network interfaces
-    by default. However, Geo requires the secondary to be able to
-    connect to the primary's database. For this reason, we need the address of
-    each node. Note: For external PostgreSQL instances, see [additional instructions][external postgresql].
-
-    If you are using a cloud provider, you can lookup the addresses for each
-    Geo node through your cloud provider's management console.
-
-    To lookup the address of a Geo node, SSH in to the Geo node and execute:
-
-    ```bash
-    ##
-    ## Private address
-    ##
-    ip route get 255.255.255.255 | awk '{print "Private address:", $NF; exit}'
-
-    ##
-    ## Public address
-    ##
-    echo "External address: $(curl ipinfo.io/ip)"
-    ```
-
-    In most cases, the following addresses will be used to configure GitLab
-    Geo:
-
-    | Configuration | Address |
-    |-----|-----|
-    | `postgresql['listen_address']` | Primary's private address |
-    | `postgresql['trust_auth_cidr_addresses']` | Primary's private address |
-    | `postgresql['md5_auth_cidr_addresses']` | Secondary's public addresses |
-
-    If you are using Google Cloud Platform, SoftLayer, or any other vendor that
-    provides a virtual private cloud you can use the secondary's private
-    address (corresponds to "internal address" for Google Cloud Platform) for
-    `postgresql['md5_auth_cidr_addresses']`.
-
-    The `listen_address` option opens PostgreSQL up to network connections
-    with the interface corresponding to the given address. See [the PostgreSQL
-    documentation](https://www.postgresql.org/docs/9.6/static/runtime-config-connection.html)
-    for more details.
-
-    Depending on your network configuration, the suggested addresses may not
-    be correct. If your primary and secondary connect over a local
-    area network, or a virtual network connecting availability zones like
-    [Amazon's VPC](https://aws.amazon.com/vpc/) or [Google's VPC](https://cloud.google.com/vpc/)
-    you should use the secondary's private address for `postgresql['md5_auth_cidr_addresses']`.
-
-    Edit `/etc/gitlab/gitlab.rb` and add the following, replacing the IP
-    addresses with addresses appropriate to your network configuration:
-
-    ```ruby
-    geo_primary_role['enable'] = true
-
-    ##
-    ## Primary address
-    ## - replace '1.2.3.4' with the primary private address
-    ##
-    postgresql['listen_address'] = '1.2.3.4'
-    postgresql['trust_auth_cidr_addresses'] = ['127.0.0.1/32','1.2.3.4/32']
-
-    ##
-    # Secondary addresses
-    # - replace '5.6.7.8' with the secondary public address
-    ##
-    postgresql['md5_auth_cidr_addresses'] = ['5.6.7.8/32']
-
-    ##
-    ## Replication settings
-    ## - set this to be the number of Geo secondary nodes you have
-    ##
-    postgresql['max_replication_slots'] = 1
-    # postgresql['max_wal_senders'] = 10
-    # postgresql['wal_keep_segments'] = 10
-
-    ##
-    ## Disable automatic database migrations temporarily
-    ## (until PostgreSQL is restarted and listening on the private address).
-    ##
-    gitlab_rails['auto_migrate'] = false
-    ```
-
-1. Optional: If you want to add another secondary, the relevant setting would look like:
-
-    ```ruby
-    postgresql['md5_auth_cidr_addresses'] = ['5.6.7.8/32','9.10.11.12/32']
-    ```
-
-    You may also want to edit the `wal_keep_segments` and `max_wal_senders` to
-    match your database replication requirements. Consult the [PostgreSQL -
-    Replication documentation](https://www.postgresql.org/docs/9.6/static/runtime-config-replication.html)
-    for more information.
-
-1. Save the file and reconfigure GitLab for the database listen changes and
-   the replication slot changes to be applied.
-
-    ```bash
-    gitlab-ctl reconfigure
-    ```
-
-    Restart PostgreSQL for its changes to take effect:
-
-    ```bash
-    gitlab-ctl restart postgresql
-    ```
-
-1. Re-enable migrations now that PostgreSQL is restarted and listening on the
-   private address.
-
-    Edit `/etc/gitlab/gitlab.rb` and **change** the configuration to `true`:
-
-    ```ruby
-    gitlab_rails['auto_migrate'] = true
-    ```
-
-    Save the file and reconfigure GitLab:
-
-    ```bash
-    gitlab-ctl reconfigure
-    ```
-
-1. Now that the PostgreSQL server is set up to accept remote connections, run
-   `netstat -plnt | grep 5432` to make sure that PostgreSQL is listening on port
-   `5432` to the primary server's private address.
-
-1. A certificate was automatically generated when GitLab was reconfigured. This
-   will be used automatically to protect your PostgreSQL traffic from
-   eavesdroppers, but to protect against active ("man-in-the-middle") attackers,
-   the secondary needs a copy of the certificate. Make a copy of the PostgreSQL
-    `server.crt` file on the primary node by running this command:
-
-    ```bash
-    cat ~gitlab-psql/data/server.crt
-    ```
-
-    Copy the output into a clipboard or into a local file. You
-    will need it when setting up the secondary! The certificate is not sensitive
-    data.
-
-### Step 2. Add the secondary GitLab node
-
-To prevent the secondary geo node from trying to act as the primary once the
-database is replicated, the secondary geo node must be added on the
-primary before the database is replicated.
-
-1. Visit the **primary** node's **Admin Area ➔ Geo Nodes**
-   (`/admin/geo_nodes`) in your browser.
-1. Add the secondary node by providing its full URL. **Do NOT** check the box
-   'This is a primary node'.
-1. Optionally, choose which namespaces should be replicated by the
-   secondary node. Leave blank to replicate all. Read more in
-   [selective replication](#selective-replication).
-1. Click the **Add node** button.
-1. SSH into your GitLab **primary** server and login as root to verify the
-   secondary is reachable:
-
-    ```
-    gitlab-rake gitlab:geo:check
-    ```
-
-The new secondary geo node will have the status **Unhealthy**. This is expected
-because we have not yet configured the secondary server. This is the next step.
-
-### Step 3. Configure the secondary server
-
-1. SSH into your GitLab **secondary** server and login as root:
-
-    ```
-    sudo -i
-    ```
-
-1. [Check TCP connectivity](../administration/raketasks/maintenance.md) to the
-   primary's PostgreSQL server:
-
-    ```bash
-    gitlab-rake gitlab:tcp_check[1.2.3.4,5432]
-    ```
-
-    If this step fails, you may be using the wrong IP address, or a firewall may
-    be preventing access to the server. Check the IP address, paying close
-    attention to the difference between public and private addresses and ensure
-    that, if a firewall is present, the secondary is permitted to connect to the
-    primary on port 5432.
-
-1. Create a file `server.crt` in the secondary server, with the content you got on the last step of the primary setup:
-
-    ```
-    editor server.crt
-    ```
-
-
-1. Set up PostgreSQL TLS verification on the secondary
-
-    Install the `server.crt` file:
-
-    ```bash
-    install -D -o gitlab-psql -g gitlab-psql -m 0400 -T server.crt ~gitlab-psql/.postgresql/root.crt
-    ```
-
-    PostgreSQL will now only recognize that exact certificate when verifying TLS
-    connections. The certificate can only be replicated by someone with access
-    to the private key, which is **only** present on the primary node.
-
-1. Test that the `gitlab-psql` user can connect to the primary's database:
-
-    ```bash
-    sudo -u gitlab-psql /opt/gitlab/embedded/bin/psql --list -U gitlab_replicator -d "dbname=gitlabhq_production sslmode=verify-ca" -W -h 1.2.3.4
-    ```
-
-    When prompted enter the password you set in the first step for the
-    `gitlab_replicator` user. If all worked correctly, you should see
-    the list of primary's databases.
-
-    A failure to connect here indicates that the TLS configuration is incorrect.
-    Ensure that the contents of `~gitlab-psql/data/server.crt` on the primary
-    match the contents of `~gitlab-psql/.postgresql/root.crt` on the secondary.
-
-1. Configure PostreSQL to enable FDW support
-
-    This step is similar to how we configured the primary instance.
-    We need to enable this, to enable FDW support, even if using a single node.
-
-    Edit `/etc/gitlab/gitlab.rb` and add the following, replacing the IP
-    addresses with addresses appropriate to your network configuration:
-
-    ```ruby
-    # Secondary addresses
-    # - replace '5.6.7.8' with the secondary private address
-    postgresql['listen_address'] = '5.6.7.8'
-    postgresql['trust_auth_cidr_addresses'] = ['127.0.0.1/32','5.6.7.8/32']
-    postgresql['md5_auth_cidr_addresses'] = ['5.6.7.8/32']
-
-    # gitlab database user's password (defined previously)
-    gitlab_rails['db_password'] = 'mypassword'
-
-    # enable fdw for the geo tracking database
-    geo_secondary['db_fdw'] = true
-    ```
-
-1. Edit `/etc/gitlab/gitlab.rb` and add the following:
-
-    ```ruby
-    geo_secondary_role['enable'] = true
-    ```
-
-    For external PostgreSQL instances, [see additional instructions][external postgresql].
-    If you bring a former primary back online to serve as a secondary then you also need to remove `geo_primary_role['enable'] = true`.
-
-1. Reconfigure GitLab for the changes to take effect:
-
-    ```bash
-    gitlab-ctl reconfigure
-    ```
-
-1. Restart PostgreSQL for its changes to take effect:
-
-    ```bash
-    gitlab-ctl restart postgresql
-    ```
-
-### Step 4. Initiate the replication process
-
-Below we provide a script that connects the database on the secondary node to
-the database on the primary node, replicates the database, and creates the
-needed files for streaming replication.
-
-The directories used are the defaults that are set up in Omnibus. If you have
-changed any defaults or are using a source installation, configure it as you
-see fit replacing the directories and paths.
-
->**Warning:**
-Make sure to run this on the **secondary** server as it removes all PostgreSQL's
-data before running `pg_basebackup`.
-
-1. SSH into your GitLab **secondary** server and login as root:
-
-    ```
-    sudo -i
-    ```
-
-1. Choose a database-friendly name to use for your secondary to
-   use as the replication slot name. For example, if your domain is
-   `secondary.geo.example.com`, you may use `secondary_example` as the slot
-   name as shown in the commands below.
-
-1. Execute the command below to start a backup/restore and begin the replication
-   >**Warning:** Each Geo secondary must have its own unique replication slot name.
-   Using the same slot name between two secondaries will break PostgreSQL replication.
-
-    ```bash
-    gitlab-ctl replicate-geo-database --slot-name=secondary_example --host=1.2.3.4
-    ```
-
-    When prompted, enter the _plaintext_ password you set up for the `gitlab_replicator`
-    user in the first step.
-
-    This command also takes a number of additional options. You can use `--help`
-    to list them all, but here are a couple of tips:
-       - If PostgreSQL is listening on a non-standard port, add `--port=` as well.
-       - If your database is too large to be transferred in 30 minutes, you will need
-         to increase the timeout, e.g., `--backup-timeout=3600` if you expect the
-         initial replication to take under an hour.
-       - Pass `--sslmode=disable` to skip PostgreSQL TLS authentication altogether
-         (e.g., you know the network path is secure, or you are using a site-to-site
-         VPN). This is **not** safe over the public Internet!
-       - You can read more details about each `sslmode` in the
-         [PostgreSQL documentation](https://www.postgresql.org/docs/9.6/static/libpq-ssl.html#LIBPQ-SSL-PROTECTION);
-         the instructions above are carefully written to ensure protection against
-         both passive eavesdroppers and active "man-in-the-middle" attackers.
-       - Change the `--slot-name` to the name of the replication slot
-         to be used on the primary database. The script will attempt to create the
-         replication slot automatically if it does not exist.
-       - If you're repurposing an old server into a Geo secondary, you'll need to
-         add `--force` to the command line.
-       - When not in a production machine you can disable backup step if you
-         really sure this is what you want by adding `--skip-backup`
-
-1. Verify that the secondary is configured correctly and that the primary is
-   reachable:
-
-    ```
-    gitlab-rake gitlab:geo:check
-    ```
-
-The replication process is now complete.
-
-### External PostgreSQL instances
-
-For installations using external PostgreSQL instances, the `geo_primary_role`
-and `geo_secondary_role` includes configuration changes that must be applied
-manually.
-
-The `geo_primary_role` makes configuration changes to `pg_hba.conf` and
-`postgresql.conf` on the primary:
-
-```
-##
-## Geo Primary
-## - pg_hba.conf
-##
-host    replication gitlab_replicator <trusted secondary IP>/32     md5
-```
-
-```
-##
-## Geo Primary Role
-## - postgresql.conf
-##
-sql_replication_user = gitlab_replicator
-wal_level = hot_standby
-max_wal_senders = 10
-wal_keep_segments = 50
-max_replication_slots = 1 # number of secondary instances
-hot_standby = on
-```
-
-Th `geo_secondary_role` makes configuration changes to `postgresql.conf` and
-enables the Geo Log Cursor (`geo_logcursor`) and secondary tracking database
-on the secondary. The PostgreSQL settings for this database it adds to
-the default settings:
-
-```
-##
-## Geo Secondary Role
-## - postgresql.conf
-##
-wal_level = hot_standby
-max_wal_senders = 10
-wal_keep_segments = 10
-hot_standby = on
-```
-
-Geo secondary nodes use a tracking database to keep track of replication
-status and recover automatically from some replication issues. Follow the
-instructions for [enabling tracking database on the secondary server][tracking].
-
-## MySQL replication
-
-MySQL replication is not supported for Geo.
-
-## Troubleshooting
-
-Read the [troubleshooting document](troubleshooting.md).
-
-[pgback]: http://www.postgresql.org/docs/9.2/static/app-pgbasebackup.html
-[external postgresql]: #external-postgresql-instances
-[tracking]: database_source.md#enable-tracking-database-on-the-secondary-server
-[FDW]: https://www.postgresql.org/docs/9.6/static/postgres-fdw.html
-[toc]: README.md#using-omnibus-gitlab
+This document was moved to [another location](../administration/geo/database.md).
--- a/doc/gitlab-geo/database_source.md
+++ b/doc/gitlab-geo/database_source.md
-# Geo database replication
-
->**Note:**
-This is the documentation for installations from source. For installations
-using the Omnibus GitLab packages, follow the
-[**database replication for Omnibus GitLab**](database.md) guide.
-
->**Note:**
-The stages of the setup process must be completed in the documented order.
-Before attempting the steps in this stage, [complete all prior stages][toc].
-
-This document describes the minimal steps you have to take in order to
-replicate your primary GitLab database to a secondary node's database. You may
-have to change some values according to your database setup, how big it is, etc.
-
-You are encouraged to first read through all the steps before executing them
-in your testing/production environment.
-
-## PostgreSQL replication
-
-The GitLab primary node where the write operations happen will connect to
-primary database server, and the secondary ones which are read-only will
-connect to secondary database servers (which are read-only too).
-
->**Note:**
-In many databases documentation you will see "primary" being referenced as "master"
-and "secondary" as either "slave" or "standby" server (read-only).
-
-We recommend using [PostgreSQL replication
-slots](https://medium.com/@tk512/replication-slots-in-postgresql-b4b03d277c75)
-to ensure the primary retains all the data necessary for the secondaries to
-recover. See below for more details.
-
-The following guide assumes that:
-
- You are using PostgreSQL 9.6 or later which includes the
-  [`pg_basebackup` tool][pgback] and improved [Foreign Data Wrapper][FDW] support.
- You have a primary node already set up (the GitLab server you are
-  replicating from), running PostgreSQL 9.6 or later, and
-  you have a new secondary server set up with the same versions of the OS,
-  PostgreSQL, and GitLab on all nodes.
- The IP of the primary server for our examples will be `1.2.3.4`, whereas the
-  secondary's IP will be `5.6.7.8`. Note that the primary and secondary servers
-  **must** be able to communicate over these addresses. These IP addresses can either
-  be public or private.
-
-### Step 1. Configure the primary server
-
-1. SSH into your GitLab **primary** server and login as root:
-
-    ```bash
-    sudo -i
-    ```
-
-1. Add this node as the Geo primary by running:
-
-    ```bash
-    bundle exec rake geo:set_primary_node
-    ```
-
-1. Create a [replication user] named `gitlab_replicator`:
-
-    ```bash
-    sudo -u postgres psql -c "CREATE USER gitlab_replicator REPLICATION ENCRYPTED PASSWORD 'thepassword';"
-    ```
-    
-1. Make sure your the `gitlab` database user has a password defined
-
-    ```bash
-    sudo -u postgres psql -d template1 -c "ALTER USER gitlab WITH ENCRYPTED PASSWORD 'mydatabasepassword';"
-    ```
-    
-1. Edit the content of `database.yml` in `production:` and add the password like the exemple below:
-
-    ```yaml
-    #
-    # PRODUCTION
-    #
-    production:
-      adapter: postgresql
-      encoding: unicode
-      database: gitlabhq_production
-      pool: 10
-      username: gitlab
-      password: mydatabasepassword
-      host: /var/opt/gitlab/geo-postgresql
-    ```
-
-1. Set up TLS support for the PostgreSQL primary server
-
-    > **Warning**: Only skip this step if you **know** that PostgreSQL traffic
-    > between the primary and secondary will be secured through some other
-    > means, e.g., a known-safe physical network path or a site-to-site VPN that
-    > you have configured.
-
-    If you are replicating your database across the open Internet, it is
-    **essential** that the connection is TLS-secured. Correctly configured, this
-    provides protection against both passive eavesdroppers and active
-    "man-in-the-middle" attackers.
-
-    To generate a self-signed certificate and key, run this command:
-
-    ```bash
-    openssl req -nodes -batch -x509 -newkey rsa:4096 -keyout server.key -out server.crt -days 3650
-    ```
-
-    This will create two files - `server.key` and `server.crt` - that you can
-    use for authentication.
-
-    Copy them to the correct location for your PostgreSQL installation:
-
-    ```bash
-    # Copying a self-signed certificate and key
-    install -o postgres -g postgres -m 0400 -T server.crt ~postgres/9.x/main/data/server.crt
-    install -o postgres -g postgres -m 0400 -T server.key ~postgres/9.x/main/data/server.key
-    ```
-
-    Add this configuration to `postgresql.conf`, removing any existing
-    configuration for `ssl_cert_file` or `ssl_key_file`:
-
-    ```
-    ssl = on
-    ssl_cert_file='server.crt'
-    ssl_key_file='server.key'
-    ```
-
-1. Edit `postgresql.conf` to configure the primary server for streaming replication
-   (for Debian/Ubuntu that would be `/etc/postgresql/9.x/main/postgresql.conf`):
-
-    ```
-    listen_address = '1.2.3.4'
-    wal_level = hot_standby
-    max_wal_senders = 5
-    min_wal_size = 80MB
-    max_wal_size = 1GB
-    max_replicaton_slots = 1 # Number of Geo secondary nodes
-    wal_keep_segments = 10
-    hot_standby = on
-    ```
-
-    Be sure to set `max_replication_slots` to the number of Geo secondary
-    nodes that you may potentially have (at least 1).
-
-    For security reasons, PostgreSQL by default only listens on the local
-    interface (e.g. 127.0.0.1). However, Geo needs to communicate
-    between the primary and secondary nodes over a common network, such as a
-    corporate LAN or the public Internet. For this reason, we need to
-    configure PostgreSQL to listen on more interfaces.
-
-    The `listen_address` option opens PostgreSQL up to external connections
-    with the interface corresponding to the given IP. See [the PostgreSQL
-    documentation](https://www.postgresql.org/docs/9.6/static/runtime-config-connection.html)
-    for more details.
-
-    You may also want to edit the `wal_keep_segments` and `max_wal_senders` to
-    match your database replication requirements. Consult the [PostgreSQL - Replication documentation](https://www.postgresql.org/docs/9.6/static/runtime-config-replication.html)
-    for more information.
-
-1. Set the access control on the primary to allow TCP connections using the
-   server's public IP and set the connection from the secondary to require a
-   password.  Edit `pg_hba.conf` (for Debian/Ubuntu that would be
-   `/etc/postgresql/9.x/main/pg_hba.conf`):
-
-    ```bash
-    host    all             all                      127.0.0.1/32    trust
-    host    all             all                      1.2.3.4/32      trust
-    host    replication     gitlab_replicator        5.6.7.8/32      md5
-    ```
-
-    Where `1.2.3.4` is the public IP address of the primary server, and `5.6.7.8`
-    the public IP address of the secondary one. If you want to add another
-    secondary, add one more row like the replication one and change the IP
-    address:
-
-    ```bash
-    host    all             all                      127.0.0.1/32    trust
-    host    all             all                      1.2.3.4/32      trust
-    host    replication     gitlab_replicator        5.6.7.8/32      md5
-    host    replication     gitlab_replicator        11.22.33.44/32  md5
-    ```
-
-1. Restart PostgreSQL for the changes to take effect.
-
-1. Choose a database-friendly name to use for your secondary to use as the
-   replication slot name. For example, if your domain is
-   `secondary.geo.example.com`, you may use `secondary_example` as the slot
-   name.
-
-1. Create the replication slot on the primary:
-
-    ```bash
-    $ sudo -u postgres psql -c "SELECT * FROM pg_create_physical_replication_slot('secondary_example');"
-      slot_name         | xlog_position
-      ------------------+---------------
-      secondary_example |
-      (1 row)
-    ```
-
-1. Now that the PostgreSQL server is set up to accept remote connections, run
-   `netstat -plnt` to make sure that PostgreSQL is listening to the server's
-   public IP.
-
-### Step 2. Add the secondary GitLab node
-
-Follow the steps in ["add the secondary GitLab node"](database.md#step-2-add-the-secondary-gitlab-node).
-
-### Step 3. Configure the secondary server
-
-Follow the first steps in ["configure the secondary server"](database.md#step-3-configure-the-secondary-server),
-but note that since you are installing from source, the username and
-group listed as `gitlab-psql` in those steps should be replaced by `postgres`
-instead. After completing the "Test that the `gitlab-psql` user can connect to
-the primary's database" step, continue here:
-
-1. Edit `postgresql.conf` to configure the secondary for streaming replication
-   (for Debian/Ubuntu that would be `/etc/postgresql/9.*/main/postgresql.conf`):
-
-    ```bash
-    wal_level = hot_standby
-    max_wal_senders = 5
-    checkpoint_segments = 10
-    wal_keep_segments = 10
-    hot_standby = on
-    ```
-
-1. Restart PostgreSQL for the changes to take effect.
-
-#### Enable tracking database on the secondary server
-
-Geo secondary nodes use a tracking database to keep track of replication status
-and recover automatically from some replication issues. Follow the steps below to create
-the tracking database.
-
-1. On the secondary node, run the following command to create `database_geo.yml` with the
-information of your secondary PostgreSQL instance:
-
-    ```bash
-    sudo cp /home/git/gitlab/config/database_geo.yml.postgresql /home/git/gitlab/config/database_geo.yml
-    ```
-
-1. Edit the content of `database_geo.yml` in `production:` as in the example below:
-
-    ```yaml
-    #
-    # PRODUCTION
-    #
-    production:
-      adapter: postgresql
-      encoding: unicode
-      database: gitlabhq_geo_production
-      pool: 10
-      username: gitlab_geo
-      # password:
-      host: /var/opt/gitlab/geo-postgresql
-    ```
-
-1. Create the database `gitlabhq_geo_production` on the PostgreSQL instance of the secondary
-node.
-
-1. Set up the Geo tracking database:
-
-    ```bash
-    bundle exec rake geo:db:migrate
-    ```
-
-1. Configure the [PostgreSQL FDW][FDW] connection and credentials:
-
-    Save the script below in a file, ex. `/tmp/geo_fdw.sh` and modify the connection
-    params to match your environment.
-    
-    ```bash
-    #!/bin/bash
- 
-    # Secondary Database connection params:
-    DB_HOST="/var/opt/gitlab/postgresql"
-    DB_NAME="gitlabhq_production"
-    DB_USER="gitlab"
-    DB_PORT="5432"
-    
-    # Tracking Database connection params:
-    GEO_DB_HOST="/var/opt/gitlab/geo-postgresql"
-    GEO_DB_NAME="gitlabhq_geo_production"
-    GEO_DB_USER="gitlab_geo"
-    GEO_DB_PORT="5432"
- 
-    sudo -u postgres psql -h $GEO_DB_HOST -d $GEO_DB_NAME -p $GEO_DB_PORT -c "CREATE EXTENSION postgres_fdw;"
-    sudo -u postgres psql -h $GEO_DB_HOST -d $GEO_DB_NAME -p $GEO_DB_PORT -c "CREATE SERVER gitlab_secondary FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '$(DB_HOST)', dbname '$(DB_NAME)', port '$(DB_PORT)' );"
-    sudo -u postgres psql -h $GEO_DB_HOST -d $GEO_DB_NAME -p $GEO_DB_PORT -c "CREATE USER MAPPING FOR $(GEO_DB_USER) SERVER gitlab_secondary OPTIONS (user '$(DB_USER)');"
-    sudo -u postgres psql -h $GEO_DB_HOST -d $GEO_DB_NAME -p $GEO_DB_PORT -c "CREATE SCHEMA gitlab_secondary;"
-    sudo -u postgres psql -h $GEO_DB_HOST -d $GEO_DB_NAME -p $GEO_DB_PORT -c "GRANT USAGE ON FOREIGN SERVER gitlab_secondary TO $(GEO_DB_USER);"
-    ```
-
-    And edit the content of `database_geo.yml` and to add `fdw: true` to
-    the  `production:` block.
-
-### Step 4. Initiate the replication process
-
-Below we provide a script that connects the database on the secondary node to
-the database on the primary node, replicates the database, and creates the
-needed files for streaming replication.
-
-The directories used are the defaults for Debian/Ubuntu. If you have changed
-any defaults, configure it as you see fit replacing the directories and paths.
-
->**Warning:**
-Make sure to run this on the **secondary** server as it removes all PostgreSQL's
-data before running `pg_basebackup`.
-
-1. SSH into your GitLab **secondary** server and login as root:
-
-    ```bash
-    sudo -i
-    ```
-
-1. Save the snippet below in a file, let's say `/tmp/replica.sh`. Modify the
-   embedded paths if necessary:
-
-    ```bash
-    #!/bin/bash
-
-    PORT="5432"
-    USER="gitlab_replicator"
-    echo ---------------------------------------------------------------
-    echo WARNING: Make sure this script is run from the secondary server
-    echo ---------------------------------------------------------------
-    echo
-    echo Enter the IP or FQDN of the primary PostgreSQL server
-    read HOST
-    echo Enter the password for $USER@$HOST
-    read -s PASSWORD
-    echo Enter the required sslmode
-    read SSLMODE
-
-    echo Stopping PostgreSQL and all GitLab services
-    gitlab-ctl stop
-
-    echo Backing up postgresql.conf
-    sudo -u postgres mv /var/opt/gitlab/postgresql/data/postgresql.conf /var/opt/gitlab/postgresql/
-
-    echo Cleaning up old cluster directory
-    sudo -u postgres rm -rf /var/opt/gitlab/postgresql/data
-    rm -f /tmp/postgresql.trigger
-
-    echo Starting base backup as the replicator user
-    echo Enter the password for $USER@$HOST
-    sudo -u postgres /opt/gitlab/embedded/bin/pg_basebackup -h $HOST -D /var/opt/gitlab/postgresql/data -U gitlab_replicator -v -x -P
-
-    echo Writing recovery.conf file
-    sudo -u postgres bash -c "cat > /var/opt/gitlab/postgresql/data/recovery.conf <<- _EOF1_
-      standby_mode = 'on'
-      primary_conninfo = 'host=$HOST port=$PORT user=$USER password=$PASSWORD sslmode=$SSLMODE'
-      trigger_file = '/tmp/postgresql.trigger'
-    _EOF1_
-    "
-
-    echo Restoring postgresql.conf
-    sudo -u postgres mv /var/opt/gitlab/postgresql/postgresql.conf /var/opt/gitlab/postgresql/data/
-
-    echo Starting PostgreSQL and all GitLab services
-    gitlab-ctl start
-    ```
-
-1. Run it with:
-
-    ```bash
-    bash /tmp/replica.sh
-    ```
-
-    When prompted, enter the IP/FQDN of the primary, and the password you set up
-    for the `gitlab_replicator` user in the first step.
-
-    You should use `verify-ca` for the `sslmode`. You can use `disable` if you
-    are happy to skip PostgreSQL TLS authentication altogether (e.g., you know
-    the network path is secure, or you are using a site-to-site VPN). This is
-    **not** safe over the public Internet!
-
-    You can read more details about each `sslmode` in the
-    [PostgreSQL documentation](https://www.postgresql.org/docs/9.6/static/libpq-ssl.html#LIBPQ-SSL-PROTECTION);
-    the instructions above are carefully written to ensure protection against
-    both passive eavesdroppers and active "man-in-the-middle" attackers.
-
-The replication process is now over.
-
-## MySQL replication
-
-MySQL replication is not supported for Geo.
-
-## Troubleshooting
-
-Read the [troubleshooting document](troubleshooting.md).
-
-[pgback]: http://www.postgresql.org/docs/9.6/static/app-pgbasebackup.html
-[replication user]:https://wiki.postgresql.org/wiki/Streaming_Replication
-[FDW]: https://www.postgresql.org/docs/9.6/static/postgres-fdw.html
-[toc]: README.md#using-gitlab-installed-from-source
+This document was moved to [another location](../administration/geo/database_source.md).
--- a/doc/gitlab-geo/docker_registry.md
+++ b/doc/gitlab-geo/docker_registry.md
-# Docker Registry for a secondary node
-
-You can setup a [Docker Registry](https://docs.docker.com/registry/) on your
-secondary Geo node that mirrors the one on the primary Geo node.
-
-## Storage support
-
-CAUTION: **Warning:**
-If you use [local storage](../administration/container_registry.md#container-registry-storage-driver)
-for the Container Registry you **cannot** replicate it to the secondary Geo node.
-
-Docker Registry currently supports a few types of storages. If you choose a
-distributed storage (`azure`, `gcs`, `s3`, `swift`, or `oss`) for your Docker
-Registry on a primary Geo node, you can use the same storage for a secondary
-Docker Registry as well. For more information, read the
-[Load balancing considerations](https://docs.docker.com/registry/deploying/#load-balancing-considerations)
-when deploying the Registry, and how to setup the storage driver for GitLab's
-integrated [Container Registry](../administration/container_registry.md#container-registry-storage-driver).
-
-[ee]: https://about.gitlab.com/products/
+This document was moved to [another location](../administration/geo/docker_registry.md).
--- a/doc/gitlab-geo/faq.md
+++ b/doc/gitlab-geo/faq.md
-# Geo Frequently Asked Questions
-
-## Can I use Geo in a disaster recovery situation?
-
-Yes, but there are limitations to what we replicate (see
-[What data is replicated to a secondary node?](#what-data-is-replicated-to-a-secondary-node)).
-
-Read the documentation for [Disaster Recovery](../administration/disaster_recovery/index.md).
-
-## What data is replicated to a secondary node?
-
-We currently replicate project repositories, LFS objects, generated
-attachments / avatars and the whole database. This means user accounts,
-issues, merge requests, groups, project data, etc., will be available for
-query. We currently don't replicate artifact data (`shared/folder`).
-
-## Can I git push to a secondary node?
-
-No. All writing operations (this includes `git push`) must be done in your
-primary node.
-
-## How long does it take to have a commit replicated to a secondary node?
-
-All replication operations are asynchronous and are queued to be dispatched in
-a batched request every 10 minutes. Besides that, it depends on a lot of other
-factors including the amount of traffic, how big your commit is, the
-connectivity between your nodes, your hardware, etc.
-
-## What if the SSH server runs at a different port?
-
-We send the clone url from the primary server to any secondaries, so it
-doesn't matter. If primary is running on port `2200`, clone url will reflect
-that.
-
-## Is this possible to set up a Docker Registry for a secondary node that mirrors the one on a primary node?
-
-Yes. See [Docker Registry for a secondary Geo node](docker_registry.md).
+This document was moved to [another location](../administration/geo/faq.md).
--- a/doc/gitlab-geo/ha.md
+++ b/doc/gitlab-geo/ha.md
-# Geo High Availability
-
-This document describes a minimal reference architecture for running Geo
-in a high availability configuration. If your HA setup differs from the one
-described, it is possible to adapt these instructions to your needs.
-
-## Architecture overview
-
-![Geo HA Diagram](../administration/img/high_availability/geo-ha-diagram.png)
-
-_[diagram source - gitlab employees only](https://docs.google.com/drawings/d/1z0VlizKiLNXVVVaERFwgsIOuEgjcUqDTWPdQYsE7Z4c/edit)_
-
-The topology above assumes that the primary and secondary Geo clusters
-are located in two separate locations, on their own virtual network
-with private IP addresses. The network is configured such that all machines within
-one geographic location can communicate with each other using their private IP addresses.
-The IP addresses given are examples and may be different depending on the
-network topology of your deployment.
-
-The only external way to access the two Geo deployments is by HTTPS at
-`gitlab.us.example.com` and `gitlab.eu.example.com` in the example above.
-
-> **Note:** The primary and secondary Geo deployments must be able to
-> communicate to each other over HTTPS.
-
-## Redis and PostgreSQL High Availability
-
-The primary and secondary Redis and PostgreSQL should be configured
-for high availability.  Because of the additional complexity involved
-in setting up this configuration for PostgreSQL and Redis
-it is not covered by this Geo HA documentation.
-The two services will instead be configured such that
-they will each run on a single machine.
-
-For more information about setting up a highly available PostgreSQL cluster and Redis cluster using the omnibus package see the high availability documentation for
-[PostgreSQL](../administration/high_availability/database.md) and
-[Redis](../administration/high_availability/redis.md), respectively.
-
-From these instructions you will need the following for the examples below:
-* `gitlab_rails['db_password']` for the PostgreSQL "DB password"
-* `redis['password']` for the Redis "Redis password"
-
-NOTE: **Note:**
-It is possible to use cloud hosted services for PostgreSQL and Redis but this is beyond the scope of this document.
-
-### Prerequisites
-
-Make sure you have GitLab EE installed using the
-[Omnibus package](https://about.gitlab.com/installation).
-
-
-### Step 1: Configure the Geo Backend Services
-
-On the **primary** backend servers configure the following services:
-
-* [Redis](../administration/high_availability/redis.md) for high availability.
-* [NFS Server](../administration/high_availability/nfs.md) for repository, LFS, and upload storage.
-* [PostgreSQL](../administration/high_availability/database.md) for high availability.
-
-On the **secondary** backend servers configure the following services:
-
-* [Redis](../administration/high_availability/redis.md) for high availability.
-* [NFS Server](../administration/high_availability/nfs.md) which will store data that is synchronized from the Geo primary.
-
-### Step 2: Configure the Postgres services on the Geo Secondary
-
-1. Configure the [secondary Geo PostgreSQL database](../gitlab-geo/database.md)
- as a read-only secondary of the primary Geo PostgreSQL database.
-
-1. Configure the Geo tracking database on the secondary server, to do this modify `/etc/gitlab/gitlab.rb`:
-
-    ```ruby
-    geo_postgresql['enable'] = true
-
-    geo_postgresql['listen_address'] = '10.1.4.1'
-    geo_postgresql['trust_auth_cidr_addresses'] = ['10.1.0.0/16']
-
-    geo_secondary['auto_migrate'] = true
-    geo_secondary['db_host'] = '10.1.4.1'
-    geo_secondary['db_password'] = 'Geo tracking DB password'
-    ```
-
-NOTE: **Note:**
-Be sure that other non-postgresql services are disabled by setting `enable` to `false` in
-the [gitlab.rb configuration](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/files/gitlab-config-template/gitlab.rb.template).
-
-After making these changes be sure to run `sudo gitlab-ctl reconfigure` so that they take effect.
-
-### Step 3: Setup the LoadBalancer
-
-In this topology there will need to be a load balancers at each geographical location
-to route traffic to the application servers.
-
-See the [Load Balancer for GitLab HA](../administration/high_availability/load_balancer.md)
-documentation for more information.
-
-### Step 4: Configure the Geo Frontend Application Servers
-
-In the architecture overview there are two machines running the GitLab application
-services.  These services are enabled selectively in the configuration. Additionally
-the addresses of the remote endpoints for PostgreSQL and Redis will need to be specified.
-
-#### On the GitLab Primary Frontend servers
-
-1. Edit `/etc/gitlab/gitlab.rb` and ensure the following to disable PostgreSQL and Redis from running locally.
-
-    ```ruby
-    ##
-    ## Disable PostgreSQL on the local machine and connect to the remote
-    ##
-
-    postgresql['enable'] = false
-    gitlab_rails['auto_migrate'] = false
-    gitlab_rails['db_host'] = '10.0.3.1'
-    gitlab_rails['db_password'] = 'DB password'
-
-    ##
-    ## Disable Redis on the local machine and connect to the remote
-    ##
-
-    redis['enable'] = false
-    gitlab_rails['redis_host'] = '10.0.2.1'
-    gitlab_rails['redis_password'] = 'Redis password'
-
-    geo_primary_role['enable'] = true
-    ```
-
-#### On the GitLab Secondary Frontend servers
-
-On the secondary the remote endpoint for the PostgreSQL Geo database will
-be specified.
-
-1. Edit `/etc/gitlab/gitlab.rb` and ensure the following to disable PostgreSQL and Redis from running locally. Configure the secondary to connect to the Geo tracking database.
-
-
-    ```ruby
-    ##
-    ## Disable PostgreSQL on the local machine and connect to the remote
-    ##
-
-    postgresql['enable'] = false
-    gitlab_rails['auto_migrate'] = false
-    gitlab_rails['db_host'] = '10.1.3.1'
-    gitlab_rails['db_password'] = 'DB password'
-
-    ##
-    ## Disable Redis on the local machine and connect to the remote
-    ##
-
-    redis['enable'] = false
-    gitlab_rails['redis_host'] = '10.1.2.1'
-    gitlab_rails['redis_password'] = 'Redis password'
-
-
-    ##
-    ## Enable the geo secondary role and configure the
-    ## geo tracking database
-    ##
-
-    geo_secondary_role['enable'] = true
-    geo_secondary['db_host'] = '10.1.4.1'
-    geo_secondary['db_password'] = 'Geo tracking DB password'
-    geo_postgresql['enable'] = false
-    ```
-
-
-After making these changes [Reconfigure GitLab][] so that they take effect.
-
-On the primary the following GitLab frontend services will be enabled:
-
-* gitlab-pages
-* gitlab-workhorse
-* logrotate
-* nginx
-* registry
-* remote-syslog
-* sidekiq
-* unicorn
-
-On the secondary the following GitLab frontend services will be enabled:
-
-* geo-logcursor
-* gitlab-pages
-* gitlab-workhorse
-* logrotate
-* nginx
-* registry
-* remote-syslog
-* sidekiq
-* unicorn
-
-Verify these services by running `sudo gitlab-ctl status` on the frontend
-application servers.
-
-[reconfigure GitLab]: ../administration/restart_gitlab.md#omnibus-gitlab-reconfigure
-[restart GitLab]: ../administration/restart_gitlab.md#omnibus-gitlab-restart
+This document was moved to [another location](../administration/geo/high_availability.md).
--- a/doc/gitlab-geo/object_storage.md
+++ b/doc/gitlab-geo/object_storage.md
-# Geo with Object storage
-
-Geo can be used in combination with Object Storage (AWS S3, or
-other compatible object storage).
-
-## Configuration
-
-At this time it is required that if object storage is enabled on the
-primary, it must also be enabled on the secondary.
-
-The secondary nodes can use the same storage bucket as the primary, or
-they can use a replicated storage bucket. At this time GitLab does not
-take care of content replication in object storage.
-
-For LFS, follow the documentation to
-[set up LFS object storage](../workflow/lfs/lfs_administration.md#setting-up-s3-compatible-object-storage).
-
-For CI job artifacts, there is similar documentation to configure
-[jobs artifact object storage](../administration/job_artifacts.md#using-object-storage)
-
-Complete these steps on all nodes, primary **and** secondary.
-
-## Replication
-
-When using Amazon S3, you can use
-[CRR](https://docs.aws.amazon.com/AmazonS3/latest/dev/crr.html) to
-have automatic replication between the bucket used by the primary and
-the bucket used by the secondary.
-
-If you are using Google Cloud Storage, consider using
-[Multi-Regional Storage](https://cloud.google.com/storage/docs/storage-classes#multi-regional).
-Or you can use the [Storage Transfer Service](https://cloud.google.com/storage/transfer/),
-although this only supports daily synchronization.
-
-For manual synchronization, or scheduled by `cron`, please have a look at:
-
- [`s3cmd sync`](http://s3tools.org/s3cmd-sync)
- [`gsutil rsync`](https://cloud.google.com/storage/docs/gsutil/commands/rsync)
+This document was moved to [another location](../administration/geo/object_storage.md).
--- a/doc/gitlab-geo/security-review.md
+++ b/doc/gitlab-geo/security-review.md
-The following security review of the Geo feature set focuses on security
-aspects of the feature as they apply to customers running their own GitLab
-instances. The review questions are based in part on the [application security architecture](https://www.owasp.org/index.php/Application_Security_Architecture_Cheat_Sheet)
-questions from [owasp.org](https://www.owasp.org).
-
-
-
-## Business Model
-
-### What geographic areas does the application service?
-
- This varies by customer. Geo allows customers to deploy to multiple areas,
-   and they get to choose where they are.
- Region and node selection is entirely manual.
-
-
-
-## Data Essentials
-
-### What data does the application receive, produce, and process?
-
- Geo streams almost all data held by a GitLab instance between sites. This
-  includes full database replication, most files (user-uploaded attachments,
-  etc) and repository + wiki data. In a typical configuration, this will
-  happen across the public Internet, and be TLS-encrypted.
- PostgreSQL replication is TLS-encrypted.
- See also: [only TLSv1.2 should be supported](https://gitlab.com/gitlab-org/omnibus-gitlab/issues/2948)
-
-### How can the data be classified into categories according to its sensitivity?
-
- GitLab’s model of sensitivity is centered around public vs. internal vs.
-private projects. Geo replicates them all indiscriminately. “Selective sync”
-exists for files and repositories (but not database content), which would permit
-only less-sensitive projects to be replicated to a secondary if desired.
- See also: [developing a data classification policy](https://gitlab.com/gitlab-com/security/issues/4).
-
-### What data backup and retention requirements have been defined for the application?
-
- Geo is designed to provide replication of a certain subset of the application
-data. It is part of the solution, rather than part of the problem.
-
-
-
-## End-Users
-
-### Who are the application's end‐users?
-
- Geo nodes (secondaries) are created in regions that are distant (in terms of
-  Internet latency) from the main GitLab installation (the primary). They are
-  intended to be used by anyone who would ordinarily use the primary, who finds
-  that the secondary is closer to them (in terms of Internet latency).
-
-### How do the end‐users interact with the application?
-
- A Geo secondary node provides all the interfaces a Geo primary node does
-(notably a HTTP/HTTPS web application, and HTTP/HTTPS or SSH git repository
-access), but is constrained to read-only activities. The principal use case is
-envisioned to be cloning git repositories from the secondary in favor of the
-primary, but end-users may use the GitLab web interface to view projects,
-issues, merge requests, snippets, etc.
-
-### What security expectations do the end‐users have?
-
- The replication process must be secure. It would typically be unacceptable to
-transmit the entire database contents or all files and repositories across the
-public Internet in plaintext, for instance.
- The Geo secondary must have the same access controls over its content as the
-primary - unauthenticated users must not be able to gain access to privileged
-information on the primary by querying the secondary.
- Attackers must not be able to impersonate the secondary to the primary, and
-thus gain access to privileged information.
-
-
-
-## Administrators
-
-### Who has administrative capabilities in the application?
-
- Nothing Geo-specific. Any user where `admin: true` is set in the database is
-considered an admin with super-user privileges.
- See also: [more granular access control](https://gitlab.com/gitlab-org/gitlab-ce/issues/32730)
-(not geo-specific)
- Much of Geo’s integration (database replication, for instance) must be
-configured with the application, typically by system administrators.
-
-### What administrative capabilities does the application offer?
-
- Geo secondaries may be added, modified, or removed by users with
-administrative access.
- The replication process may be controlled (start/stop) via the Sidekiq
-administrative controls.
-
-
-
-## Network
-
-### What details regarding routing, switching, firewalling, and load‐balancing have been defined?
-
- Geo requires the primary and secondary to be able to communicate with each
-other across a TCP/IP network. In particular, the secondaries must be able to
-access HTTP/HTTPS and PostgreSQL services on the primary.
-
-### What core network devices support the application?
-
- Varies from customer to customer.
-
-### What network performance requirements exist?
-
- Maximum replication speeds between primary and secondary is limited by the
-available bandwidth between sites. No hard requirements exist - time to complete
-replication (and ability to keep up with changes on the primary) is a function
-of the size of the data set, tolerance for latency, and available network
-capacity.
-
-### What private and public network links support the application?
-
- Customers choose their own networks. As sites are intended to be
-geographically separated, it is envisioned that replication traffic will pass
-over the public Internet in a typical deployment, but this is not a requirement.
-
-
-
-## Systems
-
-### What operating systems support the application?
-
- Geo imposes no additional restrictions on operating system (see the
-  [GitLab installation](https://about.gitlab.com/installation/) page for more
-  details), however we recommend using the operating systems listed in the [Geo documentation](http://docs.gitlab.com/ee/gitlab-geo/#geo-recommendations). 
-
-
-### What details regarding required OS components and lock‐down needs have been defined?
-
- The recommended installation method (Omnibus) packages most components itself.
-A from-source installation method exists. Both are documented at
-https://docs.gitlab.com/ee/gitlab-geo/
- There are significant dependencies on the system-installed OpenSSH daemon (Geo
-  requires users to set up custom authentication methods) and the omnibus or
-  system-provided PostgreSQL daemon (it must be configured to listen on TCP,
-  additional users and replication slots must be added, etc).
- The process for dealing with security updates (for example, if there is a
-  significant vulnerability in OpenSSH or other services, and the customer
-  wants to patch those services on the OS) is identical to the non-Geo
-  situation: security updates to OpenSSH would be provided to the user via the
-  usual distribution channels. Geo introduces no delay there.
-
-
-
-## Infrastructure Monitoring
-
-### What network and system performance monitoring requirements have been defined?
-
- None specific to Geo.
-
-### What mechanisms exist to detect malicious code or compromised application components?
-
- None specific to Geo.
-
-### What network and system security monitoring requirements have been defined?
-
- None specific to Geo.
-
-
-
-## Virtualization and Externalization
-
-### What aspects of the application lend themselves to virtualization?
-
- All.
-
-## What virtualization requirements have been defined for the application?
-
- Nothing Geo-specific, but everything in GitLab needs to have full
-functionality in such an environment.
-
-### What aspects of the product may or may not be hosted via the cloud computing model?
-
- GitLab is “cloud native” and this applies to Geo as much as to the rest of the
-product. Deployment in clouds is a common and supported scenario.
-
-## If applicable, what approach(es) to cloud computing will be taken (Managed Hosting versus "Pure" Cloud, a "full machine" approach such as AWS-EC2 versus a "hosted database" approach such as AWS-RDS and Azure, etc)?
-
- To be decided by our customers, according to their operational needs.
-
-
-
-## Environment
-
-### What frameworks and programming languages have been used to create the application?
-
- Ruby on Rails, Ruby.
-
-### What process, code, or infrastructure dependencies have been defined for the application?
-
- Nothing specific to Geo.
-
-### What databases and application servers support the application?
-
- PostgreSQL >= 9.6, Redis, Sidekiq, Unicorn.
-
-### How will database connection strings, encryption keys, and other sensitive components be stored, accessed, and protected from unauthorized detection?
-
- There are some Geo-specific values. Some are shared secrets which must be
-securely transmitted from the primary to the secondary at setup time. Our
-documentation recommends transmitting them from the primary to the system
-administrator via SSH, and then back out to the secondary in the same manner.
-In particular, this includes the PostgreSQL replication credentials and a secret
-key (`db_key_base`) which is used to decrypt certain columns in the database.
-The `db_key_base` secret is stored unencrypted on the filesystem, in
-`/etc/gitlab/gitlab-secrets.json`, along with a number of other secrets. There is
-no at-rest protection for them.
-
-
-
-## Data Processing
-
-### What data entry paths does the application support?
-
- Data is entered via the web application exposed by GitLab itself. Some data is
-also entered using system administration commands on the GitLab servers (e.g.,
-  `gitlab-ctl set-primary-node`).
- Secondaries also receive inputs via PostgreSQL streaming replication from the
-primary.
-
-### What data output paths does the application support?
-
- Primaries output via PostgreSQL streaming replication to the secondary.
-Otherwise, principally via the web application exposed by GitLab itself, and via
-SSH `git clone` operations initiated by the end-user.
-
-### How does data flow across the application's internal components?
-
- Secondaries and primaries interact via HTTP/HTTPS (secured with JSON web
-  tokens) and via PostgreSQL streaming replication.
- Within a primary or secondary, the SSOT is the filesystem and the database
-(including Geo tracking database on secondary). The various internal components
-are orchestrated to make alterations to these stores.
-
-### What data input validation requirements have been defined?
-
- Secondaries must have a faithful replication of the primary’s data.
-
-### What data does the application store and how?
-
- Git repositories and files, tracking information related to the them, and the
-GitLab database contents.
-
-### What data is or may need to be encrypted and what key management requirements have been defined?
-
- Neither primaries or secondaries encrypt Git repository or filesystem data at
-rest. A subset of database columns are encrypted at rest using the `db_otp_key`
- a static secret shared across all hosts in a GitLab deployment.
- In transit, data should be encrypted, although the application does permit
-communication to proceed unencrypted. The two main transits are the secondary’s
-replication process for PostgreSQL, and for git repositories/files. Both should
-be protected using TLS, with the keys for that managed via Omnibus per existing
-configuration for end-user access to GitLab.
-
-### What capabilities exist to detect the leakage of sensitive data?
-
- Comprehensive system logs exist, tracking every connection to GitLab and
-PostgreSQL.
-
-### What encryption requirements have been defined for data in transit - including transmission over WAN, LAN, SecureFTP, or publicly accessible protocols such as http: and https:?
-
- Data must have the option to be encrypted in transit, and be secure against
-both passive and active attack (e.g., MITM attacks should not be possible).
-
-
-
-## Access
-
-### What user privilege levels does the application support?
-
- Geo adds one type of privilege: secondaries can access a special Geo API to
-download files over HTTP/HTTPS, and to clone repositories using HTTP/HTTPS.
-
-### What user identification and authentication requirements have been defined?
-
- Geo secondaries identify to Geo primaries via OAuth or JWT authentication
-based on the shared database (HTTP access) or a PostgreSQL replication user (for
-database replication). The database replication also requires IP-based access
-controls to be defined.
-
-### What user authorization requirements have been defined?
-
- Secondaries must only be able to *read* data. They are not currently able to
-mutate data on the primary.
-
-### What session management requirements have been defined?
-
- Geo JWTs are defined to last for only two minutes before needing to be
-regenerated.
-
-### What access requirements have been defined for URI and Service calls?
-
- A Geo secondary makes many calls to the primary's API. This is how file
-replication proceeds, for instance. This endpoint is only accessible with a JWT
-token.
- The primary also makes calls to the secondary to get status information.
-
-
-
-## Application Monitoring
-
-### What application auditing requirements have been defined? How are audit and debug logs accessed, stored, and secured?
-
- Structured JSON log is written to the filesystem, and can also be ingested
-into a Kibana installation for further analysis.
+This document was moved to [another location](../administration/geo/security_review.md).
--- a/doc/gitlab-geo/troubleshooting.md
+++ b/doc/gitlab-geo/troubleshooting.md
-# Geo Troubleshooting
-
->**Note:**
-This list is an attempt to document all the moving parts that can go wrong.
-We are working into getting all this steps verified automatically in a
-rake task in the future.
-
-Setting up Geo requires careful attention to details and sometimes it's easy to
-miss a step. Here is a list of questions you should ask to try to detect
-what you need to fix (all commands and path locations are for Omnibus installs):
-
-#### First check the health of the secondary
-
-Visit the primary node's **Admin Area ➔ Geo Nodes** (`/admin/geo_nodes`) in
-your browser. We perform the following health checks on each secondary node
-to help identify if something is wrong:
-
- Is the node running?
- Is the node's secondary database configured for streaming replication?
- Is the node's secondary tracking database configured?
- Is the node's secondary tracking database connected?
- Is the node's secondary tracking database up-to-date?
-
-![Geo health check](img/geo-node-healthcheck.png)
-
-There is also an option to check the status of the secondary node by running a special rake task:
-
-```
-sudo gitlab-rake geo:status
-```
-
-#### Is Postgres replication working?
-
-#### Are my nodes pointing to the correct database instance?
-
-You should make sure your primary Geo node points to the instance with
-writing permissions.
-
-Any secondary nodes should point only to read-only instances.
-
-#### Can Geo detect my current node correctly?
-
-Geo uses the defined node from the `Admin ➔ Geo` screen, and tries to match
-it with the value defined in the `/etc/gitlab/gitlab.rb` configuration file.
-The relevant line looks like: `external_url "http://gitlab.example.com"`.
-
-To check if the node on the current machine is correctly detected type:
-
-```bash
-sudo gitlab-rails runner "puts Gitlab::Geo.current_node.inspect"
-```
-
-and expect something like:
-
-```
-#<GeoNode id: 2, schema: "https", host: "gitlab.example.com", port: 443, relative_url_root: "", primary: false, ...>
-```
-
-By running the command above, `primary` should be `true` when executed in
-the primary node, and `false` on any secondary.
-
-#### How do I fix the message, "ERROR:  replication slots can only be used if max_replication_slots > 0"?
-
-This means that the `max_replication_slots` PostgreSQL variable needs to
-be set on the primary database. In GitLab 9.4, we have made this setting
-default to 1. You may need to increase this value if you have more Geo
-secondary nodes. Be sure to restart PostgreSQL for this to take
-effect. See the [PostgreSQL replication
-setup](database.md#postgresql-replication) guide for more details.
-
-#### How do I fix the message, "FATAL:  could not start WAL streaming: ERROR:  replication slot "geo_secondary_my_domain_com" does not exist"?
-
-This occurs when PostgreSQL does not have a replication slot for the
-secondary by that name. You may want to rerun the [replication
-process](database.md) on the secondary.
-
-#### How do I fix the message, "Command exceeded allowed execution time" when setting up replication?
-
-This may happen while [initiating the replication process](database.md#step-4-initiate-the-replication-process) on the Geo secondary, and indicates that your
-initial dataset is too large to be replicated in the default timeout (30 minutes).
-
-Re-run `gitlab-ctl replicate-geo-database`, but include a larger value for
-`--backup-timeout`:
-
-```bash
-sudo gitlab-ctl replicate-geo-database --host=primary.geo.example.com --slot-name=secondary_geo_example_com --backup-timeout=21600
-```
-
-This will give the initial replication up to six hours to complete, rather than
-the default thirty minutes. Adjust as required for your installation.
-
-#### How do I fix the message, "PANIC: could not write to file 'pg_xlog/xlogtemp.123': No space left on device"
-
-Determine if you have any unused replication slots in the primary database.  This can cause large amounts of log data to build up in `pg_xlog`.
-Removing the unused slots can reduce the amount of space used in the `pg_xlog`.
-
-1. Start a PostgreSQL console session:
-
-    ```bash
-    sudo gitlab-psql gitlabhq_production
-    ```
-
-    Note that using `gitlab-rails dbconsole` will not work, because managing replication slots requires superuser permissions.
-
-2. View your replication slots with
-
-     ```sql
-     SELECT * FROM pg_replication_slots;
-     ```
-
-Slots where `active` is `f` are not active.
-
- When this slot should be active, because you have a secondary configured using that slot,
-log in to that secondary and check the PostgreSQL logs why the replication is not running.
-
- If you are no longer using the slot (e.g. you no longer have Geo enabled), you can remove it with in the PostgreSQL console session:
-
-    ```sql
-    SELECT pg_drop_replication_slot('name_of_extra_slot');
-    ```
-
-#### Very large repositories never successfully synchronize on the secondary
-
-GitLab places a timeout on all repository clones, including project imports
-and Geo synchronization operations. If a fresh `git clone` of a repository
-on the primary takes more than a few minutes, you may be affected by this.
-To increase the timeout, add the following line to `/etc/gitlab/gitlab.rb`
-on the secondary:
-
-```ruby
-gitlab_rails['gitlab_shell_git_timeout'] = 10800
-```
-
-Then reconfigure GitLab:
-
-```bash
-sudo gitlab-ctl reconfigure
-```
-
-This will increase the timeout to three hours (10800 seconds). Choose a time
-long enough to accomodate a full clone of your largest repositories.
+This document was moved to [another location](../administration/geo/troubleshooting.md).
--- a/doc/gitlab-geo/tuning.md
+++ b/doc/gitlab-geo/tuning.md
-# Tuning Geo
-
-## Changing the sync capacity values
-
-In the Geo admin page (`/admin/geo_nodes`), there are several variables that
-can be tuned to improve performance of Geo:
-
-* Repository sync capacity
-* File sync capacity
-
-Increasing these values will increase the number of jobs that are scheduled,
-but this may not lead to a more downloads in parallel unless the number of
-available Sidekiq threads is also increased. For example, if repository sync
-capacity is increased from 25 to 50, you may also want to increase the number
-of Sidekiq threads from 25 to 50. See the [Sidekiq concurrency
-documentation](../administration/operations/extra_sidekiq_processes.html#concurrency)
-for more details.
+This document was moved to [another location](../administration/geo/tuning.md).
--- a/doc/gitlab-geo/updating_the_geo_nodes.md
+++ b/doc/gitlab-geo/updating_the_geo_nodes.md
-# Updating the Geo nodes
-
-Depending on which version of Geo you are updating to/from, there may be
-different steps.
-
-## General update steps
-
-In order to update the Geo nodes when a new GitLab version is released,
-all you need to do is update GitLab itself:
-
-1. Log into each node (primary and secondaries)
-1. [Update GitLab][update]
-1. [Update tracking database on secondary node](#update-tracking-database-on-secondary-node) when
-   the tracking database is enabled.
-1. [Test](#check-status-after-updating) primary and secondary nodes, and check version in each.
-
-## Upgrading to GitLab 10.5
-
-For Geo Disaster Recovery to work with minimum downtime, your Geo secondary
-should use the same set of secrets as the primary. However, setup instructions
-prior to the 10.5 release only synchronized the `db_key_base` secret.
-
-To rectify this error on existing installations, you should **overwrite** the
-contents of `/etc/gitlab/gitlab-secrets.json` on the secondary node with the
-contents of `/etc/gitlab/gitlab-secrets.json` on the primary node, then run the
-following command on the secondary node:
-
-```bash
-sudo gitlab-ctl reconfigure
-```
-
-If you do not perform this step, you may find that two-factor authentication
-[is broken following DR](faq.md#i-followed-the-disaster-recovery-instructions-and-now-two-factor-auth-is-broken).
-
-To prevent SSH requests to the newly promoted primary node from failing
-due to SSH host key mismatch when updating the primary domain's DNS record
-you should perform the step to [Manually replicate primary SSH host keys](configuration.md#step-2-manually-replicate-primary-ssh-host-keys) in each
-secondary node.
-
-## Upgrading to GitLab 10.4
-
-There are no Geo-specific steps to take!
-
-## Upgrading to GitLab 10.3
-
-### Support for SSH repository synchronization removed
-
-In GitLab 10.2, synchronizing secondaries over SSH was deprecated. In 10.3,
-support is removed entirely. All installations will switch to the HTTP/HTTPS
-cloning method instead. Before upgrading, ensure that all your Geo nodes are
-configured to use this method and that it works for your installation. In
-particular, ensure that [Git access over HTTP/HTTPS is enabled](configuration.md#step-5-enable-git-access-over-http-https).
-
-Synchronizing repositories over the public Internet using HTTP is insecure, so
-you should ensure that you have HTTPS configured before upgrading. Note that
-file synchronization is **also** insecure in these cases!
-
-## Upgrading to GitLab 10.2
-
-### Secure PostgreSQL replication
-
-Support for TLS-secured PostgreSQL replication has been added. If you are
-currently using PostgreSQL replication across the open internet without an
-external means of securing the connection (e.g., a site-to-site VPN), then you
-should immediately reconfigure your primary and secondary PostgreSQL instances
-according to the [updated instructions](#database.md).
-
-If you *are* securing the connections externally and wish to continue doing so,
-ensure you include the new option `--sslmode=prefer` in future invocations of
-`gitlab-ctl replicate-geo-database`.
-
-### HTTPS repository sync
-
-Support for replicating repositories and wikis over HTTP/HTTPS has been added.
-Replicating over SSH has been deprecated, and support for this option will be
-removed in a future release.
-
-To switch to HTTP/HTTPS replication, log into the primary node as an admin and visit
-**Admin Area ➔ Geo Nodes** (`/admin/geo_nodes`). For each secondary listed,
-press the "Edit" button, change the "Repository cloning" setting from
-"SSH (deprecated)" to "HTTP/HTTPS", and press "Save changes". This should take
-effect immediately.
-
-Any new secondaries should be created using HTTP/HTTPS replication - this is the
-default setting.
-
-After you've verified that HTTP/HTTPS replication is working, you should remove
-the now-unused SSH keys from your secondaries, as they may cause problems if the
-secondary if ever promoted to a primary:
-
-1. **[secondary]** Login to **all** your secondary nodes and run:
-
-    ```ruby
-    sudo -u git -H rm ~git/.ssh/id_rsa ~git/.ssh/id_rsa.pub
-    ```
-
-### Hashed Storage
-
->**Warning**
-Hashed storage is in **Alpha**. It is considered experimental and not
-production-ready. See [Hashed
-Storage](../administration/repository_storage_types.md) for more detail.
-
-If you previously enabled Hashed Storage and migrated all your existing
-projects to Hashed Storage, disabling hashed storage will not migrate projects
-to their previous project based storage path. As such, once enabled and
-migrated we recommend leaving Hashed Storage enabled.
-
-## Upgrading to GitLab 10.1
-
->**Warning**
-Hashed storage is in **Alpha**. It is considered experimental and not
-production-ready. See [Hashed
-Storage](../administration/repository_storage_types.md) for more detail.
-
-[Hashed storage](../administration/repository_storage_types.md) was introduced
-in GitLab 10.0, and a [migration path](../administration/raketasks/storage.md)
-for existing repositories was added in GitLab 10.1.
-
-## Upgrading to GitLab 10.0
-
-Since GitLab 10.0, we require all **Geo** systems to [use SSH key lookups via
-the database](../administration/operations/fast_ssh_key_lookup.md) to avoid having to maintain consistency of the
-`authorized_keys` file for SSH access. Failing to do this will prevent users
-from being able to clone via SSH.
-
-Note that in older versions of Geo, attachments downloaded on the secondary
-nodes would be saved to the wrong directory. We recommend that you do the
-following to clean this up.
-
-On the SECONDARY Geo nodes, run as root:
-
-```sh
-mv /var/opt/gitlab/gitlab-rails/working /var/opt/gitlab/gitlab-rails/working.old
-mkdir /var/opt/gitlab/gitlab-rails/working
-chmod 700 /var/opt/gitlab/gitlab-rails/working
-chown git:git /var/opt/gitlab/gitlab-rails/working
-```
-
-You may delete `/var/opt/gitlab/gitlab-rails/working.old` any time.
-
-Once this is done, we advise restarting GitLab on the secondary nodes for the
-new working directory to be used:
-
-```
-sudo gitlab-ctl restart
-```
-
-## Upgrading from GitLab 9.3 or older
-
-If you started running Geo on GitLab 9.3 or older, we recommend that you
-resync your secondary PostgreSQL databases to use replication slots. If you
-started using Geo with GitLab 9.4 or 10.x, no further action should be
-required because replication slots are used by default. However, if you
-started with GitLab 9.3 and upgraded later, you should still follow the
-instructions below.
-
-When in doubt, it does not hurt to do a resync. The easiest way to do this in
-Omnibus is the following:
-
-  1. Install GitLab on the primary server
-  1. Run `gitlab-ctl reconfigure` and `gitlab-ctl restart postgresql`. This will enable replication slots on the primary database.
-  1. Install GitLab on the secondary server.
-  1. Re-run the [database replication process](database.md#step-3-initiate-the-replication-process).
-
-## Special update notes for 9.0.x
-
-> **IMPORTANT**:
-With GitLab 9.0, the PostgreSQL version is upgraded to 9.6 and manual steps are
-required in order to update the secondary nodes and keep the Streaming
-Replication working. Downtime is required, so plan ahead.
-
-The following steps apply only if you upgrade from a 8.17 GitLab version to
-9.0+. For previous versions, update to GitLab 8.17 first before attempting to
-upgrade to 9.0+.
-
---
-
-Make sure to follow the steps in the exact order as they appear below and pay
-extra attention in what node (primary/secondary) you execute them! Each step
-is prepended with the relevant node for better clarity:
-
-1. **[secondary]** Login to **all** your secondary nodes and stop all services:
-
-    ```ruby
-    sudo gitlab-ctl stop
-    ```
-
-1. **[secondary]** Make a backup of the `recovery.conf` file on **all**
-   secondary nodes to preserve PostgreSQL's credentials:
-
-    ```
-    sudo cp /var/opt/gitlab/postgresql/data/recovery.conf /var/opt/gitlab/
-    ```
-
-1. **[primary]** Update the primary node to GitLab 9.0 following the
-   [regular update docs][update]. At the end of the update, the primary node
-   will be running with PostgreSQL 9.6.
-
-1. **[primary]** To prevent a de-synchronization of the repository replication,
-   stop all services except `postgresql` as we will use it to re-initialize the
-   secondary node's database:
-
-    ```
-    sudo gitlab-ctl stop
-    sudo gitlab-ctl start postgresql
-    ```
-
-1. **[secondary]** Run the following steps on each of the secondaries:
-
-    1. **[secondary]**  Stop all services:
-
-        ```
-        sudo gitlab-ctl stop
-        ```
-
-    1. **[secondary]** Prevent running database migrations:
-
-        ```
-        sudo touch /etc/gitlab/skip-auto-migrations
-        ```
-
-    1. **[secondary]** Move the old database to another directory:
-
-        ```
-        sudo mv /var/opt/gitlab/postgresql{,.bak}
-        ```
-
-    1. **[secondary]** Update to GitLab 9.0 following the [regular update docs][update].
-       At the end of the update, the node will be running with PostgreSQL 9.6.
-
-    1. **[secondary]** Make sure all services are up:
-
-        ```
-        sudo gitlab-ctl start
-        ```
-
-    1. **[secondary]** Reconfigure GitLab:
-
-        ```
-        sudo gitlab-ctl reconfigure
-        ```
-
-    1. **[secondary]** Run the PostgreSQL upgrade command:
-
-          ```
-          sudo gitlab-ctl pg-upgrade
-          ```
-
-    1. **[secondary]** See the stored credentials for the database that you will
-       need to re-initialize the replication:
-
-        ```
-        sudo grep -s primary_conninfo /var/opt/gitlab/recovery.conf
-        ```
-
-    1. **[secondary]** Create the `replica.sh` script as described in the
-       [database configuration document](database.md#step-3-initiate-the-replication-process).
-
-    1. **[secondary]** Run the recovery script using the credentials from the
-       previous step:
-
-        ```
-        sudo bash /tmp/replica.sh
-        ```
-
-    1. **[secondary]** Reconfigure GitLab:
-
-        ```
-        sudo gitlab-ctl reconfigure
-        ```
-
-    1. **[secondary]** Start all services:
-
-        ```
-        sudo gitlab-ctl start
-        ```
-
-    1. **[secondary]** Repeat the steps for the rest of the secondaries.
-
-1. **[primary]** After all secondaries are updated, start all services in
-   primary:
-
-    ```
-    sudo gitlab-ctl start
-    ```
-
-## Check status after updating
-
-Now that the update process is complete, you may want to check whether
-everything is working correctly:
-
-1. Run the Geo raketask on all nodes, everything should be green:
-
-    ```
-    sudo gitlab-rake gitlab:geo:check
-    ```
-
-1. Check the primary's Geo dashboard for any errors
-1. Test the data replication by pushing code to the primary and see if it
-   is received by the secondaries
-
-## Update tracking database on secondary node
-
-After updating a secondary node, you might need to run migrations on
-the tracking database. The tracking database was added in GitLab 9.1,
-and it is required since 10.0.
-
-1. Run database migrations on tracking database
-
-    ```
-    sudo gitlab-rake geo:db:migrate
-    ```
-
-1. Repeat this step for every secondary node
-
-[update]: ../update/README.md
+This document was moved to [another location](../administration/geo/updating_the_geo_nodes.md).
--- a/doc/gitlab-geo/using_a_geo_server.md
+++ b/doc/gitlab-geo/using_a_geo_server.md
-[//]: # (Please update EE::GitLab::GeoGitAccess::GEO_SERVER_DOCS_URL if this file is moved)
-
-# Using a Geo Server
-
-After you set up the [database replication and configure the Geo nodes][req],
-there are a few things to consider:
-
-1. Users need an extra step to be able to fetch code from the secondary and push
-   to primary:
-
-     1. Clone the repository as you would normally do, but from the secondary node:
-
-         ```bash
-         git clone git@secondary.gitlab.example.com:user/repo.git
-         ```
-
-     1. Change the remote push URL to always push to primary, following this example:
-
-         ```bash
-         git remote set-url --push origin git@primary.gitlab.example.com:user/repo.git
-         ```
-
-[req]: README.md#setup-instructions
+This document was moved to [another location](../administration/geo/using_a_geo_server.md).
--- a/ee/lib/ee/gitlab/geo_git_access.rb
+++ b/ee/lib/ee/gitlab/geo_git_access.rb
@@ -4,7 +4,7 @@ module EE
      include ::Gitlab::ConfigHelper
      include ::EE::GitlabRoutingHelper

-      GEO_SERVER_DOCS_URL = 'https://docs.gitlab.com/ee/gitlab-geo/using_a_geo_server.html'.freeze
+      GEO_SERVER_DOCS_URL = 'https://docs.gitlab.com/ee/administration/geo/using_a_geo_server.html'.freeze

      protected