1.[PostgreSQL](database.md#postgresql-in-a-scaled-environment) with [PGBouncer](https://docs.gitlab.com/ee/administration/high_availability/pgbouncer.html)
1.[Redis](redis.md#redis-in-a-scaled-environment)
1.[Gitaly](gitaly.md)(recommended) and / or [NFS](nfs.md)[^4]
1.[GitLab application nodes](gitlab.md)
- With [Object Storage service enabled](../gitaly/index.md#eliminating-nfs-altogether)[^3]
1.[Load Balancer](load_balancer.md)[^2]
1.[Load Balancer(s)](load_balancer.md)[^2]
1.[Monitoring node (Prometheus and Grafana)](monitoring_node.md)
### Full Scaling
...
...
@@ -98,8 +98,8 @@ is split into separate Sidekiq and Unicorn/Workhorse nodes. One indication that
this architecture is required is if Sidekiq queues begin to periodically increase
in size, indicating that there is contention or there are not enough resources.
- 1 or more PostgreSQL node
- 1 or more Redis node
- 1 or more PostgreSQL nodes
- 1 or more Redis nodes
- 1 or more Gitaly storage servers
- 1 or more Object Storage services[^3] and / or NFS storage server[^4]
- 2 or more Sidekiq nodes
...
...
@@ -182,6 +182,7 @@ the basis of the GitLab.com architecture. While this scales well it also comes
with the added complexity of many more nodes to configure, manage, and monitor.
- 3 PostgreSQL nodes
- 1 or more PgBouncer nodes (with associated internal load balancers)
- 4 or more Redis nodes (2 separate clusters for persistent and cache data)
- 3 Consul nodes
- 3 Sentinel nodes
...
...
@@ -228,16 +229,17 @@ users are, how much automation you use, mirroring, and repo/change size.
@@ -135,7 +135,8 @@ The recommended configuration for a PostgreSQL HA requires:
-`repmgrd` - A service to monitor, and handle failover in case of a failure
-`Consul` agent - Used for service discovery, to alert other nodes when failover occurs
- A minimum of three `Consul` server nodes
- A minimum of one `pgbouncer` service node
- A minimum of one `pgbouncer` service node, but it's recommended to have one per database node
- An internal load balancer (TCP) is required when there is more than one `pgbouncer` service node
You also need to take into consideration the underlying network topology,
making sure you have redundant connectivity between all Database and GitLab instances,
...
...
@@ -155,13 +156,13 @@ Database nodes run two services with PostgreSQL:
On failure, the old master node is automatically evicted from the cluster, and should be rejoined manually once recovered.
- Consul. Monitors the status of each node in the database cluster and tracks its health in a service definition on the Consul cluster.
Alongside PgBouncer, there is a Consul agent that watches the status of the PostgreSQL service. If that status changes, Consul runs a script which updates the configuration and reloads PgBouncer
Alongside each PgBouncer, there is a Consul agent that watches the status of the PostgreSQL service. If that status changes, Consul runs a script which updates the configuration and reloads PgBouncer
##### Connection flow
Each service in the package comes with a set of [default ports](https://docs.gitlab.com/omnibus/package-information/defaults.html#ports). You may need to make specific firewall rules for the connections listed below:
- Application servers connect to [PgBouncer default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#pgbouncer)
- Application servers connect to either PgBouncer directly via its [default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#pgbouncer) or via a configured Internal Load Balancer (TCP) that serves multiple PgBouncers.
- PgBouncer connects to the primary database servers [PostgreSQL default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#postgresql)
- Repmgr connects to the database servers [PostgreSQL default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#postgresql)
- Postgres secondaries connect to the primary database servers [PostgreSQL default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#postgresql)
...
...
@@ -499,7 +500,7 @@ attributes set, but the following need to be set.
@@ -533,7 +534,8 @@ Here we'll show you some fully expanded example configurations.
##### Example recommended setup
This example uses 3 Consul servers, 3 PostgreSQL servers, and 1 application node.
This example uses 3 Consul servers, 3 PgBouncer servers (with associated internal load balancer),
3 PostgreSQL servers, and 1 application node.
We start with all servers on the same 10.6.0.0/16 private network range, they
can connect to each freely other on those addresses.
...
...
@@ -543,14 +545,16 @@ Here is a list and description of each machine and the assigned IP:
-`10.6.0.11`: Consul 1
-`10.6.0.12`: Consul 2
-`10.6.0.13`: Consul 3
-`10.6.0.21`: PostgreSQL master
-`10.6.0.22`: PostgreSQL secondary
-`10.6.0.23`: PostgreSQL secondary
-`10.6.0.31`: GitLab application
All passwords are set to `toomanysecrets`, please do not use this password or derived hashes.
-`10.6.0.20`: Internal Load Balancer
-`10.6.0.21`: PgBouncer 1
-`10.6.0.22`: PgBouncer 2
-`10.6.0.23`: PgBouncer 3
-`10.6.0.31`: PostgreSQL master
-`10.6.0.32`: PostgreSQL secondary
-`10.6.0.33`: PostgreSQL secondary
-`10.6.0.41`: GitLab application
The external_url for GitLab is `http://gitlab.example.com`
All passwords are set to `toomanysecrets`, please do not use this password or derived hashes and the external_url for GitLab is `http://gitlab.example.com`.
Please note that after the initial configuration, if a failover occurs, the PostgresSQL master will change to one of the available secondaries until it is failed back.
...
...
@@ -566,10 +570,45 @@ consul['configuration'] = {
server: true,
retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13)
}
consul['monitoring_service_discovery']=true
```
[Reconfigure Omnibus GitLab][reconfigure GitLab] for the changes to take effect.
##### Example recommended setup for PgBouncer servers
On each server edit `/etc/gitlab/gitlab.rb`:
```ruby
# Disable all components except Pgbouncer and Consul agent
[Reconfigure Omnibus GitLab][reconfigure GitLab] for the changes to take effect.
##### Internal load balancer setup
An internal load balancer (TCP) is then required to be setup to serve each PgBouncer node (in this example on the IP of `10.6.0.20`). An example of how to do this can be found in the [PgBouncer Configure Internal Load Balancer](pgbouncer.md#configure-the-internal-load-balancer) section.
##### Example recommended setup for PostgreSQL servers
After deploying the configuration follow these steps:
1. On `10.6.0.21`, our primary database
1. On `10.6.0.31`, our primary database
Enable the `pg_trgm` extension
...
...
@@ -673,7 +710,7 @@ After deploying the configuration follow these steps:
CREATE EXTENSION pg_trgm;
```
1. On `10.6.0.22`, our first standby database
1. On `10.6.0.32`, our first standby database
Make this node a standby of the primary
...
...
@@ -681,7 +718,7 @@ After deploying the configuration follow these steps:
gitlab-ctl repmgr standby setup 10.6.0.21
```
1. On `10.6.0.23`, our second standby database
1. On `10.6.0.33`, our second standby database
Make this node a standby of the primary
...
...
@@ -689,7 +726,7 @@ After deploying the configuration follow these steps:
gitlab-ctl repmgr standby setup 10.6.0.21
```
1. On `10.6.0.31`, our application server
1. On `10.6.0.41`, our application server
Set `gitlab-consul` user's PgBouncer password to `toomanysecrets`
...
...
@@ -705,7 +742,7 @@ After deploying the configuration follow these steps:
#### Example minimal setup
This example uses 3 PostgreSQL servers, and 1 application node.
This example uses 3 PostgreSQL servers, and 1 application node (with PgBouncer setup alongside).
It differs from the [recommended setup](#example-recommended-setup) by moving the Consul servers into the same servers we use for PostgreSQL.
The trade-off is between reducing server counts, against the increased operational complexity of needing to deal with PostgreSQL [failover](#failover-procedure) and [restore](#restore-procedure) procedures in addition to [Consul outage recovery](consul.md#outage-recovery) on the same set of machines.
As part of its High Availability stack, GitLab Premium includes a bundled version of [PgBouncer](https://www.pgbouncer.org/) that can be managed through `/etc/gitlab/gitlab.rb`.
As part of its High Availability stack, GitLab Premium includes a bundled version of [PgBouncer](https://pgbouncer.github.io/) that can be managed through `/etc/gitlab/gitlab.rb`. PgBouncer is used to seamlessly migrate database connections between servers in a failover scenario. Additionally, it can be used in a non-HA setup to pool connections, speeding up response time while reducing resource usage.
In a High Availability setup, PgBouncer is used to seamlessly migrate database connections between servers in a failover scenario.
Additionally, it can be used in a non-HA setup to pool connections, speeding up response time while reducing resource usage.
It is recommended to run PgBouncer alongside the `gitlab-rails` service, or on its own dedicated node in a cluster.
In a HA setup, it's recommended to run a PgBouncer node separately for each database node with an internal load balancer (TCP) serving each accordingly.
## Operations
...
...
@@ -18,7 +14,7 @@ It is recommended to run PgBouncer alongside the `gitlab-rails` service, or on i
1. Make sure you collect [`CONSUL_SERVER_NODES`](database.md#consul-information), [`CONSUL_PASSWORD_HASH`](database.md#consul-information), and [`PGBOUNCER_PASSWORD_HASH`](database.md#pgbouncer-information) before executing the next step.
1.Edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
1.One each node, edit the `/etc/gitlab/gitlab.rb` config file and replace values noted in the `# START user configuration` section as below:
```ruby
# Disable all components except PgBouncer and Consul agent
...
...
@@ -67,7 +63,7 @@ It is recommended to run PgBouncer alongside the `gitlab-rails` service, or on i
#### PgBouncer Checkpoint
1. Ensure the node is talking to the current master:
1. Ensure each node is talking to the current master:
```sh
gitlab-ctl pgb-console # You will be prompted for PGBOUNCER_PASSWORD
...
...
@@ -100,6 +96,41 @@ It is recommended to run PgBouncer alongside the `gitlab-rails` service, or on i
(2 rows)
```
#### Configure the internal load balancer
If you're running more than one PgBouncer node as recommended, then at this time you'll need to set up a TCP internal load balancer to serve each correctly. This can be done with any reputable TCP load balancer.
As an example here's how you could do it with [HAProxy](https://www.haproxy.org/):
```
global
log /dev/log local0
log localhost local1 notice
log stdout format raw local0
defaults
log global
default-server inter 10s fall 3 rise 2
balance leastconn
frontend internal-pgbouncer-tcp-in
bind *:6432
mode tcp
option tcplog
default_backend pgbouncer
backend pgbouncer
mode tcp
option tcp-check
server pgbouncer1 <ip>:6432 check
server pgbouncer2 <ip>:6432 check
server pgbouncer3 <ip>:6432 check
```
Refer to your preferred Load Balancer's documentation for further guidance.
### Running PgBouncer as part of a non-HA GitLab installation
1. Generate PGBOUNCER_USER_PASSWORD_HASH with the command `gitlab-ctl pg-password-md5 pgbouncer`