cloud "**Object Storage**" as object_storage #white
"R_Queues_Replica_Node_1..2"
"R_Queues_Sentinel_1..3"
elb -[#6a9be7]-> gitlab
}
elb -[#6a9be7]--> monitor
state Gitaly {
gitlab -[#32CD32]> sidekiq
"Gitaly_1..2"
gitlab -[#32CD32]--> ilb
}
gitlab -[#32CD32]-> object_storage
gitlab -[#32CD32]---> redis
state BackgroundJobs {
gitlab -[hidden]-> monitor
"Sidekiq_1..4"
gitlab -[hidden]-> consul
}
sidekiq -[#ff8dd1]--> ilb
state ApplicationServer {
sidekiq -[#ff8dd1]-> object_storage
"GitLab_Rails_1..5"
sidekiq -[#ff8dd1]---> redis
}
sidekiq -[hidden]-> monitor
sidekiq -[hidden]-> consul
state LoadBalancer {
"LoadBalancer_1"
ilb -[#9370DB]-> gitaly_cluster
}
ilb -[#9370DB]-> database
state ApplicationMonitoring {
consul .[#e76a9b]u-> gitlab
"Prometheus"
consul .[#e76a9b]u-> sidekiq
"Grafana"
consul .[#e76a9b]> monitor
}
consul .[#e76a9b]-> database
consul .[#e76a9b]-> gitaly_cluster
state PgBouncer {
consul .[#e76a9b,norank]--> redis
"Internal_Load_Balancer"
"PgBouncer_1..3"
monitor .[#7FFFD4]u-> gitlab
}
monitor .[#7FFFD4]u-> sidekiq
monitor .[#7FFFD4]> consul
monitor .[#7FFFD4]-> database
monitor .[#7FFFD4]-> gitaly_cluster
monitor .[#7FFFD4,norank]--> redis
monitor .[#7FFFD4]> ilb
monitor .[#7FFFD4,norank]u--> elb
@enduml
```
```
The Google Cloud Platform (GCP) architectures were built and tested using the
The Google Cloud Platform (GCP) architectures were built and tested using the
...
@@ -120,19 +130,25 @@ uploads, or artifacts), using an [object storage service](#configure-the-object-
...
@@ -120,19 +130,25 @@ uploads, or artifacts), using an [object storage service](#configure-the-object-
is recommended instead of using NFS. Using an object storage service also
is recommended instead of using NFS. Using an object storage service also
doesn't require you to provision and maintain a node.
doesn't require you to provision and maintain a node.
It's also worth noting that at this time [Praefect requires its own database server](../gitaly/praefect.md#postgresql) and
that to achieve full High Availability a third party PostgreSQL database solution will be required.
We hope to offer a built in solutions for these restrictions in the future but in the meantime a non HA PostgreSQL server
can be set up via Omnibus GitLab, which the above specs reflect. Refer to the following issues for more information: [`omnibus-gitlab#5919`](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919) & [`gitaly#3398`](https://gitlab.com/gitlab-org/gitaly/-/issues/3398)
## Setup components
## Setup components
To set up GitLab and its components to accommodate up to 25,000 users:
To set up GitLab and its components to accommodate up to 25,000 users:
1.[Configure the external load balancing node](#configure-the-external-load-balancer)
1.[Configure the external load balancer](#configure-the-external-load-balancer)
to handle the load balancing of the GitLab application services nodes.
to handle the load balancing of the GitLab application services nodes.
1.[Configure the internal load balancer](#configure-the-internal-load-balancer).
to handle the load balancing of GitLab application internal connections.
1.[Configure Consul](#configure-consul).
1.[Configure Consul](#configure-consul).
1.[Configure PostgreSQL](#configure-postgresql), the database for GitLab.
1.[Configure PostgreSQL](#configure-postgresql), the database for GitLab.
1.[Configure PgBouncer](#configure-pgbouncer).
1.[Configure PgBouncer](#configure-pgbouncer).
1.[Configure the internal load balancing node](#configure-the-internal-load-balancer).
Back to setup components <iclass="fa fa-angle-double-up"aria-hidden="true"></i>
</a>
</div>
## Configure Redis
## Configure Redis
Using [Redis](https://redis.io/) in scalable environment is possible using a **Primary** x **Replica**
Using [Redis](https://redis.io/) in scalable environment is possible using a **Primary** x **Replica**
...
@@ -1302,19 +1344,283 @@ To configure the Sentinel Queues server:
...
@@ -1302,19 +1344,283 @@ To configure the Sentinel Queues server:
</a>
</a>
</div>
</div>
## Configure Gitaly
## Configure Gitaly Cluster
NOTE:
[Gitaly Cluster](../gitaly/praefect.md) is a GitLab provided and recommended fault tolerant solution for storing Git repositories.
[Gitaly Cluster](../gitaly/praefect.md) support
In this configuration, every Git repository is stored on every Gitaly node in the cluster, with one being designated the primary, and failover occurs automatically if the primary node goes down.
for the Reference Architectures is being
worked on as a [collaborative effort](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/1) between the Quality Engineering and Gitaly teams. When this component has been verified
The recommended cluster setup includes the following components:
some Architecture specs will likely change as a result to support the new
and improved designed.
- 3 Gitaly nodes: Replicated storage of Git repositories.
- 3 Praefect nodes: Router and transaction manager for Gitaly Cluster.
- 1 Praefect PostgreSQL node: Database server for Praefect. A third-party solution
is required for Praefect database connections to be made highly available.
- 1 load balancer: A load balancer is required for Praefect. The
[internal load balancer](#configure-the-internal-load-balancer) will be used.
This section will detail how to configure the recommended standard setup in order.
For more advanced setups refer to the [standalone Gitaly Cluster documentation](../gitaly/praefect.md).
### Configure Praefect PostgreSQL
[Gitaly](../gitaly/index.md) server node requirements are dependent on data,
Praefect, the routing and transaction manager for Gitaly Cluster, requires its own database server to store data on Gitaly Cluster status.
specifically the number of projects and those projects' sizes. It's recommended
that a Gitaly server node stores no more than 5 TB of data. Depending on your
If you want to have a highly available setup, Praefect requires a third-party PostgreSQL database.
repository storage requirements, you may require additional Gitaly server nodes.
A built-in solution is being [worked on](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919).
#### Praefect non-HA PostgreSQL standalone using Omnibus GitLab
The following IPs will be used as an example:
-`10.6.0.141`: Praefect PostgreSQL
First, make sure to [install](https://about.gitlab.com/install/)
the Linux GitLab package in the Praefect PostgreSQL node. Following the steps,
install the necessary dependencies from step 1, and add the
GitLab package repository from step 2. When installing GitLab
in the second step, do not supply the `EXTERNAL_URL` value.
1. SSH in to the Praefect PostgreSQL node.
1. Create a strong password to be used for the Praefect PostgreSQL user. Take note of this password as `<praefect_postgresql_password>`.
1. Generate the password hash for the Praefect PostgreSQL username/password pair. This assumes you will use the default
username of `praefect` (recommended). The command will request the password `<praefect_postgresql_password>`
and confirmation. Use the value that is output by this command in the next
step as the value of `<praefect_postgresql_password_hash>`:
```shell
sudo gitlab-ctl pg-password-md5 praefect
```
1. Edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
```ruby
# Disable all components except PostgreSQL and Consul
roles['postgres_role']
repmgr['enable']=false
patroni['enable']=false
# PostgreSQL configuration
postgresql['listen_address']='0.0.0.0'
postgresql['max_connections']=200
gitlab_rails['auto_migrate']=false
# Configure the Consul agent
consul['enable']=true
## Enable service discovery for Prometheus
consul['monitoring_service_discovery']=true
# START user configuration
# Please set the real values as explained in Required Information section
#
# Replace PRAEFECT_POSTGRESQL_PASSWORD_HASH with a generated md5 value
Back to setup components <iclass="fa fa-angle-double-up"aria-hidden="true"></i>
</a>
</div>
#### Praefect HA PostgreSQL third-party solution
[As noted](#configure-praefect-postgresql), a third-party PostgreSQL solution for
Praefect's database is recommended if aiming for full High Availability.
There are many third-party solutions for PostgreSQL HA. The solution selected must have the following to work with Praefect:
- A static IP for all connections that doesn't change on failover.
-[`LISTEN`](https://www.postgresql.org/docs/12/sql-listen.html) SQL functionality must be supported.
Examples of the above could include [Google's Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) or [Amazon RDS](https://aws.amazon.com/rds/).
Once the database is set up, follow the [post configuration](#praefect-postgresql-post-configuration).
#### Praefect PostgreSQL post-configuration
After the Praefect PostgreSQL server has been set up, you'll then need to configure the user and database for Praefect to use.
We recommend the user be named `praefect` and the database `praefect_production`, and these can be configured as standard in PostgreSQL.
The password for the user is the same as the one you configured earlier as `<praefect_postgresql_password>`.
This is how this would work with a Omnibus GitLab PostgreSQL setup:
1. SSH in to the Praefect PostgreSQL node.
1. Connect to the PostgreSQL server with administrative access.
The `gitlab-psql` user should be used here for this as it's added by default in Omnibus.
The database `template1` is used because it is created by default on all PostgreSQL servers.
Back to setup components <iclass="fa fa-angle-double-up"aria-hidden="true"></i>
</a>
</div>
### Configure Praefect
Praefect is the router and transaction manager for Gitaly Cluster and all connections to Gitaly go through
it. This section details how to configure it.
Praefect requires several secret tokens to secure communications across the Cluster:
-`<praefect_external_token>`: Used for repositories hosted on your Gitaly cluster and can only be accessed by Gitaly clients that carry this token.
-`<praefect_internal_token>`: Used for replication traffic inside your Gitaly cluster. This is distinct from `praefect_external_token` because Gitaly clients must not be able to access internal nodes of the Praefect cluster directly; that could lead to data loss.
-`<praefect_postgresql_password>`: The Praefect PostgreSQL password defined in the previous section is also required as part of this setup.
Gitaly Cluster nodes are configured in Praefect via a `virtual storage`. Each storage contains
the details of each Gitaly node that makes up the cluster. Each storage is also given a name
and this name is used in several areas of the config. In this guide, the name of the storage will be
`default`. Also, this guide is geared towards new installs, if upgrading an existing environment
to use Gitaly Cluster, you may need to use a different name.
Refer to the [Praefect documentation](../gitaly/praefect.md#praefect) for more info.
The following IPs will be used as an example:
-`10.6.0.131`: Praefect 1
-`10.6.0.132`: Praefect 2
-`10.6.0.133`: Praefect 3
To configure the Praefect nodes, on each one:
1. SSH in to the Praefect server.
1.[Download and install](https://about.gitlab.com/install/) the Omnibus GitLab
package of your choice. Be sure to follow _only_ installation steps 1 and 2
on the page.
1. Edit the `/etc/gitlab/gitlab.rb` file to configure Praefect:
```ruby
# Avoid running unnecessary services on the Gitaly server
postgresql['enable']=false
redis['enable']=false
nginx['enable']=false
puma['enable']=false
unicorn['enable']=false
sidekiq['enable']=false
gitlab_workhorse['enable']=false
grafana['enable']=false
# If you run a separate monitoring node you can disable these services
alertmanager['enable']=false
prometheus['enable']=false
# Praefect Configuration
praefect['enable']=true
praefect['listen_addr']='0.0.0.0:2305'
gitlab_rails['rake_cache_clear']=false
gitlab_rails['auto_migrate']=false
# Configure the Consul agent
consul['enable']=true
## Enable service discovery for Prometheus
consul['monitoring_service_discovery']=true
# START user configuration
# Please set the real values as explained in Required Information section
#
# Praefect External Token
# This is needed by clients outside the cluster (like GitLab Shell) to communicate with the Praefect cluster
cloud "**Object Storage**" as object_storage #white
state ApplicationMonitoring {
elb -[#6a9be7]-> gitlab
"Prometheus"
elb -[#6a9be7]--> monitor
"Grafana"
}
gitlab -[#32CD32]> sidekiq
gitlab -[#32CD32]--> ilb
state PgBouncer {
gitlab -[#32CD32]-> object_storage
"Internal_Load_Balancer"
gitlab -[#32CD32]---> redis
"PgBouncer_1..3"
gitlab -[hidden]-> monitor
}
gitlab -[hidden]-> consul
sidekiq -[#ff8dd1]--> ilb
sidekiq -[#ff8dd1]-> object_storage
sidekiq -[#ff8dd1]---> redis
sidekiq -[hidden]-> monitor
sidekiq -[hidden]-> consul
ilb -[#9370DB]-> gitaly_cluster
ilb -[#9370DB]-> database
consul .[#e76a9b]u-> gitlab
consul .[#e76a9b]u-> sidekiq
consul .[#e76a9b]> monitor
consul .[#e76a9b]-> database
consul .[#e76a9b]-> gitaly_cluster
consul .[#e76a9b,norank]--> redis
monitor .[#7FFFD4]u-> gitlab
monitor .[#7FFFD4]u-> sidekiq
monitor .[#7FFFD4]> consul
monitor .[#7FFFD4]-> database
monitor .[#7FFFD4]-> gitaly_cluster
monitor .[#7FFFD4,norank]--> redis
monitor .[#7FFFD4]> ilb
monitor .[#7FFFD4,norank]u--> elb
@enduml
```
```
The Google Cloud Platform (GCP) architectures were built and tested using the
The Google Cloud Platform (GCP) architectures were built and tested using the
...
@@ -106,19 +135,25 @@ uploads, or artifacts), using an [object storage service](#configure-the-object-
...
@@ -106,19 +135,25 @@ uploads, or artifacts), using an [object storage service](#configure-the-object-
is recommended instead of using NFS. Using an object storage service also
is recommended instead of using NFS. Using an object storage service also
doesn't require you to provision and maintain a node.
doesn't require you to provision and maintain a node.
It's also worth noting that at this time [Praefect requires its own database server](../gitaly/praefect.md#postgresql) and
that to achieve full High Availability a third party PostgreSQL database solution will be required.
We hope to offer a built in solutions for these restrictions in the future but in the meantime a non HA PostgreSQL server
can be set up via Omnibus GitLab, which the above specs reflect. Refer to the following issues for more information: [`omnibus-gitlab#5919`](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919) & [`gitaly#3398`](https://gitlab.com/gitlab-org/gitaly/-/issues/3398)
## Setup components
## Setup components
To set up GitLab and its components to accommodate up to 3,000 users:
To set up GitLab and its components to accommodate up to 3,000 users:
1.[Configure the external load balancing node](#configure-the-external-load-balancer)
1.[Configure the external load balancer](#configure-the-external-load-balancer)
to handle the load balancing of the GitLab application services nodes.
to handle the load balancing of the GitLab application services nodes.
1.[Configure the internal load balancer](#configure-the-internal-load-balancer).
to handle the load balancing of GitLab application internal connections.
1.[Configure Redis](#configure-redis).
1.[Configure Redis](#configure-redis).
1.[Configure Consul and Sentinel](#configure-consul-and-sentinel).
1.[Configure Consul and Sentinel](#configure-consul-and-sentinel).
1.[Configure PostgreSQL](#configure-postgresql), the database for GitLab.
1.[Configure PostgreSQL](#configure-postgresql), the database for GitLab.
1.[Configure PgBouncer](#configure-pgbouncer).
1.[Configure PgBouncer](#configure-pgbouncer).
1.[Configure the internal load balancing node](#configure-the-internal-load-balancer).
Back to setup components <iclass="fa fa-angle-double-up"aria-hidden="true"></i>
</a>
</div>
## Configure Redis
## Configure Redis
Using [Redis](https://redis.io/) in scalable environment is possible using a **Primary** x **Replica**
Using [Redis](https://redis.io/) in scalable environment is possible using a **Primary** x **Replica**
...
@@ -925,45 +1030,96 @@ The following IPs will be used as an example:
...
@@ -925,45 +1030,96 @@ The following IPs will be used as an example:
</a>
</a>
</div>
</div>
### Configure the internal load balancer
## Configure Gitaly Cluster
If you're running more than one PgBouncer node as recommended, then at this time you'll need to set
[Gitaly Cluster](../gitaly/praefect.md) is a GitLab provided and recommended fault tolerant solution for storing Git repositories.
up a TCP internal load balancer to serve each correctly.
In this configuration, every Git repository is stored on every Gitaly node in the cluster, with one being designated the primary, and failover occurs automatically if the primary node goes down.
The following IP will be used as an example:
The recommended cluster setup includes the following components:
-`10.6.0.20`: Internal Load Balancer
- 3 Gitaly nodes: Replicated storage of Git repositories.
- 3 Praefect nodes: Router and transaction manager for Gitaly Cluster.
- 1 Praefect PostgreSQL node: Database server for Praefect. A third-party solution
is required for Praefect database connections to be made highly available.
- 1 load balancer: A load balancer is required for Praefect. The
[internal load balancer](#configure-the-internal-load-balancer) will be used.
Here's how you could do it with [HAProxy](https://www.haproxy.org/):
This section will detail how to configure the recommended standard setup in order.
For more advanced setups refer to the [standalone Gitaly Cluster documentation](../gitaly/praefect.md).
```plaintext
### Configure Praefect PostgreSQL
global
log /dev/log local0
log localhost local1 notice
log stdout format raw local0
defaults
Praefect, the routing and transaction manager for Gitaly Cluster, requires its own database server to store data on Gitaly Cluster status.
log global
default-server inter 10s fall 3 rise 2
balance leastconn
frontend internal-pgbouncer-tcp-in
If you want to have a highly available setup, Praefect requires a third-party PostgreSQL database.
bind *:6432
A built-in solution is being [worked on](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919).
mode tcp
option tcplog
default_backend pgbouncer
#### Praefect non-HA PostgreSQL standalone using Omnibus GitLab
backend pgbouncer
The following IPs will be used as an example:
mode tcp
option tcp-check
server pgbouncer1 10.6.0.21:6432 check
-`10.6.0.141`: Praefect PostgreSQL
server pgbouncer2 10.6.0.22:6432 check
server pgbouncer3 10.6.0.23:6432 check
```
Refer to your preferred Load Balancer's documentation for further guidance.
First, make sure to [install](https://about.gitlab.com/install/)
the Linux GitLab package in the Praefect PostgreSQL node. Following the steps,
install the necessary dependencies from step 1, and add the
GitLab package repository from step 2. When installing GitLab
in the second step, do not supply the `EXTERNAL_URL` value.
1. SSH in to the Praefect PostgreSQL node.
1. Create a strong password to be used for the Praefect PostgreSQL user. Take note of this password as `<praefect_postgresql_password>`.
1. Generate the password hash for the Praefect PostgreSQL username/password pair. This assumes you will use the default
username of `praefect` (recommended). The command will request the password `<praefect_postgresql_password>`
and confirmation. Use the value that is output by this command in the next
step as the value of `<praefect_postgresql_password_hash>`:
```shell
sudo gitlab-ctl pg-password-md5 praefect
```
1. Edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
```ruby
# Disable all components except PostgreSQL and Consul
roles['postgres_role']
repmgr['enable']=false
patroni['enable']=false
# PostgreSQL configuration
postgresql['listen_address']='0.0.0.0'
postgresql['max_connections']=200
gitlab_rails['auto_migrate']=false
# Configure the Consul agent
consul['enable']=true
## Enable service discovery for Prometheus
consul['monitoring_service_discovery']=true
# START user configuration
# Please set the real values as explained in Required Information section
#
# Replace PRAEFECT_POSTGRESQL_PASSWORD_HASH with a generated md5 value
@@ -971,19 +1127,186 @@ Refer to your preferred Load Balancer's documentation for further guidance.
...
@@ -971,19 +1127,186 @@ Refer to your preferred Load Balancer's documentation for further guidance.
</a>
</a>
</div>
</div>
## Configure Gitaly
#### Praefect HA PostgreSQL third-party solution
NOTE:
[As noted](#configure-praefect-postgresql), a third-party PostgreSQL solution for
[Gitaly Cluster](../gitaly/praefect.md) support
Praefect's database is recommended if aiming for full High Availability.
for the Reference Architectures is being
worked on as a [collaborative effort](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/1) between the Quality Engineering and Gitaly teams. When this component has been verified
There are many third-party solutions for PostgreSQL HA. The solution selected must have the following to work with Praefect:
some Architecture specs will likely change as a result to support the new
and improved designed.
- A static IP for all connections that doesn't change on failover.
-[`LISTEN`](https://www.postgresql.org/docs/12/sql-listen.html) SQL functionality must be supported.
Examples of the above could include [Google's Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) or [Amazon RDS](https://aws.amazon.com/rds/).
Once the database is set up, follow the [post configuration](#praefect-postgresql-post-configuration).
#### Praefect PostgreSQL post-configuration
After the Praefect PostgreSQL server has been set up, you'll then need to configure the user and database for Praefect to use.
[Gitaly](../gitaly/index.md) server node requirements are dependent on data,
We recommend the user be named `praefect` and the database `praefect_production`, and these can be configured as standard in PostgreSQL.
specifically the number of projects and those projects' sizes. It's recommended
The password for the user is the same as the one you configured earlier as `<praefect_postgresql_password>`.
that a Gitaly server node stores no more than 5 TB of data. Depending on your
repository storage requirements, you may require additional Gitaly server nodes.
This is how this would work with a Omnibus GitLab PostgreSQL setup:
1. SSH in to the Praefect PostgreSQL node.
1. Connect to the PostgreSQL server with administrative access.
The `gitlab-psql` user should be used here for this as it's added by default in Omnibus.
The database `template1` is used because it is created by default on all PostgreSQL servers.
Back to setup components <iclass="fa fa-angle-double-up"aria-hidden="true"></i>
</a>
</div>
### Configure Praefect
Praefect is the router and transaction manager for Gitaly Cluster and all connections to Gitaly go through
it. This section details how to configure it.
Praefect requires several secret tokens to secure communications across the Cluster:
-`<praefect_external_token>`: Used for repositories hosted on your Gitaly cluster and can only be accessed by Gitaly clients that carry this token.
-`<praefect_internal_token>`: Used for replication traffic inside your Gitaly cluster. This is distinct from `praefect_external_token` because Gitaly clients must not be able to access internal nodes of the Praefect cluster directly; that could lead to data loss.
-`<praefect_postgresql_password>`: The Praefect PostgreSQL password defined in the previous section is also required as part of this setup.
Gitaly Cluster nodes are configured in Praefect via a `virtual storage`. Each storage contains
the details of each Gitaly node that makes up the cluster. Each storage is also given a name
and this name is used in several areas of the config. In this guide, the name of the storage will be
`default`. Also, this guide is geared towards new installs, if upgrading an existing environment
to use Gitaly Cluster, you may need to use a different name.
Refer to the [Praefect documentation](../gitaly/praefect.md#praefect) for more info.
The following IPs will be used as an example:
-`10.6.0.131`: Praefect 1
-`10.6.0.132`: Praefect 2
-`10.6.0.133`: Praefect 3
To configure the Praefect nodes, on each one:
1. SSH in to the Praefect server.
1.[Download and install](https://about.gitlab.com/install/) the Omnibus GitLab
package of your choice. Be sure to follow _only_ installation steps 1 and 2
on the page.
1. Edit the `/etc/gitlab/gitlab.rb` file to configure Praefect:
```ruby
# Avoid running unnecessary services on the Gitaly server
postgresql['enable']=false
redis['enable']=false
nginx['enable']=false
puma['enable']=false
unicorn['enable']=false
sidekiq['enable']=false
gitlab_workhorse['enable']=false
grafana['enable']=false
# If you run a separate monitoring node you can disable these services
alertmanager['enable']=false
prometheus['enable']=false
# Praefect Configuration
praefect['enable']=true
praefect['listen_addr']='0.0.0.0:2305'
gitlab_rails['rake_cache_clear']=false
gitlab_rails['auto_migrate']=false
# Configure the Consul agent
consul['enable']=true
## Enable service discovery for Prometheus
consul['monitoring_service_discovery']=true
# START user configuration
# Please set the real values as explained in Required Information section
#
# Praefect External Token
# This is needed by clients outside the cluster (like GitLab Shell) to communicate with the Praefect cluster
cloud "**Object Storage**" as object_storage #white
"R_Queues_Replica_Node_1..2"
"R_Queues_Sentinel_1..3"
elb -[#6a9be7]-> gitlab
}
elb -[#6a9be7]--> monitor
state Gitaly {
gitlab -[#32CD32]> sidekiq
"Gitaly_1..2"
gitlab -[#32CD32]--> ilb
}
gitlab -[#32CD32]-> object_storage
gitlab -[#32CD32]---> redis
state BackgroundJobs {
gitlab -[hidden]-> monitor
"Sidekiq_1..4"
gitlab -[hidden]-> consul
}
sidekiq -[#ff8dd1]--> ilb
state ApplicationServer {
sidekiq -[#ff8dd1]-> object_storage
"GitLab_Rails_1..12"
sidekiq -[#ff8dd1]---> redis
}
sidekiq -[hidden]-> monitor
sidekiq -[hidden]-> consul
state LoadBalancer {
"LoadBalancer_1"
ilb -[#9370DB]-> gitaly_cluster
}
ilb -[#9370DB]-> database
state ApplicationMonitoring {
consul .[#e76a9b]u-> gitlab
"Prometheus"
consul .[#e76a9b]u-> sidekiq
"Grafana"
consul .[#e76a9b]> monitor
}
consul .[#e76a9b]-> database
consul .[#e76a9b]-> gitaly_cluster
state PgBouncer {
consul .[#e76a9b,norank]--> redis
"Internal_Load_Balancer"
"PgBouncer_1..3"
monitor .[#7FFFD4]u-> gitlab
}
monitor .[#7FFFD4]u-> sidekiq
monitor .[#7FFFD4]> consul
monitor .[#7FFFD4]-> database
monitor .[#7FFFD4]-> gitaly_cluster
monitor .[#7FFFD4,norank]--> redis
monitor .[#7FFFD4]> ilb
monitor .[#7FFFD4,norank]u--> elb
@enduml
```
```
The Google Cloud Platform (GCP) architectures were built and tested using the
The Google Cloud Platform (GCP) architectures were built and tested using the
...
@@ -120,19 +130,25 @@ uploads, or artifacts), using an [object storage service](#configure-the-object-
...
@@ -120,19 +130,25 @@ uploads, or artifacts), using an [object storage service](#configure-the-object-
is recommended instead of using NFS. Using an object storage service also
is recommended instead of using NFS. Using an object storage service also
doesn't require you to provision and maintain a node.
doesn't require you to provision and maintain a node.
It's also worth noting that at this time [Praefect requires its own database server](../gitaly/praefect.md#postgresql) and
that to achieve full High Availability a third party PostgreSQL database solution will be required.
We hope to offer a built in solutions for these restrictions in the future but in the meantime a non HA PostgreSQL server
can be set up via Omnibus GitLab, which the above specs reflect. Refer to the following issues for more information: [`omnibus-gitlab#5919`](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919) & [`gitaly#3398`](https://gitlab.com/gitlab-org/gitaly/-/issues/3398)
## Setup components
## Setup components
To set up GitLab and its components to accommodate up to 50,000 users:
To set up GitLab and its components to accommodate up to 50,000 users:
1.[Configure the external load balancing node](#configure-the-external-load-balancer)
1.[Configure the external load balancer](#configure-the-external-load-balancer)
to handle the load balancing of the GitLab application services nodes.
to handle the load balancing of the GitLab application services nodes.
1.[Configure the internal load balancer](#configure-the-internal-load-balancer).
to handle the loa
1.[Configure Consul](#configure-consul).
1.[Configure Consul](#configure-consul).
1.[Configure PostgreSQL](#configure-postgresql), the database for GitLab.
1.[Configure PostgreSQL](#configure-postgresql), the database for GitLab.
1.[Configure PgBouncer](#configure-pgbouncer).
1.[Configure PgBouncer](#configure-pgbouncer).
1.[Configure the internal load balancing node](#configure-the-internal-load-balancer).
Back to setup components <iclass="fa fa-angle-double-up"aria-hidden="true"></i>
</a>
</div>
## Configure Redis
## Configure Redis
Using [Redis](https://redis.io/) in scalable environment is possible using a **Primary** x **Replica**
Using [Redis](https://redis.io/) in scalable environment is possible using a **Primary** x **Replica**
...
@@ -1302,19 +1351,283 @@ To configure the Sentinel Queues server:
...
@@ -1302,19 +1351,283 @@ To configure the Sentinel Queues server:
</a>
</a>
</div>
</div>
## Configure Gitaly
## Configure Gitaly Cluster
NOTE:
[Gitaly Cluster](../gitaly/praefect.md) is a GitLab provided and recommended fault tolerant solution for storing Git repositories.
[Gitaly Cluster](../gitaly/praefect.md) support
In this configuration, every Git repository is stored on every Gitaly node in the cluster, with one being designated the primary, and failover occurs automatically if the primary node goes down.
for the Reference Architectures is being
worked on as a [collaborative effort](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/1) between the Quality Engineering and Gitaly teams. When this component has been verified
The recommended cluster setup includes the following components:
some Architecture specs will likely change as a result to support the new
and improved designed.
- 3 Gitaly nodes: Replicated storage of Git repositories.
- 3 Praefect nodes: Router and transaction manager for Gitaly Cluster.
- 1 Praefect PostgreSQL node: Database server for Praefect. A third-party solution
is required for Praefect database connections to be made highly available.
- 1 load balancer: A load balancer is required for Praefect. The
[internal load balancer](#configure-the-internal-load-balancer) will be used.
This section will detail how to configure the recommended standard setup in order.
For more advanced setups refer to the [standalone Gitaly Cluster documentation](../gitaly/praefect.md).
### Configure Praefect PostgreSQL
[Gitaly](../gitaly/index.md) server node requirements are dependent on data,
Praefect, the routing and transaction manager for Gitaly Cluster, requires its own database server to store data on Gitaly Cluster status.
specifically the number of projects and those projects' sizes. It's recommended
that a Gitaly server node stores no more than 5 TB of data. Depending on your
If you want to have a highly available setup, Praefect requires a third-party PostgreSQL database.
repository storage requirements, you may require additional Gitaly server nodes.
A built-in solution is being [worked on](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919).
#### Praefect non-HA PostgreSQL standalone using Omnibus GitLab
The following IPs will be used as an example:
-`10.6.0.141`: Praefect PostgreSQL
First, make sure to [install](https://about.gitlab.com/install/)
the Linux GitLab package in the Praefect PostgreSQL node. Following the steps,
install the necessary dependencies from step 1, and add the
GitLab package repository from step 2. When installing GitLab
in the second step, do not supply the `EXTERNAL_URL` value.
1. SSH in to the Praefect PostgreSQL node.
1. Create a strong password to be used for the Praefect PostgreSQL user. Take note of this password as `<praefect_postgresql_password>`.
1. Generate the password hash for the Praefect PostgreSQL username/password pair. This assumes you will use the default
username of `praefect` (recommended). The command will request the password `<praefect_postgresql_password>`
and confirmation. Use the value that is output by this command in the next
step as the value of `<praefect_postgresql_password_hash>`:
```shell
sudo gitlab-ctl pg-password-md5 praefect
```
1. Edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
```ruby
# Disable all components except PostgreSQL and Consul
roles['postgres_role']
repmgr['enable']=false
patroni['enable']=false
# PostgreSQL configuration
postgresql['listen_address']='0.0.0.0'
postgresql['max_connections']=200
gitlab_rails['auto_migrate']=false
# Configure the Consul agent
consul['enable']=true
## Enable service discovery for Prometheus
consul['monitoring_service_discovery']=true
# START user configuration
# Please set the real values as explained in Required Information section
#
# Replace PRAEFECT_POSTGRESQL_PASSWORD_HASH with a generated md5 value
Back to setup components <iclass="fa fa-angle-double-up"aria-hidden="true"></i>
</a>
</div>
#### Praefect HA PostgreSQL third-party solution
[As noted](#configure-praefect-postgresql), a third-party PostgreSQL solution for
Praefect's database is recommended if aiming for full High Availability.
There are many third-party solutions for PostgreSQL HA. The solution selected must have the following to work with Praefect:
- A static IP for all connections that doesn't change on failover.
-[`LISTEN`](https://www.postgresql.org/docs/12/sql-listen.html) SQL functionality must be supported.
Examples of the above could include [Google's Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) or [Amazon RDS](https://aws.amazon.com/rds/).
Once the database is set up, follow the [post configuration](#praefect-postgresql-post-configuration).
#### Praefect PostgreSQL post-configuration
After the Praefect PostgreSQL server has been set up, you'll then need to configure the user and database for Praefect to use.
We recommend the user be named `praefect` and the database `praefect_production`, and these can be configured as standard in PostgreSQL.
The password for the user is the same as the one you configured earlier as `<praefect_postgresql_password>`.
This is how this would work with a Omnibus GitLab PostgreSQL setup:
1. SSH in to the Praefect PostgreSQL node.
1. Connect to the PostgreSQL server with administrative access.
The `gitlab-psql` user should be used here for this as it's added by default in Omnibus.
The database `template1` is used because it is created by default on all PostgreSQL servers.
Back to setup components <iclass="fa fa-angle-double-up"aria-hidden="true"></i>
</a>
</div>
### Configure Praefect
Praefect is the router and transaction manager for Gitaly Cluster and all connections to Gitaly go through
it. This section details how to configure it.
Praefect requires several secret tokens to secure communications across the Cluster:
-`<praefect_external_token>`: Used for repositories hosted on your Gitaly cluster and can only be accessed by Gitaly clients that carry this token.
-`<praefect_internal_token>`: Used for replication traffic inside your Gitaly cluster. This is distinct from `praefect_external_token` because Gitaly clients must not be able to access internal nodes of the Praefect cluster directly; that could lead to data loss.
-`<praefect_postgresql_password>`: The Praefect PostgreSQL password defined in the previous section is also required as part of this setup.
Gitaly Cluster nodes are configured in Praefect via a `virtual storage`. Each storage contains
the details of each Gitaly node that makes up the cluster. Each storage is also given a name
and this name is used in several areas of the config. In this guide, the name of the storage will be
`default`. Also, this guide is geared towards new installs, if upgrading an existing environment
to use Gitaly Cluster, you may need to use a different name.
Refer to the [Praefect documentation](../gitaly/praefect.md#praefect) for more info.
The following IPs will be used as an example:
-`10.6.0.131`: Praefect 1
-`10.6.0.132`: Praefect 2
-`10.6.0.133`: Praefect 3
To configure the Praefect nodes, on each one:
1. SSH in to the Praefect server.
1.[Download and install](https://about.gitlab.com/install/) the Omnibus GitLab
package of your choice. Be sure to follow _only_ installation steps 1 and 2
on the page.
1. Edit the `/etc/gitlab/gitlab.rb` file to configure Praefect:
```ruby
# Avoid running unnecessary services on the Gitaly server
postgresql['enable']=false
redis['enable']=false
nginx['enable']=false
puma['enable']=false
unicorn['enable']=false
sidekiq['enable']=false
gitlab_workhorse['enable']=false
grafana['enable']=false
# If you run a separate monitoring node you can disable these services
alertmanager['enable']=false
prometheus['enable']=false
# Praefect Configuration
praefect['enable']=true
praefect['listen_addr']='0.0.0.0:2305'
gitlab_rails['rake_cache_clear']=false
gitlab_rails['auto_migrate']=false
# Configure the Consul agent
consul['enable']=true
## Enable service discovery for Prometheus
consul['monitoring_service_discovery']=true
# START user configuration
# Please set the real values as explained in Required Information section
#
# Praefect External Token
# This is needed by clients outside the cluster (like GitLab Shell) to communicate with the Praefect cluster
cloud "**Object Storage**" as object_storage #white
}
elb -[#6a9be7]-> gitlab
state ApplicationMonitoring {
elb -[#6a9be7]--> monitor
"Prometheus"
"Grafana"
gitlab -[#32CD32]> sidekiq
}
gitlab -[#32CD32]--> ilb
gitlab -[#32CD32]-> object_storage
state PgBouncer {
gitlab -[#32CD32]---> redis
"Internal_Load_Balancer"
gitlab -[hidden]-> monitor
"PgBouncer_1..3"
gitlab -[hidden]-> consul
}
sidekiq -[#ff8dd1]--> ilb
sidekiq -[#ff8dd1]-> object_storage
sidekiq -[#ff8dd1]---> redis
sidekiq -[hidden]-> monitor
sidekiq -[hidden]-> consul
ilb -[#9370DB]-> gitaly_cluster
ilb -[#9370DB]-> database
consul .[#e76a9b]u-> gitlab
consul .[#e76a9b]u-> sidekiq
consul .[#e76a9b]> monitor
consul .[#e76a9b]-> database
consul .[#e76a9b]-> gitaly_cluster
consul .[#e76a9b,norank]--> redis
monitor .[#7FFFD4]u-> gitlab
monitor .[#7FFFD4]u-> sidekiq
monitor .[#7FFFD4]> consul
monitor .[#7FFFD4]-> database
monitor .[#7FFFD4]-> gitaly_cluster
monitor .[#7FFFD4,norank]--> redis
monitor .[#7FFFD4]> ilb
monitor .[#7FFFD4,norank]u--> elb
@enduml
```
```
The Google Cloud Platform (GCP) architectures were built and tested using the
The Google Cloud Platform (GCP) architectures were built and tested using the
...
@@ -106,19 +134,25 @@ uploads, or artifacts), using an [object storage service](#configure-the-object-
...
@@ -106,19 +134,25 @@ uploads, or artifacts), using an [object storage service](#configure-the-object-
is recommended instead of using NFS. Using an object storage service also
is recommended instead of using NFS. Using an object storage service also
doesn't require you to provision and maintain a node.
doesn't require you to provision and maintain a node.
It's also worth noting that at this time [Praefect requires its own database server](../gitaly/praefect.md#postgresql) and
that to achieve full High Availability a third party PostgreSQL database solution will be required.
We hope to offer a built in solutions for these restrictions in the future but in the meantime a non HA PostgreSQL server
can be set up via Omnibus GitLab, which the above specs reflect. Refer to the following issues for more information: [`omnibus-gitlab#5919`](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919) & [`gitaly#3398`](https://gitlab.com/gitlab-org/gitaly/-/issues/3398)
## Setup components
## Setup components
To set up GitLab and its components to accommodate up to 5,000 users:
To set up GitLab and its components to accommodate up to 5,000 users:
1.[Configure the external load balancing node](#configure-the-external-load-balancer)
1.[Configure the external load balancer](#configure-the-external-load-balancer)
to handle the load balancing of the GitLab application services nodes.
to handle the load balancing of the GitLab application services nodes.
1.[Configure the internal load balancer](#configure-the-internal-load-balancer).
to handle the load balancing of GitLab application internal connections.
1.[Configure Redis](#configure-redis).
1.[Configure Redis](#configure-redis).
1.[Configure Consul and Sentinel](#configure-consul-and-sentinel).
1.[Configure Consul and Sentinel](#configure-consul-and-sentinel).
1.[Configure PostgreSQL](#configure-postgresql), the database for GitLab.
1.[Configure PostgreSQL](#configure-postgresql), the database for GitLab.
1.[Configure PgBouncer](#configure-pgbouncer).
1.[Configure PgBouncer](#configure-pgbouncer).
1.[Configure the internal load balancing node](#configure-the-internal-load-balancer).
Back to setup components <iclass="fa fa-angle-double-up"aria-hidden="true"></i>
</a>
</div>
## Configure Redis
## Configure Redis
Using [Redis](https://redis.io/) in scalable environment is possible using a **Primary** x **Replica**
Using [Redis](https://redis.io/) in scalable environment is possible using a **Primary** x **Replica**
...
@@ -924,45 +1028,96 @@ The following IPs will be used as an example:
...
@@ -924,45 +1028,96 @@ The following IPs will be used as an example:
</a>
</a>
</div>
</div>
### Configure the internal load balancer
## Configure Gitaly Cluster
If you're running more than one PgBouncer node as recommended, then at this time you'll need to set
[Gitaly Cluster](../gitaly/praefect.md) is a GitLab provided and recommended fault tolerant solution for storing Git repositories.
up a TCP internal load balancer to serve each correctly.
In this configuration, every Git repository is stored on every Gitaly node in the cluster, with one being designated the primary, and failover occurs automatically if the primary node goes down.
The following IP will be used as an example:
The recommended cluster setup includes the following components:
-`10.6.0.20`: Internal Load Balancer
- 3 Gitaly nodes: Replicated storage of Git repositories.
- 3 Praefect nodes: Router and transaction manager for Gitaly Cluster.
- 1 Praefect PostgreSQL node: Database server for Praefect. A third-party solution
is required for Praefect database connections to be made highly available.
- 1 load balancer: A load balancer is required for Praefect. The
[internal load balancer](#configure-the-internal-load-balancer) will be used.
Here's how you could do it with [HAProxy](https://www.haproxy.org/):
This section will detail how to configure the recommended standard setup in order.
For more advanced setups refer to the [standalone Gitaly Cluster documentation](../gitaly/praefect.md).
```plaintext
### Configure Praefect PostgreSQL
global
log /dev/log local0
log localhost local1 notice
log stdout format raw local0
defaults
Praefect, the routing and transaction manager for Gitaly Cluster, requires its own database server to store data on Gitaly Cluster status.
log global
default-server inter 10s fall 3 rise 2
balance leastconn
frontend internal-pgbouncer-tcp-in
If you want to have a highly available setup, Praefect requires a third-party PostgreSQL database.
bind *:6432
A built-in solution is being [worked on](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919).
mode tcp
option tcplog
default_backend pgbouncer
#### Praefect non-HA PostgreSQL standalone using Omnibus GitLab
backend pgbouncer
The following IPs will be used as an example:
mode tcp
option tcp-check
server pgbouncer1 10.6.0.21:6432 check
-`10.6.0.141`: Praefect PostgreSQL
server pgbouncer2 10.6.0.22:6432 check
server pgbouncer3 10.6.0.23:6432 check
```
Refer to your preferred Load Balancer's documentation for further guidance.
First, make sure to [install](https://about.gitlab.com/install/)
the Linux GitLab package in the Praefect PostgreSQL node. Following the steps,
install the necessary dependencies from step 1, and add the
GitLab package repository from step 2. When installing GitLab
in the second step, do not supply the `EXTERNAL_URL` value.
1. SSH in to the Praefect PostgreSQL node.
1. Create a strong password to be used for the Praefect PostgreSQL user. Take note of this password as `<praefect_postgresql_password>`.
1. Generate the password hash for the Praefect PostgreSQL username/password pair. This assumes you will use the default
username of `praefect` (recommended). The command will request the password `<praefect_postgresql_password>`
and confirmation. Use the value that is output by this command in the next
step as the value of `<praefect_postgresql_password_hash>`:
```shell
sudo gitlab-ctl pg-password-md5 praefect
```
1. Edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
```ruby
# Disable all components except PostgreSQL and Consul
roles['postgres_role']
repmgr['enable']=false
patroni['enable']=false
# PostgreSQL configuration
postgresql['listen_address']='0.0.0.0'
postgresql['max_connections']=200
gitlab_rails['auto_migrate']=false
# Configure the Consul agent
consul['enable']=true
## Enable service discovery for Prometheus
consul['monitoring_service_discovery']=true
# START user configuration
# Please set the real values as explained in Required Information section
#
# Replace PRAEFECT_POSTGRESQL_PASSWORD_HASH with a generated md5 value
@@ -970,19 +1125,186 @@ Refer to your preferred Load Balancer's documentation for further guidance.
...
@@ -970,19 +1125,186 @@ Refer to your preferred Load Balancer's documentation for further guidance.
</a>
</a>
</div>
</div>
## Configure Gitaly
#### Praefect HA PostgreSQL third-party solution
NOTE:
[As noted](#configure-praefect-postgresql), a third-party PostgreSQL solution for
[Gitaly Cluster](../gitaly/praefect.md) support
Praefect's database is recommended if aiming for full High Availability.
for the Reference Architectures is being
worked on as a [collaborative effort](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/1) between the Quality Engineering and Gitaly teams. When this component has been verified
There are many third-party solutions for PostgreSQL HA. The solution selected must have the following to work with Praefect:
some Architecture specs will likely change as a result to support the new
and improved designed.
- A static IP for all connections that doesn't change on failover.
-[`LISTEN`](https://www.postgresql.org/docs/12/sql-listen.html) SQL functionality must be supported.
Examples of the above could include [Google's Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) or [Amazon RDS](https://aws.amazon.com/rds/).
Once the database is set up, follow the [post configuration](#praefect-postgresql-post-configuration).
#### Praefect PostgreSQL post-configuration
[Gitaly](../gitaly/index.md) server node requirements are dependent on data,
After the Praefect PostgreSQL server has been set up, you'll then need to configure the user and database for Praefect to use.
specifically the number of projects and those projects' sizes. It's recommended
that a Gitaly server node stores no more than 5 TB of data. Depending on your
We recommend the user be named `praefect` and the database `praefect_production`, and these can be configured as standard in PostgreSQL.
repository storage requirements, you may require additional Gitaly server nodes.
The password for the user is the same as the one you configured earlier as `<praefect_postgresql_password>`.
This is how this would work with a Omnibus GitLab PostgreSQL setup:
1. SSH in to the Praefect PostgreSQL node.
1. Connect to the PostgreSQL server with administrative access.
The `gitlab-psql` user should be used here for this as it's added by default in Omnibus.
The database `template1` is used because it is created by default on all PostgreSQL servers.
Back to setup components <iclass="fa fa-angle-double-up"aria-hidden="true"></i>
</a>
</div>
### Configure Praefect
Praefect is the router and transaction manager for Gitaly Cluster and all connections to Gitaly go through
it. This section details how to configure it.
Praefect requires several secret tokens to secure communications across the Cluster:
-`<praefect_external_token>`: Used for repositories hosted on your Gitaly cluster and can only be accessed by Gitaly clients that carry this token.
-`<praefect_internal_token>`: Used for replication traffic inside your Gitaly cluster. This is distinct from `praefect_external_token` because Gitaly clients must not be able to access internal nodes of the Praefect cluster directly; that could lead to data loss.
-`<praefect_postgresql_password>`: The Praefect PostgreSQL password defined in the previous section is also required as part of this setup.
Gitaly Cluster nodes are configured in Praefect via a `virtual storage`. Each storage contains
the details of each Gitaly node that makes up the cluster. Each storage is also given a name
and this name is used in several areas of the config. In this guide, the name of the storage will be
`default`. Also, this guide is geared towards new installs, if upgrading an existing environment
to use Gitaly Cluster, you may need to use a different name.
Refer to the [Praefect documentation](../gitaly/praefect.md#praefect) for more info.
The following IPs will be used as an example:
-`10.6.0.131`: Praefect 1
-`10.6.0.132`: Praefect 2
-`10.6.0.133`: Praefect 3
To configure the Praefect nodes, on each one:
1. SSH in to the Praefect server.
1.[Download and install](https://about.gitlab.com/install/) the Omnibus GitLab
package of your choice. Be sure to follow _only_ installation steps 1 and 2
on the page.
1. Edit the `/etc/gitlab/gitlab.rb` file to configure Praefect:
```ruby
# Avoid running unnecessary services on the Gitaly server
postgresql['enable']=false
redis['enable']=false
nginx['enable']=false
puma['enable']=false
unicorn['enable']=false
sidekiq['enable']=false
gitlab_workhorse['enable']=false
grafana['enable']=false
# If you run a separate monitoring node you can disable these services
alertmanager['enable']=false
prometheus['enable']=false
# Praefect Configuration
praefect['enable']=true
praefect['listen_addr']='0.0.0.0:2305'
gitlab_rails['rake_cache_clear']=false
gitlab_rails['auto_migrate']=false
# Configure the Consul agent
consul['enable']=true
## Enable service discovery for Prometheus
consul['monitoring_service_discovery']=true
# START user configuration
# Please set the real values as explained in Required Information section
#
# Praefect External Token
# This is needed by clients outside the cluster (like GitLab Shell) to communicate with the Praefect cluster