Commit 9d23f8f9 authored by Evan Read's avatar Evan Read Committed by Kati Paizee

Edit new primary election content

parent 0bd09359
......@@ -108,7 +108,7 @@ The availability objectives for Gitaly clusters are:
Gitaly Cluster supports:
- [Strong consistency](praefect.md#strong-consistency) of the secondary replicas.
- [Automatic failover](praefect.md#automatic-failover-and-leader-election) from the primary to the secondary.
- [Automatic failover](praefect.md#automatic-failover-and-primary-election-strategies) from the primary to the secondary.
- Reporting of possible data loss if replication queue is non-empty.
- Marking repositories as [read only](praefect.md#read-only-mode) if data loss is detected to prevent data inconsistencies.
......@@ -248,10 +248,10 @@ Gitaly Cluster and [Geo](../geo/index.md) both provide redundancy. However the r
The following table outlines the major differences between Gitaly Cluster and Geo:
| Tool | Nodes | Locations | Latency tolerance | Failover | Consistency | Provides redundancy for |
|:---------------|:---------|:----------|:-------------------|:----------------------------------------------------------------|:-----------------------------------------|:------------------------|
| Gitaly Cluster | Multiple | Single | Approximately 1 ms | [Automatic](praefect.md#automatic-failover-and-leader-election) | [Strong](praefect.md#strong-consistency) | Data storage in Git |
| Geo | Multiple | Multiple | Up to one minute | [Manual](../geo/disaster_recovery/index.md) | Eventual | Entire GitLab instance |
| Tool | Nodes | Locations | Latency tolerance | Failover | Consistency | Provides redundancy for |
|:---------------|:---------|:----------|:-------------------|:----------------------------------------------------------------------------|:-----------------------------------------|:------------------------|
| Gitaly Cluster | Multiple | Single | Approximately 1 ms | [Automatic](praefect.md#automatic-failover-and-primary-election-strategies) | [Strong](praefect.md#strong-consistency) | Data storage in Git |
| Geo | Multiple | Multiple | Up to one minute | [Manual](../geo/disaster_recovery/index.md) | Eventual | Entire GitLab instance |
For more information, see:
......
......@@ -1079,32 +1079,33 @@ You can configure:
current assignments: gitaly-1, gitaly-2
```
## Automatic failover and leader election
## Automatic failover and primary election strategies
Praefect regularly checks the health of each backend Gitaly node. This
information can be used to automatically failover to a new primary node if the
current primary node is found to be unhealthy.
Praefect regularly checks the health of each Gitaly node. This is used to automatically fail over
to a newly-elected primary Gitaly node if the current primary node is found to be unhealthy.
### Election strategies
We recommend using [repository-specific primary nodes](#repository-specific-primary-nodes),
which is [planned to be the only available election strategy](https://gitlab.com/gitlab-org/gitaly/-/issues/3574)
We recommend using [repository-specific primary nodes](#repository-specific-primary-nodes). This is
[planned to be the only available election strategy](https://gitlab.com/gitlab-org/gitaly/-/issues/3574)
from GitLab 14.0.
In the future, we are likely to implement support for:
### Repository-specific primary nodes
> [Introduced](https://gitlab.com/gitlab-org/gitaly/-/issues/3492) in GitLab 13.12.
- A [Consul](../consul.md) strategy.
- A cloud-native strategy.
Gitaly Cluster supports electing repository-specific primary Gitaly nodes. Repository-specific
Gitaly primary nodes are enabled in `/etc/gitlab/gitlab.rb` by setting
`praefect['failover_election_strategy'] = 'per_repository'`.
#### Repository-specific primary nodes
Praefect's [deprecated election strategies](#deprecated-election-strategies):
Praefect's earlier election strategies elected a primary for each virtual storage, which was used as the
primary for each repository in the virtual storage. This model prevented horizontal scaling of a virtual
storage. The primary Gitaly node needed a replica of each repository and thus became the bottleneck.
- Elected a primary Gitaly node for each virtual storage, which was used as the primary node for
each repository in the virtual storage.
- Prevented horizontal scaling of a virtual storage. The primary Gitaly node needed a replica of
each repository and thus became the bottleneck.
The `per_repository` election strategy solves this problem by electing a primary separately for each repository.
Combined with [configurable replication factors](#configure-replication-factor), you can horizontally
scale storage capacity and distribute write load across Gitaly nodes.
The `per_repository` election strategy solves this problem by electing a primary Gitaly node separately for each
repository. Combined with [configurable replication factors](#configure-replication-factor), you can
horizontally scale storage capacity and distribute write load across Gitaly nodes.
Primary elections are run when:
......@@ -1128,16 +1129,13 @@ If there are no healthy secondary nodes for a repository:
- The unhealthy primary node is demoted and the repository is left without a primary node.
- Operations that require a primary node fail until a primary is successfully elected.
Repository-specific primaries are enabled in `/etc/gitlab/gitlab.rb` by setting
`praefect['failover_election_strategy'] = 'per_repository'`.
##### Migrate to repository-specific primary nodes
#### Migrate to repository-specific primary Gitaly nodes
New Gitaly clusters can start using the `per_repository` election strategy immediately.
New Gitaly Clusters can start using the `per_repository` election strategy immediately.
To migrate existing clusters:
1. Praefect didn't historically keep database records of every repository stored on the cluster. When
1. Praefect nodes didn't historically keep database records of every repository stored on the cluster. When
the `per_repository` election strategy is configured, Praefect expects to have database records of
each repository. A [background migration](https://gitlab.com/gitlab-org/gitaly/-/merge_requests/2749) is
included in GitLab 13.6 and later to create any missing database records for repositories. Before migrating
......@@ -1161,35 +1159,39 @@ To migrate existing clusters:
The migration is ran when Praefect starts up. If the migration is unsuccessful, you can restart
a Praefect node to reattempt it. The migration only runs with `sql` election strategy configured.
1. Running two different election strategies side by side can cause a split brain, where different Praefects
consider repositories to have different primaries. To avoid this, all Praefects should be shut down prior
to changing the election strategy.
1. Running two different election strategies side by side can cause a split brain, where different
Praefect nodes consider repositories to have different primaries. To avoid this, shut down
all Praefect nodes before changing the election strategy.
This can be done by running `gitlab-ctl stop praefect` on the Praefect nodes.
Do this by running `gitlab-ctl stop praefect` on the Praefect nodes.
1. On the Praefect nodes, configure the election strategy in `/etc/gitlab/gitlab.rb` with
`praefect['failover_election_strategy'] = 'per_repository'`.
1. Finally, run `gitlab-ctl reconfigure` to reconfigure and restart the Praefects.
1. Finally, run `gitlab-ctl reconfigure` to reconfigure and restart the Praefect nodes.
#### Deprecated election strategies
### Deprecated election strategies
WARNING:
The below election strategies are deprecated and are scheduled for removal in GitLab 14.0.
Migrate to [repository-specific primary nodes](#repository-specific-primary-nodes).
- **PostgreSQL:** Enabled by default until GitLab 14.0, and equivalent to:
`praefect['failover_election_strategy'] = 'sql'`. This configuration
option allows multiple Praefect nodes to coordinate via the
PostgreSQL database to elect a primary Gitaly node. This configuration
causes Praefect nodes to elect a new primary, monitor its health,
and elect a new primary if the current one has not been reachable in
10 seconds by a majority of the Praefect nodes.
`praefect['failover_election_strategy'] = 'sql'`.
This configuration option:
- Allows multiple Praefect nodes to coordinate via the PostgreSQL database to elect a primary
Gitaly node.
- Causes Praefect nodes to elect a new primary Gitaly node, monitor its health, and elect a new primary
Gitaly node if the current one is not reached within 10 seconds by a majority of the Praefect
nodes.
- **Memory:** Enabled by setting `praefect['failover_election_strategy'] = 'local'`
in `/etc/gitlab/gitlab.rb` on the Praefect node. If a sufficient number of health
checks fail for the current primary backend Gitaly node, and new primary will
be elected. **Do not use with multiple Praefect nodes!** Using with multiple
Praefect nodes is likely to result in a split brain.
in `/etc/gitlab/gitlab.rb` on the Praefect node.
If a sufficient number of health checks fail for the current primary Gitaly node, a new primary is
elected. **Do not use with multiple Praefect nodes!** Using with multiple Praefect nodes is
likely to result in a split brain.
## Primary Node Failure
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment