This uses ActiveRecord's `update_all` method to update all rows in a single
This uses ActiveRecord's `update_all` method to update all rows in a single
query. This in turn makes it much harder for this code to overload a database.
query. This in turn makes it much harder for this code to overload a database.
## Use read replicas when posible
## Use read replicas when possible
In a DB cluster we have many read replicas and one primary. A classic use of scaling the DB is to have read only actions be performed the replicas. We use [load balancing](https://docs.gitlab.com/ee/administration/database_load_balancing.html)) to distribute this load. This allows for the replics to grow as the pressure on the DB grows.
In a DB cluster we have many read replicas and one primary. A classic use of scaling the DB is to have read only actions be performed the replicas. We use [load balancing](https://docs.gitlab.com/ee/administration/database_load_balancing.html) to distribute this load. This allows for the replicas to grow as the pressure on the DB grows.
By default queries use read-only replicas, but due to [primary sticking](https://docs.gitlab.com/ee/administration/database_load_balancing.html#primary-sticking) GitLab will stick to using the primary for a certain period of time and revert back to secondaries after they have either caught up or 30 seconds which can lead to a considerable amount of unnecessary load on the primary. In this [merge request](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/56849) we introduced the `without_sticky_writes` block to prevent switching to the primary. This [merge request example](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/57328) provides a good use case for when queries can stick to the primary and how to prevent this by using `without_sticky_writes`.
By default queries use read-only replicas, but due to [primary sticking](https://docs.gitlab.com/ee/administration/database_load_balancing.html#primary-sticking) GitLab sticks to using the primary for a certain period of time and revert back to secondaries after they have either caught up or 30 seconds which can lead to a considerable amount of unnecessary load on the primary. In this [merge request](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/56849) we introduced the `without_sticky_writes` block to prevent switching to the primary. This [merge request example](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/57328) provides a good use case for when queries can stick to the primary and how to prevent this by using `without_sticky_writes`.
Internally, our database load balancer classifies the queries based on their main statement (`select`, `update`, `delete`, etc.). When in doubt, it redirects the queries to the primary database. Hence, there are some common cases the load balancer sends the queries to the primary unnecessarily:
Internally, our database load balancer classifies the queries based on their main statement (`select`, `update`, `delete`, etc.). When in doubt, it redirects the queries to the primary database. Hence, there are some common cases the load balancer sends the queries to the primary unnecessarily:
...
@@ -171,7 +171,7 @@ Internally, our database load balancer classifies the queries based on their mai
...
@@ -171,7 +171,7 @@ Internally, our database load balancer classifies the queries based on their mai
- In-flight connection configuration set
- In-flight connection configuration set
- Sidekiq background jobs
- Sidekiq background jobs
Worse, after the above queries are executed, GitLab will [stick to the primary](https://docs.gitlab.com/ee/administration/database_load_balancing.html#primary-sticking). In [this merge request](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/56476), we introduced `use_replica_if_possible` block to make the inside queries prefer to use the replicas. That MR is also an example how we redirected a costly, time-consuming query to the replicas.
Worse, after the above queries are executed, GitLab [sticks to the primary](https://docs.gitlab.com/ee/administration/database_load_balancing.html#primary-sticking). In [this merge request](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/56476), we introduced `use_replica_if_possible` block to make the inside queries prefer to use the replicas. That MR is also an example how we redirected a costly, time-consuming query to the replicas.