-
Yorick Peterse authored
This adds support for balancing queries amongst multiple database hosts. Web requests will stick to using the primary for a little while after a write took place, removing the need for synchronous replication. Load balancing is disabled for Sidekiq since using this could lead to race conditions, and Sidekiq mostly performs writes anyway. == Balancing Balancing is done using a simple round-robin algorithm. The first time a connection is needed the first host is used, then the second, third, etc. This logic resets to the first host once reaching the end of the hosts list. The code added in this commit _only_ load balances queries sent from models. The code does _not_ touch ActiveRecord::Base.connection. This means that direct use of this method will result in the queries being sent to the primary. == Configuration Configuration is done by adding a YAML section to config/database.yml. For example: production: load_balancing: hosts: - 10.0.0.1 - 10.0.0.2 All hosts will use the same authentication credentials. == Sticking When a write is performed the query is sent to the primary. Any queries executed after this point are also sent to the primary. At the end of a request some session details are stored for the current user, these details are used to stick to the primary for as long as necessary (or until the data expires). This prevents the user from running into cases where they write data to the primary, read from the secondary, and the data isn't available yet (e.g. leading to an HTTP 404 error). == Overhead The load balancing code has minimal overhead. Instead of parsing raw SQL queries it hooks into Rails specific methods to determine what host to use for a query. == Transactions Transactions are always executed on the primary, even if they don't perform any writes. Once a transaction completes a session will stick to the primary. This is based on transactions almost always being used for writes (there's little benefit to using a transaction for only reads). == Prepared Statements Prepared statements don't work well when queries are being distributed amongst hosts. As a result GitLab will automatically disable prepared statements when load balancing is enabled. Disabling prepared statements has no impact on response timings, and may even reduce the memory usage of PostgreSQL. == Failovers The load balancing code is capable of dealing with database failovers. In the event of a secondary being unavailable the load balancer will mark it as offline and use the next available secondary. If no secondaries are available the primary is used instead. Secondaries that are marked as offline are checked again automatically, preventing a host from being marked as offline forever. In the event of a connection error when writing to the primary the code will suspend the caller, then retry the operation up to 3 times. Every retry the sleep time will increase exponentially. All of this means that in the event of a DB restart or failover some requests may take a bit longer to complete; instead of the application immediately returning an error.
623db07d