An error occurred fetching the project authors.
- 04 Apr, 2017 1 commit
-
-
Yorick Peterse authored
There were 3 problems with sticking: 1. When sticking to a request (based on a previous request) we would always overwrite the WAL pointer, leading to a user being stuck to the primary for too long. 2. Sticking was not working for the Grape API. The API in particular is tricky because we have to stick either using a user, CI runner, or a CI build. 3. Refreshing of permissions did not lead to the refreshed users being stuck to the primary. This could result in users not being able to access new resources for a brief moment of time, or them still being able to see old resources. To solve this some of the Grape related logic is handled by injecting EE specific modules in the right places. This ensures we can stick as early as possible, using the right data. The same technique is applied for various parts of the CI codebase.
-
- 14 Mar, 2017 1 commit
-
-
Yorick Peterse authored
This adds support for balancing queries amongst multiple database hosts. Web requests will stick to using the primary for a little while after a write took place, removing the need for synchronous replication. Load balancing is disabled for Sidekiq since using this could lead to race conditions, and Sidekiq mostly performs writes anyway. == Balancing Balancing is done using a simple round-robin algorithm. The first time a connection is needed the first host is used, then the second, third, etc. This logic resets to the first host once reaching the end of the hosts list. The code added in this commit _only_ load balances queries sent from models. The code does _not_ touch ActiveRecord::Base.connection. This means that direct use of this method will result in the queries being sent to the primary. == Configuration Configuration is done by adding a YAML section to config/database.yml. For example: production: load_balancing: hosts: - 10.0.0.1 - 10.0.0.2 All hosts will use the same authentication credentials. == Sticking When a write is performed the query is sent to the primary. Any queries executed after this point are also sent to the primary. At the end of a request some session details are stored for the current user, these details are used to stick to the primary for as long as necessary (or until the data expires). This prevents the user from running into cases where they write data to the primary, read from the secondary, and the data isn't available yet (e.g. leading to an HTTP 404 error). == Overhead The load balancing code has minimal overhead. Instead of parsing raw SQL queries it hooks into Rails specific methods to determine what host to use for a query. == Transactions Transactions are always executed on the primary, even if they don't perform any writes. Once a transaction completes a session will stick to the primary. This is based on transactions almost always being used for writes (there's little benefit to using a transaction for only reads). == Prepared Statements Prepared statements don't work well when queries are being distributed amongst hosts. As a result GitLab will automatically disable prepared statements when load balancing is enabled. Disabling prepared statements has no impact on response timings, and may even reduce the memory usage of PostgreSQL. == Failovers The load balancing code is capable of dealing with database failovers. In the event of a secondary being unavailable the load balancer will mark it as offline and use the next available secondary. If no secondaries are available the primary is used instead. Secondaries that are marked as offline are checked again automatically, preventing a host from being marked as offline forever. In the event of a connection error when writing to the primary the code will suspend the caller, then retry the operation up to 3 times. Every retry the sleep time will increase exponentially. All of this means that in the event of a DB restart or failover some requests may take a bit longer to complete; instead of the application immediately returning an error.
-