@@ -29,8 +29,8 @@ This page is a development guide for application secrets.
## Warning: Before you add a new secret to application secrets
Before you add a new secret to [`config/initializers/01_secret_token.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/config/initializers/01_secret_token.rb),
make sure you also update Omnibus GitLab or updates will fail. Omnibus is responsible for writing the `secrets.yml` file.
If Omnibus doesn't know about a secret, Rails will attempt to write to the file, but this will fail because Rails doesn't have write access.
make sure you also update Omnibus GitLab or updates fail. Omnibus is responsible for writing the `secrets.yml` file.
If Omnibus doesn't know about a secret, Rails attempts to write to the file, but this fails because Rails doesn't have write access.
The same rules apply to Cloud Native GitLab charts, you must update the charts at first.
In case you need the secret to have same value on each node (which is usually the case) you need to make sure it's configured for all
GitLab.com environments prior to changing this file.
...
...
@@ -44,5 +44,5 @@ GitLab.com environments prior to changing this file.
## Further iteration
We might deprecate/remove this automatic secret generation '01_secret_token.rb' in the future.
Please see [this issue](https://gitlab.com/gitlab-org/gitlab/-/issues/222690) for more information.
We may either deprecate or remove this automatic secret generation `01_secret_token.rb` in the future.
Please see [issue 222690](https://gitlab.com/gitlab-org/gitlab/-/issues/222690) for more information.
@@ -8,7 +8,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w
As [Werner Vogels](https://twitter.com/Werner), the CTO at Amazon Web Services, famously put it, **Everything fails, all the time**.
As a developer, it's as important to consider the failure modes in which your software will operate as much as normal operation. Doing so can mean the difference between a minor hiccup leading to a scattering of `500` errors experienced by a tiny fraction of users and a full site outage that affects all users for an extended period.
As a developer, it's as important to consider the failure modes in which your software may operate as much as normal operation. Doing so can mean the difference between a minor hiccup leading to a scattering of `500` errors experienced by a tiny fraction of users, and a full site outage that affects all users for an extended period.
To paraphrase [Tolstoy](https://en.wikipedia.org/wiki/Anna_Karenina_principle), _all happy servers are alike, but all failing servers are failing in their own way_. Luckily, there are ways we can attempt to simulate these failure modes, and the chaos endpoints are tools for assisting in this process.
...
...
@@ -40,17 +40,17 @@ Replace `secret` with your own secret token.
## Invoking chaos
Once you have enabled the chaos endpoints and restarted the application, you can start testing using the endpoints.
After you have enabled the chaos endpoints and restarted the application, you can start testing using the endpoints.
By default, when invoking a chaos endpoint, the web worker process which receives the request will handle it. This means, for example, that if the Kill
operation is invoked, the Puma or Unicorn worker process handling the request will be killed. To test these operations in Sidekiq, the `async` parameter on
each endpoint can be set to `true`. This will run the chaos process in a Sidekiq worker.
By default, when invoking a chaos endpoint, the web worker process which receives the request handles it. This means, for example, that if the Kill
operation is invoked, the Puma or Unicorn worker process handling the request is killed. To test these operations in Sidekiq, the `async` parameter on
each endpoint can be set to `true`. This runs the chaos process in a Sidekiq worker.
## Memory leaks
To simulate a memory leak in your application, use the `/-/chaos/leakmem` endpoint.
The memory is not retained after the request finishes. After the request has completed, the Ruby garbage collector will attempt to recover the memory.
The memory is not retained after the request finishes. After the request has completed, the Ruby garbage collector attempts to recover the memory.
```plaintext
GET /-/chaos/leakmem
...
...
@@ -85,7 +85,7 @@ GET /-/chaos/cpu_spin?duration_s=50&async=true
This endpoint is similar to the CPU Spin endpoint but simulates off-processor activity, such as network calls to backend services. It will sleep for a given duration_s.
This endpoint is similar to the CPU Spin endpoint but simulates off-processor activity, such as network calls to backend services. It sleeps for a given `duration_s`.
As with the CPU Spin endpoint, this may lead to your request timing out if duration_s exceeds the configured limit.
As with the CPU Spin endpoint, this may lead to your request timing out if `duration_s` exceeds the configured limit.
```plaintext
GET /-/chaos/sleep
...
...
@@ -132,7 +132,7 @@ GET /-/chaos/sleep?duration_s=50&async=true
@@ -29,7 +29,7 @@ This method ignores all the errors silently (including the ones related to `GITA
A convenient script, [`bin/import-project`](https://gitlab.com/gitlab-org/quality/performance/blob/master/bin/import-project), is provided with [performance](https://gitlab.com/gitlab-org/quality/performance) project to import the Project tarball into a GitLab environment via API from the terminal.
Note that to use the script, it will require some preparation if you haven't done so already:
Note that to use the script, it requires some preparation if you haven't done so already:
1. First, set up [`Ruby`](https://www.ruby-lang.org/en/documentation/installation/) and [`Ruby Bundler`](https://bundler.io) if they aren't already available on the machine.
1. Next, install the required Ruby Gems via Bundler with `bundle install`.
...
...
@@ -40,7 +40,7 @@ For details how to use `bin/import-project`, run:
bin/import-project --help
```
The process should take up to 15 minutes for the project to import fully. The script will keep checking periodically for the status and exit once import has completed.
The process should take up to 15 minutes for the project to import fully. The script checks the status periodically and exits after the import has completed.
### Importing via GitHub
...
...
@@ -49,7 +49,7 @@ There is also an option to [import the project via GitHub](../user/project/impor
1. Create the group `qa-perf-testing`
1. Import the GitLab FOSS repository that's [mirrored on GitHub](https://github.com/gitlabhq/gitlabhq) into the group via the UI.
This method will take longer to import than the other methods and will depend on several factors. It's recommended to use the other methods.
This method takes longer to import than the other methods and depends on several factors. It's recommended to use the other methods.
### Importing via a Rake task
...
...
@@ -94,7 +94,7 @@ The `namespace_path` does not exist.
For example, one of the groups or subgroups is mistyped or missing
or you've specified the project name in the path.
The task will only create the project.
The task only creates the project.
If you want to import it to a new group or subgroup then create it first.
##### `Exception: No such file or directory @ rb_sysopen - (filename)`
...
...
@@ -118,8 +118,8 @@ with '-', end in '.git' or end in '.atom'
The project name specified in `project_path` is not valid for one of the specified reasons.
Only put the project name in `project_path`. If, for example, you put a path of subgroups in there
it will fail with this error as `/` is not a valid character in a project name.
Only put the project name in `project_path`. For example, if you provide a path of subgroups
it fails with this error as `/` is not a valid character in a project name.
##### `Name has already been taken and Path has already been taken`
...
...
@@ -190,7 +190,7 @@ For Performance testing, we should:
- Count the number of executed SQL queries during the restore.
- Observe the number of GC cycles happening.
You can use this snippet: `https://gitlab.com/gitlab-org/gitlab/snippets/1924954` (must be logged in), which will restore the project, and measure the execution time of `Project::TreeRestorer`, number of SQL queries and number of GC cycles happening.
You can use this snippet: `https://gitlab.com/gitlab-org/gitlab/snippets/1924954` (must be logged in), which restores the project, and measures the execution time of `Project::TreeRestorer`, number of SQL queries and number of GC cycles happening.
You can execute the script from the `gdk/gitlab` directory like this:
@@ -33,7 +33,7 @@ Additionally, the pattern that is currently used to update the project statistic
(the callback) doesn't scale adequately. It is currently one of the largest
[database queries transactions on production](https://gitlab.com/gitlab-org/gitlab/-/issues/29070)
that takes the most time overall. We can't add one more query to it as
it will increase the transaction's length.
it increases the transaction's length.
Because of all of the above, we can't apply the same pattern to store
and update the namespaces statistics, as the `namespaces` table is one
...
...
@@ -137,12 +137,12 @@ WHERE namespace_id IN (
Even though this approach would make aggregating much easier, it has some major downsides:
- We'd have to migrate **all namespaces** by adding and filling a new column. Because of the size of the table, dealing with time/cost will not be great. The background migration will take approximately `153h`, see <https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/29772>.
- We'd have to migrate **all namespaces** by adding and filling a new column. Because of the size of the table, dealing with time/cost would be significant. The background migration would take approximately `153h`, see <https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/29772>.
- Background migration has to be shipped one release before, delaying the functionality by another milestone.
### Attempt E (final): Update the namespace storage statistics in async way
This approach consists of keep using the incremental statistics updates we currently already have,
This approach consists of continuing to use the incremental statistics updates we already have,
but we refresh them through Sidekiq jobs and in different transactions:
1. Create a second table (`namespace_aggregation_schedules`) with two columns `id` and `namespace_id`.
@@ -19,7 +19,7 @@ The measuring module is a tool that allows to measure a service's execution, and
- RSS memory usage
- Server worker ID
The measuring module will log these measurements into a structured log called [`service_measurement.log`](../administration/logs.md#service_measurementlog),
The measuring module logs these measurements into a structured log called [`service_measurement.log`](../administration/logs.md#service_measurementlog),
as a single entry for each service execution.
For GitLab.com, `service_measurement.log` is ingested in Elasticsearch and Kibana as part of our monitoring solution.