Commit 6eaea52a authored by Russell Dickenson's avatar Russell Dickenson Committed by Amy Qualls

Remove instances of future tense from several Development Guide pages

parent 548ba5c2
......@@ -29,8 +29,8 @@ This page is a development guide for application secrets.
## Warning: Before you add a new secret to application secrets
Before you add a new secret to [`config/initializers/01_secret_token.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/config/initializers/01_secret_token.rb),
make sure you also update Omnibus GitLab or updates will fail. Omnibus is responsible for writing the `secrets.yml` file.
If Omnibus doesn't know about a secret, Rails will attempt to write to the file, but this will fail because Rails doesn't have write access.
make sure you also update Omnibus GitLab or updates fail. Omnibus is responsible for writing the `secrets.yml` file.
If Omnibus doesn't know about a secret, Rails attempts to write to the file, but this fails because Rails doesn't have write access.
The same rules apply to Cloud Native GitLab charts, you must update the charts at first.
In case you need the secret to have same value on each node (which is usually the case) you need to make sure it's configured for all
GitLab.com environments prior to changing this file.
......@@ -44,5 +44,5 @@ GitLab.com environments prior to changing this file.
## Further iteration
We might deprecate/remove this automatic secret generation '01_secret_token.rb' in the future.
Please see [this issue](https://gitlab.com/gitlab-org/gitlab/-/issues/222690) for more information.
We may either deprecate or remove this automatic secret generation `01_secret_token.rb` in the future.
Please see [issue 222690](https://gitlab.com/gitlab-org/gitlab/-/issues/222690) for more information.
......@@ -8,7 +8,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w
As [Werner Vogels](https://twitter.com/Werner), the CTO at Amazon Web Services, famously put it, **Everything fails, all the time**.
As a developer, it's as important to consider the failure modes in which your software will operate as much as normal operation. Doing so can mean the difference between a minor hiccup leading to a scattering of `500` errors experienced by a tiny fraction of users and a full site outage that affects all users for an extended period.
As a developer, it's as important to consider the failure modes in which your software may operate as much as normal operation. Doing so can mean the difference between a minor hiccup leading to a scattering of `500` errors experienced by a tiny fraction of users, and a full site outage that affects all users for an extended period.
To paraphrase [Tolstoy](https://en.wikipedia.org/wiki/Anna_Karenina_principle), _all happy servers are alike, but all failing servers are failing in their own way_. Luckily, there are ways we can attempt to simulate these failure modes, and the chaos endpoints are tools for assisting in this process.
......@@ -40,17 +40,17 @@ Replace `secret` with your own secret token.
## Invoking chaos
Once you have enabled the chaos endpoints and restarted the application, you can start testing using the endpoints.
After you have enabled the chaos endpoints and restarted the application, you can start testing using the endpoints.
By default, when invoking a chaos endpoint, the web worker process which receives the request will handle it. This means, for example, that if the Kill
operation is invoked, the Puma or Unicorn worker process handling the request will be killed. To test these operations in Sidekiq, the `async` parameter on
each endpoint can be set to `true`. This will run the chaos process in a Sidekiq worker.
By default, when invoking a chaos endpoint, the web worker process which receives the request handles it. This means, for example, that if the Kill
operation is invoked, the Puma or Unicorn worker process handling the request is killed. To test these operations in Sidekiq, the `async` parameter on
each endpoint can be set to `true`. This runs the chaos process in a Sidekiq worker.
## Memory leaks
To simulate a memory leak in your application, use the `/-/chaos/leakmem` endpoint.
The memory is not retained after the request finishes. After the request has completed, the Ruby garbage collector will attempt to recover the memory.
The memory is not retained after the request finishes. After the request has completed, the Ruby garbage collector attempts to recover the memory.
```plaintext
GET /-/chaos/leakmem
......@@ -85,7 +85,7 @@ GET /-/chaos/cpu_spin?duration_s=50&async=true
| Attribute | Type | Required | Description |
| ------------ | ------- | -------- | --------------------------------------------------------------------- |
| `duration_s` | integer | no | Duration, in seconds, that the core will be used. Defaults to 30s |
| `duration_s` | integer | no | Duration, in seconds, that the core is used. Defaults to 30s |
| `async` | boolean | no | Set to true to consume CPU in a Sidekiq background worker process |
```shell
......@@ -110,7 +110,7 @@ GET /-/chaos/db_spin?duration_s=50&async=true
| Attribute | Type | Required | Description |
| ------------ | ------- | -------- | --------------------------------------------------------------------------- |
| `interval_s` | float | no | Interval, in seconds, for every DB request. Defaults to 1s |
| `duration_s` | integer | no | Duration, in seconds, that the core will be used. Defaults to 30s |
| `duration_s` | integer | no | Duration, in seconds, that the core is used. Defaults to 30s |
| `async` | boolean | no | Set to true to perform the operation in a Sidekiq background worker process |
```shell
......@@ -120,9 +120,9 @@ curl "http://localhost:3000/-/chaos/db_spin?interval_s=1&duration_s=60&token=sec
## Sleep
This endpoint is similar to the CPU Spin endpoint but simulates off-processor activity, such as network calls to backend services. It will sleep for a given duration_s.
This endpoint is similar to the CPU Spin endpoint but simulates off-processor activity, such as network calls to backend services. It sleeps for a given `duration_s`.
As with the CPU Spin endpoint, this may lead to your request timing out if duration_s exceeds the configured limit.
As with the CPU Spin endpoint, this may lead to your request timing out if `duration_s` exceeds the configured limit.
```plaintext
GET /-/chaos/sleep
......@@ -132,7 +132,7 @@ GET /-/chaos/sleep?duration_s=50&async=true
| Attribute | Type | Required | Description |
| ------------ | ------- | -------- | ---------------------------------------------------------------------- |
| `duration_s` | integer | no | Duration, in seconds, that the request will sleep for. Defaults to 30s |
| `duration_s` | integer | no | Duration, in seconds, that the request sleeps for. Defaults to 30s |
| `async` | boolean | no | Set to true to sleep in a Sidekiq background worker process |
```shell
......@@ -142,7 +142,7 @@ curl "http://localhost:3000/-/chaos/sleep?duration_s=60&token=secret"
## Kill
This endpoint will simulate the unexpected death of a worker process using a `kill` signal.
This endpoint simulates the unexpected death of a worker process using a `kill` signal.
Because this endpoint uses the `KILL` signal, the worker isn't given an
opportunity to cleanup or shutdown.
......
......@@ -50,28 +50,28 @@ called `Gitlab::GithubImport::AdvanceStageWorker`.
### 1. RepositoryImportWorker
This worker will kick off the import process by simply scheduling a job for the
This worker starts the import process by scheduling a job for the
next worker.
### 2. Stage::ImportRepositoryWorker
This worker will import the repository and wiki, scheduling the next stage when
This worker imports the repository and wiki, scheduling the next stage when
done.
### 3. Stage::ImportBaseDataWorker
This worker will import base data such as labels, milestones, and releases. This
work is done in a single thread since it can be performed fast enough that we
This worker imports base data such as labels, milestones, and releases. This
work is done in a single thread because it can be performed fast enough that we
don't need to perform this work in parallel.
### 4. Stage::ImportPullRequestsWorker
This worker will import all pull requests. For every pull request a job for the
This worker imports all pull requests. For every pull request a job for the
`Gitlab::GithubImport::ImportPullRequestWorker` worker is scheduled.
### 5. Stage::ImportIssuesAndDiffNotesWorker
This worker will import all issues and pull request comments. For every issue, we
This worker imports all issues and pull request comments. For every issue, we
schedule a job for the `Gitlab::GithubImport::ImportIssueWorker` worker. For
pull request comments, we instead schedule jobs for the
`Gitlab::GithubImport::DiffNoteImporter` worker.
......@@ -91,14 +91,14 @@ This worker imports regular comments for both issues and pull requests. For
every comment, we schedule a job for the
`Gitlab::GithubImport::ImportNoteWorker` worker.
Regular comments have to be imported at the end since the GitHub API used
Regular comments have to be imported at the end because the GitHub API used
returns comments for both issues and pull requests. This means we have to wait
for all issues and pull requests to be imported before we can import regular
comments.
### 7. Stage::FinishImportWorker
This worker will wrap up the import process by performing some housekeeping
This worker completes the import process by performing some housekeeping
(such as flushing any caches) and by marking the import as completed.
## Advancing stages
......@@ -113,22 +113,22 @@ The first approach should only be used by workers that perform all their work in
a single thread, while `AdvanceStageWorker` should be used for everything else.
The way `AdvanceStageWorker` works is fairly simple. When scheduling a job it
will be given a project ID, a list of Redis keys, and the name of the next
is given a project ID, a list of Redis keys, and the name of the next
stage. The Redis keys (produced by `Gitlab::JobWaiter`) are used to check if the
currently running stage has been completed or not. If the stage has not yet been
completed `AdvanceStageWorker` will reschedule itself. Once a stage finishes
`AdvanceStageworker` will refresh the import JID (more on this below) and
completed `AdvanceStageWorker` reschedules itself. After a stage finishes
`AdvanceStageworker` refreshes the import JID (more on this below) and
schedule the worker of the next stage.
To reduce the number of `AdvanceStageWorker` jobs scheduled this worker will
briefly wait for jobs to complete before deciding what the next action should
be. For small projects, this may slow down the import process a bit, but it will
also reduce pressure on the system as a whole.
To reduce the number of `AdvanceStageWorker` jobs scheduled this worker
briefly waits for jobs to complete before deciding what the next action should
be. For small projects, this may slow down the import process a bit, but it
also reduces pressure on the system as a whole.
## Refreshing import JIDs
GitLab includes a worker called `Gitlab::Import::StuckProjectImportJobsWorker`
that will periodically run and mark project imports as failed if they have been
that periodically runs and marks project imports as failed if they have been
running for more than 15 hours. For GitHub projects, this poses a bit of a
problem: importing large projects could take several hours depending on how
often we hit the GitHub rate limit (more on this below), but we don't want
......@@ -151,7 +151,7 @@ because we need the Email address of users in order to map them to GitLab users.
We handle this by doing the following:
1. Once we hit the rate limit all jobs will automatically reschedule themselves
1. After we hit the rate limit all jobs automatically reschedule themselves
in such a way that they are not executed until the rate limit has been reset.
1. We cache the mapping of GitHub users to GitLab users in Redis.
......@@ -164,7 +164,7 @@ perform:
1. One API call to get the user's Email address.
1. Two database queries to see if a corresponding GitLab user exists. One query
will try to find the user based on the GitHub user ID, while the second query
tries to find the user based on the GitHub user ID, while the second query
is used to find the user using their GitHub Email address.
Because this process is quite expensive we cache the result of these lookups in
......@@ -186,11 +186,11 @@ positive lookup, we refresh the TTL automatically. The TTL of false lookups is
never refreshed.
Because of this caching layer, it's possible newly registered GitLab accounts
won't be linked to their corresponding GitHub accounts. This, however, will sort
itself out once the cached keys expire.
aren't linked to their corresponding GitHub accounts. This, however, is resolved
after the cached keys expire.
The user cache lookup is shared across projects. This means that the more
projects get imported the fewer GitHub API calls will be needed.
The user cache lookup is shared across projects. This means that the greater the number of
projects that are imported, fewer GitHub API calls are needed.
The code for this resides in:
......
......@@ -29,7 +29,7 @@ This method ignores all the errors silently (including the ones related to `GITA
A convenient script, [`bin/import-project`](https://gitlab.com/gitlab-org/quality/performance/blob/master/bin/import-project), is provided with [performance](https://gitlab.com/gitlab-org/quality/performance) project to import the Project tarball into a GitLab environment via API from the terminal.
Note that to use the script, it will require some preparation if you haven't done so already:
Note that to use the script, it requires some preparation if you haven't done so already:
1. First, set up [`Ruby`](https://www.ruby-lang.org/en/documentation/installation/) and [`Ruby Bundler`](https://bundler.io) if they aren't already available on the machine.
1. Next, install the required Ruby Gems via Bundler with `bundle install`.
......@@ -40,7 +40,7 @@ For details how to use `bin/import-project`, run:
bin/import-project --help
```
The process should take up to 15 minutes for the project to import fully. The script will keep checking periodically for the status and exit once import has completed.
The process should take up to 15 minutes for the project to import fully. The script checks the status periodically and exits after the import has completed.
### Importing via GitHub
......@@ -49,7 +49,7 @@ There is also an option to [import the project via GitHub](../user/project/impor
1. Create the group `qa-perf-testing`
1. Import the GitLab FOSS repository that's [mirrored on GitHub](https://github.com/gitlabhq/gitlabhq) into the group via the UI.
This method will take longer to import than the other methods and will depend on several factors. It's recommended to use the other methods.
This method takes longer to import than the other methods and depends on several factors. It's recommended to use the other methods.
### Importing via a Rake task
......@@ -94,7 +94,7 @@ The `namespace_path` does not exist.
For example, one of the groups or subgroups is mistyped or missing
or you've specified the project name in the path.
The task will only create the project.
The task only creates the project.
If you want to import it to a new group or subgroup then create it first.
##### `Exception: No such file or directory @ rb_sysopen - (filename)`
......@@ -118,8 +118,8 @@ with '-', end in '.git' or end in '.atom'
The project name specified in `project_path` is not valid for one of the specified reasons.
Only put the project name in `project_path`. If, for example, you put a path of subgroups in there
it will fail with this error as `/` is not a valid character in a project name.
Only put the project name in `project_path`. For example, if you provide a path of subgroups
it fails with this error as `/` is not a valid character in a project name.
##### `Name has already been taken and Path has already been taken`
......@@ -190,7 +190,7 @@ For Performance testing, we should:
- Count the number of executed SQL queries during the restore.
- Observe the number of GC cycles happening.
You can use this snippet: `https://gitlab.com/gitlab-org/gitlab/snippets/1924954` (must be logged in), which will restore the project, and measure the execution time of `Project::TreeRestorer`, number of SQL queries and number of GC cycles happening.
You can use this snippet: `https://gitlab.com/gitlab-org/gitlab/snippets/1924954` (must be logged in), which restores the project, and measures the execution time of `Project::TreeRestorer`, number of SQL queries and number of GC cycles happening.
You can execute the script from the `gdk/gitlab` directory like this:
......@@ -200,7 +200,7 @@ bundle exec rails r /path_to_sript/script.rb project_name /path_to_extracted_pr
## Troubleshooting
In this section we'll detail any known issues we've seen when trying to import a project and how to manage them.
This section details known issues we've seen when trying to import a project and how to manage them.
### Gitaly calls error when importing
......
......@@ -33,7 +33,7 @@ Additionally, the pattern that is currently used to update the project statistic
(the callback) doesn't scale adequately. It is currently one of the largest
[database queries transactions on production](https://gitlab.com/gitlab-org/gitlab/-/issues/29070)
that takes the most time overall. We can't add one more query to it as
it will increase the transaction's length.
it increases the transaction's length.
Because of all of the above, we can't apply the same pattern to store
and update the namespaces statistics, as the `namespaces` table is one
......@@ -137,12 +137,12 @@ WHERE namespace_id IN (
Even though this approach would make aggregating much easier, it has some major downsides:
- We'd have to migrate **all namespaces** by adding and filling a new column. Because of the size of the table, dealing with time/cost will not be great. The background migration will take approximately `153h`, see <https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/29772>.
- We'd have to migrate **all namespaces** by adding and filling a new column. Because of the size of the table, dealing with time/cost would be significant. The background migration would take approximately `153h`, see <https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/29772>.
- Background migration has to be shipped one release before, delaying the functionality by another milestone.
### Attempt E (final): Update the namespace storage statistics in async way
This approach consists of keep using the incremental statistics updates we currently already have,
This approach consists of continuing to use the incremental statistics updates we already have,
but we refresh them through Sidekiq jobs and in different transactions:
1. Create a second table (`namespace_aggregation_schedules`) with two columns `id` and `namespace_id`.
......
......@@ -7,7 +7,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w
# Query Count Limits
Each controller or API endpoint is allowed to execute up to 100 SQL queries and
in test environments we'll raise an error when this threshold is exceeded.
in test environments we raise an error when this threshold is exceeded.
## Solving Failing Tests
......@@ -20,18 +20,18 @@ solutions to this problem:
You should only resort to whitelisting when an existing controller or endpoint
is to blame as in this case reducing the number of SQL queries can take a lot of
effort. Newly added controllers and endpoints are not allowed to execute more
than 100 SQL queries and no exceptions will be made for this rule. _If_ a large
than 100 SQL queries and no exceptions are made for this rule. _If_ a large
number of SQL queries is necessary to perform certain work it's best to have
this work performed by Sidekiq instead of doing this directly in a web request.
## Whitelisting
In the event that you _have_ to whitelist a controller you'll first need to
In the event that you _have_ to whitelist a controller you must first
create an issue. This issue should (preferably in the title) mention the
controller or endpoint and include the appropriate labels (`database`,
`performance`, and at least a team specific label such as `Discussion`).
Once the issue has been created you can whitelist the code in question. For
After the issue has been created you can whitelist the code in question. For
Rails controllers it's best to create a `before_action` hook that runs as early
as possible. The called method in turn should call
`Gitlab::QueryLimiting.whitelist('issue URL here')`. For example:
......
......@@ -19,7 +19,7 @@ The measuring module is a tool that allows to measure a service's execution, and
- RSS memory usage
- Server worker ID
The measuring module will log these measurements into a structured log called [`service_measurement.log`](../administration/logs.md#service_measurementlog),
The measuring module logs these measurements into a structured log called [`service_measurement.log`](../administration/logs.md#service_measurementlog),
as a single entry for each service execution.
For GitLab.com, `service_measurement.log` is ingested in Elasticsearch and Kibana as part of our monitoring solution.
......@@ -43,7 +43,7 @@ DummyService.prepend(Measurable)
In case when you are prepending a module from the `EE` namespace with EE features, you need to prepend Measurable after prepending the `EE` module.
This way, `Measurable` will be at the bottom of the ancestor chain, in order to measure execution of `EE` features as well:
This way, `Measurable` is at the bottom of the ancestor chain, in order to measure execution of `EE` features as well:
```ruby
class DummyService
......@@ -69,7 +69,7 @@ def extra_attributes_for_measurement
end
```
After the measurement module is injected in the service, it will be behind a generic feature flag.
After the measurement module is injected in the service, it is behind a generic feature flag.
To actually use it, you need to enable measuring for the desired service by enabling the feature flag.
### Enabling measurement using feature flags
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment