Commit ea4ad773 authored by Craig Norris's avatar Craig Norris

Merge branch '354845-edit-workhorse-page' into 'master'

Polish the new-features Workhorse page

See merge request gitlab-org/gitlab!84762
parents ae14c73f f841b505
...@@ -7,40 +7,72 @@ info: To determine the technical writer assigned to the Stage/Group associated w ...@@ -7,40 +7,72 @@ info: To determine the technical writer assigned to the Stage/Group associated w
# Adding new features to Workhorse # Adding new features to Workhorse
GitLab Workhorse is a smart reverse proxy for GitLab. It handles GitLab Workhorse is a smart reverse proxy for GitLab. It handles
"long" HTTP requests such as file downloads, file uploads, Git [long HTTP requests](#what-are-long-requests), such as:
push/pull and Git archive downloads.
- File downloads.
Workhorse itself is not a feature, but there are [several features in GitLab](gitlab_features.md) that would not work efficiently without Workhorse. - File uploads.
- Git pushes and pulls.
At a first glance, it may look like Workhorse is just a pipeline for processing HTTP streams so that you can reduce the amount of logic in your Ruby on Rails controller, but there are good reasons to avoid treating it like that. - Git archive downloads.
Engineers embarking on the quest of offloading a feature to Workhorse often find that the endeavor is much higher than what originally anticipated. In part because of the new programming language (only a few engineers at GitLab are Go developers), in part because of the demanding requirements for Workhorse. Workhorse is stateless, memory and disk usage must be kept under tight control, and the request should not be slowed down in the process. Workhorse itself is not a feature, but [several features in GitLab](gitlab_features.md)
would not work efficiently without Workhorse.
## Can I add a new feature to Workhorse?
At a first glance, Workhorse appears to be just a pipeline for processing HTTP
We suggest to follow this route only if absolutely necessary and no other options are available. streams to reduce the amount of logic in your Ruby on Rails controller. However,
don't treat it that way. Engineers trying to offload a feature to Workhorse often
Splitting a feature between the Rails code-base and Workhorse is deliberately choosing to introduce technical debt. It adds complexity to the system and coupling between the two components. find it takes more work than originally anticipated:
- Building features using Workhorse has a considerable complexity cost, so you should prefer designs based on Rails requests and Sidekiq jobs. - It's a new programming language, and only a few engineers at GitLab are Go developers.
- Even when using Rails+Sidekiq is "more work" than using Rails+Workhorse, Rails+Sidekiq is easier to maintain in the long term because Workhorse is unique to GitLab while Rails+Sidekiq is an industry standard. - Workhorse has demanding requirements:
- For "global" behaviors around web requests consider using a Rack middleware instead of Workhorse. - It's stateless.
- Generally speaking, we should only use Rails+Workhorse if the HTTP client expects behavior that is not reasonable to implement in Rails, like "long" requests. - Memory and disk usage must be kept under tight control.
- The request should not be slowed down in the process.
## What is a "long" request?
## Avoid adding new features
There is one order of magnitude between Workhorse and Puma RAM usage. Having connection open for a period longer than milliseconds is a problem because of the amount of RAM it monopolizes once it reaches the Ruby on Rails controller.
We suggest adding new features only if absolutely necessary and no other options exist.
So far we identified two classes of "long" requests: data transfers and HTTP long polling. Splitting a feature between the Rails codebase and Workhorse is a deliberate choice
to introduce technical debt. It adds complexity to the system, and coupling between
`git push`, `git pull`, uploading or downloading an artifact, the CI runner waiting for a new job are all good examples of long requests. the two components:
With the rise of cloud-native installations, Workhorse's feature-set was extended to add object storage direct-upload, to get rid of the shared Network File System (NFS) drives. - Building features using Workhorse has a considerable complexity cost, so you should
prefer designs based on Rails requests and Sidekiq jobs.
In 2020 @nolith presented at FOSDEM a talk titled [_Speed up the monolith. Building a smart reverse proxy in Go_](https://archive.fosdem.org/2020/schedule/event/speedupmonolith/). - Even when using Rails-and-Sidekiq is more work than using Rails-and-Workhorse,
You can watch the recording for more details on the history of Workhorse and the NFS removal. Rails-and-Sidekiq is easier to maintain in the long term. Workhorse is unique
to GitLab, while Rails-and-Sidekiq is an industry standard.
[Uploads development documentation](../uploads.md) - For global behaviors around web requests, consider using a Rack middleware
contains the most common use-cases for adding a new type of upload and may answer all of your questions. instead of Workhorse.
- Generally speaking, use Rails-and-Workhorse only if the HTTP client expects
If you still think we should add a new feature to Workhorse, please open an issue explaining **what you want to implement** and **why it can't be implemented in our Ruby code-base**. Workhorse maintainers will be happy to help you assessing the situation. behavior reasonable to implement in Rails, like long requests.
## What are long requests?
One order of magnitude exists between Workhorse and Puma RAM usage. Having a connection
open for longer than milliseconds is problematic due to the amount of RAM
it monopolizes after it reaches the Ruby on Rails controller. We've identified two classes
of long requests: data transfers and HTTP long polling. Some examples:
- `git push`.
- `git pull`.
- Uploading or downloading an artifact.
- A CI runner waiting for a new job.
With the rise of cloud-native installations, Workhorse's feature set was extended
to add object storage direct-upload. This change removed the need for the shared
Network File System (NFS) drives.
If you still think we should add a new feature to Workhorse, open an issue for the
Workhorse maintainers and explain:
1. What you want to implement.
1. Why it can't be implemented in our Ruby codebase.
The Workhorse maintainers can help you assess the situation.
## Related topics
- In 2020, `@nolith` presented the talk
["Speed up the monolith. Building a smart reverse proxy in Go"](https://archive.fosdem.org/2020/schedule/event/speedupmonolith/)
at FOSDEM. The talk includes more details on the history of Workhorse and the NFS removal.
- The [uploads development documentation](../uploads.md) contains the most common
use cases for adding a new type of upload.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment