Merge branch '284-remove-zip-buffer' into 'master'
Alessio Caiazza authored
Remove an in-memory buffer for LSIF transformation

Closes #284

See merge request gitlab-org/gitlab-workhorse!586
8fd7b24b
Name Last commit Last update
_support Verify changelog file presence in CI NO CHANGELOG
changelogs/unreleased Merge branch '284-remove-zip-buffer' into 'master'
cmd Limit memory footprint of artifacts metadata processing
doc Update environment websocket route
internal Merge branch '284-remove-zip-buffer' into 'master'
testdata Test the behaviour of SendURL with less-common upstreams
.gitignore Ignore toml files in project root
.gitlab-ci.yml Verify changelog file presence in CI NO CHANGELOG
CHANGELOG Update CHANGELOG for 8.43.0
CONTRIBUTING.md Add a boilerplate CONTRIBUTING.md document
LICENSE Update LICENSE year to 2017
Makefile Update make to generate `testdata/scratch` dir
PROCESS.md Merge branch 'security-release-process' into 'master'
README.md Merge branch 'patch-1' into 'master'
VERSION Update VERSION to 8.43.0
authorization_test.go Fix semantical merge conflict (double import)
backend.go Fix backend URL parsing
backend_test.go Use testify/require in top level tests
cable_test.go Tests: consistent helper function names
channel_test.go Use testify/require in top level tests
config.toml.example Add Azure Blob Storage support
gitaly_integration_test.go Make more use of testify/require
gitaly_test.go Merge branch 'jv-test-gitaly-graceful-stop' into 'master'
go.mod Add Azure Blob Storage support
go.sum Add Azure Blob Storage support
jobs_test.go
logging.go
main.go
main_test.go
proxy_test.go
raven.go
sendfile_test.go
tools.go
upload_test.go

gitlab-workhorse

Gitlab-workhorse is a smart reverse proxy for GitLab. It handles "large" HTTP requests such as file downloads, file uploads, Git push/pull and Git archive downloads.

Quick facts (how does Workhorse work)

  • Workhorse can handle some requests without involving Rails at all: for example, JavaScript files and CSS files are served straight from disk.
  • Workhorse can modify responses sent by Rails: for example if you use send_file in Rails then gitlab-workhorse will open the file on disk and send its contents as the response body to the client.
  • Workhorse can take over requests after asking permission from Rails. Example: handling git clone.
  • Workhorse can modify requests before passing them to Rails. Example: when handling a Git LFS upload Workhorse first asks permission from Rails, then it stores the request body in a tempfile, then it sends a modified request containing the tempfile path to Rails.
  • Workhorse can manage long-lived WebSocket connections for Rails. Example: handling the terminal websocket for environments.
  • Workhorse does not connect to Postgres, only to Rails and (optionally) Redis.
  • We assume that all requests that reach Workhorse pass through an upstream proxy such as NGINX or Apache first.
  • Workhorse does not accept HTTPS connections.
  • Workhorse does not clean up idle client connections.
  • We assume that all requests to Rails pass through Workhorse.

For more information see 'A brief history of gitlab-workhorse'.

Usage

  gitlab-workhorse [OPTIONS]

Options:
  -apiCiLongPollingDuration duration
      Long polling duration for job requesting for runners (default 50s - enabled) (default 50ns)
  -apiLimit uint
      Number of API requests allowed at single time
  -apiQueueDuration duration
      Maximum queueing duration of requests (default 30s)
  -apiQueueLimit uint
      Number of API requests allowed to be queued
  -authBackend string
      Authentication/authorization backend (default "http://localhost:8080")
  -authSocket string
      Optional: Unix domain socket to dial authBackend at
  -cableBackend string
      Optional: ActionCable backend (default authBackend)
  -cableSocket string
      Optional: Unix domain socket to dial cableBackend at (default authSocket)
  -config string
      TOML file to load config from
  -developmentMode
      Allow the assets to be served from Rails app
  -documentRoot string
      Path to static files content (default "public")
  -listenAddr string
      Listen address for HTTP server (default "localhost:8181")
  -listenNetwork string
      Listen 'network' (tcp, tcp4, tcp6, unix) (default "tcp")
  -listenUmask int
      Umask for Unix socket
  -logFile string
      Log file location
  -logFormat string
      Log format to use defaults to text (text, json, structured, none) (default "text")
  -pprofListenAddr string
      pprof listening address, e.g. 'localhost:6060'
  -prometheusListenAddr string
      Prometheus listening address, e.g. 'localhost:9229'
  -proxyHeadersTimeout duration
      How long to wait for response headers when proxying the request (default 5m0s)
  -secretPath string
      File with secret key to authenticate with authBackend (default "./.gitlab_workhorse_secret")
  -version
      Print version and exit

The 'auth backend' refers to the GitLab Rails application. The name is a holdover from when gitlab-workhorse only handled Git push/pull over HTTP.

Gitlab-workhorse can listen on either a TCP or a Unix domain socket. It can also open a second listening TCP listening socket with the Go net/http/pprof profiler server.

Gitlab-workhorse can listen on redis events (currently only builds/register for runners). This requires you to pass a valid TOML config file via -config flag. For regular setups it only requires the following (replacing the string with the actual socket)

Redis

Gitlab-workhorse integrates with Redis to do long polling for CI build requests. This is configured via two things:

  • Redis settings in the TOML config file
  • The -apiCiLongPollingDuration command line flag to control polling behavior for CI build requests

It is OK to enable Redis in the config file but to leave CI polling disabled; this just results in an idle Redis pubsub connection. The opposite is not possible: CI long polling requires a correct Redis configuration.

Below we discuss the options for the [redis] section in the config file.

[redis]
URL = "unix:///var/run/gitlab/redis.sock"
Password = "my_awesome_password"
Sentinel = [ "tcp://sentinel1:23456", "tcp://sentinel2:23456" ]
SentinelMaster = "mymaster"
  • URL takes a string in the format unix://path/to/redis.sock or tcp://host:port.
  • Password is only required if your redis instance is password-protected
  • Sentinel is used if you are using Sentinel. NOTE that if both Sentinel and URL are given, only Sentinel will be used

Optional fields are as follows:

[redis]
DB = 0
ReadTimeout = "1s"
KeepAlivePeriod = "5m"
MaxIdle = 1
MaxActive = 1
  • DB is the Database to connect to. Defaults to 0
  • ReadTimeout is how long a redis read-command can take. Defaults to 1s
  • KeepAlivePeriod is how long the redis connection is to be kept alive without anything flowing through it. Defaults to 5m
  • MaxIdle is how many idle connections can be in the redis-pool at once. Defaults to 1
  • MaxActive is how many connections the pool can keep. Defaults to 1

Relative URL support

If you are mounting GitLab at a relative URL, e.g. example.com/gitlab, then you should also use this relative URL in the authBackend setting:

gitlab-workhorse -authBackend http://localhost:8080/gitlab

Interaction of authBackend and authSocket

The interaction between authBackend and authSocket can be a bit confusing. It comes down to: if authSocket is set it overrides the host part of authBackend but not the relative path.

In table form:

authBackend authSocket Workhorse connects to? Rails relative URL
unset unset localhost:8080 /
http://localhost:3000 unset localhost:3000 /
http://localhost:3000/gitlab unset localhost:3000 /gitlab
unset /path/to/socket /path/to/socket /
http://localhost:3000 /path/to/socket /path/to/socket /
http://localhost:3000/gitlab /path/to/socket /path/to/socket /gitlab

The same applies to cableBackend and cableSocket.

Installation

To install gitlab-workhorse you need Go 1.8 or newer and GNU Make.

To install into /usr/local/bin run make install.

make install

To install into /foo/bin set the PREFIX variable.

make install PREFIX=/foo

On some operating systems, such as FreeBSD, you may have to use gmake instead of make.

Dependencies

Exiftool

Workhorse uses exiftool for removing EXIF data (which may contain sensitive information) from uploaded images. If you installed GitLab:

  • Using the Omnibus package, you're all set. NOTE that if you are using CentOS Minimal, you may need to install perl package: yum install perl

  • From source, make sure exiftool is installed:

    # Debian/Ubuntu
    sudo apt-get install libimage-exiftool-perl
    
    # RHEL/CentOS
    sudo yum install perl-Image-ExifTool

GraphicsMagick (experimental)

Workhorse has an experimental feature that allows us to rescale images on-the-fly. If you do not run Workhorse in a container where the gm tool is already installed, you will have to install it on your host machine instead:

macOS

brew install graphicsmagick

Debian/Ubuntu

sudo apt-get install graphicsmagick

For installation on other platforms, please consult http://www.graphicsmagick.org/README.html.

Note that Omnibus containers already come with gm installed.

Error tracking

GitLab-Workhorse supports remote error tracking with Sentry. To enable this feature set the GITLAB_WORKHORSE_SENTRY_DSN environment variable. You can also set the GITLAB_WORKHORSE_SENTRY_ENVIRONMENT environment variable to use the Sentry environment functionality to separate staging, production and development.

Omnibus (/etc/gitlab/gitlab.rb):

gitlab_workhorse['env'] = {
    'GITLAB_WORKHORSE_SENTRY_DSN' => 'https://foobar'
    'GITLAB_WORKHORSE_SENTRY_ENVIRONMENT' => 'production'
}

Source installations (/etc/default/gitlab):

export GITLAB_WORKHORSE_SENTRY_DSN='https://foobar'
export GITLAB_WORKHORSE_SENTRY_ENVIRONMENT='production'

Tests

Run the tests with:

make clean test

Coverage / what to test

Each feature in gitlab-workhorse should have an integration test that verifies that the feature 'kicks in' on the right requests and leaves other requests unaffected. It is better to also have package-level tests for specific behavior but the high-level integration tests should have the first priority during development.

It is OK if a feature is only covered by integration tests.

Distributed Tracing

Workhorse supports distributed tracing through LabKit using OpenTracing APIs.

By default, no tracing implementation is linked into the binary, but different OpenTracing providers can be linked in using build tags/build constraints. This can be done by setting the BUILD_TAGS make variable.

For more details of the supported providers, see LabKit, but as an example, for Jaeger tracing support, include the tags: BUILD_TAGS="tracer_static tracer_static_jaeger".

make BUILD_TAGS="tracer_static tracer_static_jaeger"

Once Workhorse is compiled with an opentracing provider, the tracing configuration is configured via the GITLAB_TRACING environment variable.

For example:

GITLAB_TRACING=opentracing://jaeger ./gitlab-workhorse

Continuous Profiling

Workhorse supports continuous profiling through LabKit using Stackdriver Profiler.

By default, the Stackdriver Profiler implementation is linked in the binary using build tags, though it's not required and can be skipped.

For example:

make BUILD_TAGS=""

Once Workhorse is compiled with Continuous Profiling, the profiler configuration can be set via GITLAB_CONTINUOUS_PROFILING environment variable.

For example:

GITLAB_CONTINUOUS_PROFILING="stackdriver?service=workhorse&service_version=1.0.1&project_id=test-123 ./gitlab-workhorse"

More information about see the LabKit monitoring docs.

License

This code is distributed under the MIT license, see the LICENSE file.