Commits · 015c6f5fd38faff274e611fd84d4f0e764101adf · Kirill Smelkov / gitlab-workhorse

11 Mar, 2016 1 commit
- . · 015c6f5f
  Kirill Smelkov authored Mar 11, 2016
  
  015c6f5f
28 Feb, 2016 2 commits

blob/auth: Cache auth backend reply for 30s · 86b2a17f

Kirill Smelkov authored Dec 09, 2015

In previous patch we added code to serve blob content via running `git cat-file
...` directly, but for every such request a request to slow RoR-based auth
backend is made, which is bad for performance.

Let's cache auth backend reply for small period of time, e.g. 30 seconds, which
will change the situation dramatically:

If we have a lot of requests to the same repository, we query auth backend only
for every Nth request and with e.g. 100 raw blob request/s N=3000 which means
that previous load to RoR code essentially goes away.

On the other hand as we query auth backend only once in a while and refresh the
cache, we will not miss potential changes in project settings. I mean potential
e.g. 25 seconds delay for a project to become public, or vise versa to become
private does no real harm.

The cache is done with the idea to allow the read side codepath to execute in
parallel and to be not blocked by eventual cache updates.

Overall this improves performance a lot:

  (on a 8-CPU i7-3770S with 16GB of RAM, 2001:67c:1254:e:8b::c776 is on localhost)

  # request is handled by gitlab-workhorse, but without auth caching
  $ ./wrk -c40 -d10 -t1 --latency http://[2001:67c:1254:e:8b::c776]:7777/nexedi/slapos/raw/master/software/wendelin/software.cfg
  Running 10s test @ http://[2001:67c:1254:e:8b::c776]:7777/nexedi/slapos/raw/master/software/wendelin/software.cfg
    1 threads and 40 connections
    Thread Stats   Avg      Stdev     Max   +/- Stdev
      Latency   458.42ms   66.26ms 766.12ms   84.76%
      Req/Sec    85.38     16.59   120.00     82.00%
    Latency Distribution
       50%  459.26ms
       75%  490.09ms
       90%  523.95ms
       99%  611.33ms
    853 requests in 10.01s, 1.51MB read
  Requests/sec:     85.18
  Transfer/sec:    154.90KB

  # request goes to gitlab-workhorse with auth caching (this patch)
  $ ./wrk -c40 -d10 -t1 --latency http://[2001:67c:1254:e:8b::c776]:7777/nexedi/slapos/raw/master/software/wendelin/software.cfg
  Running 10s test @ http://[2001:67c:1254:e:8b::c776]:7777/nexedi/slapos/raw/master/software/wendelin/software.cfg
    1 threads and 40 connections
    Thread Stats   Avg      Stdev     Max   +/- Stdev
      Latency    34.52ms   19.28ms 288.63ms   74.74%
      Req/Sec     1.20k   127.21     1.39k    85.00%
    Latency Distribution
       50%   32.67ms
       75%   42.73ms
       90%   56.26ms
       99%   99.86ms
    11961 requests in 10.01s, 21.24MB read
  Requests/sec:   1194.51
  Transfer/sec:      2.12MB

i.e. it is ~ 14x improvement.

86b2a17f

Teach gitlab-workhorse to serve requests to get raw blobs · 3dfb8ed5

Kirill Smelkov authored Dec 09, 2015

Currently GitLab serves requests to get raw blobs via Ruby-on-Rails code and
Unicorn. Because RoR/Unicorn is relatively heavyweight, in environment where
there are a lot of simultaneous requests to get raw blobs, this works very slow
and server is constantly overloaded.

On the other hand, to get raw blob content, we do not need anything from RoR
framework - we only need to have access to project git repository on filesystem,
and knowing whether access for getting data from there should be granted or
not. That means it is possible to handle '.../raw/....' request directly
in more lightweight and performant gitlab-workhorse.

As gitlab-workhorse is written in Go, and Go has good concurrency/parallelism
support and is generally much faster than Ruby, moving raw blob serving task to
it makes sense and should be a net win.

In this patch: we add infrastructure to process GET request for '/raw/...':

- extract project / ref and path from URL
- query auth backend for whether download access should be granted or not
- emit blob content via spawning external `git cat-file`

I've tried to mimic the output to be as close as the one emitted by RoR code,
with the idea that for users the change should be transparent.

As in this patch we do auth backend query for every request to get a blob, RoR
code is still loaded very much, so essentially there is no speedup yet:

  (on a 8-CPU i7-3770S with 16GB of RAM, 2001:67c:1254:e:8b::c776 is on localhost)

  # without patch: request eventually goes to unicorn  (9 unicorn workers)
  $ ./wrk -c40 -d10 -t1 --latency http://[2001:67c:1254:e:8b::c776]:7777/nexedi/slapos/raw/master/software/wendelin/software.cfg
  Running 10s test @ http://[2001:67c:1254:e:8b::c776]:7777/nexedi/slapos/raw/master/software/wendelin/software.cfg
    1 threads and 40 connections
    Thread Stats   Avg      Stdev     Max   +/- Stdev
      Latency   461.16ms   63.44ms 809.80ms   84.18%
      Req/Sec    84.84     17.02   131.00     80.00%
    Latency Distribution
       50%  460.21ms
       75%  492.83ms
       90%  524.67ms
       99%  636.49ms
    847 requests in 10.01s, 1.57MB read
  Requests/sec:     84.64
  Transfer/sec:    161.10KB

  # with this patch: request handled by gitlab-workhorse
  $ ./wrk -c40 -d10 -t1 --latency http://[2001:67c:1254:e:8b::c776]:7777/nexedi/slapos/raw/master/software/wendelin/software.cfg
  Running 10s test @ http://[2001:67c:1254:e:8b::c776]:7777/nexedi/slapos/raw/master/software/wendelin/software.cfg
    1 threads and 40 connections
    Thread Stats   Avg      Stdev     Max   +/- Stdev
      Latency   458.42ms   66.26ms 766.12ms   84.76%
      Req/Sec    85.38     16.59   120.00     82.00%
    Latency Distribution
       50%  459.26ms
       75%  490.09ms
       90%  523.95ms
       99%  611.33ms
    853 requests in 10.01s, 1.51MB read
  Requests/sec:     85.18
  Transfer/sec:    154.90KB

In the next patch we'll cache requests to auth backend and that will improve
performance dramatically.

NOTE 20160228: there is internal/git/blob.go trying to get raw data via
    gitlab-workhorse, but still asking Unicorn about blob->sha1 mapping
    etc. That work started in

        86aaa133 (Prototype blobs via workhorse, @jacobvosmaer)

    and was inspired by this patch. It goes out of line compared to what
    we can do if we serve all blob data just by gitlab-workhorse (see
    next patch), so we just avoid git/blob.go and put our stuff into
    git/xblob.go and tweak routes, essentially deactivating git/blob.go
    code.

3dfb8ed5

12 Feb, 2016 1 commit
- Version 0.6.4 · 3f8da4ae
  Jacob Vosmaer authored Feb 12, 2016
  
  3f8da4ae
08 Feb, 2016 3 commits

Merge branch 'git-raw-content-length' into 'master' · 369ad5b4

Jacob Vosmaer authored Feb 08, 2016

Unset Content-Length for raw Git blobs

Fixes https://gitlab.com/gitlab-org/gitlab-workhorse/merge_requests/30#note_3560130

Failure to get raw Git blob via /api/v3 with error:

error: SendBlob: copy git cat-file stdout: Conn.Write wrote more than the declared Content-Length

See merge request !37

369ad5b4

Un set Content-Length for raw Git blobs · 275946fe
Jacob Vosmaer authored Feb 08, 2016

275946fe
Increase default ProxyHeadersTimeout to 5 minutes · 55ce310d
Jacob Vosmaer authored Feb 08, 2016

55ce310d

02 Feb, 2016 3 commits
- Version 0.6.3 · fb0deba8
  Jacob Vosmaer authored Feb 02, 2016
  
  fb0deba8
- Merge branch 'raw-blob' into 'master' · 575e78b9
  Jacob Vosmaer authored Feb 02, 2016
```
Blobs via workhorse

Works together with https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/2451

See merge request !30
```
  575e78b9
- No "--" for "git cat-file"? · 8167a305
  Jacob Vosmaer authored Feb 02, 2016
  
  8167a305
01 Feb, 2016 8 commits
- Use 'git cat-file blob' instead of 'git show' · 39a4a3ea
  Jacob Vosmaer authored Feb 01, 2016
```
This is meant as an extra layer of defense against untrusted user
input.
```
  39a4a3ea
- Merge branch 'patch-1' into 'master' · e5250487
  Jacob Vosmaer authored Feb 01, 2016
```
Fixed typo



See merge request !35
```
  e5250487
- Use some more constants · a6655446
  Jacob Vosmaer authored Feb 01, 2016
  
  a6655446
- Assert internal header removal · a3fc1506
  Jacob Vosmaer authored Feb 01, 2016
  
  a3fc1506
- Add test for sending Git blobs · b6dcb7c8
  Jacob Vosmaer authored Feb 01, 2016
  
  b6dcb7c8
- Forgot to use blob prefix constant · 90d98776
  Jacob Vosmaer authored Feb 01, 2016
  
  90d98776
- Fixed typo · 9f741495
  Marco Vito Moscaritolo authored Feb 01, 2016
  
  9f741495
- Use only one header to send git blobs · 825e2afb
  Jacob Vosmaer authored Feb 01, 2016
  
  825e2afb
28 Jan, 2016 1 commit
- Use blob IDs, not commit IDs · 0a5bd0fe
  Jacob Vosmaer authored Jan 28, 2016
  
  0a5bd0fe
26 Jan, 2016 8 commits
- Version 0.6.2 · 7a8ab7a2
  Jacob Vosmaer authored Jan 26, 2016
  
  7a8ab7a2
- Merge branch 'fix/metadata-full-structure' into 'master' · 205486e3
  Jacob Vosmaer authored Jan 26, 2016
```
Add missing entries in build artifacts archive metadata

We need full directory structure, but since ZIP does not require it, we
need to calculate missing entries for directories when we generate
archive metadata.

Closes gitlab-org/gitlab-ce#12634

cc @jacobvosmaer @ayufan 

See merge request !34
```
  205486e3
- Revert break when traversing directories for zip entries · d4ee3a82
  Grzegorz Bizon authored Jan 26, 2016
  
  d4ee3a82
- Extend specs for metadata, improve internal loop · bb374fbd
  Grzegorz Bizon authored Jan 26, 2016
  
  bb374fbd
- Add specs for generated artifacts metadata paths · d76b7599
  Grzegorz Bizon authored Jan 26, 2016
  
  d76b7599
- Do not make assumptions about non existing entries in artifacts · 53e345d2
  Grzegorz Bizon authored Jan 26, 2016
  
  53e345d2
- Sort metadata entries before passing to IO buffer · 57c3a8a8
  Grzegorz Bizon authored Jan 26, 2016
  
  57c3a8a8
- Refactor function that generates artifacts metadata · f1d320ed
  Grzegorz Bizon authored Jan 26, 2016
  
  f1d320ed
25 Jan, 2016 6 commits
- Fixup merge · 16dc2cea
  Jacob Vosmaer authored Jan 25, 2016
  
  16dc2cea
- Merge branch 'master' of https://gitlab.com/gitlab-org/gitlab-workhorse into raw-blob · 51d1256c
  Jacob Vosmaer authored Jan 25, 2016
  
  51d1256c
- Allocate map with initial size that is eq to entries size · 3353d2e7
  Grzegorz Bizon authored Jan 25, 2016
  
  3353d2e7
- Use Go `map` instead of iterating entries multiple times · 524fda2f
  Grzegorz Bizon authored Jan 25, 2016
  
  524fda2f
- Add missing entries in build artifacts archive metadata · eebe7a29
  Grzegorz Bizon authored Jan 25, 2016
```
We need full directory structure, but since ZIP does not require it, we
need to calculate missing entries for directories when we generate
archive metadata.
```
  eebe7a29
- Improve error messages · 4fe063cf
  Jacob Vosmaer authored Jan 25, 2016
  
  4fe063cf
22 Jan, 2016 3 commits
- Merge branch 'go-1.5.3' into 'master' · 1bb40cdf
  Jacob Vosmaer authored Jan 22, 2016
```
Use Go 1.5.3



See merge request !33
```
  1bb40cdf
- Merge branch 'destdir' into 'master' · 257fe7c4
  Jacob Vosmaer authored Jan 22, 2016
```
Add DESTDIR support to "make install"

Fixes https://gitlab.com/gitlab-org/gitlab-workhorse/issues/8

See merge request !27
```
  257fe7c4
- Use Go 1.5.3 · 9c974f0d
  Jacob Vosmaer authored Jan 22, 2016
  
  9c974f0d
21 Jan, 2016 4 commits
- Merge branch 'master' into destdir · 34a9138d
  Jacob Vosmaer authored Jan 21, 2016
  
  34a9138d
- Version 0.6.1 · 4464eb25
  Jacob Vosmaer authored Jan 21, 2016
  
  4464eb25
- Update changelog for 0.6.1 · 56e1512a
  Jacob Vosmaer authored Jan 21, 2016
  
  56e1512a
- Merge branch 'zip-subprocess' into 'master' · d40c6979
  Jacob Vosmaer authored Jan 21, 2016
```
Use gitlab-zip-cat to send zip entries

Fixes https://gitlab.com/gitlab-org/gitlab-workhorse/issues/17

See merge request !31
```
  d40c6979