Commits · v1.3.0-nxd · nexedi / gitlab-workhorse

12 Sep, 2018 2 commits

NXD: fix patches to match changes on gitlab-workhorse up to v1.3.0 · 5f44f59c
Alain Takoudjou authored Sep 10, 2018
```
helper method: helper.Fail500 now require 3 arguments, the second argument is now an *http.Request. See: 3aaadb1b
```
5f44f59c

NXD: fix local import in non-local package · 6d08450d

Alain Takoudjou authored Sep 10, 2018

update patches to not import package localy but import from gitlab.com/gitlab-org/gitlab-workhorse/internal as the way to install gitlab-workhorse changed

6d08450d

23 Aug, 2018 6 commits

fixup! NXD blob/auth: Teach it to handle HTTP Basic Auth too · c81f109a

Kirill Smelkov authored Jun 04, 2018

@rafael approached me and asked why URLs like

https://gitlab-ci-token:XXX@hostname/group/project/raw/master/file

work in CURL, but not in Chrome under AJAX requests.

After investigation it turned out they neither work in WGET and give 302
redirect to http://localhost:8080/users/sign_in:

kirr@deco:~$ wget https://gitlab-ci-token:XXX@lab.nexedi.com/kirr/test/raw/master/hello.txt
--2018-06-04 13:14:04-- https://gitlab-ci-token:*password*@lab.nexedi.com/kirr/test/raw/master/hello.txt
Resolving lab.nexedi.com (lab.nexedi.com)... 176.31.129.213, 85.118.38.162
Connecting to lab.nexedi.com (lab.nexedi.com)|176.31.129.213|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://localhost:8080/users/sign_in [following]
--2018-06-04 13:14:04-- http://localhost:8080/users/sign_in
Resolving localhost (localhost)... 127.0.0.1, ::1
Connecting to localhost (localhost)|127.0.0.1|:8080... failed: Connection refused.
Connecting to localhost (localhost)|::1|:8080... failed: Connection refused.

This turned out to be due to most clients (in fine accordance with RFC2617 /
RFC7617) first send request without Authorization header set and retry it with
that header only if server challenges it to(*), and our authorization code was
only trying to handle HTTP basic auth if Authorization header was provided
without issuing any challenge on server side.

Fix it by checking Rails backend reply for 302, which it gives for
unauthorized non-raw requests, and on our side convert it HTTP Basic
auth challenge if raw request does not contain any token. This way it
now works with user:password in URLs for both WGET and Chrome.

If any tokens were provided we leave Rails auth response as is because
we handle user/password only for that "no token provided at all" case.

(*) see https://en.wikipedia.org/wiki/Basic_access_authentication for overview.
/cc @alain.takoudjou, @jerome

/reviewed-on !2

c81f109a

fixup! NXD blob/auth: Teach it to handle HTTP Basic Auth too · f5e228f3

Kirill Smelkov authored Aug 02, 2016

Adjust the test because download-archive format has been changed (see
fixup to first patch in nxd series), while `git fetch` expects the old
way.

f5e228f3

NXD blob/auth: Teach it to handle HTTP Basic Auth too · 79ff8d7d

Kirill Smelkov authored Mar 14, 2016

[ Not sent upstream.

  The patch was not sent upstream, because previous 2 raw blob patches
  were not accepted (see details there).

  OTOH it is very handy in SlapOS environment to use CI token auth for
  raw downloading, so just carry with us as NXD. ]

There are cases when using user:password for /raw/... access is handy:

- when using query for auth (private_token) is not convenient for some
  reason (e.g. client processing software does not handle queries well
  when generating URLs)

- when we do not want to organize many artificial users and use their
  tokens, but instead just use per-project automatically setup

    gitlab-ci-token : <ci-token>

  artificial user & "password" which are already handled by auth backend
  for `git fetch` requests.

Handling is easy: if main auth backend rejects access, and there is
user:password in original request, we retry asking auth backend the way
as `git fetch` would do.

Access is granted if any of two ways to ask auth backend succeeds. This
way both private tokens / cookies and HTTP auth are supported.

79ff8d7d

NXD blob/auth: Cache auth backend reply for 30s · 307edc62

Kirill Smelkov authored Dec 09, 2015

[ Sent upstream: https://gitlab.com/gitlab-org/gitlab-workhorse/merge_requests/17

  This patch was sent upstream but was not accepted for "complexity"
  reason of auth cache, despite that provides more than an order of magnitude
  speedup. Just carry it with us as NXD ]

In previous patch we added code to serve blob content via running `git cat-file
...` directly, but for every such request a request to slow RoR-based auth
backend is made, which is bad for performance.

Let's cache auth backend reply for small period of time, e.g. 30 seconds, which
will change the situation dramatically:

If we have a lot of requests to the same repository, we query auth backend only
for every Nth request and with e.g. 100 raw blob request/s N=3000 which means
that previous load to RoR code essentially goes away.

On the other hand as we query auth backend only once in a while and refresh the
cache, we will not miss potential changes in project settings. I mean potential
e.g. 25 seconds delay for a project to become public, or vise versa to become
private does no real harm.

The cache is done with the idea to allow the read side codepath to execute in
parallel and to be not blocked by eventual cache updates.

Overall this improves performance a lot:

  (on a 8-CPU i7-3770S with 16GB of RAM, 2001:67c:1254:e:8b::c776 is on localhost)

  # request is handled by gitlab-workhorse, but without auth caching
  $ ./wrk -c40 -d10 -t1 --latency http://[2001:67c:1254:e:8b::c776]:7777/nexedi/slapos/raw/master/software/wendelin/software.cfg
  Running 10s test @ http://[2001:67c:1254:e:8b::c776]:7777/nexedi/slapos/raw/master/software/wendelin/software.cfg
    1 threads and 40 connections
    Thread Stats   Avg      Stdev     Max   +/- Stdev
      Latency   458.42ms   66.26ms 766.12ms   84.76%
      Req/Sec    85.38     16.59   120.00     82.00%
    Latency Distribution
       50%  459.26ms
       75%  490.09ms
       90%  523.95ms
       99%  611.33ms
    853 requests in 10.01s, 1.51MB read
  Requests/sec:     85.18
  Transfer/sec:    154.90KB

  # request goes to gitlab-workhorse with auth caching (this patch)
  $ ./wrk -c40 -d10 -t1 --latency http://[2001:67c:1254:e:8b::c776]:7777/nexedi/slapos/raw/master/software/wendelin/software.cfg
  Running 10s test @ http://[2001:67c:1254:e:8b::c776]:7777/nexedi/slapos/raw/master/software/wendelin/software.cfg
    1 threads and 40 connections
    Thread Stats   Avg      Stdev     Max   +/- Stdev
      Latency    34.52ms   19.28ms 288.63ms   74.74%
      Req/Sec     1.20k   127.21     1.39k    85.00%
    Latency Distribution
       50%   32.67ms
       75%   42.73ms
       90%   56.26ms
       99%   99.86ms
    11961 requests in 10.01s, 21.24MB read
  Requests/sec:   1194.51
  Transfer/sec:      2.12MB

i.e. it is ~ 14x improvement.

307edc62

fixup! NXD Teach gitlab-workhorse to serve requests to get raw blobs · 90af7424

Kirill Smelkov authored Aug 02, 2016

During 0.6.4..0.6.5 upstream reworked the way request about downloading
archive is replied. Before it was json in body, after it is json in
headers handled via so-called "senddata" workhorse mechanism:

    https://gitlab.com/gitlab-org/gitlab-workhorse/commit/153527fb

Adjust our patch accordingly about requesting whether it is ok to
download from repository or not.

90af7424

NXD Teach gitlab-workhorse to serve requests to get raw blobs · eda612a7

Kirill Smelkov authored Dec 09, 2015

[ Sent upstream: https://gitlab.com/gitlab-org/gitlab-workhorse/merge_requests/17

  This patch was sent upstream but was not accepted for "complexity"
  reason of auth cache (next patch), despite that provides more than an
  order of magnitude speedup. Just carry it with us as NXD ]

Currently GitLab serves requests to get raw blobs via Ruby-on-Rails code and
Unicorn. Because RoR/Unicorn is relatively heavyweight, in environment where
there are a lot of simultaneous requests to get raw blobs, this works very slow
and server is constantly overloaded.

On the other hand, to get raw blob content, we do not need anything from RoR
framework - we only need to have access to project git repository on filesystem,
and knowing whether access for getting data from there should be granted or
not. That means it is possible to handle '.../raw/....' request directly
in more lightweight and performant gitlab-workhorse.

As gitlab-workhorse is written in Go, and Go has good concurrency/parallelism
support and is generally much faster than Ruby, moving raw blob serving task to
it makes sense and should be a net win.

In this patch: we add infrastructure to process GET request for '/raw/...':

- extract project / ref and path from URL
- query auth backend for whether download access should be granted or not
- emit blob content via spawning external `git cat-file`

I've tried to mimic the output to be as close as the one emitted by RoR code,
with the idea that for users the change should be transparent.

As in this patch we do auth backend query for every request to get a blob, RoR
code is still loaded very much, so essentially there is no speedup yet:

  (on a 8-CPU i7-3770S with 16GB of RAM, 2001:67c:1254:e:8b::c776 is on localhost)

  # without patch: request eventually goes to unicorn  (9 unicorn workers)
  $ ./wrk -c40 -d10 -t1 --latency http://[2001:67c:1254:e:8b::c776]:7777/nexedi/slapos/raw/master/software/wendelin/software.cfg
  Running 10s test @ http://[2001:67c:1254:e:8b::c776]:7777/nexedi/slapos/raw/master/software/wendelin/software.cfg
    1 threads and 40 connections
    Thread Stats   Avg      Stdev     Max   +/- Stdev
      Latency   461.16ms   63.44ms 809.80ms   84.18%
      Req/Sec    84.84     17.02   131.00     80.00%
    Latency Distribution
       50%  460.21ms
       75%  492.83ms
       90%  524.67ms
       99%  636.49ms
    847 requests in 10.01s, 1.57MB read
  Requests/sec:     84.64
  Transfer/sec:    161.10KB

  # with this patch: request handled by gitlab-workhorse
  $ ./wrk -c40 -d10 -t1 --latency http://[2001:67c:1254:e:8b::c776]:7777/nexedi/slapos/raw/master/software/wendelin/software.cfg
  Running 10s test @ http://[2001:67c:1254:e:8b::c776]:7777/nexedi/slapos/raw/master/software/wendelin/software.cfg
    1 threads and 40 connections
    Thread Stats   Avg      Stdev     Max   +/- Stdev
      Latency   458.42ms   66.26ms 766.12ms   84.76%
      Req/Sec    85.38     16.59   120.00     82.00%
    Latency Distribution
       50%  459.26ms
       75%  490.09ms
       90%  523.95ms
       99%  611.33ms
    853 requests in 10.01s, 1.51MB read
  Requests/sec:     85.18
  Transfer/sec:    154.90KB

In the next patch we'll cache requests to auth backend and that will improve
performance dramatically.

NOTE 20160228: there is internal/git/blob.go trying to get raw data via
    gitlab-workhorse, but still asking Unicorn about blob->sha1 mapping
    etc. That work started in

        86aaa133 (Prototype blobs via workhorse, @jacobvosmaer)

    and was inspired by this patch. It goes out of line compared to what
    we can do if we serve all blob data just by gitlab-workhorse (see
    next patch), so we just avoid git/blob.go and put our stuff into
    git/xblob.go and tweak routes, essentially deactivating git/blob.go
    code.

eda612a7

20 Jan, 2017 2 commits
- Merge branch 'release-v1.3.0' into 'master' · b30cda22
  Nick Thomas authored Jan 20, 2017
```
Release v1.3.0

See merge request !116
```
  b30cda22
- Release v1.3.0 · b62f6cc2
  Nick Thomas authored Jan 20, 2017
  
  b62f6cc2
19 Jan, 2017 8 commits
- Merge branch 'sh-fix-git-large-upload-pack' into 'master' · 87127664
  Nick Thomas authored Jan 19, 2017
```
Fix stalled HTTP fetches with large payloads

Closes #92, gitlab-ce#25916, and gitlab-com/infrastructure#941

See merge request !110
```
  87127664
- Limit upload-pack request body size to 10MB · fc80c6a3
  Jacob Vosmaer authored Jan 19, 2017
  
  fc80c6a3
- No need for two different buffers · 8dd2d3eb
  Jacob Vosmaer authored Jan 19, 2017
  
  8dd2d3eb
- Use setupGitCommand for /info/refs too · 7fc85ede
  Jacob Vosmaer authored Jan 19, 2017
  
  7fc85ede
- Prevent pipe leaks · 18334f75
  Jacob Vosmaer authored Jan 19, 2017
  
  18334f75
- Remove NopCloser · 9a775998
  Jacob Vosmaer authored Jan 19, 2017
  
  9a775998
- Merge branch '90-parse-content-types' into 'master' · 1b0af335
  Jacob Vosmaer (GitLab) authored Jan 19, 2017
```
Add helper.IsContentType and use it everywhere

Closes #90

See merge request !114
```
  1b0af335
- Add helper.IsContentType and use it everywhere · b03e9b9f
  Nick Thomas authored Jan 19, 2017
  
  b03e9b9f
17 Jan, 2017 1 commit
- Put buffering method in helper package · e122950d
  Jacob Vosmaer authored Jan 15, 2017
  
  e122950d
16 Jan, 2017 1 commit
- Buffer request in tempfile, not memory · 6ba53d9d
  Jacob Vosmaer authored Jan 15, 2017
  
  6ba53d9d
15 Jan, 2017 6 commits
- Move functions around · ba4d77e5
  Jacob Vosmaer authored Jan 15, 2017
  
  ba4d77e5
- Don't use 'naked return' style · 7d3fefcf
  Jacob Vosmaer authored Jan 15, 2017
  
  7d3fefcf
- Don't need a sync.WaitGroup here · 819a23d8
  Jacob Vosmaer authored Jan 15, 2017
  
  819a23d8
- Put Git actions in separate files · 59c95c75
  Jacob Vosmaer authored Jan 15, 2017
  
  59c95c75
- Merge branch 'master' into sh-fix-git-large-upload-pack · 1bdde7a0
  Stan Hu authored Jan 15, 2017
  
  1bdde7a0
- Merge branch 'sh-add-git-upload-pack-test' into 'master' · 2de1f1e3
  Nick Thomas authored Jan 15, 2017
```
Add unit test for upload-pack handling

See merge request !111
```
  2de1f1e3
13 Jan, 2017 9 commits

Simplify test payload · 0f137cf6
Stan Hu authored Jan 13, 2017

0f137cf6
Simplify test payload · d4db653c
Stan Hu authored Jan 13, 2017

d4db653c
Merge branch '74-hierarchical-namespace' into 'master' · afec1f64
Jacob Vosmaer (GitLab) authored Jan 13, 2017
```
Allow nested namespaces in git URLs

Closes #74

See merge request !80
```
afec1f64
Refactor handlePostRPC to be more clear · 37389823
Stan Hu authored Jan 12, 2017

37389823
Revise error statement to reflect reality · 232cd84c
Stan Hu authored Jan 12, 2017

232cd84c
Test git receive-pack · 0ef15b36
Stan Hu authored Jan 12, 2017

0ef15b36
Add spec that exercises #92 · 8950dfc1
Stan Hu authored Jan 12, 2017

8950dfc1
Add unit test for upload-pack handling · 1263f2c5
Stan Hu authored Jan 10, 2017

1263f2c5

Fix stalled HTTP fetches with large payloads · b8ab8bed

Stan Hu authored Jan 12, 2017

For fetches over HTTP, Workhorse executes git-upload-pack and first
attempts to send all the input data to stdin before reading from the
stdout pipe. However, when the payload is large, the stdout pipe may
fill up, causing git-upload-pack to stop reading from stdin. Workhorse
will then be deadlocked, since it will be waiting to send more data
to a buffer that will never be drained.

An addition side effect is that git-upload-pack processes also get left
around. These processes are cleaned up only after Workhorse is
restarted.

This fix modifies the git-upload-pack behavior to consume the entire
HTTP input first so that reading the data from stdout and sending the
reply can be performed in a separate Goroutine.

Closes #92

Closes gitlab-org/gitlab-ce#25916

Closes gitlab-com/infrastructure#941

b8ab8bed

12 Jan, 2017 4 commits
- Disambiguate a number of routes shadowed by nested namespaces · 3686b73f
  Nick Thomas authored Jan 06, 2017
  
  3686b73f
- Allow nested namespaces in git URLs · 1c088017
  Nick Thomas authored Nov 02, 2016
  
  1c088017
- Merge branch 'feature/gitaly-feature-flag' into 'master' · c8589c13
  Nick Thomas authored Jan 12, 2017
```
Proxy GET /info/refs to Gitaly

See merge request !105
```
  c8589c13
- Proxy GET /info/refs to Gitaly · 207262c2
  Ahmad Sherif authored Dec 26, 2016
  
  207262c2
04 Jan, 2017 1 commit
- Merge branch 'multipart-nextpart-error' into 'master' · aa4addd7
  Nick Thomas authored Jan 04, 2017
```
Catch _all_ multipart NextPart() errors

Closes #89

See merge request !108
```
  aa4addd7