1. 09 Dec, 2015 2 commits
    • Kirill Smelkov's avatar
      blob: Cache auth backend reply for 30s · 2beb8c95
      Kirill Smelkov authored
      In previous patch we added code to serve blob content via running `git cat-file
      ...` directly, but for every such request a request to slow RoR-based auth
      backend is made, which is bad for performance.
      
      Let's cache auth backend reply for small period of time, e.g. 30 seconds, which
      will change the situation dramatically:
      
      If we have a lot of requests to the same repository, we query auth backend only
      for every Nth request and with e.g. 100 raw blob request/s N=3000 which means
      that previous load to RoR code essentially goes away.
      
      On the other hand as we query auth backend only once in a while and refresh the
      cache we will not miss potential changes in project settings. I mean potential
      e.g. 25 seconds delay for a project to become public, or vise versa to become
      private does no real harm.
      
      The cache is done with the idea to allow the read side codepath to execute in
      parallel and to be not blocked by eventual cache updates.
      
      Overall this improves performance a lot:
      
        (on a 8-CPU i7-3770S with 16GB of RAM)
      
        # request goes to gitlab-workhorse with the following added to nginx conf
        # location ~ ^/[\w\.-]+/[\w\.-]+/raw/ {
        #   error_page 418 = @gitlab-workhorse;
        #   return 418;
        # }
        # but without auth caching
        $ ./wrk -c40 -d10 -t1 --latency https://[2001:67c:1254:e:8b::c776]:7777/root/slapos/raw/master/software/wendelin/software.cfg
        Running 10s test @ https://[2001:67c:1254:e:8b::c776]:7777/root/slapos/raw/master/software/wendelin/software.cfg
          1 threads and 40 connections
          Thread Stats   Avg      Stdev     Max   +/- Stdev
            Latency   549.37ms  220.53ms   1.69s    84.74%
            Req/Sec    71.01     25.49   160.00     70.71%
          Latency Distribution
             50%  514.66ms
             75%  584.32ms
             90%  767.46ms
             99%    1.37s
          709 requests in 10.01s, 1.26MB read
        Requests/sec:     70.83
        Transfer/sec:    128.79KB
      
        # request goes to gitlab-workhorse with auth caching (this patch)
        $ ./wrk -c40 -d10 -t1 --latency https://[2001:67c:1254:e:8b::c776]:7777/root/slapos/raw/master/software/wendelin/software.cfg
        Running 10s test @ https://[2001:67c:1254:e:8b::c776]:7777/root/slapos/raw/master/software/wendelin/software.cfg
          1 threads and 40 connections
          Thread Stats   Avg      Stdev     Max   +/- Stdev
            Latency    35.18ms   20.78ms 291.34ms   72.79%
            Req/Sec     1.18k   135.34     1.34k    88.00%
          Latency Distribution
             50%   33.96ms
             75%   44.35ms
             90%   58.70ms
             99%  104.76ms
          11704 requests in 10.01s, 20.78MB read
        Requests/sec:   1169.75
        Transfer/sec:      2.08MB
      
      i.e. it is ~ 17x improvement.
      2beb8c95
    • Kirill Smelkov's avatar
      Teach gitlab-workhorse to serve requests to get raw blobs · 1b274d0d
      Kirill Smelkov authored
      Currently GitLab serves requests to get raw blobs via Ruby-on-Rails code and
      Unicorn. Because RoR/Unicorn is relatively heavyweight, in environment where
      there are a lot of simultaneous requests to get raw blobs, this works very slow
      and server is constantly overloaded.
      
      On the other hand, to get raw blob content, we do not need anything from RoR
      framework - we only need to have access to project git repository on filesystem,
      and knowing whether access for getting data from there should be granted or
      not. That means it is possible to adjust Nginx frontend to route '.../raw/....'
      request to more lightweight and performant program which does this particular
      task and that will be a net win.
      
      As gitlab-workhorse is written in Go, and Go has good concurrency/parallelism
      support and is generally much faster than Ruby, adding raw blob serving task to
      it makes sense.
      
      In this patch: we add infrastructure to process GET request for '/raw/...':
      
      - extract project / ref and path from URL
      - query auth backend for whether download access should be granted or not
      - emit blob content via spawning external `git cat-file`
      
      I've tried to mimic the output to be as close as the one emitted by RoR code,
      with the idea that for users the change should be transparent.
      
      As in this patch we do auth backend query for every request to get a blob, RoR
      code is still loaded very much, so essentially there is no speedup yet:
      
        (on a 8-CPU i7-3770S with 16GB of RAM)
      
        # request goes to unicorn  (9 unicorn workers)
        $ ./wrk -c40 -d10 -t1 --latency https://[2001:67c:1254:e:8b::c776]:7777/root/slapos/raw/master/software/wendelin/software.cfg
        Running 10s test @ https://[2001:67c:1254:e:8b::c776]:7777/root/slapos/raw/master/software/wendelin/software.cfg
          1 threads and 40 connections
          Thread Stats   Avg      Stdev     Max   +/- Stdev
            Latency   553.06ms  166.39ms   1.29s    80.06%
            Req/Sec    69.53     23.12   140.00     71.72%
          Latency Distribution
             50%  525.41ms
             75%  615.63ms
             90%  774.48ms
             99%    1.05s
          695 requests in 10.02s, 1.38MB read
        Requests/sec:     69.38
        Transfer/sec:    141.47KB
      
        # request goes to gitlab-workhorse with the following added to nginx conf
        # location ~ ^/[\w\.-]+/[\w\.-]+/raw/ {
        #   error_page 418 = @gitlab-workhorse;
        #   return 418;
        # }
        $ ./wrk -c40 -d10 -t1 --latency https://[2001:67c:1254:e:8b::c776]:7777/root/slapos/raw/master/software/wendelin/software.cfg
        Running 10s test @ https://[2001:67c:1254:e:8b::c776]:7777/root/slapos/raw/master/software/wendelin/software.cfg
          1 threads and 40 connections
          Thread Stats   Avg      Stdev     Max   +/- Stdev
            Latency   549.37ms  220.53ms   1.69s    84.74%
            Req/Sec    71.01     25.49   160.00     70.71%
          Latency Distribution
             50%  514.66ms
             75%  584.32ms
             90%  767.46ms
             99%    1.37s
          709 requests in 10.01s, 1.26MB read
        Requests/sec:     70.83
        Transfer/sec:    128.79KB
      
      In the next patch we'll cache requests to auth backend and that will improve
      performance dramatically.
      1b274d0d
  2. 08 Dec, 2015 4 commits
  3. 07 Dec, 2015 3 commits
  4. 02 Dec, 2015 1 commit
  5. 29 Nov, 2015 2 commits
  6. 25 Nov, 2015 1 commit
  7. 24 Nov, 2015 3 commits
    • Jacob Vosmaer's avatar
      Merge branch 'y/gitcommand-path' into 'master' · 0d0bd209
      Jacob Vosmaer authored
      gitCommand: Pass $HOME to git as well
      
      Git has 3 places for configs:
      
         - system
         - global (per user), and
         - local  (per repository)
      
      System config location is hardcoded at git compile time (to usually
      $prefix/etc/gitconfig). Local configuration is usually picked because we
      pass --git-dir to subcommand. But global configuration is currently not
      picked at all, because HOME env variable is not passed to git.
      
      Pass $HOME through and let git see it's "global" config.
      
      Currently GitLab omnibus stores gitlab user name/email  + "autocrlf =
      true" in global config, so missing it should not be a blocker for
      receive/send-pack operations. But having it is more correct and can be
      handy in the future if/when more git operations are done from-under
      gitlab-workhorse.
      
      Having $HOME properly set is also needed when one cannot change system
      git config and have to put site-wide configuration into global git
      config under $HOME.
      
      That was the case I've hit and the reason for this patch.
      
      See merge request !10
      0d0bd209
    • Kirill Smelkov's avatar
      gitCommand: Pass $HOME to git as well · b5f1b803
      Kirill Smelkov authored
      Git has 3 places for configs:
      
          - system
          - global (per user), and
          - local  (per repository)
      
      System config location is hardcoded at git compile time (to usually
      $prefix/etc/gitconfig). Local configuration is usually picked because we
      pass --git-dir to subcommand. But global configuration is currently not
      picked at all, because HOME env variable is not passed to git.
      
      Pass $HOME through and let git see it's "global" config.
      
      Currently GitLab omnibus stores gitlab user name/email  + "autocrlf =
      true" in global config, so missing it should not be a blocker for
      receive/send-pack operations. But having it is more correct and can be
      handy in the future if/when more git operations are done from-under
      gitlab-workhorse.
      
      Having $HOME properly set is also needed when one cannot change system
      git config and have to put site-wide configuration into global git
      config under $HOME.
      
      That was the case I've hit and the reason for this patch.
      b5f1b803
    • Jacob Vosmaer's avatar
      Merge branch 'shared-runner-tests' into 'master' · 15f3268d
      Jacob Vosmaer authored
      Download and install go before running tests
      
      This change allows the build to run on 'shared runners'.
      
      See merge request !13
      15f3268d
  8. 23 Nov, 2015 3 commits
  9. 19 Nov, 2015 1 commit
  10. 16 Nov, 2015 3 commits
  11. 13 Nov, 2015 2 commits
    • Kamil Trzcinski's avatar
      Release 0.4.1 · c75879e6
      Kamil Trzcinski authored
      c75879e6
    • Kamil Trzciński's avatar
      Merge branch 'artifacts' into 'master' · 199976a4
      Kamil Trzciński authored
      Implement multipart form rewriting to support upload offloading
      
      1. This parses multipart form data and saves all found files as files in TempPath. TempPath is received from Rails by calling authorize request. The rewritten multipart form data contains `file.path` where the temporary file is stored, and `file.name` the original name of file as stored in Content-Disposition. The temporary file is removed afterwards, if it's not consumed by GitLab Rails. If the body is not multipart, forward request.
      
      2. All artifacts downloads are offloaded by exposing X-Sendfile-Type extension.
      
      
      See merge request !5
      199976a4
  12. 12 Nov, 2015 3 commits
  13. 10 Nov, 2015 12 commits