Commit ca500e64 authored by Marcel Amirault's avatar Marcel Amirault

Merge branch 'sh-ci-pre-clone-docs' into 'master'

Document pre-clone step on GitLab.com

Closes #118468

See merge request gitlab-org/gitlab!23579
parents 56ba4736 531c9335
......@@ -489,6 +489,71 @@ for more information.
Consult the [Review Apps](testing_guide/review_apps.md) dedicated page for more information.
## Pre-clone step
The `gitlab-org/gitlab` project on GitLab.com uses a [pre-clone step](https://gitlab.com/gitlab-org/gitlab/issues/39134)
to seed the project with a recent archive of the repository. This is done for
several reasons:
- It speeds up builds because a 800 MB download only takes seconds, as opposed to a full Git clone.
- It significantly reduces load on the file server, as smaller deltas mean less time spent in `git pack-objects`.
The pre-clone step works by using the `CI_PRE_CLONE_SCRIPT` variable
[defined by GitLab.com shared runners](../user/gitlab_com/index.md#pre-clone-script).
The `CI_PRE_CLONE_SCRIPT` is currently defined as a project CI/CD
variable:
```shell
echo "Downloading archived master..."
wget -O /tmp/gitlab.tar.gz https://storage.googleapis.com/gitlab-ci-git-repo-cache/project-278964/gitlab-master.tar.gz
if [ ! -f /tmp/gitlab.tar.gz ]; then
echo "Repository cache not available, cloning a new directory..."
exit
fi
rm -rf $CI_PROJECT_DIR
echo "Extracting tarball into $CI_PROJECT_DIR..."
mkdir -p $CI_PROJECT_DIR
cd $CI_PROJECT_DIR
tar xzf /tmp/gitlab.tar.gz
rm -f /tmp/gitlab.tar.gz
chmod a+w $CI_PROJECT_DIR
```
The first step of the script downloads `gitlab-master.tar.gz` from
Google Cloud Storage. There is a [GitLab CI job named `cache-repo`](https://gitlab.com/gitlab-org/gitlab/blob/master/.gitlab/ci/cache-repo.gitlab-ci.yml#L5)
that is responsible for keeping that archive up-to-date. Every two hours
on a scheduled pipeline, it does the following:
1. Creates a fresh clone of the `gitlab-org/gitlab` repository on GitLab.com.
1. Saves the data as a `.tar.gz`.
1. Uploads it into the Google Cloud Storage bucket.
When a CI job runs with this configuration, you'll see something like
this:
```shell
$ eval "$CI_PRE_CLONE_SCRIPT"
Downloading archived master...
Extracting tarball into /builds/group/project...
Fetching changes...
Reinitialized existing Git repository in /builds/group/project/.git/
```
Note that the `Reinitialized existing Git repository` message shows that
the pre-clone step worked. The runner runs `git init`, which
overwrites the Git configuration with the appropriate settings to fetch
from the GitLab repository.
`CI_REPO_CACHE_CREDENTIALS` contains the Google Cloud service account
JSON for uploading to the `gitlab-ci-git-repo-cache` bucket. These
credentials are stored in the 1Password GitLab.com Production vault.
Note that this bucket should be located in the same continent as the
runner, or [network egress charges will apply](https://cloud.google.com/storage/pricing).
---
[Return to Development documentation](README.md)
......@@ -144,6 +144,26 @@ Below are the shared Runners settings.
| Default Docker image | `ruby:2.5` | - |
| `privileged` (run [Docker in Docker](https://hub.docker.com/_/docker/)) | `true` | `false` |
#### Pre-clone script
Linux Shared Runners on GitLab.com provide a way to run commands in a CI
job before the Runner attempts to run `git init` and `git fetch` to
download a GitLab repository. The
[pre_clone_script](https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runners-section)
can be used for:
- Seeding the build directory with repository data
- Sending a request to a server
- Downloading assets from a CDN
- Any other commands that must run before the `git init`
To use this feature, define a [CI/CD variable](../../ci/variables/README.md#via-the-ui) called
`CI_PRE_CLONE_SCRIPT` that contains a bash script.
[This example](../../development/pipelines.md#pre-clone-step)
demonstrates how you might use a pre-clone step to seed the build
directory.
#### `config.toml`
The full contents of our `config.toml` are:
......@@ -164,6 +184,7 @@ sentry_dsn = "X"
request_concurrency = X
url = "https://gitlab.com/"
token = "SHARED_RUNNER_TOKEN"
pre_clone_script = "eval \"$CI_PRE_CLONE_SCRIPT\""
executor = "docker+machine"
environment = [
"DOCKER_DRIVER=overlay2",
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment