Commit 2db542c5 authored by Gabriel Mazetto's avatar Gabriel Mazetto

Added file storage documentation and updated hash storage one

parent 2f644452
......@@ -9,7 +9,7 @@ mapping structure from the projects URLs:
* Project's repository: `#{namespace}/#{project_name}.git`
* Project's wiki: `#{namespace}/#{project_name}.wiki.git`
This structure made simple to migrate from existing solutions to GitLab and easy for Administrators to find where the
repository is stored.
......@@ -27,7 +27,7 @@ of load in big installations, and can be even worst if they are using any type o
Last, for GitLab Geo, this storage type means we have to synchronize the disk state, replicate renames in the correct
order or we may end-up with wrong repository or missing data temporarily.
This pattern also exists in other objects stored in GitLab, like issue Attachments, GitLab Pages artifacts,
This pattern also exists in other objects stored in GitLab, like issue Attachments, GitLab Pages artifacts,
Docker Containers for the integrated Registry, etc.
## Hashed Storage
......@@ -62,9 +62,9 @@ you will never mistakenly restore a repository in the wrong project (considering
### How to migrate to Hashed Storage
In GitLab, go to **Admin > Settings**, find the **Repository Storage** section and select
In GitLab, go to **Admin > Settings**, find the **Repository Storage** section and select
"_Create new projects using hashed storage paths_".
To migrate your existing projects to the new storage type, check the specific [rake tasks].
[ce-28283]: https://gitlab.com/gitlab-org/gitlab-ce/issues/28283
......@@ -79,14 +79,14 @@ coverage status below.
Note that things stored in an S3 compatible endpoint will not have the downsides mentioned earlier, if they are not
prefixed with `#{namespace}/#{project_name}`, which is true for CI Cache and LFS Objects.
| Storable Object | Legacy Storage | Hashed Storage | S3 Compatible | GitLab Version |
| ----------------| -------------- | -------------- | ------------- | -------------- |
| Storable Object | Legacy Storage | Hashed Storage | S3 Compatible | GitLab Version |
| --------------- | -------------- | -------------- | ------------- | -------------- |
| Repository | Yes | Yes | - | 10.0 |
| Attachments | Yes | Yes | - | 10.2 |
| Avatars | Yes | No | - | - |
| Avatars | Yes | No | - | - |
| Pages | Yes | No | - | - |
| Docker Registry | Yes | No | - | - |
| CI Build Logs | No | No | - | - |
| CI Artifacts | No | No | - | - |
| CI Build Logs | No | No | - | - |
| CI Artifacts | No | No | Yes (EEP) | - |
| CI Cache | No | No | Yes | - |
| LFS Objects | Yes | No | Yes (EEP) | - |
| LFS Objects | Yes | No | Yes (EEP) | - |
# File Storage in GitLab
We use the [CarrierWave] gem to handle file upload, store and retrieval.
There are many places where file uploading is used, according to contexts:
* System
- Instance Logo (logo visible in sign in/sign up pages)
- Header Logo (one displayed in the navigation bar)
* Group
- Group avatars
* User
- User avatars
- User snippet attachments
* Project
- Project avatars
- Issues/MR Markdown attachments
- Issues/MR Legacy Markdown attachments
- CI Build Artifacts
- LFS Objects
## Disk storage
GitLab started saving everything on local disk. While directory location changed from previous versions,
they are still not 100% standardized. You can see them below:
| Description | In DB? | Relative path | Uploader class | model_type |
| ------------------------------------- | ------ | ----------------------------------------------------------- | ---------------------- | ---------- |
| Instance logo | yes | uploads/-/system/appearance/logo/:id/:filename | `AttachmentUploader` | Appearance |
| Header logo | yes | uploads/-/system/appearance/header_logo/:id/:filename | `AttachmentUploader` | Appearance |
| Group avatars | yes | uploads/-/system/group/avatar/:id/:filename | `AvatarUploader` | Group |
| User avatars | yes | uploads/-/system/user/avatar/:id/:filename | `AvatarUploader` | User |
| User snippet attachments | yes | uploads/-/system/personal_snippet/:id/:random_hex/:filename | `PersonalFileUploader` | Snippet |
| Project avatars | yes | uploads/-/system/project/avatar/:id/:filename | `AvatarUploader` | Project |
| Issues/MR Markdown attachments | yes | uploads/:project_path_with_namespace/:random_hex/:filename | `FileUploader` | Project |
| Issues/MR Legacy Markdown attachments | no | uploads/-/system/note/attachment/:id/:filename | `AttachmentUploader` | Note |
| CI Artifacts (CE) | yes | shared/artifacts/:year_:month/:project_id/:id | `ArtifactUploader` | Ci::Build |
| LFS Objects (CE) | yes | shared/lfs-objects/:hex/:hex/:object_hash | `LfsObjectUploader` | LfsObject |
CI Artifacts and LFS Objects behave differently in CE and EE. In CE they inherit the `GitlabUploader`
while in EE they inherit the `ObjectStoreUploader` and store files in and S3 API compatible object store.
In the case of Issues/MR Markdown attachments, there is a different approach using the [Hashed Storage] layout,
instead of basing the path into a mutable variable `:project_path_with_namespace`, it's possible to use the
hash of the project ID instead, if project migrates to the new approach (introduced in 10.2).
[CarrierWave]: https://github.com/carrierwaveuploader/carrierwave
[Hashed Storage]: ../administration/repository_storage_types.md
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment