• Yorick Peterse's avatar
    Constrain environment deployments to project IDs · f04094a4
    Yorick Peterse authored
    When querying the deployments of an environment the query Rails produces
    will be along the lines of the following:
    
        SELECT *
        FROM deployments
        WHERE environment_id = X
    
    For queries such as this (or queries that use this as their base and add
    more conditions) there is no meaningful index that can be used as long
    as deployments.project_id is not part of a WHERE clause.
    
    To work around this we change that "has_many :deployments" relation to
    always add a "WHERE project_id = X" condition. This means that queries
    filtering deployments can make better use of the existing indexes. For
    example, when filtering by deployments.iid this will result in the
    following query:
    
        SELECT *
        FROM deployments
        WHERE environment_id = X
        AND project_id = Y
        AND iid = Z
    
    This means PostgreSQL can use the existing index on
    (project_id, environment_id, iid) instead of having to use a different
    index (or none at all) and having to scan over a large amount of data.
    
    Query plan wise this means that instead of this query and plan:
    
        EXPLAIN (BUFFERS, ANALYZE)
        SELECT deployments.*
        FROM deployments
        WHERE deployments.environment_id = 5
        AND deployments.iid = 225;
    
        Index Scan using index_deployments_on_project_id_and_iid on deployments  (cost=0.42..14465.75 rows=1 width=117) (actual time=6.394..38.048 rows=1 loops=1)
          Index Cond: (iid = 225)
          Filter: (environment_id = 5)
          Rows Removed by Filter: 839
          Buffers: shared hit=4534
        Planning time: 0.076 ms
        Execution time: 38.073 ms
    
    We produce the following query and plan:
    
        EXPLAIN (BUFFERS, ANALYZE)
        SELECT deployments.*
        FROM deployments
        WHERE deployments.environment_id = 5
        AND deployments.iid = 225
        AND deployments.project_id = 1292351;
    
        Index Scan using index_deployments_on_project_id_and_iid on deployments  (cost=0.42..4.45 rows=1 width=117) (actual time=0.018..0.018 rows=1 loops=1)
          Index Cond: ((project_id = 1292351) AND (iid = 225))
          Filter: (environment_id = 5)
          Buffers: shared hit=4
        Planning time: 0.088 ms
        Execution time: 0.039 ms
    
    On GitLab.com these changes result in a (roughly) 11x improvement in SQL
    timings for the CI environment status endpoint.
    
    Fixes https://gitlab.com/gitlab-org/gitlab-ce/issues/36877
    f04094a4
ci-environment-status-performance.yml 95 Bytes