• Toon Claes's avatar
    UNION of SELECT/WHERE is faster than WHERE on UNION · 01c6323d
    Toon Claes authored
    Instead of applying WHERE on a UNION, apply the WHERE on each of the seperate
    SELECT statements, and do UNION on that.
    
    Local tests with about 2_000_000 projects:
     - 1_500_000 private projects
     -    40_000 internal projects
     -   400_000 public projects
    
    For the API endpoint `/api/v4/projects?visibility=private` the slowest query was:
    
    ```sql
    SELECT "projects".*
    FROM "projects"
    WHERE ...
    ```
    
    The original query took 1073.8ms.
    The query refactored to UNION of SELECT/WHERE took 2.3ms.
    
    The original query was:
    
    ```sql
    SELECT "projects".*
    FROM "projects"
    WHERE "projects"."pending_delete" = $1
      AND (projects.id IN
             (SELECT "projects"."id"
              FROM "projects"
              INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id"
              WHERE "projects"."pending_delete" = 'f'
                AND "project_authorizations"."user_id" = 23
              UNION SELECT "projects"."id"
              FROM "projects"
              WHERE "projects"."visibility_level" IN (20,
                                                      10)))
      AND "projects"."visibility_level" = $2
      AND "projects"."archived" = $3
    ORDER BY "projects"."created_at" DESC
    LIMIT 20
    OFFSET 0 [["pending_delete", "f"],
           ["visibility_level", 0],
           ["archived", "f"]]
    ```
    
    The refactored query:
    ```sql
    SELECT "projects".*
    FROM "projects"
    WHERE "projects"."pending_delete" = $1
      AND (projects.id IN
             (SELECT "projects"."id"
              FROM "projects"
              INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id"
              WHERE "projects"."pending_delete" = 'f'
                AND "project_authorizations"."user_id" = 23
                AND "projects"."visibility_level" = 0
                AND "projects"."archived" = 'f'
              UNION SELECT "projects"."id"
              FROM "projects"
              WHERE "projects"."visibility_level" IN (20,
                                                      10)
                AND "projects"."visibility_level" = 0
                AND "projects"."archived" = 'f'))
    ORDER BY "projects"."created_at" DESC
    LIMIT 20
    OFFSET 0 [["pending_delete", "f"]]
    ```
    01c6323d
tc-improve-project-api-perf.yml 108 Bytes