Remove FDW mentions from Geo development docs

parent 2920ab1c
...@@ -209,121 +209,12 @@ To migrate the tracking database, run: ...@@ -209,121 +209,12 @@ To migrate the tracking database, run:
bundle exec rake geo:db:migrate bundle exec rake geo:db:migrate
``` ```
### Foreign Data Wrapper
> Introduced in GitLab 10.1.
Foreign Data Wrapper ([FDW](#fdw)) is used by the [Geo Log Cursor](#geo-log-cursor) and improves
the performance of many synchronization operations.
FDW is a PostgreSQL extension ([`postgres_fdw`](https://www.postgresql.org/docs/11/postgres-fdw.html)) that is enabled within
the Geo Tracking Database (on a **secondary** node), which allows it
to connect to the read-only database replica and perform queries and filter
data from both instances.
This persistent connection is configured as an FDW server
named `gitlab_secondary`. This configuration exists within the database's user
context only. To access the `gitlab_secondary`, GitLab needs to use the
same database user that had previously been configured.
The Geo Tracking Database accesses the read-only database replica via FDW as a regular user,
limited by its own restrictions. The credentials are configured as a
`USER MAPPING` associated with the `SERVER` mapped previously
(`gitlab_secondary`).
FDW configuration and credentials definition are managed automatically by the
Omnibus GitLab `gitlab-ctl reconfigure` command.
#### Refreshing the Foreign Tables
Whenever a new Geo node is configured or the database schema changes on the
**primary** node, you must refresh the foreign tables on the **secondary** node
by running the following:
```shell
bundle exec rake geo:db:refresh_foreign_tables
```
Failure to do this will prevent the **secondary** node from
functioning properly. The **secondary** node will generate error
messages, as the following PostgreSQL error:
```sql
ERROR: relation "gitlab_secondary.ci_job_artifacts" does not exist at character 323
STATEMENT: SELECT a.attname, format_type(a.atttypid, a.atttypmod),
pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod
FROM pg_attribute a LEFT JOIN pg_attrdef d
ON a.attrelid = d.adrelid AND a.attnum = d.adnum
WHERE a.attrelid = '"gitlab_secondary"."ci_job_artifacts"'::regclass
AND a.attnum > 0 AND NOT a.attisdropped
ORDER BY a.attnum
```
#### Accessing data from a Foreign Table
At the SQL level, all you have to do is `SELECT` data from `gitlab_secondary.*`.
Here's an example of how to access all projects from the Geo Tracking Database's FDW:
```sql
SELECT * FROM gitlab_secondary.projects;
```
As a more real-world example, this is how you filter for unarchived projects
on the Tracking Database:
```sql
SELECT project_registry.*
FROM project_registry
JOIN gitlab_secondary.projects
ON (project_registry.project_id = gitlab_secondary.projects.id
AND gitlab_secondary.projects.archived IS FALSE)
```
At the ActiveRecord level, we have additional Models that represent the
foreign tables. They must be mapped in a slightly different way, and they are read-only.
Check the existing FDW models in `ee/app/models/geo/fdw` for reference.
From a developer's perspective, it's no different than creating a model that
represents a Database View.
With the examples above, you can access the projects with:
```ruby
Geo::Fdw::Project.all
```
and to access the `ProjectRegistry` filtering by unarchived projects:
```ruby
# We have to use Arel here:
project_registry_table = Geo::ProjectRegistry.arel_table
fdw_project_table = Geo::Fdw::Project.arel_table
project_registry_table.join(fdw_project_table)
.on(project_registry_table[:project_id].eq(fdw_project_table[:id]))
.where((fdw_project_table[:archived]).eq(true)) # if you append `.to_sql` you can check generated query
```
## Finders ## Finders
Geo uses [Finders](https://gitlab.com/gitlab-org/gitlab/tree/master/app/finders), Geo uses [Finders](https://gitlab.com/gitlab-org/gitlab/tree/master/app/finders),
which are classes take care of the heavy lifting of looking up which are classes take care of the heavy lifting of looking up
projects/attachments/etc. in the tracking database and main database. projects/attachments/etc. in the tracking database and main database.
### Finders Performance
The Finders need to compare data from the main database with data in
the tracking database. For example, counting the number of synced
projects normally involves retrieving the project IDs from one
database and checking their state in the other database. This is slow
and requires a lot of memory.
To overcome this, the Finders use [FDW](#fdw), or Foreign Data
Wrappers. This allows a regular `JOIN` between the main database and
the tracking database.
## Redis ## Redis
Redis on the **secondary** node works the same as on the **primary** Redis on the **secondary** node works the same as on the **primary**
...@@ -397,12 +288,6 @@ migration do not need to run on the secondary nodes. ...@@ -397,12 +288,6 @@ migration do not need to run on the secondary nodes.
A database on each Geo **secondary** node that keeps state for the node A database on each Geo **secondary** node that keeps state for the node
on which it resides. Read more in [Using the Tracking database](#using-the-tracking-database). on which it resides. Read more in [Using the Tracking database](#using-the-tracking-database).
### FDW
Foreign Data Wrapper, or FDW, is a feature built-in in PostgreSQL. It
allows data to be queried from different data sources. In Geo, it's
used to query data from different PostgreSQL instances.
## Geo Event Log ## Geo Event Log
The Geo **primary** stores events in the `geo_event_log` table. Each The Geo **primary** stores events in the `geo_event_log` table. Each
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment