Commit c4ea733a authored by Mikołaj Wawrzyniak's avatar Mikołaj Wawrzyniak

Merge branch '330842-use-instrumented-classes-in-usage-ping' into 'master'

[RUN ALL RSPEC] [RUN AS-IF-FOSS] Use instrumented classes in Usage Ping

See merge request gitlab-org/gitlab!64168
parents edab8275 2e92c887
...@@ -13,6 +13,7 @@ introduced_by_url: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/65336 ...@@ -13,6 +13,7 @@ introduced_by_url: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/65336
time_frame: none time_frame: none
data_source: system data_source: system
data_category: Standard data_category: Standard
instrumentation_class: CollectedDataCategoriesMetric
distribution: distribution:
- ce - ce
- ee - ee
......
...@@ -377,7 +377,9 @@ happened over time, such as how many CI pipelines have run. They are monotonic a ...@@ -377,7 +377,9 @@ happened over time, such as how many CI pipelines have run. They are monotonic a
Observations are facts collected from one or more GitLab instances and can carry arbitrary data. There are no Observations are facts collected from one or more GitLab instances and can carry arbitrary data. There are no
general guidelines around how to collect those, due to the individual nature of that data. general guidelines around how to collect those, due to the individual nature of that data.
There are several types of counters which are all found in `usage_data.rb`: ### Types of counters
There are several types of counters in `usage_data.rb`:
- **Ordinary Batch Counters:** Simple count of a given ActiveRecord_Relation - **Ordinary Batch Counters:** Simple count of a given ActiveRecord_Relation
- **Distinct Batch Counters:** Distinct count of a given ActiveRecord_Relation in a given column - **Distinct Batch Counters:** Distinct count of a given ActiveRecord_Relation in a given column
...@@ -388,6 +390,19 @@ There are several types of counters which are all found in `usage_data.rb`: ...@@ -388,6 +390,19 @@ There are several types of counters which are all found in `usage_data.rb`:
NOTE: NOTE:
Only use the provided counter methods. Each counter method contains a built in fail safe to isolate each counter to avoid breaking the entire Service Ping. Only use the provided counter methods. Each counter method contains a built in fail safe to isolate each counter to avoid breaking the entire Service Ping.
### Using instrumentation classes
We recommend you use [instrumentation classes](metrics_instrumentation.md) in `usage_data.rb` where possible.
For example, we have the following instrumentation class:
`lib/gitlab/usage/metrics/instrumentations/count_boards_metric.rb`.
You should add it to `usage_data.rb` as follows:
```ruby
boards: add_metric('CountBoardsMetric', time_frame: 'all'),
```
### Why batch counting ### Why batch counting
For large tables, PostgreSQL can take a long time to count rows due to MVCC [(Multi-version Concurrency Control)](https://en.wikipedia.org/wiki/Multiversion_concurrency_control). Batch counting is a counting method where a single large query is broken into multiple smaller queries. For example, instead of a single query querying 1,000,000 records, with batch counting, you can execute 100 queries of 10,000 records each. Batch counting is useful for avoiding database timeouts as each batch query is significantly shorter than one single long running query. For large tables, PostgreSQL can take a long time to count rows due to MVCC [(Multi-version Concurrency Control)](https://en.wikipedia.org/wiki/Multiversion_concurrency_control). Batch counting is a counting method where a single large query is broken into multiple smaller queries. For example, instead of a single query querying 1,000,000 records, with batch counting, you can execute 100 queries of 10,000 records each. Batch counting is useful for avoiding database timeouts as each batch query is significantly shorter than one single long running query.
......
...@@ -86,6 +86,21 @@ module Gitlab ...@@ -86,6 +86,21 @@ module Gitlab
end end
``` ```
## Support for instrumentation classes
There is support for:
- `count`, `distinct_count` for [database metrics](#database-metrics).
- [Redis HLL metrics](#redis-hyperloglog-metrics).
- [Generic metrics](#generic-metrics), which are metrics based on settings or configurations.
Currently, there is no support for:
- `add`, `sum`, `histogram`, `estimate_batch_distinct_count` for database metrics.
- Regular Redis counters.
You can [track the progress to support these](https://gitlab.com/groups/gitlab-org/-/epics/6118).
## Creating a new metric instrumentation class ## Creating a new metric instrumentation class
To create a stub instrumentation for a Service Ping metric, you can use a dedicated [generator](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/generators/gitlab/usage_metric_generator.rb): To create a stub instrumentation for a Service Ping metric, you can use a dedicated [generator](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/generators/gitlab/usage_metric_generator.rb):
......
...@@ -63,6 +63,9 @@ any of the following Service Ping files: ...@@ -63,6 +63,9 @@ any of the following Service Ping files:
Read the [stages file](https://gitlab.com/gitlab-com/www-gitlab-com/blob/master/data/stages.yml). Read the [stages file](https://gitlab.com/gitlab-com/www-gitlab-com/blob/master/data/stages.yml).
- Check the file location. Consider the time frame, and if the file should be under `ee`. - Check the file location. Consider the time frame, and if the file should be under `ee`.
- Check the tiers. - Check the tiers.
- Metrics instrumentations
- Recommend to use metrics instrumentation for new metrics addded to service with
[limitations](metrics_instrumentation.md#support-for-instrumentation-classes)
- Approve the MR, and relabel the MR with `~"product intelligence::approved"`. - Approve the MR, and relabel the MR with `~"product intelligence::approved"`.
## Review workload distribution ## Review workload distribution
......
...@@ -67,6 +67,7 @@ RSpec.describe 'Every metric definition' do ...@@ -67,6 +67,7 @@ RSpec.describe 'Every metric definition' do
allow(Gitlab::Geo).to receive(:enabled?).and_return(true) allow(Gitlab::Geo).to receive(:enabled?).and_return(true)
stub_licensed_features(requirements: true) stub_licensed_features(requirements: true)
stub_prometheus_queries stub_prometheus_queries
stub_usage_data_connections
end end
it 'is included in the Usage Ping hash structure' do it 'is included in the Usage Ping hash structure' do
......
...@@ -15,6 +15,10 @@ module Gitlab ...@@ -15,6 +15,10 @@ module Gitlab
@time_frame = time_frame @time_frame = time_frame
@options = options @options = options
end end
def instrumentation
value
end
end end
end end
end end
......
...@@ -5,7 +5,7 @@ module Gitlab ...@@ -5,7 +5,7 @@ module Gitlab
module Metrics module Metrics
module Instrumentations module Instrumentations
class CollectedDataCategoriesMetric < GenericMetric class CollectedDataCategoriesMetric < GenericMetric
def value value do
::ServicePing::PermitDataCategoriesService.new.execute ::ServicePing::PermitDataCategoriesService.new.execute
end end
end end
......
...@@ -59,6 +59,10 @@ module Gitlab ...@@ -59,6 +59,10 @@ module Gitlab
Gitlab::Usage::Metrics::Query.for(self.class.metric_operation, relation, self.class.column) Gitlab::Usage::Metrics::Query.for(self.class.metric_operation, relation, self.class.column)
end end
def instrumentation
to_sql
end
def suggested_name def suggested_name
Gitlab::Usage::Metrics::NameSuggestion.for( Gitlab::Usage::Metrics::NameSuggestion.for(
self.class.metric_operation, self.class.metric_operation,
......
...@@ -15,9 +15,7 @@ module Gitlab ...@@ -15,9 +15,7 @@ module Gitlab
FALLBACK = -1 FALLBACK = -1
class << self class << self
attr_reader :metric_operation, :metric_value attr_reader :metric_value
@metric_operation = :alt
def fallback(custom_fallback = FALLBACK) def fallback(custom_fallback = FALLBACK)
return @metric_fallback if defined?(@metric_fallback) return @metric_fallback if defined?(@metric_fallback)
...@@ -30,6 +28,11 @@ module Gitlab ...@@ -30,6 +28,11 @@ module Gitlab
end end
end end
def initialize(time_frame: 'none', options: {})
@time_frame = time_frame
@options = options
end
def value def value
alt_usage_data(fallback: self.class.fallback) do alt_usage_data(fallback: self.class.fallback) do
self.class.metric_value.call self.class.metric_value.call
...@@ -37,9 +40,7 @@ module Gitlab ...@@ -37,9 +40,7 @@ module Gitlab
end end
def suggested_name def suggested_name
Gitlab::Usage::Metrics::NameSuggestion.for( Gitlab::Usage::Metrics::NameSuggestion.for(:alt)
self.class.metric_operation
)
end end
end end
end end
......
...@@ -12,11 +12,6 @@ module Gitlab ...@@ -12,11 +12,6 @@ module Gitlab
# events: # events:
# - g_analytics_valuestream # - g_analytics_valuestream
# end # end
class << self
attr_reader :metric_operation
@metric_operation = :redis
end
def initialize(time_frame:, options: {}) def initialize(time_frame:, options: {})
super super
...@@ -36,9 +31,7 @@ module Gitlab ...@@ -36,9 +31,7 @@ module Gitlab
end end
def suggested_name def suggested_name
Gitlab::Usage::Metrics::NameSuggestion.for( Gitlab::Usage::Metrics::NameSuggestion.for(:redis)
self.class.metric_operation
)
end end
private private
......
...@@ -10,6 +10,12 @@ module Gitlab ...@@ -10,6 +10,12 @@ module Gitlab
uncached_data.deep_stringify_keys.dig(*key_path.split('.')) uncached_data.deep_stringify_keys.dig(*key_path.split('.'))
end end
def add_metric(metric, time_frame: 'none')
metric_class = "Gitlab::Usage::Metrics::Instrumentations::#{metric}".constantize
metric_class.new(time_frame: time_frame).suggested_name
end
private private
def count(relation, column = nil, batch: true, batch_size: nil, start: nil, finish: nil) def count(relation, column = nil, batch: true, batch_size: nil, start: nil, finish: nil)
......
...@@ -72,8 +72,8 @@ module Gitlab ...@@ -72,8 +72,8 @@ module Gitlab
def license_usage_data def license_usage_data
{ {
recorded_at: recorded_at, recorded_at: recorded_at,
uuid: alt_usage_data { Gitlab::CurrentSettings.uuid }, uuid: add_metric('UuidMetric'),
hostname: alt_usage_data { Gitlab.config.gitlab.host }, hostname: add_metric('HostnameMetric'),
version: alt_usage_data { Gitlab::VERSION }, version: alt_usage_data { Gitlab::VERSION },
installation_type: alt_usage_data { installation_type }, installation_type: alt_usage_data { installation_type },
active_user_count: count(User.active), active_user_count: count(User.active),
...@@ -93,7 +93,7 @@ module Gitlab ...@@ -93,7 +93,7 @@ module Gitlab
{ {
counts: { counts: {
assignee_lists: count(List.assignee), assignee_lists: count(List.assignee),
boards: count(Board), boards: add_metric('CountBoardsMetric', time_frame: 'all'),
ci_builds: count(::Ci::Build), ci_builds: count(::Ci::Build),
ci_internal_pipelines: count(::Ci::Pipeline.internal), ci_internal_pipelines: count(::Ci::Pipeline.internal),
ci_external_pipelines: count(::Ci::Pipeline.external), ci_external_pipelines: count(::Ci::Pipeline.external),
...@@ -138,7 +138,7 @@ module Gitlab ...@@ -138,7 +138,7 @@ module Gitlab
in_review_folder: count(::Environment.in_review_folder), in_review_folder: count(::Environment.in_review_folder),
grafana_integrated_projects: count(GrafanaIntegration.enabled), grafana_integrated_projects: count(GrafanaIntegration.enabled),
groups: count(Group), groups: count(Group),
issues: count(Issue, start: minimum_id(Issue), finish: maximum_id(Issue)), issues: add_metric('CountIssuesMetric', time_frame: 'all'),
issues_created_from_gitlab_error_tracking_ui: count(SentryIssue), issues_created_from_gitlab_error_tracking_ui: count(SentryIssue),
issues_with_associated_zoom_link: count(ZoomMeeting.added_to_issue), issues_with_associated_zoom_link: count(ZoomMeeting.added_to_issue),
issues_using_zoom_quick_actions: distinct_count(ZoomMeeting, :issue_id), issues_using_zoom_quick_actions: distinct_count(ZoomMeeting, :issue_id),
...@@ -257,7 +257,7 @@ module Gitlab ...@@ -257,7 +257,7 @@ module Gitlab
ldap_encrypted_secrets_enabled: alt_usage_data(fallback: nil) { Gitlab::Auth::Ldap::Config.encrypted_secrets.active? }, ldap_encrypted_secrets_enabled: alt_usage_data(fallback: nil) { Gitlab::Auth::Ldap::Config.encrypted_secrets.active? },
operating_system: alt_usage_data(fallback: nil) { operating_system }, operating_system: alt_usage_data(fallback: nil) { operating_system },
gitaly_apdex: alt_usage_data { gitaly_apdex }, gitaly_apdex: alt_usage_data { gitaly_apdex },
collected_data_categories: alt_usage_data(fallback: []) { Gitlab::Usage::Metrics::Instrumentations::CollectedDataCategoriesMetric.new(time_frame: 'none').value } collected_data_categories: add_metric('CollectedDataCategoriesMetric', time_frame: 'none')
} }
} }
end end
...@@ -644,8 +644,9 @@ module Gitlab ...@@ -644,8 +644,9 @@ module Gitlab
# Omitted because of encrypted properties: `projects_jira_cloud_active`, `projects_jira_server_active` # Omitted because of encrypted properties: `projects_jira_cloud_active`, `projects_jira_server_active`
# rubocop: disable CodeReuse/ActiveRecord # rubocop: disable CodeReuse/ActiveRecord
def usage_activity_by_stage_plan(time_period) def usage_activity_by_stage_plan(time_period)
time_frame = time_period.present? ? '28d' : 'none'
{ {
issues: distinct_count(::Issue.where(time_period), :author_id), issues: add_metric('CountUsersCreatingIssuesMetric', time_frame: time_frame),
notes: distinct_count(::Note.where(time_period), :author_id), notes: distinct_count(::Note.where(time_period), :author_id),
projects: distinct_count(::Project.where(time_period), :creator_id), projects: distinct_count(::Project.where(time_period), :creator_id),
todos: distinct_count(::Todo.where(time_period), :author_id), todos: distinct_count(::Todo.where(time_period), :author_id),
......
...@@ -5,6 +5,12 @@ module Gitlab ...@@ -5,6 +5,12 @@ module Gitlab
SQL_METRIC_DEFAULT = -3 SQL_METRIC_DEFAULT = -3
class << self class << self
def add_metric(metric, time_frame: 'none')
metric_class = "Gitlab::Usage::Metrics::Instrumentations::#{metric}".constantize
metric_class.new(time_frame: time_frame).instrumentation
end
def count(relation, column = nil, batch: true, batch_size: nil, start: nil, finish: nil) def count(relation, column = nil, batch: true, batch_size: nil, start: nil, finish: nil)
SQL_METRIC_DEFAULT SQL_METRIC_DEFAULT
end end
......
...@@ -5,6 +5,12 @@ module Gitlab ...@@ -5,6 +5,12 @@ module Gitlab
# See https://gitlab.com/gitlab-org/gitlab/-/merge_requests/41091 # See https://gitlab.com/gitlab-org/gitlab/-/merge_requests/41091
class UsageDataQueries < UsageData class UsageDataQueries < UsageData
class << self class << self
def add_metric(metric, time_frame: 'none')
metric_class = "Gitlab::Usage::Metrics::Instrumentations::#{metric}".constantize
metric_class.new(time_frame: time_frame).instrumentation
end
def count(relation, column = nil, *args, **kwargs) def count(relation, column = nil, *args, **kwargs)
Gitlab::Usage::Metrics::Query.for(:count, relation, column) Gitlab::Usage::Metrics::Query.for(:count, relation, column)
end end
......
...@@ -44,6 +44,12 @@ module Gitlab ...@@ -44,6 +44,12 @@ module Gitlab
DISTRIBUTED_HLL_FALLBACK = -2 DISTRIBUTED_HLL_FALLBACK = -2
MAX_BUCKET_SIZE = 100 MAX_BUCKET_SIZE = 100
def add_metric(metric, time_frame: 'none')
metric_class = "Gitlab::Usage::Metrics::Instrumentations::#{metric}".constantize
metric_class.new(time_frame: time_frame).value
end
def count(relation, column = nil, batch: true, batch_size: nil, start: nil, finish: nil) def count(relation, column = nil, batch: true, batch_size: nil, start: nil, finish: nil)
if batch if batch
Gitlab::Database::BatchCount.batch_count(relation, column, batch_size: batch_size, start: start, finish: finish) Gitlab::Database::BatchCount.batch_count(relation, column, batch_size: batch_size, start: start, finish: finish)
......
...@@ -16,6 +16,14 @@ RSpec.describe Gitlab::Usage::Metrics::NamesSuggestions::Generator do ...@@ -16,6 +16,14 @@ RSpec.describe Gitlab::Usage::Metrics::NamesSuggestions::Generator do
end end
end end
describe '#add_metric' do
let(:metric) {'CountIssuesMetric' }
it 'computes the suggested name for given metric' do
expect(described_class.add_metric(metric)).to eq('count_issues')
end
end
context 'for count with default column metrics' do context 'for count with default column metrics' do
it_behaves_like 'name suggestion' do it_behaves_like 'name suggestion' do
# corresponding metric is collected with count(Board) # corresponding metric is collected with count(Board)
......
...@@ -42,6 +42,10 @@ RSpec.describe Gitlab::UsageDataMetrics do ...@@ -42,6 +42,10 @@ RSpec.describe Gitlab::UsageDataMetrics do
it 'includes usage_activity_by_stage_monthly keys' do it 'includes usage_activity_by_stage_monthly keys' do
expect(subject[:usage_activity_by_stage_monthly][:plan]).to include(:issues) expect(subject[:usage_activity_by_stage_monthly][:plan]).to include(:issues)
end end
it 'includes settings keys' do
expect(subject[:settings]).to include(:collected_data_categories)
end
end end
end end
end end
...@@ -5,6 +5,14 @@ require 'spec_helper' ...@@ -5,6 +5,14 @@ require 'spec_helper'
RSpec.describe Gitlab::UsageDataNonSqlMetrics do RSpec.describe Gitlab::UsageDataNonSqlMetrics do
let(:default_count) { Gitlab::UsageDataNonSqlMetrics::SQL_METRIC_DEFAULT } let(:default_count) { Gitlab::UsageDataNonSqlMetrics::SQL_METRIC_DEFAULT }
describe '#add_metric' do
let(:metric) { 'UuidMetric' }
it 'computes the metric value for given metric' do
expect(described_class.add_metric(metric)).to eq(Gitlab::CurrentSettings.uuid)
end
end
describe '.count' do describe '.count' do
it 'returns default value for count' do it 'returns default value for count' do
expect(described_class.count(User)).to eq(default_count) expect(described_class.count(User)).to eq(default_count)
......
...@@ -7,6 +7,14 @@ RSpec.describe Gitlab::UsageDataQueries do ...@@ -7,6 +7,14 @@ RSpec.describe Gitlab::UsageDataQueries do
allow(ActiveRecord::Base.connection).to receive(:transaction_open?).and_return(false) allow(ActiveRecord::Base.connection).to receive(:transaction_open?).and_return(false)
end end
describe '#add_metric' do
let(:metric) { 'CountBoardsMetric' }
it 'builds the query for given metric' do
expect(described_class.add_metric(metric)).to eq('SELECT COUNT("boards"."id") FROM "boards"')
end
end
describe '.count' do describe '.count' do
it 'returns the raw SQL' do it 'returns the raw SQL' do
expect(described_class.count(User)).to start_with('SELECT COUNT("users"."id") FROM "users"') expect(described_class.count(User)).to start_with('SELECT COUNT("users"."id") FROM "users"')
......
...@@ -5,6 +5,14 @@ require 'spec_helper' ...@@ -5,6 +5,14 @@ require 'spec_helper'
RSpec.describe Gitlab::Utils::UsageData do RSpec.describe Gitlab::Utils::UsageData do
include Database::DatabaseHelpers include Database::DatabaseHelpers
describe '#add_metric' do
let(:metric) { 'UuidMetric'}
it 'computes the metric value for given metric' do
expect(described_class.add_metric(metric)).to eq(Gitlab::CurrentSettings.uuid)
end
end
describe '#count' do describe '#count' do
let(:relation) { double(:relation) } let(:relation) { double(:relation) }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment