Commit 326a45e9 authored by Kamil Trzciński's avatar Kamil Trzciński

Merge branch '289838-explore-gc-settings' into 'master'

Add scripts to collect GC setting results

See merge request gitlab-org/gitlab!49718
parents c571bcca f6d99723
......@@ -130,6 +130,8 @@ As a follow up to finding `N+1` queries with Bullet, consider writing a [QueryRe
## Settings that impact performance
### Application settings
1. `development` environment by default works with hot-reloading enabled, this makes Rails to check file changes every request, and create a potential contention lock, as hot reload is single threaded.
1. `development` environment can load code lazily once the request is fired which results in first request to always be slow.
......@@ -140,3 +142,34 @@ To disable those features for profiling/benchmarking set the `RAILS_PROFILE` env
- restart GDK with `gdk restart`
*This environment variable is only applicable for the development mode.*
### GC settings
Ruby's garbage collector (GC) can be tuned via a variety of environment variables that will directly impact application performance.
The following table lists these variables along with their default values.
| Environment variable | Default value |
|--|--|
| `RUBY_GC_HEAP_INIT_SLOTS` | `10000` |
| `RUBY_GC_HEAP_FREE_SLOTS` | `4096` |
| `RUBY_GC_HEAP_FREE_SLOTS_MIN_RATIO` | `0.20` |
| `RUBY_GC_HEAP_FREE_SLOTS_GOAL_RATIO` | `0.40` |
| `RUBY_GC_HEAP_FREE_SLOTS_MAX_RATIO` | `0.65` |
| `RUBY_GC_HEAP_GROWTH_FACTOR` | `1.8` |
| `RUBY_GC_HEAP_GROWTH_MAX_SLOTS` | `0 (disable)` |
| `RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR` | `2.0` |
| `RUBY_GC_MALLOC_LIMIT(_MIN)` | `(16 * 1024 * 1024 /* 16MB */)` |
| `RUBY_GC_MALLOC_LIMIT_MAX` | `(32 * 1024 * 1024 /* 32MB */)` |
| `RUBY_GC_MALLOC_LIMIT_GROWTH_FACTOR` | `1.4` |
| `RUBY_GC_OLDMALLOC_LIMIT(_MIN)` | `(16 * 1024 * 1024 /* 16MB */)` |
| `RUBY_GC_OLDMALLOC_LIMIT_MAX` | `(128 * 1024 * 1024 /* 128MB */)` |
| `RUBY_GC_OLDMALLOC_LIMIT_GROWTH_FACTOR` | `1.2` |
([Source](https://github.com/ruby/ruby/blob/45b29754cfba8435bc4980a87cd0d32c648f8a2e/gc.c#L254-L308))
GitLab may decide to change these settings in order to speed up application performance, lower memory requirements, or both.
You can see how each of these settings affect GC performance, memory use and application start-up time for an idle instance of
GitLab by runnning the `scripts/perf/gc/collect_gc_stats.rb` script. It will output GC stats and general timing data to standard
out as CSV.
#!/usr/bin/env ruby
# frozen_string_literal: true
####
# Loads GitLab application classes with a variety of GC settings and prints
# GC stats and timing data to standard out as CSV.
#
# The degree of parallelism can be increased by setting the PAR environment
# variable (default: 2).
require 'benchmark'
SETTINGS = {
'DEFAULTS' => [''],
# Default: 10_000
'RUBY_GC_HEAP_INIT_SLOTS' => %w[100000 1000000 5000000],
# Default: 1.8
'RUBY_GC_HEAP_GROWTH_FACTOR' => %w[1.2 1 0.8],
# Default: 0 (disabled)
'RUBY_GC_HEAP_GROWTH_MAX_SLOTS' => %w[10000 100000 1000000],
# Default: 2.0
'RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR' => %w[2.5 1.5 1],
# Default: 4096 (= 2^12)
'RUBY_GC_HEAP_FREE_SLOTS' => %w[16384 2048 0],
# Default: 0.20 (20%)
'RUBY_GC_HEAP_FREE_SLOTS_MIN_RATIO' => %w[0.1 0.01 0.001],
# Default: 0.40 (40%)
'RUBY_GC_HEAP_FREE_SLOTS_GOAL_RATIO' => %w[0.2 0.01 0.001],
# Default: 0.65 (65%)
'RUBY_GC_HEAP_FREE_SLOTS_MAX_RATIO' => %w[0.2 0.02 0.002],
# Default: 16MB
'RUBY_GC_MALLOC_LIMIT' => %w[8388608 4194304 1048576],
# Default: 32MB
'RUBY_GC_MALLOC_LIMIT_MAX' => %w[16777216 8388608 1048576],
# Default: 1.4
'RUBY_GC_MALLOC_LIMIT_GROWTH_FACTOR' => %w[1.6 1 0.8],
# Default: 16MB
'RUBY_GC_OLDMALLOC_LIMIT' => %w[8388608 4194304 1048576],
# Default: 128MB
'RUBY_GC_OLDMALLOC_LIMIT_MAX' => %w[33554432 16777216 1048576],
# Default: 1.2
'RUBY_GC_OLDMALLOC_LIMIT_GROWTH_FACTOR' => %w[1.4 1 0.8]
}.freeze
USED_GCSTAT_KEYS = [
:minor_gc_count,
:major_gc_count,
:heap_live_slots,
:heap_free_slots,
:total_allocated_pages,
:total_freed_pages,
:malloc_increase_bytes,
:malloc_increase_bytes_limit,
:oldmalloc_increase_bytes,
:oldmalloc_increase_bytes_limit
].freeze
CSV_USED_GCSTAT_KEYS = USED_GCSTAT_KEYS.join(',')
CSV_HEADER = "setting,value,#{CSV_USED_GCSTAT_KEYS},RSS,gc_time_s,cpu_utime_s,cpu_stime_s,real_time_s\n"
SCRIPT_PATH = __dir__
RAILS_ROOT = "#{SCRIPT_PATH}/../../../"
def collect_stats(setting, value)
warn "Testing #{setting} = #{value} ..."
env = {
setting => value,
'RAILS_ROOT' => RAILS_ROOT,
'SETTING_CSV' => "#{setting},#{value}",
'GC_STAT_KEYS' => CSV_USED_GCSTAT_KEYS
}
system(env, 'ruby', "#{SCRIPT_PATH}/print_gc_stats.rb")
end
par = ENV['PAR']&.to_i || 2
batch_size = (SETTINGS.size.to_f / par).ceil
batches = SETTINGS.each_slice(batch_size)
warn "Requested parallelism: #{par} (batches: #{batches.size}, batch size: #{batch_size})"
puts CSV_HEADER
elapsed = Benchmark.realtime do
threads = batches.each_with_index.map do |settings_batch, n|
Thread.new do
settings_batch.each do |setting, values|
values.each do |v|
collect_stats(setting, v)
end
end
end
end
threads.each(&:join)
end
warn "All done in #{elapsed} sec"
# frozen_string_literal: true
# Promotes survivors from eden to old gen and runs a compaction.
#
# aka "Nakayoshi GC"
#
# https://github.com/puma/puma/blob/de632261ac45d7dd85230c83f6af6dd720f1cbd9/lib/puma/util.rb#L26-L35
def nakayoshi_gc
4.times { GC.start(full_mark: false) }
GC.compact
end
# GC::Profiler is used elsewhere in the code base, so we provide a way for it
# to be used exclusively by this script, or otherwise results will be tainted.
module GC::Profiler
class << self
attr_accessor :use_exclusive
%i[enable disable clear].each do |method|
alias_method "#{method}_orig", "#{method}"
define_method(method) do
if use_exclusive
warn "GC::Profiler: ignoring call to #{method}"
return
end
send("#{method}_orig") # rubocop: disable GitlabSecurity/PublicSend
end
end
end
end
GC::Profiler.enable
GC::Profiler.use_exclusive = true
require 'benchmark'
RAILS_ROOT = ENV['RAILS_ROOT']
tms = Benchmark.measure do
require RAILS_ROOT + 'config/boot'
require RAILS_ROOT + 'config/environment'
end
GC::Profiler.use_exclusive = false
nakayoshi_gc
gc_stats = GC.stat
warn gc_stats.inspect
gc_total_time = GC::Profiler.total_time
GC::Profiler.report($stderr)
GC::Profiler.disable
gc_stat_keys = ENV['GC_STAT_KEYS'].to_s.split(',').map(&:to_sym)
values = []
values << ENV['SETTING_CSV']
values += gc_stat_keys.map { |k| gc_stats[k] }
values << ::Gitlab::Metrics::System.memory_usage_rss
values << gc_total_time
values << tms.utime + tms.cutime
values << tms.stime + tms.cstime
values << tms.real
puts values.join(',')
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment