Prevent amplification of ReactiveCachingWorker jobs upon failures

When `ReactiveCachingWorker` hits an SSL or other exception that occurs quickly and reliably, automatically rescheduling a new worker could lead to excessive number of jobs being scheduled. This happens because not only does the failed job get rescheduled in a minute, but each Sidekiq retry will also add even more rescheduled jobs. In busy instances, this can become an issue because large numbers of `ReactiveCachingWorker` running can cause high rates of `ExclusiveLease` reads to occur and possibly saturate the Redis server with queries. We now disable this automatic retry and rely on Sidekiq to perform its 3 retries with a backoff period. Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/64176

Prevent amplification of ReactiveCachingWorker jobs upon failures
When `ReactiveCachingWorker` hits an SSL or other exception that occurs quickly and reliably, automatically rescheduling a new worker could lead to excessive number of jobs being scheduled. This happens because not only does the failed job get rescheduled in a minute, but each Sidekiq retry will also add even more rescheduled jobs. In busy instances, this can become an issue because large numbers of `ReactiveCachingWorker` running can cause high rates of `ExclusiveLease` reads to occur and possibly saturate the Redis server with queries. We now disable this automatic retry and rely on Sidekiq to perform its 3 retries with a backoff period. Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/64176
a28844ea · Stan Hu · 6e9f8820 · a28844ea · a28844ea · a28844ea
Commit a28844ea authored Jul 06, 2019 by Stan Hu
3 changed files
--- a/app/models/concerns/reactive_caching.rb
+++ b/app/models/concerns/reactive_caching.rb
@@ -178,7 +178,7 @@ module ReactiveCaching

    def enqueuing_update(*args)
      yield
-    ensure
+
      ReactiveCachingWorker.perform_in(self.class.reactive_cache_refresh_interval, self.class, id, *args)
    end
  end

--- a/changelogs/unreleased/sh-disable-reactive-caching-automatic-retries.yml
+++ b/changelogs/unreleased/sh-disable-reactive-caching-automatic-retries.yml
+---
+title: Prevent amplification of ReactiveCachingWorker jobs upon failures
+merge_request: 30432
+author:
+type: performance
--- a/spec/models/concerns/reactive_caching_spec.rb
+++ b/spec/models/concerns/reactive_caching_spec.rb
@@ -206,8 +206,9 @@ describe ReactiveCaching, :use_clean_rails_memory_store_caching do
          expect(read_reactive_cache(instance)).to eq("preexisting")
        end

-        it 'enqueues a repeat worker' do
-          expect_reactive_cache_update_queued(instance)
+        it 'does not enqueue a repeat worker' do
+          expect(ReactiveCachingWorker)
+            .not_to receive(:perform_in)

          expect { go! }.to raise_error("foo")
        end