1. 09 Aug, 2018 1 commit
  2. 08 Aug, 2018 2 commits
    • David Jeffery's avatar
      dm snapshot: improve performance by switching out_of_order_list to rbtree · 3db2776d
      David Jeffery authored
      copy_complete()'s processing of out_of_order_list can result in
      quadratic complexity in the worst case.  As such it was the source of
      consuming too much cpu and the source of significant loss in
      performance.
      
      Fix this by converting out_of_order_list to an rbtree.  This improved
      a dm-snapshot test copy workload from 32 seconds to 4 seconds.
      Signed-off-by: default avatarDavid Jeffery <djeffery@redhat.com>
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Tested-by: default avatarBrett Hull <bhull@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      3db2776d
    • John Pittman's avatar
      dm kcopyd: avoid softlockup in run_complete_job · 784c9a29
      John Pittman authored
      It was reported that softlockups occur when using dm-snapshot ontop of
      slow (rbd) storage.  E.g.:
      
      [ 4047.990647] watchdog: BUG: soft lockup - CPU#10 stuck for 22s! [kworker/10:23:26177]
      ...
      [ 4048.034151] Workqueue: kcopyd do_work [dm_mod]
      [ 4048.034156] RIP: 0010:copy_callback+0x41/0x160 [dm_snapshot]
      ...
      [ 4048.034190] Call Trace:
      [ 4048.034196]  ? __chunk_is_tracked+0x70/0x70 [dm_snapshot]
      [ 4048.034200]  run_complete_job+0x5f/0xb0 [dm_mod]
      [ 4048.034205]  process_jobs+0x91/0x220 [dm_mod]
      [ 4048.034210]  ? kcopyd_put_pages+0x40/0x40 [dm_mod]
      [ 4048.034214]  do_work+0x46/0xa0 [dm_mod]
      [ 4048.034219]  process_one_work+0x171/0x370
      [ 4048.034221]  worker_thread+0x1fc/0x3f0
      [ 4048.034224]  kthread+0xf8/0x130
      [ 4048.034226]  ? max_active_store+0x80/0x80
      [ 4048.034227]  ? kthread_bind+0x10/0x10
      [ 4048.034231]  ret_from_fork+0x35/0x40
      [ 4048.034233] Kernel panic - not syncing: softlockup: hung tasks
      
      Fix this by calling cond_resched() after run_complete_job()'s callout to
      the dm_kcopyd_notify_fn (which is dm-snap.c:copy_callback in the above
      trace).
      Signed-off-by: default avatarJohn Pittman <jpittman@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      784c9a29
  3. 07 Aug, 2018 2 commits
    • Mike Snitzer's avatar
      dm cache metadata: save in-core policy_hint_size to on-disk superblock · fd2fa954
      Mike Snitzer authored
      policy_hint_size starts as 0 during __write_initial_superblock().  It
      isn't until the policy is loaded that policy_hint_size is set in-core
      (cmd->policy_hint_size).  But it never got recorded in the on-disk
      superblock because __commit_transaction() didn't deal with transfering
      the in-core cmd->policy_hint_size to the on-disk superblock.
      
      The in-core cmd->policy_hint_size gets initialized by metadata_open()'s
      __begin_transaction_flags() which re-reads all superblock fields.
      Because the superblock's policy_hint_size was never properly stored, when
      the cache was created, hints_array_available() would always return false
      when re-activating a previously created cache.  This means
      __load_mappings() always considered the hints invalid and never made use
      of the hints (these hints served to optimize).
      
      Another detremental side-effect of this oversight is the cache_check
      utility would fail with: "invalid hint width: 0"
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      fd2fa954
    • Hou Tao's avatar
      dm thin: stop no_space_timeout worker when switching to write-mode · 75294442
      Hou Tao authored
      Now both check_for_space() and do_no_space_timeout() will read & write
      pool->pf.error_if_no_space.  If these functions run concurrently, as
      shown in the following case, the default setting of "queue_if_no_space"
      can get lost.
      
      precondition:
          * error_if_no_space = false (aka "queue_if_no_space")
          * pool is in Out-of-Data-Space (OODS) mode
          * no_space_timeout worker has been queued
      
      CPU 0:                          CPU 1:
      // delete a thin device
      process_delete_mesg()
      // check_for_space() invoked by commit()
      set_pool_mode(pool, PM_WRITE)
          pool->pf.error_if_no_space = \
           pt->requested_pf.error_if_no_space
      
      				// timeout, pool is still in OODS mode
      				do_no_space_timeout
      				    // "queue_if_no_space" config is lost
      				    pool->pf.error_if_no_space = true
          pool->pf.mode = new_mode
      
      Fix it by stopping no_space_timeout worker when switching to write mode.
      
      Fixes: bcc696fa ("dm thin: stay in out-of-data-space mode once no_space_timeout expires")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHou Tao <houtao1@huawei.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      75294442
  4. 31 Jul, 2018 1 commit
  5. 30 Jul, 2018 1 commit
    • Andy Grover's avatar
      dm thin: include metadata_low_watermark threshold in pool status · 63c8ecb6
      Andy Grover authored
      The metadata low watermark threshold is set by the kernel.  But the
      kernel depends on userspace to extend the thinpool metadata device when
      the threshold is crossed.
      
      Since the metadata low watermark threshold is not visible to userspace,
      upon receiving an event, userspace cannot tell that the kernel wants the
      metadata device extended, instead of some other eventing condition.
      Making it visible (but not settable) enables userspace to affirmatively
      know the kernel is asking for a metadata device extension, by comparing
      metadata_low_watermark against nr_free_blocks_metadata, also reported in
      status.
      
      Current solutions like dmeventd have their own thresholds for extending
      the data and metadata devices, and both devices are checked against
      their thresholds on each event.  This lessens the value of the kernel-set
      threshold, since userspace will either extend the metadata device sooner,
      when receiving another event; or will receive the metadata lowater event
      and do nothing, if dmeventd's threshold is less than the kernel's.
      (This second case is dangerous. The metadata lowater event will not be
      re-sent, so no further event will be generated before the metadata
      device is out if space, unless some other event causes userspace to
      recheck its thresholds.)
      Signed-off-by: default avatarAndy Grover <agrover@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      63c8ecb6
  6. 27 Jul, 2018 16 commits
  7. 22 Jul, 2018 8 commits
  8. 21 Jul, 2018 9 commits