1. 01 Nov, 2017 10 commits
  2. 31 Oct, 2017 1 commit
  3. 30 Oct, 2017 6 commits
    • Liang Chen's avatar
      bcache: explicitly destroy mutex while exiting · 330a4db8
      Liang Chen authored
      mutex_destroy does nothing most of time, but it's better to call
      it to make the code future proof and it also has some meaning
      for like mutex debug.
      
      As Coly pointed out in a previous review, bcache_exit() may not be
      able to handle all the references properly if userspace registers
      cache and backing devices right before bch_debug_init runs and
      bch_debug_init failes later. So not exposing userspace interface
      until everything is ready to avoid that issue.
      Signed-off-by: default avatarLiang Chen <liangchen.linux@gmail.com>
      Reviewed-by: default avatarMichael Lyle <mlyle@lyle.org>
      Reviewed-by: default avatarColy Li <colyli@suse.de>
      Reviewed-by: default avatarEric Wheeler <bcache@linux.ewheeler.net>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      330a4db8
    • tang.junhui's avatar
      bcache: fix wrong cache_misses statistics · c1573137
      tang.junhui authored
      Currently, Cache missed IOs are identified by s->cache_miss, but actually,
      there are many situations that missed IOs are not assigned a value for
      s->cache_miss in cached_dev_cache_miss(), for example, a bypassed IO
      (s->iop.bypass = 1), or the cache_bio allocate failed. In these situations,
      it will go to out_put or out_submit, and s->cache_miss is null, which leads
      bch_mark_cache_accounting() to treat this IO as a hit IO.
      
      [ML: applied by 3-way merge]
      Signed-off-by: default avatartang.junhui <tang.junhui@zte.com.cn>
      Reviewed-by: default avatarMichael Lyle <mlyle@lyle.org>
      Reviewed-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c1573137
    • Tang Junhui's avatar
      bcache: update bucket_in_use in real time · d44c2f9e
      Tang Junhui authored
      bucket_in_use is updated in gc thread which triggered by invalidating or
      writing sectors_to_gc dirty data, It's a long interval. Therefore, when we
      use it to compare with the threshold, it is often not timely, which leads
      to inaccurate judgment and often results in bucket depletion.
      
      We have send a patch before, by the means of updating bucket_in_use
      periodically In gc thread, which Coly thought that would lead high
      latency, In this patch, we add avail_nbuckets to record the count of
      available buckets, and we calculate bucket_in_use when alloc or free
      bucket in real time.
      
      [edited by ML: eliminated some whitespace errors]
      Signed-off-by: default avatarTang Junhui <tang.junhui@zte.com.cn>
      Signed-off-by: default avatarMichael Lyle <mlyle@lyle.org>
      Reviewed-by: default avatarMichael Lyle <mlyle@lyle.org>
      Reviewed-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      d44c2f9e
    • Elena Reshetova's avatar
      bcache: convert cached_dev.count from atomic_t to refcount_t · 3b304d24
      Elena Reshetova authored
      atomic_t variables are currently used to implement reference
      counters with the following properties:
       - counter is initialized to 1 using atomic_set()
       - a resource is freed upon counter reaching zero
       - once counter reaches zero, its further
         increments aren't allowed
       - counter schema uses basic atomic operations
         (set, inc, inc_not_zero, dec_and_test, etc.)
      
      Such atomic variables should be converted to a newly provided
      refcount_t type and API that prevents accidental counter overflows
      and underflows. This is important since overflows and underflows
      can lead to use-after-free situation and be exploitable.
      
      The variable cached_dev.count is used as pure reference counter.
      Convert it to refcount_t and fix up the operations.
      Suggested-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarDavid Windsor <dwindsor@gmail.com>
      Reviewed-by: default avatarHans Liljestrand <ishkamiel@gmail.com>
      Reviewed-by: default avatarMichael Lyle <mlyle@lyle.org>
      Signed-off-by: default avatarElena Reshetova <elena.reshetova@intel.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      3b304d24
    • Coly Li's avatar
      bcache: only permit to recovery read error when cache device is clean · d59b2379
      Coly Li authored
      When bcache does read I/Os, for example in writeback or writethrough mode,
      if a read request on cache device is failed, bcache will try to recovery
      the request by reading from cached device. If the data on cached device is
      not synced with cache device, then requester will get a stale data.
      
      For critical storage system like database, providing stale data from
      recovery may result an application level data corruption, which is
      unacceptible.
      
      With this patch, for a failed read request in writeback or writethrough
      mode, recovery a recoverable read request only happens when cache device
      is clean. That is to say, all data on cached device is up to update.
      
      For other cache modes in bcache, read request will never hit
      cached_dev_read_error(), they don't need this patch.
      
      Please note, because cache mode can be switched arbitrarily in run time, a
      writethrough mode might be switched from a writeback mode. Therefore
      checking dc->has_data in writethrough mode still makes sense.
      
      Changelog:
      V4: Fix parens error pointed by Michael Lyle.
      v3: By response from Kent Oversteet, he thinks recovering stale data is a
          bug to fix, and option to permit it is unnecessary. So this version
          the sysfs file is removed.
      v2: rename sysfs entry from allow_stale_data_on_failure  to
          allow_stale_data_on_failure, and fix the confusing commit log.
      v1: initial patch posted.
      
      [small change to patch comment spelling by mlyle]
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarMichael Lyle <mlyle@lyle.org>
      Reported-by: default avatarArne Wolf <awolf@lenovo.com>
      Reviewed-by: default avatarMichael Lyle <mlyle@lyle.org>
      Cc: Kent Overstreet <kent.overstreet@gmail.com>
      Cc: Nix <nix@esperi.org.uk>
      Cc: Kai Krakow <hurikhan77@gmail.com>
      Cc: Eric Wheeler <bcache@lists.ewheeler.net>
      Cc: Junhui Tang <tang.junhui@zte.com.cn>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      d59b2379
    • Bart Van Assche's avatar
      block: Fix a race between blk_cleanup_queue() and timeout handling · 4e9b6f20
      Bart Van Assche authored
      Make sure that if the timeout timer fires after a queue has been
      marked "dying" that the affected requests are finished.
      Reported-by: default avatarchenxiang (M) <chenxiang66@hisilicon.com>
      Fixes: commit 287922eb ("block: defer timeouts to a workqueue")
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Tested-by: default avatarchenxiang (M) <chenxiang66@hisilicon.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Ming Lei <ming.lei@redhat.com>
      Cc: Johannes Thumshirn <jthumshirn@suse.de>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      4e9b6f20
  4. 25 Oct, 2017 7 commits
  5. 24 Oct, 2017 1 commit
  6. 17 Oct, 2017 2 commits
    • Omar Sandoval's avatar
      kyber: fix hang on domain token wait queue · 8cf46660
      Omar Sandoval authored
      When we're getting a domain token, if we fail to get a token on our
      first attempt, we put the current hardware queue on a wait queue and
      then try again just in case a token was freed after our initial attempt
      but before we got on the wait queue. If this second attempt succeeds, we
      currently leave the hardware queue on the wait queue. Usually this is
      okay; we'll just run the hardware queue one extra time when another
      token is freed. However, if the hardware queue doesn't have any other
      requests waiting, then when it it gets the extra wakeup, it won't have
      anything to free and therefore won't wake up any other hardware queues.
      If tokens are limited, then we won't make forward progress and the
      device will hang.
      Reported-by: default avatarBin Zha <zhabin.zb@alibaba-inc.com>
      Signed-off-by: default avatarOmar Sandoval <osandov@fb.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      8cf46660
    • Wei Yongjun's avatar
      nullb: fix error return code in null_init() · 30c516d7
      Wei Yongjun authored
      Fix to return error code -ENOMEM from the null_alloc_dev() error
      handling case instead of 0, as done elsewhere in this function.
      
      Fixes: 2984c868 ("nullb: factor disk parameters")
      Signed-off-by: default avatarWei Yongjun <weiyongjun1@huawei.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      30c516d7
  7. 16 Oct, 2017 13 commits