1. 13 Apr, 2023 6 commits
    • Chengming Zhou's avatar
      blk-throttle: only enable blk-stat when BLK_DEV_THROTTLING_LOW · 8e15dfbd
      Chengming Zhou authored
      blk_throtl_register() will unconditionally enable blk-stat for gendisk
      when register, even when we have no BLK_DEV_THROTTLING_LOW config.
      
      Since the kernel always has only BLK_DEV_THROTTLING config and the
      BLK_DEV_THROTTLING_LOW config is still in EXPERIMENTAL state, we can
      just skip blk-stat when !BLK_DEV_THROTTLING_LOW.
      Signed-off-by: default avatarChengming Zhou <zhouchengming@bytedance.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Link: https://lore.kernel.org/r/20230413062805.2081970-2-chengming.zhou@linux.devSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      8e15dfbd
    • Chengming Zhou's avatar
      blk-stat: fix QUEUE_FLAG_STATS clear · 20de765f
      Chengming Zhou authored
      We need to set QUEUE_FLAG_STATS for two cases:
      1. blk_stat_enable_accounting()
      2. blk_stat_add_callback()
      
      So we should clear it only when ((q->stats->accounting == 0) &&
      list_empty(&q->stats->callbacks)).
      
      blk_stat_disable_accounting() only check if q->stats->accounting
      is 0 before clear the flag, this patch fix it.
      
      Also add list_empty(&q->stats->callbacks)) check when enable, or
      the flag is already set.
      
      The bug can be reproduced on kernel without BLK_DEV_THROTTLING
      (since it unconditionally enable accounting, see the next patch).
      
        # cat /sys/block/sr0/queue/scheduler
        none mq-deadline [bfq]
      
        # cat /sys/kernel/debug/block/sr0/state
        SAME_COMP|IO_STAT|INIT_DONE|STATS|REGISTERED|NOWAIT|30
      
        # echo none > /sys/block/sr0/queue/scheduler
      
        # cat /sys/kernel/debug/block/sr0/state
        SAME_COMP|IO_STAT|INIT_DONE|REGISTERED|NOWAIT
      
        # cat /sys/block/sr0/queue/wbt_lat_usec
        75000
      
      We can see that after changing elevator from "bfq" to "none",
      "STATS" flag is lost even though WBT callback still need it.
      
      Fixes: 68497092 ("block: make queue stat accounting a reference")
      Cc: <stable@vger.kernel.org> # v5.17+
      Signed-off-by: default avatarChengming Zhou <zhouchengming@bytedance.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Link: https://lore.kernel.org/r/20230413062805.2081970-1-chengming.zhou@linux.devSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      20de765f
    • Tejun Heo's avatar
      blk-iolatency: Make initialization lazy · a13696b8
      Tejun Heo authored
      Other rq_qos policies such as wbt and iocost are lazy-initialized when they
      are configured for the first time for the device but iolatency is
      initialized unconditionally from blkcg_init_disk() during gendisk init. Lazy
      init is beneficial because rq_qos policies add runtime overhead when
      initialized as every IO has to walk all registered rq_qos callbacks.
      
      This patch switches iolatency to lazy initialization too so that it only
      registered its rq_qos policy when it is first configured.
      
      Note that there is a known race condition between blkcg config file writes
      and del_gendisk() and this patch makes iolatency susceptible to it by
      exposing the init path to race against the deletion path. However, that
      problem already exists in iocost and is being worked on.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Josef Bacik <josef@toxicpanda.com>
      Link: https://lore.kernel.org/r/20230413000649.115785-5-tj@kernel.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      a13696b8
    • Tejun Heo's avatar
      blk-iolatency: s/blkcg_rq_qos/iolat_rq_qos/ · 33049187
      Tejun Heo authored
      The name was too generic given that there are multiple blkcg rq-qos
      policies.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Josef Bacik <josef@toxicpanda.com>
      Link: https://lore.kernel.org/r/20230413000649.115785-4-tj@kernel.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      33049187
    • Tejun Heo's avatar
      blkcg: Restructure blkg_conf_prep() and friends · faffaab2
      Tejun Heo authored
      We want to support lazy init of rq-qos policies so that iolatency is enabled
      lazily on configuration instead of gendisk initialization. The way blkg
      config helpers are structured now is a bit awkward for that. Let's
      restructure:
      
      * blkcg_conf_open_bdev() is renamed to blkg_conf_open_bdev(). The blkcg_
        prefix was used because the bdev opening step is blkg-independent.
        However, the distinction is too subtle and confuses more than helps. Let's
        switch to blkg prefix so that it's consistent with the type and other
        helper names.
      
      * struct blkg_conf_ctx now remembers the original input string and is always
        initialized by the new blkg_conf_init().
      
      * blkg_conf_open_bdev() is updated to take a pointer to blkg_conf_ctx like
        blkg_conf_prep() and can be called multiple times safely. Instead of
        modifying the double pointer to input string directly,
        blkg_conf_open_bdev() now sets blkg_conf_ctx->body.
      
      * blkg_conf_finish() is renamed to blkg_conf_exit() for symmetry and now
        must be called on all blkg_conf_ctx's which were initialized with
        blkg_conf_init().
      
      Combined, this allows the users to either open the bdev first or do it
      altogether with blkg_conf_prep() which will help implementing lazy init of
      rq-qos policies.
      
      blkg_conf_init/exit() will also be used implement synchronization against
      device removal. This is necessary because iolat / iocost are configured
      through cgroupfs instead of one of the files under /sys/block/DEVICE. As
      cgroupfs operations aren't synchronized with block layer, the lazy init and
      other configuration operations may race against device removal. This patch
      makes blkg_conf_init/exit() used consistently for all cgroup-orginating
      configurations making them a good place to implement explicit
      synchronization.
      
      Users are updated accordingly. No behavior change is intended by this patch.
      
      v2: bfq wasn't updated in v1 causing a build error. Fixed.
      
      v3: Update the description to include future use of blkg_conf_init/exit() as
          synchronization points.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Josef Bacik <josef@toxicpanda.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Yu Kuai <yukuai1@huaweicloud.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/20230413000649.115785-3-tj@kernel.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      faffaab2
    • Tejun Heo's avatar
      blkcg: Drop unnecessary RCU read [un]locks from blkg_conf_prep/finish() · 83462a6c
      Tejun Heo authored
      Now that all RCU flavors have been combined either holding a spin lock,
      disabling irq or disabling preemption implies RCU read lock, so there's no
      need to use rcu_read_[un]lock() explicitly while holding queue_lock. This
      shouldn't cause any behavior changes.
      
      v2: Description updated. Leave __acquires/release on queue_lock alone.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/20230413000649.115785-2-tj@kernel.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      83462a6c
  2. 12 Apr, 2023 7 commits
  3. 06 Apr, 2023 3 commits
  4. 05 Apr, 2023 5 commits
  5. 03 Apr, 2023 5 commits
  6. 02 Apr, 2023 9 commits
  7. 27 Mar, 2023 2 commits
  8. 20 Mar, 2023 1 commit
  9. 16 Mar, 2023 2 commits