1. 19 May, 2014 4 commits
  2. 10 May, 2014 5 commits
  3. 09 May, 2014 6 commits
    • Jens Axboe's avatar
      block: only calculate part_in_flight() once · 7276d02e
      Jens Axboe authored
      We first check if we have inflight IO, then retrieve that
      same number again. Usually this isn't that costly since the
      chance of having the data dirtied in between is small, but
      there's no reason for calling part_in_flight() twice.
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      7276d02e
    • Jens Axboe's avatar
      blk-mq: fix race in IO start accounting · cf4b50af
      Jens Axboe authored
      Commit c6d600c6 opened up a small race where we could attempt to
      account IO completion on a request, racing with IO start accounting.
      Fix this up by ensuring that we've accounted for IO start before
      inserting the request.
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      cf4b50af
    • Jens Axboe's avatar
      blk-mq: use sparser tag layout for lower queue depth · 59d13bf5
      Jens Axboe authored
      For best performance, spreading tags over multiple cachelines
      makes the tagging more efficient on multicore systems. But since
      we have 8 * sizeof(unsigned long) tags per cacheline, we don't
      always get a nice spread.
      
      Attempt to spread the tags over at least 4 cachelines, using fewer
      number of bits per unsigned long if we have to. This improves
      tagging performance in setups with 32-128 tags. For higher depths,
      the spread is the same as before (BITS_PER_LONG tags per cacheline).
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      59d13bf5
    • Jens Axboe's avatar
      blk-mq: implement new and more efficient tagging scheme · 4bb659b1
      Jens Axboe authored
      blk-mq currently uses percpu_ida for tag allocation. But that only
      works well if the ratio between tag space and number of CPUs is
      sufficiently high. For most devices and systems, that is not the
      case. The end result if that we either only utilize the tag space
      partially, or we end up attempting to fully exhaust it and run
      into lots of lock contention with stealing between CPUs. This is
      not optimal.
      
      This new tagging scheme is a hybrid bitmap allocator. It uses
      two tricks to both be SMP friendly and allow full exhaustion
      of the space:
      
      1) We cache the last allocated (or freed) tag on a per blk-mq
         software context basis. This allows us to limit the space
         we have to search. The key element here is not caching it
         in the shared tag structure, otherwise we end up dirtying
         more shared cache lines on each allocate/free operation.
      
      2) The tag space is split into cache line sized groups, and
         each context will start off randomly in that space. Even up
         to full utilization of the space, this divides the tag users
         efficiently into cache line groups, avoiding dirtying the same
         one both between allocators and between allocator and freeer.
      
      This scheme shows drastically better behaviour, both on small
      tag spaces but on large ones as well. It has been tested extensively
      to show better performance for all the cases blk-mq cares about.
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      4bb659b1
    • Christoph Hellwig's avatar
      blk-mq: initialize struct request fields individually · af76e555
      Christoph Hellwig authored
      This allows us to avoid a non-atomic memset over ->atomic_flags as well
      as killing lots of duplicate initializations.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      af76e555
    • Jens Axboe's avatar
      blk-mq: update a hotplug comment for grammar · 9fccfed8
      Jens Axboe authored
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      9fccfed8
  4. 07 May, 2014 1 commit
  5. 02 May, 2014 3 commits
  6. 30 Apr, 2014 4 commits
  7. 28 Apr, 2014 1 commit
  8. 25 Apr, 2014 2 commits
    • Christoph Hellwig's avatar
      block: fold __blk_add_timer into blk_add_timer · c4a634f4
      Christoph Hellwig authored
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      c4a634f4
    • Christoph Hellwig's avatar
      blk-mq: respect rq_affinity · 38535201
      Christoph Hellwig authored
      The blk-mq code is using it's own version of the I/O completion affinity
      tunables, which causes a few issues:
      
       - the rq_affinity sysfs file doesn't work for blk-mq devices, even if it
         still is present, thus breaking existing tuning setups.
       - the rq_affinity = 1 mode, which is the defauly for legacy request based
         drivers isn't implemented at all.
       - blk-mq drivers don't implement any completion affinity with the default
         flag settings.
      
      This patches removes the blk-mq ipi_redirect flag and sysfs file, as well
      as the internal BLK_MQ_F_SHOULD_IPI flag and replaces it with code that
      respects the queue-wide rq_affinity flags and also implements the
      rq_affinity = 1 mode.
      
      This means I/O completion affinity can now only be tuned block-queue wide
      instead of per context, which seems more sensible to me anyway.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      38535201
  9. 24 Apr, 2014 2 commits
    • Jens Axboe's avatar
      blk-mq: fix race with timeouts and requeue events · 87ee7b11
      Jens Axboe authored
      If a requeue event races with a timeout, we can get into the
      situation where we attempt to complete a request from the
      timeout handler when it's not start anymore. This causes a crash.
      So have the timeout handler check that REQ_ATOM_STARTED is still
      set on the request - if not, we ignore the event. If this happens,
      the request has now been marked as complete. As a consequence, we
      need to ensure to clear REQ_ATOM_COMPLETE in blk_mq_start_request(),
      as to maintain proper request state.
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      87ee7b11
    • Jens Axboe's avatar
      Revert "blk-mq: initialize req->q in allocation" · 70ab0b2d
      Jens Axboe authored
      This reverts commit 6a3c8a3a.
      
      We need selective clearing of the request to make the init-at-free
      time completely safe. Otherwise we end up stomping on
      rq->atomic_flags, which we don't want to do.
      70ab0b2d
  10. 23 Apr, 2014 1 commit
  11. 22 Apr, 2014 3 commits
  12. 21 Apr, 2014 5 commits
  13. 17 Apr, 2014 1 commit
  14. 16 Apr, 2014 2 commits
    • Jens Axboe's avatar
      fb1be433
    • Jens Axboe's avatar
      block: relax when to modify the timeout timer · f793aa53
      Jens Axboe authored
      Since we are now, by default, applying timer slack to expiry times,
      the logic for when to modify a timer in the block code is suboptimal.
      The block layer keeps a forward rolling timer per queue for all
      requests, and modifies this timer if a request has a shorter timeout
      than what the current expiry time is. However, this breaks down
      when our rounded timer values get applied slack. Then each new
      request ends up modifying the timer, since we're still a little
      in front of the timer + slack.
      
      Fix this by allowing a tolerance of HZ / 2, the timeout handling
      doesn't need to be very precise. This drastically cuts down
      the number of timer modifications we have to make.
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      f793aa53