1. 23 Aug, 2010 10 commits
    • Vivek Goyal's avatar
      cfq-iosched: blktrace print per slice sector stats · c4e7893e
      Vivek Goyal authored
      o Divyesh had gotten rid of this code in the past. I want to re-introduce it
        back as it helps me a lot during debugging.
      Reviewed-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Reviewed-by: default avatarDivyesh Shah <dpshah@google.com>
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      c4e7893e
    • Vivek Goyal's avatar
      cfq-iosched: Implement tunable group_idle · 80bdf0c7
      Vivek Goyal authored
      o Implement a new tunable group_idle, which allows idling on the group
        instead of a cfq queue. Hence one can set slice_idle = 0 and not idle
        on the individual queues but idle on the group. This way on fast storage
        we can get fairness between groups at the same time overall throughput
        improves.
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Acked-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      80bdf0c7
    • Vivek Goyal's avatar
      cfq-iosched: Do group share accounting in IOPS when slice_idle=0 · 02b35081
      Vivek Goyal authored
      o Implement another CFQ mode where we charge group in terms of number
        of requests dispatched instead of measuring the time. Measuring in terms
        of time is not possible when we are driving deeper queue depths and there
        are requests from multiple cfq queues in the request queue.
      
      o This mode currently gets activated if one sets slice_idle=0 and associated
        disk supports NCQ. Again the idea is that on an NCQ disk with idling disabled
        most of the queues will dispatch 1 or more requests and then cfq queue
        expiry happens and we don't have a way to measure time. So start providing
        fairness in terms of IOPS.
      
      o Currently IOPS mode works only with cfq group scheduling. CFQ is following
        different scheduling algorithms for queue and group scheduling. These IOPS
        stats are used only for group scheduling hence in non-croup mode nothing
        should change.
      
      o For CFQ group scheduling one can disable slice idling so that we don't idle
        on queue and drive deeper request queue depths (achieving better throughput),
        at the same time group idle is enabled so one should get service
        differentiation among groups.
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Acked-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      02b35081
    • Vivek Goyal's avatar
      cfq-iosched: Do not idle if slice_idle=0 · b6508c16
      Vivek Goyal authored
      Do not idle either on cfq queue or service tree if slice_idle=0. User does
      not want any queue or service tree idling. Currently even if slice_idle=0,
      we were waiting for request to finish before expiring the queue and that
      can lead to lower queue depths.
      Acked-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      b6508c16
    • Stephen M. Cameron's avatar
      cciss: disable doorbell reset on reset_devices · 75230ff2
      Stephen M. Cameron authored
      The doorbell reset initially appears to work correctly,
      the controller resets, comes up, some i/o can even be
      done, but on at least some Smart Arrays in some servers,
      it eventually causes a subsequent controller lockup due
      to some kind of PCIe error, and kdump can end up leaving
      the root filesystem in an unbootable state.  For this
      reason, until the problem is fixed, or at least isolated
      to certain hardware enough to be avoided, the doorbell
      reset should not be used at all.
      Signed-off-by: default avatarStephen M. Cameron <scameron@beardog.cce.hp.com>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      75230ff2
    • Ciju Rajan K's avatar
      blkio: Fix return code for mkdir calls · 96aa1b41
      Ciju Rajan K authored
      If the cgroup hierarchy for blkio control groups is deeper than two
      levels, kernel should not allow the creation of further levels. mkdir
      system call does not except EINVAL as a return value. This patch
      replaces EINVAL with more appropriate EPERM
      Signed-off-by: default avatarCiju Rajan K <ciju@linux.vnet.ibm.com>
      Reviewed-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      96aa1b41
    • Linus Torvalds's avatar
      Merge branch 'radix-tree' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/xfsdev · 9ee47476
      Linus Torvalds authored
      * 'radix-tree' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/xfsdev:
        radix-tree: radix_tree_range_tag_if_tagged() can set incorrect tags
        radix-tree: clear all tags in radix_tree_node_rcu_free
      9ee47476
    • Linus Torvalds's avatar
      Linux 2.6.36-rc2 · 76be97c1
      Linus Torvalds authored
      76be97c1
    • Dave Chinner's avatar
      radix-tree: radix_tree_range_tag_if_tagged() can set incorrect tags · 144dcfc0
      Dave Chinner authored
      Commit ebf8aa44 ("radix-tree:
      omplement function radix_tree_range_tag_if_tagged") does not safely
      set tags on on intermediate tree nodes. The code walks down the tree
      setting tags before it has fully resolved the path to the leaf under
      the assumption there will be a leaf slot with the tag set in the
      range it is searching.
      
      Unfortunately, this is not a valid assumption - we can abort after
      setting a tag on an intermediate node if we overrun the number of
      tags we are allowed to set in a batch, or stop scanning because we
      we have passed the last scan index before we reach a leaf slot with
      the tag we are searching for set.
      
      As a result, we can leave the function with tags set on intemediate
      nodes which can be tripped over later by tag-based lookups. The
      result of these stale tags is that lookup may end prematurely or
      livelock because the lookup cannot make progress.
      
      The fix for the problem involves reocrding the traversal path we
      take to the leaf nodes, and only propagating the tags back up the
      tree once the tag is set in the leaf node slot. We are already
      recording the path for efficient traversal, so there is no
      additional overhead to do the intermediately node tag setting in
      this manner.
      
      This fixes a radix tree lookup livelock triggered by the new
      writeback sync livelock avoidance code introduced in commit
      f446daae ("mm: implement writeback
      livelock avoidance using page tagging").
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Acked-by: default avatarJan Kara <jack@suse.cz>
      144dcfc0
    • Dave Chinner's avatar
      radix-tree: clear all tags in radix_tree_node_rcu_free · b6dd0865
      Dave Chinner authored
      Commit f446daae ("mm: implement
      writeback livelock avoidance using page tagging") introduced a new
      radix tree tag, increasing the number of tags in each node from 2 to
      3. It did not, however, fix up the code in
      radix_tree_node_rcu_free() that cleans up after radix_tree_shrink()
      and hence could leave stray tags set in the new tag array.
      
      The result is that the livelock avoidance code added in the the
      above commit would hit stale tags when doing tag based lookups,
      resulting in livelocks when trying to traverse the tree.
      
      Fix this problem in radix_tree_node_rcu_free() so it doesn't happen
      again in the future by using a loop to walk all the tags up to
      RADIX_TREE_MAX_TAGS to clear the stray tags radix_tree_shrink()
      leaves behind.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Acked-by: default avatarNick Piggin <npiggin@kernel.dk>
      Acked-by: default avatarJan Kara <jack@suse.cz>
      b6dd0865
  2. 22 Aug, 2010 12 commits
  3. 21 Aug, 2010 6 commits
  4. 20 Aug, 2010 12 commits