1. 12 Oct, 2011 7 commits
    • Christoph Hellwig's avatar
      xfs: let xfs_bwrite callers handle the xfs_buf_relse · c2b006c1
      Christoph Hellwig authored
      Remove the xfs_buf_relse from xfs_bwrite and let the caller handle it to
      mirror the delwri and read paths.
      
      Also remove the mount pointer passed to xfs_bwrite, which is superflous now
      that we have a mount pointer in the buftarg.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      
      c2b006c1
    • Christoph Hellwig's avatar
      xfs: call xfs_buf_delwri_queue directly · 61551f1e
      Christoph Hellwig authored
      Unify the ways we add buffers to the delwri queue by always calling
      xfs_buf_delwri_queue directly.  The xfs_bdwrite functions is removed and
      opencoded in its callers, and the two places setting XBF_DELWRI while a
      buffer is locked and expecting xfs_buf_unlock to pick it up are converted
      to call xfs_buf_delwri_queue directly, too.  Also replace the
      XFS_BUF_UNDELAYWRITE macro with direct calls to xfs_buf_delwri_dequeue
      to make the explicit queuing/dequeuing more obvious.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      
      61551f1e
    • Christoph Hellwig's avatar
      xfs: move more delwri setup into xfs_buf_delwri_queue · 5a8ee6ba
      Christoph Hellwig authored
      Do not transfer a reference held by the caller to the buffer on the list,
      or decrement it in xfs_buf_delwri_queue, but instead grab a new reference
      if needed, and let the caller drop its own reference.  Also move setting
      of the XBF_DELWRI and XBF_ASYNC flags into xfs_buf_delwri_queue, and
      only do it if needed.  Note that for now xfs_buf_unlock already has
      XBF_DELWRI, but that will change in the following patches.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      
      5a8ee6ba
    • Christoph Hellwig's avatar
      xfs: remove the unlock argument to xfs_buf_delwri_queue · 527cfdf1
      Christoph Hellwig authored
      We can just unlock the buffer in the caller, and the decrement of b_hold
      would also be needed in the !unlock, we just never hit that case currently
      given that the caller handles that case.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      
      527cfdf1
    • Christoph Hellwig's avatar
      xfs: remove delwri buffer handling from xfs_buf_iorequest · 375ec69d
      Christoph Hellwig authored
      We cannot ever reach xfs_buf_iorequest for a buffer with XBF_DELWRI set,
      given that all write handlers make sure that the buffer is remove from
      the delwri queue before, and we never do reads with the XBF_DELWRI flag
      set (which the code would not handle correctly anyway).
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      
      375ec69d
    • Dave Chinner's avatar
      xfs: don't serialise adjacent concurrent direct IO appending writes · 7271d243
      Dave Chinner authored
      For append write workloads, extending the file requires a certain
      amount of exclusive locking to be done up front to ensure sanity in
      things like ensuring that we've zeroed any allocated regions
      between the old EOF and the start of the new IO.
      
      For single threads, this typically isn't a problem, and for large
      IOs we don't serialise enough for it to be a problem for two
      threads on really fast block devices. However for smaller IO and
      larger thread counts we have a problem.
      
      Take 4 concurrent sequential, single block sized and aligned IOs.
      After the first IO is submitted but before it completes, we end up
      with this state:
      
              IO 1    IO 2    IO 3    IO 4
            +-------+-------+-------+-------+
            ^       ^
            |       |
            |       |
            |       |
            |       \- ip->i_new_size
            \- ip->i_size
      
      And the IO is done without exclusive locking because offset <=
      ip->i_size. When we submit IO 2, we see offset > ip->i_size, and
      grab the IO lock exclusive, because there is a chance we need to do
      EOF zeroing. However, there is already an IO in progress that avoids
      the need for IO zeroing because offset <= ip->i_new_size. hence we
      could avoid holding the IO lock exlcusive for this. Hence after
      submission of the second IO, we'd end up this state:
      
              IO 1    IO 2    IO 3    IO 4
            +-------+-------+-------+-------+
            ^               ^
            |               |
            |               |
            |               |
            |               \- ip->i_new_size
            \- ip->i_size
      
      There is no need to grab the i_mutex of the IO lock in exclusive
      mode if we don't need to invalidate the page cache. Taking these
      locks on every direct IO effective serialises them as taking the IO
      lock in exclusive mode has to wait for all shared holders to drop
      the lock. That only happens when IO is complete, so effective it
      prevents dispatch of concurrent direct IO writes to the same inode.
      
      And so you can see that for the third concurrent IO, we'd avoid
      exclusive locking for the same reason we avoided the exclusive lock
      for the second IO.
      
      Fixing this is a bit more complex than that, because we need to hold
      a write-submission local value of ip->i_new_size to that clearing
      the value is only done if no other thread has updated it before our
      IO completes.....
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      7271d243
    • Dave Chinner's avatar
      xfs: don't serialise direct IO reads on page cache checks · 0c38a251
      Dave Chinner authored
      There is no need to grab the i_mutex of the IO lock in exclusive
      mode if we don't need to invalidate the page cache. Taking these
      locks on every direct IO effective serialises them as taking the IO
      lock in exclusive mode has to wait for all shared holders to drop
      the lock. That only happens when IO is complete, so effective it
      prevents dispatch of concurrent direct IO reads to the same inode.
      
      Fix this by taking the IO lock shared to check the page cache state,
      and only then drop it and take the IO lock exclusively if there is
      work to be done. Hence for the normal direct IO case, no exclusive
      locking will occur.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Tested-by: default avatarJoern Engel <joern@logfs.org>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      0c38a251
  2. 05 Oct, 2011 1 commit
  3. 04 Oct, 2011 12 commits
  4. 03 Oct, 2011 18 commits
  5. 02 Oct, 2011 2 commits