1. 04 Jul, 2014 3 commits
    • Eric Sandeen's avatar
      xfs: don't emit corruption noise on fs probes · 22a16208
      Eric Sandeen authored
      commit 31625f28 upstream.
      
      If we get EWRONGFS due to probing of non-xfs filesystems,
      there's no need to issue the scary corruption error and backtrace.
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Reviewed-by: default avatarMark Tinguely <tinguely@sgi.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      22a16208
    • Dave Chinner's avatar
      xfs: prevent deadlock trying to cover an active log · 192fedc5
      Dave Chinner authored
      commit 2c6e24ce upstream.
      
      Recent analysis of a deadlocked XFS filesystem from a kernel
      crash dump indicated that the filesystem was stuck waiting for log
      space. The short story of the hang on the RHEL6 kernel is this:
      
      	- the tail of the log is pinned by an inode
      	- the inode has been pushed by the xfsaild
      	- the inode has been flushed to it's backing buffer and is
      	  currently flush locked and hence waiting for backing
      	  buffer IO to complete and remove it from the AIL
      	- the backing buffer is marked for write - it is on the
      	  delayed write queue
      	- the inode buffer has been modified directly and logged
      	  recently due to unlinked inode list modification
      	- the backing buffer is pinned in memory as it is in the
      	  active CIL context.
      	- the xfsbufd won't start buffer writeback because it is
      	  pinned
      	- xfssyncd won't force the log because it sees the log as
      	  needing to be covered and hence wants to issue a dummy
      	  transaction to move the log covering state machine along.
      
      Hence there is no trigger to force the CIL to the log and hence
      unpin the inode buffer and therefore complete the inode IO, remove
      it from the AIL and hence move the tail of the log along, allowing
      transactions to start again.
      
      Mainline kernels also have the same deadlock, though the signature
      is slightly different - the inode buffer never reaches the delayed
      write lists because xfs_buf_item_push() sees that it is pinned and
      hence never adds it to the delayed write list that the xfsaild
      flushes.
      
      There are two possible solutions here. The first is to simply force
      the log before trying to cover the log and so ensure that the CIL is
      emptied before we try to reserve space for the dummy transaction in
      the xfs_log_worker(). While this might work most of the time, it is
      still racy and is no guarantee that we don't get stuck in
      xfs_trans_reserve waiting for log space to come free. Hence it's not
      the best way to solve the problem.
      
      The second solution is to modify xfs_log_need_covered() to be aware
      of the CIL. We only should be attempting to cover the log if there
      is no current activity in the log - covering the log is the process
      of ensuring that the head and tail in the log on disk are identical
      (i.e. the log is clean and at idle). Hence, by definition, if there
      are items in the CIL then the log is not at idle and so we don't
      need to attempt to cover it.
      
      When we don't need to cover the log because it is active or idle, we
      issue a log force from xfs_log_worker() - if the log is idle, then
      this does nothing.  However, if the log is active due to there being
      items in the CIL, it will force the items in the CIL to the log and
      unpin them.
      
      In the case of the above deadlock scenario, instead of
      xfs_log_worker() getting stuck in xfs_trans_reserve() attempting to
      cover the log, it will instead force the log, thereby unpinning the
      inode buffer, allowing IO to be issued and complete and hence
      removing the inode that was pinning the tail of the log from the
      AIL. At that point, everything will start moving along again. i.e.
      the xfs_log_worker turns back into a watchdog that can alleviate
      deadlocks based around pinned items that prevent the tail of the log
      from being moved...
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      192fedc5
    • Jie Liu's avatar
      xfs: fix the wrong new_size/rnew_size at xfs_iext_realloc_direct() · 5b093812
      Jie Liu authored
      commit 17ec81c1 upstream.
      
      At xfs_iext_realloc_direct(), the new_size is changed by adding
      if_bytes if originally the extent records are stored at the inline
      extent buffer, and we have to switch from it to a direct extent
      list for those new allocated extents, this is wrong. e.g,
      
      Create a file with three extents which was showing as following,
      
      xfs_io -f -c "truncate 100m" /xfs/testme
      
      for i in $(seq 0 5 10); do
      	offset=$(($i * $((1 << 20))))
      	xfs_io -c "pwrite $offset 1m" /xfs/testme
      done
      
      Inline
      ------
      irec:	if_bytes	bytes_diff	new_size
      1st	0		16		16
      2nd	16		16		32
      
      Switching
      ---------						rnew_size
      3rd	32		16		48 + 32 = 80	roundup=128
      
      In this case, the desired value of new_size should be 48, and then
      it will be roundup to 64 and be assigned to rnew_size.
      
      However, this issue has been covered by resetting the if_bytes to
      the new_size which is calculated at the begnning of xfs_iext_add()
      before leaving out this function, and in turn make the rnew_size
      correctly again. Hence, this can not be detected via xfstestes.
      
      This patch fix above problem and revise the new_size comments at
      xfs_iext_realloc_direct() to make it more readable.  Also, fix the
      comments while switching from the inline extent buffer to a direct
      extent list to reflect this change.
      Signed-off-by: default avatarJie Liu <jeff.liu@oracle.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      5b093812
  2. 02 Jul, 2014 37 commits