1. 18 Dec, 2013 1 commit
  2. 17 Dec, 2013 9 commits
    • Dave Chinner's avatar
      xfs: abort metadata writeback on permanent errors · ac8809f9
      Dave Chinner authored
      If we are doing aysnc writeback of metadata, we can get write errors
      but have nobody to report them to. At the moment, we simply attempt
      to reissue the write from io completion in the hope that it's a
      transient error.
      
      When it's not a transient error, the buffer is stuck forever in
      this loop, and we cannot break out of it. Eventually, unmount will
      hang because the AIL cannot be emptied and everything goes downhill
      from them.
      
      To solve this problem, only retry the write IO once before aborting
      it. We don't throw the buffer away because some transient errors can
      last minutes (e.g.  FC path failover) or even hours (thin
      provisioned devices that have run out of backing space) before they
      go away. Hence we really want to keep trying until we can't try any
      more.
      
      Because the buffer was not cleaned, however, it does not get removed
      from the AIL and hence the next pass across the AIL will start IO on
      it again. As such, we still get the "retry forever" semantics that
      we currently have, but we allow other access to the buffer in the
      mean time. Meanwhile the filesystem can continue to modify the
      buffer and relog it, so the IO errors won't hang the log or the
      filesystem.
      
      Now when we are pushing the AIL, we can see all these "permanent IO
      error" buffers and we can issue a warning about failures before we
      retry the IO. We can also catch these buffers when unmounting an
      issue a corruption warning, too.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      ac8809f9
    • Dave Chinner's avatar
      xfs: swalloc doesn't align allocations properly · 33177f05
      Dave Chinner authored
      When swalloc is specified as a mount option, allocations are
      supposed to be aligned to the stripe width rather than the stripe
      unit of the underlying filesystem. However, it does not do this.
      
      What the implementation does is round up the allocation size to a
      stripe width, hence ensuring that all allocations span a full stripe
      width. It does not, however, ensure that that allocation is aligned
      to a stripe width, and hence the allocations can span multiple
      underlying stripes and so still see RMW cycles for things like
      direct IO on MD RAID.
      
      So, if the swalloc mount option is set, change the allocation
      alignment in xfs_bmap_btalloc() to use the stripe width rather than
      the stripe unit.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarBen Myers <bpm@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      33177f05
    • Christoph Hellwig's avatar
      xfs: remove xfsbdstrat error · 83a0adc3
      Christoph Hellwig authored
      The xfsbdstrat helper is a small but useless wrapper for xfs_buf_iorequest that
      handles the case of a shut down filesystem.  Most of the users have private,
      uncached buffers that can just be freed in this case, but the complex error
      handling in xfs_bioerror_relse messes up the case when it's called without
      a locked buffer.
      
      Remove xfsbdstrat and opencode the error handling in the callers.  All but
      one can simply return an error and don't need to deal with buffer state,
      and the one caller that cares about the buffer state could do with a major
      cleanup as well, but we'll defer that to later.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarBen Myers <bpm@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      83a0adc3
    • Dave Chinner's avatar
      xfs: align initial file allocations correctly · 6e708bcf
      Dave Chinner authored
      The function xfs_bmap_isaeof() is used to indicate that an
      allocation is occurring at or past the end of file, and as such
      should be aligned to the underlying storage geometry if possible.
      
      Commit 27a3f8f2 ("xfs: introduce xfs_bmap_last_extent") changed the
      behaviour of this function for empty files - it turned off
      allocation alignment for this case accidentally. Hence large initial
      allocations from direct IO are not getting correctly aligned to the
      underlying geometry, and that is cause write performance to drop in
      alignment sensitive configurations.
      
      Fix it by considering allocation into empty files as requiring
      aligned allocation again.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      
      (cherry picked from commit f9b395a8)
      6e708bcf
    • Namjae Jeon's avatar
      MAINTAINERS: fix incorrect mail address of XFS maintainer · 809625ca
      Namjae Jeon authored
      When I tried to send the patches to XFS Maintainers,
      I got returned mail included delivery fail message for Dave's mail.
      Maybe, Dave Chinner mail address is incorrect.
      I try to fix it correctly.
      Signed-off-by: default avatarNamjae Jeon <namjae.jeon@samsung.com>
      Reviewed-by: default avatarBen Myers <bpm@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      
      (cherry picked from commit db10bddc)
      809625ca
    • Jie Liu's avatar
      xfs: fix infinite loop by detaching the group/project hints from user dquot · 718cc6f8
      Jie Liu authored
      xfs_quota(8) will hang up if trying to turn group/project quota off
      before the user quota is off, this could be 100% reproduced by:
        # mount -ouquota,gquota /dev/sda7 /xfs
        # mkdir /xfs/test
        # xfs_quota -xc 'off -g' /xfs <-- hangs up
        # echo w > /proc/sysrq-trigger
        # dmesg
      
        SysRq : Show Blocked State
        task                        PC stack   pid father
        xfs_quota       D 0000000000000000     0 27574   2551 0x00000000
        [snip]
        Call Trace:
        [<ffffffff81aaa21d>] schedule+0xad/0xc0
        [<ffffffff81aa327e>] schedule_timeout+0x35e/0x3c0
        [<ffffffff8114b506>] ? mark_held_locks+0x176/0x1c0
        [<ffffffff810ad6c0>] ? call_timer_fn+0x2c0/0x2c0
        [<ffffffffa0c25380>] ? xfs_qm_shrink_count+0x30/0x30 [xfs]
        [<ffffffff81aa3306>] schedule_timeout_uninterruptible+0x26/0x30
        [<ffffffffa0c26155>] xfs_qm_dquot_walk+0x235/0x260 [xfs]
        [<ffffffffa0c059d8>] ? xfs_perag_get+0x1d8/0x2d0 [xfs]
        [<ffffffffa0c05805>] ? xfs_perag_get+0x5/0x2d0 [xfs]
        [<ffffffffa0b7707e>] ? xfs_inode_ag_iterator+0xae/0xf0 [xfs]
        [<ffffffffa0c22280>] ? xfs_trans_free_dqinfo+0x50/0x50 [xfs]
        [<ffffffffa0b7709f>] ? xfs_inode_ag_iterator+0xcf/0xf0 [xfs]
        [<ffffffffa0c261e6>] xfs_qm_dqpurge_all+0x66/0xb0 [xfs]
        [<ffffffffa0c2497a>] xfs_qm_scall_quotaoff+0x20a/0x5f0 [xfs]
        [<ffffffffa0c2b8f6>] xfs_fs_set_xstate+0x136/0x180 [xfs]
        [<ffffffff8136cf7a>] do_quotactl+0x53a/0x6b0
        [<ffffffff812fba4b>] ? iput+0x5b/0x90
        [<ffffffff8136d257>] SyS_quotactl+0x167/0x1d0
        [<ffffffff814cf2ee>] ? trace_hardirqs_on_thunk+0x3a/0x3f
        [<ffffffff81abcd19>] system_call_fastpath+0x16/0x1b
      
      It's fine if we turn user quota off at first, then turn off other
      kind of quotas if they are enabled since the group/project dquot
      refcount is decreased to zero once the user quota if off. Otherwise,
      those dquots refcount is non-zero due to the user dquot might refer
      to them as hint(s).  Hence, above operation cause an infinite loop
      at xfs_qm_dquot_walk() while trying to purge dquot cache.
      
      This problem has been around since Linux 3.4, it was introduced by:
        [ b84a3a96 xfs: remove the per-filesystem list of dquots ]
      
      Originally we will release the group dquot pointers because the user
      dquots maybe carrying around as a hint via xfs_qm_detach_gdquots().
      However, with above change, there is no such work to be done before
      purging group/project dquot cache.
      
      In order to solve this problem, this patch introduces a special routine
      xfs_qm_dqpurge_hints(), and it would release the group/project dquot
      pointers the user dquots maybe carrying around as a hint, and then it
      will proceed to purge the user dquot cache if requested.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJie Liu <jeff.liu@oracle.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      
      (cherry picked from commit df8052e7)
      718cc6f8
    • Jie Liu's avatar
      xfs: fix assertion failure at xfs_setattr_nonsize · 5c227278
      Jie Liu authored
      For CRC enabled v5 super block, change a file's ownership can simply
      trigger an ASSERT failure at xfs_setattr_nonsize() if both group and
      project quota are enabled, i.e,
      
      [  305.337609] XFS: Assertion failed: !XFS_IS_PQUOTA_ON(mp), file: fs/xfs/xfs_iops.c, line: 621
      [  305.339250] Kernel BUG at ffffffffa0a7fa32 [verbose debug info unavailable]
      [  305.383939] Call Trace:
      [  305.385536]  [<ffffffffa0a7d95a>] xfs_setattr_nonsize+0x69a/0x720 [xfs]
      [  305.387142]  [<ffffffffa0a7dea9>] xfs_vn_setattr+0x29/0x70 [xfs]
      [  305.388727]  [<ffffffff811ca388>] notify_change+0x1a8/0x350
      [  305.390298]  [<ffffffff811ac39d>] chown_common+0xfd/0x110
      [  305.391868]  [<ffffffff811ad6bf>] SyS_fchownat+0xaf/0x110
      [  305.393440]  [<ffffffff811ad760>] SyS_lchown+0x20/0x30
      [  305.394995]  [<ffffffff8170f7dd>] system_call_fastpath+0x1a/0x1f
      [  305.399870] RIP  [<ffffffffa0a7fa32>] assfail+0x22/0x30 [xfs]
      
      This fix adjust the assertion to check if the super block support both
      quota inodes or not.
      Signed-off-by: default avatarJie Liu <jeff.liu@oracle.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      
      (cherry picked from commit 5a01dd54)
      5c227278
    • Jie Liu's avatar
      xfs: fix false assertion at xfs_qm_vop_create_dqattach · 30d161c9
      Jie Liu authored
      After the previous fix, there still has another ASSERT failure if turning
      off any type of quota while fsstress is running at the same time.
      
      Backtrace in this case:
      
      [   50.867897] XFS: Assertion failed: XFS_IS_GQUOTA_ON(mp), file: fs/xfs/xfs_qm.c, line: 2118
      [   50.867924] ------------[ cut here ]------------
      ... <snip>
      [   50.867957] Kernel BUG at ffffffffa0b55a32 [verbose debug info unavailable]
      [   50.867999] invalid opcode: 0000 [#1] SMP
      [   50.869407] Call Trace:
      [   50.869446]  [<ffffffffa0bc408a>] xfs_qm_vop_create_dqattach+0x19a/0x2d0 [xfs]
      [   50.869512]  [<ffffffffa0b9cc45>] xfs_create+0x5c5/0x6a0 [xfs]
      [   50.869564]  [<ffffffffa0b5307c>] xfs_vn_mknod+0xac/0x1d0 [xfs]
      [   50.869615]  [<ffffffffa0b531d6>] xfs_vn_mkdir+0x16/0x20 [xfs]
      [   50.869655]  [<ffffffff811becd5>] vfs_mkdir+0x95/0x130
      [   50.869689]  [<ffffffff811bf63a>] SyS_mkdirat+0xaa/0xe0
      [   50.869723]  [<ffffffff811bf689>] SyS_mkdir+0x19/0x20
      [   50.869757]  [<ffffffff8170f7dd>] system_call_fastpath+0x1a/0x1f
      [   50.869793] Code: 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 <snip>
      [   50.870003] RIP  [<ffffffffa0b55a32>] assfail+0x22/0x30 [xfs]
      [   50.870050]  RSP <ffff88002941fd60>
      [   50.879251] ---[ end trace c93a2b342341c65b ]---
      
      We're hitting the ASSERT(XFS_IS_*QUOTA_ON(mp)) in xfs_qm_vop_create_dqattach(),
      however the assertion itself is not right IMHO.  While performing quota off, we
      firstly clear the XFS_*QUOTA_ACTIVE bit(s) from struct xfs_mount without taking
      any special locks, see xfs_qm_scall_quotaoff().  Hence there is no guarantee
      that the desired quota is still active.
      Signed-off-by: default avatarJie Liu <jeff.liu@oracle.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      
      (cherry picked from commit 37eb9706)
      30d161c9
    • Mark Tinguely's avatar
      xfs: fix memory leak in xfs_dir2_node_removename · 3a8c9208
      Mark Tinguely authored
      Fix the leak of kernel memory in xfs_dir2_node_removename()
      when xfs_dir2_leafn_remove() returns an error code.
      Signed-off-by: default avatarMark Tinguely <tinguely@sgi.com>
      Reviewed-by: default avatarBen Myers <bpm@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      
      (cherry picked from commit ef701600)
      3a8c9208
  3. 13 Dec, 2013 19 commits
  4. 11 Dec, 2013 4 commits
  5. 10 Dec, 2013 3 commits
  6. 09 Dec, 2013 2 commits
    • Jie Liu's avatar
      xfs: fix infinite loop by detaching the group/project hints from user dquot · df8052e7
      Jie Liu authored
      xfs_quota(8) will hang up if trying to turn group/project quota off
      before the user quota is off, this could be 100% reproduced by:
        # mount -ouquota,gquota /dev/sda7 /xfs
        # mkdir /xfs/test
        # xfs_quota -xc 'off -g' /xfs <-- hangs up
        # echo w > /proc/sysrq-trigger
        # dmesg
      
        SysRq : Show Blocked State
        task                        PC stack   pid father
        xfs_quota       D 0000000000000000     0 27574   2551 0x00000000
        [snip]
        Call Trace:
        [<ffffffff81aaa21d>] schedule+0xad/0xc0
        [<ffffffff81aa327e>] schedule_timeout+0x35e/0x3c0
        [<ffffffff8114b506>] ? mark_held_locks+0x176/0x1c0
        [<ffffffff810ad6c0>] ? call_timer_fn+0x2c0/0x2c0
        [<ffffffffa0c25380>] ? xfs_qm_shrink_count+0x30/0x30 [xfs]
        [<ffffffff81aa3306>] schedule_timeout_uninterruptible+0x26/0x30
        [<ffffffffa0c26155>] xfs_qm_dquot_walk+0x235/0x260 [xfs]
        [<ffffffffa0c059d8>] ? xfs_perag_get+0x1d8/0x2d0 [xfs]
        [<ffffffffa0c05805>] ? xfs_perag_get+0x5/0x2d0 [xfs]
        [<ffffffffa0b7707e>] ? xfs_inode_ag_iterator+0xae/0xf0 [xfs]
        [<ffffffffa0c22280>] ? xfs_trans_free_dqinfo+0x50/0x50 [xfs]
        [<ffffffffa0b7709f>] ? xfs_inode_ag_iterator+0xcf/0xf0 [xfs]
        [<ffffffffa0c261e6>] xfs_qm_dqpurge_all+0x66/0xb0 [xfs]
        [<ffffffffa0c2497a>] xfs_qm_scall_quotaoff+0x20a/0x5f0 [xfs]
        [<ffffffffa0c2b8f6>] xfs_fs_set_xstate+0x136/0x180 [xfs]
        [<ffffffff8136cf7a>] do_quotactl+0x53a/0x6b0
        [<ffffffff812fba4b>] ? iput+0x5b/0x90
        [<ffffffff8136d257>] SyS_quotactl+0x167/0x1d0
        [<ffffffff814cf2ee>] ? trace_hardirqs_on_thunk+0x3a/0x3f
        [<ffffffff81abcd19>] system_call_fastpath+0x16/0x1b
      
      It's fine if we turn user quota off at first, then turn off other
      kind of quotas if they are enabled since the group/project dquot
      refcount is decreased to zero once the user quota if off. Otherwise,
      those dquots refcount is non-zero due to the user dquot might refer
      to them as hint(s).  Hence, above operation cause an infinite loop
      at xfs_qm_dquot_walk() while trying to purge dquot cache.
      
      This problem has been around since Linux 3.4, it was introduced by:
        [ b84a3a96 xfs: remove the per-filesystem list of dquots ]
      
      Originally we will release the group dquot pointers because the user
      dquots maybe carrying around as a hint via xfs_qm_detach_gdquots().
      However, with above change, there is no such work to be done before
      purging group/project dquot cache.
      
      In order to solve this problem, this patch introduces a special routine
      xfs_qm_dqpurge_hints(), and it would release the group/project dquot
      pointers the user dquots maybe carrying around as a hint, and then it
      will proceed to purge the user dquot cache if requested.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJie Liu <jeff.liu@oracle.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      df8052e7
    • Jie Liu's avatar
      xfs: fix assertion failure at xfs_setattr_nonsize · 5a01dd54
      Jie Liu authored
      For CRC enabled v5 super block, change a file's ownership can simply
      trigger an ASSERT failure at xfs_setattr_nonsize() if both group and
      project quota are enabled, i.e,
      
      [  305.337609] XFS: Assertion failed: !XFS_IS_PQUOTA_ON(mp), file: fs/xfs/xfs_iops.c, line: 621
      [  305.339250] Kernel BUG at ffffffffa0a7fa32 [verbose debug info unavailable]
      [  305.383939] Call Trace:
      [  305.385536]  [<ffffffffa0a7d95a>] xfs_setattr_nonsize+0x69a/0x720 [xfs]
      [  305.387142]  [<ffffffffa0a7dea9>] xfs_vn_setattr+0x29/0x70 [xfs]
      [  305.388727]  [<ffffffff811ca388>] notify_change+0x1a8/0x350
      [  305.390298]  [<ffffffff811ac39d>] chown_common+0xfd/0x110
      [  305.391868]  [<ffffffff811ad6bf>] SyS_fchownat+0xaf/0x110
      [  305.393440]  [<ffffffff811ad760>] SyS_lchown+0x20/0x30
      [  305.394995]  [<ffffffff8170f7dd>] system_call_fastpath+0x1a/0x1f
      [  305.399870] RIP  [<ffffffffa0a7fa32>] assfail+0x22/0x30 [xfs]
      
      This fix adjust the assertion to check if the super block support both
      quota inodes or not.
      Signed-off-by: default avatarJie Liu <jeff.liu@oracle.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      5a01dd54
  7. 06 Dec, 2013 2 commits