1. 31 Jan, 2011 27 commits
  2. 29 Jan, 2011 5 commits
  3. 28 Jan, 2011 8 commits
    • Chuck Lever's avatar
      NFS: NFSv4 readdir loses entries · d1205f87
      Chuck Lever authored
      On recent 2.6.38-rc kernels, connectathon basic test 6 fails on
      NFSv4 mounts of OpenSolaris with something like:
      
      > ./test6: readdir
      > 	./test6: (/mnt/klimt/matisse.test) didn't read expected 'file.12' dir entry, pass 0
      > 	./test6: (/mnt/klimt/matisse.test) didn't read expected 'file.82' dir entry, pass 0
      > 	./test6: (/mnt/klimt/matisse.test) didn't read expected 'file.164' dir entry, pass 0
      > 	./test6: (/mnt/klimt/matisse.test) Test failed with 3 errors
      > basic tests failed
      > Tests failed, leaving /mnt/klimt mounted
      > [cel@matisse cthon04]$
      
      I narrowed the problem down to nfs4_decode_dirent() reporting that the
      decode buffer had overflowed while decoding the entries for those
      missing files.
      
      verify_attr_len() assumes both it's pointer arguments reside on the
      same page.  When these arguments point to locations on two different
      pages, verify_attr_len() can report false errors.  This can happen now
      that a large NFSv4 readdir result can span pages.
      
      We have reasonably good checking in nfs4_decode_dirent() anyway, so
      it should be safe to simply remove the extra checking.
      
      At a guess, this was introduced by commit 6650239a, "NFS: Don't use
      vm_map_ram() in readdir".
      
      Cc: stable@kernel.org [2.6.37]
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      d1205f87
    • Chuck Lever's avatar
      NFS: Micro-optimize nfs4_decode_dirent() · c08e76d0
      Chuck Lever authored
      Make the decoding of NFSv4 directory entries slightly more efficient
      by:
      
        1.  Avoiding unnecessary byte swapping when checking XDR booleans,
            and
      
        2.  Not bumping "p" when its value will be immediately replaced by
            xdr_inline_decode()
      
      This commit makes nfs4_decode_dirent() consistent with similar logic
      in the other two decode_dirent() functions.
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      c08e76d0
    • Trond Myklebust's avatar
      NFS: Fix an NFS client lockdep issue · e00b8a24
      Trond Myklebust authored
      There is no reason to be freeing the delegation cred in the rcu callback,
      and doing so is resulting in a lockdep complaint that rpc_credcache_lock
      is being called from both softirq and non-softirq contexts.
      Reported-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: stable@kernel.org
      e00b8a24
    • bpm@sgi.com's avatar
      xfs: xfs_bmap_add_extent_delay_real should init br_startblock · 24446fc6
      bpm@sgi.com authored
      When filling in the middle of a previous delayed allocation in
      xfs_bmap_add_extent_delay_real, set br_startblock of the new delay
      extent to the right to nullstartblock instead of 0 before inserting
      the extent into the ifork (xfs_iext_insert), rather than setting
      br_startblock afterward.
      
      Adding the extent into the ifork with br_startblock=0 can lead to
      the extent being copied into the btree by xfs_bmap_extent_to_btree
      if we happen to convert from extents format to btree format before
      updating br_startblock with the correct value.  The unexpected
      addition of this delay extent to the btree can cause subsequent
      XFS_WANT_CORRUPTED_GOTO filesystem shutdown in several
      xfs_bmap_add_extent_delay_real cases where we are converting a delay
      extent to real and unexpectedly find an extent already inserted.
      For example:
      
      911         case BMAP_LEFT_FILLING:
      912                 /*
      913                  * Filling in the first part of a previous delayed allocation.
      914                  * The left neighbor is not contiguous.
      915                  */
      916                 trace_xfs_bmap_pre_update(ip, idx, state, _THIS_IP_);
      917                 xfs_bmbt_set_startoff(ep, new_endoff);
      918                 temp = PREV.br_blockcount - new->br_blockcount;
      919                 xfs_bmbt_set_blockcount(ep, temp);
      920                 xfs_iext_insert(ip, idx, 1, new, state);
      921                 ip->i_df.if_lastex = idx;
      922                 ip->i_d.di_nextents++;
      923                 if (cur == NULL)
      924                         rval = XFS_ILOG_CORE | XFS_ILOG_DEXT;
      925                 else {
      926                         rval = XFS_ILOG_CORE;
      927                         if ((error = xfs_bmbt_lookup_eq(cur, new->br_startoff,
      928                                         new->br_startblock, new->br_blockcount,
      929                                         &i)))
      930                                 goto done;
      931                         XFS_WANT_CORRUPTED_GOTO(i == 0, done);
      
      With the bogus extent in the btree we shutdown the filesystem at
      931.  The conversion from extents to btree format happens when the
      number of extents in the inode increases above ip->i_df.if_ext_max.
      xfs_bmap_extent_to_btree copies extents from the ifork into the
      btree, ignoring all delalloc extents which are denoted by
      br_startblock having some value of nullstartblock.
      
      SGI-PV: 1013221
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      24446fc6
    • Dave Chinner's avatar
      xfs: fix dquot shaker deadlock · 0fbca4d1
      Dave Chinner authored
      Commit 368e1361 ("xfs: remove duplicate code from dquot reclaim") fails
      to unlock the dquot freelist when the number of loop restarts is
      exceeded in xfs_qm_dqreclaim_one(). This causes hangs in memory
      reclaim.
      
      Rework the loop control logic into an unwind stack that all the
      different cases jump into. This means there is only one set of code
      that processes the loop exit criteria, and simplifies the unlocking
      of all the items from different points in the loop. It also fixes a
      double increment of the restart counter from the qi_dqlist_lock
      case.
      Reported-by: default avatarMalcolm Scott <lkml@malc.org.uk>
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarAlex Elder <aelder@sgi.com>
      0fbca4d1
    • Dave Chinner's avatar
      xfs: handle CIl transaction commit failures correctly · c6f990d1
      Dave Chinner authored
      Failure to commit a transaction into the CIL is not handled
      correctly. This currently can only happen when racing with a
      shutdown and requires an explicit shutdown check, so it rare and can
      be avoided. Remove the shutdown check and make the CIL commit a void
      function to indicate it will always succeed, thereby removing the
      incorrectly handled failure case.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarAlex Elder <aelder@sgi.com>
      c6f990d1
    • Dave Chinner's avatar
      xfs: limit extsize to size of AGs and/or MAXEXTLEN · 5315837d
      Dave Chinner authored
      The extent size hint can be set to larger than an AG. This means
      that the alignment process can push the range to be allocated
      outside the bounds of the AG, resulting in assert failures or
      corrupted bmbt records. Similarly, if the extsize is larger than the
      maximum extent size supported, the alignment process will produce
      extents that are too large to fit into the bmbt records, resulting
      in a different type of assert/corruption failure.
      
      Fix this by limiting extsize at the time іt is set firstly to be
      less than MAXEXTLEN, then to be a maximum of half the size of the
      AGs in the filesystem for non-realtime inodes. Realtime inodes do
      not allocate out of AGs, so don't have to be restricted by the size
      of AGs.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarAlex Elder <aelder@sgi.com>
      5315837d
    • Dave Chinner's avatar
      xfs: prevent extsize alignment from exceeding maximum extent size · 4ce15989
      Dave Chinner authored
      When doing delayed allocation, if the allocation size is for a
      maximally sized extent, extent size alignment can push it over this
      limit. This results in an assert failure in xfs_bmbt_set_allf() as
      the extent length is too large to find in the extent record.
      
      Fix this by ensuring that we allow for space that extent size
      alignment requires (up to 2 * (extsize -1) blocks as we have to
      handle both head and tail alignment) when limiting the maximum size
      of the extent.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarAlex Elder <aelder@sgi.com>
      4ce15989