1. 27 Apr, 2013 8 commits
  2. 21 Apr, 2013 8 commits
  3. 16 Apr, 2013 2 commits
  4. 05 Apr, 2013 1 commit
    • Dave Chinner's avatar
      xfs: don't free EFIs before the EFDs are committed · 666d644c
      Dave Chinner authored
      Filesystems are occasionally being shut down with this error:
      
      xfs_trans_ail_delete_bulk: attempting to delete a log item that is
      not in the AIL.
      
      It was diagnosed to be related to the EFI/EFD commit order when the
      EFI and EFD are in different checkpoints and the EFD is committed
      before the EFI here:
      
      http://oss.sgi.com/archives/xfs/2013-01/msg00082.html
      
      The real problem is that a single bit cannot fully describe the
      states that the EFI/EFD processing can be in. These completion
      states are:
      
      EFI			EFI in AIL	EFD		Result
      committed/unpinned	Yes		committed	OK
      committed/pinned	No		committed	Shutdown
      uncommitted		No		committed	Shutdown
      
      
      Note that the "result" field is what should happen, not what does
      happen. The current logic is broken and handles the first two cases
      correctly by luck.  That is, the code will free the EFI if the
      XFS_EFI_COMMITTED bit is *not* set, rather than if it is set. The
      inverted logic "works" because if both EFI and EFD are committed,
      then the first __xfs_efi_release() call clears the XFS_EFI_COMMITTED
      bit, and the second frees the EFI item. Hence as long as
      xfs_efi_item_committed() has been called, everything appears to be
      fine.
      
      It is the third case where the logic fails - where
      xfs_efd_item_committed() is called before xfs_efi_item_committed(),
      and that results in the EFI being freed before it has been
      committed. That is the bug that triggered the shutdown, and hence
      keeping track of whether the EFI has been committed or not is
      insufficient to correctly order the EFI/EFD operations w.r.t. the
      AIL.
      
      What we really want is this: the EFI is always placed into the
      AIL before the last reference goes away. The only way to guarantee
      that is that the EFI is not freed until after it has been unpinned
      *and* the EFD has been committed. That is, restructure the logic so
      that the only case that can occur is the first case.
      
      This can be done easily by replacing the XFS_EFI_COMMITTED with an
      EFI reference count. The EFI is initialised with it's own count, and
      that is not released until it is unpinned. However, there is a
      complication to this method - the high level EFI/EFD code in
      xfs_bmap_finish() does not hold direct references to the EFI
      structure, and runs a transaction commit between the EFI and EFD
      processing. Hence the EFI can be freed even before the EFD is
      created using such a method.
      
      Further, log recovery uses the AIL for tracking EFI/EFDs that need
      to be recovered, but it uses the AIL *differently* to the EFI
      transaction commit. Hence log recovery never pins or unpins EFIs, so
      we can't drop the EFI reference count indirectly to free the EFI.
      
      However, this doesn't prevent us from using a reference count here.
      There is a 1:1 relationship between EFIs and EFDs, so when we
      initialise the EFI we can take a reference count for the EFD as
      well. This solves the xfs_bmap_finish() issue - the EFI will never
      be freed until the EFD is processed. In terms of log recovery,
      during the committing of the EFD we can look for the
      XFS_EFI_RECOVERED bit being set and drop the EFI reference as well,
      thereby ensuring everything works correctly there as well.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarMark Tinguely <tinguely@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      666d644c
  5. 03 Apr, 2013 1 commit
  6. 22 Mar, 2013 7 commits
  7. 14 Mar, 2013 3 commits
  8. 07 Mar, 2013 6 commits
    • Dave Chinner's avatar
      xfs: rearrange some code in xfs_bmap for better locality · 9e5987a7
      Dave Chinner authored
      xfs_bmap.c is a big file, and some of the related code is spread all
      throughout the file requiring function prototypes for static
      function and jumping all through the file to follow a single call
      path. Rearrange the code so that:
      
      	a) related functionality is grouped together; and
      	b) functions are grouped in call dependency order
      
      While the diffstat is large, there are no code changes in the patch;
      it is just moving the functionality around and removing the function
      prototypes at the top of the file. The resulting layout of the code
      is as follows (top of file to bottom):
      
      	- miscellaneous helper functions
      	- extent tree block counting routines
      	- debug/sanity checking code
      	- bmap free list manipulation functions
      	- inode fork format manipulation functions
      	- internal/external extent tree seach functions
      	- extent tree manipulation functions used during allocation
      	- functions used during extent read/allocate/removal
      	  operations (i.e. xfs_bmapi_write, xfs_bmapi_read,
      	  xfs_bunmapi and xfs_getbmap)
      
      This means that following logic paths through the bmapi code is much
      simpler - most of the code relevant to a specific operation is now
      clustered together rather than spread all over the file....
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarMark Tinguely <tinguely@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      9e5987a7
    • Akinobu Mita's avatar
      xfs: rename random32() to prandom_u32() · ecb3403d
      Akinobu Mita authored
      Use more preferable function name which implies using a pseudo-random
      number generator.
      Signed-off-by: default avatarAkinobu Mita <akinobu.mita@gmail.com>
      Acked-by: <bpm@sgi.com>
      Cc: Ben Myers <bpm@sgi.com>
      Cc: Alex Elder <elder@kernel.org>
      Cc: xfs@oss.sgi.com
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      ecb3403d
    • Dave Chinner's avatar
      xfs: don't verify buffers after IO errors · d5929de8
      Dave Chinner authored
      When we read a buffer, we might get an error from the underlying
      block device and not the real data. Hence if we get an IO error, we
      shouldn't run the verifier but instead just pass the IO error
      straight through.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarMark Tinguely <tinguely@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      d5929de8
    • Mark Tinguely's avatar
      xfs: fix xfs_iomap_eof_prealloc_initial_size type · e8108ced
      Mark Tinguely authored
      Fix the return type of xfs_iomap_eof_prealloc_initial_size() to
      xfs_fsblock_t to reflect the fact that the return value may be an
      unsigned 64 bits if XFS_BIG_BLKNOS is defined.
      Signed-off-by: default avatarMark Tinguely <tinguely@sgi.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      e8108ced
    • Brian Foster's avatar
      xfs: increase prealloc size to double that of the previous extent · e114b5fc
      Brian Foster authored
      The updated speculative preallocation algorithm for handling sparse
      files can becomes less effective in situations with a high number of
      concurrent, sequential writers. The number of writers and amount of
      available RAM affect the writeback bandwidth slicing algorithm,
      which in turn affects the block allocation pattern of XFS. For
      example, running 32 sequential writers on a system with 32GB RAM,
      preallocs become fixed at a value of around 128MB (instead of
      steadily increasing to the 8GB maximum as sequential writes
      proceed).
      
      Update the speculative prealloc heuristic to base the size of the
      next prealloc on double the size of the preceding extent. This
      preserves the original aggressive speculative preallocation
      behavior and continues to accomodate sparse files at a slight cost
      of increasing the size of preallocated data regions following holes
      of sparse files.
      Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      e114b5fc
    • Brian Foster's avatar
      xfs: fix potential infinite loop in xfs_iomap_prealloc_size() · e78c420b
      Brian Foster authored
      If freesp == 0, we could end up in an infinite loop while squashing
      the preallocation. Break the loop when we've killed the prealloc
      entirely.
      Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      e78c420b
  9. 03 Mar, 2013 4 commits
    • Linus Torvalds's avatar
      Linux 3.9-rc1 · 6dbe51c2
      Linus Torvalds authored
      6dbe51c2
    • Linus Torvalds's avatar
      Merge tag 'disintegrate-fbdev-20121220' of git://git.infradead.org/users/dhowells/linux-headers · ea882c2e
      Linus Torvalds authored
      Pull fbdev UAPI disintegration from David Howells:
       "You'll be glad to here that the end is nigh for the UAPI patches.
        Only the fbdev/framebuffer piece remains now that the SCSI stuff has
        gone in.
      
        Here are the UAPI disintegration bits for the fbdev drivers.  It
        appears that Florian hasn't had time to deal with my patch, but back
        in December he did say he didn't mind if I pushed it forward."
      
      Yay.  No more uapi movement.  And hopefully no more big header file
      cleanups coming up either, it just tends to be very painful.
      
      * tag 'disintegrate-fbdev-20121220' of git://git.infradead.org/users/dhowells/linux-headers:
        UAPI: (Scripted) Disintegrate include/video
      ea882c2e
    • Linus Torvalds's avatar
      Merge tag 'stable/for-linus-3.9-rc1-tag' of... · 8e8b180a
      Linus Torvalds authored
      Merge tag 'stable/for-linus-3.9-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
      
      Pull Xen bug-fixes from Konrad Rzeszutek Wilk:
       - Update the Xen ACPI memory and CPU hotplug locking mechanism.
       - Fix PAT issues wherein various applications would not start
       - Fix handling of multiple MSI as AHCI now does it.
       - Fix ARM compile failures.
      
      * tag 'stable/for-linus-3.9-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
        xenbus: fix compile failure on ARM with Xen enabled
        xen/pci: We don't do multiple MSI's.
        xen/pat: Disable PAT using pat_enabled value.
        xen/acpi: xen cpu hotplug minor updates
        xen/acpi: xen memory hotplug minor updates
      8e8b180a
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 56a79b7b
      Linus Torvalds authored
      Pull  more VFS bits from Al Viro:
       "Unfortunately, it looks like xattr series will have to wait until the
        next cycle ;-/
      
        This pile contains 9p cleanups and fixes (races in v9fs_fid_add()
        etc), fixup for nommu breakage in shmem.c, several cleanups and a bit
        more file_inode() work"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        constify path_get/path_put and fs_struct.c stuff
        fix nommu breakage in shmem.c
        cache the value of file_inode() in struct file
        9p: if v9fs_fid_lookup() gets to asking server, it'd better have hashed dentry
        9p: make sure ->lookup() adds fid to the right dentry
        9p: untangle ->lookup() a bit
        9p: double iput() in ->lookup() if d_materialise_unique() fails
        9p: v9fs_fid_add() can't fail now
        v9fs: get rid of v9fs_dentry
        9p: turn fid->dlist into hlist
        9p: don't bother with private lock in ->d_fsdata; dentry->d_lock will do just fine
        more file_inode() open-coded instances
        selinux: opened file can't have NULL or negative ->f_path.dentry
      
      (In the meantime, the hlist traversal macros have changed, so this
      required a semantic conflict fixup for the newly hlistified fid->dlist)
      56a79b7b