1. 18 Apr, 2008 16 commits
    • David Chinner's avatar
      [XFS] Update c/mtime correctly on truncates · 44d814ce
      David Chinner authored
      XFS changes the c/mtime of an inode when truncating it to the same size.
      The c/mtime is only supposed to change if the size is changed. Not to be
      confused with ftruncate, where the c/mtime is supposed to be changed even
      if the size is not changed.
      
      The Linux VFS encodes this semantic difference in the flags it sends down
      to ->setattr, which XFS currently ignores. We need to make XFS pay
      attention to the VFS flags and hence Do The Right Thing.
      
      SGI-PV: 977547
      SGI-Modid: xfs-linux-melb:xfs-kern:30536a
      Signed-off-by: default avatarDavid Chinner <dgc@sgi.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarLachlan McIlroy <lachlan@sgi.com>
      44d814ce
    • Christoph Hellwig's avatar
      [XFS] don't encode parent in nfs filehandles unless nessecary · 24bd861d
      Christoph Hellwig authored
      As Dave pointed out after the export ops changes we now always encode the
      parent into the filehandle for regular files, but it's not actually needed
      when the filesystem is export with no_subtree_check. This one-liner fixes
      xfs_fs_encode_fh to skip encoding the parent unless nessecary.
      
      SGI-PV: 976035
      SGI-Modid: xfs-linux-melb:xfs-kern:30535a
      Signed-off-by: default avatarChristoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarLachlan McIlroy <lachlan@sgi.com>
      24bd861d
    • Christoph Hellwig's avatar
      [XFS] kill xfs_rwlock/xfs_rwunlock · 126468b1
      Christoph Hellwig authored
      We can just use xfs_ilock/xfs_iunlock instead and get rid of the ugly
      bhv_vrwlock_t.
      
      SGI-PV: 976035
      SGI-Modid: xfs-linux-melb:xfs-kern:30533a
      Signed-off-by: default avatarChristoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarLachlan McIlroy <lachlan@sgi.com>
      126468b1
    • Christoph Hellwig's avatar
      [XFS] kill xfs_get_dir_entry · 43973964
      Christoph Hellwig authored
      Instead of of xfs_get_dir_entry use a macro to get the xfs_inode from the
      dentry in the callers and grab the reference manually.
      
      Only grab the reference once as it's fine to keep it over the dmapi calls.
      (And even that reference is actually superflous in Linux but I'll leave
      that for another patch)
      
      SGI-PV: 976035
      SGI-Modid: xfs-linux-melb:xfs-kern:30531a
      Signed-off-by: default avatarChristoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarLachlan McIlroy <lachlan@sgi.com>
      43973964
    • Christoph Hellwig's avatar
      [XFS] vnode cleanup in xfs_fs_subr.c · a8b3acd5
      Christoph Hellwig authored
      Cleanup the unneeded intermediate vnode step in the flushing helpers and
      go directly from the xfs_inode to the struct address_space.
      
      SGI-PV: 976035
      SGI-Modid: xfs-linux-melb:xfs-kern:30530a
      Signed-off-by: default avatarChristoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarLachlan McIlroy <lachlan@sgi.com>
      a8b3acd5
    • Christoph Hellwig's avatar
      [XFS] cleanup xfs_vn_mknod · db0bb7ba
      Christoph Hellwig authored
      - use proper goto based unwinding instead of the current mess of
        multiple conditionals
      - rename ip to inode because that's the normal convention for Linux
        inodes while ip is the convention for xfs_inodes
      - remove unlikely checks for the default_acl - branches marked unlikely
        might lead to extreme branch bredictor slowdons if taken and for some
        workloads a default acl is quite common
      - properly indent the switch statements
      - remove xfs_has_fs_struct as nfsd has a fs_struct in any semi-recent
        kernel
      
      SGI-PV: 976035
      SGI-Modid: xfs-linux-melb:xfs-kern:30529a
      Signed-off-by: default avatarChristoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarLachlan McIlroy <lachlan@sgi.com>
      db0bb7ba
    • David Chinner's avatar
      [XFS] Use atomics for iclog reference counting · 155cc6b7
      David Chinner authored
      Now that we update the log tail LSN less frequently on transaction
      completion, we pass the contention straight to the global log state lock
      (l_iclog_lock) during transaction completion.
      
      We currently have to take this lock to decrement the iclog reference
      count. there is a reference count on each iclog, so we need to take þhe
      global lock for all refcount changes.
      
      When large numbers of processes are all doing small trnasctions, the iclog
      reference counts will be quite high, and the state change that absolutely
      requires the l_iclog_lock is the except rather than the norm.
      
      Change the reference counting on the iclogs to use atomic_inc/dec so that
      we can use atomic_dec_and_lock during transaction completion and avoid the
      need for grabbing the l_iclog_lock for every reference count decrement
      except the one that matters - the last.
      
      SGI-PV: 975671
      SGI-Modid: xfs-linux-melb:xfs-kern:30505a
      Signed-off-by: default avatarDavid Chinner <dgc@sgi.com>
      Signed-off-by: default avatarTim Shimmin <tes@sgi.com>
      Signed-off-by: default avatarLachlan McIlroy <lachlan@sgi.com>
      155cc6b7
    • David Chinner's avatar
      [XFS] Prevent AIL lock contention during transaction completion · b589334c
      David Chinner authored
      When hundreds of processors attempt to commit transactions at the same
      time, they can contend on the AIL lock when updating the tail LSN held in
      the in-core log structure.
      
      At the moment, the tail LSN is only needed when actually writing out an
      iclog, so it really does not need to be updated on every single
      transaction completion - only those that result in switching iclogs and
      flushing them to disk.
      
      The result is that we reduce the number of times we need to grab the AIL
      lock and the log grant lock by up to two orders of magnitude on large
      processor count machines. The problem has previously been hidden by AIL
      lock contention walking the AIL list which was recently solved and
      uncovered this issue.
      
      SGI-PV: 975671
      SGI-Modid: xfs-linux-melb:xfs-kern:30504a
      Signed-off-by: default avatarDavid Chinner <dgc@sgi.com>
      Signed-off-by: default avatarTim Shimmin <tes@sgi.com>
      Signed-off-by: default avatarLachlan McIlroy <lachlan@sgi.com>
      b589334c
    • David Chinner's avatar
      [XFS] Use xfs_inode_clean() in more places · 33540408
      David Chinner authored
      Remove open coded checks for the whether the inode is clean and replace
      them with an inlined function.
      
      SGI-PV: 977461
      SGI-Modid: xfs-linux-melb:xfs-kern:30503a
      Signed-off-by: default avatarDavid Chinner <dgc@sgi.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarLachlan McIlroy <lachlan@sgi.com>
      33540408
    • David Chinner's avatar
      [XFS] Remove the xfs_icluster structure · bad55843
      David Chinner authored
      Remove the xfs_icluster structure and replace with a radix tree lookup.
      
      We don't need to keep a list of inodes in each cluster around anymore as
      we can look them up quickly when we need to. The only time we need to do
      this now is during inode writeback.
      
      Factor the inode cluster writeback code out of xfs_iflush and convert it
      to use radix_tree_gang_lookup() instead of walking a list of inodes built
      when we first read in the inodes.
      
      This remove 3 pointers from each xfs_inode structure and the xfs_icluster
      structure per inode cluster. Hence we reduce the cache footprint of the
      xfs_inodes by between 5-10% depending on cluster sparseness.
      
      To be truly efficient we need a radix_tree_gang_lookup_range() call to
      stop searching once we are past the end of the cluster instead of trying
      to find a full cluster's worth of inodes.
      
      Before (ia64):
      
      $ cat /sys/slab/xfs_inode/object_size 536
      
      After:
      
      $ cat /sys/slab/xfs_inode/object_size 512
      
      SGI-PV: 977460
      SGI-Modid: xfs-linux-melb:xfs-kern:30502a
      Signed-off-by: default avatarDavid Chinner <dgc@sgi.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarLachlan McIlroy <lachlan@sgi.com>
      bad55843
    • David Chinner's avatar
      [XFS] Don't block pdflush when writing back inodes · a3f74ffb
      David Chinner authored
      When pdflush is writing back inodes, it can get stuck on inode cluster
      buffers that are currently under I/O. This occurs when we write data to
      multiple inodes in the same inode cluster at the same time.
      
      Effectively, delayed allocation marks the inode dirty during the data
      writeback. Hence if the inode cluster was flushed during the writeback of
      the first inode, the writeback of the second inode will block waiting for
      the inode cluster write to complete before writing it again for the newly
      dirtied inode.
      
      Basically, we want to avoid this from happening so we don't block pdflush
      and slow down all of writeback. Hence we introduce a non-blocking async
      inode flush flag that pdflush uses. If this flag is set, we use
      non-blocking operations (e.g. try locks) whereever we can to avoid
      blocking or extra I/O being issued.
      
      SGI-PV: 970925
      SGI-Modid: xfs-linux-melb:xfs-kern:30501a
      Signed-off-by: default avatarDavid Chinner <dgc@sgi.com>
      Signed-off-by: default avatarLachlan McIlroy <lachlan@sgi.com>
      a3f74ffb
    • David Chinner's avatar
      [XFS] Factor xfs_itobp() and xfs_inotobp(). · 4ae29b43
      David Chinner authored
      The only difference between the functions is one passes an inode for the
      lookup, the other passes an inode number. However, they don't do the same
      validity checking or set all the same state on the buffer that is returned
      yet they should.
      
      Factor the functions into a common implementation.
      
      SGI-PV: 970925
      SGI-Modid: xfs-linux-melb:xfs-kern:30500a
      Signed-off-by: default avatarDavid Chinner <dgc@sgi.com>
      Signed-off-by: default avatarLachlan McIlroy <lachlan@sgi.com>
      4ae29b43
    • Lachlan McIlroy's avatar
      [XFS] Fix regression due to refcache removal · e9a56b7c
      Lachlan McIlroy authored
      SGI-PV: 971186
      SGI-Modid: xfs-linux-melb:xfs-kern:30490a
      Signed-off-by: default avatarLachlan McIlroy <lachlan@sgi.com>
      Signed-off-by: default avatarDonald Douwsma <donaldd@sgi.com>
      e9a56b7c
    • Donald Douwsma's avatar
      [XFS] Remove the xfs_refcache · 163d3686
      Donald Douwsma authored
      Remove the xfs_refcache, it was only needed while we were still
      building for 2.4 kernels.
      
      SGI-PV: 971186
      SGI-Modid: xfs-linux-melb:xfs-kern:30472a
      Signed-off-by: default avatarDonald Douwsma <donaldd@sgi.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarLachlan McIlroy <lachlan@sgi.com>
      163d3686
    • Lachlan McIlroy's avatar
      [XFS] make inode reclaim synchronise with xfs_iflush_done() · 461aa8a2
      Lachlan McIlroy authored
      On a forced shutdown, xfs_finish_reclaim() will skip flushing the inode.
      If the inode flush lock is not already held and there is an outstanding
      xfs_iflush_done() then we might free the inode prematurely. By acquiring
      and releasing the flush lock we will synchronise with xfs_iflush_done().
      
      SGI-PV: 909874
      SGI-Modid: xfs-linux-melb:xfs-kern:30468a
      Signed-off-by: default avatarLachlan McIlroy <lachlan@sgi.com>
      Signed-off-by: default avatarDavid Chinner <dgc@sgi.com>
      461aa8a2
    • Niv Sardi's avatar
      [XFS] actually check error returned by xfs_flush_pages, clean up and · e12070a5
      Niv Sardi authored
      bailout if fails.
      
      SGI-PV: 973041
      SGI-Modid: xfs-linux-melb:xfs-kern:30462a
      Signed-off-by: default avatarNiv Sardi <xaiki@sgi.com>
      Signed-off-by: default avatarLachlan McIlroy <lachlan@sgi.com>
      e12070a5
  2. 17 Apr, 2008 2 commits
  3. 16 Apr, 2008 22 commits