1. 15 May, 2012 11 commits
    • Eric Sandeen's avatar
      ext3: return 32/64-bit dir name hash according to usage type · d7dab39b
      Eric Sandeen authored
      This is based on commit d1f5273e
      ext4: return 32/64-bit dir name hash according to usage type
      by Fan Yong <yong.fan@whamcloud.com>
      
      Traditionally ext2/3/4 has returned a 32-bit hash value from llseek()
      to appease NFSv2, which can only handle a 32-bit cookie for seekdir()
      and telldir().  However, this causes problems if there are 32-bit hash
      collisions, since the NFSv2 server can get stuck resending the same
      entries from the directory repeatedly.
      
      Allow ext3 to return a full 64-bit hash (both major and minor) for
      telldir to decrease the chance of hash collisions.
      
      This patch does implement a new ext3_dir_llseek op, because with 64-bit
      hashes, nfs will attempt to seek to a hash "offset" which is much
      larger than ext3's s_maxbytes.  So for dx dirs, we call
      generic_file_llseek_size() with the appropriate max hash value as the
      maximum seekable size.  Otherwise we just pass through to
      generic_file_llseek().
      Patch-updated-by: default avatarBernd Schubert <bernd.schubert@itwm.fraunhofer.de>
      Patch-updated-by: default avatarEric Sandeen <sandeen@redhat.com>
      (blame us if something is not correct)
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      d7dab39b
    • Jan Kara's avatar
      quota: Get rid of nested I_MUTEX_QUOTA locking subclass · a80b12c3
      Jan Kara authored
      So far i_mutex was ranking above dqonoff_mutex and i_mutex on quota files
      was special and ranking below dqonoff_mutex (and several other locks).
      However there's no real need for i_mutex on quota files to be special.
      IO on quota files is serialized by dqio_mutex anyway so we don't need to
      take i_mutex when writing to quota files. Other places where we take i_mutex
      on quota file can accomodate standard i_mutex lock ranking, we only need
      to change the lock ranking to be dqonoff_mutex > i_mutex which is a matter
      of changing documentation because there's no place which would enforce
      ordering in the other direction.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      a80b12c3
    • Jan Kara's avatar
      f9ef1784
    • Jan Kara's avatar
      ext2: Remove i_mutex use from ext2_quota_write() · e2a3fde7
      Jan Kara authored
      We don't need i_mutex in ext2_quota_write() because writes to quota file
      are serialized by dqio_mutex anyway. Changes to quota files outside of quota
      code are forbidded and enforced by NOATIME and IMMUTABLE bits.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      e2a3fde7
    • Jan Kara's avatar
      reiserfs: Remove i_mutex use from reiserfs_quota_write() · 67f1648d
      Jan Kara authored
      We don't need i_mutex in reiserfs_quota_write() because writes to quota file
      are serialized by dqio_mutex anyway. Changes to quota files outside of quota
      code are forbidded and enforced by NOATIME and IMMUTABLE bits.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      67f1648d
    • Jan Kara's avatar
      ext4: Remove i_mutex use from ext4_quota_write() · 0b7f7cef
      Jan Kara authored
      We don't need i_mutex in ext4_quota_write() because writes to quota file
      are serialized by dqio_mutex anyway. Changes to quota files outside of quota
      code are forbidded and enforced by NOATIME and IMMUTABLE bits.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      0b7f7cef
    • Jan Kara's avatar
      ext3: Remove i_mutex use from ext3_quota_write() · 905c3937
      Jan Kara authored
      We don't need i_mutex in ext3_quota_write() because writes to quota file
      are serialized by dqio_mutex anyway. Changes to quota files outside of quota
      code are forbidded and enforced by NOATIME and IMMUTABLE bits.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      905c3937
    • Jan Kara's avatar
      quota: Fix double lock in add_dquot_ref() with CONFIG_QUOTA_DEBUG · d7e97117
      Jan Kara authored
      When CONFIG_QUOTA_DEBUG is enabled we call inode_get_rsv_space() from
      add_dquot_ref() while holding i_lock. But inode_get_rsv_space() is trying
      to get i_lock as well resulting in double lock.
      
      Fix the problem by moving inode_get_rsv_space() call out of i_lock.
      Reported-and-analyzed-by: default avatarJie Liu <jeff.liu@oracle.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      d7e97117
    • Jan Kara's avatar
      jbd: Write journal superblock with WRITE_FUA after checkpointing · fd2cbd4d
      Jan Kara authored
      If journal superblock is written only in disk's caches and other transaction
      starts reusing space of the transaction cleaned from the log, it can happen
      blocks of a new transaction reach the disk before journal superblock. When
      power failure happens in such case, subsequent journal replay would still try
      to replay the old transaction but some of it's blocks may be already
      overwritten by the new transaction. For this reason we must use WRITE_FUA when
      updating log tail and we must first write new log tail to disk and update
      in-memory information only after that.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      fd2cbd4d
    • Jan Kara's avatar
      jbd: protect all log tail updates with j_checkpoint_mutex · 1ce8486d
      Jan Kara authored
      There are some log tail updates that are not protected by j_checkpoint_mutex.
      Some of these are harmless because they happen during startup or shutdown but
      updates in journal_commit_transaction() and journal_flush() can really race
      with other log tail updates (e.g. someone doing journal_flush() with someone
      running cleanup_journal_tail()). So protect all log tail updates with
      j_checkpoint_mutex.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      1ce8486d
    • Jan Kara's avatar
      jbd: Split updating of journal superblock and marking journal empty · 9754e39c
      Jan Kara authored
      There are three case of updating journal superblock. In the first case, we want
      to mark journal as empty (setting s_sequence to 0), in the second case we want
      to update log tail, in the third case we want to update s_errno. Split these
      cases into separate functions. It makes the code slightly more straightforward
      and later patches will make the distinction even more important.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      9754e39c
  2. 11 Apr, 2012 6 commits
    • Artem Bityutskiy's avatar
      ext2: do not register write_super within VFS · f72cf5e2
      Artem Bityutskiy authored
      Jan Kara removed 'sb->s_dirt' VFS flag references, so we do not need to
      register the ext2 'ext2_write_super()' method in the VFS superblock operations,
      because 'sb->s_dirt' won't be ever set to 1 and VFS won't ever call
      '->write_super()' anyway. Thus, remove the method.
      
      Tested using xfstests.
      Signed-off-by: default avatarArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      f72cf5e2
    • Jan Kara's avatar
      ext2: Remove s_dirt handling · b838ec22
      Jan Kara authored
      Places which modify superblock feature / state fields mark the superblock
      buffer dirty so it is written out by flusher thread. Thus there's no need to
      set s_dirt there.
      
      The only other fields changing in the superblock are the numbers of free
      blocks, free inodes and s_wtime. There's no real need to write (or even
      compute) these periodically. Free blocks / inodes counters are recomputed on
      every mount from group counters anyway and value of s_wtime is only
      informational and imprecise anyway. So it should be enough to write these
      opportunistically on mount, remount, umount, and sync_fs times.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      b838ec22
    • Artem Bityutskiy's avatar
      ext2: write superblock only once on unmount · f2b22420
      Artem Bityutskiy authored
      Currently on unmount if we are mounted R/W, we first write the superblock to
      the media if it is dirty, and then write it again, which is not optimal. This
      patch makes ext2 write the superblock on unmount less times.
      Signed-off-by: default avatarArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      f2b22420
    • Stefan Hajnoczi's avatar
      ext3: update documentation with barrier=1 default · ee65244b
      Stefan Hajnoczi authored
      Commit 00eacd66 ("ext3: make ext3 mount default to barrier=1") changed
      the default barrier mount option for ext3.  The documentation needs to
      be updated, so this patch does that.
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      ee65244b
    • Akira Fujita's avatar
      ext3: remove max_debt in find_group_orlov() · ac0dd247
      Akira Fujita authored
      max_debt, involved variables and calculations
      are no longer needed, clean them up.
      Signed-off-by: default avatarAkira Fujita <a-fujita@rs.jp.nec.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      ac0dd247
    • Jan Kara's avatar
      jbd: Refine commit writeout logic · 2db938be
      Jan Kara authored
      Currently we write out all journal buffers in WRITE_SYNC mode. This improves
      performance for fsync heavy workloads but hinders performance when writes
      are mostly asynchronous, most noticably it slows down readers and users
      complain about slow desktop response etc.
      
      So submit writes as asynchronous in the normal case and only submit writes as
      WRITE_SYNC if we detect someone is waiting for current transaction commit.
      
      I've gathered some numbers to back this change. The first is the read latency
      test. It measures time to read 1 MB after several seconds of sleeping in
      presence of streaming writes.
      
      Top 10 times (out of 90) in us:
      Before		After
      2131586		697473
      1709932		557487
      1564598		535642
      1480462		347573
      1478579		323153
      1408496		222181
      1388960		181273
      1329565		181070
      1252486		172832
      1223265		172278
      
      Average:
      619377		82180
      
      So the improvement in both maximum and average latency is massive.
      
      I've measured fsync throughput by:
      fs_mark -n 100 -t 1 -s 16384 -d /mnt/fsync/ -S 1 -L 4
      
      in presence of streaming reader. The numbers (fsyncs/s) are:
      Before		After
      9.9		6.3
      6.8		6.0
      6.3		6.2
      5.8		6.1
      
      So fsync performance seems unharmed by this change.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      2db938be
  3. 10 Apr, 2012 8 commits
  4. 09 Apr, 2012 6 commits
  5. 08 Apr, 2012 1 commit
  6. 07 Apr, 2012 8 commits