Commits · 4ed5c033c11b33149d993734a6a8de1016e8f03f · nexedi / linux

20 May, 2011 1 commit

ext4: Use schedule_timeout_interruptible() for waiting in lazyinit thread · 4ed5c033

Lukas Czerner authored May 20, 2011

In order to make lazyinit eat approx. 10% of io bandwidth at max, we
are sleeping between zeroing each single inode table. For that purpose
we are using timer which wakes up thread when it expires. It is set
via add_timer() and this may cause troubles in the case that thread
has been woken up earlier and in next iteration we call add_timer() on
still running timer hence hitting BUG_ON in add_timer(). We could fix
that by using mod_timer() instead however we can use
schedule_timeout_interruptible() for waiting and hence simplifying
things a lot.

This commit exchange the old "waiting mechanism" with simple
schedule_timeout_interruptible(), setting the time to sleep. Hence we
do not longer need li_wait_daemon waiting queue and others, so get rid
of it.

Addresses-Red-Hat-Bugzilla: #699708
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>

4ed5c033

18 May, 2011 3 commits

ext4: wait for writeback to complete while making pages writable · 0e499890

Darrick J. Wong authored May 18, 2011

In order to stabilize pages during disk writes, ext4_page_mkwrite must
wait for writeback operations to complete before making a page
writable.  Furthermore, the function must return locked pages, and
recheck the writeback status if the page lock is ever dropped.  The
"someone could wander in" part of this patch was suggested by Chris
Mason.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

0e499890

ext4: clean up some wait_on_page_writeback calls · 7cb1a535

Darrick J. Wong authored May 18, 2011

wait_on_page_writeback already checks the writeback bit, so callers of it
needn't do that test.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

7cb1a535

ext4: don't warn about mnt_count if it has been disabled · ed3ce80a

Tao Ma authored May 18, 2011

Currently, if we mkfs a new ext4 volume with s_max_mnt_count set to
zero, and mount it for the first time, we will get the warning:

	maximal mount count reached, running e2fsck is recommended

It is really misleading. So change the check so that it won't warn in
that case.
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ed3ce80a

16 May, 2011 2 commits

ext4: ext4_ext_convert_to_initialized bug found in extended FSX testing · 9b940f8e

Allison Henderson authored May 16, 2011

This patch addresses bugs found while testing punch hole 
with the fsx test.  The patch corrects the number of blocks
that are zeroed out while splitting an extent, and also corrects
the return value to return the number of blocks split out, instead
of the number of blocks zeroed out.

This patch has been tested in addition to the following patches: 
[Ext4 punch hole v7]
[XFS Tests Punch Hole 1/1 v2] Add Punch Hole Testing to FSX

The test ran successfully for 24 hours.
Signed-off-by: Allison Henderson <achender@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

9b940f8e

ext4: fix oops in ext4_quota_off() · 0b268590

Amir Goldstein authored May 16, 2011

If quota is not enabled when ext4_quota_off() is called, we must not
dereference quota file inode since it is NULL.  Check properly for
this.

This fixes a bug in commit 21f97697 (ext4: remove unnecessary
[cm]time update of quota file), which was merged for 2.6.39-rc3.
Reported-by: Amir Goldstein <amir73il@users.sf.net>
Signed-off-by: Amir Goldstein <amir73il@users.sf.net>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

0b268590

15 May, 2011 1 commit

ext4: don't dereference null pointer when make_indexed_dir() fails · 6976a6f2

Allison Henderson authored May 15, 2011

Fix for a null pointer bug found while running punch hole tests
Signed-off-by: Allison Henderson <achender@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

6976a6f2

10 May, 2011 4 commits

ext4: remove alloc_semp · 44183d42

Amir Goldstein authored May 09, 2011

After taking care of all group init races, all that remains is to
remove alloc_semp from ext4_allocation_context and ext4_buddy structs.
Signed-off-by: Amir Goldstein <amir73il@users.sf.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

44183d42

ext4: teach ext4_mb_init_cache() to skip uptodate buddy caches · 9b8b7d35

Amir Goldstein authored May 09, 2011

After online resize which adds new groups, some of the groups
in a buddy page may be initialized and uptodate, while other
(new ones) may be uninitialized.

The indication for init of new block groups is when ext4_mb_init_cache()
is called with an uptodate buddy page. In this case, initialized groups
on that buddy page must be skipped when initializing the buddy cache.
Signed-off-by: Amir Goldstein <amir73il@users.sf.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

9b8b7d35

ext4: synchronize ext4_mb_init_group() with buddy page lock · 2de8807b

Amir Goldstein authored May 09, 2011

The old routines ext4_mb_[get|put]_buddy_cache_lock(), which used
to take grp->alloc_sem for all groups on the buddy page have been
replaced with the routines ext4_mb_[get|put]_buddy_page_lock().

The new routines take both buddy and bitmap page locks to protect
against concurrent init of groups on the same buddy page.

The GROUP_NEED_INIT flag is tested again under page lock to check
if the group was initialized by another caller.
Signed-off-by: Amir Goldstein <amir73il@users.sf.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

2de8807b

ext4: implement ext4_add_groupblocks() by freeing blocks · e73a347b

Amir Goldstein authored May 09, 2011

The old imlementation used to take grp->alloc_sem and set the
GROUP_NEED_INIT flag, so that the buddy cache would be reloaded.

The new implementation updates the buddy cache by freeing the added
blocks and making them available for use, so there is no need to
reload the buddy cache and there is no need to take grp->alloc_sem.
Signed-off-by: Amir Goldstein <amir73il@users.sf.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

e73a347b

09 May, 2011 6 commits

ext4: remove unneeded ext4_journal_get_undo_access · 2cd05cc3

Theodore Ts'o authored May 09, 2011

The block allocation code used to use jbd2_journal_get_undo_access as
a way to make changes that wouldn't show up until the commit took
place. The new multi-block allocation code has a its own way of
preventing newly freed blocks from getting reused until the commit
takes place (it avoids updating the buddy bitmaps until the commit is
done), so we don't need to use jbd2_journal_get_undo_access(), which
has extra overhead compared to jbd2_journal_get_write_access().

There was one last vestigal use of ext4_journal_get_undo_access() in
ext4_add_groupblocks(); change it to use ext4_journal_get_write_access()
and then remove the ext4_journal_get_undo_access() support.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

2cd05cc3

ext4: move ext4_add_groupblocks() to mballoc.c · 2846e820

Amir Goldstein authored May 09, 2011

In preparation for the next patch, the function ext4_add_groupblocks()
is moved to mballoc.c, where it could use some static functions.
Signed-off-by: Amir Goldstein <amir73il@users.sf.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

2846e820

ext4: remove redundant #ifdef in super.c · 66bb8279

Amerigo Wang authored May 09, 2011

There is already an #ifdef CONFIG_QUOTA some lines above,
so this one is totally useless.
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

66bb8279

ext4: remove redundant check for first_not_zeroed in ext4_register_li_request · 55ff3840

Tao Ma authored May 09, 2011

We have checked first_not_zeroed == ngroups already above, so remove
this redundant check.

sbi->s_li_request = NULL above is also removed since it is NULL
already.

Cc: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

55ff3840

ext4: use s_inodes_per_block directly in __ext4_get_inode_loc · 00d09882

Tao Ma authored May 09, 2011

In __ext4_get_inode_loc, we calculate inodes_per_block every time by
EXT4_BLOCK_SIZE(sb) / EXT4_INODE_SIZE(sb).  AFAICS, this function is a
hot path for ext4, so we'd better use s_inodes_per_block directly
instead of calculating every time.
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

00d09882

ext4: use EXT4FS_DEBUG instead of EXT4_DEBUG in fsync.c · e8bbe8c4

Tao Ma authored May 09, 2011

We have EXT4FS_DEBUG for some old debug and CONFIG_EXT4_DEBUG
for the new mballoc debug, but there isn't any EXT4_DEBUG.

As CONFIG_EXT4_DEBUG seems to be only used in mballoc, use
EXT4FS_DEBUG in fsync.c.

[ It doesn't really matter; although I'm including this commit for
  consistency's sake.  The whole point of the #ifdef's is to disable
  the debugging code.  In general you're not going to want to enable
  all of the code protected by EXT4FS_DEBUG at the same time.  -- Ted ]
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

e8bbe8c4

08 May, 2011 2 commits

jbd2: only print the debugging information for tid wraparound once · 1be2add6

Theodore Ts'o authored May 08, 2011

If we somehow wrap, we don't want to keep printing the warning message
over and over again.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

1be2add6

jbd2: Fix forever sleeping process in do_get_write_access() · 229309ca

Jan Kara authored May 08, 2011

In do_get_write_access() we wait on BH_Unshadow bit for buffer to get
from shadow state. The waking code in journal_commit_transaction() has
a bug because it does not issue a memory barrier after the buffer is
moved from the shadow state and before wake_up_bit() is called. Thus a
waitqueue check can happen before the buffer is actually moved from
the shadow state and waiting process may never be woken. Fix the
problem by issuing proper barrier.
Reported-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

229309ca

03 May, 2011 6 commits

ext4: reimplement convert and split_unwritten · 667eff35

Yongqiang Yang authored May 03, 2011

Reimplement ext4_ext_convert_to_initialized() and
ext4_split_unwritten_extents() using ext4_split_extent()
Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Tested-by: Allison Henderson <achender@linux.vnet.ibm.com>

667eff35

ext4: add ext4_split_extent_at() and ext4_split_extent() · 47ea3bb5

Yongqiang Yang authored May 03, 2011

Add two functions: ext4_split_extent_at(), which splits an extent into
two extents at given logical block, and ext4_split_extent() which
splits an extent into three extents.
Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Tested-by: Allison Henderson <achender@linux.vnet.ibm.com>

47ea3bb5

ext4: add a function merging extents right and left · 197217a5

Yongqiang Yang authored May 03, 2011

1) Rename ext4_ext_try_to_merge() to ext4_ext_try_to_merge_right().

2) Add a new function ext4_ext_try_to_merge() which tries to merge
   an extent both left and right.

3) Use the new function in ext4_ext_convert_unwritten_endio() and
   ext4_ext_insert_extent().
Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
Tested-by: Allison Henderson <achender@linux.vnet.ibm.com>

197217a5

ext4: fix deadlock in ext4_symlink() in ENOSPC conditions · df5e6223

Jan Kara authored May 03, 2011

ext4_symlink() cannot call __page_symlink() with transaction open.
__page_symlink() calls ext4_write_begin() which can wait for
transaction commit if we are running out of space thus causing a
deadlock. Also error recovery in ext4_truncate_failed_write() does not
count with the transaction being already started (although I'm not
aware of any particular deadlock here).

Fix the problem by stopping a transaction before calling
__page_symlink() (we have to be careful and put inode to orphan list
so that it gets deleted in case of crash) and starting another one
after __page_symlink() returns for addition of symlink into a
directory.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

df5e6223

ext4: Fix fs corruption when make_indexed_dir() fails · 7ad8e4e6

Jan Kara authored May 03, 2011

When make_indexed_dir() fails (e.g. because of ENOSPC) after it has
allocated block for index tree root, we did not properly mark all
changed buffers dirty.  This lead to only some of these buffers being
written out and thus effectively corrupting the directory.

Fix the issue by marking all changed data dirty even in the error
failure case.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

7ad8e4e6

ext4: set extents flag when migrating file to use extents · 74e4e6db

Theodore Ts'o authored May 03, 2011

Fix a typo that was introduced in commit 07a03824 (in 2.6.36) which
caused the extents flag not to be set at the conclusion of converting
an inode to use extents.
Reported-by: Peter Uchno <peter.uchno@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

74e4e6db

01 May, 2011 3 commits

jbd2: fix fsync() tid wraparound bug · deeeaf13

Theodore Ts'o authored May 01, 2011

If an application program does not make any changes to the indirect
blocks or extent tree, i_datasync_tid will not get updated.  If there
are enough commits (i.e., 2**31) such that tid_geq()'s calculations
wrap, and there isn't a currently active transaction at the time of
the fdatasync() call, this can end up triggering a BUG_ON in
fs/jbd2/commit.c:

	J_ASSERT(journal->j_running_transaction != NULL);

It's pretty rare that this can happen, since it requires the use of
fdatasync() plus *very* frequent and excessive use of fsync().  But
with the right workload, it can.

We fix this by replacing the use of tid_geq() with an equality test,
since there's only one valid transaction id that we is valid for us to
wait until it is commited: namely, the currently running transaction
(if it exists).
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

deeeaf13

ext4: remove obsolete mount options from ext4's documentation · 59802db0
Theodore Ts'o authored May 01, 2011
```
The block reservation code from ext3 was removed long ago...
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
```
59802db0

ext4: remove dead code in ext4_has_free_blocks() · dc2070a2

Shaohua Li authored May 01, 2011

percpu_counter_sum_positive() never returns a negative value.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

dc2070a2

30 Apr, 2011 2 commits

ext4: ignore errors when issuing discards · d9f34504

Theodore Ts'o authored Apr 30, 2011

This is an effective revert of commit a30eec2a: "ext4: stop issuing
discards if not supported by device".  The problem is that there are
some devices that may return errors in response to a discard request
some times but not others.  (One example would be a hybrid dm device
which concatenates an SSD and an HDD device).

By this logic, I also removed the error checking from ext4's FITRIM
code; so that an error from a discard will not stop the FITRIM from
trying to trim the rest of the file system.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

d9f34504

ext4: don't set PageUptodate in ext4_end_bio() · 39db00f1

Curt Wohlgemuth authored Apr 30, 2011

In the bio completion routine, we should not be setting
PageUptodate at all -- it's set at sys_write() time, and is
unaffected by success/failure of the write to disk.

This can cause a page corruption bug when the file system's
block size is less than the architecture's VM page size.

if we have only written a single block -- we might end up
setting the page's PageUptodate flag, indicating that page
is completely read into memory, which may not be true.
This could cause subsequent reads to get bad data.

This commit also takes the opportunity to clean up error
handling in ext4_end_bio(), and remove some extraneous code:

   - fixes ext4_end_bio() to set AS_EIO in the
     page->mapping->flags on error, which was left out by
     mistake.  This is needed so that fsync() will
     return an error if there was an I/O error.
   - remove the clear_buffer_dirty() call on unmapped
     buffers for each page.
   - consolidate page/buffer error handling in a single
     section.
Signed-off-by: Curt Wohlgemuth <curtw@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reported-by: Jim Meyering <jim@meyering.net>
Reported-by: Hugh Dickins <hughd@google.com>
Cc: Mingming Cao <cmm@us.ibm.com>

39db00f1

18 Apr, 2011 1 commit

ext4: check for ext[23] file system features when mounting as ext[23] · 2035e776

Theodore Ts'o authored Apr 18, 2011

Provide better emulation for ext[23] mode by enforcing that the file
system does not have any unsupported file system features as defined
by ext[23] when emulating the ext[23] file system driver when
CONFIG_EXT4_USE_FOR_EXT23 is defined.

This causes the file system type information in /proc/mounts to be
correct for the automatically mounted root file system.  This also
means that "mount -t ext2 /dev/sda /mnt" will fail if /dev/sda
contains an ext3 or ext4 file system, just as one would expect if the
original ext2 file system driver were in use.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

2035e776

16 Apr, 2011 1 commit

ext4: release page cache in ext4_mb_load_buddy error path · 26626f11

Yang Ruirui authored Apr 16, 2011

Add missing page_cache_release in the error path of ext4_mb_load_buddy
Signed-off-by: Yang Ruirui <ruirui.r.yang@tieto.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

26626f11

12 Apr, 2011 1 commit
- Linux 2.6.39-rc3 · a6360dd3
  Linus Torvalds authored Apr 11, 2011
  
  a6360dd3
11 Apr, 2011 7 commits

Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs · 1e05ff02

Linus Torvalds authored Apr 11, 2011

* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  xfs: use proper interfaces for on-stack plugging
  xfs: fix xfs_debug warnings
  xfs: fix variable set but not used warnings
  xfs: convert log tail checking to a warning
  xfs: catch bad block numbers freeing extents.
  xfs: push the AIL from memory reclaim and periodic sync
  xfs: clean up code layout in xfs_trans_ail.c
  xfs: convert the xfsaild threads to a workqueue
  xfs: introduce background inode reclaim work
  xfs: convert ENOSPC inode flushing to use new syncd workqueue
  xfs: introduce a xfssyncd workqueue
  xfs: fix extent format buffer allocation size
  xfs: fix unreferenced var error in xfs_buf.c

Also, applied patch from Tony Luck that fixes ia64:
  xfs_destroy_workqueues() should not be tagged with__exit
in the branch before merging.

1e05ff02

xfs_destroy_workqueues() should not be tagged with__exit · 39411f81

Luck, Tony authored Apr 11, 2011

ia64 throws away .exit sections for the built-in CONFIG case, so routines
that are used in other circumstances should not be tagged as __exit.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

39411f81

Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · a97b5202

Linus Torvalds authored Apr 11, 2011

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix data corruption regression by reverting commit 6de9843d
  ext4: Allow indirect-block file to grow the file size to max file size
  ext4: allow an active handle to be started when freezing
  ext4: sync the directory inode in ext4_sync_parent()
  ext4: init timer earlier to avoid a kernel panic in __save_error_info
  jbd2: fix potential memory leak on transaction commit
  ext4: fix a double free in ext4_register_li_request
  ext4: fix credits computing for indirect mapped files
  ext4: remove unnecessary [cm]time update of quota file
  jbd2: move bdget out of critical section

a97b5202

Merge branch 'for-2.6.39' of git://linux-nfs.org/~bfields/linux · 18770c7c

Linus Torvalds authored Apr 11, 2011

* 'for-2.6.39' of git://linux-nfs.org/~bfields/linux:
  nfsd4: fix oops on lock failure
  nfsd: fix auth_domain reference leak on nlm operations

18770c7c

Merge branch 'spi/merge' of git://git.secretlab.ca/git/linux-2.6 · 6b98cd5a

Linus Torvalds authored Apr 11, 2011

* 'spi/merge' of git://git.secretlab.ca/git/linux-2.6:
  dt/fsldma: fix build warning caused by of_platform_device changes
  spi: Fix race condition in stop_queue()
  gpio/pch_gpio: Fix output value of pch_gpio_direction_output()
  gpio/ml_ioh_gpio: Fix output value of ioh_gpio_direction_output()
  gpio/pca953x: fix error handling path in probe() call

6b98cd5a

pci: fix PCI bus allocation alignment handling · b42282e5

Linus Torvalds authored Apr 11, 2011

In commit 13583b16 ("PCI: refactor io size calculation code") Ram
had a thinko in the refactorization of the code: the end result used the
variable 'align' for the bus alignment, but the original code used
'min_align'.

Since then, another use of that 'align' variable got introduced by
commit c8adf9a3 ("PCI: pre-allocate additional resources to devices
only after successful allocation of essential resources.")

Fix both of those uses to use 'min_align' as they should.

Daniel Hellstrom <daniel@gaisler.com>
Acked-by: Ram Pai <linuxram@us.ibm.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

b42282e5

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 · c44eaf41

Linus Torvalds authored Apr 11, 2011

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (34 commits)
  net: Add support for SMSC LAN9530, LAN9730 and LAN89530
  mlx4_en: Restoring RX buffer pointer in case of failure
  mlx4: Sensing link type at device initialization
  ipv4: Fix "Set rt->rt_iif more sanely on output routes."
  MAINTAINERS: add entry for Xen network backend
  be2net: Fix suspend/resume operation
  be2net: Rename some struct members for clarity
  pppoe: drop PPPOX_ZOMBIEs in pppoe_flush_dev
  dsa/mv88e6131: add support for mv88e6085 switch
  ipv6: Enable RFS sk_rxhash tracking for ipv6 sockets (v2)
  be2net: Fix a potential crash during shutdown.
  bna: Fix for handling firmware heartbeat failure
  can: mcp251x: Allow pass IRQ flags through platform data.
  smsc911x: fix mac_lock acquision before calling smsc911x_mac_read
  iwlwifi: accept EEPROM version 0x423 for iwl6000
  rt2x00: fix cancelling uninitialized work
  rtlwifi: Fix some warnings/bugs
  p54usb: IDs for two new devices
  wl12xx: fix potential buffer overflow in testmode nvs push
  zd1211rw: reset rx idle timer from tasklet
  ...

c44eaf41