Commits · b4d7241596ffb6398ac5535ae8cf80d845b0c254 · nexedi / linux

24 Nov, 2009 4 commits

ext4: remove encountered_congestion trace · b4d72415

Wu Fengguang authored Nov 24, 2009

It is no longer set and scheduled to be removed.
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

b4d72415

ext4: move_extent_per_page() cleanup · ac48b0a1

Akira Fujita authored Nov 24, 2009

Integrate duplicate lines (acquire/release semaphore and invalidate
extent cache in move_extent_per_page()) into mext_replace_branches(),
to reduce source and object code size.
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ac48b0a1

ext4: initialize moved_len before calling ext4_move_extents() · 446aaa6e

Kazuya Mio authored Nov 24, 2009

The move_extent.moved_len is used to pass back the number of exchanged
blocks count to user space.  Currently the caller must clear this
field; but we spend more code space checking for this requirement than
simply zeroing the field ourselves, so let's just make life easier for
everyone all around.
Signed-off-by: Kazuya Mio <k-mio@sx.jp.nec.com>
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

446aaa6e

ext4: Fix double-free of blocks with EXT4_IOC_MOVE_EXT · 94d7c16c

Akira Fujita authored Nov 24, 2009

At the beginning of ext4_move_extent(), we call
ext4_discard_preallocations() to discard inode PAs of orig and donor
inodes.  But in the following case, blocks can be double freed, so
move ext4_discard_preallocations() to the end of ext4_move_extents().

1. Discard inode PAs of orig and donor inodes with
   ext4_discard_preallocations() in ext4_move_extents().

   orig : [ DATA1 ]
   donor: [ DATA2 ]

2. While data blocks are exchanging between orig and donor inodes, new
   inode PAs is created to orig by other process's block allocation.
   (Since there are semaphore gaps in ext4_move_extents().)  And new
   inode PAs is used partially (2-1).

   2-1 Create new inode PAs to orig inode
   orig : [ DATA1 | used PA1 | free PA1 ]
   donor: [ DATA2 ]

3. Donor inode which has old orig inode's blocks is deleted after
   EXT4_IOC_MOVE_EXT finished (3-1, 3-2).  So the block bitmap
   corresponds to old orig inode's blocks are freed.

   3-1 After EXT4_IOC_MOVE_EXT finished
   orig : [ DATA2 |  free PA1 ]
   donor: [ DATA1 |  used PA1 ]

   3-2 Delete donor inode
   orig : [ DATA2 |  free PA1 ]
   donor: [ FREE SPACE(DATA1) | FREE SPACE(used PA1) ]

4. The double-free of blocks is occurred, when close() is called to
   orig inode.  Because ext4_discard_preallocations() for orig inode
   frees used PA1 and free PA1, though used PA1 is already freed in 3.

   4-1 Double-free of blocks is occurred
   orig : [ DATA2 |  FREE SPACE(free PA1) ]
   donor: [ FREE SPACE(DATA1) | DOUBLE FREE(used PA1) ]
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

94d7c16c

23 Nov, 2009 4 commits

ext4: use ext4_data_block_valid() in ext4_free_blocks() · 9084d471

Theodore Ts'o authored Nov 22, 2009

The block validity framework does a more comprehensive set of checks,
and it saves object code space to use the ext4_data_block_valid() than
the limited open-coded version that had been in ext4_free_blocks().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

9084d471

ext4: add check for wraparound in ext4_data_block_valid() · 1585d8d8
Theodore Ts'o authored Nov 22, 2009
```
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
```
1585d8d8

ext4: print i_mode in octal in ext4 tracepoints · 6eebee62

Theodore Ts'o authored Nov 22, 2009

Inode permissions are much easier to understand if they are printed in
octal.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

6eebee62

ext4: call ext4_forget() from ext4_free_blocks() · e6362609

Theodore Ts'o authored Nov 23, 2009

Add the facility for ext4_forget() to be called from
ext4_free_blocks().  This simplifies the code in a large number of
places, and centralizes most of the work of calling ext4_forget() into
a single place.

Also fix a bug in the extents migration code; it wasn't calling
ext4_forget() when releasing the indirect blocks during the
conversion.  As a result, if the system cashed during or shortly after
the extents migration, and the released indirect blocks get reused as
data blocks, the journal replay would corrupt the data blocks.  With
this new patch, fixing this bug was as simple as adding the
EXT4_FREE_BLOCKS_FORGET flags to the call to ext4_free_blocks().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

e6362609

22 Nov, 2009 1 commit

ext4: fold ext4_free_blocks() and ext4_mb_free_blocks() · 44338711

Theodore Ts'o authored Nov 22, 2009

ext4_mb_free_blocks() is only called by ext4_free_blocks(), and the
latter function doesn't really do much.  So merge the two functions
together, such that ext4_free_blocks() is now found in
fs/ext4/mballoc.c.  This saves about 200 bytes of compiled text space.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

44338711

23 Nov, 2009 1 commit

ext4: fold ext4_journal_forget() into ext4_forget() · b7e57e7c

Theodore Ts'o authored Nov 22, 2009

Convert the last two callers of ext4_journal_forget() to use
ext4_forget() instead, and then fold ext4_journal_forget() into
ext4_forget().  This reduces are code complexity and shortens our call
stack.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

b7e57e7c

24 Nov, 2009 1 commit

ext4: fold ext4_journal_revoke() into ext4_forget() · e4684b3f

Theodore Ts'o authored Nov 24, 2009

The only caller of ext4_journal_revoke() is ext4_forget(), so we can
fold ext4_journal_revoke() into ext4_forget() to simplify the code and
shorten the call stack.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

e4684b3f

23 Nov, 2009 1 commit

ext4: move ext4_forget() to ext4_jbd2.c · d6797d14

Theodore Ts'o authored Nov 22, 2009

The ext4_forget() function better belongs in ext4_jbd2.c.  This will
allow us to do some cleanup of the ext4_journal_revoke() and
ext4_journal_forget() functions, as well as giving us better error
reporting since we can report the caller of ext4_forget() when things
go wrong.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

d6797d14

19 Nov, 2009 2 commits

ext4: make "norecovery" an alias for "noload" · e3bb52ae

Eric Sandeen authored Nov 19, 2009

Users on the linux-ext4 list recently complained about differences
across filesystems w.r.t. how to mount without a journal replay.

In the discussion it was noted that xfs's "norecovery" option is
perhaps more descriptively accurate than "noload," so let's make
that an alias for ext4.

Also show this status in /proc/mounts
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

e3bb52ae

ext4: make trim/discard optional (and off by default) · 5328e635

Eric Sandeen authored Nov 19, 2009

It is anticipated that when sb_issue_discard starts doing
real work on trim-capable devices, we may see issues.  Make
this mount-time optional, and default it to off until we know
that things are working out OK.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

5328e635

23 Nov, 2009 2 commits

ext4: fix error handling in ext4_ind_get_blocks() · 2bba702d

Jan Kara authored Nov 23, 2009

When an error happened in ext4_splice_branch we failed to notice that
in ext4_ind_get_blocks and mapped the buffer anyway. Fix the problem
by checking for error properly.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

2bba702d

ext4: avoid issuing unnecessary barriers · 6b17d902

Theodore Ts'o authored Nov 23, 2009

We don't to issue an I/O barrier on an error or if we force commit
because we are doing data journaling.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Cc: stable@kernel.org

6b17d902

15 Nov, 2009 1 commit

ext4: fix block validity checks so they work correctly with meta_bg · 1032988c

Theodore Ts'o authored Nov 15, 2009

The block validity checks used by ext4_data_block_valid() wasn't
correctly written to check file systems with the meta_bg feature.  Fix
this.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

1032988c

23 Nov, 2009 2 commits

ext4: fix uninit block bitmap initialization when s_meta_first_bg is non-zero · 8dadb198

Theodore Ts'o authored Nov 23, 2009

The number of old-style block group descriptor blocks is
s_meta_first_bg when the meta_bg feature flag is set.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

8dadb198

ext4: don't update the superblock in ext4_statfs() · 3f8fb949

Theodore Ts'o authored Nov 23, 2009

commit a71ce8c6 updated ext4_statfs()
to update the on-disk superblock counters, but modified this buffer
directly without any journaling of the change.  This is one of the
accesses that was causing the crc errors in journal replay as seen in
kernel.org bugzilla #14354.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

3f8fb949

15 Nov, 2009 2 commits

ext4: journal all modifications in ext4_xattr_set_handle · 86ebfd08

Eric Sandeen authored Nov 15, 2009

ext4_xattr_set_handle() was zeroing out an inode outside
of journaling constraints; this is one of the accesses that
was causing the crc errors in journal replay as seen in
kernel.org bugzilla #14354.
Reviewed-by: Andreas Dilger <adilger@sun.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

86ebfd08

ext4: fix i_flags access in ext4_da_writepages_trans_blocks() · 30c6e07a

Julia Lawall authored Nov 15, 2009

We need to be testing the i_flags field in the ext4 specific portion
of the inode, instead of the (confusingly aliased) i_flags field in
the generic struct inode.
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

30c6e07a

23 Nov, 2009 3 commits

ext4: make sure directory and symlink blocks are revoked · 50689696

Theodore Ts'o authored Nov 23, 2009

When an inode gets unlinked, the functions ext4_clear_blocks() and
ext4_remove_blocks() call ext4_forget() for all the buffer heads
corresponding to the deleted inode's data blocks.  If the inode is a
directory or a symlink, the is_metadata parameter must be non-zero so
ext4_forget() will revoke them via jbd2_journal_revoke().  Otherwise,
if these blocks are reused for a data file, and the system crashes
before a journal checkpoint, the journal replay could end up
corrupting these data blocks.

Thanks to Curt Wohlgemuth for pointing out potential problems in this
area.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

50689696

ext4: add tracepoint for ext4_forget() · beac2da7
Theodore Ts'o authored Nov 23, 2009
```
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
```
beac2da7

ext4: remove failed journal checksum check · cf40db13

Theodore Ts'o authored Nov 22, 2009

Now that we are checking for failed journal checksums in the jbd2
layer, we don't need to check in the ext4 mount path --- since a
checksum fail will result in ext4_load_journal() returning an error,
causing the file system to refuse to be mounted until e2fsck can deal
with the problem.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

cf40db13

15 Nov, 2009 1 commit

jbd2: don't wipe the journal on a failed journal checksum · e6a47428

Theodore Ts'o authored Nov 15, 2009

If there is a failed journal checksum, don't reset the journal.  This
allows for userspace programs to decide how to recover from this
situation.  It may be that ignoring the journal checksum failure might
be a better way of recovering the file system.  Once we add per-block
checksums, we can definitely do better.  Until then, a system
administrator can try backing up the file system image (or taking a
snapshot) and and trying to determine experimentally whether ignoring
the checksum failure or aborting the journal replay results in less
data loss.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

e6a47428

14 Nov, 2009 1 commit

ext4: plug a buffer_head leak in an error path of ext4_iget() · 567f3e9a

Theodore Ts'o authored Nov 14, 2009

One of the invalid error paths in ext4_iget() forgot to brelse() the
inode buffer head.  Fix it by adding a brelse() in the common error
return path, which also simplifies function.

Thanks to Andi Kleen <ak@linux.intel.com> reporting the problem.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

567f3e9a

23 Nov, 2009 6 commits

ext4: fix spelling typos in move_extent.c · 92c28159

Akira Fujita authored Nov 23, 2009

Fix a few spelling typos in move_extent.c
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.co.jp>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

92c28159

ext4: fix possible recursive locking warning in EXT4_IOC_MOVE_EXT · 49bd22bc

Akira Fujita authored Nov 23, 2009

If CONFIG_PROVE_LOCKING is enabled, the double_down_write_data_sem()
will trigger a false-positive warning of a recursive lock. Since we
take i_data_sem for the two inodes ordered by their inode numbers,
this isn't a problem. Use of down_write_nested() will notify the lock
dependency checker machinery that there is no problem here.

This problem was reported by Brian Rogers:

http://marc.info/?l=linux-ext4&m=125115356928011&w=1Reported-by: Brian Rogers <brian@xyzw.org>
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

49bd22bc

ext4: fix lock order problem in ext4_move_extents() · fc04cb49

Akira Fujita authored Nov 23, 2009

ext4_move_extents() checks the logical block contiguousness
of original file with ext4_find_extent() and mext_next_extent().
Therefore the extent which ext4_ext_path structure indicates
must not be changed between above functions.

But in current implementation, there is no i_data_sem protection
between ext4_ext_find_extent() and mext_next_extent().  So the extent
which ext4_ext_path structure indicates may be overwritten by
delalloc.  As a result, ext4_move_extents() will exchange wrong blocks
between original and donor files.  I change the place where
acquire/release i_data_sem to solve this problem.

Moreover, I changed move_extent_per_page() to start transaction first,
and then acquire i_data_sem.  Without this change, there is a
possibility of the deadlock between mmap() and ext4_move_extents():

* NOTE: "A", "B" and "C" mean different processes

A-1: ext4_ext_move_extents() acquires i_data_sem of two inodes.

B:   do_page_fault() starts the transaction (T),
     and then tries to acquire i_data_sem.
     But process "A" is already holding it, so it is kept waiting.

C:   While "A" and "B" running, kjournald2 tries to commit transaction (T)
     but it is under updating, so kjournald2 waits for it.

A-2: Call ext4_journal_start with holding i_data_sem,
     but transaction (T) is locked.
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

fc04cb49

ext4: fix the returned block count if EXT4_IOC_MOVE_EXT fails · f868a48d

Akira Fujita authored Nov 23, 2009

If the EXT4_IOC_MOVE_EXT ioctl fails, the number of blocks that were
exchanged before the failure should be returned to the userspace
caller.  Unfortunately, currently if the block size is not the same as
the page size, the returned block count that is returned is the
page-aligned block count instead of the actual block count.  This
commit addresses this bug.
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

f868a48d

ext4: avoid divide by zero when trying to mount a corrupted file system · 503358ae

Theodore Ts'o authored Nov 23, 2009

If s_log_groups_per_flex is greater than 31, then groups_per_flex will
will overflow and cause a divide by zero error.  This can cause kernel
BUG if such a file system is mounted.

Thanks to Nageswara R Sastry for analyzing the failure and providing
an initial patch.

http://bugzilla.kernel.org/show_bug.cgi?id=14287Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

503358ae

ext4: fix potential buffer head leak when add_dirent_to_buf() returns ENOSPC · 2de770a4

Theodore Ts'o authored Nov 23, 2009

Previously add_dirent_to_buf() did not free its passed-in buffer head
in the case of ENOSPC, since in some cases the caller still needed it.
However, this led to potential buffer head leaks since not all callers
dealt with this correctly.  Fix this by making simplifying the freeing
convention; now add_dirent_to_buf() *never* frees the passed-in buffer
head, and leaves that to the responsibility of its caller.  This makes
things cleaner and easier to prove that the code is neither leaking
buffer heads or calling brelse() one time too many.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Curt Wohlgemuth <curtw@google.com>
Cc: stable@kernel.org

2de770a4

13 Nov, 2009 1 commit
- Linux 2.6.32-rc7 · 156171c7
  Linus Torvalds authored Nov 12, 2009
  
  156171c7
12 Nov, 2009 7 commits

Merge branch 'omap-fixes-for-linus' of... · 031fc8f3

Linus Torvalds authored Nov 12, 2009

Merge branch 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6

* 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6:
  omap3: Decrease cpufreq transition latency
  omap3: update Pandora defconfig
  omap3: 3430sdp: Enable Linux Regulator framework
  omap3: beagle: Fix USB host port power control
  omap3: pandora: Fix keypad keymap
  omap1: Amstrad Delta defconfig fixes
  omap: Fix omapfb/lcdc on OMAP1510 broken when PM set
  omap: Use resource_size
  omap: Fix race condition in omap dma driver

031fc8f3

__generic_block_fiemap(): fix for files bigger than 4GB · e04b5ef8

Mike Hommey authored Nov 11, 2009

Because of an integer overflow on start_blk, various kind of wrong results
would be returned by the generic_block_fiemap() handler, such as no
extents when there is a 4GB+ hole at the beginning of the file, or wrong
fe_logical when an extent starts after the first 4GB.
Signed-off-by: Mike Hommey <mh@glandium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Eric Sandeen <sandeen@sgi.com>
Cc: Josef Bacik <jbacik@redhat.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

e04b5ef8

pps: events reporting fix up · 276b282e

Rodolfo Giometti authored Nov 11, 2009

PPS events must be recorded according to PPS's mode settings.

If a process asks for (i.e.) capture-assert events only, when the PPS
client calls the pps_event() function to save the current PPS event, we
should verify the event type and then discard unwanted ones.

Also, without this patch userland processes waiting for a specific PPS
event (assert or clear but not both) may be awakened at wrong time.
Signed-off-by: Rodolfo Giometti <giometti@linux.it>
Tested-by: William S. Brasher <billb958@door.net>
Tested-by: Reg Clemens <clemens@dwf.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

276b282e

pps: locking scheme fix up for PPS_GETPARAMS · cbf83cc5

Rodolfo Giometti authored Nov 11, 2009

Userland programs may read/write PPS parameters at same time and these
operations may corrupt PPS data.
Signed-off-by: Rodolfo Giometti <giometti@linux.it>
Tested-by: Reg Clemens <clemens@dwf.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cbf83cc5

drivers/video/msm: update to new kernel · 69fd8d24

Pavel Machek authored Nov 11, 2009

TASK_INTERRUPTIBLE and friends are now only available after including
<linux/sched.h>, so include it when needed.

bus_id is no longer available/necessary, so remove that.

Android pmem driver is not available in mainline, so remove its hooks
from drivers/video.
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

69fd8d24

gpiolib: fix device_create() result check · d62668e1

Sergei Shtylyov authored Nov 11, 2009

In case of failure, device_create() returns not NULL but the error code.
The current code checks for non-NULL though which causes kernel oops in
sysfs_create_group() when device_create() fails.  Check for error using
IS_ERR() and propagate the error value using PTR_ERR() instead of fixed
-ENODEV code returned now...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

d62668e1

rtc: v3020: fix v3020_mmio_read_bit() · bcb3a167

Scott Valentine authored Nov 11, 2009

v3020_mmio_read_bit() always returns 0 when left_shift > 7.

v3020_mmio_read_bit()'s return type is (unsigned char).  The code returns
a value masked by (1 << left_shift) that is casted to the return type.  If
left_shift is larger than 7, the cast will always result in a 0 return
value.  The problem was discovered with left_shift = 16, and the included
patch corrects the problem.

The bug was introduced in the last (Apr 3 2009) commit of the file, kernel
versions 2.6.30 and later.

Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Paul Gortmaker <p_gortmaker@yahoo.com>
Cc: Raphael Assenat <raph@8d.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

bcb3a167