Commits · a454f7428ffa03c8e1321124d9074101b7290be6 · nexedi / linux

08 Nov, 2012 2 commits

xfs: support a tag-based inode_ag_iterator · a454f742

Brian Foster authored Nov 06, 2012

Genericize xfs_inode_ag_walk() to support an optional radix tree tag
and args argument for the execute function. Create a new wrapper
called xfs_inode_ag_iterator_tag() that performs a tag based walk
of perag's and inodes.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

a454f742

xfs: add EOFBLOCKS inode tagging/untagging · 27b52867

Brian Foster authored Nov 06, 2012

Add the XFS_ICI_EOFBLOCKS_TAG inode tag to identify inodes with
speculatively preallocated blocks beyond EOF. An inode is tagged
when speculative preallocation occurs and untagged either via
truncate down or when post-EOF blocks are freed via release or
reclaim.

The tag management is intentionally not aggressive to prefer
simplicity over the complexity of handling all the corner cases
under which post-EOF blocks could be freed (i.e., forward
truncation, fallocate, write error conditions, etc.). This means
that a tagged inode may or may not have post-EOF blocks after a
period of time. The tag is eventually cleared when the inode is
released or reclaimed.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

27b52867

07 Nov, 2012 5 commits

xfs: report projid32bit feature in geometry call · 69a58a43

Eric Sandeen authored Oct 09, 2012

When xfs gained the projid32bit feature, it was never added to
the FSGEOMETRY ioctl feature flags, so it's not queryable without
this patch.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

69a58a43

xfs: fix reading of wrapped log data · 009507b0

Dave Chinner authored Nov 02, 2012

Commit 44396476 ("xfs: reset buffer pointers before freeing them") in
3.0-rc1 introduced a regression when recovering log buffers that
wrapped around the end of log. The second part of the log buffer at
the start of the physical log was being read into the header buffer
rather than the data buffer, and hence recovery was seeing garbage
in the data buffer when it got to the region of the log buffer that
was incorrectly read.

Cc: <stable@vger.kernel.org> # 3.0.x, 3.2.x, 3.4.x 3.6.x
Reported-by: Torsten Kaiser <just.for.lkml@googlemail.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

009507b0

xfs: fix buffer shudown reference count mismatch · 137fff09

Dave Chinner authored Nov 02, 2012

When we shut down the filesystem, we have to unpin and free all the
buffers currently active in the CIL. To do this we unpin and remove
them in one operation as a result of a failed iclogbuf write. For
buffers, we do this removal via a simultated IO completion of after
marking the buffer stale.

At the time we do this, we have two references to the buffer - the
active LRU reference and the buf log item.  The LRU reference is
removed by marking the buffer stale, and the active CIL reference is
by the xfs_buf_iodone() callback that is run by
xfs_buf_do_callbacks() during ioend processing (via the bp->b_iodone
callback).

However, ioend processing requires one more reference - that of the
IO that it is completing. We don't have this reference, so we free
the buffer prematurely and use it after it is freed. For buffers
marked with XBF_ASYNC, this leads to assert failures in
xfs_buf_rele() on debug kernels because the b_hold count is zero.

Fix this by making sure we take the necessary IO reference before
starting IO completion processing on the stale buffer, and set the
XBF_ASYNC flag to ensure that IO completion processing removes all
the active references from the buffer to ensure it is fully torn
down.

Cc: <stable@vger.kernel.org>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

137fff09

xfs: don't vmap inode cluster buffers during free · b6aff29f

Dave Chinner authored Nov 02, 2012

Inode buffers do not need to be mapped as inodes are read or written
directly from/to the pages underlying the buffer. This fixes a
regression introduced by commit 611c9946 ("xfs: make XBF_MAPPED the
default behaviour").
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

b6aff29f

xfs: invalidate allocbt blocks moved to the free list · 4c05f9ad

Dave Chinner authored Nov 02, 2012

When we free a block from the alloc btree tree, we move it to the
freelist held in the AGFL and mark it busy in the busy extent tree.
This typically happens when we merge btree blocks.

Once the transaction is committed and checkpointed, the block can
remain on the free list for an indefinite amount of time.  Now, this
isn't the end of the world at this point - if the free list is
shortened, the buffer is invalidated in the transaction that moves
it back to free space. If the buffer is allocated as metadata from
the free list, then all the modifications getted logged, and we have
no issues, either. And if it gets allocated as userdata direct from
the freelist, it gets invalidated and so will never get written.

However, during the time it sits on the free list, pressure on the
log can cause the AIL to be pushed and the buffer that covers the
block gets pushed for write. IOWs, we end up writing a freed
metadata block to disk. Again, this isn't the end of the world
because we know from the above we are only writing to free space.

The problem, however, is for validation callbacks. If the block was
on old btree root block, then the level of the block is going to be
higher than the current tree root, and so will fail validation.
There may be other inconsistencies in the block as well, and
currently we don't care because the block is in free space. Shutting
down the filesystem because a freed block doesn't pass write
validation, OTOH, is rather unfriendly.

So, make sure we always invalidate buffers as they move from the
free space trees to the free list so that we guarantee they never
get written to disk while on the free list.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Phil White <pwhite@sgi.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

4c05f9ad

02 Nov, 2012 4 commits

xfs: Update mount options documentation · c99abb8f

Carlos Maiolino authored Oct 18, 2012

Once inode64 is the default allocation mode now, kernel documentation should be
updated to match this behaviour.
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

c99abb8f

xfs: Update inode alloc comments · cd856db6

Carlos Maiolino authored Oct 20, 2012

I found some out of date comments while studying the inode allocation
code, so I believe it's worth to have these comments updated.

It basically rewrites the comment regarding to "call_again" variable,
which is not used anymore, but instead, callers of xfs_ialloc() decides
if it needs to be called again relying only if ialloc_context is NULL or
not.

Also did some small changes in another comment that I thought to be
pertinent to the current behaviour of these functions and some alignment
on both comments.
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>

cd856db6

xfs: silence uninitialised f.file warning. · 531c3bdc

Dave Chinner authored Oct 25, 2012

Uninitialised variable build warning introduced by 2903ff01 ("switch
simple cases of fget_light to fdget"), gcc is not smart enough to
work out that the variable is not used uninitialised, and the commit
removed the initialisation at declaration that the old variable had.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

531c3bdc

xfs: growfs: don't read garbage for new secondary superblocks · 1375cb65

Dave Chinner authored Oct 09, 2012

When updating new secondary superblocks in a growfs operation, the
superblock buffer is read from the newly grown region of the
underlying device. This is not guaranteed to be zero, so violates
the underlying assumption that the unused parts of superblocks are
zero filled. Get a new buffer for these secondary superblocks to
ensure that the unused regions are zero filled correctly.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

1375cb65

18 Oct, 2012 3 commits

xfs: move allocation stack switch up to xfs_bmapi_allocate · e04426b9

Dave Chinner authored Oct 05, 2012

Switching stacks are xfs_alloc_vextent can cause deadlocks when we
run out of worker threads on the allocation workqueue. This can
occur because xfs_bmap_btalloc can make multiple calls to
xfs_alloc_vextent() and even if xfs_alloc_vextent() fails it can
return with the AGF locked in the current allocation transaction.

If we then need to make another allocation, and all the allocation
worker contexts are exhausted because the are blocked waiting for
the AGF lock, holder of the AGF cannot get it's xfs-alloc_vextent
work completed to release the AGF.  Hence allocation effectively
deadlocks.

To avoid this, move the stack switch one layer up to
xfs_bmapi_allocate() so that all of the allocation attempts in a
single switched stack transaction occur in a single worker context.
This avoids the problem of an allocation being blocked waiting for
a worker thread whilst holding the AGF.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

e04426b9

xfs: introduce XFS_BMAPI_STACK_SWITCH · 2455881c

Dave Chinner authored Oct 05, 2012

Certain allocation paths through xfs_bmapi_write() are in situations
where we have limited stack available. These are almost always in
the buffered IO writeback path when convertion delayed allocation
extents to real extents.

The current stack switch occurs for userdata allocations, which
means we also do stack switches for preallocation, direct IO and
unwritten extent conversion, even those these call chains have never
been implicated in a stack overrun.

Hence, let's target just the single stack overun offended for stack
switches. To do that, introduce a XFS_BMAPI_STACK_SWITCH flag that
the caller can pass xfs_bmapi_write() to indicate it should switch
stacks if it needs to do allocation.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

2455881c

xfs: zero allocation_args on the kernel stack · a0041684

Mark Tinguely authored Sep 20, 2012

Zero the kernel stack space that makes up the xfs_alloc_arg structures.
Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

a0041684

17 Oct, 2012 14 commits

xfs: only update the last_sync_lsn when a transaction completes · d35e88fa

Dave Chinner authored Oct 08, 2012

The log write code stamps each iclog with the current tail LSN in
the iclog header so that recovery knows where to find the tail of
thelog once it has found the head. Normally this is taken from the
first item on the AIL - the log item that corresponds to the oldest
active item in the log.

The problem is that when the AIL is empty, the tail lsn is dervied
from the the l_last_sync_lsn, which is the LSN of the last iclog to
be written to the log. In most cases this doesn't happen, because
the AIL is rarely empty on an active filesystem. However, when it
does, it opens up an interesting case when the transaction being
committed to the iclog spans multiple iclogs.

That is, the first iclog is stamped with the l_last_sync_lsn, and IO
is issued. Then the next iclog is setup, the changes copied into the
iclog (takes some time), and then the l_last_sync_lsn is stamped
into the header and IO is issued. This is still the same
transaction, so the tail lsn of both iclogs must be the same for log
recovery to find the entire transaction to be able to replay it.

The problem arises in that the iclog buffer IO completion updates
the l_last_sync_lsn with it's own LSN. Therefore, If the first iclog
completes it's IO before the second iclog is filled and has the tail
lsn stamped in it, it will stamp the LSN of the first iclog into
it's tail lsn field. If the system fails at this point, log recovery
will not see a complete transaction, so the transaction will no be
replayed.

The fix is simple - the l_last_sync_lsn is updated when a iclog
buffer IO completes, and this is incorrect. The l_last_sync_lsn
shoul dbe updated when a transaction is completed by a iclog buffer
IO. That is, only iclog buffers that have transaction commit
callbacks attached to them should update the l_last_sync_lsn. This
means that the last_sync_lsn will only move forward when a commit
record it written, not in the middle of a large transaction that is
rolling through multiple iclog buffers.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>

d35e88fa

xfs: remove xfs_iget.c · 33479e05

Dave Chinner authored Oct 08, 2012

The inode cache functions remaining in xfs_iget.c can be moved to xfs_icache.c
along with the other inode cache functions. This removes all functionality from
xfs_iget.c, so the file can simply be removed.

This move results in various functions now only having the scope of a single
file (e.g. xfs_inode_free()), so clean up all the definitions and exported
prototypes in xfs_icache.[ch] and xfs_inode.h appropriately.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

33479e05

xfs: move inode locking functions to xfs_inode.c · fa96acad

Dave Chinner authored Oct 08, 2012

xfs_ilock() and friends really aren't related to the inode cache in
any way, so move them to xfs_inode.c with all the other inode
related functionality.

While doing this move, move the xfs_ilock() tracepoints to *before*
the lock is taken so that when a hang on a lock occurs we have
events to indicate which process and what inode we were trying to
lock when the hang occurred. This is much better than the current
silence we get on a hang...
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

fa96acad

xfs: rename xfs_sync.[ch] to xfs_icache.[ch] · 6d8b79cf

Dave Chinner authored Oct 08, 2012

xfs_sync.c now only contains inode reclaim functions and inode cache
iteration functions. It is not related to sync operations anymore.
Rename to xfs_icache.c to reflect it's contents and prepare for
consolidation with the other inode cache file that exists
(xfs_iget.c).
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

6d8b79cf

xfs: xfs_quiesce_attr() should quiesce the log like unmount · c75921a7

Dave Chinner authored Oct 08, 2012

xfs_quiesce_attr() is supposed to leave the log empty with an
unmount record written. Right now it does not wait for the AIL to be
emptied before writing the unmount record, not does it wait for
metadata IO completion, either. Fix it to use the same method and
code as xfs_log_unmount().
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

c75921a7

xfs: move xfs_quiesce_attr() into xfs_super.c · c7eea6f7

Dave Chinner authored Oct 08, 2012

Both callers of xfs_quiesce_attr() are in xfs_super.c, and there's
nothing really sync-specific about this functionality so it doesn't
really matter where it lives. Move it to benext to it's callers, so
all the remount/sync_fs code is in the one place.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

c7eea6f7

xfs: xfs_sync_fsdata is redundant · 34061f5c

Dave Chinner authored Oct 08, 2012

Why do we need to write the superblock to disk once we've written
all the data?  We don't actually - the reasons for doing this are
lost in the mists of time, and go back to the way Irix used to drive
VFS flushing.

On linux, this code is only called from two contexts: remount and
.sync_fs. In the remount case, the call is followed by a metadata
sync, which unpins and writes the superblock.  In the sync_fs case,
we only need to force the log to disk to ensure that the superblock
is correctly on disk, so we don't actually need to write it. Hence
the functionality is either redundant or superfluous and thus can be
removed.

Seeing as xfs_quiesce_data is essentially now just a log force,
remove it as well and fold the code back into the two callers.
Neither of them need the log covering check, either, as that is
redundant for the remount case, and unnecessary for the .sync_fs
case.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

34061f5c

xfs: syncd workqueue is no more · 5889608d

Dave Chinner authored Oct 08, 2012

With the syncd functions moved to the log and/or removed, the syncd
workqueue is the only remaining bit left. It is used by the log
covering/ail pushing work, as well as by the inode reclaim work.

Given how cheap workqueues are these days, give the log and inode
reclaim work their own work queues and kill the syncd work queue.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>

5889608d

xfs: xfs_sync_data is redundant. · 9aa05000

Dave Chinner authored Oct 08, 2012

We don't do any data writeback from XFS any more - the VFS is
completely responsible for that, including for freeze. We can
replace the remaining caller with a VFS level function that
achieves the same thing, but without conflicting with current
writeback work.

This means we can remove the flush_work and xfs_flush_inodes() - the
VFS functionality completely replaces the internal flush queue for
doing this writeback work in a separate context to avoid stack
overruns.

This does have one complication - it cannot be called with page
locks held.  Hence move the flushing of delalloc space when ENOSPC
occurs back up into xfs_file_aio_buffered_write when we don't hold
any locks that will stall writeback.

Unfortunately, writeback_inodes_sb_if_idle() is not sufficient to
trigger delalloc conversion fast enough to prevent spurious ENOSPC
whent here are hundreds of writers, thousands of small files and GBs
of free RAM.  Hence we need to use sync_sb_inodes() to block callers
while we wait for writeback like the previous xfs_flush_inodes
implementation did.

That means we have to hold the s_umount lock here, but because this
call can nest inside i_mutex (the parent directory in the create
case, held by the VFS), we have to use down_read_trylock() to avoid
potential deadlocks. In practice, this trylock will succeed on
almost every attempt as unmount/remount type operations are
exceedingly rare.

Note: we always need to pass a count of zero to
generic_file_buffered_write() as the previously written byte count.
We only do this by accident before this patch by the virtue of ret
always being zero when there are no errors. Make this explicit
rather than needing to specifically zero ret in the ENOSPC retry
case.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Tested-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>

9aa05000

xfs: Bring some sanity to log unmounting · cf2931db

Dave Chinner authored Oct 08, 2012

When unmounting the filesystem, there are lots of operations that
need to be done in a specific order, and they are spread across
across a couple of functions. We have to drain the AIL before we
write the unmount record, and we have to shut down the background
log work before we do either of them.

But this is all split haphazardly across xfs_unmountfs() and
xfs_log_unmount(). Move all the AIL flushing and log manipulations
to xfs_log_unmount() so that the responisbilities of each function
is clear and the operations they perform obvious.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

cf2931db

xfs: sync work is now only periodic log work · f661f1e0

Dave Chinner authored Oct 08, 2012

The only thing the periodic sync work does now is flush the AIL and
idle the log. These are really functions of the log code, so move
the work to xfs_log.c and rename it appropriately.

The only wart that this leaves behind is the xfssyncd_centisecs
sysctl, otherwise the xfssyncd is dead. Clean up any comments that
related to xfssyncd to reflect it's passing.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>

f661f1e0

xfs: don't run the sync work if the filesystem is read-only · 7f7bebef

Dave Chinner authored Oct 08, 2012

If the filesystem is mounted or remounted read-only, stop the sync
worker that tries to flush or cover the log if the filesystem is
dirty. It's read-only, so it isn't dirty. Restart it on a remount,rw
as necessary. This avoids the need for RO checks in the work.

Similarly, stop the sync work when the filesystem is frozen, and
start it again when the filesysetm is thawed. This avoids the need
for special freeze checks in the work.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>

7f7bebef

xfs: rationalise xfs_mount_wq users · 7e18530b

Dave Chinner authored Oct 08, 2012

Instead of starting and stopping background work on the xfs_mount_wq
all at the same time, separate them to where they really are needed
to start and stop.

The xfs_sync_worker, only needs to be started after all the mount
processing has completed successfully, while it needs to be stopped
before the log is unmounted.

The xfs_reclaim_worker is started on demand, and can be
stopped before the unmount process does it's own inode reclaim pass.

The xfs_flush_inodes work is run on demand, and so we really only
need to ensure that it has stopped running before we start
processing an unmount, freeze or remount,ro.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>

7e18530b

xfs: xfs_syncd_stop must die · 33c7a2bc

Dave Chinner authored Oct 08, 2012

xfs_syncd_start and xfs_syncd_stop tie a bunch of unrelated
functionailty together that actually have different start and stop
requirements. Kill these functions and open code the start/stop
methods for each of the background functions.

Subsequent patches will move the start/stop functions around to the
correct places to avoid races and shutdown issues.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

33c7a2bc

14 Oct, 2012 6 commits

Linux 3.7-rc1 · ddffeb8c
Linus Torvalds authored Oct 14, 2012

ddffeb8c

Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · a5ef3f7d

Linus Torvalds authored Oct 14, 2012

Pull MIPS update from Ralf Baechle:
 "Cleanups and fixes for breakage that occured earlier during this merge
  phase.  Also a few patches that didn't make the first pull request.
  Of those is the Alchemy work that merges code for many of the SOCs and
  evaluation boards thus among other code shrinkage, reduces the number
  of MIPS defconfigs by 5."

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (22 commits)
  MIPS: SNI: Switch RM400 serial to SCCNXP driver
  MIPS: Remove unused empty_bad_pmd_table[] declaration.
  MIPS: MT: Remove kspd.
  MIPS: Malta: Fix section mismatch.
  MIPS: asm-offset.c: Delete unused irq_cpustat_t struct offsets.
  MIPS: Alchemy: Merge PB1100/1500 support into DB1000 code.
  MIPS: Alchemy: merge PB1550 support into DB1550 code
  MIPS: Alchemy: Single kernel for DB1200/1300/1550
  MIPS: Optimize TLB refill for RI/XI configurations.
  MIPS: proc: Cleanup printing of ASEs.
  MIPS: Hardwire detection of DSP ASE Rev 2 for systems, as required.
  MIPS: Add detection of DSP ASE Revision 2.
  MIPS: Optimize pgd_init and pmd_init
  MIPS: perf: Add perf functionality for BMIPS5000
  MIPS: perf: Split the Kconfig option CONFIG_MIPS_MT_SMP
  MIPS: perf: Remove unnecessary #ifdef
  MIPS: perf: Add cpu feature bit for PCI (performance counter interrupt)
  MIPS: perf: Change the "mips_perf_event" table unsupported indicator.
  MIPS: Align swapper_pg_dir to 64K for better TLB Refill code.
  vmlinux.lds.h: Allow architectures to add sections to the front of .bss
  ...

a5ef3f7d

Merge branch 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux · d25282d1

Linus Torvalds authored Oct 14, 2012

Pull module signing support from Rusty Russell:
 "module signing is the highlight, but it's an all-over David Howells frenzy..."

Hmm "Magrathea: Glacier signing key". Somebody has been reading too much HHGTTG.

* 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (37 commits)
  X.509: Fix indefinite length element skip error handling
  X.509: Convert some printk calls to pr_devel
  asymmetric keys: fix printk format warning
  MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checking
  MODSIGN: Make mrproper should remove generated files.
  MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs
  MODSIGN: Use the same digest for the autogen key sig as for the module sig
  MODSIGN: Sign modules during the build process
  MODSIGN: Provide a script for generating a key ID from an X.509 cert
  MODSIGN: Implement module signature checking
  MODSIGN: Provide module signing public keys to the kernel
  MODSIGN: Automatically generate module signing keys if missing
  MODSIGN: Provide Kconfig options
  MODSIGN: Provide gitignore and make clean rules for extra files
  MODSIGN: Add FIPS policy
  module: signature checking hook
  X.509: Add a crypto key parser for binary (DER) X.509 certificates
  MPILIB: Provide a function to read raw data into an MPI
  X.509: Add an ASN.1 decoder
  X.509: Add simple ASN.1 grammar compiler
  ...

d25282d1

x86, boot: Explicitly include autoconf.h for hostprogs · b6eea87f

Matt Fleming authored Oct 12, 2012

The hostprogs need access to the CONFIG_* symbols found in
include/generated/autoconf.h.  But commit abbf1590 ("UAPI: Partition
the header include path sets and add uapi/ header directories") replaced
$(LINUXINCLUDE) with $(USERINCLUDE) which doesn't contain the necessary
include paths.

This has the undesirable effect of breaking the EFI boot stub because
the #ifdef CONFIG_EFI_STUB code in arch/x86/boot/tools/build.c is
never compiled.

It should also be noted that because $(USERINCLUDE) isn't exported by
the top-level Makefile it's actually empty in arch/x86/boot/Makefile.

Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

b6eea87f

perf: Fix UAPI fallout · 7d380c8f

Ingo Molnar authored Oct 14, 2012

The UAPI commits forgot to test tooling builds such as tools/perf/,
and this fixes the fallout.

Manual conversion.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

7d380c8f

Merge branch 'late-for-linus' of git://git.linaro.org/people/rmk/linux-arm · 3d6ee36d

Linus Torvalds authored Oct 13, 2012

Pull ARM update from Russell King:
 "This is the final round of stuff for ARM, left until the end of the
  merge window to reduce the number of conflicts.  This set contains the
  ARM part of David Howells UAPI changes, and a fix to the ordering of
  'select' statements in ARM Kconfig files (see the appropriate commit
  for why this happened - thanks to Andrew Morton for pointing out the
  problem.)

  I've left this as long as I dare for this window to avoid conflicts,
  and I regenerated the config patch yesterday, posting it to our
  mailing list for review and testing.  I have several acks which
  include successful test reports for it.

  However, today I notice we've got new conflicts with previously unseen
  code...  though that conflict should be trivial (it's my changes vs a
  one liner.)"

* 'late-for-linus' of git://git.linaro.org/people/rmk/linux-arm:
  ARM: config: make sure that platforms are ordered by option string
  ARM: config: sort select statements alphanumerically
  UAPI: (Scripted) Disintegrate arch/arm/include/asm

Fix up fairly conflict in arch/arm/Kconfig (the select re-organization
vs recent addition of GENERIC_KERNEL_EXECVE)

3d6ee36d

13 Oct, 2012 6 commits

Merge tag 'disintegrate-main-20121013' of git://git.infradead.org/users/dhowells/linux-headers · 0b381a28

Linus Torvalds authored Oct 13, 2012

Pull UAPI disintegration for include/linux/{,byteorder/}*.h from David Howells:
"The patches contained herein do the following:

(1) Remove kernel-only stuff in linux/ppp-comp.h from the UAPI. I checked
this with Paul Mackerras before I created the patch and he suggested some
extra bits to unexport.

(2) Remove linux/blk_types.h entirely from the UAPI as none of it is userspace
applicable, and remove from the UAPI that part of linux/fs.h that was the
reason for linux/blk_types.h being exported in the first place. I
discussed this with Jens Axboe before creating the patch.

(3) The big patch of the series to disintegrate include/linux/*.h as a unit.
This could be split up, though there would be collisions in moving stuff
between the two Kbuild files when the parts are merged as that file is
sorted alphabetically rather than being grouped by subsystem.

Of this set of headers, 17 files have changed in the UAPI exported region
since the 4th and only 8 since the 9th so there isn't much change in this
area - as one might expect.

It should be pretty obvious and straightforward if it does come to fixing
up: stuff in __KERNEL__ guards stays where it is and stuff outside moves
to the same file in the include/uapi/linux/ directory.

If a new file appears then things get a bit more complicated as the
"headers +=" line has to move to include/uapi/linux/Kbuild. Only one new
file has appeared since the 9th and I judge this type of event relatively
unlikely.

(4) A patch to disintegrate include/linux/byteorder/*.h as a unit.

Signed-off-by: David Howells <dhowells@redhat.com>"

* tag 'disintegrate-main-20121013' of git://git.infradead.org/users/dhowells/linux-headers:
UAPI: (Scripted) Disintegrate include/linux/byteorder
UAPI: (Scripted) Disintegrate include/linux
UAPI: Unexport linux/blk_types.h
UAPI: Unexport part of linux/ppp-comp.h

0b381a28

Merge tag 'disintegrate-spi-20121009' of git://git.infradead.org/users/dhowells/linux-headers · 034b5eeb

Linus Torvalds authored Oct 13, 2012

Pull spi UAPI disintegration from David Howells:
 "This is to complete part of the Userspace API (UAPI) disintegration
  for which the preparatory patches were pulled recently.  After these
  patches, userspace headers will be segregated into:

        include/uapi/linux/.../foo.h

  for the userspace interface stuff, and:

        include/linux/.../foo.h

  for the strictly kernel internal stuff.
Signed-off-by: David Howells <dhowells@redhat.com>
  Acked-by: Grant Likely <grant.likely@secretlab.ca>"

* tag 'disintegrate-spi-20121009' of git://git.infradead.org/users/dhowells/linux-headers:
  UAPI: (Scripted) Disintegrate include/linux/spi

034b5eeb

Merge tag 'openrisc-uapi' of git://openrisc.net/jonas/linux · 7c5a4734

Linus Torvalds authored Oct 13, 2012

Pull OpenRISC uapi disintegration from Jonas Bonn:
 "OpenRISC UAPI disintegration work from David Howells"

* tag 'openrisc-uapi' of git://openrisc.net/jonas/linux:
  UAPI: (Scripted) Disintegrate arch/openrisc/include/asm

7c5a4734

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace · 09a9ad6a

Linus Torvalds authored Oct 13, 2012

Pull user namespace compile fixes from Eric W Biederman:
 "This tree contains three trivial fixes.  One compiler warning, one
  thinko fix, and one build fix"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  btrfs: Fix compilation with user namespace support enabled
  userns: Fix posix_acl_file_xattr_userns gid conversion
  userns: Properly print bluetooth socket uids

09a9ad6a

Merge tag 'md-3.7' of git://neil.brown.name/md · 9db90880

Linus Torvalds authored Oct 13, 2012

Pull md updates from NeilBrown:
 - "discard" support, some dm-raid improvements and other assorted bits
   and pieces.

* tag 'md-3.7' of git://neil.brown.name/md: (29 commits)
  md: refine reporting of resync/reshape delays.
  md/raid5: be careful not to resize_stripes too big.
  md: make sure manual changes to recovery checkpoint are saved.
  md/raid10: use correct limit variable
  md: writing to sync_action should clear the read-auto state.
  Subject: [PATCH] md:change resync_mismatches to atomic64_t to avoid races
  md/raid5: make sure to_read and to_write never go negative.
  md: When RAID5 is dirty, force reconstruct-write instead of read-modify-write.
  md/raid5: protect debug message against NULL derefernce.
  md/raid5: add some missing locking in handle_failed_stripe.
  MD: raid5 avoid unnecessary zero page for trim
  MD: raid5 trim support
  md/bitmap:Don't use IS_ERR to judge alloc_page().
  md/raid1: Don't release reference to device while handling read error.
  raid: replace list_for_each_continue_rcu with new interface
  add further __init annotations to crypto/xor.c
  DM RAID: Fix for "sync" directive ineffectiveness
  DM RAID: Fix comparison of index and quantity for "rebuild" parameter
  DM RAID: Add rebuild capability for RAID10
  DM RAID: Move 'rebuild' checking code to its own function
  ...

9db90880

Merge branch 'config' into late-for-linus · 244acb1b
Russell King authored Oct 13, 2012

244acb1b