Commits · 2dbc57298d84fe37d855c8bfa3b280dfd47ded4b · Kirill Smelkov / linux

18 Oct, 2004 32 commits

[PATCH] fork() bug invalidates file descriptors · 2dbc5729

Gregory Kurz authored Oct 18, 2004

Take a process P1 that spawns a thread T (aka.  a clone with CLONE_FILES). 
If P1 forks another process P2 (aka.  not a clone) while T is blocked in a
open() that should return file descriptor FD, then FD will be unusable in
P2.  This leads to strange behaviors in the context of P2: close(FD)
returns EBADF, while dup2(a_valid_fd, FD) returns EBUSY and of course FD is
never returned again by any syscall...

testcase:

#include <errno.h>
#include <fcntl.h>
#include <sched.h>
#include <signal.h>
#include <string.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <asm/page.h>

#define FIFO "/tmp/bug_fifo"
#define FD   0

/*
 * This program is meant to show that calling fork() while a clone spawned
 * with CLONE_FILES is blocked in open() makes a fd number unusable in the
 * child.
 *
 *
 *     Parent               Clone                Child
 *        |
 *   clone(CLONE_FILES)-

2dbc5729

[PATCH] fix the prof=schedule feature · 0105467f

Ingo Molnar authored Oct 18, 2004

Fix mismerge of the "prof=schedule" feature.  Without this patch the output
is a boring empty profile.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

0105467f

[PATCH] reiserfs: small filesystem fix · 5f50cce9

Chris Mason authored Oct 18, 2004

On small filesystems (<128M), make sure not to reference bitmap blocks that
don't exist.

Thanks to Jan Kara for finding this bug.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

5f50cce9

[PATCH] __set_page_dirty_nobuffers mappings · c76aaef0

Hugh Dickins authored Oct 18, 2004

Marcelo noticed that the BUG_ON in __set_page_dirty_nobuffers doesn't make
much sense: it lost its way in 2.6.7, amidst so many page_mappings!

It's supposed to be checking that, although page->mapping may suddenly go NULL
from truncation, and although tmpfs swizzles page_mapping(page) between tmpfs
inode address_space and swapper_space, there's sufficient stabilization while
here in __set_page_dirty_nobuffers that the mapping after we locked
mapping->tree_lock is the same as the mapping before we locked
mapping->tree_lock i.e. the lock we hold is the right one.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

c76aaef0

[PATCH] exec: fix posix-timers leak and pending signal loss · fef60c1b

Roland McGrath authored Oct 18, 2004

I've found some problems with exec and fixed them with this patch to
de_thread.

The second problem is that a multithreaded exec loses all pending signals.
This is violation of POSIX rules. But a moment's thought will show it's
also just not desireable: if you send a process a SIGTERM while it's in the
middle of calling exec, you expect either the original program in that
process or the new program being exec'd to handle that signal or be killed
by it. As it stands now, you can try to kill a process and have that
signal just evaporate if it's multithreaded and calls exec just then. I
really don't know what the rationale was behind the de_thread code that
allocates a new signal_struct. It doesn't make any sense now. The other
code there ensures that the old signal_struct is no longer shared. Except
for posix-timers, all the state there is stuff you want to keep. So my
changes just keep the old structs when they are no longer shared, and all
the right state is retained (after clearing out posix-timers).

The final bug is that the cumulative statistics of dead threads and dead
child processes are lost in the abandoned signal_struct. This is also
fixed by holding on to it instead of replacing it.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

fef60c1b

[PATCH] show aggregate per-process counters in /proc/PID/stat 2 · 9a349eb7

Lev Makhlis authored Oct 18, 2004

Add up resource usage counters for live and dead threads to show aggregate
per-process usage in /proc/<pid>/stat.  This mirrors the new getrusage()
semantics.  /proc/<pid>/task/<tid>/stat still has the per-thread usage.

After moving the counter aggregation loop inside a task->sighand lock to
avoid nasty race conditions, it has survived stress-testing with '(while
true; do sleep 1 & done) & top -d 0.1'
Signed-off-by: Lev Makhlis <mlev@despammed.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

9a349eb7

[PATCH] distinct tgid/tid CPU usage · bf719d26

Albert Cahalan authored Oct 18, 2004

This patch adjusts /proc/*/stat to have distinct per-process and per-thread
CPU usage, faults, and wchan.
Signed-off-by: Albert Cahalan <albert@users.sf.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

bf719d26

[PATCH] add missing linux/syscalls.h includes · 09b9135c

Arnd Bergmann authored Oct 18, 2004

I found that the prototypes for sys_waitid and sys_fcntl in
<linux/syscalls.h> don't match the implementation.  In order to keep all
prototypes in sync in the future, now include the header from each file
implementing any syscall.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

09b9135c

[PATCH] softirqs: fix latency of softirq processing · 40e39ce0

Ingo Molnar authored Oct 18, 2004

The attached patch fixes a local_bh_enable() buglet: we first enabled
softirqs then did we do local_softirq_pending() - often this is preemptible
code.  So this task could be preempted and there's no guarantee that
softirq processing will occur (except the periodic timer tick).

The race window is small but existent.  This could result in packet
processing latencies or timer expiration latencies - hard to detect and
annoying bugs.

The fix is to invoke softirqs with softirqs enabled but preemption still
disabled.  Patch is against 2.6.9-rc2-mm1.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

40e39ce0

[PATCH] fix PTRACE_ATTACH race with real parent's wait calls · cfc4f957

Roland McGrath authored Oct 18, 2004

There is a race between PTRACE_ATTACH and the real parent calling wait.
For a moment, the task is put in PT_PTRACED but with its parent still
pointing to its real_parent. In this circumstance, if the real parent
calls wait without the WUNTRACED flag, he can see a stopped child status,
which wait should never return without WUNTRACED when the caller is not
using ptrace. Here it is not the caller that is using ptrace, but some
third party.

This patch avoids this race condition by adding the PT_ATTACHED flag to
distinguish a real parent from a ptrace_attach parent when PT_PTRACED is
set, and then having wait use this flag to confirm that things are in order
and not consider the child ptraced when its ->ptrace flags are set but its
parent links have not yet been switched. (ptrace_check_attach also uses it
similarly to rule out a possible race with a bogus ptrace call by the real
parent during ptrace_attach.)

While looking into this, I noticed that every arch's sys_execve has:

current->ptrace &= ~PT_DTRACE;

with no locking at all. So, if an exec happens in a race with
PTRACE_ATTACH, you could wind up with ->ptrace not having PT_PTRACED set
because this store clobbered it. That will cause later BUG hits because
the parent links indicate ptracedness but the flag is not set. The patch
corrects all the places I found to use task_lock around diddling ->ptrace
when it's possible to be racing with ptrace_attach. (The ptrace operation
code itself doesn't have this issue because it already excludes anyone else
being in ptrace_attach.)
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

cfc4f957

[PATCH] add WCONTINUED support to wait4 syscall · 04bff088

Roland McGrath authored Oct 18, 2004

POSIX specifies the new WCONTINUED flag for waitpid, not just for waitid.
I overlooked this addition when I implemented waitid.  The real work was
already done to support waitid, but waitpid needs to report the results
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

04bff088

[PATCH] make rlimit settings per-process instead of per-thread · 31180071

Roland McGrath authored Oct 18, 2004

POSIX specifies that the limit settings provided by getrlimit/setrlimit are
shared by the whole process, not specific to individual threads. This
patch changes the behavior of those calls to comply with POSIX.

I've moved the struct rlimit array from task_struct to signal_struct, as it
has the correct sharing properties. (This reduces kernel memory usage per
thread in multithreaded processes by around 100/200 bytes for 32/64
machines respectively.) I took a fairly minimal approach to the locking
issues with the newly shared struct rlimit array. It turns out that all
the code that is checking limits really just needs to look at one word at a
time (one rlim_cur field, usually). It's only the few places like
getrlimit itself (and fork), that require atomicity in accessing a whole
struct rlimit, so I just used a spin lock for them and no locking for most
of the checks. If it turns out that readers of struct rlimit need more
atomicity where they are now cheap, or less overhead where they are now
atomic (e.g. fork), then seqcount is certainly the right thing to use for
them instead of readers using the spin lock. Though it's in signal_struct,
I didn't use siglock since the access to rlimits never needs to disable
irqs and doesn't overlap with other siglock uses. Instead of adding
something new, I overloaded task_lock(task->group_leader) for this; it is
used for other things that are not likely to happen simultaneously with
limit tweaking. To me that seems preferable to adding a word, but it would
be trivial (and arguably cleaner) to add a separate lock for these users
(or e.g. just use seqlock, which adds two words but is optimal for readers).

Most of the changes here are just the trivial s/->rlim/->signal->rlim/.

I stumbled across what must be a long-standing bug, in reparent_to_init.
It does:
memcpy(current->rlim, init_task.rlim, sizeof(*(current->rlim)));
when surely it was intended to be:
memcpy(current->rlim, init_task.rlim, sizeof(current->rlim));
As rlim is an array, the * in the sizeof expression gets the size of the
first element, so this just changes the first limit (RLIMIT_CPU). This is
for kernel threads, where it's clear that resetting all the rlimits is what
you want. With that fixed, the setting of RLIMIT_FSIZE in nfsd is
superfluous since it will now already have been reset to RLIM_INFINITY.

The other subtlety is removing:
tsk->rlim[RLIMIT_CPU].rlim_cur = RLIM_INFINITY;
in exit_notify, which was to avoid a race signalling during self-reaping
exit. As the limit is now shared, a dying thread should not change it for
others. Instead, I avoid that race by checking current->state before the
RLIMIT_CPU check. (Adding one new conditional in that path is now required
one way or another, since if not for this check there would also be a new
race with self-reaping exit later on clearing current->signal that would
have to be checked for.)

The one loose end left by this patch is with process accounting.
do_acct_process temporarily resets the RLIMIT_FSIZE limit while writing the
accounting record. I left this as it was, but it is now changing a limit
that might be shared by other threads still running. I left this in a
dubious state because it seems to me that processing accounting may already
be more generally a dubious state when it comes to NPTL threads. I would
think you would want one record per process, with aggregate data about all
threads that ever lived in it, not a separate record for each thread.
I don't use process accounting myself, but if anyone is interested in
testing it out I could provide a patch to change it this way.

One final note, this is not 100% to POSIX compliance in regards to rlimits.
POSIX specifies that RLIMIT_CPU refers to a whole process in aggregate, not
to each individual thread. I will provide patches later on to achieve that
change, assuming this patch goes in first.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

31180071

[PATCH] i386 entry.S cleanups · cc588ba9

Ingo Molnar authored Oct 18, 2004

Remove the unused lcall7/lcall27 code.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

cc588ba9

[PATCH] acpi proc: error handling · 678ab4ca

Pavel Machek authored Oct 18, 2004

Propagate the software_suspend() return value.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

678ab4ca

[PATCH] swsusp: progress in percent · fa7f7d64

Pavel Machek authored Oct 18, 2004

swsusp currently has very poor progress indication.  Thanks to Erik Rigtorp
<erik@rigtorp.com>, we have percentages there, so people know how long wait
to expect.  Please apply,

From: Erik Rigtorp <erik@rigtorp.com>
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

fa7f7d64

[PATCH] parport_pc superio chip fixes · b7478cdd

Andrea Arcangeli authored Oct 18, 2004

This patch fixes some troubles that somebody reported me with the superio
chips.

In short rmmod parport_pc && cat /proc/iomem was good enough for crashing
the box hard on some machine (and hwscan --printer was doing just that).
The way the oops triggers is that iomem tries to vsprintf the p->name, but
the p->name was a static string in the module address (now unloaded).

The reason is that the superio chip scanning leaves up to two persistent
ranges claimed.  But the second (legacy) pass has no way to notice the
resources are already reclaimed.  Plus if the superio->io was different
than the "io" variable (the range to scan for superio chips) the "io" range
would generate a leak of the original "io" range too.

I simply make sure to always release the requested space during the superio
scan, and I make sure not to istantiate new ranges in the p->base that
would cause the later parport scan to fail too (plus leaving up to leaked
resources).

The previous code that was returning values and was leaving garbage in
there made no sense to me.  My best guess (assuming I didn't misread it ;)
is that probably somebody added the request_region without realizing
they're pointing to the very same address that would be requested later
(and nobody does accesses on those ranges until later, so it was very safe
to claim it later).

Disclaimer: I don't have the specs of the winbond and smsc at hand, I just
guessed what they do from the code (nothing checks superio->io except
get_superio_dma get_superio_irq, which made the thing enough self
explainatory to fix it without specs)
Signed-off-by: Andrea Arcangeli <andrea@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

b7478cdd

[PATCH] add sys_setaltroot() · 44c4fb89

Seth Rohit authored Oct 18, 2004

Add a new system call setaltroot(2).

Currently, using the altroot feature is accessible only via the
set_personality() system call.  It is accessible to user space only if there
is more than one exec domain in the system.  This patch allows using the
altroot feature on systems where there is only one exec domain.

It is possible to work around the issue by adding a dummy exec domain, but it
was rejected for not being very elegant.

If this feature is implemented in userspace, it adds a 16% overhead on a test
case which greps for a single word in the kernel source tree.
Signed-off-by: Zou Nanhai <nanhai.zou@intel.com>
Signed-off-by: Gordon Jin <gordon.jin@intel.com>
Signed-off-by: Arun Sharma <arun.sharma@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

44c4fb89

Wrap <linux/compiler.h> inside '#ifndef __ASSEMBLY__' · f5aa089a

Linus Torvalds authored Oct 18, 2004

None of the compatibility defines make sense for assembly
files, and gcc has trouble with vararg macros when using
"-traditional" (which is used for asm), to the point of
ICE'ing.

f5aa089a

Add copyright notice on ppc64 iomap files. · 0e0c5521
Linus Torvalds authored Oct 18, 2004
```
Paul cares. I think there's something in the water at IBM
that makes people sticklers ;)
```
0e0c5521

[PATCH] ppc64: Fix iSeries build (ouch !) · 02dc1467

Benjamin Herrenschmidt authored Oct 18, 2004

The move of iomap out of eeh inadvertently broke iSeries ...

Fixed like this.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

02dc1467

[PATCH] ppc32/64: FPU/vector register restore after signal · 4aab1539

Benjamin Herrenschmidt authored Oct 18, 2004

This fixes some issues with restoring the altivec and/or FPU registers
upon return from a signal or when setting a context.  It also add a
proper stack backlink to the signal frames created for 64 bits
applications.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

4aab1539

Older gcc's ICE on missing (unused) varags macro name. · a85f54d7
Linus Torvalds authored Oct 18, 2004

a85f54d7
Merge bk://gkernel.bkbits.net/net-drivers-2.6 · 9266734a
Linus Torvalds authored Oct 18, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.6/linux
```
9266734a
Merge bk://gkernel.bkbits.net/libata-2.6 · 6db492bc
Linus Torvalds authored Oct 18, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.6/linux
```
6db492bc
Merge pobox.com:/spare/repo/linux-2.6 · 49093583
Jeff Garzik authored Oct 18, 2004
```
into pobox.com:/spare/repo/libata-2.6
```
49093583
Add fake '__builtin_warning()' for the gcc case. · 6df3af84
Linus Torvalds authored Oct 18, 2004
```
Allows us to do compile-time sparse warnings of our own.
```
6df3af84
Merge bk://linux-scsi.bkbits.net/scsi-for-linus-2.6 · d78d2844
Linus Torvalds authored Oct 18, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.6/linux
```
d78d2844
Merge titanic.il.steeleye.com:/home/jejb/BK/scsi-target-2.6 · e270e1b2
James Bottomley authored Oct 18, 2004
```
into titanic.il.steeleye.com:/home/jejb/BK/scsi-for-linus-2.6
```
e270e1b2

aic7xxx and aic79xx: fix sleeping while holding a lock · c045ebb7

James Bottomley authored Oct 18, 2004

From: Luben Tuikov <luben_tuikov@adaptec.com>

Fix sleeping while holding a lock on host removal and on
killing the DV thread.
Signed-off-by: Luben Tuikov <luben_tuikov@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>

c045ebb7

SCSI: fix Suspend I/O block/unblock path · 4b8cbbf6

James Bottomley authored Oct 18, 2004

From: James.Smart@Emulex.Com

urther testing is showing that we are having some i/o threads
prematurely die with the following message: "rejecting I/O to device
being removed"
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>

4b8cbbf6

[PATCH] cciss: fixes for clustering · 2207252b

Mike Miller authored Oct 18, 2004

This patch changes our open specifically for clustering software. We must
allow root to access any volume or device with a LUN ID. We also modified
our revalidate function for this reason.
If a logical is reserved, we must register it with the OS with size=0. Then
the backup system can call BLKRRPART after breaking the reservation to
set the device to the correct size.
We also must register a controller with no logical volumes for the online
utilities to function. This is the way we've done it since the 2.2 kernel.
Which doesn't neccesarily make it right, but we have legacy apps to consider.

Signed off by: Mike Miller <mike.miller@hp.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>

2207252b

Merge bk://bk.arm.linux.org.uk/linux-2.6-rmk · 0d377ebc
Linus Torvalds authored Oct 18, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.6/linux
```
0d377ebc

19 Oct, 2004 1 commit

[ARM PATCH] 2145/1: S3C2410 - GPIO ID register update · 5995d15b

Ben Dooks authored Oct 19, 2004

Patch from Ben Dooks

Update the include/asm-arm/arch-s3c2410/regs-gpio.h with
GSTATUS1 register information

Signed-off-by: Ben Dooks

5995d15b

18 Oct, 2004 7 commits

[ARM PATCH] 2144/1: S3C2410 - s3c2440 fixes and clock updates · 18250f9e

Ben Dooks authored Oct 19, 2004

Patch from Ben Dooks

Fixes the following problems and ommisions:

 - added variable for base crystal rate
 - moved clock variables into clock.c
 - fixed bug in identifying s3c2440 cpus
 - added initial support for new uart registration
 - removed base blocks from include/asm/arch/hardware.h

Signed-off-by: Ben Dooks

18250f9e

[ARM PATCH] 2131/1: Add _iomem to the IO string functions · ee0f2f9e

Ben Dooks authored Oct 19, 2004

Patch from Ben Dooks

This patch stops mtd from generating problems of
casting pointers to ints, due to the memcpy_fromio
and related functions all taking `unsigned long`
for their IO addresses.

Replace `unsigned long` with `void __iomem *`

Compiled clean on arch-s3c2410

Signed-off-by: Ben Dooks

ee0f2f9e

[PATCH] sparse __iomem annotations for qla2xxx · 2f68cfe3

Christoph Hellwig authored Oct 18, 2004

this also found a real bug, qla2xxx isn't iounmapping at host removal at
all currently - and if the right cpp macro would have been set it'd be
too late.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>

2f68cfe3

Linux 2.6.9 · 31a37910
Linus Torvalds authored Oct 17, 2004

31a37910

[PATCH] USB: handle NAK packets in input devices. · 56150702

Greg Kroah-Hartman authored Oct 17, 2004

Andrew requested this fix go in before 2.6.9 was out, to keep people's
syslog quiet for a lot of different USB input devices.

Fixes bug bugzilla.kernel.org bug #3564
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

56150702

[PATCH] Duh. _Really_ unbalanced locking in MTD Intel chip driver · a4a865be

Nicolas Pitre authored Oct 17, 2004

I apparently can't copy simple obvious fixes by hand.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

a4a865be

[PATCH] unbalanced locking in MTD Intel chip driver · 9c4ffb34

Nicolas Pitre authored Oct 17, 2004

This obvious missing unlock is screwing the preemption count.
Fix was applied to MTD CVS already.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

9c4ffb34