Commits · 40e39ce0f4eceee04555c45ae9918a017cd1686c · Kirill Smelkov / linux

18 Oct, 2004 24 commits

[PATCH] softirqs: fix latency of softirq processing · 40e39ce0

Ingo Molnar authored Oct 18, 2004

The attached patch fixes a local_bh_enable() buglet: we first enabled
softirqs then did we do local_softirq_pending() - often this is preemptible
code.  So this task could be preempted and there's no guarantee that
softirq processing will occur (except the periodic timer tick).

The race window is small but existent.  This could result in packet
processing latencies or timer expiration latencies - hard to detect and
annoying bugs.

The fix is to invoke softirqs with softirqs enabled but preemption still
disabled.  Patch is against 2.6.9-rc2-mm1.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

40e39ce0

[PATCH] fix PTRACE_ATTACH race with real parent's wait calls · cfc4f957

Roland McGrath authored Oct 18, 2004

There is a race between PTRACE_ATTACH and the real parent calling wait.
For a moment, the task is put in PT_PTRACED but with its parent still
pointing to its real_parent. In this circumstance, if the real parent
calls wait without the WUNTRACED flag, he can see a stopped child status,
which wait should never return without WUNTRACED when the caller is not
using ptrace. Here it is not the caller that is using ptrace, but some
third party.

This patch avoids this race condition by adding the PT_ATTACHED flag to
distinguish a real parent from a ptrace_attach parent when PT_PTRACED is
set, and then having wait use this flag to confirm that things are in order
and not consider the child ptraced when its ->ptrace flags are set but its
parent links have not yet been switched. (ptrace_check_attach also uses it
similarly to rule out a possible race with a bogus ptrace call by the real
parent during ptrace_attach.)

While looking into this, I noticed that every arch's sys_execve has:

current->ptrace &= ~PT_DTRACE;

with no locking at all. So, if an exec happens in a race with
PTRACE_ATTACH, you could wind up with ->ptrace not having PT_PTRACED set
because this store clobbered it. That will cause later BUG hits because
the parent links indicate ptracedness but the flag is not set. The patch
corrects all the places I found to use task_lock around diddling ->ptrace
when it's possible to be racing with ptrace_attach. (The ptrace operation
code itself doesn't have this issue because it already excludes anyone else
being in ptrace_attach.)
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

cfc4f957

[PATCH] add WCONTINUED support to wait4 syscall · 04bff088

Roland McGrath authored Oct 18, 2004

POSIX specifies the new WCONTINUED flag for waitpid, not just for waitid.
I overlooked this addition when I implemented waitid.  The real work was
already done to support waitid, but waitpid needs to report the results
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

04bff088

[PATCH] make rlimit settings per-process instead of per-thread · 31180071

Roland McGrath authored Oct 18, 2004

POSIX specifies that the limit settings provided by getrlimit/setrlimit are
shared by the whole process, not specific to individual threads. This
patch changes the behavior of those calls to comply with POSIX.

I've moved the struct rlimit array from task_struct to signal_struct, as it
has the correct sharing properties. (This reduces kernel memory usage per
thread in multithreaded processes by around 100/200 bytes for 32/64
machines respectively.) I took a fairly minimal approach to the locking
issues with the newly shared struct rlimit array. It turns out that all
the code that is checking limits really just needs to look at one word at a
time (one rlim_cur field, usually). It's only the few places like
getrlimit itself (and fork), that require atomicity in accessing a whole
struct rlimit, so I just used a spin lock for them and no locking for most
of the checks. If it turns out that readers of struct rlimit need more
atomicity where they are now cheap, or less overhead where they are now
atomic (e.g. fork), then seqcount is certainly the right thing to use for
them instead of readers using the spin lock. Though it's in signal_struct,
I didn't use siglock since the access to rlimits never needs to disable
irqs and doesn't overlap with other siglock uses. Instead of adding
something new, I overloaded task_lock(task->group_leader) for this; it is
used for other things that are not likely to happen simultaneously with
limit tweaking. To me that seems preferable to adding a word, but it would
be trivial (and arguably cleaner) to add a separate lock for these users
(or e.g. just use seqlock, which adds two words but is optimal for readers).

Most of the changes here are just the trivial s/->rlim/->signal->rlim/.

I stumbled across what must be a long-standing bug, in reparent_to_init.
It does:
memcpy(current->rlim, init_task.rlim, sizeof(*(current->rlim)));
when surely it was intended to be:
memcpy(current->rlim, init_task.rlim, sizeof(current->rlim));
As rlim is an array, the * in the sizeof expression gets the size of the
first element, so this just changes the first limit (RLIMIT_CPU). This is
for kernel threads, where it's clear that resetting all the rlimits is what
you want. With that fixed, the setting of RLIMIT_FSIZE in nfsd is
superfluous since it will now already have been reset to RLIM_INFINITY.

The other subtlety is removing:
tsk->rlim[RLIMIT_CPU].rlim_cur = RLIM_INFINITY;
in exit_notify, which was to avoid a race signalling during self-reaping
exit. As the limit is now shared, a dying thread should not change it for
others. Instead, I avoid that race by checking current->state before the
RLIMIT_CPU check. (Adding one new conditional in that path is now required
one way or another, since if not for this check there would also be a new
race with self-reaping exit later on clearing current->signal that would
have to be checked for.)

The one loose end left by this patch is with process accounting.
do_acct_process temporarily resets the RLIMIT_FSIZE limit while writing the
accounting record. I left this as it was, but it is now changing a limit
that might be shared by other threads still running. I left this in a
dubious state because it seems to me that processing accounting may already
be more generally a dubious state when it comes to NPTL threads. I would
think you would want one record per process, with aggregate data about all
threads that ever lived in it, not a separate record for each thread.
I don't use process accounting myself, but if anyone is interested in
testing it out I could provide a patch to change it this way.

One final note, this is not 100% to POSIX compliance in regards to rlimits.
POSIX specifies that RLIMIT_CPU refers to a whole process in aggregate, not
to each individual thread. I will provide patches later on to achieve that
change, assuming this patch goes in first.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

31180071

[PATCH] i386 entry.S cleanups · cc588ba9

Ingo Molnar authored Oct 18, 2004

Remove the unused lcall7/lcall27 code.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

cc588ba9

[PATCH] acpi proc: error handling · 678ab4ca

Pavel Machek authored Oct 18, 2004

Propagate the software_suspend() return value.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

678ab4ca

[PATCH] swsusp: progress in percent · fa7f7d64

Pavel Machek authored Oct 18, 2004

swsusp currently has very poor progress indication.  Thanks to Erik Rigtorp
<erik@rigtorp.com>, we have percentages there, so people know how long wait
to expect.  Please apply,

From: Erik Rigtorp <erik@rigtorp.com>
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

fa7f7d64

[PATCH] parport_pc superio chip fixes · b7478cdd

Andrea Arcangeli authored Oct 18, 2004

This patch fixes some troubles that somebody reported me with the superio
chips.

In short rmmod parport_pc && cat /proc/iomem was good enough for crashing
the box hard on some machine (and hwscan --printer was doing just that).
The way the oops triggers is that iomem tries to vsprintf the p->name, but
the p->name was a static string in the module address (now unloaded).

The reason is that the superio chip scanning leaves up to two persistent
ranges claimed.  But the second (legacy) pass has no way to notice the
resources are already reclaimed.  Plus if the superio->io was different
than the "io" variable (the range to scan for superio chips) the "io" range
would generate a leak of the original "io" range too.

I simply make sure to always release the requested space during the superio
scan, and I make sure not to istantiate new ranges in the p->base that
would cause the later parport scan to fail too (plus leaving up to leaked
resources).

The previous code that was returning values and was leaving garbage in
there made no sense to me.  My best guess (assuming I didn't misread it ;)
is that probably somebody added the request_region without realizing
they're pointing to the very same address that would be requested later
(and nobody does accesses on those ranges until later, so it was very safe
to claim it later).

Disclaimer: I don't have the specs of the winbond and smsc at hand, I just
guessed what they do from the code (nothing checks superio->io except
get_superio_dma get_superio_irq, which made the thing enough self
explainatory to fix it without specs)
Signed-off-by: Andrea Arcangeli <andrea@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

b7478cdd

[PATCH] add sys_setaltroot() · 44c4fb89

Seth Rohit authored Oct 18, 2004

Add a new system call setaltroot(2).

Currently, using the altroot feature is accessible only via the
set_personality() system call.  It is accessible to user space only if there
is more than one exec domain in the system.  This patch allows using the
altroot feature on systems where there is only one exec domain.

It is possible to work around the issue by adding a dummy exec domain, but it
was rejected for not being very elegant.

If this feature is implemented in userspace, it adds a 16% overhead on a test
case which greps for a single word in the kernel source tree.
Signed-off-by: Zou Nanhai <nanhai.zou@intel.com>
Signed-off-by: Gordon Jin <gordon.jin@intel.com>
Signed-off-by: Arun Sharma <arun.sharma@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

44c4fb89

Wrap <linux/compiler.h> inside '#ifndef __ASSEMBLY__' · f5aa089a

Linus Torvalds authored Oct 18, 2004

None of the compatibility defines make sense for assembly
files, and gcc has trouble with vararg macros when using
"-traditional" (which is used for asm), to the point of
ICE'ing.

f5aa089a

Add copyright notice on ppc64 iomap files. · 0e0c5521
Linus Torvalds authored Oct 18, 2004
```
Paul cares. I think there's something in the water at IBM
that makes people sticklers ;)
```
0e0c5521

[PATCH] ppc64: Fix iSeries build (ouch !) · 02dc1467

Benjamin Herrenschmidt authored Oct 18, 2004

The move of iomap out of eeh inadvertently broke iSeries ...

Fixed like this.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

02dc1467

[PATCH] ppc32/64: FPU/vector register restore after signal · 4aab1539

Benjamin Herrenschmidt authored Oct 18, 2004

This fixes some issues with restoring the altivec and/or FPU registers
upon return from a signal or when setting a context.  It also add a
proper stack backlink to the signal frames created for 64 bits
applications.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

4aab1539

Older gcc's ICE on missing (unused) varags macro name. · a85f54d7
Linus Torvalds authored Oct 18, 2004

a85f54d7
Merge bk://gkernel.bkbits.net/net-drivers-2.6 · 9266734a
Linus Torvalds authored Oct 18, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.6/linux
```
9266734a
Merge bk://gkernel.bkbits.net/libata-2.6 · 6db492bc
Linus Torvalds authored Oct 18, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.6/linux
```
6db492bc
Merge pobox.com:/spare/repo/linux-2.6 · 49093583
Jeff Garzik authored Oct 18, 2004
```
into pobox.com:/spare/repo/libata-2.6
```
49093583
Add fake '__builtin_warning()' for the gcc case. · 6df3af84
Linus Torvalds authored Oct 18, 2004
```
Allows us to do compile-time sparse warnings of our own.
```
6df3af84
Merge bk://linux-scsi.bkbits.net/scsi-for-linus-2.6 · d78d2844
Linus Torvalds authored Oct 18, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.6/linux
```
d78d2844
Merge titanic.il.steeleye.com:/home/jejb/BK/scsi-target-2.6 · e270e1b2
James Bottomley authored Oct 18, 2004
```
into titanic.il.steeleye.com:/home/jejb/BK/scsi-for-linus-2.6
```
e270e1b2

aic7xxx and aic79xx: fix sleeping while holding a lock · c045ebb7

James Bottomley authored Oct 18, 2004

From: Luben Tuikov <luben_tuikov@adaptec.com>

Fix sleeping while holding a lock on host removal and on
killing the DV thread.
Signed-off-by: Luben Tuikov <luben_tuikov@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>

c045ebb7

SCSI: fix Suspend I/O block/unblock path · 4b8cbbf6

James Bottomley authored Oct 18, 2004

From: James.Smart@Emulex.Com

urther testing is showing that we are having some i/o threads
prematurely die with the following message: "rejecting I/O to device
being removed"
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>

4b8cbbf6

[PATCH] cciss: fixes for clustering · 2207252b

Mike Miller authored Oct 18, 2004

This patch changes our open specifically for clustering software. We must
allow root to access any volume or device with a LUN ID. We also modified
our revalidate function for this reason.
If a logical is reserved, we must register it with the OS with size=0. Then
the backup system can call BLKRRPART after breaking the reservation to
set the device to the correct size.
We also must register a controller with no logical volumes for the online
utilities to function. This is the way we've done it since the 2.2 kernel.
Which doesn't neccesarily make it right, but we have legacy apps to consider.

Signed off by: Mike Miller <mike.miller@hp.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>

2207252b

Merge bk://bk.arm.linux.org.uk/linux-2.6-rmk · 0d377ebc
Linus Torvalds authored Oct 18, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.6/linux
```
0d377ebc

19 Oct, 2004 1 commit

[ARM PATCH] 2145/1: S3C2410 - GPIO ID register update · 5995d15b

Ben Dooks authored Oct 19, 2004

Patch from Ben Dooks

Update the include/asm-arm/arch-s3c2410/regs-gpio.h with
GSTATUS1 register information

Signed-off-by: Ben Dooks

5995d15b

18 Oct, 2004 9 commits

[ARM PATCH] 2144/1: S3C2410 - s3c2440 fixes and clock updates · 18250f9e

Ben Dooks authored Oct 19, 2004

Patch from Ben Dooks

Fixes the following problems and ommisions:

 - added variable for base crystal rate
 - moved clock variables into clock.c
 - fixed bug in identifying s3c2440 cpus
 - added initial support for new uart registration
 - removed base blocks from include/asm/arch/hardware.h

Signed-off-by: Ben Dooks

18250f9e

[ARM PATCH] 2131/1: Add _iomem to the IO string functions · ee0f2f9e

Ben Dooks authored Oct 19, 2004

Patch from Ben Dooks

This patch stops mtd from generating problems of
casting pointers to ints, due to the memcpy_fromio
and related functions all taking `unsigned long`
for their IO addresses.

Replace `unsigned long` with `void __iomem *`

Compiled clean on arch-s3c2410

Signed-off-by: Ben Dooks

ee0f2f9e

[PATCH] sparse __iomem annotations for qla2xxx · 2f68cfe3

Christoph Hellwig authored Oct 18, 2004

this also found a real bug, qla2xxx isn't iounmapping at host removal at
all currently - and if the right cpp macro would have been set it'd be
too late.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>

2f68cfe3

Linux 2.6.9 · 31a37910
Linus Torvalds authored Oct 17, 2004

31a37910

[PATCH] USB: handle NAK packets in input devices. · 56150702

Greg Kroah-Hartman authored Oct 17, 2004

Andrew requested this fix go in before 2.6.9 was out, to keep people's
syslog quiet for a lot of different USB input devices.

Fixes bug bugzilla.kernel.org bug #3564
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

56150702

[PATCH] Duh. _Really_ unbalanced locking in MTD Intel chip driver · a4a865be

Nicolas Pitre authored Oct 17, 2004

I apparently can't copy simple obvious fixes by hand.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

a4a865be

[PATCH] unbalanced locking in MTD Intel chip driver · 9c4ffb34

Nicolas Pitre authored Oct 17, 2004

This obvious missing unlock is screwing the preemption count.
Fix was applied to MTD CVS already.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

9c4ffb34

[PATCH] security issue in firmware system · eb51a678

Oliver Neukum authored Oct 17, 2004

The firmware loader has a security issue.  Firmware on some devices can
write to all memory through DMA.  Therefore the ability to feed firmware
to the kernel is equivalent to writing to /dev/kmem.  CAP_SYS_RAWIO is
needed to protect itself.

[ Editors note: the firmware file is 0644, and owned by root, so this
  "security issue" is really only an issue for people who use
  capabilities explicitly, rather than the regular Unix permissions.
  This patch makes it do the same checks we do for /dev/mem etc.  ]
Signed-Off-By: Oliver Neukum <oliver@neukum.name>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

eb51a678

[PATCH] Fix NFS3 krb5 clients on x86-64 · d7ce417e

Mark Goodman authored Oct 17, 2004

This patch is necessary to make NFS3 krb5 clients work on x86-64.

ACK'ed by Trond
Signed-off-by: Mark Goodman <mgoodman@csua.berkeley.edu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

d7ce417e

17 Oct, 2004 6 commits

[PATCH] cciss: SCSI API updates · a6c0c127

Mike Miller authored Oct 16, 2004

This patch updates our SCSI support to no longer use deprecated APIs.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>

a6c0c127

[PATCH] ppc64: fix smp_startup_cpu for cpu hotplug · a0d194e3

Nathan Lynch authored Oct 16, 2004

This change is needed in order to allow cpus to be onlined after
boot.  This used to work but the declaration of
pseries_secondary_smp_init in this file was changed in Ben's big
cleanup patch a while back, so the cpu would start at a bad address.
Signed-off-by: Nathan Lynch <nathanl@austin.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

a0d194e3

[PATCH] kswapd lockup fix · 7ac62185

Nick Piggin authored Oct 16, 2004

Fix some bugs in the kswapd logic which can cause kswapd lockups.

The balance_pgdat() logic is supposed to cause kswapd to loop across all zones
in the node until each zone either

	a) has enough pages free or

	b) is deemed to be in an "all pages unreclaimable" state.

In the latter case, we just give the zone a light scan on each balance_pgdat()
scan and wait for the zone to come back to life again.

But the zone->all_unreclaimable logic is broken - if the zone has no pages on
the LRU at all, we perform no scanning of that zone (of course).  So the
zone->pages_scanned is not incremented and the expression

		if (zone->pages_scanned > zone->present_pages * 2)
			zone->all_unreclaimable = 1;

never is satisfied.

The patch changes that logic to

		if (zone->pages_scanned >= (zone->nr_active +
						zone->nr_inactive) * 4)
			zone->all_unreclaimable = 1;

so if the zone has no LRU pages it will still enter the all_unreclaimable
state.


Another problem is that if the zone has no LRU pages we will tell
shrink_slab() that we scanned zero LRU pages.  This causes shrink_slab() to
scan zero slab objects, which is obviously wrong.  So change shrink_slab() to
perform a decent chunk of slab scanning in this situation.


And put a cond_resched() into the balance_pgdat() outer loop.  Probably
unnecessary, but that's what Jeff had in place when he confirmed that this
patch fixed the lockup :(
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

7ac62185

[PATCH] swsusp: fix x86-64 - do not use memory in copy loop · 9ac1e4a8

Pavel Machek authored Oct 16, 2004

In assembly code, there are some problems with "nosave" section (linker was
doing something stupid, like duplicating the section).  We attempted to fix
it, but fix was worse then first problem.  This fixes is for good: We no
longer use any memory in the copy loop.  (Plus it fixes indentation and
uses meaningful labels.)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

9ac1e4a8

[PATCH] tailcall prevention in sys_wait4() and sys_waitid() · c5cada67

Ingo Molnar authored Oct 16, 2004

A hack to prevent the compiler from generatin tailcalls in these two
functions.

With CONFIG_REGPARM=y, the tailcalled code ends up stomping on the
syscall's argument frame which corrupts userspace's registers.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

c5cada67

[PATCH] intel_agp: dangling devexit reference · 74a82aa9

Randy Dunlap authored Oct 16, 2004

Fix error found by 'scripts/reference_discarded.pl':
Error: ./drivers/char/agp/intel-agp.o .data refers to 00000914 R_386_32          .exit.text
Signed-off-by: Randy Dunlap <rddunlap@osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

74a82aa9