Commits · be8962a15857ee7b43bc66f718d7e8987f484fc8 · Kirill Smelkov / linux

16 Nov, 2007 40 commits

Fix crypto_alloc_comp() error checking. · be8962a1

Herbert Xu authored Nov 13, 2007

[IPSEC]: Fix crypto_alloc_comp error checking

[ Upstream commit: 4999f362 ]

The function crypto_alloc_comp returns an errno instead of NULL
to indicate error.  So it needs to be tested with IS_ERR.

This is based on a patch by Vicenç Beltran Querol.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

be8962a1

Fix SET_VLAN_INGRESS_PRIORITY_CMD error return. · 9def747b

Patrick McHardy authored Nov 13, 2007

patch fffe470a in mainline.

[VLAN]: Fix SET_VLAN_INGRESS_PRIORITY_CMD ioctl

Based on report and patch by Doug Kehn <rdkehn@yahoo.com>:

vconfig returns the following error when attempting to execute the
set_ingress_map command:

vconfig: socket or ioctl error for set_ingress_map: Operation not permitted

In vlan.c, vlan_ioctl_handler for SET_VLAN_INGRESS_PRIORITY_CMD
sets err = -EPERM and calls vlan_dev_set_ingress_priority.
vlan_dev_set_ingress_priority is a void function so err remains
at -EPERM and results in the vconfig error (even though the ingress
map was set).

Fix by setting err = 0 after the vlan_dev_set_ingress_priority call.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

9def747b

Fix VLAN address syncing. · dae1e6e8

Patrick McHardy authored Nov 13, 2007

patch d932e04a in mainline.

[PATCH] [VLAN]: Don't synchronize addresses while the vlan device is down

While the VLAN device is down, the unicast addresses are not configured
on the underlying device, so we shouldn't attempt to sync them.

Noticed by Dmitry Butskoy <buc@odusz.so-cdu.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

dae1e6e8

Fix endianness bug in U32 classifier. · e809e9c0

Radu Rendec authored Nov 13, 2007

changeset 543821c6 in mainline.

[PKT_SCHED] CLS_U32: Fix endianness problem with u32 classifier hash masks.

While trying to implement u32 hashes in my shaping machine I ran into
a possible bug in the u32 hash/bucket computing algorithm
(net/sched/cls_u32.c).

The problem occurs only with hash masks that extend over the octet
boundary, on little endian machines (where htonl() actually does
something).

Let's say that I would like to use 0x3fc0 as the hash mask. This means
8 contiguous "1" bits starting at b6. With such a mask, the expected
(and logical) behavior is to hash any address in, for instance,
192.168.0.0/26 in bucket 0, then any address in 192.168.0.64/26 in
bucket 1, then 192.168.0.128/26 in bucket 2 and so on.

This is exactly what would happen on a big endian machine, but on
little endian machines, what would actually happen with current
implementation is 0x3fc0 being reversed (into 0xc03f0000) by htonl()
in the userspace tool and then applied to 192.168.x.x in the u32
classifier. When shifting right by 16 bits (rank of first "1" bit in
the reversed mask) and applying the divisor mask (0xff for divisor
256), what would actually remain is 0x3f applied on the "168" octet of
the address.

One could say is this can be easily worked around by taking endianness
into account in userspace and supplying an appropriate mask (0xfc03)
that would be turned into contiguous "1" bits when reversed
(0x03fc0000). But the actual problem is the network address (inside
the packet) not being converted to host order, but used as a
host-order value when computing the bucket.

Let's say the network address is written as n31 n30 ... n0, with n0
being the least significant bit. When used directly (without any
conversion) on a little endian machine, it becomes n7 ... n0 n8 ..n15
etc in the machine's registers. Thus bits n7 and n8 would no longer be
adjacent and 192.168.64.0/26 and 192.168.128.0/26 would no longer be
consecutive.

The fix is to apply ntohl() on the hmask before computing fshift,
and in u32_hash_fold() convert the packet data to host order before
shifting down by fshift.

With helpful feedback from Jamal Hadi Salim and Jarek Poplawski.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

e809e9c0

Fix TEQL oops. · 61254a93

Evgeniy Polyakov authored Nov 13, 2007

[PKT_SCHED]: Fix OOPS when removing devices from a teql queuing discipline

[ Upstream commit: 4f9f8311 ]

tecl_reset() is called from deactivate and qdisc is set to noop already,
but subsequent teql_xmit does not know about it and dereference private
data as teql qdisc and thus oopses.
not catch it first :)
Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

61254a93

Fix error returns in sys_socketpair() · c669e2ad

David Miller authored Nov 13, 2007

patch bf3c23d1 in mainline.

[NET]: Fix error reporting in sys_socketpair().

If either of the two sock_alloc_fd() calls fail, we
forget to update 'err' and thus we'll erroneously
return zero in these cases.

Based upon a report and patch from Rich Paul, and
commentary from Chuck Ebbert.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

c669e2ad

softmac: fix wext MLME request reason code endianness · eeb4e8c2

Johannes Berg authored Oct 25, 2007

patch 94e10bfb in mainline.

The MLME request reason code is host-endian and our passing
it to the low level functions is host-endian as well since
they do the swapping. I noticed that the reason code 768 was
sent (0x300) rather than 3 when wpa_supplicant terminates.
This removes the superfluous cpu_to_le16() call.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

eeb4e8c2

Fix kernel_accept() return handling. · cfebbe5f

Tony Battersby authored Oct 23, 2007

patch fa8705b0 in mainline.

[NET]: sanitize kernel_accept() error path

If kernel_accept() returns an error, it may pass back a pointer to
freed memory (which the caller should ignore).  Make it pass back NULL
instead for better safety.
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cfebbe5f

TCP: Fix size calculation in sk_stream_alloc_pskb · 9fcba471

Herbert Xu authored Nov 14, 2007

[TCP]: Fix size calculation in sk_stream_alloc_pskb

[ Upstream commit: fb93134d ]

We round up the header size in sk_stream_alloc_pskb so that
TSO packets get zero tail room.  Unfortunately this rounding
up is not coordinated with the select_size() function used by
TCP to calculate the second parameter of sk_stream_alloc_pskb.

As a result, we may allocate more than a page of data in the
non-TSO case when exactly one page is desired.

In fact, rounding up the head room is detrimental in the non-TSO
case because it makes memory that would otherwise be available to
the payload head room.  TSO doesn't need this either, all it wants
is the guarantee that there is no tail room.

So this patch fixes this by adjusting the skb_reserve call so that
exactly the requested amount (which all callers have calculated in
a precise way) is made available as tail room.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

9fcba471

Fix SKB_WITH_OVERHEAD calculations. · 8522b496

Herbert Xu authored Oct 23, 2007

patch deea84b0 in mainline.

[NET]: Fix SKB_WITH_OVERHEAD calculation

The calculation in SKB_WITH_OVERHEAD is incorrect in that it can cause
an overflow across a page boundary which is what it's meant to prevent.
In particular, the header length (X) should not be lumped together with
skb_shared_info.  The latter needs to be aligned properly while the header
has no choice but to sit in front of wherever the payload is.

Therefore the correct calculation is to take away the aligned size of
skb_shared_info, and then subtract the header length.  The resulting
quantity L satisfies the following inequality:

	SKB_DATA_ALIGN(L + X) + sizeof(struct skb_shared_info) <= PAGE_SIZE

This is the quantity used by alloc_skb to do the actual allocation.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

8522b496

Fix 9P protocol build · b3540573

Ingo Molnar authored Oct 23, 2007

patch 092e9d93 in mainline.

[9P]: build fix with !CONFIG_SYSCTL

found via make randconfig build testing:

 net/built-in.o: In function `init_p9':
 mod.c:(.init.text+0x3b39): undefined reference to `p9_sysctl_register'
 net/built-in.o: In function `exit_p9':
 mod.c:(.exit.text+0x36b): undefined reference to `p9_sysctl_unregister'
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

b3540573

Fix advertised packet scheduler timer resolution · e20e6446

Patrick McHardy authored Oct 23, 2007

patch 3c0cfc13 in mainline

The fourth parameter of /proc/net/psched is supposed to show the timer
resultion and is used by HTB userspace to calculate the necessary
burst rate. Currently we show the clock resolution, which results in a
too low burst rate when the two differ.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

e20e6446

Add get_unaligned to ieee80211_get_radiotap_len · d876cd16

Andy Green authored Oct 09, 2007

patch dfe6e81d in mainline.

ieee80211_get_radiotap_len() tries to dereference radiotap length without
taking care that it is completely unaligned and get_unaligned()
is required.
Signed-off-by: Andy Green <andy@warmcat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

d876cd16

mac80211: Improve sanity checks on injected packets · 1e3bfd14

Andy Green authored Oct 09, 2007

patch 9b8a74e3 in mainline.

Michael Wu noticed that the skb length checking is not taken care of enough when
a packet is presented on the Monitor interface for injection.

This patch improves the sanity checking and removes fake offsets placed
into the skb network and transport header.
Signed-off-by: Andy Green <andy@warmcat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

1e3bfd14

mac80211: filter locally-originated multicast frames · be3d7bec

John W. Linville authored Oct 09, 2007

patch b3316157 in mainline.

In STA mode, the AP will echo our traffic.  This includes multicast
traffic.

Receiving these frames confuses some protocols and applications,
notably IPv6 Duplicate Address Detection.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

be3d7bec

Linux 2.6.23.3 · ef7cede8
Greg Kroah-Hartman authored Nov 16, 2007

ef7cede8

revert "x86_64: allocate sparsemem memmap above 4G" · edc0636c

Linus Torvalds authored Nov 01, 2007

Reverted upstream by commit 6a22c57b

Revert this commit:

	commit 2e1c49db
	Author: Zou Nan hai <nanhai.zou@intel.com>
	Date:   Fri Jun 1 00:46:28 2007 -0700
	
	x86_64: allocate sparsemem memmap above 4G

This reverts commit 2e1c49db.

First off, testing in Fedora has shown it to cause boot failures,
bisected down by Martin Ebourne, and reported by Dave Jobes.  So the
commit will likely be reverted in the 2.6.23 stable kernels.

Secondly, in the 2.6.24 model, x86-64 has now grown support for
SPARSEMEM_VMEMMAP, which disables the relevant code anyway, so while the
bug is not visible any more, it's become invisible due to the code just
being irrelevant and no longer enabled on the only architecture that
this ever affected.
Reported-by: Dave Jones <davej@redhat.com>
Tested-by: Martin Ebourne <fedora@ebourne.me.uk>
Cc: Zou Nan hai <nanhai.zou@intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

edc0636c

x86: fix TSC clock source calibration error · 963bbb3b

Dave Johnson authored Oct 23, 2007

patch edaf420f in mainline.

I ran into this problem on a system that was unable to obtain NTP sync
because the clock was running very slow (over 10000ppm slow). ntpd had
declared all of its peers 'reject' with 'peer_dist' reason.

On investigation, the tsc_khz variable was significantly incorrect
causing xtime to run slow.  After a reboot tsc_khz was correct so I
did a reboot test to see how often the problem occurred:

Test was done on a 2000 Mhz Xeon system.  Of 689 reboots, 8 of them
had unacceptable tsc_khz values (>500ppm):

 range of tsc_khz  # of boots  % of boots
 ----------------  ----------  ----------
        < 1999750           0      0.000%
1999750 - 1999800          21      3.048%
1999800 - 1999850         166     24.128%
1999850 - 1999900         241     35.029%
1999900 - 1999950         211     30.669%
1999950 - 2000000          42      6.105%
2000000 - 2000000           0      0.000%
2000050 - 2000100           0      0.000%
                   [...]
2000100 - 2015000           1      0.145%  << BAD
2015000 - 2030000           6      0.872%  << BAD
2030000 - 2045000           1      0.145%  << BAD
2045000 <                   0      0.000%

The worst boot was 2032.577 Mhz, over 1.5% off!

It appears that on rare occasions, mach_countup() is taking longer to
complete than necessary.

I suspect that this is caused by the CPU taking a periodic SMI
interrupt right at the end of the 30ms calibration loop.  This would
cause the loop to delay while the SMI BIOS hander runs. The resulting
TSC value is beyond what it actually should be resulting in a higher
tsc_khz.

The below patch makes native_calculate_cpu_khz() take the best
(shortest duration, lowest khz) run of it's 3 calibration loops.  If a
SMI goes off causing a bad result (long duration, higher khz) it will
be discarded.

With the patch applied, 300 boots of the same system produce good
results:

 range of tsc_khz  # of boots  % of boots
 ----------------  ----------  ----------
        < 1999750           0      0.000%
1999750 - 1999800          30     10.000%
1999800 - 1999850         166     55.333%
1999850 - 1999900          89     29.667%
1999900 - 1999950          15      5.000%
1999950 <                   0      0.000%

Problem was found and tested against 2.6.18.  Patch is against 2.6.22.
Signed-off-by: Dave Johnson <djohnson@sw.starentnetworks.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

963bbb3b

x86 setup: sizeof() is unsigned, unbreak comparisons · 2d49e888

H. Peter Anvin authored Oct 25, 2007

patch e6e1ace9 in mainline.


We use signed values for limit checking since the values can go
negative under certain circumstances.  However, sizeof() is unsigned
and forces the comparison to be unsigned, so move the comparison into
the heap_free() macros so we can ensure it is a signed comparison.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

2d49e888

x86 setup: handle boot loaders which set up the stack incorrectly · 430bb2ee

H. Peter Anvin authored Oct 25, 2007

patch 6b6815c6 in mainline.

Apparently some specific versions of LILO enter the kernel with a
stack pointer that doesn't match the rest of the segments.  Make our
best attempt at untangling the resulting mess.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

430bb2ee

x86: fix global_flush_tlb() bug · 4b69ffe3

Ingo Molnar authored Oct 19, 2007

patch 9a24d04a upstream

While we were reviewing pageattr_32/64.c for unification,
Thomas Gleixner noticed the following serious SMP bug in
global_flush_tlb():

	down_read(&init_mm.mmap_sem);
	list_replace_init(&deferred_pages, &l);
	up_read(&init_mm.mmap_sem);

this is SMP-unsafe because list_replace_init() done on two CPUs in
parallel can corrupt the list.

This bug has been introduced about a year ago in the 64-bit tree:

       commit ea7322de
       Author: Andi Kleen <ak@suse.de>
       Date:   Thu Dec 7 02:14:05 2006 +0100

       [PATCH] x86-64: Speed and clean up cache flushing in change_page_attr

                down_read(&init_mm.mmap_sem);
        -       dpage = xchg(&deferred_pages, NULL);
        +       list_replace_init(&deferred_pages, &l);
                up_read(&init_mm.mmap_sem);

the xchg() based version was SMP-safe, but list_replace_init() is not.
So this "cleanup" introduced a nasty bug.

why this bug never become prominent is a mystery - it can probably be
explained with the (still) relative obscurity of the x86_64 architecture.

the safe fix for now is to write-lock init_mm.mmap_sem.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

4b69ffe3

xfs: eagerly remove vmap mappings to avoid upsetting Xen · ba4312eb

Jeremy Fitzhardinge authored Oct 12, 2007

patch ace2e92e in mainline.

XFS leaves stray mappings around when it vmaps memory to make it
virtually contigious.  This upsets Xen if one of those pages is being
recycled into a pagetable, since it finds an extra writable mapping of
the page.

This patch solves the problem in a brute force way, by making XFS
always eagerly unmap its mappings.

[ Stable: This works around a bug in 2.6.23.  We may come up with a
better solution for mainline, but this seems like a low-impact fix for
the stable kernel. ]
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: XFS masters <xfs-masters@oss.sgi.com>
Cc: Morten =?utf-8?q?B=C3=B8geskov?= <xen-users@morten.bogeskov.dk>
Cc: Mark Williamson <mark.williamson@cl.cam.ac.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ba4312eb

xen: fix incorrect vcpu_register_vcpu_info hypercall argument · 418db154

Jeremy Fitzhardinge authored Oct 12, 2007

patch e3d26976 in mainline.

The kernel's copy of struct vcpu_register_vcpu_info was out of date,
at best causing the hypercall to fail and the guest kernel to fall
back to the old mechanism, or worse, causing random memory corruption.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Stable Kernel <stable@kernel.org>
Cc: Morten =?utf-8?q?B=C3=B8geskov?= <xen-users@morten.bogeskov.dk>
Cc: Mark Williamson <mark.williamson@cl.cam.ac.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

418db154

xen: deal with stale cr3 values when unpinning pagetables · edf06ad7

Jeremy Fitzhardinge authored Oct 12, 2007

patch 9f79991d in mainline.

When a pagetable is no longer in use, it must be unpinned so that its
pages can be freed. However, this is only possible if there are no
stray uses of the pagetable. The code currently deals with all the
usual cases, but there's a rare case where a vcpu is changing cr3, but
is doing so lazily, and the change hasn't actually happened by the time
the pagetable is unpinned, even though it appears to have been completed.

This change adds a second per-cpu cr3 variable - xen_current_cr3 -
which tracks the actual state of the vcpu cr3. It is only updated once
the actual hypercall to set cr3 has been completed. Other processors
wishing to unpin a pagetable can check other vcpu's xen_current_cr3
values to see if any cross-cpu IPIs are needed to clean things up.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

edf06ad7

xen: add batch completion callbacks · 4fc04833

Jeremy Fitzhardinge authored Oct 12, 2007

patch 91e0c5f3 in mainline.

This adds a mechanism to register a callback function to be called once
a batch of hypercalls has been issued.  This is typically used to unlock
things which must remain locked until the hypercall has taken place.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

4fc04833

UML - kill subprocesses on exit · 6f8a6ffc

Lepton Wu authored Nov 01, 2007

commit a24864a1

uml: definitively kill subprocesses on panic

In a stock 2.6.22.6 kernel, poweroff a user mode linux guest (2.6.22.6 running
in skas0 mode) will halt the host linux.  I think the reason is the kernel
thread abort because of a bug.  Then the sys_reboot in process of user mode
linux guest is not trapped by the user mode linux kernel and is executed by
host.  I think it is better to make sure all of our children process to quit
when user mode linux kernel abort.

[ jdike - the kernel process needs to ignore SIGTERM, plus the waitpid/kill
loop is needed to make sure that all of our children are dead before the
kernel exits ]
Signed-off-by: Lepton Wu <ytht.net@gmail.com>
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

6f8a6ffc

UML - stop using libc asm/user.h · 9e6707f3

Jeff Dike authored Nov 01, 2007

commit 189872f9 in mainline.

uml: don't use glibc asm/user.h

Stop including asm/user.h from libc - it seems to be disappearing from
distros.  It's replaced with sys/user.h which defines user_fpregs_struct and
user_fpxregs_struct instead of user_i387_struct and struct user_fxsr_struct on
i386.

As a bonus, on x86_64, I get to dump some stupid typedefs which were needed in
order to get asm/user.h to compile.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

9e6707f3

UML - Fix kernel vs libc symbols clash · 63140953

Jeff Dike authored Nov 01, 2007

commit 818f6ef4 in mainline.

uml: fix an IPV6 libc vs kernel symbol clash

On some systems, with IPV6 configured, there is a clash between the kernel's
in6addr_any and the one in libc.

This is handled in the usual (gross) way of defining the kernel symbol out of
the way on the gcc command line.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

63140953

UML - Stop using libc asm/page.h · 90118e75

Jeff Dike authored Nov 01, 2007

commit 71f926f2 in mainline.

uml: stop using libc asm/page.h

Remove includes of asm/page.h from libc code.  This header seems to be
disappearing, and UML doesn't make much use of it anyway.

The one use, PAGE_SHIFT in stub.h, is handled by copying the constant from the
kernel side of the house in common_offsets.h.

[ jdike - added arch/um/kernel/skas/clone.c for -stable ]
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

90118e75

POWERPC: Make sure to of_node_get() the result of pci_device_to_OF_node() · 5361fb20

Michael Ellerman authored Sep 17, 2007

patch db220b23 in mainline.

pci_device_to_OF_node() returns the device node attached to a PCI device,
but doesn't actually grab a reference - we need to do it ourselves.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

5361fb20

POWERPC: Fix handling of stfiwx math emulation · 1d841b4f

Kumar Gala authored Oct 11, 2007

patch ba02946a in mainline

Its legal for the stfiwx instruction to have RA = 0 as part of its
effective address calculation.  This is illegal for all other XE
form instructions.

Add code to compute the proper effective address for stfiwx if
RA = 0 rather than treating it as illegal.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

1d841b4f

MIPS: R1: Fix hazard barriers to make kernels work on R2 also. · 2f51141b

Ralf Baechle authored Oct 10, 2007

patch 572afc24 in mainline.

Tested with Malta; inflates malta_defconfig by 3932 bytes.  Ideally there
should be additional configuration to allow getting rid of this overhead
but that would be too much complexity at this stage of the release cycle.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

2f51141b

MIPS: MT: Fix bug in multithreaded kernels. · e989c61a

Ralf Baechle authored Oct 10, 2007

patch a76ab5c1 in mainline.

When GDB writes a breakpoint into address area of inferior process the
kernel needs to invalidate the modified memory in the inferior which
is done by calling flush_cache_page which in turns calls
r4k_flush_cache_page and local_r4k_flush_cache_page for VSMP or SMTC
kernel via r4k_on_each_cpu().

As the VSMP and SMTC SMP kernels for 34K are running on a single shared
caches it is possible to get away without interprocessor function calls.
This optimization is implemented in r4k_on_each_cpu, so
local_r4k_flush_cache_page is only ever called on the local CPU.

This is where the following code in local_r4k_flush_cache_page() strikes:

        /*
         * If ownes no valid ASID yet, cannot possibly have gotten
         * this page into the cache.
         */
        if (cpu_context(smp_processor_id(), mm) == 0)
                return;

On VSMP and SMTC had a function of cpu_context() for each CPU(TC).

So in case another CPU than the CPU executing local_r4k_cache_flush_page
has not accessed the mm but one of the other CPUs has there may be data
to be flushed in the cache yet local_r4k_cache_flush_page will falsely
return leaving the I-cache inconsistent for the breakpoint.

While the issue was discovered with GDB it also exists in
local_r4k_flush_cache_range() and local_r4k_flush_cache().

Fixed by introducing a new function has_valid_asid which on MT kernels
returns true if a mm is active on any processor in the system.

This is relativly expensive since for memory acccesses in that loop
cache misses have to be assumed but it seems the most viable solution
for 2.6.23 and older -stable kernels.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

e989c61a

Fix sparc64 MAP_FIXED handling of framebuffer mmaps · efd233a4

Chris Wright authored Oct 23, 2007

patch d58aa8c7 in mainline.

From: Chris Wright <chrisw@sous-sol.org>
Date: Tue, 23 Oct 2007 20:36:14 -0700
Subject: [PATCH] [SPARC64]: pass correct addr in get_fb_unmapped_area(MAP_FIXED)

Looks like the MAP_FIXED case is using the wrong address hint.  I'd
expect the comment "don't mess with it" means pass the request
straight on through, not change the address requested to -ENOMEM.
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

efd233a4

Fix sparc64 niagara optimized RAID xor asm · 71ec6448

David Miller authored Oct 23, 2007

patch d060db63 in mainline.

[SPARC64]: Fix register usage in xor_raid_4().

Some typos led to using %i6/%i7 instead of %l6/%l7 in loads which is
really really bad because those are the frame pointer and return PC.

Based upon a raid5 crash report by Bertrand Joel.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

71ec6448

Linux 2.6.23.2 · 553e6a1a
Greg Kroah-Hartman authored Nov 16, 2007

553e6a1a

BLOCK: Fix bad sharing of tag busy list on queues with shared tag maps · 0520fb16

Jens Axboe authored Oct 30, 2007

patch 6eca9004 in mainline.

For the locking to work, only the tag map and tag bit map may be shared
(incidentally, I was just explaining this to Nick yesterday, but I
apparently didn't review the code well enough myself). But we also share
the busy list!  The busy_list must be queue private, or we need a
block_queue_tag covering lock as well.

So we have to move the busy_list to the queue. This'll work fine, and
it'll actually also fix a problem with blk_queue_invalidate_tags() which
will invalidate tags across all shared queues. This is a bit confusing,
the low level driver should call it for each queue seperately since
otherwise you cannot kill tags on just a single queue for eg a hard
drive that stops responding. Since the function has no callers
currently, it's not an issue.

This is fixed with commit 6eca9004 in
Linus' tree.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

0520fb16

fix tmpfs BUG and AOP_WRITEPAGE_ACTIVATE · bba9d994

Hugh Dickins authored Oct 29, 2007

patch 487e9bf2 in mainline.

It's possible to provoke unionfs (not yet in mainline, though in mm and
some distros) to hit shmem_writepage's BUG_ON(page_mapped(page)).  I expect
it's possible to provoke the 2.6.23 ecryptfs in the same way (but the
2.6.24 ecryptfs no longer calls lower level's ->writepage).

This came to light with the recent find that AOP_WRITEPAGE_ACTIVATE could
leak from tmpfs via write_cache_pages and unionfs to userspace.  There's
already a fix (e4230030 - writeback: don't
propagate AOP_WRITEPAGE_ACTIVATE) in the tree for that, and it's okay so
far as it goes; but insufficient because it doesn't address the underlying
issue, that shmem_writepage expects to be called only by vmscan (relying on
backing_dev_info capabilities to prevent the normal writeback path from
ever approaching it).

That's an increasingly fragile assumption, and ramdisk_writepage (the other
source of AOP_WRITEPAGE_ACTIVATEs) is already careful to check
wbc->for_reclaim before returning it.  Make the same check in
shmem_writepage, thereby sidestepping the page_mapped BUG also.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: Erez Zadok <ezk@cs.sunysb.edu>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

bba9d994

Fix compat futex hangs. · 59ddd460

David Miller authored Nov 12, 2007

[FUTEX]: Fix address computation in compat code.

[ Upstream commit: 3c5fd9c7 ]

compat_exit_robust_list() computes a pointer to the
futex entry in userspace as follows:

	(void __user *)entry + futex_offset

'entry' is a 'struct robust_list __user *', and
'futex_offset' is a 'compat_long_t' (typically a 's32').

Things explode if the 32-bit sign bit is set in futex_offset.

Type promotion sign extends futex_offset to a 64-bit value before
adding it to 'entry'.

This triggered a problem on sparc64 running 32-bit applications which
would lock up a cpu looping forever in the fault handling for the
userspace load in handle_futex_death().

Compat userspace runs with address masking (wherein the cpu zeros out
the top 32-bits of every effective address given to a memory operation
instruction) so the sparc64 fault handler accounts for this by
zero'ing out the top 32-bits of the fault address too.

Since the kernel properly uses the compat_uptr interfaces, kernel side
accesses to compat userspace work too since they will only use
addresses with the top 32-bit clear.

Because of this compat futex layer bug we get into the following loop
when executing the get_user() load near the top of handle_futex_death():

1) load from address '0xfffffffff7f16bd8', FAULT
2) fault handler clears upper 32-bits, processes fault
   for address '0xf7f16bd8' which succeeds
3) goto #1

I want to thank Bernd Zeimetz, Josip Rodin, and Fabio Massimo Di Nitto
for their tireless efforts helping me track down this bug.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

59ddd460

sched: keep utime/stime monotonic · e823c33c

Frans Pop authored Nov 14, 2007

sched: keep utime/stime monotonic

cpustats use utime/stime as a ratio against sum_exec_runtime, as a
consequence it can happen - when the ratio changes faster than time
accumulates - that either can be appear to go backwards.

Combined backport for 2.6.23 of the following patches from mainline:
commit 73a2bcb0
Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
  sched: keep utime/stime monotonic

commit 9301899b
Author: Balbir Singh <balbir@linux.vnet.ibm.com>
  sched: fix /proc/<PID>/stat stime/utime monotonicity, part 2
Signed-off-by: Frans Pop <elendil@planet.nl>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

e823c33c