Commits · 0b749ce3802428007a37870eb51ba3c0bdf90857 · Kirill Smelkov / linux

10 Apr, 2006 6 commits

[PATCH] splice: be smarter about calling do_page_cache_readahead() · 0b749ce3

Jens Axboe authored Apr 10, 2006

We don't want to call into the read-ahead logic unless we are at the
start of a page, _or_ we have multiple pages to read.
Signed-off-by: Jens Axboe <axboe@suse.de>

0b749ce3

[PATCH] splice: optimize the splice buffer mapping · 49d0b21b

Jens Axboe authored Apr 10, 2006

We don't really need to lock down the pages, just make sure they
are uptodate.
Signed-off-by: Jens Axboe <axboe@suse.de>

49d0b21b

[PATCH] splice: cleanup __generic_file_splice_read() · 16c523dd

Jens Axboe authored Apr 10, 2006

The whole shadow/pages logic got overly complex, and this simpler
approach is actually faster in testing.
Signed-off-by: Jens Axboe <axboe@suse.de>

16c523dd

[PATCH] splice: only call wake_up_interruptible() when we really have to · c0bd1f65

Jens Axboe authored Apr 10, 2006

__wake_up_common() is pretty heavy in the kernel profiles, this brings
it down to a more acceptable level.
Signed-off-by: Jens Axboe <axboe@suse.de>

c0bd1f65

[PATCH] splice: potential !page dereference · 9aefe431

Dave Jones authored Apr 10, 2006

We can get to out: with a NULL page, which we probably
don't want to be calling page_cache_release() on.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Jens Axboe <axboe@suse.de>

9aefe431

[PATCH] splice: mark the io page as accessed · c7f21e4f

Jens Axboe authored Apr 10, 2006

We should do that, since we do the LRU manipulation ourselves now. Suggested
by Nick Piggin.
Signed-off-by: Jens Axboe <axboe@suse.de>

c7f21e4f

09 Apr, 2006 31 commits

[SELINUX] Fix build after ipsec decap state changes. · 67644726

Dave Jones authored Apr 02, 2006

security/selinux/xfrm.c: In function 'selinux_socket_getpeer_dgram':
security/selinux/xfrm.c:284: error: 'struct sec_path' has no member named 'x'
security/selinux/xfrm.c: In function 'selinux_xfrm_sock_rcv_skb':
security/selinux/xfrm.c:317: error: 'struct sec_path' has no member named 'x'
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

67644726

Move request_standard_resources() back to before PCI probing · 66004a6c

Linus Torvalds authored Apr 09, 2006

This effectively undoes the PCI resource allocation changes done in
commit b408cbc7, but leaves the cleanups
of that commit in place.

We're going back to marking the resources reported by e820 busy _before_
doing PCI probing, so that any PCI resource that clashes with the BIOS-
reported memory map will be reloacted to a non-clashing area.

The reason? Larry Finger reports that his laptop has the cardbus
controller set up by the BIOS so that it conflicts with the e820 memory
map, and needs to be relocated. See

   http://bugzilla.kernel.org/show_bug.cgi?id=6337

for more details.

We'll have to work out how to handle the fbcon problem that caused that
commit in the first place in some other way.

Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Antonino A. Daplas <adaplas@pol.net>
Cc: <bjk@luxsci.net>
Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

66004a6c

[PATCH] x86_64: Update 32-bit system call table · b8feb47f
Andi Kleen authored Apr 07, 2006
```
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
```
b8feb47f

[PATCH] x86_64: Eliminate IA32_NR_syscalls define · 67d53ea5

Andi Kleen authored Apr 07, 2006

Or rather compute it based on the table length automatically.

This also has the intended side effect of not warning for new system calls
anymore.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

67d53ea5

[PATCH] x86_64: fix CONFIG_REORDER · bbd3aff8

Sam Ravnborg authored Apr 07, 2006

Fix CONFIG_REORDER.

The value of cflags-y was assined to CFLAGS before cflags-y was assigned
the value used for CONFIG_REORDER.

Use cflags-y for all CFLAGS options in the Makefile to avoid this
happening again.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

bbd3aff8

[PATCH] x86_64: Plug GS leak in arch_prctl() · 97c2803c

John Blackwood authored Apr 07, 2006

In linux-2.6.16, we have noticed a problem where the gs base value
returned from an arch_prtcl(ARCH_GET_GS, ...) call will be incorrect if:

   - the current/calling task has NOT set its own gs base yet to a
     non-zero value,

   - some other task that ran on the same processor previously set their
     own gs base to a non-zero value.

In this situation, the ARCH_GET_GS code will read and return the
MSR_KERNEL_GS_BASE msr register.

However, since the __switch_to() code does NOT load/zero the
MSR_KERNEL_GS_BASE register when the task that is switched IN has a zero
next->gs value, the caller of arch_prctl(ARCH_GET_GS, ...) will get back
the value of some previous tasks's gs base value instead of 0.

    Change the arch_prctl() ARCH_GET_GS code to only read and return
    the MSR_KERNEL_GS_BASE msr register if the 'gs' register of the calling
    task is non-zero.

    Side note: Since in addition to using arch_prctl(ARCH_SET_GS, ...),
    a task can also setup a gs base value by using modify_ldt() and write
    an index value into 'gs' from user space, the patch below reads
    'gs' instead of using thread.gs, since in the modify_ldt() case,
    the thread.gs value will be 0, and incorrect value would be returned
    (the task->thread.gs value).

    When the user has not set its own gs base value and the 'gs'
    register is zero, then the MSR_KERNEL_GS_BASE register will not be
    read and a value of zero will be returned by reading and returning
    'task->thread.gs'.

    The first patch shown below is an attempt at implementing this
    approach.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

97c2803c

[PATCH] i386: Remove printk about reboot fixups at reboot · e48c4729

Andi Kleen authored Apr 07, 2006

Printk doesn't have any value
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

e48c4729

[PATCH] x86_64: Fix drift with HPET timer enabled · b20367a6

Jordan Hargrave authored Apr 07, 2006

If the HPET timer is enabled, the clock can drift by ~3 seconds a day.
This is due to the HPET timer not being initialized with the correct
setting (still using PIT count).

If HZ changes, this drift can become even more pronounced.

HPET patch initializes tick_nsec with correct tick_nsec settings for
HPET timer.

Vojtech comments:

  "It's not entirely correct (it assumes the HPET ticks totally
   exactly), but it's significantly better than assuming the PIT error
   there."
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

b20367a6

[PATCH] i386/x86-64: Return defined error value for bad PCI config space accesses · 49c93e84

Andi Kleen authored Apr 07, 2006

Mostly to get better handling when a extended config space
access has to fallback to Type1.

Cc: gregkh@suse.de
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

49c93e84

[PATCH] i386/x86_64: Check if MCFG works for the first 16 busses · 8c30b1a7

Andi Kleen authored Apr 07, 2006

Previously only the first bus would be checked against Type 1.

Why 16? Checking all would need too much memory and we
can assume that systems with more than 16 busses have better than
average quality BIOS.

This is an additional defense against bad MCFG tables.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

8c30b1a7

[PATCH] x86_64: Fixup read_mostly section on internode cache line size for vSMP · e405d067

Ravikiran G Thirumalai authored Apr 07, 2006

Fixup the read mostly section to start at internode cacheline boundary.
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

e405d067

[PATCH] x86_64: Don't return error for HPET initialization in initcall · 3d34ee68
Andi Kleen authored Apr 07, 2006
```
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
```
3d34ee68

[PATCH] x86_64: Don't export strlen twice · ac04dcaf

Andi Kleen authored Apr 07, 2006

Fix

  WARNING: vmlinux: 'strlen' exported twice. Previous export was in vmlinux

Reported by Mats Johannesson
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

ac04dcaf

[PATCH] x86_64: When user could have changed RIP always force IRET · 7bf36bbc

Andi Kleen authored Apr 07, 2006

Intel EM64T CPUs handle uncanonical return addresses differently
from AMD CPUs.

The exception is reported in the SYSRET, not the next instruction.
This leads to the kernel exception handler running on the user stack
with the wrong GS because the kernel didn't expect exceptions
on this instruction.

This version of the patch has the teething problems that plagued an earlier
version fixed.

This is CVE-2006-0744

Thanks to Ernie Petrides and Asit B. Mallick for analysis and initial
patches.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

7bf36bbc

[PATCH] x86_64: Don't run NMI watchdog during machine checks · 553f265f

Andi Kleen authored Apr 07, 2006

Machine checks can stall the machine for a long time and
it's not good to trigger the nmi watchdog during that.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

553f265f

[PATCH] x86_64: extra NODES_SHIFT definition · be56db61

Dave Hansen authored Apr 07, 2006

The generic linux/numa.h file defines NODES_SHIFT to 0 in case
the architecture did not.

Every architecture which has a NUMA config option defines
NODES_SHIFT in its asm-$ARCH headers, but only if NUMA is
enabled, except for x86_64.

This should make it like all the rest.
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

be56db61

[PATCH] x86_64: Proper null pointer check in powernow_k8_get · 4211a303

Jacob Shin authored Apr 07, 2006

This prevents crashes on dual core system when enough ticks are lost.

Replaces earlier patch by me.

Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

4211a303

[PATCH] x86_64: Revert earlier powernow-k8 change · d7fa706c
Andi Kleen authored Apr 07, 2006
```
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
```
d7fa706c

[PATCH] i386: Consolidate modern APIC handling · 95d769aa

Andi Kleen authored Apr 07, 2006

AMD systems have a modern APIC that supports 8 bit IDs, but
don't have a XAPIC version number.  Add a new "modern_apic"
subfunction that handles this correctly and use it (nearly)
everywhere where XAPIC is tested for.

I removed one wart: the code specified that external APICs
would use an 8bit APIC ID. But I checked a real 82093 data sheet
and it says clearly that they only use 4bit. So I removed
this special case since it would a bit awkward to implement now.

I removed the valid APIC tests in mptable parsing completely. On any modern
system they only check against the full field width (8bit) anyways
and are no-ops. This also fixes them doing the wrong thing
on >8 core Opterons.

This makes i386 boot again on 16 core Opterons.

Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

95d769aa

[PATCH] x86_64: Clear APIC feature bit when local APIC is disabled · d1530d82

Andi Kleen authored Apr 07, 2006

Needed for other checks later in ACPI.

Pointed out by Len Brown
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

d1530d82

[PATCH] x86-64/i386: Don't process APICs/IO-APICs in ACPI when APIC is disabled. · d3b6a349

Andi Kleen authored Apr 07, 2006

When nolapic was passed or the local APIC was disabled
for another reason ACPI would still parse the IO-APICs
until these were explicitely disabled with noapic.

Usually this resulted in a non booting configuration unless
"nolapic noapic" was used.

I also disabled the local APIC parsing in this case, although
that's only cosmetic (suppresses a few printks)

This hopefully makes nolapic work in all cases.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

d3b6a349

[PATCH] x86_64: Don't sanity check Type 1 PCI bus access on newer systems · ec0f08ee

Andi Kleen authored Apr 07, 2006

Horus systems don't have anything on bus 0 which makes
the Type 1 sanity checks fail.  Use the DMI BIOS year to
check for newer systems and always assume Type 1 works on them.
I used 2001 as an pretty arbitary cutoff year.

Cc: gregkh@suse.de
Cc: Navin Boppuri <navin.boppuri@newisys.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

ec0f08ee

[PATCH] x86_64: Fix compilation with CONFIG_PCI=n / allnoconfig · fa47dd0b
Andi Kleen authored Apr 07, 2006
```
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
```
fa47dd0b

[PATCH] i386/x86-64: Check that MCFG points to an e820 reserved area · 946f2ee5

Arjan van de Ven authored Apr 07, 2006

This patch introduces a user for the e820_all_mapped function:

There have been several machines that don't have a working MMCONFIG,
often because of a buggy MCFG table in the ACPI bios. This patch adds a
simple sanity check that detects a whole bunch of these cases, and when
it detects it, linux now boots rather than crash-and-burns.

The accuracy of this detection can in principle be improved if there was
a "is this entire range in e820 with THIS attribute", but no such
function exist and the complexity needed for this is not really worth
it; this simple check already catches most cases anyway.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

946f2ee5

[PATCH] x86_64: Introduce e820_all_mapped · 95222368

Arjan van de Ven authored Apr 07, 2006

Introduce a e820_all_mapped() function which checks if the entire range
<start,end> is mapped with type.

This is done by moving the local start variable to the end of each
known-good region; if at the end of the function the start address is
still before end, there must be a part that's not of the correct type;
otherwise it's a good region.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

95222368

[PATCH] x86_64: Rename e820_mapped to e820_any_mapped · eee5a9fa

Arjan van de Ven authored Apr 07, 2006

Rename e820_mapped to e820_any_mapped since it tests if any part of the
range is mapped according to the type.

Later steps will introduce e820_all_mapped which will check if the
entire range is mapped with the type.  Both have their merit.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

eee5a9fa

[PATCH] x86_64: Handle empty PXMs that only contain hotplug memory · a8062231

Andi Kleen authored Apr 07, 2006

The node setup code would try to allocate the node metadata in the node
itself, but that fails if there is no memory in there.

This can happen with memory hotplug when the hotplug area defines an so
far empty node.

Now use bootmem to try to allocate the mem_map in other nodes.

And if it fails don't panic, but just ignore the node.

To make this work I added a new __alloc_bootmem_nopanic function that
does what its name implies.

TBD should try to use nearby nodes here.  Currently we just use any.
It's hard to do it better because bootmem doesn't have proper fallback
lists yet.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

a8062231

[PATCH] x86_64: Reserve SRAT hotadd memory on x86-64 · 68a3a7fe

Andi Kleen authored Apr 07, 2006

From: Keith Mannthey, Andi Kleen

Implement memory hotadd without sparsemem. The memory in the SRAT
hotadd area is just preserved instead and can be activated later.

There are a few restrictions:
- Only one continuous hotadd area allowed per node

The main problem is dealing with the many buggy SRAT tables
that are out there. The strategy here is to reject anything
suspicious.

Originally from Keith Mannthey, with several hacks and changes by AK
and also contributions from Andrew Morton

[ TBD: Problems pointed out by KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>:

 1) Goto's rebuild_zonelist patch will not work if CONFIG_MEMORY_HOTPLUG=n.

    Rebuilding zonelist is necessary when the system has just memory <
    4G at boot, and hot add memory > 4G.  because x86_64 has DMA32,
    ZONE_NORAML is not included into zonelist at boot time if system
    doesn't have memory >4G at boot.

    [AK: should just force the higher zones at boot time when SRAT tells us]

 2) zone and node's spanned_pages and present_pages are not incremented.
    They should be.

    For example, our server (ia64/Fujitsu PrimeQuest) can equip memory
    from 4G to 1T(maybe 2T in future), and SRAT will *always* say we have
    possible 1T +memory.  (Microsoft requires "write all possible memory
    in SRAT") When we reserve memmap for possible 1T memory, Linux will
    not work well in +minimum 4G configuraion ;)

    [AK: needs limiting to 5-10% of max memory]
 ]
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

68a3a7fe

[PATCH] x86_64: Support memory hotadd without sparsemem · 9d99aaa3

Andi Kleen authored Apr 07, 2006

Memory hotadd doesn't need SPARSEMEM, but can be handled by just preallocating
mem_maps. This only needs some untangling of ifdefs to enable the necessary
code even without SPARSEMEM.

Originally from Keith Mannthey, hacked by AK.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

9d99aaa3

[PATCH] x86_64: Clean up execve path · 805e8c03

Andi Kleen authored Apr 07, 2006

Just call IRET always, no need for any special cases.

Needed for the next bug fix.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

805e8c03

[PATCH] x86_64: Update defconfig · 903fcc60

Andi Kleen authored Apr 07, 2006

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

903fcc60

03 Apr, 2006 1 commit
- Linux v2.6.17-rc1 · 6246b612
  Linus Torvalds authored Apr 02, 2006
```
Close of the merge window..
```
  6246b612
02 Apr, 2006 2 commits

Update dummy snd_power_wait() function for new calling convention · 6fdb94bd

Linus Torvalds authored Apr 02, 2006

Apparently nobody had tried to compile the ALSA CVS tree without power
management enabled.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

6fdb94bd

Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block · d6963615

Linus Torvalds authored Apr 02, 2006

* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] splice: fix page stealing LRU handling.
  [PATCH] splice: page stealing needs to wait_on_page_writeback()
  [PATCH] splice: export generic_splice_sendpage
  [PATCH] splice: add a SPLICE_F_MORE flag
  [PATCH] splice: add comments documenting more of the code
  [PATCH] splice: improve writeback and clean up page stealing
  [PATCH] splice: fix shadow[] filling logic

d6963615