Commit 1a87fc37 authored by Andi Kleen's avatar Andi Kleen Committed by Linus Torvalds

[PATCH] New x86-64 merge

This fixes various issues in the previous update, in particular
a kernel without CONFIG_GART_IOMMU should boot now again,

The kernel discoverys PCI BUS<->CPU affinity on AMD systems
now.  It is so far used by dma_alloc_coherent to allocate memory
Experimental patches to add this to sysfs exist, but they're not
included yet. On systems with no memory on a CPU this information may
be wrong.

It has a new experimental CONFIG_UNORDERED_IO option. When enabled
it uses write combining for stores to device iomemory mapping. This
may give better performance with some device drivers, but has a slight
risk of breaking drivers (in general if a driver works on ia64,ppc64,sparc64
it should also work). Based on some discussions with Grant Grundler.

It requires the driver to use memory barriers properly. I would be interested 
in feedback on any performance changes you're seeing. For a production system I
would recommend to keep it turned off(although I run it on all my systems and 
haven't run into any problems yet)

ACPI and Centrino speedstep is enabled now for Nocona systems.

The IOMMU code does lazy merging by default now, which should be safe
and may increase performance on block IO.  It also avoids SAC force by default
now.

The machine check code has been improved again, hopefully it is good 
now. It will log now machine check events from before the last reset.
And various other fixes.

The x86-64 parts are now gcc 3.5 clean.

And various other fixes

- Update defconfig
- Reset lost ticks on lost time warning, print RIP.
- Make TASK_SIZE test for 32bit (Arjan van de Ven) 
- Work around bug in generic code that broke pcibus_to_cpumask
- Actually fix dummy iommu code
- Compile i386 acpi and speedstep-centrino cpufreq modules
- Export cpu_khz
- Fix compilation without GART_IOMMU
- Optimize find_*_bit functions for small fields
- Discover nodes near PCI busses on K8 (Travis Betak, changed by me) 
- Optimize gart tlb flush slightly
- Add experimental CONFIG_UNORDERED_IO for unordered IO stores
- Add 32bit emulation for PTRACE_GETEVENTMSG
- Fix kernel_fpu_{begin,end} for preemptive kernels (Alexander Nyberg)
- Readd proper check for biomerge (got lost) 
- Set up 32bit vsyscall page for ptrace early
- Add 32bit emulation for lookup_dcookie() for oprofile
- Export copy_page / clear_page
- Use rex prefix in save_init_fpu fxsave (Jan Beulich)
- Make it compile again
- Fix handling of hwdev == NULL (= ISA/LPC devices) in swiotlb
- Convert PCI DMA code to dma devices
- Change IOMMU code to use dummy fallback device instead of hardcoded
  NULL tests everywhere.
- Test iommu_sac_force instead of nommu for DAC supported macro
  (will cause more drivers to use DAC)
- Harden non IOMMU dma_alloc_consistent code to fail less likely.
- Remove use of strsep in option parsers
- Remove duplicated exports (Arjan van der Ven) 
- Fix EFAULT checking in ptrace (John Blackwood)
- Update defconfig
- Remove dead URL from boot/setup.S (R.J. Wysocki) 
- Use compat_sigval_t instead of sigval_t32 (Al Viro)
- Nanooptimization in 32bit ptregs calls
- Fix gcc 3.5 compilation in mtrr.h 
- Pass pt_regs as pointer to avoid illegal pass by reference (for gcc 3.5)
- Make set_bit take int not long (Harald Dunkel)
- Avoid panic on pci_map_sg and pci_alloc_consistent overflow in GART IOMMU
- Handle large lost time delays in HPET code (Suresh B. Siddha)
- Work around theoretical bugs in prefetch handling (suggested by Jamie Lokier)
- Remove mtrr_strings declaration for gcc 3.5
- Set KBUILD_IMAGE for make rpm (William Lee Irwin III)
- Add iommu=noaperture to not touch the aperture
- Clean up argument parsing for iommu= option
- Export symbols for xchgadd based rwsems (still disabled)
- Define iommu_bio_merge for !CONFIG_GART_IOMMU
- Don't use backwards rep ; movsb for memmove
- Out line bitmap search functions (saves 8k .text, from i386) 
- Convert bitmap search functions to 64bit accesses and optimize them
  a bit.
- Handle corrupted page tables in page fault handler
- Set iommu_merge (without force) to on by default again.
- Don't do bio merging by default for iommu=merge. This should make it
  safe to use again
- Add iommu=biomerge option to enable BIO merging (like old iommu=merge)
- Fix iommu=memaper=... parsing
- More MCE fixes (based on a patch by Eric Morton, heavily changed by me)
- Fix check for banks causing exceptions
- Allow to reinit MCEs later even after mce=off, fix wrong
  use of __initdata
  to disable at boot, but reenable later.
- Log left over machine checks after boot and resume
- Fix missing prototype warning with CPU_FREQ on
- Fix parsing of noexec=on (Ian Hastie)
- Fix warning in ia32_binfmt.c
- Resync time variable cpu frequency handling with i386
- Resync msr.c with i386
- Add 0x60 level 1 intel cache descriptor (from i386)
- Remove duplicated 32bit ioctls (Arnd Bergmann)
- Enable -msoft-float (from i386)
- Use faster version of FPU hang fix - handle the exception
  * a bit experimental, if you see "kernel ... math error" events
    in the log please report.
Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
parent b1541be9
......@@ -136,17 +136,25 @@ PCI
IOMMU
iommu=[size][,noagp][,off][,force][,noforce][,leak][,memaper[=order]][,soft]
iommu=[size][,noagp][,off][,force][,noforce][,leak][,memaper[=order]][,merge]
[,forcesac][,fullflush][,nomerge][,noaperture]
size set size of iommu (in bytes)
noagp don't initialize the AGP driver and use full aperture.
off don't use the IOMMU
leak turn on simple iommu leak tracing (only when CONFIG_IOMMU_LEAK is on)
memaper[=order] allocate an own aperture over RAM with size 32MB^order.
noforce don't force IOMMU usage. Default.
force Force IOMMU
soft Use software bounce buffering for non 32bit IO. Default on Intel
machines.
swiotlb=pages
Prereserve that many 4K pages for the software IO bounce buffering.
force Force IOMMU.
merge Do SG merging. Implies force (experimental)
nomerge Don't do SG merging.
forcesac For SAC mode for masks <40bits (experimental)
fullflush Flush IOMMU on each allocation (default)
nofullflush Don't use IOMMU fullflush
allowed overwrite iommu off workarounds for specific chipsets.
soft Use software bounce buffering (default for Intel machines)
noaperture Don't touch the aperture for AGP.
swiotlb=pages[,force]
pages Prereserve that many 128K pages for the software IO bounce buffering.
force Force all IO through the software TLB.
......@@ -347,6 +347,17 @@ config PCI_MMCONFIG
depends on PCI
select ACPI_BOOT
config UNORDERED_IO
bool "Unordered IO mapping access"
depends on EXPERIMENTAL
select UNORDERED_IO
help
Use unordered stores to access IO memory mappings in device drivers.
Still very experimental. When a driver works on IA64/ppc64/pa-risc it should
work with this option, but it makes the drivers behave differently
from i386. Requires that the driver writer used memory barriers
properly.
source "drivers/pci/Kconfig"
source "drivers/pcmcia/Kconfig"
......
......@@ -77,6 +77,7 @@ boot := arch/x86_64/boot
all: bzImage
BOOTIMAGE := arch/x86_64/boot/bzImage
KBUILD_IMAGE := $(BOOTIMAGE)
bzImage: vmlinux
$(Q)$(MAKE) $(build)=$(boot) $(BOOTIMAGE)
......
......@@ -17,7 +17,6 @@ CONFIG_GENERIC_ISA_DMA=y
#
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y
CONFIG_STANDALONE=y
#
# General setup
......@@ -29,12 +28,13 @@ CONFIG_POSIX_MQUEUE=y
CONFIG_SYSCTL=y
# CONFIG_AUDIT is not set
CONFIG_LOG_BUF_SHIFT=18
# CONFIG_HOTPLUG is not set
CONFIG_HOTPLUG=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_EXTRA_PASS=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_IOSCHED_NOOP=y
......@@ -104,8 +104,8 @@ CONFIG_ACPI_FAN=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_THERMAL=y
# CONFIG_ACPI_ASUS is not set
CONFIG_ACPI_TOSHIBA=y
CONFIG_ACPI_DEBUG=y
# CONFIG_ACPI_TOSHIBA is not set
# CONFIG_ACPI_DEBUG is not set
CONFIG_ACPI_BUS=y
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
......@@ -115,7 +115,23 @@ CONFIG_ACPI_SYSTEM=y
#
# CPU Frequency scaling
#
# CONFIG_CPU_FREQ is not set
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_PROC_INTF=y
CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=y
CONFIG_CPU_FREQ_GOV_USERSPACE=y
# CONFIG_CPU_FREQ_24_API is not set
CONFIG_CPU_FREQ_TABLE=y
#
# CPUFreq processor drivers
#
CONFIG_X86_POWERNOW_K8=y
# CONFIG_X86_SPEEDSTEP_CENTRINO is not set
CONFIG_X86_ACPI_CPUFREQ=y
CONFIG_X86_ACPI_CPUFREQ_PROC_INTF=y
#
# Bus options (PCI etc.)
......@@ -123,9 +139,26 @@ CONFIG_ACPI_SYSTEM=y
CONFIG_PCI=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_UNORDERED_IO=y
CONFIG_PCI_MSI=y
# CONFIG_PCI_LEGACY_PROC is not set
# CONFIG_PCI_NAMES is not set
#
# PCMCIA/CardBus support
#
# CONFIG_PCMCIA is not set
#
# PCI Hotplug Support
#
CONFIG_HOTPLUG_PCI=y
# CONFIG_HOTPLUG_PCI_FAKE is not set
# CONFIG_HOTPLUG_PCI_ACPI is not set
# CONFIG_HOTPLUG_PCI_CPCI is not set
# CONFIG_HOTPLUG_PCI_PCIE is not set
# CONFIG_HOTPLUG_PCI_SHPC is not set
#
# Executable file formats / Emulations
#
......@@ -144,6 +177,9 @@ CONFIG_UID16=y
#
# Generic Driver Options
#
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
# CONFIG_FW_LOADER is not set
# CONFIG_DEBUG_DRIVER is not set
#
......@@ -171,7 +207,7 @@ CONFIG_BLK_DEV_FD=y
CONFIG_BLK_DEV_LOOP=y
# CONFIG_BLK_DEV_CRYPTOLOOP is not set
# CONFIG_BLK_DEV_NBD is not set
# CONFIG_BLK_DEV_CARMEL is not set
# CONFIG_BLK_DEV_SX8 is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_SIZE=4096
CONFIG_BLK_DEV_INITRD=y
......@@ -186,10 +222,10 @@ CONFIG_BLK_DEV_IDE=y
#
# Please see Documentation/ide.txt for help/info on IDE drives
#
# CONFIG_BLK_DEV_IDE_SATA is not set
# CONFIG_BLK_DEV_HD_IDE is not set
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_IDEDISK_MULTI_MODE=y
# CONFIG_IDEDISK_STROKE is not set
CONFIG_BLK_DEV_IDECD=y
# CONFIG_BLK_DEV_IDETAPE is not set
# CONFIG_BLK_DEV_IDEFLOPPY is not set
......@@ -234,7 +270,7 @@ CONFIG_BLK_DEV_PIIX=y
# CONFIG_BLK_DEV_SIS5513 is not set
# CONFIG_BLK_DEV_SLC90E66 is not set
# CONFIG_BLK_DEV_TRM290 is not set
# CONFIG_BLK_DEV_VIA82CXXX is not set
CONFIG_BLK_DEV_VIA82CXXX=y
# CONFIG_IDE_ARM is not set
CONFIG_BLK_DEV_IDEDMA=y
# CONFIG_IDEDMA_IVB is not set
......@@ -273,16 +309,24 @@ CONFIG_BLK_DEV_SD=y
# SCSI low-level drivers
#
CONFIG_BLK_DEV_3W_XXXX_RAID=y
# CONFIG_SCSI_3W_9XXX is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AACRAID is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_ADVANSYS is not set
CONFIG_SCSI_AIC79XX=y
CONFIG_AIC79XX_CMDS_PER_DEVICE=32
CONFIG_AIC79XX_RESET_DELAY_MS=2000
# CONFIG_AIC79XX_BUILD_FIRMWARE is not set
# CONFIG_AIC79XX_ENABLE_RD_STRM is not set
# CONFIG_AIC79XX_DEBUG_ENABLE is not set
CONFIG_AIC79XX_DEBUG_MASK=0
CONFIG_AIC79XX_REG_PRETTY_PRINT=y
# CONFIG_SCSI_MEGARAID is not set
CONFIG_SCSI_SATA=y
# CONFIG_SCSI_SATA_SVW is not set
CONFIG_SCSI_ATA_PIIX=y
# CONFIG_SCSI_SATA_NV is not set
# CONFIG_SCSI_SATA_PROMISE is not set
# CONFIG_SCSI_SATA_SX4 is not set
# CONFIG_SCSI_SATA_SIL is not set
......@@ -290,7 +334,6 @@ CONFIG_SCSI_ATA_PIIX=y
CONFIG_SCSI_SATA_VIA=y
# CONFIG_SCSI_SATA_VITESSE is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_CPQFCTS is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_EATA_PIO is not set
......@@ -392,6 +435,7 @@ CONFIG_IPV6=y
# QoS and/or fair queueing
#
# CONFIG_NET_SCHED is not set
# CONFIG_NET_CLS_ROUTE is not set
#
# Network testing
......@@ -431,8 +475,7 @@ CONFIG_MII=y
# CONFIG_HP100 is not set
CONFIG_NET_PCI=y
# CONFIG_PCNET32 is not set
CONFIG_AMD8111_ETH=y
# CONFIG_AMD8111E_NAPI is not set
# CONFIG_AMD8111_ETH is not set
# CONFIG_ADAPTEC_STARFIRE is not set
# CONFIG_B44 is not set
CONFIG_FORCEDETH=y
......@@ -442,16 +485,13 @@ CONFIG_FORCEDETH=y
# CONFIG_FEALNX is not set
# CONFIG_NATSEMI is not set
# CONFIG_NE2K_PCI is not set
CONFIG_8139CP=m
CONFIG_8139TOO=m
# CONFIG_8139TOO_PIO is not set
# CONFIG_8139TOO_TUNE_TWISTER is not set
# CONFIG_8139TOO_8129 is not set
# CONFIG_8139_OLD_RX_RESET is not set
# CONFIG_8139CP is not set
# CONFIG_8139TOO is not set
# CONFIG_SIS900 is not set
# CONFIG_EPIC100 is not set
# CONFIG_SUNDANCE is not set
# CONFIG_VIA_RHINE is not set
# CONFIG_VIA_VELOCITY is not set
#
# Ethernet (1000 Mbit)
......@@ -602,6 +642,9 @@ CONFIG_AGP_AMD64=y
# CONFIG_DRM is not set
# CONFIG_MWAVE is not set
CONFIG_RAW_DRIVER=y
CONFIG_HPET=y
# CONFIG_HPET_RTC_IRQ is not set
CONFIG_HPET_MMAP=y
CONFIG_MAX_RAW_DEVS=256
CONFIG_HANGCHECK_TIMER=y
......@@ -610,6 +653,11 @@ CONFIG_HANGCHECK_TIMER=y
#
# CONFIG_I2C is not set
#
# Dallas's 1-wire bus
#
# CONFIG_W1 is not set
#
# Misc devices
#
......@@ -703,17 +751,8 @@ CONFIG_USB_OHCI_HCD=y
# CONFIG_USB_BLUETOOTH_TTY is not set
# CONFIG_USB_MIDI is not set
# CONFIG_USB_ACM is not set
CONFIG_USB_PRINTER=y
CONFIG_USB_STORAGE=y
# CONFIG_USB_STORAGE_DEBUG is not set
# CONFIG_USB_STORAGE_DATAFAB is not set
# CONFIG_USB_STORAGE_FREECOM is not set
# CONFIG_USB_STORAGE_ISD200 is not set
# CONFIG_USB_STORAGE_DPCM is not set
# CONFIG_USB_STORAGE_HP8200e is not set
# CONFIG_USB_STORAGE_SDDR09 is not set
# CONFIG_USB_STORAGE_SDDR55 is not set
# CONFIG_USB_STORAGE_JUMPSHOT is not set
# CONFIG_USB_PRINTER is not set
# CONFIG_USB_STORAGE is not set
#
# USB Human Interface Devices (HID)
......@@ -830,7 +869,8 @@ CONFIG_ISO9660_FS=y
#
# DOS/FAT/NT Filesystems
#
# CONFIG_FAT_FS is not set
# CONFIG_MSDOS_FS is not set
# CONFIG_VFAT_FS is not set
# CONFIG_NTFS_FS is not set
#
......@@ -856,6 +896,7 @@ CONFIG_RAMFS=y
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_JFFS2_COMPRESSION_OPTIONS is not set
# CONFIG_CRAMFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_HPFS_FS is not set
......@@ -909,10 +950,11 @@ CONFIG_DEBUG_KERNEL=y
# CONFIG_DEBUG_SLAB is not set
CONFIG_MAGIC_SYSRQ=y
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_INIT_DEBUG is not set
CONFIG_INIT_DEBUG=y
# CONFIG_DEBUG_INFO is not set
# CONFIG_FRAME_POINTER is not set
# CONFIG_IOMMU_DEBUG is not set
CONFIG_IOMMU_DEBUG=y
# CONFIG_IOMMU_LEAK is not set
#
# Security options
......@@ -927,5 +969,6 @@ CONFIG_MAGIC_SYSRQ=y
#
# Library routines
#
# CONFIG_CRC_CCITT is not set
CONFIG_CRC32=y
# CONFIG_LIBCRC32C is not set
......@@ -301,6 +301,9 @@ MODULE_AUTHOR("Eric Youngdale, Andi Kleen");
#define elf_addr_t __u32
#undef TASK_SIZE
#define TASK_SIZE 0xffffffff
static void elf32_init(struct pt_regs *);
#include "../../../fs/binfmt_elf.c"
......
......@@ -171,23 +171,8 @@ struct ioctl_trans ioctl_start[] = {
COMPATIBLE_IOCTL(HDIO_SET_KEEPSETTINGS)
COMPATIBLE_IOCTL(HDIO_SCAN_HWIF)
COMPATIBLE_IOCTL(BLKRASET)
COMPATIBLE_IOCTL(BLKFRASET)
COMPATIBLE_IOCTL(0x4B50) /* KDGHWCLK - not in the kernel, but don't complain */
COMPATIBLE_IOCTL(0x4B51) /* KDSHWCLK - not in the kernel, but don't complain */
COMPATIBLE_IOCTL(RTC_AIE_ON)
COMPATIBLE_IOCTL(RTC_AIE_OFF)
COMPATIBLE_IOCTL(RTC_UIE_ON)
COMPATIBLE_IOCTL(RTC_UIE_OFF)
COMPATIBLE_IOCTL(RTC_PIE_ON)
COMPATIBLE_IOCTL(RTC_PIE_OFF)
COMPATIBLE_IOCTL(RTC_WIE_ON)
COMPATIBLE_IOCTL(RTC_WIE_OFF)
COMPATIBLE_IOCTL(RTC_ALM_SET)
COMPATIBLE_IOCTL(RTC_ALM_READ)
COMPATIBLE_IOCTL(RTC_RD_TIME)
COMPATIBLE_IOCTL(RTC_SET_TIME)
COMPATIBLE_IOCTL(RTC_WKALM_SET)
COMPATIBLE_IOCTL(RTC_WKALM_RD)
COMPATIBLE_IOCTL(FIOQSIZE)
/* And these ioctls need translation */
......
......@@ -115,7 +115,8 @@ int ia32_copy_siginfo_from_user(siginfo_t *to, siginfo_t32 __user *from)
}
asmlinkage long
sys32_sigsuspend(int history0, int history1, old_sigset_t mask, struct pt_regs regs)
sys32_sigsuspend(int history0, int history1, old_sigset_t mask,
struct pt_regs *regs)
{
sigset_t saveset;
......@@ -126,11 +127,11 @@ sys32_sigsuspend(int history0, int history1, old_sigset_t mask, struct pt_regs r
recalc_sigpending();
spin_unlock_irq(&current->sighand->siglock);
regs.rax = -EINTR;
regs->rax = -EINTR;
while (1) {
current->state = TASK_INTERRUPTIBLE;
schedule();
if (do_signal(&regs, &saveset))
if (do_signal(regs, &saveset))
return -EINTR;
}
}
......@@ -138,7 +139,7 @@ sys32_sigsuspend(int history0, int history1, old_sigset_t mask, struct pt_regs r
asmlinkage long
sys32_sigaltstack(const stack_ia32_t __user *uss_ptr,
stack_ia32_t __user *uoss_ptr,
struct pt_regs regs)
struct pt_regs *regs)
{
stack_t uss,uoss;
int ret;
......@@ -155,7 +156,7 @@ sys32_sigaltstack(const stack_ia32_t __user *uss_ptr,
}
seg = get_fs();
set_fs(KERNEL_DS);
ret = do_sigaltstack(uss_ptr ? &uss : NULL, &uoss, regs.rsp);
ret = do_sigaltstack(uss_ptr ? &uss : NULL, &uoss, regs->rsp);
set_fs(seg);
if (ret >= 0 && uoss_ptr) {
if (!access_ok(VERIFY_WRITE,uoss_ptr,sizeof(stack_ia32_t)) ||
......@@ -274,9 +275,9 @@ ia32_restore_sigcontext(struct pt_regs *regs, struct sigcontext_ia32 __user *sc,
return 1;
}
asmlinkage long sys32_sigreturn(struct pt_regs regs)
asmlinkage long sys32_sigreturn(struct pt_regs *regs)
{
struct sigframe __user *frame = (struct sigframe __user *)(regs.rsp-8);
struct sigframe __user *frame = (struct sigframe __user *)(regs->rsp-8);
sigset_t set;
unsigned int eax;
......@@ -294,20 +295,23 @@ asmlinkage long sys32_sigreturn(struct pt_regs regs)
recalc_sigpending();
spin_unlock_irq(&current->sighand->siglock);
if (ia32_restore_sigcontext(&regs, &frame->sc, &eax))
if (ia32_restore_sigcontext(regs, &frame->sc, &eax))
goto badframe;
return eax;
badframe:
signal_fault(&regs, frame, "32bit sigreturn");
signal_fault(regs, frame, "32bit sigreturn");
return 0;
}
asmlinkage long sys32_rt_sigreturn(struct pt_regs regs)
asmlinkage long sys32_rt_sigreturn(struct pt_regs *regs)
{
struct rt_sigframe __user *frame = (struct rt_sigframe __user *)(regs.rsp - 4);
struct rt_sigframe __user *frame;
sigset_t set;
unsigned int eax;
struct pt_regs tregs;
frame = (struct rt_sigframe __user *)(regs->rsp - 4);
if (verify_area(VERIFY_READ, frame, sizeof(*frame)))
goto badframe;
......@@ -320,16 +324,17 @@ asmlinkage long sys32_rt_sigreturn(struct pt_regs regs)
recalc_sigpending();
spin_unlock_irq(&current->sighand->siglock);
if (ia32_restore_sigcontext(&regs, &frame->uc.uc_mcontext, &eax))
if (ia32_restore_sigcontext(regs, &frame->uc.uc_mcontext, &eax))
goto badframe;
if (sys32_sigaltstack(&frame->uc.uc_stack, NULL, regs) == -EFAULT)
tregs = *regs;
if (sys32_sigaltstack(&frame->uc.uc_stack, NULL, &tregs) == -EFAULT)
goto badframe;
return eax;
badframe:
signal_fault(&regs,frame,"32bit rt sigreturn");
signal_fault(regs,frame,"32bit rt sigreturn");
return 0;
}
......
......@@ -270,35 +270,32 @@ quiet_ni_syscall:
ret
CFI_ENDPROC
.macro PTREGSCALL label, func
.macro PTREGSCALL label, func, arg
.globl \label
\label:
leaq \func(%rip),%rax
leaq -ARGOFFSET+8(%rsp),\arg /* 8 for return address */
jmp ia32_ptregs_common
.endm
PTREGSCALL stub32_rt_sigreturn, sys32_rt_sigreturn
PTREGSCALL stub32_sigreturn, sys32_sigreturn
PTREGSCALL stub32_sigaltstack, sys32_sigaltstack
PTREGSCALL stub32_sigsuspend, sys32_sigsuspend
PTREGSCALL stub32_execve, sys32_execve
PTREGSCALL stub32_fork, sys_fork
PTREGSCALL stub32_clone, sys32_clone
PTREGSCALL stub32_vfork, sys_vfork
PTREGSCALL stub32_iopl, sys_iopl
PTREGSCALL stub32_rt_sigsuspend, sys_rt_sigsuspend
PTREGSCALL stub32_rt_sigreturn, sys32_rt_sigreturn, %rdi
PTREGSCALL stub32_sigreturn, sys32_sigreturn, %rdi
PTREGSCALL stub32_sigaltstack, sys32_sigaltstack, %rdx
PTREGSCALL stub32_sigsuspend, sys32_sigsuspend, %rcx
PTREGSCALL stub32_execve, sys32_execve, %rcx
PTREGSCALL stub32_fork, sys_fork, %rdi
PTREGSCALL stub32_clone, sys32_clone, %rdx
PTREGSCALL stub32_vfork, sys_vfork, %rdi
PTREGSCALL stub32_iopl, sys_iopl, %rsi
PTREGSCALL stub32_rt_sigsuspend, sys_rt_sigsuspend, %rdx
ENTRY(ia32_ptregs_common)
CFI_STARTPROC
popq %r11
SAVE_REST
movq %r11, %r15
call *%rax
movq %r15, %r11
RESTORE_REST
leaq ia32_sysret(%rip),%r11
pushq %r11
ret
jmp ia32_sysret /* misbalances the return cache */
CFI_ENDPROC
.data
......@@ -332,7 +329,7 @@ ia32_sys_call_table:
.quad sys_getuid16
.quad sys_stime /* stime */ /* 25 */
.quad sys32_ptrace /* ptrace */
.quad sys_alarm /* XXX sign extension??? */
.quad sys_alarm
.quad sys_fstat /* (old)fstat */
.quad sys_pause
.quad compat_sys_utime /* 30 */
......@@ -558,7 +555,7 @@ ia32_sys_call_table:
.quad sys_fadvise64 /* 250 */
.quad quiet_ni_syscall /* free_huge_pages */
.quad sys_exit_group
.quad sys_lookup_dcookie
.quad sys32_lookup_dcookie
.quad sys_epoll_create
.quad sys_epoll_ctl /* 255 */
.quad sys_epoll_wait
......
......@@ -249,8 +249,8 @@ asmlinkage long sys32_ptrace(long request, u32 pid, u32 addr, u32 data)
case PTRACE_GETFPREGS:
case PTRACE_SETFPXREGS:
case PTRACE_GETFPXREGS:
case PTRACE_GETEVENTMSG:
break;
}
child = find_target(request, pid, &ret);
......@@ -363,6 +363,10 @@ asmlinkage long sys32_ptrace(long request, u32 pid, u32 addr, u32 data)
break;
}
case PTRACE_GETEVENTMSG:
ret = put_user(child->ptrace_message,(unsigned int __user *)(u64)data);
break;
default:
ret = -EINVAL;
break;
......
......@@ -1125,7 +1125,7 @@ long sys32_ustat(unsigned dev, struct ustat32 __user *u32p)
}
asmlinkage long sys32_execve(char __user *name, compat_uptr_t __user *argv,
compat_uptr_t __user *envp, struct pt_regs regs)
compat_uptr_t __user *envp, struct pt_regs *regs)
{
long error;
char * filename;
......@@ -1134,20 +1134,21 @@ asmlinkage long sys32_execve(char __user *name, compat_uptr_t __user *argv,
error = PTR_ERR(filename);
if (IS_ERR(filename))
return error;
error = compat_do_execve(filename, argv, envp, &regs);
error = compat_do_execve(filename, argv, envp, regs);
if (error == 0)
current->ptrace &= ~PT_DTRACE;
putname(filename);
return error;
}
asmlinkage long sys32_clone(unsigned int clone_flags, unsigned int newsp, struct pt_regs regs)
asmlinkage long sys32_clone(unsigned int clone_flags, unsigned int newsp,
struct pt_regs *regs)
{
void __user *parent_tid = (void __user *)regs.rdx;
void __user *child_tid = (void __user *)regs.rdi;
void __user *parent_tid = (void __user *)regs->rdx;
void __user *child_tid = (void __user *)regs->rdi;
if (!newsp)
newsp = regs.rsp;
return do_fork(clone_flags & ~CLONE_IDLETASK, newsp, &regs, 0,
newsp = regs->rsp;
return do_fork(clone_flags & ~CLONE_IDLETASK, newsp, regs, 0,
parent_tid, child_tid);
}
......@@ -1337,6 +1338,12 @@ long sys32_quotactl(void)
return -ENOSYS;
}
long sys32_lookup_dcookie(u32 addr_low, u32 addr_high,
char __user * buf, size_t len)
{
return sys_lookup_dcookie(((u64)addr_high << 32) | addr_low, buf, len);
}
cond_syscall(sys32_ipc)
static int __init ia32_init (void)
......
#
# Makefile for the linux kernel.
#
extra-y := head.o head64.o init_task.o vmlinux.lds
EXTRA_AFLAGS := -traditional
obj-y := process.o semaphore.o signal.o entry.o traps.o irq.o \
ptrace.o i8259.o ioport.o ldt.o setup.o time.o sys_x86_64.o \
x8664_ksyms.o i387.o syscall.o vsyscall.o \
setup64.o bootflag.o e820.o reboot.o warmreboot.o
obj-y += mce.o
obj-$(CONFIG_MTRR) += ../../i386/kernel/cpu/mtrr/
obj-$(CONFIG_ACPI_BOOT) += acpi/
obj-$(CONFIG_X86_MSR) += msr.o
obj-$(CONFIG_MICROCODE) += microcode.o
obj-$(CONFIG_X86_CPUID) += cpuid.o
obj-$(CONFIG_SMP) += smp.o smpboot.o trampoline.o
obj-$(CONFIG_X86_LOCAL_APIC) += apic.o nmi.o
obj-$(CONFIG_X86_IO_APIC) += io_apic.o mpparse.o
obj-$(CONFIG_PM) += suspend.o
obj-$(CONFIG_SOFTWARE_SUSPEND) += suspend_asm.o
obj-$(CONFIG_CPU_FREQ) += cpufreq/
obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
obj-$(CONFIG_GART_IOMMU) += pci-gart.o aperture.o
obj-$(CONFIG_DUMMY_IOMMU) += pci-nommu.o pci-dma.o
obj-$(CONFIG_SWIOTLB) += swiotlb.o
obj-$(CONFIG_MODULES) += module.o
obj-y += topology.o
bootflag-y += ../../i386/kernel/bootflag.o
cpuid-$(subst m,y,$(CONFIG_X86_CPUID)) += ../../i386/kernel/cpuid.o
topology-y += ../../i386/mach-default/topology.o
swiotlb-$(CONFIG_SWIOTLB) += ../../ia64/lib/swiotlb.o
microcode-$(subst m,y,$(CONFIG_MICROCODE)) += ../../i386/kernel/microcode.o
......@@ -31,6 +31,8 @@ int iommu_aperture_allowed __initdata = 0;
int fallback_aper_order __initdata = 1; /* 64MB */
int fallback_aper_force __initdata = 0;
int fix_aperture __initdata = 1;
/* This code runs before the PCI subsystem is initialized, so just
access the northbridge directly. */
......@@ -202,7 +204,7 @@ void __init iommu_hole_init(void)
u64 aper_base;
int valid_agp = 0;
if (iommu_aperture_disabled)
if (iommu_aperture_disabled || !fix_aperture)
return;
printk("Checking aperture...\n");
......@@ -241,20 +243,15 @@ void __init iommu_hole_init(void)
/* Got the aperture from the AGP bridge */
} else if ((!no_iommu && end_pfn >= 0xffffffff>>PAGE_SHIFT) ||
force_iommu ||
valid_agp ||
valid_agp ||
fallback_aper_force) {
/* When there is a AGP bridge in the system assume the
user wants to use the AGP driver too and needs an
aperture. However this case (AGP but no good
aperture) should only happen with a more broken than
usual BIOS, because it would even break Windows. */
printk("Your BIOS doesn't leave a aperture memory hole\n");
printk("Please enable the IOMMU option in the BIOS setup\n");
printk("This costs you %d MB of RAM\n", 32 << fallback_aper_order);
printk("Your BIOS doesn't leave a aperture memory hole\n");
printk("Please enable the IOMMU option in the BIOS setup\n");
printk("This costs you %d MB of RAM\n",
32 << fallback_aper_order);
aper_order = fallback_aper_order;
aper_alloc = allocate_aperture();
aper_alloc = allocate_aperture();
if (!aper_alloc) {
/* Could disable AGP and IOMMU here, but it's probably
not worth it. But the later users cannot deal with
......
......@@ -46,4 +46,53 @@ config X86_POWERNOW_K8_ACPI
depends on ((X86_POWERNOW_K8 = "m" && ACPI_PROCESSOR) || (X86_POWERNOW_K8 = "y" && ACPI_PROCESSOR = "y"))
default y
config X86_SPEEDSTEP_CENTRINO
tristate "Intel Enhanced SpeedStep"
depends on CPU_FREQ_TABLE
help
This adds the CPUFreq driver for Enhanced SpeedStep enabled
mobile CPUs. This means Intel Pentium M (Centrino) CPUs
or 64bit enabled Intel Xeons.
For details, take a look at <file:Documentation/cpu-freq/>.
If in doubt, say N.
config X86_SPEEDSTEP_CENTRINO_TABLE
bool
depends on X86_SPEEDSTEP_CENTRINO
default y
config X86_SPEEDSTEP_CENTRINO_ACPI
bool "Use ACPI tables to decode valid frequency/voltage pairs (EXPERIMENTAL)"
depends on EXPERIMENTAL
depends on ((X86_SPEEDSTEP_CENTRINO = "m" && ACPI_PROCESSOR) || (X86_SPEEDSTEP_CENTRINO = "y" && ACPI_PROCESSOR = "y"))
help
Use primarily the information provided in the BIOS ACPI tables
to determine valid CPU frequency and voltage pairings.
If in doubt, say Y.
config X86_ACPI_CPUFREQ
tristate "ACPI Processor P-States driver"
depends on CPU_FREQ_TABLE && ACPI_PROCESSOR
help
This driver adds a CPUFreq driver which utilizes the ACPI
Processor Performance States.
For details, take a look at <file:Documentation/cpu-freq/>.
If in doubt, say N.
config X86_ACPI_CPUFREQ_PROC_INTF
bool "/proc/acpi/processor/../performance interface (deprecated)"
depends on X86_ACPI_CPUFREQ && PROC_FS
help
This enables the deprecated /proc/acpi/processor/../performance
interface. While it is helpful for debugging, the generic,
cross-architecture cpufreq interfaces should be used.
If in doubt, say N.
endmenu
......@@ -2,6 +2,12 @@
# Reuse the i386 cpufreq drivers
#
SRCDIR := ../../../i386/kernel/cpu/cpufreq
obj-$(CONFIG_X86_POWERNOW_K8) += powernow-k8.o
obj-$(CONFIG_X86_SPEEDSTEP_CENTRINO) += speedstep-centrino.o
obj-$(CONFIG_X86_ACPI_CPUFREQ) += acpi.o
powernow-k8-objs := ../../../i386/kernel/cpu/cpufreq/powernow-k8.o
powernow-k8-objs := ${SRCDIR}/powernow-k8.o
speedstep-centrino-objs := ${SRCDIR}/speedstep-centrino.o
acpi-objs := ${SRCDIR}/acpi.o
......@@ -99,18 +99,17 @@ static void early_serial_write(struct console *con, const char *s, unsigned n)
#define DEFAULT_BAUD 9600
static __init void early_serial_init(char *opt)
static __init void early_serial_init(char *s)
{
unsigned char c;
unsigned divisor;
unsigned baud = DEFAULT_BAUD;
char *s, *e;
char *e;
if (*opt == ',')
++opt;
if (*s == ',')
++s;
s = strsep(&opt, ",");
if (s != NULL) {
if (*s) {
unsigned port;
if (!strncmp(s,"0x",2)) {
early_serial_base = simple_strtoul(s, &e, 16);
......@@ -124,6 +123,9 @@ static __init void early_serial_init(char *opt)
port = 0;
early_serial_base = bases[port];
}
s += strcspn(s, ",");
if (*s == ',')
s++;
}
outb(0x3, early_serial_base + LCR); /* 8n1 */
......@@ -131,8 +133,7 @@ static __init void early_serial_init(char *opt)
outb(0, early_serial_base + FCR); /* no fifo */
outb(0x3, early_serial_base + MCR); /* DTR + RTS */
s = strsep(&opt, ",");
if (s != NULL) {
if (*s) {
baud = simple_strtoul(s, &e, 0);
if (baud == 0 || s == e)
baud = DEFAULT_BAUD;
......
......@@ -324,19 +324,20 @@ int_restore_rest:
* Certain special system calls that need to save a complete full stack frame.
*/
.macro PTREGSCALL label,func
.macro PTREGSCALL label,func,arg
.globl \label
\label:
leaq \func(%rip),%rax
leaq -ARGOFFSET+8(%rsp),\arg /* 8 for return address */
jmp ptregscall_common
.endm
PTREGSCALL stub_clone, sys_clone
PTREGSCALL stub_fork, sys_fork
PTREGSCALL stub_vfork, sys_vfork
PTREGSCALL stub_rt_sigsuspend, sys_rt_sigsuspend
PTREGSCALL stub_sigaltstack, sys_sigaltstack
PTREGSCALL stub_iopl, sys_iopl
PTREGSCALL stub_clone, sys_clone, %r8
PTREGSCALL stub_fork, sys_fork, %rdi
PTREGSCALL stub_vfork, sys_vfork, %rdi
PTREGSCALL stub_rt_sigsuspend, sys_rt_sigsuspend, %rdx
PTREGSCALL stub_sigaltstack, sys_sigaltstack, %rdx
PTREGSCALL stub_iopl, sys_iopl, %rsi
ENTRY(ptregscall_common)
CFI_STARTPROC
......@@ -386,6 +387,7 @@ ENTRY(stub_rt_sigreturn)
CFI_STARTPROC
addq $8, %rsp
SAVE_REST
movq %rsp,%rdi
FIXUP_TOP_OF_STACK %r11
call sys_rt_sigreturn
movq %rax,RAX(%rsp) # fixme, this could be done at the higher layer
......
......@@ -83,9 +83,9 @@ asmlinkage long sys_ioperm(unsigned long from, unsigned long num, int turn_on)
* code.
*/
asmlinkage long sys_iopl(unsigned int level, struct pt_regs regs)
asmlinkage long sys_iopl(unsigned int level, struct pt_regs *regs)
{
unsigned int old = (regs.eflags >> 12) & 3;
unsigned int old = (regs->eflags >> 12) & 3;
if (level > 3)
return -EINVAL;
......@@ -94,6 +94,6 @@ asmlinkage long sys_iopl(unsigned int level, struct pt_regs regs)
if (!capable(CAP_SYS_RAWIO))
return -EPERM;
}
regs.eflags = (regs.eflags &~ 0x3000UL) | (level << 12);
regs->eflags = (regs->eflags &~ 0x3000UL) | (level << 12);
return 0;
}
......@@ -24,7 +24,8 @@
#define MISC_MCELOG_MINOR 227
#define NR_BANKS 5
static int mce_disabled __initdata;
static int mce_dont_init;
/* 0: always panic, 1: panic if deadlock possible, 2: try to avoid panic,
3: never panic or exit (for testing only) */
static int tolerant = 1;
......@@ -113,9 +114,8 @@ static void mce_panic(char *msg, struct mce *backup, unsigned long start)
static int mce_available(struct cpuinfo_x86 *c)
{
return !mce_disabled &&
test_bit(X86_FEATURE_MCE, &c->x86_capability) &&
test_bit(X86_FEATURE_MCA, &c->x86_capability);
return test_bit(X86_FEATURE_MCE, &c->x86_capability) &&
test_bit(X86_FEATURE_MCA, &c->x86_capability);
}
/*
......@@ -127,8 +127,9 @@ void do_machine_check(struct pt_regs * regs, long error_code)
struct mce m, panicm;
int nowayout = (tolerant < 1);
int kill_it = 0;
u64 mcestart;
u64 mcestart = 0;
int i;
int panicm_found = 0;
if (regs)
notify_die(DIE_NMI, "machine check", regs, error_code, 255, SIGKILL);
......@@ -138,17 +139,11 @@ void do_machine_check(struct pt_regs * regs, long error_code)
memset(&m, 0, sizeof(struct mce));
m.cpu = hard_smp_processor_id();
rdmsrl(MSR_IA32_MCG_STATUS, m.mcgstatus);
if (!regs && (m.mcgstatus & MCG_STATUS_MCIP))
return;
if (!(m.mcgstatus & MCG_STATUS_RIPV))
kill_it = 1;
if (regs) {
m.rip = regs->rip;
m.cs = regs->cs;
}
rdtscll(mcestart);
mb();
barrier();
for (i = 0; i < banks; i++) {
if (!bank[i])
......@@ -156,52 +151,62 @@ void do_machine_check(struct pt_regs * regs, long error_code)
m.misc = 0;
m.addr = 0;
m.bank = i;
m.tsc = 0;
rdmsrl(MSR_IA32_MC0_STATUS + i*4, m.status);
if ((m.status & MCI_STATUS_VAL) == 0)
continue;
/* Should be implied by the banks check above, but
check it anyways */
if ((m.status & MCI_STATUS_EN) == 0)
continue;
/* Did this bank cause the exception? */
/* Assume that the bank with uncorrectable errors did it,
and that there is only a single one. */
if (m.status & MCI_STATUS_UC) {
panicm = m;
} else {
m.rip = 0;
m.cs = 0;
if (m.status & MCI_STATUS_EN) {
/* In theory _OVER could be a nowayout too, but
assume any overflowed errors were no fatal. */
nowayout |= !!(m.status & MCI_STATUS_PCC);
kill_it |= !!(m.status & MCI_STATUS_UC);
}
/* In theory _OVER could be a nowayout too, but
assume any overflowed errors were no fatal. */
nowayout |= !!(m.status & MCI_STATUS_PCC);
kill_it |= !!(m.status & MCI_STATUS_UC);
m.bank = i;
if (m.status & MCI_STATUS_MISCV)
rdmsrl(MSR_IA32_MC0_MISC + i*4, m.misc);
if (m.status & MCI_STATUS_ADDRV)
rdmsrl(MSR_IA32_MC0_ADDR + i*4, m.addr);
rdtscll(m.tsc);
if (regs && (m.mcgstatus & MCG_STATUS_RIPV)) {
m.rip = regs->rip;
m.cs = regs->cs;
} else {
m.rip = 0;
m.cs = 0;
}
if (error_code != -1)
rdtscll(m.tsc);
wrmsrl(MSR_IA32_MC0_STATUS + i*4, 0);
mce_log(&m);
/* Did this bank cause the exception? */
/* Assume that the bank with uncorrectable errors did it,
and that there is only a single one. */
if ((m.status & MCI_STATUS_UC) && (m.status & MCI_STATUS_EN)) {
panicm = m;
panicm_found = 1;
}
}
wrmsrl(MSR_IA32_MCG_STATUS, 0);
/* Never do anything final in the polling timer */
if (!regs)
return;
goto out;
/* If we didn't find an uncorrectable error, pick
the last one (shouldn't happen, just being safe). */
if (!panicm_found)
panicm = m;
if (nowayout)
mce_panic("Machine check", &m, mcestart);
mce_panic("Machine check", &panicm, mcestart);
if (kill_it) {
int user_space = 0;
if (m.mcgstatus & MCG_STATUS_RIPV)
user_space = m.rip && (m.cs & 3);
user_space = panicm.rip && (panicm.cs & 3);
/* When the machine was in user space and the CPU didn't get
confused it's normally not necessary to panic, unless you
......@@ -214,18 +219,15 @@ void do_machine_check(struct pt_regs * regs, long error_code)
(unsigned)current->pid <= 1)
mce_panic("Uncorrected machine check", &panicm, mcestart);
/* do_exit takes an awful lot of locks and has as slight risk
of deadlocking. If you don't want that don't set tolerant >= 2 */
/* do_exit takes an awful lot of locks and has as
slight risk of deadlocking. If you don't want that
don't set tolerant >= 2 */
if (tolerant < 3)
do_exit(SIGBUS);
}
}
static void mce_clear_all(void)
{
int i;
for (i = 0; i < banks; i++)
wrmsrl(MSR_IA32_MC0_STATUS + i*4, 0);
out:
/* Last thing done in the machine check exception to clear state. */
wrmsrl(MSR_IA32_MCG_STATUS, 0);
}
......@@ -268,22 +270,25 @@ static void mce_init(void *dummy)
int i;
rdmsrl(MSR_IA32_MCG_CAP, cap);
if (cap & MCG_CTL_P)
wrmsr(MSR_IA32_MCG_CTL, 0xffffffff, 0xffffffff);
banks = cap & 0xff;
if (banks > NR_BANKS) {
printk(KERN_INFO "MCE: warning: using only %d banks\n", banks);
banks = NR_BANKS;
}
mce_clear_all();
/* Log the machine checks left over from the previous reset.
This also clears all registers */
do_machine_check(NULL, -1);
set_in_cr4(X86_CR4_MCE);
if (cap & MCG_CTL_P)
wrmsr(MSR_IA32_MCG_CTL, 0xffffffff, 0xffffffff);
for (i = 0; i < banks; i++) {
wrmsrl(MSR_IA32_MC0_CTL+4*i, bank[i]);
wrmsrl(MSR_IA32_MC0_STATUS+4*i, 0);
}
set_in_cr4(X86_CR4_MCE);
}
/* Add per CPU specific workarounds here */
......@@ -307,7 +312,9 @@ void __init mcheck_init(struct cpuinfo_x86 *c)
mce_cpu_quirks(c);
if (test_and_set_bit(smp_processor_id(), &mce_cpus) || !mce_available(c))
if (mce_dont_init ||
test_and_set_bit(smp_processor_id(), &mce_cpus) ||
!mce_available(c))
return;
mce_init(NULL);
......@@ -410,15 +417,16 @@ static struct miscdevice mce_log_device = {
static int __init mcheck_disable(char *str)
{
mce_disabled = 1;
mce_dont_init = 1;
return 0;
}
/* mce=off disable machine check */
/* mce=off disables machine check. Note you can reenable it later
using sysfs */
static int __init mcheck_enable(char *str)
{
if (!strcmp(str, "off"))
mce_disabled = 1;
mce_dont_init = 1;
else
printk("mce= argument %s ignored. Please use /sys", str);
return 0;
......@@ -434,7 +442,6 @@ __setup("mce", mcheck_enable);
/* On resume clear all MCE state. Don't want to see leftovers from the BIOS. */
static int mce_resume(struct sys_device *dev)
{
mce_clear_all();
on_each_cpu(mce_init, NULL, 1, 1);
return 0;
}
......@@ -492,7 +499,7 @@ static __init int mce_init_device(void)
if (!err)
err = sysdev_register(&device_mce);
if (!err) {
/* could create per CPU objects, but is not worth it. */
/* could create per CPU objects, but it is not worth it. */
sysdev_create_file(&device_mce, &attr_bank0ctl);
sysdev_create_file(&device_mce, &attr_bank1ctl);
sysdev_create_file(&device_mce, &attr_bank2ctl);
......
......@@ -46,234 +46,229 @@
static inline int wrmsr_eio(u32 reg, u32 eax, u32 edx)
{
int err;
asm volatile(
"1: wrmsr\n"
"2:\n"
".section .fixup,\"ax\"\n"
"3: movl %4,%0\n"
" jmp 2b\n"
".previous\n"
".section __ex_table,\"a\"\n"
" .align 8\n"
" .quad 1b,3b\n"
".previous"
: "=&bDS" (err)
: "a" (eax), "d" (edx), "c" (reg), "i" (-EIO), "0" (0));
return err;
int err;
asm volatile ("1: wrmsr\n"
"2:\n"
".section .fixup,\"ax\"\n"
"3: movl %4,%0\n"
" jmp 2b\n"
".previous\n"
".section __ex_table,\"a\"\n"
" .align 8\n" " .quad 1b,3b\n" ".previous":"=&bDS" (err)
:"a"(eax), "d"(edx), "c"(reg), "i"(-EIO), "0"(0));
return err;
}
static inline int rdmsr_eio(u32 reg, u32 *eax, u32 *edx)
{
int err;
asm volatile(
"1: rdmsr\n"
"2:\n"
".section .fixup,\"ax\"\n"
"3: movl %4,%0\n"
" jmp 2b\n"
".previous\n"
".section __ex_table,\"a\"\n"
" .align 8\n"
" .quad 1b,3b\n"
".previous"
: "=&bDS" (err), "=a" (*eax), "=d" (*edx)
: "c" (reg), "i" (-EIO), "0" (0));
return err;
int err;
asm volatile ("1: rdmsr\n"
"2:\n"
".section .fixup,\"ax\"\n"
"3: movl %4,%0\n"
" jmp 2b\n"
".previous\n"
".section __ex_table,\"a\"\n"
" .align 8\n"
" .quad 1b,3b\n"
".previous":"=&bDS" (err), "=a"(*eax), "=d"(*edx)
:"c"(reg), "i"(-EIO), "0"(0));
return err;
}
#ifdef CONFIG_SMP
struct msr_command {
int cpu;
int err;
u32 reg;
u32 data[2];
int cpu;
int err;
u32 reg;
u32 data[2];
};
static void msr_smp_wrmsr(void *cmd_block)
{
struct msr_command *cmd = (struct msr_command *) cmd_block;
if ( cmd->cpu == smp_processor_id() )
cmd->err = wrmsr_eio(cmd->reg, cmd->data[0], cmd->data[1]);
struct msr_command *cmd = (struct msr_command *)cmd_block;
if (cmd->cpu == smp_processor_id())
cmd->err = wrmsr_eio(cmd->reg, cmd->data[0], cmd->data[1]);
}
static void msr_smp_rdmsr(void *cmd_block)
{
struct msr_command *cmd = (struct msr_command *) cmd_block;
if ( cmd->cpu == smp_processor_id() )
cmd->err = rdmsr_eio(cmd->reg, &cmd->data[0], &cmd->data[1]);
struct msr_command *cmd = (struct msr_command *)cmd_block;
if (cmd->cpu == smp_processor_id())
cmd->err = rdmsr_eio(cmd->reg, &cmd->data[0], &cmd->data[1]);
}
static inline int do_wrmsr(int cpu, u32 reg, u32 eax, u32 edx)
{
struct msr_command cmd;
int ret;
preempt_disable();
if ( cpu == smp_processor_id() ) {
ret = wrmsr_eio(reg, eax, edx);
} else {
cmd.cpu = cpu;
cmd.reg = reg;
cmd.data[0] = eax;
cmd.data[1] = edx;
smp_call_function(msr_smp_wrmsr, &cmd, 1, 1);
ret = cmd.err;
}
preempt_enable();
return ret;
struct msr_command cmd;
int ret;
preempt_disable();
if (cpu == smp_processor_id()) {
ret = wrmsr_eio(reg, eax, edx);
} else {
cmd.cpu = cpu;
cmd.reg = reg;
cmd.data[0] = eax;
cmd.data[1] = edx;
smp_call_function(msr_smp_wrmsr, &cmd, 1, 1);
ret = cmd.err;
}
preempt_enable();
return ret;
}
static inline int do_rdmsr(int cpu, u32 reg, u32 *eax, u32 *edx)
static inline int do_rdmsr(int cpu, u32 reg, u32 * eax, u32 * edx)
{
struct msr_command cmd;
int ret;
preempt_disable();
if ( cpu == smp_processor_id() ) {
ret = rdmsr_eio(reg, eax, edx);
} else {
cmd.cpu = cpu;
cmd.reg = reg;
smp_call_function(msr_smp_rdmsr, &cmd, 1, 1);
*eax = cmd.data[0];
*edx = cmd.data[1];
ret = cmd.err;
}
preempt_enable();
return ret;
struct msr_command cmd;
int ret;
preempt_disable();
if (cpu == smp_processor_id()) {
ret = rdmsr_eio(reg, eax, edx);
} else {
cmd.cpu = cpu;
cmd.reg = reg;
smp_call_function(msr_smp_rdmsr, &cmd, 1, 1);
*eax = cmd.data[0];
*edx = cmd.data[1];
ret = cmd.err;
}
preempt_enable();
return ret;
}
#else /* ! CONFIG_SMP */
#else /* ! CONFIG_SMP */
static inline int do_wrmsr(int cpu, u32 reg, u32 eax, u32 edx)
{
return wrmsr_eio(reg, eax, edx);
return wrmsr_eio(reg, eax, edx);
}
static inline int do_rdmsr(int cpu, u32 reg, u32 *eax, u32 *edx)
{
return rdmsr_eio(reg, eax, edx);
return rdmsr_eio(reg, eax, edx);
}
#endif /* ! CONFIG_SMP */
#endif /* ! CONFIG_SMP */
static loff_t msr_seek(struct file *file, loff_t offset, int orig)
{
loff_t ret = -EINVAL;
lock_kernel();
switch (orig) {
case 0:
file->f_pos = offset;
ret = file->f_pos;
break;
case 1:
file->f_pos += offset;
ret = file->f_pos;
}
unlock_kernel();
return ret;
loff_t ret = -EINVAL;
lock_kernel();
switch (orig) {
case 0:
file->f_pos = offset;
ret = file->f_pos;
break;
case 1:
file->f_pos += offset;
ret = file->f_pos;
}
unlock_kernel();
return ret;
}
static ssize_t msr_read(struct file * file, char __user * buf,
size_t count, loff_t *ppos)
static ssize_t msr_read(struct file *file, char __user * buf,
size_t count, loff_t * ppos)
{
char __user *tmp = buf;
u32 data[2];
size_t rv;
u32 reg = *ppos;
int cpu = iminor(file->f_dentry->d_inode);
int err;
if ( count % 8 )
return -EINVAL; /* Invalid chunk size */
for ( rv = 0 ; count ; count -= 8 ) {
err = do_rdmsr(cpu, reg, &data[0], &data[1]);
if ( err )
return err;
if ( copy_to_user(tmp,&data,8) )
return -EFAULT;
tmp += 8;
}
return tmp - buf;
u32 __user *tmp = (u32 __user *) buf;
u32 data[2];
size_t rv;
u32 reg = *ppos;
int cpu = iminor(file->f_dentry->d_inode);
int err;
if (count % 8)
return -EINVAL; /* Invalid chunk size */
for (rv = 0; count; count -= 8) {
err = do_rdmsr(cpu, reg, &data[0], &data[1]);
if (err)
return err;
if (copy_to_user(tmp, &data, 8))
return -EFAULT;
tmp += 2;
}
return ((char __user *)tmp) - buf;
}
static ssize_t msr_write(struct file * file, const char __user * buf,
static ssize_t msr_write(struct file *file, const char __user *buf,
size_t count, loff_t *ppos)
{
const char __user *tmp = buf;
u32 data[2];
size_t rv;
u32 reg = *ppos;
int cpu = iminor(file->f_dentry->d_inode);
int err;
if ( count % 8 )
return -EINVAL; /* Invalid chunk size */
for ( rv = 0 ; count ; count -= 8 ) {
if ( copy_from_user(&data,tmp,8) )
return -EFAULT;
err = do_wrmsr(cpu, reg, data[0], data[1]);
if ( err )
return err;
tmp += 8;
}
return tmp - buf;
const u32 __user *tmp = (const u32 __user *)buf;
u32 data[2];
size_t rv;
u32 reg = *ppos;
int cpu = iminor(file->f_dentry->d_inode);
int err;
if (count % 8)
return -EINVAL; /* Invalid chunk size */
for (rv = 0; count; count -= 8) {
if (copy_from_user(&data, tmp, 8))
return -EFAULT;
err = do_wrmsr(cpu, reg, data[0], data[1]);
if (err)
return err;
tmp += 2;
}
return ((char __user *)tmp) - buf;
}
static int msr_open(struct inode *inode, struct file *file)
{
int cpu = iminor(file->f_dentry->d_inode);
struct cpuinfo_x86 *c = &(cpu_data)[cpu];
if (cpu >= NR_CPUS || !cpu_online(cpu))
return -ENXIO; /* No such CPU */
if ( !cpu_has(c, X86_FEATURE_MSR) )
return -EIO; /* MSR not supported */
return 0;
unsigned int cpu = iminor(file->f_dentry->d_inode);
struct cpuinfo_x86 *c = &(cpu_data)[cpu];
if (cpu >= NR_CPUS || !cpu_online(cpu))
return -ENXIO; /* No such CPU */
if (!cpu_has(c, X86_FEATURE_MSR))
return -EIO; /* MSR not supported */
return 0;
}
/*
* File operations we support
*/
static struct file_operations msr_fops = {
.owner = THIS_MODULE,
.llseek = msr_seek,
.read = msr_read,
.write = msr_write,
.open = msr_open,
.owner = THIS_MODULE,
.llseek = msr_seek,
.read = msr_read,
.write = msr_write,
.open = msr_open,
};
int __init msr_init(void)
{
if (register_chrdev(MSR_MAJOR, "cpu/msr", &msr_fops)) {
printk(KERN_ERR "msr: unable to get major %d for msr\n",
MSR_MAJOR);
return -EBUSY;
}
return 0;
if (register_chrdev(MSR_MAJOR, "cpu/msr", &msr_fops)) {
printk(KERN_ERR "msr: unable to get major %d for msr\n",
MSR_MAJOR);
return -EBUSY;
}
return 0;
}
void __exit msr_exit(void)
{
unregister_chrdev(MSR_MAJOR, "cpu/msr");
unregister_chrdev(MSR_MAJOR, "cpu/msr");
}
module_init(msr_init);
......
/*
* Dynamic DMA mapping support. Common code
* Dynamic DMA mapping support.
*/
#include <linux/types.h>
......@@ -24,38 +24,37 @@
* Device ownership issues as mentioned above for pci_map_single are
* the same here.
*/
int pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
int nents, int direction)
int dma_map_sg(struct device *hwdev, struct scatterlist *sg,
int nents, int direction)
{
int i;
BUG_ON(direction == PCI_DMA_NONE);
BUG_ON(direction == DMA_NONE);
for (i = 0; i < nents; i++ ) {
struct scatterlist *s = &sg[i];
BUG_ON(!s->page);
s->dma_address = pci_map_page(hwdev, s->page, s->offset,
s->length, direction);
s->dma_address = virt_to_bus(page_address(s->page) +s->offset);
s->dma_length = s->length;
}
return nents;
}
EXPORT_SYMBOL(pci_map_sg);
EXPORT_SYMBOL(dma_map_sg);
/* Unmap a set of streaming mode DMA translations.
* Again, cpu read rules concerning calls here are the same as for
* pci_unmap_single() above.
*/
void pci_unmap_sg(struct pci_dev *dev, struct scatterlist *sg,
int nents, int dir)
void dma_unmap_sg(struct device *dev, struct scatterlist *sg,
int nents, int dir)
{
int i;
for (i = 0; i < nents; i++) {
struct scatterlist *s = &sg[i];
BUG_ON(s->page == NULL);
BUG_ON(s->dma_address == 0);
pci_unmap_single(dev, s->dma_address, s->dma_length, dir);
dma_unmap_single(dev, s->dma_address, s->dma_length, dir);
}
}
EXPORT_SYMBOL(pci_unmap_sg);
EXPORT_SYMBOL(dma_unmap_sg);
......@@ -31,12 +31,6 @@
#include <asm/cacheflush.h>
#include <asm/kdebug.h>
#ifdef CONFIG_PREEMPT
#define preempt_atomic() in_atomic()
#else
#define preempt_atomic() 1
#endif
dma_addr_t bad_dma_address;
unsigned long iommu_bus_base; /* GART remapping area (physical) */
......@@ -54,7 +48,7 @@ int force_iommu = 1;
int panic_on_overflow = 0;
int force_iommu = 0;
#endif
int iommu_merge = 0;
int iommu_merge = 1;
int iommu_sac_force = 0;
/* If this is disabled the IOMMU will use an optimized flushing strategy
......@@ -64,6 +58,10 @@ int iommu_sac_force = 0;
also seen with Qlogic at least). */
int iommu_fullflush = 1;
/* This tells the BIO block layer to assume merging. Default to off
because we cannot guarantee merging later. */
int iommu_bio_merge = 0;
#define MAX_NB 8
/* Allocation bitmap for the remapping area */
......@@ -104,8 +102,16 @@ AGPEXTERN __u32 *agp_gatt_table;
static unsigned long next_bit; /* protected by iommu_bitmap_lock */
static int need_flush; /* global flush state. set for each gart wrap */
static dma_addr_t pci_map_area(struct pci_dev *dev, unsigned long phys_mem,
size_t size, int dir);
static dma_addr_t dma_map_area(struct device *dev, unsigned long phys_mem,
size_t size, int dir, int do_panic);
/* Dummy device used for NULL arguments (normally ISA). Better would
be probably a smaller DMA mask, but this is bug-to-bug compatible to i386. */
static struct device fallback_dev = {
.bus_id = "fallback device",
.coherent_dma_mask = 0xffffffff,
.dma_mask = &fallback_dev.coherent_dma_mask,
};
static unsigned long alloc_iommu(int size)
{
......@@ -146,25 +152,31 @@ static void free_iommu(unsigned long offset, int size)
/*
* Use global flush state to avoid races with multiple flushers.
*/
static void flush_gart(struct pci_dev *dev)
static void flush_gart(struct device *dev)
{
unsigned long flags;
int flushed = 0;
int i;
int i, max;
spin_lock_irqsave(&iommu_bitmap_lock, flags);
if (need_flush) {
max = 0;
for (i = 0; i < MAX_NB; i++) {
u32 w;
if (!northbridges[i])
continue;
pci_write_config_dword(northbridges[i], 0x9c,
northbridge_flush_word[i] | 1);
flushed++;
max = i;
}
for (i = 0; i <= max; i++) {
u32 w;
if (!northbridges[i])
continue;
/* Make sure the hardware actually executed the flush. */
do {
pci_read_config_dword(northbridges[i], 0x9c, &w);
} while (w & 1);
flushed++;
}
if (!flushed)
printk("nothing to flush?\n");
......@@ -173,31 +185,47 @@ static void flush_gart(struct pci_dev *dev)
spin_unlock_irqrestore(&iommu_bitmap_lock, flags);
}
/* Allocate DMA memory on node near device */
noinline
static void *dma_alloc_pages(struct device *dev, unsigned gfp, unsigned order)
{
struct page *page;
int node;
if (dev->bus == &pci_bus_type) {
cpumask_t mask;
mask = pcibus_to_cpumask(to_pci_dev(dev)->bus->number);
node = cpu_to_node(first_cpu(mask));
} else
node = numa_node_id();
page = alloc_pages_node(node, gfp, order);
return page ? page_address(page) : NULL;
}
/*
* Allocate memory for a consistent mapping.
* Allocate memory for a coherent mapping.
*/
void *pci_alloc_consistent(struct pci_dev *hwdev, size_t size,
dma_addr_t *dma_handle)
void *
dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle,
unsigned gfp)
{
void *memory;
int gfp = preempt_atomic() ? GFP_ATOMIC : GFP_KERNEL;
unsigned long dma_mask = 0;
u64 bus;
if (hwdev)
dma_mask = hwdev->dev.coherent_dma_mask;
if (!dev)
dev = &fallback_dev;
dma_mask = dev->coherent_dma_mask;
if (dma_mask == 0)
dma_mask = 0xffffffff;
/* Kludge to make it bug-to-bug compatible with i386. i386
uses the normal dma_mask for alloc_consistent. */
if (hwdev)
dma_mask &= hwdev->dma_mask;
uses the normal dma_mask for alloc_coherent. */
dma_mask &= *dev->dma_mask;
again:
memory = (void *)__get_free_pages(gfp, get_order(size));
memory = dma_alloc_pages(dev, gfp, get_order(size));
if (memory == NULL)
return NULL;
return NULL;
{
int high, mmu;
......@@ -223,28 +251,29 @@ void *pci_alloc_consistent(struct pci_dev *hwdev, size_t size,
}
}
*dma_handle = pci_map_area(hwdev, bus, size, PCI_DMA_BIDIRECTIONAL);
*dma_handle = dma_map_area(dev, bus, size, PCI_DMA_BIDIRECTIONAL, 0);
if (*dma_handle == bad_dma_address)
goto error;
flush_gart(hwdev);
flush_gart(dev);
return memory;
error:
if (panic_on_overflow)
panic("pci_alloc_consistent: overflow %lu bytes\n", size);
panic("dma_alloc_coherent: IOMMU overflow by %lu bytes\n", size);
free:
free_pages((unsigned long)memory, get_order(size));
/* XXX Could use the swiotlb pool here too */
return NULL;
}
/*
* Unmap consistent memory.
* Unmap coherent memory.
* The caller must ensure that the device has finished accessing the mapping.
*/
void pci_free_consistent(struct pci_dev *hwdev, size_t size,
void dma_free_coherent(struct device *dev, size_t size,
void *vaddr, dma_addr_t bus)
{
pci_unmap_single(hwdev, bus, size, 0);
dma_unmap_single(dev, bus, size, 0);
free_pages((unsigned long)vaddr, get_order(size));
}
......@@ -280,7 +309,7 @@ void dump_leak(void)
#define CLEAR_LEAK(x)
#endif
static void iommu_full(struct pci_dev *dev, size_t size, int dir)
static void iommu_full(struct device *dev, size_t size, int dir, int do_panic)
{
/*
* Ran out of IOMMU space for this operation. This is very bad.
......@@ -293,14 +322,14 @@ static void iommu_full(struct pci_dev *dev, size_t size, int dir)
*/
printk(KERN_ERR
"PCI-DMA: Out of IOMMU space for %lu bytes at device %s[%s]\n",
size, dev ? pci_pretty_name(dev) : "", dev ? dev->slot_name : "?");
"PCI-DMA: Out of IOMMU space for %lu bytes at device %s\n",
size, dev->bus_id);
if (size > PAGE_SIZE*EMERGENCY_PAGES) {
if (size > PAGE_SIZE*EMERGENCY_PAGES && do_panic) {
if (dir == PCI_DMA_FROMDEVICE || dir == PCI_DMA_BIDIRECTIONAL)
panic("PCI-DMA: Memory will be corrupted\n");
panic("PCI-DMA: Memory would be corrupted\n");
if (dir == PCI_DMA_TODEVICE || dir == PCI_DMA_BIDIRECTIONAL)
panic("PCI-DMA: Random memory will be DMAed\n");
panic("PCI-DMA: Random memory would be DMAed\n");
}
#ifdef CONFIG_IOMMU_LEAK
......@@ -308,9 +337,9 @@ static void iommu_full(struct pci_dev *dev, size_t size, int dir)
#endif
}
static inline int need_iommu(struct pci_dev *dev, unsigned long addr, size_t size)
static inline int need_iommu(struct device *dev, unsigned long addr, size_t size)
{
u64 mask = dev ? dev->dma_mask : 0xffffffff;
u64 mask = *dev->dma_mask;
int high = addr + size >= mask;
int mmu = high;
if (force_iommu)
......@@ -323,9 +352,9 @@ static inline int need_iommu(struct pci_dev *dev, unsigned long addr, size_t siz
return mmu;
}
static inline int nonforced_iommu(struct pci_dev *dev, unsigned long addr, size_t size)
static inline int nonforced_iommu(struct device *dev, unsigned long addr, size_t size)
{
u64 mask = dev ? dev->dma_mask : 0xffffffff;
u64 mask = *dev->dma_mask;
int high = addr + size >= mask;
int mmu = high;
if (no_iommu) {
......@@ -339,8 +368,8 @@ static inline int nonforced_iommu(struct pci_dev *dev, unsigned long addr, size_
/* Map a single continuous physical area into the IOMMU.
* Caller needs to check if the iommu is needed and flush.
*/
static dma_addr_t pci_map_area(struct pci_dev *dev, unsigned long phys_mem,
size_t size, int dir)
static dma_addr_t dma_map_area(struct device *dev, unsigned long phys_mem,
size_t size, int dir, int do_panic)
{
unsigned long npages = to_pages(phys_mem, size);
unsigned long iommu_page = alloc_iommu(npages);
......@@ -349,8 +378,8 @@ static dma_addr_t pci_map_area(struct pci_dev *dev, unsigned long phys_mem,
if (!nonforced_iommu(dev, phys_mem, size))
return phys_mem;
if (panic_on_overflow)
panic("pci_map_area overflow %lu bytes\n", size);
iommu_full(dev, size, dir);
panic("dma_map_area overflow %lu bytes\n", size);
iommu_full(dev, size, dir, do_panic);
return bad_dma_address;
}
......@@ -363,44 +392,44 @@ static dma_addr_t pci_map_area(struct pci_dev *dev, unsigned long phys_mem,
}
/* Map a single area into the IOMMU */
dma_addr_t pci_map_single(struct pci_dev *dev, void *addr, size_t size, int dir)
{
dma_addr_t dma_map_single(struct device *dev, void *addr, size_t size, int dir)
{
unsigned long phys_mem, bus;
BUG_ON(dir == PCI_DMA_NONE);
BUG_ON(dir == DMA_NONE);
#ifdef CONFIG_SWIOTLB
if (swiotlb)
return swiotlb_map_single(&dev->dev,addr,size,dir);
#endif
return swiotlb_map_single(dev,addr,size,dir);
if (!dev)
dev = &fallback_dev;
phys_mem = virt_to_phys(addr);
if (!need_iommu(dev, phys_mem, size))
return phys_mem;
bus = pci_map_area(dev, phys_mem, size, dir);
bus = dma_map_area(dev, phys_mem, size, dir, 1);
flush_gart(dev);
return bus;
}
/* Fallback for pci_map_sg in case of overflow */
static int pci_map_sg_nonforce(struct pci_dev *dev, struct scatterlist *sg,
/* Fallback for dma_map_sg in case of overflow */
static int dma_map_sg_nonforce(struct device *dev, struct scatterlist *sg,
int nents, int dir)
{
int i;
#ifdef CONFIG_IOMMU_DEBUG
printk(KERN_DEBUG "pci_map_sg overflow\n");
printk(KERN_DEBUG "dma_map_sg overflow\n");
#endif
for (i = 0; i < nents; i++ ) {
struct scatterlist *s = &sg[i];
unsigned long addr = page_to_phys(s->page) + s->offset;
if (nonforced_iommu(dev, addr, s->length)) {
addr = pci_map_area(dev, addr, s->length, dir);
addr = dma_map_area(dev, addr, s->length, dir, 0);
if (addr == bad_dma_address) {
if (i > 0)
pci_unmap_sg(dev, sg, i, dir);
dma_unmap_sg(dev, sg, i, dir);
nents = 0;
sg[0].dma_length = 0;
break;
......@@ -414,7 +443,7 @@ static int pci_map_sg_nonforce(struct pci_dev *dev, struct scatterlist *sg,
}
/* Map multiple scatterlist entries continuous into the first. */
static int __pci_map_cont(struct scatterlist *sg, int start, int stopat,
static int __dma_map_cont(struct scatterlist *sg, int start, int stopat,
struct scatterlist *sout, unsigned long pages)
{
unsigned long iommu_start = alloc_iommu(pages);
......@@ -452,7 +481,7 @@ static int __pci_map_cont(struct scatterlist *sg, int start, int stopat,
return 0;
}
static inline int pci_map_cont(struct scatterlist *sg, int start, int stopat,
static inline int dma_map_cont(struct scatterlist *sg, int start, int stopat,
struct scatterlist *sout,
unsigned long pages, int need)
{
......@@ -462,14 +491,14 @@ static inline int pci_map_cont(struct scatterlist *sg, int start, int stopat,
sout->dma_length = sg[start].length;
return 0;
}
return __pci_map_cont(sg, start, stopat, sout, pages);
return __dma_map_cont(sg, start, stopat, sout, pages);
}
/*
* DMA map all entries in a scatterlist.
* Merge chunks that have page aligned sizes into a continuous mapping.
*/
int pci_map_sg(struct pci_dev *dev, struct scatterlist *sg, int nents, int dir)
*/
int dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, int dir)
{
int i;
int out;
......@@ -477,19 +506,14 @@ int pci_map_sg(struct pci_dev *dev, struct scatterlist *sg, int nents, int dir)
unsigned long pages = 0;
int need = 0, nextneed;
#ifdef CONFIG_SWIOTLB
if (swiotlb)
return swiotlb_map_sg(&dev->dev,sg,nents,dir);
#endif
BUG_ON(dir == PCI_DMA_NONE);
BUG_ON(dir == DMA_NONE);
if (nents == 0)
return 0;
#ifdef CONFIG_SWIOTLB
if (swiotlb)
return swiotlb_map_sg(&dev->dev,sg,nents,dir);
#endif
return swiotlb_map_sg(dev,sg,nents,dir);
if (!dev)
dev = &fallback_dev;
out = 0;
start = 0;
......@@ -508,19 +532,19 @@ int pci_map_sg(struct pci_dev *dev, struct scatterlist *sg, int nents, int dir)
boundary and the new one doesn't have an offset. */
if (!iommu_merge || !nextneed || !need || s->offset ||
(ps->offset + ps->length) % PAGE_SIZE) {
if (pci_map_cont(sg, start, i, sg+out, pages,
if (dma_map_cont(sg, start, i, sg+out, pages,
need) < 0)
goto error;
out++;
pages = 0;
start = i;
}
}
}
need = nextneed;
pages += to_pages(s->offset, s->length);
}
if (pci_map_cont(sg, start, i, sg+out, pages, need) < 0)
if (dma_map_cont(sg, start, i, sg+out, pages, need) < 0)
goto error;
out++;
flush_gart(dev);
......@@ -530,34 +554,32 @@ int pci_map_sg(struct pci_dev *dev, struct scatterlist *sg, int nents, int dir)
error:
flush_gart(NULL);
pci_unmap_sg(dev, sg, nents, dir);
dma_unmap_sg(dev, sg, nents, dir);
/* When it was forced try again unforced */
if (force_iommu)
return pci_map_sg_nonforce(dev, sg, nents, dir);
return dma_map_sg_nonforce(dev, sg, nents, dir);
if (panic_on_overflow)
panic("pci_map_sg: overflow on %lu pages\n", pages);
iommu_full(dev, pages << PAGE_SHIFT, dir);
panic("dma_map_sg: overflow on %lu pages\n", pages);
iommu_full(dev, pages << PAGE_SHIFT, dir, 0);
for (i = 0; i < nents; i++)
sg[i].dma_address = bad_dma_address;
return 0;
}
/*
* Free a PCI mapping.
* Free a DMA mapping.
*/
void pci_unmap_single(struct pci_dev *hwdev, dma_addr_t dma_addr,
void dma_unmap_single(struct device *dev, dma_addr_t dma_addr,
size_t size, int direction)
{
unsigned long iommu_page;
int npages;
int i;
#ifdef CONFIG_SWIOTLB
if (swiotlb) {
swiotlb_unmap_single(&hwdev->dev,dma_addr,size,direction);
swiotlb_unmap_single(dev,dma_addr,size,direction);
return;
}
#endif
if (dma_addr < iommu_bus_base + EMERGENCY_PAGES*PAGE_SIZE ||
dma_addr >= iommu_bus_base + iommu_size)
......@@ -574,22 +596,25 @@ void pci_unmap_single(struct pci_dev *hwdev, dma_addr_t dma_addr,
/*
* Wrapper for pci_unmap_single working with scatterlists.
*/
void pci_unmap_sg(struct pci_dev *dev, struct scatterlist *sg, int nents,
int dir)
void dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents, int dir)
{
int i;
if (swiotlb) {
swiotlb_unmap_sg(dev,sg,nents,dir);
return;
}
for (i = 0; i < nents; i++) {
struct scatterlist *s = &sg[i];
if (!s->dma_length || !s->length)
break;
pci_unmap_single(dev, s->dma_address, s->dma_length, dir);
dma_unmap_single(dev, s->dma_address, s->dma_length, dir);
}
}
int pci_dma_supported(struct pci_dev *dev, u64 mask)
int dma_supported(struct device *dev, u64 mask)
{
/* Copied from i386. Doesn't make much sense, because it will
only work for pci_alloc_consistent.
only work for pci_alloc_coherent.
The caller just has to use GFP_DMA in this case. */
if (mask < 0x00ffffff)
return 0;
......@@ -605,22 +630,31 @@ int pci_dma_supported(struct pci_dev *dev, u64 mask)
Assume all masks <= 40 bits are of this type. Normally this doesn't
make any difference, but gives more gentle handling of IOMMU overflow. */
if (iommu_sac_force && (mask >= 0xffffffffffULL)) {
printk(KERN_INFO "%s: Force SAC with mask %Lx\n", dev->slot_name,mask);
printk(KERN_INFO "%s: Force SAC with mask %Lx\n", dev->bus_id,mask);
return 0;
}
return 1;
}
EXPORT_SYMBOL(pci_unmap_sg);
EXPORT_SYMBOL(pci_map_sg);
EXPORT_SYMBOL(pci_map_single);
EXPORT_SYMBOL(pci_unmap_single);
EXPORT_SYMBOL(pci_dma_supported);
int dma_get_cache_alignment(void)
{
return boot_cpu_data.x86_clflush_size;
}
EXPORT_SYMBOL(dma_unmap_sg);
EXPORT_SYMBOL(dma_map_sg);
EXPORT_SYMBOL(dma_map_single);
EXPORT_SYMBOL(dma_unmap_single);
EXPORT_SYMBOL(dma_supported);
EXPORT_SYMBOL(no_iommu);
EXPORT_SYMBOL(force_iommu);
EXPORT_SYMBOL(bad_dma_address);
EXPORT_SYMBOL(iommu_merge);
EXPORT_SYMBOL(iommu_bio_merge);
EXPORT_SYMBOL(iommu_sac_force);
EXPORT_SYMBOL(dma_get_cache_alignment);
EXPORT_SYMBOL(dma_alloc_coherent);
EXPORT_SYMBOL(dma_free_coherent);
static __init unsigned long check_iommu_size(unsigned long aper, u64 aper_size)
{
......@@ -747,7 +781,7 @@ static int __init pci_iommu_init(void)
if (swiotlb) {
no_iommu = 1;
printk(KERN_INFO "PCI-DMA: Using software bounce buffering for IO (SWIOTLB)\n");
printk(KERN_INFO "PCI-DMA: Using software bounce buffering for IO (SWIOTLB)\n");
return -1;
}
......@@ -851,7 +885,7 @@ static int __init pci_iommu_init(void)
fs_initcall(pci_iommu_init);
/* iommu=[size][,noagp][,off][,force][,noforce][,leak][,memaper[=order]][,merge]
[,forcesac][,fullflush][,nomerge]
[,forcesac][,fullflush][,nomerge][,biomerge]
size set size of iommu (in bytes)
noagp don't initialize the AGP driver and use full aperture.
off don't use the IOMMU
......@@ -859,60 +893,73 @@ fs_initcall(pci_iommu_init);
memaper[=order] allocate an own aperture over RAM with size 32MB^order.
noforce don't force IOMMU usage. Default.
force Force IOMMU.
merge Do SG merging. Implies force (experimental)
merge Do lazy merging. This may improve performance on some block devices.
Implies force (experimental)
biomerge Do merging at the BIO layer. This is more efficient than merge,
but should be only done with very big IOMMUs. Implies merge,force.
nomerge Don't do SG merging.
forcesac For SAC mode for masks <40bits (experimental)
fullflush Flush IOMMU on each allocation (default)
nofullflush Don't use IOMMU fullflush
allowed overwrite iommu off workarounds for specific chipsets.
soft Use software bounce buffering (default for Intel machines)
noaperture Don't touch the aperture for AGP.
*/
__init int iommu_setup(char *opt)
__init int iommu_setup(char *p)
{
int arg;
char *p = opt;
for (;;) {
if (!memcmp(p,"noagp", 5))
while (*p) {
if (!strncmp(p,"noagp",5))
no_agp = 1;
if (!memcmp(p,"off", 3))
if (!strncmp(p,"off",3))
no_iommu = 1;
if (!memcmp(p,"force", 5)) {
if (!strncmp(p,"force",5)) {
force_iommu = 1;
iommu_aperture_allowed = 1;
}
if (!memcmp(p,"allowed",7))
if (!strncmp(p,"allowed",7))
iommu_aperture_allowed = 1;
if (!memcmp(p,"noforce", 7)) {
if (!strncmp(p,"noforce",7)) {
iommu_merge = 0;
force_iommu = 0;
}
if (!memcmp(p, "memaper", 7)) {
if (!strncmp(p, "memaper", 7)) {
fallback_aper_force = 1;
p += 7;
if (*p == '=' && get_option(&p, &arg))
fallback_aper_order = arg;
if (*p == '=') {
++p;
if (get_option(&p, &arg))
fallback_aper_order = arg;
}
}
if (!memcmp(p, "panic", 5))
if (!strncmp(p, "biomerge",8)) {
iommu_bio_merge = 4096;
iommu_merge = 1;
force_iommu = 1;
}
if (!strncmp(p, "panic",5))
panic_on_overflow = 1;
if (!memcmp(p, "nopanic", 7))
if (!strncmp(p, "nopanic",7))
panic_on_overflow = 0;
if (!memcmp(p, "merge", 5)) {
if (!strncmp(p, "merge",5)) {
iommu_merge = 1;
force_iommu = 1;
}
if (!memcmp(p, "nomerge", 7))
if (!strncmp(p, "nomerge",7))
iommu_merge = 0;
if (!memcmp(p, "forcesac", 8))
if (!strncmp(p, "forcesac",8))
iommu_sac_force = 1;
if (!memcmp(p, "fullflush", 9))
if (!strncmp(p, "fullflush",8))
iommu_fullflush = 1;
if (!memcmp(p, "nofullflush", 11))
if (!strncmp(p, "nofullflush",11))
iommu_fullflush = 0;
if (!memcmp(p, "soft", 4))
if (!strncmp(p, "soft",4))
swiotlb = 1;
if (!strncmp(p, "noaperture",10))
fix_aperture = 0;
#ifdef CONFIG_IOMMU_LEAK
if (!memcmp(p,"leak", 4)) {
if (!strncmp(p,"leak",4)) {
leak_trace = 1;
p += 4;
if (*p == '=') ++p;
......@@ -922,10 +969,9 @@ __init int iommu_setup(char *opt)
#endif
if (isdigit(*p) && get_option(&p, &arg))
iommu_size = arg;
do {
if (*p == ' ' || *p == 0)
return 0;
} while (*p++ != ',');
p += strcspn(p, ",");
if (*p == ',')
++p;
}
return 1;
}
/* Fallback functions when the main IOMMU code is not compiled in. This
code is roughly equivalent to i386. */
#include <linux/mm.h>
#include <linux/init.h>
#include <linux/pci.h>
#include <linux/string.h>
#include <asm/proto.h>
#include <asm/processor.h>
int iommu_merge = 0;
EXPORT_SYMBOL(iommu_merge);
......@@ -10,57 +13,80 @@ EXPORT_SYMBOL(iommu_merge);
dma_addr_t bad_dma_address;
EXPORT_SYMBOL(bad_dma_address);
int iommu_bio_merge = 0;
EXPORT_SYMBOL(iommu_bio_merge);
int iommu_sac_force = 0;
EXPORT_SYMBOL(iommu_sac_force);
/*
* Dummy IO MMU functions
*/
void *pci_alloc_consistent(struct pci_dev *hwdev, size_t size,
dma_addr_t *dma_handle)
void *dma_alloc_coherent(struct device *hwdev, size_t size,
dma_addr_t *dma_handle, unsigned gfp)
{
void *ret;
int gfp = GFP_ATOMIC;
if (hwdev == NULL ||
end_pfn > (hwdev->dma_mask>>PAGE_SHIFT) || /* XXX */
(u32)hwdev->dma_mask < 0xffffffff)
gfp |= GFP_DMA;
ret = (void *)__get_free_pages(gfp, get_order(size));
u64 mask;
int order = get_order(size);
if (ret != NULL) {
memset(ret, 0, size);
if (hwdev)
mask = hwdev->coherent_dma_mask & *hwdev->dma_mask;
else
mask = 0xffffffff;
for (;;) {
ret = (void *)__get_free_pages(gfp, order);
if (ret == NULL)
return NULL;
*dma_handle = virt_to_bus(ret);
if ((*dma_handle & ~mask) == 0)
break;
free_pages((unsigned long)ret, order);
if (gfp & GFP_DMA)
return NULL;
gfp |= GFP_DMA;
}
memset(ret, 0, size);
return ret;
}
EXPORT_SYMBOL(dma_alloc_coherent);
void pci_free_consistent(struct pci_dev *hwdev, size_t size,
void dma_free_coherent(struct device *hwdev, size_t size,
void *vaddr, dma_addr_t dma_handle)
{
free_pages((unsigned long)vaddr, get_order(size));
}
EXPORT_SYMBOL(dma_free_coherent);
int pci_dma_supported(struct pci_dev *hwdev, u64 mask)
int dma_supported(struct device *hwdev, u64 mask)
{
/*
* we fall back to GFP_DMA when the mask isn't all 1s,
* so we can't guarantee allocations that must be
* within a tighter range than GFP_DMA..
* RED-PEN this won't work for pci_map_single. Caller has to
* use GFP_DMA in the first place.
* RED-PEN this won't work for pci_map_single. Caller has to
* use GFP_DMA in the first place.
*/
if (mask < 0x00ffffff)
return 0;
return 1;
}
EXPORT_SYMBOL(dma_supported);
EXPORT_SYMBOL(pci_dma_supported);
int dma_get_cache_alignment(void)
{
return boot_cpu_data.x86_clflush_size;
}
EXPORT_SYMBOL(dma_get_cache_alignment);
static int __init check_ram(void)
{
if (end_pfn >= 0xffffffff>>PAGE_SHIFT) {
printk(KERN_ERR "WARNING more than 4GB of memory but no IOMMU.\n"
KERN_ERR "WARNING 32bit PCI may malfunction.\n");
printk(
KERN_ERR "WARNING more than 4GB of memory but IOMMU not compiled in.\n"
KERN_ERR "WARNING 32bit PCI may malfunction.\n");
}
return 0;
}
......
......@@ -546,16 +546,16 @@ void set_personality_64bit(void)
clear_thread_flag(TIF_IA32);
}
asmlinkage long sys_fork(struct pt_regs regs)
asmlinkage long sys_fork(struct pt_regs *regs)
{
return do_fork(SIGCHLD, regs.rsp, &regs, 0, NULL, NULL);
return do_fork(SIGCHLD, regs->rsp, regs, 0, NULL, NULL);
}
asmlinkage long sys_clone(unsigned long clone_flags, unsigned long newsp, void __user *parent_tid, void __user *child_tid, struct pt_regs regs)
asmlinkage long sys_clone(unsigned long clone_flags, unsigned long newsp, void __user *parent_tid, void __user *child_tid, struct pt_regs *regs)
{
if (!newsp)
newsp = regs.rsp;
return do_fork(clone_flags & ~CLONE_IDLETASK, newsp, &regs, 0,
newsp = regs->rsp;
return do_fork(clone_flags & ~CLONE_IDLETASK, newsp, regs, 0,
parent_tid, child_tid);
}
......@@ -569,9 +569,9 @@ asmlinkage long sys_clone(unsigned long clone_flags, unsigned long newsp, void _
* do not have enough call-clobbered registers to hold all
* the information you need.
*/
asmlinkage long sys_vfork(struct pt_regs regs)
asmlinkage long sys_vfork(struct pt_regs *regs)
{
return do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, regs.rsp, &regs, 0,
return do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, regs->rsp, regs, 0,
NULL, NULL);
}
......
......@@ -433,30 +433,32 @@ asmlinkage long sys_ptrace(long request, long pid, unsigned long addr, long data
break;
case PTRACE_GETREGS: { /* Get all gp regs from the child. */
if (!access_ok(VERIFY_WRITE, (unsigned __user *)data, FRAME_SIZE)) {
if (!access_ok(VERIFY_WRITE, (unsigned __user *)data,
sizeof(struct user_regs_struct))) {
ret = -EIO;
break;
}
ret = 0;
for (ui = 0; ui < sizeof(struct user_regs_struct); ui += sizeof(long)) {
__put_user(getreg(child, ui),(unsigned long __user *) data);
ret |= __put_user(getreg(child, ui),(unsigned long __user *) data);
data += sizeof(long);
}
ret = 0;
break;
}
case PTRACE_SETREGS: { /* Set all gp regs in the child. */
unsigned long tmp;
if (!access_ok(VERIFY_READ, (unsigned __user *)data, FRAME_SIZE)) {
if (!access_ok(VERIFY_READ, (unsigned __user *)data,
sizeof(struct user_regs_struct))) {
ret = -EIO;
break;
}
ret = 0;
for (ui = 0; ui < sizeof(struct user_regs_struct); ui += sizeof(long)) {
__get_user(tmp, (unsigned long __user *) data);
ret |= __get_user(tmp, (unsigned long __user *) data);
putreg(child, ui, tmp);
data += sizeof(long);
}
ret = 0;
break;
}
......
......@@ -79,8 +79,10 @@ unsigned long pci_mem_start = 0x10000000;
unsigned long saved_video_mode;
#ifdef CONFIG_SWIOTLB
int swiotlb;
EXPORT_SYMBOL(swiotlb);
#endif
/*
* Setup options
......@@ -765,6 +767,7 @@ static struct _cache_table cache_table[] __initdata =
{ 0x43, LVL_2, 512 },
{ 0x44, LVL_2, 1024 },
{ 0x45, LVL_2, 2048 },
{ 0x60, LVL_1_DATA, 16 },
{ 0x66, LVL_1_DATA, 8 },
{ 0x67, LVL_1_DATA, 16 },
{ 0x68, LVL_1_DATA, 32 },
......
......@@ -60,12 +60,12 @@ noforce (default) Don't enable by default for heap/stack/data,
*/
static int __init nonx_setup(char *str)
{
if (!strncmp(str, "on",3)) {
if (!strcmp(str, "on")) {
__supported_pte_mask |= _PAGE_NX;
do_not_nx = 0;
vm_data_default_flags &= ~VM_EXEC;
vm_stack_flags &= ~VM_EXEC;
} else if (!strncmp(str, "noforce",7) || !strncmp(str,"off",3)) {
} else if (!strcmp(str, "noforce") || !strcmp(str, "off")) {
do_not_nx = (str[0] == 'o');
if (do_not_nx)
__supported_pte_mask &= ~_PAGE_NX;
......@@ -91,26 +91,28 @@ Valid options:
compat (default) Imply PROT_EXEC for PROT_READ
*/
static int __init nonx32_setup(char *str)
static int __init nonx32_setup(char *s)
{
char *s;
while ((s = strsep(&str, ",")) != NULL) {
if (!strcmp(s, "all") || !strcmp(s,"on")) {
while (*s) {
if (!strncmp(s, "all", 3) || !strncmp(s,"on",2)) {
vm_data_default_flags32 &= ~VM_EXEC;
vm_stack_flags32 &= ~VM_EXEC;
} else if (!strcmp(s, "off")) {
} else if (!strncmp(s, "off",3)) {
vm_data_default_flags32 |= VM_EXEC;
vm_stack_flags32 |= VM_EXEC;
} else if (!strcmp(s, "stack")) {
} else if (!strncmp(s, "stack", 5)) {
vm_data_default_flags32 |= VM_EXEC;
vm_stack_flags32 &= ~VM_EXEC;
} else if (!strcmp(s, "force")) {
} else if (!strncmp(s, "force",5)) {
vm_force_exec32 = 0;
} else if (!strcmp(s, "compat")) {
} else if (!strncmp(s, "compat",5)) {
vm_force_exec32 = PROT_EXEC;
}
}
return 1;
s += strcspn(s, ",");
if (*s == ',')
++s;
}
return 1;
}
__setup("noexec32=", nonx32_setup);
......
......@@ -40,7 +40,7 @@ void ia32_setup_frame(int sig, struct k_sigaction *ka,
sigset_t *set, struct pt_regs * regs);
asmlinkage long
sys_rt_sigsuspend(sigset_t __user *unewset, size_t sigsetsize, struct pt_regs regs)
sys_rt_sigsuspend(sigset_t __user *unewset, size_t sigsetsize, struct pt_regs *regs)
{
sigset_t saveset, newset;
......@@ -59,21 +59,22 @@ sys_rt_sigsuspend(sigset_t __user *unewset, size_t sigsetsize, struct pt_regs re
spin_unlock_irq(&current->sighand->siglock);
#ifdef DEBUG_SIG
printk("rt_sigsuspend savset(%lx) newset(%lx) regs(%p) rip(%lx)\n",
saveset, newset, &regs, regs.rip);
saveset, newset, regs, regs->rip);
#endif
regs.rax = -EINTR;
regs->rax = -EINTR;
while (1) {
current->state = TASK_INTERRUPTIBLE;
schedule();
if (do_signal(&regs, &saveset))
if (do_signal(regs, &saveset))
return -EINTR;
}
}
asmlinkage long
sys_sigaltstack(const stack_t __user *uss, stack_t __user *uoss, struct pt_regs regs)
sys_sigaltstack(const stack_t __user *uss, stack_t __user *uoss,
struct pt_regs *regs)
{
return do_sigaltstack(uss, uoss, regs.rsp);
return do_sigaltstack(uss, uoss, regs->rsp);
}
......@@ -134,13 +135,13 @@ restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc, unsigned
return 1;
}
asmlinkage long sys_rt_sigreturn(struct pt_regs regs)
asmlinkage long sys_rt_sigreturn(struct pt_regs *regs)
{
struct rt_sigframe __user *frame;
sigset_t set;
long eax;
frame = (struct rt_sigframe __user *)(regs.rsp - 8);
frame = (struct rt_sigframe __user *)(regs->rsp - 8);
if (verify_area(VERIFY_READ, frame, sizeof(*frame))) {
goto badframe;
}
......@@ -154,7 +155,7 @@ asmlinkage long sys_rt_sigreturn(struct pt_regs regs)
recalc_sigpending();
spin_unlock_irq(&current->sighand->siglock);
if (restore_sigcontext(&regs, &frame->uc.uc_mcontext, &eax)) {
if (restore_sigcontext(regs, &frame->uc.uc_mcontext, &eax)) {
goto badframe;
}
......@@ -162,13 +163,13 @@ asmlinkage long sys_rt_sigreturn(struct pt_regs regs)
printk("%d sigreturn rip:%lx rsp:%lx frame:%p rax:%lx\n",current->pid,regs.rip,regs.rsp,frame,eax);
#endif
if (do_sigaltstack(&frame->uc.uc_stack, NULL, regs.rsp) == -EFAULT)
if (do_sigaltstack(&frame->uc.uc_stack, NULL, regs->rsp) == -EFAULT)
goto badframe;
return eax;
badframe:
signal_fault(&regs,frame,"sigreturn");
signal_fault(regs,frame,"sigreturn");
return 0;
}
......
......@@ -11,7 +11,6 @@
* Copyright (c) 2002 Vojtech Pavlik
* Copyright (c) 2003 Andi Kleen
* RTC support code taken from arch/i386/kernel/timers/time_hpet.c
*
*/
#include <linux/kernel.h>
......@@ -42,6 +41,10 @@ u64 jiffies_64 = INITIAL_JIFFIES;
EXPORT_SYMBOL(jiffies_64);
#ifdef CONFIG_CPU_FREQ
static void cpufreq_delayed_get(void);
#endif
extern int using_apic_timer;
spinlock_t rtc_lock = SPIN_LOCK_UNLOCKED;
......@@ -82,7 +85,7 @@ static inline void rdtscll_sync(unsigned long *tsc)
* timer interrupt has happened already, but vxtime.trigger wasn't updated yet.
* This is not a problem, because jiffies hasn't updated either. They are bound
* together by xtime_lock.
*/
*/
static inline unsigned int do_gettimeoffset_tsc(void)
{
......@@ -119,7 +122,7 @@ void do_gettimeofday(struct timeval *tv)
usec = xtime.tv_nsec / 1000;
/* i386 does some correction here to keep the clock
monotonus even when ntpd is fixing drift.
monotonous even when ntpd is fixing drift.
But they didn't work for me, there is a non monotonic
clock anyways with ntp.
I dropped all corrections now until a real solution can
......@@ -214,7 +217,7 @@ static void set_rtc_mmss(unsigned long nowtime)
* overflow. This avoids messing with unknown time zones but requires your RTC
* not to be off by more than 15 minutes. Since we're calling it only when
* our clock is externally synchronized using NTP, this shouldn't be a problem.
*/
*/
real_seconds = nowtime % 60;
real_minutes = nowtime / 60;
......@@ -297,15 +300,15 @@ EXPORT_SYMBOL(monotonic_clock);
static irqreturn_t timer_interrupt(int irq, void *dev_id, struct pt_regs *regs)
{
static unsigned long rtc_update = 0;
unsigned long tsc, lost = 0;
int delay, offset = 0;
unsigned long tsc;
int delay, offset = 0, lost = 0;
/*
* Here we are in the timer irq handler. We have irqs locally disabled (so we
* don't need spin_lock_irqsave()) but we don't know if the timer_bh is running
* on the other CPU, so we need a lock. We also need to lock the vsyscall
* variables, because both do_timer() and us change them -arca+vojtech
*/
*/
write_seqlock(&xtime_lock);
......@@ -354,12 +357,29 @@ static irqreturn_t timer_interrupt(int irq, void *dev_id, struct pt_regs *regs)
(((long) offset << 32) / vxtime.tsc_quot) - 1;
}
if (lost) {
if (lost > 0) {
static long lost_count;
if (report_lost_ticks) {
printk(KERN_WARNING "time.c: Lost %ld timer "
printk(KERN_WARNING "time.c: Lost %d timer "
"tick(s)! ", lost);
print_symbol("rip %s)\n", regs->rip);
}
if (lost_count == 100) {
printk(KERN_WARNING
"warning: many lost ticks.\n"
KERN_WARNING "Your time source seems to be instable or some driver is hogging interupts\n");
print_symbol("rip %s\n", regs->rip);
lost_count = 0;
} else
lost_count++;
if ((lost_count % 25) == 0) {
#ifdef CONFIG_CPU_FREQ
cpufreq_delayed_get();
#endif
}
jiffies += lost;
}
......@@ -509,6 +529,34 @@ unsigned long get_cmos_time(void)
Should fix up last_tsc too. Currently gettimeofday in the
first tick after the change will be slightly wrong. */
#include <linux/workqueue.h>
static unsigned int cpufreq_delayed_issched = 0;
static unsigned int cpufreq_init = 0;
static struct work_struct cpufreq_delayed_get_work;
static void handle_cpufreq_delayed_get(void *v)
{
unsigned int cpu;
for_each_online_cpu(cpu) {
cpufreq_get(cpu);
}
cpufreq_delayed_issched = 0;
}
/* if we notice lost ticks, schedule a call to cpufreq_get() as it tries
* to verify the CPU frequency the timing core thinks the CPU is running
* at is still correct.
*/
static void cpufreq_delayed_get(void)
{
if (cpufreq_init && !cpufreq_delayed_issched) {
cpufreq_delayed_issched = 1;
printk(KERN_DEBUG "Losing some ticks... checking if CPU frequency changed.\n");
schedule_work(&cpufreq_delayed_get_work);
}
}
static unsigned int ref_freq = 0;
static unsigned long loops_per_jiffy_ref = 0;
......@@ -518,14 +566,18 @@ static int time_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
void *data)
{
struct cpufreq_freqs *freq = data;
unsigned long *lpj;
unsigned long *lpj, dummy;
lpj = &dummy;
if (!(freq->flags & CPUFREQ_CONST_LOOPS))
#ifdef CONFIG_SMP
lpj = &cpu_data[freq->cpu].loops_per_jiffy;
#else
lpj = &boot_cpu_data.loops_per_jiffy;
#endif
if (!ref_freq) {
ref_freq = freq->old;
loops_per_jiffy_ref = *lpj;
......@@ -538,7 +590,8 @@ static int time_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
cpufreq_scale(loops_per_jiffy_ref, ref_freq, freq->new);
cpu_khz = cpufreq_scale(cpu_khz_ref, ref_freq, freq->new);
vxtime.tsc_quot = (1000L << 32) / cpu_khz;
if (!(freq->flags & CPUFREQ_CONST_LOOPS))
vxtime.tsc_quot = (1000L << 32) / cpu_khz;
}
set_cyc2ns_scale(cpu_khz_ref / 1000);
......@@ -549,6 +602,18 @@ static int time_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
static struct notifier_block time_cpufreq_notifier_block = {
.notifier_call = time_cpufreq_notifier
};
static int __init cpufreq_tsc(void)
{
INIT_WORK(&cpufreq_delayed_get_work, handle_cpufreq_delayed_get, NULL);
if (!cpufreq_register_notifier(&time_cpufreq_notifier_block,
CPUFREQ_TRANSITION_NOTIFIER))
cpufreq_init = 1;
return 0;
}
core_initcall(cpufreq_tsc);
#endif
/*
......@@ -725,8 +790,8 @@ void __init time_init(void)
cpu_khz = hpet_calibrate_tsc();
timename = "HPET";
} else {
pit_init();
cpu_khz = pit_calibrate_tsc();
pit_init();
cpu_khz = pit_calibrate_tsc();
timename = "PIT";
}
......@@ -742,11 +807,6 @@ void __init time_init(void)
setup_irq(0, &irq0);
set_cyc2ns_scale(cpu_khz / 1000);
#ifdef CONFIG_CPU_FREQ
cpufreq_register_notifier(&time_cpufreq_notifier_block,
CPUFREQ_TRANSITION_NOTIFIER);
#endif
}
void __init time_init_smp(void)
......@@ -1038,6 +1098,8 @@ irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id, struct pt_regs *regs)
}
#endif
static int __init nohpet_setup(char *s)
{
nohpet = 1;
......
......@@ -352,7 +352,7 @@ void __die(const char * str, struct pt_regs * regs, long err)
#ifdef CONFIG_DEBUG_PAGEALLOC
printk("DEBUG_PAGEALLOC");
#endif
printk("\n");
printk("\n");
notify_die(DIE_OOPS, (char *)str, regs, err, 255, SIGSEGV);
show_registers(regs);
/* Executive summary in case the oops scrolled away */
......@@ -513,7 +513,7 @@ asmlinkage void do_general_protection(struct pt_regs * regs, long error_code)
tsk->thread.error_code = error_code;
tsk->thread.trap_no = 13;
force_sig(SIGSEGV, tsk);
return;
return;
}
/* kernel gp */
......@@ -670,7 +670,6 @@ asmlinkage void *do_debug(struct pt_regs * regs, unsigned long error_code)
return regs;
clear_TF_reenable:
printk("clear_tf_reenable\n");
set_tsk_thread_flag(tsk, TIF_SINGLESTEP);
clear_TF:
......@@ -681,16 +680,43 @@ asmlinkage void *do_debug(struct pt_regs * regs, unsigned long error_code)
return regs;
}
static int kernel_math_error(struct pt_regs *regs, char *str)
{
const struct exception_table_entry *fixup;
fixup = search_exception_tables(regs->rip);
if (fixup) {
regs->rip = fixup->fixup;
return 1;
}
notify_die(DIE_GPF, str, regs, 0, 16, SIGFPE);
#if 0
/* This should be a die, but warn only for now */
die(str, regs, 0);
#else
printk(KERN_DEBUG "%s: %s at ", current->comm, str);
printk_address(regs->rip);
printk("\n");
#endif
return 0;
}
/*
* Note that we play around with the 'TS' bit in an attempt to get
* the correct behaviour even in the presence of the asynchronous
* IRQ13 behaviour
*/
void math_error(void __user *rip)
asmlinkage void do_coprocessor_error(struct pt_regs *regs)
{
void __user *rip = (void __user *)(regs->rip);
struct task_struct * task;
siginfo_t info;
unsigned short cwd, swd;
conditional_sti(regs);
if ((regs->cs & 3) == 0 &&
kernel_math_error(regs, "kernel x87 math error"))
return;
/*
* Save the info for the exception handler and clear the error.
*/
......@@ -740,23 +766,23 @@ void math_error(void __user *rip)
force_sig_info(SIGFPE, &info, task);
}
asmlinkage void do_coprocessor_error(struct pt_regs * regs)
{
conditional_sti(regs);
math_error((void __user *)regs->rip);
}
asmlinkage void bad_intr(void)
{
printk("bad interrupt");
}
static inline void simd_math_error(void __user *rip)
asmlinkage void do_simd_coprocessor_error(struct pt_regs *regs)
{
void __user *rip = (void __user *)(regs->rip);
struct task_struct * task;
siginfo_t info;
unsigned short mxcsr;
conditional_sti(regs);
if ((regs->cs & 3) == 0 &&
kernel_math_error(regs, "simd math error"))
return;
/*
* Save the info for the exception handler and clear the error.
*/
......@@ -799,12 +825,6 @@ static inline void simd_math_error(void __user *rip)
force_sig_info(SIGFPE, &info, task);
}
asmlinkage void do_simd_coprocessor_error(struct pt_regs * regs)
{
conditional_sti(regs);
simd_math_error((void __user *)regs->rip);
}
asmlinkage void do_spurious_interrupt_bug(struct pt_regs * regs)
{
}
......
......@@ -108,11 +108,8 @@ EXPORT_SYMBOL(pcibios_penalize_isa_irq);
EXPORT_SYMBOL(pci_mem_start);
#endif
#ifdef CONFIG_X86_USE_3DNOW
EXPORT_SYMBOL(_mmx_memcpy);
EXPORT_SYMBOL(mmx_clear_page);
EXPORT_SYMBOL(mmx_copy_page);
#endif
EXPORT_SYMBOL(copy_page);
EXPORT_SYMBOL(clear_page);
EXPORT_SYMBOL(cpu_pda);
#ifdef CONFIG_SMP
......@@ -182,10 +179,17 @@ EXPORT_SYMBOL_NOVERS(memcpy);
EXPORT_SYMBOL_NOVERS(__memcpy);
EXPORT_SYMBOL_NOVERS(memcmp);
/* syscall export needed for misdesigned sound drivers. */
EXPORT_SYMBOL(sys_read);
EXPORT_SYMBOL(sys_lseek);
EXPORT_SYMBOL(sys_open);
#ifdef CONFIG_RWSEM_XCHGADD_ALGORITHM
/* prototypes are wrong, these are assembly with custom calling functions */
extern void rwsem_down_read_failed_thunk(void);
extern void rwsem_wake_thunk(void);
extern void rwsem_downgrade_thunk(void);
extern void rwsem_down_write_failed_thunk(void);
EXPORT_SYMBOL(rwsem_down_read_failed_thunk);
EXPORT_SYMBOL(rwsem_wake_thunk);
EXPORT_SYMBOL(rwsem_downgrade_thunk);
EXPORT_SYMBOL(rwsem_down_write_failed_thunk);
#endif
EXPORT_SYMBOL(empty_zero_page);
......@@ -211,10 +215,9 @@ EXPORT_SYMBOL(init_level4_pgt);
extern unsigned long __supported_pte_mask;
EXPORT_SYMBOL(__supported_pte_mask);
EXPORT_SYMBOL(clear_page);
#ifdef CONFIG_SMP
EXPORT_SYMBOL(flush_tlb_page);
EXPORT_SYMBOL_GPL(flush_tlb_all);
#endif
EXPORT_SYMBOL(cpu_khz);
......@@ -8,7 +8,7 @@ obj-y := io.o
lib-y := csum-partial.o csum-copy.o csum-wrappers.o delay.o \
usercopy.o getuser.o putuser.o \
thunk.o clear_page.o copy_page.o bitstr.o
thunk.o clear_page.o copy_page.o bitstr.o bitops.o
lib-y += memcpy.o memmove.o memset.o copy_user.o
lib-$(CONFIG_HAVE_DEC_LOCK) += dec_and_lock.o
#include <linux/module.h>
#include <asm/bitops.h>
#undef find_first_zero_bit
#undef find_next_zero_bit
#undef find_first_bit
#undef find_next_bit
/**
* find_first_zero_bit - find the first zero bit in a memory region
* @addr: The address to start the search at
* @size: The maximum size to search
*
* Returns the bit-number of the first zero bit, not the number of the byte
* containing a bit.
*/
inline long find_first_zero_bit(const unsigned long * addr, unsigned long size)
{
long d0, d1, d2;
long res;
if (!size)
return 0;
asm volatile(
" repe; scasq\n"
" je 1f\n"
" xorq -8(%%rdi),%%rax\n"
" subq $8,%%rdi\n"
" bsfq %%rax,%%rdx\n"
"1: subq %[addr],%%rdi\n"
" shlq $3,%%rdi\n"
" addq %%rdi,%%rdx"
:"=d" (res), "=&c" (d0), "=&D" (d1), "=&a" (d2)
:"0" (0ULL), "1" ((size + 63) >> 6), "2" (addr), "3" (-1ULL),
[addr] "r" (addr) : "memory");
return res;
}
/**
* find_next_zero_bit - find the first zero bit in a memory region
* @addr: The address to base the search on
* @offset: The bitnumber to start searching at
* @size: The maximum size to search
*/
long find_next_zero_bit (const unsigned long * addr, long size, long offset)
{
unsigned long * p = ((unsigned long *) addr) + (offset >> 6);
unsigned long set = 0;
unsigned long res, bit = offset&63;
if (bit) {
/*
* Look for zero in first word
*/
asm("bsfq %1,%0\n\t"
"cmoveq %2,%0"
: "=r" (set)
: "r" (~(*p >> bit)), "r"(64L));
if (set < (64 - bit))
return set + offset;
set = 64 - bit;
p++;
}
/*
* No zero yet, search remaining full words for a zero
*/
res = find_first_zero_bit ((const unsigned long *)p,
size - 64 * (p - (unsigned long *) addr));
return (offset + set + res);
}
static inline long
__find_first_bit(const unsigned long * addr, unsigned long size)
{
long d0, d1;
long res;
asm volatile(
" repe; scasq\n"
" jz 1f\n"
" subq $8,%%rdi\n"
" bsfq (%%rdi),%%rax\n"
"1: subq %[addr],%%rdi\n"
" shlq $3,%%rdi\n"
" addq %%rdi,%%rax"
:"=a" (res), "=&c" (d0), "=&D" (d1)
:"0" (0ULL),
"1" ((size + 63) >> 6), "2" (addr),
[addr] "r" (addr) : "memory");
return res;
}
/**
* find_first_bit - find the first set bit in a memory region
* @addr: The address to start the search at
* @size: The maximum size to search
*
* Returns the bit-number of the first set bit, not the number of the byte
* containing a bit.
*/
long find_first_bit(const unsigned long * addr, unsigned long size)
{
return __find_first_bit(addr,size);
}
/**
* find_next_bit - find the first set bit in a memory region
* @addr: The address to base the search on
* @offset: The bitnumber to start searching at
* @size: The maximum size to search
*/
long find_next_bit(const unsigned long * addr, long size, long offset)
{
const unsigned long * p = addr + (offset >> 6);
unsigned long set = 0, bit = offset & 63, res;
if (bit) {
/*
* Look for nonzero in the first 64 bits:
*/
asm("bsfq %1,%0\n\t"
"cmoveq %2,%0\n\t"
: "=r" (set)
: "r" (*p >> bit), "r" (64L));
if (set < (64 - bit))
return set + offset;
set = 64 - bit;
p++;
}
/*
* No set bit yet, search remaining full words for a bit
*/
res = __find_first_bit (p, size - 64 * (p - addr));
return (offset + set + res);
}
EXPORT_SYMBOL(find_next_bit);
EXPORT_SYMBOL(find_first_bit);
EXPORT_SYMBOL(find_first_zero_bit);
EXPORT_SYMBOL(find_next_zero_bit);
#include <linux/module.h>
#include <asm/bitops.h>
/* Find string of zero bits in a bitmap */
......@@ -23,3 +24,5 @@ find_next_zero_string(unsigned long *bitmap, long start, long nbits, int len)
}
return n;
}
EXPORT_SYMBOL(find_next_zero_string);
......@@ -10,18 +10,10 @@ void *memmove(void * dest,const void *src,size_t count)
if (dest < src) {
__inline_memcpy(dest,src,count);
} else {
/* Could be more clever and move longs */
unsigned long d0, d1, d2;
__asm__ __volatile__(
"std\n\t"
"rep\n\t"
"movsb\n\t"
"cld"
: "=&c" (d0), "=&S" (d1), "=&D" (d2)
:"0" (count),
"1" (count-1+(const char *)src),
"2" (count-1+(char *)dest)
:"memory");
char *p = (char *) dest + count;
char *s = (char *) src + count;
while (count--)
*--p = *--s;
}
return dest;
}
......@@ -58,16 +58,17 @@ void bust_spinlocks(int yes)
/* Sometimes the CPU reports invalid exceptions on prefetch.
Check that here and ignore.
Opcode checker based on code by Richard Brunner */
static int is_prefetch(struct pt_regs *regs, unsigned long addr)
static noinline int is_prefetch(struct pt_regs *regs, unsigned long addr,
unsigned long error_code)
{
unsigned char *instr = (unsigned char *)(regs->rip);
int scan_more = 1;
int prefetch = 0;
unsigned char *max_instr = instr + 15;
/* Avoid recursive faults for this common case */
if (regs->rip == addr)
return 0;
/* If it was a exec fault ignore */
if (error_code & (1<<4))
return 0;
/* Code segments in LDT could have a non zero base. Don't check
when that's possible */
......@@ -218,6 +219,18 @@ int unhandled_signal(struct task_struct *tsk, int sig)
(tsk->sighand->action[sig-1].sa.sa_handler == SIG_DFL);
}
static noinline void pgtable_bad(unsigned long address, struct pt_regs *regs,
unsigned long error_code)
{
oops_begin();
printk(KERN_ALERT "%s: Corrupted page table at address %lx\n",
current->comm, address);
dump_pagetable(address);
__die("Bad pagetable", regs, error_code);
oops_end();
do_exit(SIGKILL);
}
int page_fault_trace;
int exception_trace = 1;
......@@ -268,11 +281,32 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code)
mm = tsk->mm;
info.si_code = SEGV_MAPERR;
/* 5 => page not present and from supervisor mode */
if (unlikely(!(error_code & 5) &&
((address >= VMALLOC_START && address <= VMALLOC_END) ||
(address >= MODULES_VADDR && address <= MODULES_END))))
goto vmalloc_fault;
/*
* We fault-in kernel-space virtual memory on-demand. The
* 'reference' page table is init_mm.pgd.
*
* NOTE! We MUST NOT take any locks for this case. We may
* be in an interrupt or a critical region, and should
* only copy the information from the master page table,
* nothing more.
*
* This verifies that the fault happens in kernel space
* (error_code & 4) == 0, and that the fault was not a
* protection error (error_code & 1) == 0.
*/
if (unlikely(address >= TASK_SIZE)) {
if (!(error_code & 5))
goto vmalloc_fault;
/*
* Don't take the mm semaphore here. If we fixup a prefetch
* fault we could otherwise deadlock.
*/
goto bad_area_nosemaphore;
}
if (unlikely(error_code & (1 << 3)))
goto page_table_corruption;
/*
* If we're in an interrupt or have no user
......@@ -351,18 +385,18 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code)
bad_area_nosemaphore:
#ifdef CONFIG_IA32_EMULATION
/* 32bit vsyscall. map on demand. */
if (test_thread_flag(TIF_IA32) &&
/* 32bit vsyscall. map on demand. */
if (test_thread_flag(TIF_IA32) &&
address >= 0xffffe000 && address < 0xffffe000 + PAGE_SIZE) {
if (map_syscall32(mm, address) < 0)
goto out_of_memory2;
return;
}
if (map_syscall32(mm, address) < 0)
goto out_of_memory2;
return;
}
#endif
/* User mode accesses just cause a SIGSEGV */
if (error_code & 4) {
if (is_prefetch(regs, address))
if (is_prefetch(regs, address, error_code))
return;
/* Work around K8 erratum #100 K8 in compat mode
......@@ -376,7 +410,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code)
return;
if (exception_trace && unhandled_signal(tsk, SIGSEGV)) {
printk(KERN_INFO
printk(KERN_INFO
"%s[%d]: segfault at %016lx rip %016lx rsp %016lx error %lx\n",
tsk->comm, tsk->pid, address, regs->rip,
regs->rsp, error_code);
......@@ -407,7 +441,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code)
* Hall of shame of CPU/BIOS bugs.
*/
if (is_prefetch(regs, address))
if (is_prefetch(regs, address, error_code))
return;
if (is_errata93(regs, address))
......@@ -481,10 +515,8 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code)
* is really there and when yes flush the local TLB.
*/
pgd = pgd_offset_k(address);
if (pgd != current_pgd_offset_k(address))
BUG();
if (!pgd_present(*pgd))
goto bad_area_nosemaphore;
goto bad_area_nosemaphore;
pmd = pmd_offset(pgd, address);
if (!pmd_present(*pmd))
goto bad_area_nosemaphore;
......@@ -495,4 +527,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code)
__flush_tlb_all();
return;
}
page_table_corruption:
pgtable_bad(address, regs, error_code);
}
......@@ -41,6 +41,10 @@
#define Dprintk(x...)
#endif
#ifdef CONFIG_GART_IOMMU
extern int swiotlb;
#endif
extern char _stext[];
DEFINE_PER_CPU(struct mmu_gather, mmu_gathers);
......@@ -396,6 +400,8 @@ static inline int page_is_ram (unsigned long pagenr)
return 0;
}
extern int swiotlb_force;
static struct kcore_list kcore_mem, kcore_vmalloc, kcore_kernel, kcore_modules,
kcore_vsyscall;
......@@ -405,7 +411,10 @@ void __init mem_init(void)
int tmp;
#ifdef CONFIG_SWIOTLB
if (!iommu_aperture && end_pfn >= 0xffffffff>>PAGE_SHIFT)
if (swiotlb_force)
swiotlb = 1;
if (!iommu_aperture &&
(end_pfn >= 0xffffffff>>PAGE_SHIFT || force_iommu))
swiotlb = 1;
if (swiotlb)
swiotlb_init();
......@@ -596,7 +605,16 @@ static struct vm_area_struct gate32_vma = {
struct vm_area_struct *get_gate_vma(struct task_struct *tsk)
{
return test_tsk_thread_flag(tsk, TIF_IA32) ? &gate32_vma : &gate_vma;
#ifdef CONFIG_IA32_EMULATION
if (test_tsk_thread_flag(tsk, TIF_IA32)) {
/* lookup code assumes the pages are present. set them up
now */
if (map_syscall32(tsk->mm, 0xfffe000) < 0)
return NULL;
return &gate32_vma;
}
#endif
return &gate_vma;
}
int in_gate_area(struct task_struct *task, unsigned long addr)
......
......@@ -3,7 +3,7 @@
#
# Reuse the i386 PCI subsystem
#
CFLAGS += -I arch/i386/pci
CFLAGS += -Iarch/i386/pci
obj-y := i386.o
obj-$(CONFIG_PCI_DIRECT)+= direct.o
......@@ -13,6 +13,8 @@ obj-y += legacy.o irq.o common.o
# mmconfig has a 64bit special
obj-$(CONFIG_PCI_MMCONFIG) += mmconfig.o
obj-$(CONFIG_NUMA) += k8-bus.o
direct-y += ../../i386/pci/direct.o
acpi-y += ../../i386/pci/acpi.o
legacy-y += ../../i386/pci/legacy.o
......
#
# Makefile for X86_64 specific PCI routines
#
# Reuse the i386 PCI subsystem
#
CFLAGS += -I arch/i386/pci
obj-y := i386.o
obj-$(CONFIG_PCI_DIRECT)+= direct.o
obj-y += fixup.o
obj-$(CONFIG_ACPI_PCI) += acpi.o
obj-y += legacy.o irq.o common.o
# mmconfig has a 64bit special
obj-$(CONFIG_PCI_MMCONFIG) += mmconfig.o
direct-y += ../../i386/pci/direct.o
acpi-y += ../../i386/pci/acpi.o
legacy-y += ../../i386/pci/legacy.o
irq-y += ../../i386/pci/irq.o
common-y += ../../i386/pci/common.o
fixup-y += ../../i386/pci/fixup.o
i386-y += ../../i386/pci/i386.o
#include <linux/init.h>
#include <linux/pci.h>
#include <asm/mpspec.h>
#include <linux/cpumask.h>
/*
* This discovers the pcibus <-> node mapping on AMD K8.
*
* RED-PEN need to call this again on PCI hotplug
* RED-PEN empty cpus get reported wrong
*/
#define NODE_ID_REGISTER 0x60
#define NODE_ID(dword) (dword & 0x07)
#define LDT_BUS_NUMBER_REGISTER_0 0x94
#define LDT_BUS_NUMBER_REGISTER_1 0xB4
#define LDT_BUS_NUMBER_REGISTER_2 0xD4
#define NR_LDT_BUS_NUMBER_REGISTERS 3
#define SECONDARY_LDT_BUS_NUMBER(dword) ((dword >> 8) & 0xFF)
#define SUBORDINATE_LDT_BUS_NUMBER(dword) ((dword >> 16) & 0xFF)
#define PCI_DEVICE_ID_K8HTCONFIG 0x1100
/**
* fill_mp_bus_to_cpumask()
* fills the mp_bus_to_cpumask array based according to the LDT Bus Number
* Registers found in the K8 northbridge
*/
__init static int
fill_mp_bus_to_cpumask(void)
{
struct pci_dev *nb_dev = NULL;
int i, j;
u32 ldtbus, nid;
static int lbnr[3] = {
LDT_BUS_NUMBER_REGISTER_0,
LDT_BUS_NUMBER_REGISTER_1,
LDT_BUS_NUMBER_REGISTER_2
};
while ((nb_dev = pci_get_device(PCI_VENDOR_ID_AMD,
PCI_DEVICE_ID_K8HTCONFIG, nb_dev))) {
pci_read_config_dword(nb_dev, NODE_ID_REGISTER, &nid);
for (i = 0; i < NR_LDT_BUS_NUMBER_REGISTERS; i++) {
pci_read_config_dword(nb_dev, lbnr[i], &ldtbus);
/*
* if there are no busses hanging off of the current
* ldt link then both the secondary and subordinate
* bus number fields are set to 0.
*/
if (!(SECONDARY_LDT_BUS_NUMBER(ldtbus) == 0
&& SUBORDINATE_LDT_BUS_NUMBER(ldtbus) == 0)) {
for (j = SECONDARY_LDT_BUS_NUMBER(ldtbus);
j <= SUBORDINATE_LDT_BUS_NUMBER(ldtbus);
j++)
pci_bus_to_cpumask[j] =
node_to_cpumask(NODE_ID(nid));
}
}
}
/* quick sanity check */
for (i = 0; i < 256; i++) {
if (cpus_empty(pci_bus_to_cpumask[i])) {
printk(KERN_ERR
"k8-bus.c: bus %i has empty cpu mask\n", i);
pci_bus_to_cpumask[i] = CPU_MASK_ALL;
}
}
return 0;
}
fs_initcall(fill_mp_bus_to_cpumask);
......@@ -99,6 +99,11 @@ __acpi_release_global_lock (unsigned int *lock)
:"=r"(n_hi), "=r"(n_lo) \
:"0"(n_hi), "1"(n_lo))
/*
* Refer Intel ACPI _PDC support document for bit definitions
*/
#define ACPI_PDC_EST_CAPABILITY_SMP 0xa
#define ACPI_PDC_EST_CAPABILITY_MSR 0x1
#ifdef CONFIG_ACPI_BOOT
extern int acpi_lapic;
......
......@@ -25,10 +25,10 @@
* Note that @nr may be almost arbitrarily large; this function is not
* restricted to acting on a single-word quantity.
*/
static __inline__ void set_bit(long nr, volatile void * addr)
static __inline__ void set_bit(int nr, volatile void * addr)
{
__asm__ __volatile__( LOCK_PREFIX
"btsq %1,%0"
"btsl %1,%0"
:"=m" (ADDR)
:"dIr" (nr) : "memory");
}
......@@ -254,128 +254,37 @@ static __inline__ int variable_test_bit(int nr, volatile const void * addr)
#undef ADDR
/**
* find_first_zero_bit - find the first zero bit in a memory region
* @addr: The address to start the search at
* @size: The maximum size to search
*
* Returns the bit-number of the first zero bit, not the number of the byte
* containing a bit.
*/
static __inline__ int find_first_zero_bit(const unsigned long * addr, unsigned size)
{
int d0, d1, d2;
int res;
if (!size)
return 0;
__asm__ __volatile__(
"movl $-1,%%eax\n\t"
"xorl %%edx,%%edx\n\t"
"repe; scasl\n\t"
"je 1f\n\t"
"xorl -4(%%rdi),%%eax\n\t"
"subq $4,%%rdi\n\t"
"bsfl %%eax,%%edx\n"
"1:\tsubq %%rbx,%%rdi\n\t"
"shlq $3,%%rdi\n\t"
"addq %%rdi,%%rdx"
:"=d" (res), "=&c" (d0), "=&D" (d1), "=&a" (d2)
:"1" ((size + 31) >> 5), "2" (addr), "b" (addr) : "memory");
return res;
}
extern long find_first_zero_bit(const unsigned long * addr, unsigned long size);
extern long find_next_zero_bit (const unsigned long * addr, long size, long offset);
extern long find_first_bit(const unsigned long * addr, unsigned long size);
extern long find_next_bit(const unsigned long * addr, long size, long offset);
/**
* find_next_zero_bit - find the first zero bit in a memory region
* @addr: The address to base the search on
* @offset: The bitnumber to start searching at
* @size: The maximum size to search
*/
static __inline__ int find_next_zero_bit (const unsigned long * addr, int size, int offset)
/* return index of first bet set in val or max when no bit is set */
static inline unsigned long __scanbit(unsigned long val, unsigned long max)
{
unsigned long * p = ((unsigned long *) addr) + (offset >> 6);
unsigned long set = 0;
unsigned long res, bit = offset&63;
if (bit) {
/*
* Look for zero in first word
*/
__asm__("bsfq %1,%0\n\t"
"cmoveq %2,%0"
: "=r" (set)
: "r" (~(*p >> bit)), "r"(64L));
if (set < (64 - bit))
return set + offset;
set = 64 - bit;
p++;
}
/*
* No zero yet, search remaining full words for a zero
*/
res = find_first_zero_bit ((const unsigned long *)p, size - 64 * (p - (unsigned long *) addr));
return (offset + set + res);
asm("bsfq %1,%0 ; cmovz %2,%0" : "=&r" (val) : "r" (val), "r" (max));
return val;
}
#define find_first_bit(addr,size) \
((__builtin_constant_p(size) && size <= BITS_PER_LONG ? \
(__scanbit(*(unsigned long *)addr,(size))) : \
find_first_bit(addr,size)))
/**
* find_first_bit - find the first set bit in a memory region
* @addr: The address to start the search at
* @size: The maximum size to search
*
* Returns the bit-number of the first set bit, not the number of the byte
* containing a bit.
*/
static __inline__ int find_first_bit(const unsigned long * addr, unsigned size)
{
int d0, d1;
int res;
/* This looks at memory. Mark it volatile to tell gcc not to move it around */
__asm__ __volatile__(
"xorl %%eax,%%eax\n\t"
"repe; scasl\n\t"
"jz 1f\n\t"
"leaq -4(%%rdi),%%rdi\n\t"
"bsfl (%%rdi),%%eax\n"
"1:\tsubq %%rbx,%%rdi\n\t"
"shll $3,%%edi\n\t"
"addl %%edi,%%eax"
:"=a" (res), "=&c" (d0), "=&D" (d1)
:"1" ((size + 31) >> 5), "2" (addr), "b" (addr) : "memory");
return res;
}
#define find_next_bit(addr,size,off) \
((__builtin_constant_p(size) && size <= BITS_PER_LONG ? \
((off) + (__scanbit((*(unsigned long *)addr) >> (off),(size)-(off)))) : \
find_next_bit(addr,size,off)))
/**
* find_next_bit - find the first set bit in a memory region
* @addr: The address to base the search on
* @offset: The bitnumber to start searching at
* @size: The maximum size to search
*/
static __inline__ int find_next_bit(const unsigned long * addr, int size, int offset)
{
const unsigned long * p = addr + (offset >> 6);
unsigned long set = 0, bit = offset & 63, res;
#define find_first_zero_bit(addr,size) \
((__builtin_constant_p(size) && size <= BITS_PER_LONG ? \
(__scanbit(~*(unsigned long *)addr,(size))) : \
find_first_zero_bit(addr,size)))
if (bit) {
/*
* Look for nonzero in the first 64 bits:
*/
__asm__("bsfq %1,%0\n\t"
"cmoveq %2,%0\n\t"
: "=r" (set)
: "r" (*p >> bit), "r" (64L));
if (set < (64 - bit))
return set + offset;
set = 64 - bit;
p++;
}
/*
* No set bit yet, search remaining full words for a bit
*/
res = find_first_bit (p, size - 64 * (p - addr));
return (offset + set + res);
}
#define find_next_zero_bit(addr,size,off) \
((__builtin_constant_p(size) && size <= BITS_PER_LONG ? \
((off)+(__scanbit(~(((*(unsigned long *)addr)) >> (off)),(size)-(off)))) : \
find_next_zero_bit(addr,size,off)))
/*
* Find string of zero bits in a bitmap. -1 when not found.
......
#ifndef _X8664_DMA_MAPPING_H
#define _X8664_DMA_MAPPING_H 1
#include <asm-generic/dma-mapping.h>
/*
* IOMMU interface. See Documentation/DMA-mapping.txt and DMA-API.txt for
* documentation.
*/
#include <linux/config.h>
#include <linux/device.h>
#include <asm/scatterlist.h>
#include <asm/io.h>
#include <asm/swiotlb.h>
extern dma_addr_t bad_dma_address;
#define dma_mapping_error(x) \
(swiotlb ? swiotlb_dma_mapping_error(x) : ((x) == bad_dma_address))
void *dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle,
unsigned gfp);
void dma_free_coherent(struct device *dev, size_t size, void *vaddr,
dma_addr_t dma_handle);
#ifdef CONFIG_GART_IOMMU
extern dma_addr_t dma_map_single(struct device *hwdev, void *ptr, size_t size,
int direction);
extern void dma_unmap_single(struct device *dev, dma_addr_t addr,size_t size,
int direction);
#else
/* No IOMMU */
static inline dma_addr_t dma_map_single(struct device *hwdev, void *ptr,
size_t size, int direction)
{
dma_addr_t addr;
if (direction == DMA_NONE)
out_of_line_bug();
addr = virt_to_bus(ptr);
if ((addr+size) & ~*hwdev->dma_mask)
out_of_line_bug();
return addr;
}
static inline void dma_unmap_single(struct device *hwdev, dma_addr_t dma_addr,
size_t size, int direction)
{
if (direction == DMA_NONE)
out_of_line_bug();
/* Nothing to do */
}
#endif
#define dma_map_page(dev,page,offset,size,dir) \
dma_map_single((dev), page_address(page)+(offset), (size), (dir))
static inline void dma_sync_single_for_cpu(struct device *hwdev,
dma_addr_t dma_handle,
size_t size, int direction)
{
if (direction == DMA_NONE)
out_of_line_bug();
if (swiotlb)
return swiotlb_sync_single_for_cpu(hwdev,dma_handle,size,direction);
flush_write_buffers();
}
static inline void dma_sync_single_for_device(struct device *hwdev,
dma_addr_t dma_handle,
size_t size, int direction)
{
if (direction == DMA_NONE)
out_of_line_bug();
if (swiotlb)
return swiotlb_sync_single_for_device(hwdev,dma_handle,size,direction);
flush_write_buffers();
}
static inline void dma_sync_sg_for_cpu(struct device *hwdev,
struct scatterlist *sg,
int nelems, int direction)
{
if (direction == DMA_NONE)
out_of_line_bug();
if (swiotlb)
return swiotlb_sync_sg_for_cpu(hwdev,sg,nelems,direction);
flush_write_buffers();
}
static inline void dma_sync_sg_for_device(struct device *hwdev,
struct scatterlist *sg,
int nelems, int direction)
{
if (direction == DMA_NONE)
out_of_line_bug();
if (swiotlb)
return swiotlb_sync_sg_for_device(hwdev,sg,nelems,direction);
flush_write_buffers();
}
extern int dma_map_sg(struct device *hwdev, struct scatterlist *sg,
int nents, int direction);
extern void dma_unmap_sg(struct device *hwdev, struct scatterlist *sg,
int nents, int direction);
#define dma_unmap_page dma_unmap_single
extern int dma_supported(struct device *hwdev, u64 mask);
extern int dma_get_cache_alignment(void);
#define dma_is_consistent(h) 1
static inline int dma_set_mask(struct device *dev, u64 mask)
{
if (!dev->dma_mask || !dma_supported(dev, mask))
return -EIO;
*dev->dma_mask = mask;
return 0;
}
static inline void dma_cache_sync(void *vaddr, size_t size, enum dma_data_direction dir)
{
flush_write_buffers();
}
#endif
......@@ -39,16 +39,25 @@ static inline int need_signal_i387(struct task_struct *me)
* FPU lazy state save handling...
*/
#define kernel_fpu_end() stts()
#define unlazy_fpu(tsk) do { \
if ((tsk)->thread_info->status & TS_USEDFPU) \
save_init_fpu(tsk); \
} while (0)
/* Ignore delayed exceptions from user space */
static inline void tolerant_fwait(void)
{
asm volatile("1: fwait\n"
"2:\n"
" .section __ex_table,\"a\"\n"
" .align 8\n"
" .quad 1b,2b\n"
" .previous\n");
}
#define clear_fpu(tsk) do { \
if ((tsk)->thread_info->status & TS_USEDFPU) { \
asm volatile("fnclex ; fwait"); \
tolerant_fwait(); \
(tsk)->thread_info->status &= ~TS_USEDFPU; \
stts(); \
} \
......@@ -116,6 +125,7 @@ static inline int save_i387_checking(struct i387_fxsave_struct __user *fx)
static inline void kernel_fpu_begin(void)
{
struct thread_info *me = current_thread_info();
preempt_disable();
if (me->status & TS_USEDFPU) {
asm volatile("rex64 ; fxsave %0 ; fnclex"
: "=m" (me->task->thread.i387.fxsave));
......@@ -125,9 +135,15 @@ static inline void kernel_fpu_begin(void)
clts();
}
static inline void kernel_fpu_end(void)
{
stts();
preempt_enable();
}
static inline void save_init_fpu( struct task_struct *tsk )
{
asm volatile( "fxsave %0 ; fnclex"
asm volatile( "rex64 ; fxsave %0 ; fnclex"
: "=m" (tsk->thread.i387.fxsave));
tsk->thread_info->status &= ~TS_USEDFPU;
stts();
......
......@@ -78,12 +78,6 @@ struct stat64 {
unsigned long long st_ino;
} __attribute__((packed));
typedef union sigval32 {
int sival_int;
unsigned int sival_ptr;
} sigval_t32;
typedef struct siginfo32 {
int si_signo;
int si_errno;
......@@ -102,7 +96,7 @@ typedef struct siginfo32 {
struct {
int _tid; /* timer id */
int _overrun; /* overrun count */
sigval_t32 _sigval; /* same as below */
compat_sigval_t _sigval; /* same as below */
int _sys_private; /* not to be passed to user */
int _overrun_incr; /* amount to add to overrun */
} _timer;
......@@ -111,7 +105,7 @@ typedef struct siginfo32 {
struct {
unsigned int _pid; /* sender's pid */
unsigned int _uid; /* sender's uid */
sigval_t32 _sigval;
compat_sigval_t _sigval;
} _rt;
/* SIGCHLD */
......
......@@ -186,10 +186,30 @@ extern void iounmap(void *addr);
#define __raw_readl readl
#define __raw_readq readq
#define writeb(b,addr) (*(volatile unsigned char *) (addr) = (b))
#define writew(b,addr) (*(volatile unsigned short *) (addr) = (b))
#ifdef CONFIG_UNORDERED_IO
static inline void __writel(u32 val, void *addr)
{
volatile u32 *target = addr;
asm volatile("movnti %1,%0"
: "=m" (*target)
: "r" (val) : "memory");
}
static inline void __writeq(u64 val, void *addr)
{
volatile u64 *target = addr;
asm volatile("movnti %1,%0"
: "=m" (*target)
: "r" (val) : "memory");
}
#define writeq(val,addr) __writeq((val),(void *)(addr))
#define writel(val,addr) __writel((val),(void *)(addr))
#else
#define writel(b,addr) (*(volatile unsigned int *) (addr) = (b))
#define writeq(b,addr) (*(volatile unsigned long *) (addr) = (b))
#endif
#define writeb(b,addr) (*(volatile unsigned char *) (addr) = (b))
#define writew(b,addr) (*(volatile unsigned short *) (addr) = (b))
#define __raw_writeb writeb
#define __raw_writew writew
#define __raw_writel writel
......@@ -299,11 +319,8 @@ static inline int isa_check_signature(unsigned long io_addr,
#define flush_write_buffers()
/* Disable vmerge for now. Need to fix the block layer code
to check for non iommu addresses first.
When the IOMMU is force it is safe to enable. */
extern int iommu_merge;
#define BIO_VMERGE_BOUNDARY (iommu_merge ? 4096 : 0)
extern int iommu_bio_merge;
#define BIO_VMERGE_BOUNDARY iommu_bio_merge
#endif /* __KERNEL__ */
......
......@@ -166,7 +166,6 @@ enum mp_bustype {
};
extern unsigned char mp_bus_id_to_type [MAX_MP_BUSSES];
extern int mp_bus_id_to_pci_bus [MAX_MP_BUSSES];
extern cpumask_t pci_bus_to_cpumask [256];
extern unsigned int boot_cpu_physical_apicid;
extern int smp_found_config;
......
......@@ -71,8 +71,6 @@ struct mtrr_gentry
#ifdef __KERNEL__
extern char *mtrr_strings[MTRR_NUM_TYPES];
/* The following functions are for use by other drivers */
# ifdef CONFIG_MTRR
extern int mtrr_add (unsigned long base, unsigned long size,
......
......@@ -44,81 +44,25 @@ int pcibios_set_irq_routing(struct pci_dev *dev, int pin, int irq);
#include <asm/io.h>
#include <asm/page.h>
struct pci_dev;
extern int iommu_setup(char *opt);
extern dma_addr_t bad_dma_address;
#define pci_dma_mapping_error(x) ((x) == bad_dma_address)
/* Allocate and map kernel buffer using consistent mode DMA for a device.
* hwdev should be valid struct pci_dev pointer for PCI devices,
* NULL for PCI-like buses (ISA, EISA).
* Returns non-NULL cpu-view pointer to the buffer if successful and
* sets *dma_addrp to the pci side dma address as well, else *dma_addrp
* is undefined.
*/
extern void *pci_alloc_consistent(struct pci_dev *hwdev, size_t size,
dma_addr_t *dma_handle);
/* Free and unmap a consistent DMA buffer.
* cpu_addr is what was returned from pci_alloc_consistent,
* size must be the same as what as passed into pci_alloc_consistent,
* and likewise dma_addr must be the same as what *dma_addrp was set to.
*
* References to the memory and mappings associated with cpu_addr/dma_addr
* past this call are illegal.
*/
extern void pci_free_consistent(struct pci_dev *hwdev, size_t size,
void *vaddr, dma_addr_t dma_handle);
#ifdef CONFIG_SWIOTLB
extern int swiotlb;
extern dma_addr_t swiotlb_map_single (struct device *hwdev, void *ptr, size_t size,
int dir);
extern void swiotlb_unmap_single (struct device *hwdev, dma_addr_t dev_addr,
size_t size, int dir);
extern void swiotlb_sync_single_for_cpu (struct device *hwdev,
dma_addr_t dev_addr,
size_t size, int dir);
extern void swiotlb_sync_single_for_device (struct device *hwdev,
dma_addr_t dev_addr,
size_t size, int dir);
extern void swiotlb_sync_sg_for_cpu (struct device *hwdev,
struct scatterlist *sg, int nelems,
int dir);
extern void swiotlb_sync_sg_for_device (struct device *hwdev,
struct scatterlist *sg, int nelems,
int dir);
extern int swiotlb_map_sg(struct device *hwdev, struct scatterlist *sg,
int nents, int direction);
extern void swiotlb_unmap_sg(struct device *hwdev, struct scatterlist *sg,
int nents, int direction);
#endif
#ifdef CONFIG_GART_IOMMU
/* Map a single buffer of the indicated size for DMA in streaming mode.
* The 32-bit bus address to use is returned.
/* The PCI address space does equal the physical memory
* address space. The networking and block device layers use
* this boolean for bounce buffer decisions
*
* Once the device is given the dma address, the device owns this memory
* until either pci_unmap_single or pci_dma_sync_single_for_cpu is performed.
* On AMD64 it mostly equals, but we set it to zero to tell some subsystems
* that an IOMMU is available.
*/
extern dma_addr_t pci_map_single(struct pci_dev *hwdev, void *ptr, size_t size,
int direction);
void pci_unmap_single(struct pci_dev *hwdev, dma_addr_t addr,
size_t size, int direction);
#define PCI_DMA_BUS_IS_PHYS (no_iommu ? 1 : 0)
/*
* pci_{map,unmap}_single_page maps a kernel page to a dma_addr_t. identical
* to pci_map_single, but takes a struct page instead of a virtual address
* x86-64 always supports DAC, but sometimes it is useful to force
* devices through the IOMMU to get automatic sg list merging.
* Optional right now.
*/
#define pci_map_page(dev,page,offset,size,dir) \
pci_map_single((dev), page_address(page)+(offset), (size), (dir))
extern int iommu_sac_force;
#define pci_dac_dma_supported(pci_dev, mask) (!iommu_sac_force)
#define DECLARE_PCI_UNMAP_ADDR(ADDR_NAME) \
dma_addr_t ADDR_NAME;
......@@ -133,113 +77,12 @@ void pci_unmap_single(struct pci_dev *hwdev, dma_addr_t addr,
#define pci_unmap_len_set(PTR, LEN_NAME, VAL) \
(((PTR)->LEN_NAME) = (VAL))
static inline void pci_dma_sync_single_for_cpu(struct pci_dev *hwdev,
dma_addr_t dma_handle,
size_t size, int direction)
{
BUG_ON(direction == PCI_DMA_NONE);
#ifdef CONFIG_SWIOTLB
if (swiotlb)
return swiotlb_sync_single_for_cpu(&hwdev->dev,dma_handle,size,direction);
#endif
flush_write_buffers();
}
static inline void pci_dma_sync_single_for_device(struct pci_dev *hwdev,
dma_addr_t dma_handle,
size_t size, int direction)
{
BUG_ON(direction == PCI_DMA_NONE);
#ifdef CONFIG_SWIOTLB
if (swiotlb)
return swiotlb_sync_single_for_device(&hwdev->dev,dma_handle,size,direction);
#endif
flush_write_buffers();
}
static inline void pci_dma_sync_sg_for_cpu(struct pci_dev *hwdev,
struct scatterlist *sg,
int nelems, int direction)
{
BUG_ON(direction == PCI_DMA_NONE);
#ifdef CONFIG_SWIOTLB
if (swiotlb)
return swiotlb_sync_sg_for_cpu(&hwdev->dev,sg,nelems,direction);
#endif
flush_write_buffers();
}
static inline void pci_dma_sync_sg_for_device(struct pci_dev *hwdev,
struct scatterlist *sg,
int nelems, int direction)
{
BUG_ON(direction == PCI_DMA_NONE);
#ifdef CONFIG_SWIOTLB
if (swiotlb)
return swiotlb_sync_sg_for_device(&hwdev->dev,sg,nelems,direction);
#endif
flush_write_buffers();
}
/* The PCI address space does equal the physical memory
* address space. The networking and block device layers use
* this boolean for bounce buffer decisions
*
* On AMD64 it mostly equals, but we set it to zero to tell some subsystems
* that an IOMMU is available.
*/
#define PCI_DMA_BUS_IS_PHYS (no_iommu ? 1 : 0)
/* We lie slightly when the IOMMU is forced to get the device to
use SAC instead of DAC. */
#define pci_dac_dma_supported(pci_dev, mask) (force_iommu ? 0 : 1)
#else
static inline dma_addr_t pci_map_single(struct pci_dev *hwdev, void *ptr,
size_t size, int direction)
{
dma_addr_t addr;
/* No IOMMU */
if (direction == PCI_DMA_NONE)
out_of_line_bug();
addr = virt_to_bus(ptr);
/*
* This is gross, but what should I do.
* Unfortunately drivers do not test the return value of this.
*/
if ((addr+size) & ~hwdev->dma_mask)
out_of_line_bug();
return addr;
}
static inline void pci_unmap_single(struct pci_dev *hwdev, dma_addr_t dma_addr,
size_t size, int direction)
{
if (direction == PCI_DMA_NONE)
out_of_line_bug();
/* Nothing to do */
}
static inline dma_addr_t pci_map_page(struct pci_dev *hwdev, struct page *page,
unsigned long offset, size_t size, int direction)
{
dma_addr_t addr;
if (direction == PCI_DMA_NONE)
out_of_line_bug();
addr = page_to_pfn(page) * PAGE_SIZE + offset;
if ((addr+size) & ~hwdev->dma_mask)
out_of_line_bug();
return addr;
}
#define PCI_DMA_BUS_IS_PHYS 1
#define pci_dac_dma_supported(pci_dev, mask) 1
/* pci_unmap_{page,single} is a nop so... */
#define DECLARE_PCI_UNMAP_ADDR(ADDR_NAME)
#define DECLARE_PCI_UNMAP_LEN(LEN_NAME)
#define pci_unmap_addr(PTR, ADDR_NAME) (0)
......@@ -247,74 +90,9 @@ static inline dma_addr_t pci_map_page(struct pci_dev *hwdev, struct page *page,
#define pci_unmap_len(PTR, LEN_NAME) (0)
#define pci_unmap_len_set(PTR, LEN_NAME, VAL) do { } while (0)
/* Make physical memory consistent for a single
* streaming mode DMA translation after a transfer.
*
* If you perform a pci_map_single() but wish to interrogate the
* buffer using the cpu, yet do not wish to teardown the PCI dma
* mapping, you must call this function before doing so. At the
* next point you give the PCI dma address back to the card, you
* must first perform a pci_dma_sync_for_device, and then the
* device again owns the buffer.
*/
static inline void pci_dma_sync_single_for_cpu(struct pci_dev *hwdev,
dma_addr_t dma_handle,
size_t size, int direction)
{
if (direction == PCI_DMA_NONE)
out_of_line_bug();
}
static inline void pci_dma_sync_single_for_device(struct pci_dev *hwdev,
dma_addr_t dma_handle,
size_t size, int direction)
{
if (direction == PCI_DMA_NONE)
out_of_line_bug();
flush_write_buffers();
}
/* Make physical memory consistent for a set of streaming
* mode DMA translations after a transfer.
*
* The same as pci_dma_sync_single_* but for a scatter-gather list,
* same rules and usage.
*/
static inline void pci_dma_sync_sg_for_cpu(struct pci_dev *hwdev,
struct scatterlist *sg,
int nelems, int direction)
{
if (direction == PCI_DMA_NONE)
out_of_line_bug();
}
static inline void pci_dma_sync_sg_for_device(struct pci_dev *hwdev,
struct scatterlist *sg,
int nelems, int direction)
{
if (direction == PCI_DMA_NONE)
out_of_line_bug();
flush_write_buffers();
}
#define PCI_DMA_BUS_IS_PHYS 1
#define pci_dac_dma_supported(pci_dev, mask) 1
#endif
extern int pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
int nents, int direction);
extern void pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg,
int nents, int direction);
#define pci_unmap_page pci_unmap_single
/* Return whether the given PCI device DMA address mask can
* be supported properly. For example, if your device can
* only drive the low 24-bits during PCI bus mastering, then
* you would pass 0x00ffffff as the mask to this function.
*/
extern int pci_dma_supported(struct pci_dev *hwdev, u64 mask);
#include <asm-generic/pci-dma-compat.h>
static inline dma64_addr_t
pci_dac_page_to_dma(struct pci_dev *pdev, struct page *page, unsigned long offset, int direction)
......@@ -359,7 +137,6 @@ static inline void pcibios_add_platform_entries(struct pci_dev *dev)
/* generic pci stuff */
#ifdef CONFIG_PCI
#include <asm-generic/pci.h>
#include <linux/dma-mapping.h>
#endif
#endif /* __x8664_PCI_H */
......@@ -82,7 +82,6 @@ extern int unhandled_signal(struct task_struct *tsk, int sig);
extern void select_idle_routine(const struct cpuinfo_x86 *c);
extern void swiotlb_init(void);
extern int swiotlb;
extern unsigned long max_mapnr;
extern unsigned long end_pfn;
......@@ -103,6 +102,8 @@ extern int fallback_aper_force;
extern int iommu_aperture;
extern int iommu_aperture_disabled;
extern int iommu_aperture_allowed;
extern int fix_aperture;
extern int force_iommu;
extern void smp_local_timer_interrupt(struct pt_regs * regs);
......
#ifndef _ASM_SWIOTLB_H
#define _ASM_SWTIOLB_H 1
#include <linux/config.h>
/* SWIOTLB interface */
extern dma_addr_t swiotlb_map_single(struct device *hwdev, void *ptr, size_t size,
int dir);
extern void swiotlb_unmap_single(struct device *hwdev, dma_addr_t dev_addr,
size_t size, int dir);
extern void swiotlb_sync_single_for_cpu(struct device *hwdev,
dma_addr_t dev_addr,
size_t size, int dir);
extern void swiotlb_sync_single_for_device(struct device *hwdev,
dma_addr_t dev_addr,
size_t size, int dir);
extern void swiotlb_sync_sg_for_cpu(struct device *hwdev,
struct scatterlist *sg, int nelems,
int dir);
extern void swiotlb_sync_sg_for_device(struct device *hwdev,
struct scatterlist *sg, int nelems,
int dir);
extern int swiotlb_map_sg(struct device *hwdev, struct scatterlist *sg,
int nents, int direction);
extern void swiotlb_unmap_sg(struct device *hwdev, struct scatterlist *sg,
int nents, int direction);
extern int swiotlb_dma_mapping_error(dma_addr_t dma_addr);
#ifdef CONFIG_SWIOTLB
extern int swiotlb;
#else
#define swiotlb 0
#endif
#endif
......@@ -297,11 +297,11 @@ static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old,
#define mb() asm volatile("mfence":::"memory")
#define rmb() asm volatile("lfence":::"memory")
/* could use SFENCE here, but it would be only needed for unordered SSE
store instructions and we always do an explicit sfence with them currently.
the ordering of normal stores is serialized enough. Just make it a compile
barrier. */
#ifdef CONFIG_UNORDERED_IO
#define wmb() asm volatile("sfence" ::: "memory")
#else
#define wmb() asm volatile("" ::: "memory")
#endif
#define read_barrier_depends() do {} while(0)
#define set_mb(var, value) do { xchg(&var, value); } while (0)
#define set_wmb(var, value) do { var = value; wmb(); } while (0)
......
......@@ -14,18 +14,23 @@ extern cpumask_t cpu_online_map;
extern unsigned char cpu_to_node[];
extern cpumask_t node_to_cpumask[];
extern cpumask_t pci_bus_to_cpumask[];
#define cpu_to_node(cpu) (cpu_to_node[cpu])
#define parent_node(node) (node)
#define node_to_first_cpu(node) (__ffs(node_to_cpumask[node]))
#define node_to_cpumask(node) (node_to_cpumask[node])
static inline cpumask_t pcibus_to_cpumask(int bus)
static inline cpumask_t __pcibus_to_cpumask(int bus)
{
cpumask_t busmask = pci_bus_to_cpumask[bus];
cpumask_t online = cpu_online_map;
cpumask_t res;
cpus_and(res, pci_bus_to_cpumask[bus], cpu_online_map);
cpus_and(res, busmask, online);
return res;
}
/* broken generic file uses #ifndef later on this */
#define pcibus_to_cpumask(bus) __pcibus_to_cpumask(bus)
#define NODE_BALANCE_RATE 30 /* CHECKME */
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment