- 12 Dec, 2011 11 commits
-
-
Joerg Roedel authored
This patch adds functions necessary to set and clear the GCR3 values associated with a particular PASID in an IOMMUv2 domain. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
-
Joerg Roedel authored
The functions added with this patch allow to manage the IOMMU and the device TLBs for all devices in an IOMMUv2 domain. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
-
Joerg Roedel authored
This patch adds support for protection domains that implement two-level paging for devices. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
-
Joerg Roedel authored
This function can be used to switch a domain into paging-mode 0. In this mode all devices can access physical system memory directly without any remapping. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
-
Joerg Roedel authored
Add a notifer at which a module can attach to get informed about incoming PPR faults. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
-
Joerg Roedel authored
If the device starts to use IOMMUv2 features the dma handles need to stay valid. The only sane way to do this is to use a identity mapping for the device and not translate it by the iommu. This is implemented with this patch. Since this lifts the device-isolation there is also a new kernel parameter which allows to disable that feature. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
-
Joerg Roedel authored
In mixed IOMMU setups this flag inidicates whether an IOMMU supports the v2 features or not. This patch also adds a global flag together with a function to query that flag from other code. The flag shows if at least one IOMMUv2 is in the system. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
-
Joerg Roedel authored
This feature needs to be enabled before IOMMUv2 DTEs can be set up. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
-
Joerg Roedel authored
Allocate and enable a log buffer for peripheral page faults when the IOMMU supports this feature. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
-
Joerg Roedel authored
Read the number of PASIDs supported by each IOMMU in the system and take the smallest number as the maximum value supported by the IOMMU driver. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
-
Joerg Roedel authored
Convert the contents of 'struct dev_table_entry' to u64 to allow updating the DTE wit 64bit writes as required by the spec. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
-
- 09 Dec, 2011 29 commits
-
-
Linus Torvalds authored
-
git://git.samba.org/sfrench/cifs-2.6Linus Torvalds authored
* git://git.samba.org/sfrench/cifs-2.6: cifs: check for NULL last_entry before calling cifs_save_resume_key cifs: attempt to freeze while looping on a receive attempt cifs: Fix sparse warning when calling cifs_strtoUCS CIFS: Add descriptions to the brlock cache functions
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, efi: Calling __pa() with an ioremap()ed address is invalid x86, hpet: Immediately disable HPET timer 1 if rtc irq is masked x86/intel_mid: Kconfig select fix x86/intel_mid: Fix the Kconfig for MID selection
-
git://git.pengutronix.de/git/wsa/linux-2.6Linus Torvalds authored
* 'spi/for-3.2' of git://git.pengutronix.de/git/wsa/linux-2.6: spi/gpio: fix section mismatch warning spi/fsl-espi: disable CONFIG_SPI_FSL_ESPI=m build spi/nuc900: Include linux/module.h spi/ath79: fix compile error due to missing include
-
git://neil.brown.name/mdLinus Torvalds authored
* 'for-linus' of git://neil.brown.name/md: md: raid5 crash during degradation md/raid5: never wait for bad-block acks on failed device. md: ensure new badblocks are handled promptly. md: bad blocks shouldn't cause a Blocked status on a Faulty device. md: take a reference to mddev during sysfs access. md: refine interpretation of "hold_active == UNTIL_IOCTL". md/lock: ensure updates to page_attrs are properly locked.
-
git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tileLinus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: arch/tile: use new generic {enable,disable}_percpu_irq() routines drivers/net/ethernet/tile: use skb_frag_page() API asm-generic/unistd.h: support new process_vm_{readv,write} syscalls arch/tile: fix double-free bug in homecache_free_pages() arch/tile: add a few #includes and an EXPORT to catch up with kernel changes.
-
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommuLinus Torvalds authored
* 'iommu/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: MAINTAINERS: Update amd-iommu F: patterns iommu/amd: Fix typo in kernel-parameters.txt iommu/msm: Fix compile error in mach-msm/devices-iommu.c Fix comparison using wrong pointer variable in dma debug code
-
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/soundLinus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/realtek - Fix lost speaker volume controls ALSA: hda/realtek - Create "Bass Speaker" for two speaker pins ALSA: hda/realtek - Don't create extra controls with channel suffix ALSA: hda - Fix remaining VREF mute-LED NID check in post-3.1 changes ALSA: hda - Fix GPIO LED setup for IDT 92HD75 codecs ASoC: Provide a more complete DMA driver stub ASoC: Remove references to corgi and spitz from machine driver document ASoC: Make SND_SOC_MX27VIS_AIC32X4 depend on I2C ASoC: Fix dependency for SND_SOC_RAUMFELD and SND_PXA2XX_SOC_HX4700 ASoC: uda1380: Return proper error in uda1380_modinit failure path ASoC: kirkwood: Make SND_KIRKWOOD_SOC_OPENRD and SND_KIRKWOOD_SOC_T5325 depend on I2C ASoC: Mark WM8994 ADC muxes as virtual ALSA: hda/realtek - Fix Oops in alc_mux_select() ALSA: sis7019 - give slow codecs more time to reset
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Do no try to schedule task events if there are none lockdep, kmemcheck: Annotate ->lock in lockdep_init_map() perf header: Use event_name() to get an event name perf stat: Failure with "Operation not supported"
-
Mandeep Singh Baines authored
In order to safely dereference current->real_parent inside an rcu_read_lock, we need an rcu_dereference. Signed-off-by: Mandeep Singh Baines <msb@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Alexandre Bounine authored
Modify initialization of PCIe capability registers in Tsi721 mport driver: - change Completion Timeout value to avoid unexpected data transfer aborts during intensive traffic. - replace hardcoded offset of PCIe capability block by making it use the common function. This patch is applicable to kernel versions starting from 3.2-rc1. Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Matt Porter <mporter@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Alexandre Bounine authored
Bug fix for Tsi721 RapidIO mport driver: Tsi721 supports four RapidIO mailboxes (MBOX0 - MBOX3) as defined by RapidIO specification. Mailbox resources has to be properly reported to allow use of all available mailboxes (initial version reports only MBOX0). This patch is applicable to kernel versions staring from 3.2-rc1. Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Matt Porter <mporter@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Alexandre Bounine authored
Replace the pair dma_alloc_coherent()+memset() with the new dma_zalloc_coherent() added by Andrew Morton for kernel version 3.2 Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Michal Hocko authored
Since commit a25cac51 ("proc: Consider NO_HZ when printing idle and iowait times") we are reporting idle/io_wait time also while a CPU is tickless. We rely on get_{idle,iowait}_time functions to retrieve proper data. These functions, however, use usecs_to_cputime to translate micro seconds time to cputime64_t. This is just an alias to usecs_to_jiffies which reduces the data type from u64 to unsigned int and also checks whether the given parameter overflows jiffies_to_usecs(MAX_JIFFY_OFFSET) and returns MAX_JIFFY_OFFSET in that case. When we overflow depends on CONFIG_HZ but especially for CONFIG_HZ_300 it is quite low (1431649781) so we are getting MAX_JIFFY_OFFSET for >3000s! until we overflow unsigned int. Just for reference CONFIG_HZ_100 has an overflow window around 20s, CONFIG_HZ_250 ~8s and CONFIG_HZ_1000 ~2s. This results in a bug when people saw [h]top going mad reporting 100% CPU usage even though there was basically no CPU load. The reason was simply that /proc/stat stopped reporting idle/io_wait changes (and reported MAX_JIFFY_OFFSET) and so the only change happening was for user system time. Let's use nsecs_to_jiffies64 instead which doesn't reduce the precision to 32b type and it is much more appropriate for cumulative time values (unlike usecs_to_jiffies which intended for timeout calculations). Signed-off-by: Michal Hocko <mhocko@suse.cz> Tested-by: Artem S. Tashkinov <t.artem@mailcity.com> Cc: Dave Jones <davej@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Mel Gorman authored
Commit f5252e00 ("mm: avoid null pointer access in vm_struct via /proc/vmallocinfo") adds newly allocated vm_structs to the vmlist after it is fully initialised. Unfortunately, it did not check that __vmalloc_area_node() successfully populated the area. In the event of allocation failure, the vmalloc area is freed but the pointer to freed memory is inserted into the vmlist leading to a a crash later in get_vmalloc_info(). This patch adds a check for ____vmalloc_area_node() failure within __vmalloc_node_range. It does not use "goto fail" as in the previous error path as a warning was already displayed by __vmalloc_area_node() before it called vfree in its failure path. Credit goes to Luciano Chavez for doing all the real work of identifying exactly where the problem was. Signed-off-by: Mel Gorman <mgorman@suse.de> Reported-by: Luciano Chavez <lnx1138@linux.vnet.ibm.com> Tested-by: Luciano Chavez <lnx1138@linux.vnet.ibm.com> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: David Rientjes <rientjes@google.com> Cc: <stable@vger.kernel.org> [3.1.x+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Michal Hocko authored
setup_zone_migrate_reserve() expects that zone->start_pfn starts at pageblock_nr_pages aligned pfn otherwise we could access beyond an existing memblock resulting in the following panic if CONFIG_HOLES_IN_ZONE is not configured and we do not check pfn_valid: IP: [<c02d331d>] setup_zone_migrate_reserve+0xcd/0x180 *pdpt = 0000000000000000 *pde = f000ff53f000ff53 Oops: 0000 [#1] SMP Pid: 1, comm: swapper Not tainted 3.0.7-0.7-pae #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform EIP: 0060:[<c02d331d>] EFLAGS: 00010006 CPU: 0 EIP is at setup_zone_migrate_reserve+0xcd/0x180 EAX: 000c0000 EBX: f5801fc0 ECX: 000c0000 EDX: 00000000 ESI: 000c01fe EDI: 000c01fe EBP: 00140000 ESP: f2475f58 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Process swapper (pid: 1, ti=f2474000 task=f2472cd0 task.ti=f2474000) Call Trace: [<c02d389c>] __setup_per_zone_wmarks+0xec/0x160 [<c02d3a1f>] setup_per_zone_wmarks+0xf/0x20 [<c08a771c>] init_per_zone_wmark_min+0x27/0x86 [<c020111b>] do_one_initcall+0x2b/0x160 [<c086639d>] kernel_init+0xbe/0x157 [<c05cae26>] kernel_thread_helper+0x6/0xd Code: a5 39 f5 89 f7 0f 46 fd 39 cf 76 40 8b 03 f6 c4 08 74 32 eb 91 90 89 c8 c1 e8 0e 0f be 80 80 2f 86 c0 8b 14 85 60 2f 86 c0 89 c8 <2b> 82 b4 12 00 00 c1 e0 05 03 82 ac 12 00 00 8b 00 f6 c4 08 0f EIP: [<c02d331d>] setup_zone_migrate_reserve+0xcd/0x180 SS:ESP 0068:f2475f58 CR2: 00000000000012b4 We crashed in pageblock_is_reserved() when accessing pfn 0xc0000 because highstart_pfn = 0x36ffe. The issue was introduced in 3.0-rc1 by 6d3163ce ("mm: check if any page in a pageblock is reserved before marking it MIGRATE_RESERVE"). Make sure that start_pfn is always aligned to pageblock_nr_pages to ensure that pfn_valid s always called at the start of each pageblock. Architectures with holes in pageblocks will be correctly handled by pfn_valid_within in pageblock_is_reserved. Signed-off-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Mel Gorman <mgorman@suse.de> Tested-by: Dang Bo <bdang@vmware.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Arve Hjnnevg <arve@android.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> [3.0+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Hillf Danton authored
Avoid unlocking and unlocked page if we failed to lock it. Signed-off-by: Hillf Danton <dhillf@gmail.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Youquan Song authored
Commit 70b50f94 ("mm: thp: tail page refcounting fix") keeps all page_tail->_count zero at all times. But the current kernel does not set page_tail->_count to zero if a 1GB page is utilized. So when an IOMMU 1GB page is used by KVM, it wil result in a kernel oops because a tail page's _count does not equal zero. kernel BUG at include/linux/mm.h:386! invalid opcode: 0000 [#1] SMP Call Trace: gup_pud_range+0xb8/0x19d get_user_pages_fast+0xcb/0x192 ? trace_hardirqs_off+0xd/0xf hva_to_pfn+0x119/0x2f2 gfn_to_pfn_memslot+0x2c/0x2e kvm_iommu_map_pages+0xfd/0x1c1 kvm_iommu_map_memslots+0x7c/0xbd kvm_iommu_map_guest+0xaa/0xbf kvm_vm_ioctl_assigned_device+0x2ef/0xa47 kvm_vm_ioctl+0x36c/0x3a2 do_vfs_ioctl+0x49e/0x4e4 sys_ioctl+0x5a/0x7c system_call_fastpath+0x16/0x1b RIP gup_huge_pud+0xf2/0x159 Signed-off-by: Youquan Song <youquan.song@intel.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Youquan Song authored
With the 3.2-rc kernel, IOMMU 2M pages in KVM works. But when I tried to use IOMMU 1GB pages in KVM, I encountered an oops and the 1GB page failed to be used. The root cause is that 1GB page allocation calls gup_huge_pud() while 2M page calls gup_huge_pmd. If compound pages are used and the page is a tail page, gup_huge_pmd() increases _mapcount to record tail page are mapped while gup_huge_pud does not do that. So when the mapped page is relesed, it will result in kernel oops because the page is not marked mapped. This patch add tail process for compound page in 1GB huge page which keeps the same process as 2M page. Reproduce like: 1. Add grub boot option: hugepagesz=1G hugepages=8 2. mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages 3. qemu-kvm -m 2048 -hda os-kvm.img -cpu kvm64 -smp 4 -mem-path /dev/hugepages -net none -device pci-assign,host=07:00.1 kernel BUG at mm/swap.c:114! invalid opcode: 0000 [#1] SMP Call Trace: put_page+0x15/0x37 kvm_release_pfn_clean+0x31/0x36 kvm_iommu_put_pages+0x94/0xb1 kvm_iommu_unmap_memslots+0x80/0xb6 kvm_assign_device+0xba/0x117 kvm_vm_ioctl_assigned_device+0x301/0xa47 kvm_vm_ioctl+0x36c/0x3a2 do_vfs_ioctl+0x49e/0x4e4 sys_ioctl+0x5a/0x7c system_call_fastpath+0x16/0x1b RIP put_compound_page+0xd4/0x168 Signed-off-by: Youquan Song <youquan.song@intel.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Peter Zijlstra authored
Commit 4f2a8d3c ("printk: Fix console_sem vs logbuf_lock unlock race") introduced another silly bug where we would want to acquire an already held lock. Avoid this. Reported-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
KAMEZAWA Hiroyuki authored
More players joined to memory cgroup developments and Johannes' great work changed internal design of memory cgroup dramatically. And he will do more works. Michal Hokko did many bug fixes and know memory cgroup very well. Daisuke Nishimura helped us very much but he seems busy now. Thanks to his works. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Jonghwan Choi authored
If an error occurs after the clock is enabled, the enable/disable state can become unbalanced. Signed-off-by: Jonghwan Choi <jhbird.choi@samsung.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Acked-by: Kukjin Kim <kgene.kim@samsung.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Kees Cook authored
Small clean-up for my CREDITS entry; the GPG fingerprint was not up to date, so I fixed other details at the same time too. Signed-off-by: Kees Cook <kees@outflux.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrea Arcangeli authored
khugepaged can sometimes cause suspend to fail, requiring that the user retry the suspend operation. Use wait_event_freezable_timeout() instead of schedule_timeout_interruptible() to avoid missing freezer wakeups. A try_to_freeze() would have been needed in the khugepaged_alloc_hugepage tight loop too in case of the allocation failing repeatedly, and wait_event_freezable_timeout will provide it too. khugepaged would still freeze just fine by trying again the next minute but it's better if it freezes immediately. Reported-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Tested-by: Jiri Slaby <jslaby@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> Cc: "Rafael J. Wysocki" <rjw@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Claudio Scordino authored
Fix the error message "directives may not be used inside a macro argument" which appears when the kernel is compiled for the cris architecture. Signed-off-by: Claudio Scordino <claudio@evidence.eu.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Konstantin Khlebnikov authored
Use atomic-long operations instead of looping around cmpxchg(). [akpm@linux-foundation.org: massage atomic.h inclusions] Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Konstantin Khlebnikov authored
A shrinker function can return -1, means that it cannot do anything without a risk of deadlock. For example prune_super() does this if it cannot grab a superblock refrence, even if nr_to_scan=0. Currently we interpret this -1 as a ULONG_MAX size shrinker and evaluate `total_scan' according to this. So the next time around this shrinker can cause really big pressure. Let's skip such shrinkers instead. Also make total_scan signed, otherwise the check (total_scan < 0) below never works. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joe Perches authored
Commit 29b68415 ("x86: amd_iommu: move to drivers/iommu/") moved the files, update the patterns. CC: Ohad Ben-Cohen <ohad@wizery.com> CC: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
-
Matt Fleming authored
If we encounter an efi_memory_desc_t without EFI_MEMORY_WB set in ->attribute we currently call set_memory_uc(), which in turn calls __pa() on a potentially ioremap'd address. On CONFIG_X86_32 this is invalid, resulting in the following oops on some machines: BUG: unable to handle kernel paging request at f7f22280 IP: [<c10257b9>] reserve_ram_pages_type+0x89/0x210 [...] Call Trace: [<c104f8ca>] ? page_is_ram+0x1a/0x40 [<c1025aff>] reserve_memtype+0xdf/0x2f0 [<c1024dc9>] set_memory_uc+0x49/0xa0 [<c19334d0>] efi_enter_virtual_mode+0x1c2/0x3aa [<c19216d4>] start_kernel+0x291/0x2f2 [<c19211c7>] ? loglevel+0x1b/0x1b [<c19210bf>] i386_start_kernel+0xbf/0xc8 A better approach to this problem is to map the memory region with the correct attributes from the start, instead of modifying it after the fact. The uncached case can be handled by ioremap_nocache() and the cached by ioremap_cache(). Despite first impressions, it's not possible to use ioremap_cache() to map all cached memory regions on CONFIG_X86_64 because EFI_RUNTIME_SERVICES_DATA regions really don't like being mapped into the vmalloc space, as detailed in the following bug report, https://bugzilla.redhat.com/show_bug.cgi?id=748516 Therefore, we need to ensure that any EFI_RUNTIME_SERVICES_DATA regions are covered by the direct kernel mapping table on CONFIG_X86_64. To accomplish this we now map E820_RESERVED_EFI regions via the direct kernel mapping with the initial call to init_memory_mapping() in setup_arch(), whereas previously these regions wouldn't be mapped if they were after the last E820_RAM region until efi_ioremap() was called. Doing it this way allows us to delete efi_ioremap() completely. Signed-off-by: Matt Fleming <matt.fleming@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Matthew Garrett <mjg@redhat.com> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Huang Ying <huang.ying.caritas@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1321621751-3650-1-git-send-email-matt@console-pimps.orgSigned-off-by: Ingo Molnar <mingo@elte.hu>
-