- 20 Sep, 2011 29 commits
-
-
Benjamin Herrenschmidt authored
Add definition of OPAL interfaces along with the wrappers to call into OPAL runtime and the early device-tree parsing hook to locate the OPAL runtime firmware. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
We stash it in boot_command_line which isn't in BSS and so won't be overwritten. We then use that as a default cmd_line before we walk the device-tree. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
On machines supporting the OPAL firmware version 1, the system is initially booted under pHyp. We then use a special hypercall to verify if OPAL is available and if it is, we then trigger a "takeover" which disables pHyp and loads the OPAL runtime firmware, giving control to the kernel in hypervisor mode. This patch add the necessary code to detect that the OPAL takeover capability is present when running under PowerVM (aka pHyp) and perform said takeover to get hypervisor control of the processor. To perform the takeover, we must first use RTAS (within Open Firmware runtime environment) to start all processors & threads, in order to give control to OPAL on all of them. We then call the takeover hypercall on everybody, OPAL will re-enter the kernel main entry point passing it a flat device-tree. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
Unplugged CPU go into NAP mode in a loop until woken up Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
We used to overwrite with CONFIG_CMDLINE if we found a chosen node but failed to get bootargs out of it or they were empty, unless CONFIG_CMDLINE_FORCE is set. Instead change that to overwrite if "data" is non empty after the bootargs check. It allows arch code to have other mechanisms to retrieve the command line prior to parsing the device-tree. Note: CONFIG_CMDLINE_FORCE case should ideally be handled elsewhere as it won't work as it-is if the device-tree has no /chosen node Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: devicetree-discuss@lists-ozlabs.org CC: Grant Likely <grant.likely@secretlab.ca> Acked-by: Grant Likely <grant.likely@secretlab.ca>
-
Benjamin Herrenschmidt authored
This adds a skeletton for the new Power "Non Virtualized" platform which will be used by machines supporting running without an hypervisor, for example in order to run KVM. These machines will be using a new firmware called OPAL for which the support will be provided by later patches. The PowerNV platform is intended to be also usable under the BML environment used internally for early CPU bringup which is why the code also supports using RTAS instead of OPAL in various places. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
With OPAL, r8 and r9 will be used to pass the OPAL base and entry for debugging purposes (those informations are also in the device-tree). We don't want to clobber those registers that early. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
This new function is used to properly setup the PCI Express Max Payload Size (and in some circumstances Max Read Request Size). Some systems will not operate properly if these aren't set correctly and the firmware doesn't always do it. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
This adds more generic support for doing CPU hotplug with a simple idle loop and no actual reset of the processors. The generic smp_generic_kick_cpu() does the hotplug bringup trick if the PACA shows that the CPU has already been started at boot and we provide an accessor for the CPU state. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
It was preventing the global early debug selection whenever KVM was enabled instead of only preventing the 440 specific one. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Brian King authored
The Power platform requires the partner info buffer to be page aligned otherwise it will fail the partner info hcall with H_PARAMETER. Switch from using kmalloc to allocate this buffer to __get_free_page to ensure page alignment. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
The icswx code introduced an A-B B-A deadlock: CPU0 CPU1 ---- ---- lock(&anon_vma->mutex); lock(&mm->mmap_sem); lock(&anon_vma->mutex); lock(&mm->mmap_sem); Instead of using the mmap_sem to keep mm_users constant, take the page table spinlock. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
If we echo an address the hypervisor doesn't like to /sys/devices/system/memory/probe we oops the box: # echo 0x10000000000 > /sys/devices/system/memory/probe kernel BUG at arch/powerpc/mm/hash_utils_64.c:541! The backtrace is: create_section_mapping arch_add_memory add_memory memory_probe_store sysdev_class_store sysfs_write_file vfs_write SyS_write In create_section_mapping we BUG if htab_bolt_mapping returned an error. A better approach is to return an error which will propagate back to userspace. Rerunning the test with this patch applied: # echo 0x10000000000 > /sys/devices/system/memory/probe -bash: echo: write error: Invalid argument Signed-off-by: Anton Blanchard <anton@samba.org> Cc: stable@kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
While converting code to use for_each_node_by_type I noticed a number of coding style issues. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
Use for_each_node_by_type instead of open coding it. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
During memory hotplug testing, I got the following warning: ERROR: Bad of_node_put() on /memory@0 of_node_release kref_put of_node_put of_find_node_by_type hot_add_node_scn_to_nid hot_add_scn_to_nid memory_add_physaddr_to_nid ... of_find_node_by_type() loop does the of_node_put for us so we only need the handle the case where we terminate the loop early. As suggested by Stephen Rothwell we can do the of_node_put unconditionally outside of the loop since of_node_put handles a NULL argument fine. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: stable@kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
We have two identical definitions of RECLAIM_DISTANCE, looks like the patch got applied twice. Remove one. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
On big POWER7 boxes we see large amounts of CPU time in system processes like workqueue and watchdog kernel threads. We currently rebalance the entire machine each time a task goes idle and this is very expensive on large machines. Disable newidle balancing at the node level and rely on the scheduler tick to rebalance across nodes. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
The largest POWER7 boxes have 32 nodes. SD_NODES_PER_DOMAIN groups nodes into chunks of 16 and adds a global balancing domain (SD_ALLNODES) above it. If we bump SD_NODES_PER_DOMAIN to 32, then we avoid this extra level of balancing on our largest boxes. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
We want to override the default value of SD_NODES_PER_DOMAIN on ppc64, so move it into linux/topology.h. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
When chasing a performance issue on ppc64, I noticed tasks communicating via a pipe would often end up on different nodes. It turns out SD_WAKE_AFFINE is not set in our node defition. Commit 9fcd18c9 (sched: re-tune balancing) enabled SD_WAKE_AFFINE in the node definition for x86 and we need a similar change for ppc64. I used lmbench lat_ctx and perf bench pipe to verify this fix. Each benchmark was run 10 times and the average taken. lmbench lat_ctx: before: 66565 ops/sec after: 204700 ops/sec 3.1x faster perf bench pipe: before: 5.6570 usecs after: 1.3470 usecs 4.2x faster Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
(Merge in order to get the PCIe mps/mrss code fixes)
-
git://tesla.tglx.de/git/linux-2.6-tipLinus Torvalds authored
* 'irq-fixes-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip: x86, iommu: Mark DMAR IRQ as non-threaded genirq: Make irq_shutdown() symmetric vs. irq_startup again
-
git://github.com/chrismason/linuxLinus Torvalds authored
* 'for-linus' of git://github.com/chrismason/linux: Btrfs: only clear the need lookup flag after the dentry is setup BTRFS: Fix lseek return value for error Btrfs: don't change inode flag of the dest clone file Btrfs: don't make a file partly checksummed through file clone Btrfs: fix pages truncation in btrfs_ioctl_clone() btrfs: fix d_off in the first dirent
-
Andiry Xu authored
When a xHC host is unable to handle isochronous transfer in the interval, it reports a Missed Service Error event and skips some tds. Currently xhci driver handles MSE event in the following ways: 1. When encounter a MSE event, set ep->skip flag, update event ring dequeue pointer and return. 2. When encounter the next event on this ep, the driver will run the do-while loop, fetch td from ep's td_list to find the td corresponding to this event. All tds missed are marked as short transfer(-EXDEV). The do-while loop will end in two ways: 1. If the td pointed by the event trb is found; 2. If the ep ring's td_list is empty. However, if a buggy HW reports some unpredicted event (for example, an overrun event following a MSE event while the ep ring is actually not empty), the driver will never find the td, and it will loop until the td_list is empty. Unfortunately, the spinlock is dropped when give back a urb in the do-while loop. During the spinlock released period, the class driver may still submit urbs and add tds to the td_list. This may cause disaster, since the td_list will never be empty and the loop never ends, and the system hangs. To fix this, count the number of TDs on the ep ring before skipping TDs, and quit the loop when skipped that number of tds. This guarantees the do-while loop will end after certain number of cycles, and driver will not be trapped in an infinite loop. Signed-off-by: Andiry Xu <andiry.xu@amd.com> Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Greg KH authored
Sometimes, when a USB 3.0 device is disconnected, the Intel Panther Point xHCI host controller will report a link state change with the state set to "SS.Inactive". This causes the xHCI host controller to issue a warm port reset, which doesn't finish before the USB core times out while waiting for it to complete. When the warm port reset does complete, and the xHC gives back a port status change event, the xHCI driver kicks khubd. However, it fails to set the bit indicating there is a change event for that port because the logic in xhci-hub.c doesn't check for the warm port reset bit. After that, the warm port status change bit is never cleared by the USB core, and the xHC stops reporting port status change bits. (The xHCI spec says it shouldn't report more port events until all change bits are cleared.) This means any port changes when a new device is connected will never be reported, and the port will seem "dead" until the xHCI driver is unloaded and reloaded, or the computer is rebooted. Fix this by making the xHCI driver set the port change bit when a warm port reset change bit is set. A better solution would be to make the USB core handle warm port reset in differently, merging the current code with the standard port reset code that does an incremental backoff on the timeout, and tries to complete the port reset two more times before giving up. That more complicated fix will be merged next window, and this fix will be backported to stable. This should be backported to kernels as old as 3.0, since that was the first kernel with commit a11496eb ("xHCI: warm reset support"). Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Randy Dunlap authored
Fix build when CONFIG_ISA_DMA_API is enabled but CONFIG_COMEDI_PCI[_DRIVERS] is not enabled. Fixes these build errors: drivers/staging/comedi/drivers/ni_labpc.c: In function 'labpc_ai_cmd': drivers/staging/comedi/drivers/ni_labpc.c:1351: error: implicit declaration of function 'labpc_suggest_transfer_size' drivers/staging/comedi/drivers/ni_labpc.c: At top level: drivers/staging/comedi/drivers/ni_labpc.c:1802: error: conflicting types for 'labpc_suggest_transfer_size' drivers/staging/comedi/drivers/ni_labpc.c:1351: note: previous implicit declaration of 'labpc_suggest_transfer_size' was here Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
Even with just the interface limited to admin, there really is little to reason to give byte-per-byte counts for taskstats. So round it down to something less intrusive. Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
Ok, this isn't optimal, since it means that 'iotop' needs admin capabilities, and we may have to work on this some more. But at the same time it is very much not acceptable to let anybody just read anybody elses IO statistics quite at this level. Use of the GENL_ADMIN_PERM suggested by Johannes Berg as an alternative to checking the capabilities by hand. Reported-by: Vasiliy Kulikov <segoon@openwall.com> Cc: Johannes Berg <johannes.berg@intel.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 19 Sep, 2011 11 commits
-
-
Hector Martin authored
Add a new udbg driver for the PS3 gelic Ehthernet device. This driver shares only a few stucture and constant definitions with the gelic Ethernet device driver, so is implemented as a stand-alone driver with no dependencies on the gelic Ethernet device driver. Signed-off-by: Hector Martin <hector@marcansoft.com> Signed-off-by: Andre Heider <a.heider@gmail.com> Signed-off-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Thadeu Lima de Souza Cascardo authored
Since commit 188917e1, /proc/ppc64 is a symlink to /proc/powerpc/. That means that creating /proc/ppc64/eeh will end up with a unaccessible file, that is not listed under /proc/powerpc/ and, then, not listed under /proc/ppc64/. Creating /proc/powerpc/eeh fixes that problem and maintain the compatibility intended with the ppc64 symlink. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: <stable@kernel.org> [3.x]
-
Arnaud Lacombe authored
This should fix the following warning: LD arch/powerpc/sysdev/xics/built-in.o WARNING: arch/powerpc/sysdev/xics/built-in.o(.text+0x1310): Section mismatch in reference from the function .icp_native_init() to the function .init.text:.icp_native_init_one_node() The function .icp_native_init() references the function __init .icp_native_init_one_node(). This is often because .icp_native_init lacks a __init annotation or the annotation of .icp_native_init_one_node is wrong. icp_native_init() is only referenced in `arch/powerpc/sysdev/xics/xics-common.c' by xics_init() which is itself marked with __init. = not built-tested = Reported-by: Timur Tabi <timur@freescale.com> Signed-off-by: Arnaud Lacombe <lacombar@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
During hotplug CPU add we get the following error: Unexpected Error (0) returned from configure-connector ibm,configure-connector returns 0 for configuration complete, so catch this and avoid the error. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: <stable@kernel.org>
-
Tang Yuantian authored
In SMP mode, the kernel would produce call trace when resumed from hibernation. The reason is when the function destroy_context is called to drop the resuming mm context, the mm->context.active is 1 which is wrong and should be zero. We pass the current->active_mm as previous mm context to function switch_mmu_context to decrease the context.active by 1. In UP mode, there is no effect. Signed-off-by: Tang Yuantian <b29983@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Tony Breeds authored
The various port_init_hw methods of ppc4xx_pciex_hwops should have been marked __init and when I added ppc4xx_pciex_port_reset_sdr(), which is __init. This added many section mismatch warnings like: WARNING: arch/powerpc/sysdev/built-in.o(.text+0x5c68): Section mismatch in reference from the function ppc440spe_pciex_init_port_hw() to the function .init.text:ppc4xx_pciex_port_reset_sdr() The function ppc440spe_pciex_init_port_hw() references the function __init ppc4xx_pciex_port_reset_sdr(). This is often because ppc440spe_pciex_init_port_hw lacks a __init annotation or the annotation of ppc4xx_pciex_port_reset_sdr is wrong. Trivial patch to silence those warnings. Reported-By: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Yours Tony Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Michael Ellerman authored
Based on a patch by Michael Ellerman <michael@ellerman.id.au> Patch was simply forward ported upstream. Jimi Xenidis <jimix@pobox.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
Based on a patch by Benjamin Herrenschmidt <benh@kernel.crashing.org> Modernized and slightly modified to not record erros into the nvram log since we do not have that device driver just yet. Jimi Xenidis <jimix@pobox.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Jimi Xenidis authored
Some config selections were applied to the platform (reference board) when they actuall apply to the chip. Signed-off-by: Jimi Xenidis <jimix@pobox.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Julia Lawall authored
At this point, window has not been stored anywhere, so it has to be freed before leaving the function. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @exists@ local idexpression x; statement S,S1; expression E; identifier fl; expression *ptr != NULL; @@ x = \(kmalloc\|kzalloc\|kcalloc\)(...); ... if (x == NULL) S <... when != x when != if (...) { <+...kfree(x)...+> } when any when != true x == NULL x->fl ...> ( if (x == NULL) S1 | if (...) { ... when != x when forall ( return \(0\|<+...x...+>\|ptr\); | * return ...; ) } ) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Scott Wood authored
u64 is used rather than phys_addr_t to keep things simple, as this is called from assembly code. Update callers to pass a 64-bit address in r3/r4. Other unused register assignments that were once parameters to machine_init are dropped. For FSL BookE, look up the physical address of the device tree from the effective address passed in r3 by the loader. This is required for situations where memory does not start at zero (due to AMP or IOMMU-less virtualization), and thus the IMA doesn't start at zero, and thus the device tree effective address does not equal the physical address. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-