- 09 Mar, 2012 1 commit
-
-
Benjamin Herrenschmidt authored
The current implementation of lazy interrupts handling has some issues that this tries to address. We don't do the various workarounds we need to do when re-enabling interrupts in some cases such as when returning from an interrupt and thus we may still lose or get delayed decrementer or doorbell interrupts. The current scheme also makes it much harder to handle the external "edge" interrupts provided by some BookE processors when using the EPR facility (External Proxy) and the Freescale Hypervisor. Additionally, we tend to keep interrupts hard disabled in a number of cases, such as decrementer interrupts, external interrupts, or when a masked decrementer interrupt is pending. This is sub-optimal. This is an attempt at fixing it all in one go by reworking the way we do the lazy interrupt disabling from the ground up. The base idea is to replace the "hard_enabled" field with a "irq_happened" field in which we store a bit mask of what interrupt occurred while soft-disabled. When re-enabling, either via arch_local_irq_restore() or when returning from an interrupt, we can now decide what to do by testing bits in that field. We then implement replaying of the missed interrupts either by re-using the existing exception frame (in exception exit case) or via the creation of a new one from an assembly trampoline (in the arch_local_irq_enable case). This removes the need to play with the decrementer to try to create fake interrupts, among others. In addition, this adds a few refinements: - We no longer hard disable decrementer interrupts that occur while soft-disabled. We now simply bump the decrementer back to max (on BookS) or leave it stopped (on BookE) and continue with hard interrupts enabled, which means that we'll potentially get better sample quality from performance monitor interrupts. - Timer, decrementer and doorbell interrupts now hard-enable shortly after removing the source of the interrupt, which means they no longer run entirely hard disabled. Again, this will improve perf sample quality. - On Book3E 64-bit, we now make the performance monitor interrupt act as an NMI like Book3S (the necessary C code for that to work appear to already be present in the FSL perf code, notably calling nmi_enter instead of irq_enter). (This also fixes a bug where BookE perfmon interrupts could clobber r14 ... oops) - We could make "masked" decrementer interrupts act as NMIs when doing timer-based perf sampling to improve the sample quality. Signed-off-by-yet: Benjamin Herrenschmidt <benh@kernel.crashing.org> --- v2: - Add hard-enable to decrementer, timer and doorbells - Fix CR clobber in masked irq handling on BookE - Make embedded perf interrupt act as an NMI - Add a PACA_HAPPENED_EE_EDGE for use by FSL if they want to retrigger an interrupt without preventing hard-enable v3: - Fix or vs. ori bug on Book3E - Fix enabling of interrupts for some exceptions on Book3E v4: - Fix resend of doorbells on return from interrupt on Book3E v5: - Rebased on top of my latest series, which involves some significant rework of some aspects of the patch. v6: - 32-bit compile fix - more compile fixes with various .config combos - factor out the asm code to soft-disable interrupts - remove the C wrapper around preempt_schedule_irq v7: - Fix a bug with hard irq state tracking on native power7
-
- 08 Mar, 2012 19 commits
-
-
Benjamin Herrenschmidt authored
On 64-bit, the mfmsr instruction can be quite slow, slower than loading a field from the cache-hot PACA, which happens to already contain the value we want in most cases. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
We were using CR0.EQ after EXCEPTION_COMMON, hoping it still contained whether we came from userspace or kernel space. However, under some circumstances, EXCEPTION_COMMON will call C code and clobber non-volatile registers, so we really need to re-load the previous MSR from the stackframe and re-test. While there, invert the condition to make the fast path more obvious and remove the BUG_OPCODE which was a debugging leftover and call .ret_from_except as we should. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
When running under a hypervisor that supports stolen time accounting, we may call C code from the macro EXCEPTION_PROLOG_COMMON in the exception entry path, which clobbers CR0. However, the FPU and vector traps rely on CR0 indicating whether we are coming from userspace or kernel to decide what to do. So we need to restore that value after the C call Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
Also use local_paca instead of get_paca() to avoid getting into the smp_processor_id() debugging code from the debugger Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
Other architectures such as x86 and ARM have been growing new support for features like retrying page faults after dropping the mm semaphore to break contention, or being able to return from a stuck page fault when a SIGKILL is pending. This refactors our implementation of do_page_fault() to move the error handling out of line in a way similar to x86 and adds support for those two features. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
If we get a floating point, altivec or vsx unavaible interrupt in kernel, we trigger a kernel error. There is no point preserving the interrupt state, in fact, that can even make debugging harder as the processor state might change (we may even preempt) between taking the exception and landing in a debugger. So just make those 3 disable interrupts unconditionally. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> --- v2: On BookE only disable when hitting the kernel unavailable path, otherwise it will fail to restore softe as fast_exception_return doesn't do it.
-
Benjamin Herrenschmidt authored
We currently turn interrupts back to their previous state before calling do_page_fault(). This can be annoying when debugging as a bad fault will potentially have lost some processor state before getting into the debugger. We also end up calling some generic code with interrupts enabled such as notify_page_fault() with interrupts enabled, which could be unexpected. This changes our code to behave more like other architectures, and make the assembly entry code call into do_page_faults() with interrupts disabled. They are conditionally re-enabled from within do_page_fault() in the same spot x86 does it. While there, add the might_sleep() test in the case of a successful trylock of the mmap semaphore, again like x86. Also fix a bug in the existing assembly where r12 (_MSR) could get clobbered by C calls (the DTL accounting in the exception common macro and DISABLE_INTS) in some cases. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> --- v2. Add the r12 clobber fix
-
Benjamin Herrenschmidt authored
Some exceptions would unconditionally disable interrupts on entry, which is fine, but calling lockdep every time not only adds more overhead than strictly needed, but also means we get quite a few "redudant" disable logged, which makes it hard to spot the really bad ones. So instead, split the macro used by the exception code into a normal one and a separate one used when CONFIG_TRACE_IRQFLAGS is enabled, and make the later skip th tracing if interrupts were already disabled. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
We unconditionally hard enable interrupts. This is unnecessary as syscalls are expected to always be called with interrupts enabled. While at it, we add a WARN_ON if that is not the case and CONFIG_TRACE_IRQFLAGS is enabled (we don't want to add overhead to the fast path when this is not set though). Thus let's remove the enabling (and associated irq tracing) from the syscall entry path. Also on Book3S, replace a few mfmsr instructions with loads of PACAMSR from the PACA, which should be faster & schedule better. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
This moves the inlines into system.h and changes the runlatch code to use the thread local flags (non-atomic) rather than the TIF flags (atomic) to keep track of the latch state. The code to turn it back on in an asynchronous interrupt is now simplified and partially inlined. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
The perfmon interrupt is the sole user of a special variant of the interrupt prolog which differs from the one used by external and timer interrupts in that it saves the non-volatile GPRs and doesn't turn the runlatch on. The former is unnecessary and the later is arguably incorrect, so let's clean that up by using the same prolog. While at it we rename that prolog to use the _ASYNC prefix. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
This removes the various bits of assembly in the kernel entry, exception handling and SLB management code that were specific to running under the legacy iSeries hypervisor which is no longer supported. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Stephen Rothwell authored
This cleans up vio.c after the removal of the legacy iSeries platform. It also removes some no longer referenced include files. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Stephen Rothwell authored
The PowerPC legacy iSeries plateform is being removed along with the "one looney iseries driver", so this code can now be removed as well. cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Stephen Rothwell authored
The PowerPC legacy iSeries platform is being removed so this is no longer selectable. Cc: Alan Cox <alan@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linux-serial@vger.kernel.org Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Stephen Rothwell authored
The PowerPC legacy iSeries platform is being removed, so this code is no longer needed. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Stephen Rothwell authored
The PowerPC legacy iSeries platform is being removed and this code is no longer selectable. There is more clean up that can be done, but this just gets the old code out of the way. Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Cc: Brian King <brking@linux.vnet.ibm.com> Cc: linux-scsi@vger.kernel.org Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Stephen Rothwell authored
This driver is specific to the PowerPC legcay iSeries platform which is being removed. Cc: David Miller <davem@davemloft.net> Cc: <netdev@vger.kernel.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Stephen Rothwell authored
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 07 Mar, 2012 8 commits
-
-
Akinobu Mita authored
- Use memchr_inv to check if the data contains all 0xFF bytes. It is faster than looping for each byte. - Use memcmp to compare memory areas Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Grant Likely authored
All IRQs on powerpc are managed via irq_domain anyway, there isn't really any advantage to turning SPARSE_IRQ off, and it's the direction we want to take the kernel design anyway. This patch makes powerpc always use SPARSE_IRQ. On pseries_defconfig, SPARSE_IRQ adds only about 0x300 bytes to the .text sections, and removes about 0x20000 from the data section for the static irq_desc table. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Ben Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Nishanth Aravamudan authored
On a 16TB system (using AMS/CMO), I get: WARNING: ignoring large property [/ibm,dynamic-reconfiguration-memory] ibm,dynamic-memory length 0x000000000017ffec and significantly less memory is thus shown to the partition. As far as I can tell, the constant used is arbitrary. Ben Herrenschmidt provided additional background that > The limit was originally set because of Apple machines carrying ROM > images in the device-tree, at a time where we were much more memory > constrained than we are now. and that it is likely not very useful any longer. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Matt Fleming authored
As described in e6fa16ab ("signal: sigprocmask() should do retarget_shared_pending()") the modification of current->blocked is incorrect as we need to check whether the signal we're about to block is pending in the shared queue. Also, use the new helper function introduced in commit 5e6292c0 ("signal: add block_sigmask() for adding sigmask to current->blocked") which centralises the code for updating current->blocked after successfully delivering a signal and reduces the amount of duplicate code across architectures. In the past some architectures got this code wrong, so using this helper function should stop that from happening again. Cc: Oleg Nesterov <oleg@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Joe Perches authored
Emit the function name not the address when possible. builtin_return_address() gives an address. When building a kernel with CONFIG_KALLSYMS, emit the actual function name not the address. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Jimi Xenidis authored
There is a race where a thread causes a coprocessor type to be valid in its own ACOP _and_ in the current context, but it does not propagate to the ACOP register of other threads in time for them to use it. The original code tries to solve this by sending an IPI to all threads on the system, which is heavy handed, but unfortunately still provides a window where the icswx is issued by other threads and the ACOP is not up to date. This patch detects that the ACOP DSI fault was a "false positive" and syncs the ACOP and causes the icswx to be replayed. Signed-off-by: Jimi Xenidis <jimix@pobox.com> Cc: Anton Blanchard <anton@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
Implement atomic_inc_not_zero and atomic64_inc_not_zero. At the moment we use atomic*_add_unless which requires us to put 0 and 1 constants into registers. We can also avoid a subtract by saving the original value in a second temporary. This removes 3 instructions from fget: - c0000000001b63c0: 39 00 00 00 li r8,0 - c0000000001b63c4: 39 40 00 01 li r10,1 ... - c0000000001b63e8: 7c 0a 00 50 subf r0,r10,r0 Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
We want to implement a ppc64 specific version of atomic_inc_not_zero so wrap it in an ifdef to allow it to be overridden. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 27 Feb, 2012 7 commits
-
-
Ira Snyder authored
When the system is under heavy load, we occasionally saw a problem where the system would get a legitimate interrupt when they should be disabled. This was caused by the data_dma_cb() DMA callback unconditionally re-enabling FPGA interrupts even when data dumping is disabled. When data dumping was re-enabled, the irq handler would fire while a DMA was in progress. The "BUG_ON(priv->inflight != NULL);" during the second invocation of the DMA callback caused the system to crash. To fix the issue, the priv->enabled boolean is moved under the protection of the priv->lock spinlock. The DMA callback checks the boolean to know whether to re-enable FPGA interrupts before it returns. Now that it is fixed, the driver keeps FPGA interrupts disabled when it expects that they are disabled, fixing the bug. Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Ira Snyder authored
Lockdep occasionally complains with the message: INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected This is caused by calling videobuf_dma_unmap() under spin_lock_irq(). To fix the warning, we drop the lock before unmapping and freeing the buffer. Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Masanari Iida authored
Fix typo "unsuported" to "unsupported" in drivers/machintosh/mediabay.c Signed-off-by: Masanari Iida<standby24x7@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Danny Kukawka authored
arch/powerpc/platforms/powernv/setup.c: included 'asm/xics.h' twice, remove the duplicate. Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Danny Kukawka authored
arch/powerpc/kvm/book3s_hv.c: included 'linux/sched.h' twice, remove the duplicate. Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Stephen Rothwell authored
After this, we can remove the legacy iSeries code more easily. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
When using a multi-ISU MPIC, we can interrupts up to isu_size * MPIC_MAX_ISU, not just isu_size, so allocate the right size reverse map. Without this, the code will constantly fallback to a linear search. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 26 Feb, 2012 3 commits
-
-
Benjamin Herrenschmidt authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
1) ICMP sockets leave err uninitialized but we try to return it for the unsupported MSG_OOB case, reported by Dave Jones. 2) Add new Zaurus device ID entries, from Dave Jones. 3) Pointer calculation in hso driver memset is wrong, from Dan Carpenter. 4) ks8851_probe() checks unsigned value as negative, fix also from Dan Carpenter. 5) Fix crashes in atl1c driver due to TX queue handling, from Eric Dumazet. I anticipate some TX side locking fixes coming in the near future for this driver as well. 6) The inline directive fix in Bluetooth which was breaking the build only with very new versions of GCC, from Johan Hedberg. 7) Fix crashes in the ATP CLIP code due to ARP cleanups this merge window, reported by Meelis Roos and fixed by Eric Dumazet. 8) JME driver doesn't flush RX FIFO correctly, from Guo-Fu Tseng. 9) Some ip6_route_output() callers test the return value for NULL, but this never happens as the convention is to return a dst entry with dst->error set. Fixes from RonQing Li. 10) Logitech Harmony 900 should be handled by zaurus driver not cdc_ether, update white lists and black lists accordingly. From Scott Talbert. 11) Receiving from certain kinds of devices there won't be a MAC header, so there is no MAC header to fixup in the IPSEC code, and if we try to do it we'll crash. Fix from Eric Dumazet. 12) Port type array indexing off-by-one in mlx4 driver, fix from Yevgeny Petrilin. 13) Fix regression in link-down handling in davinci_emac which causes all RX descriptors to be freed up and therefore RX to wedge completely, from Christian Riesch. 14) It took two attempts, but ctnetlink soft lockups seem to be cured now, from Pablo Neira Ayuso. 15) Endianness bug fix in ENIC driver, from Santosh Nayak. 16) The long ago conversion of the PPP fragmentation code over to abstracted SKB list handling wasn't perfect, once we get an out of sequence SKB we don't flush the rest of them like we should. From Ben McKeegan. 17) Fix regression of ->ip_summed initialization in sfc driver. From Ben Hutchings. 18) Bluetooth timeout mistakenly using msecs instead of jiffies, from Andrzej Kaczmarek. 19) Using _sync variant of work cancellation results in deadlocks, use the non _sync variants instead. From Andre Guedes. 20) Bluetooth rfcomm code had reference counting problems leading to crashes, fix from Octavian Purdila. 21) The conversion of netem over to classful qdisc handling added two bugs to netem_dequeue(), fixes from Eric Dumazet. 22) Missing pci_iounmap() in ATM Solos driver. Fix from Julia Lawall. 23) b44_pci_exit() should not have __exit tag since it's invoked from non-__exit code. From Nikola Pajkovsky. 24) The conversion of the neighbour hash tables over to RCU added a race, fixed here by adding the necessary reread of tbl->nht, fix from Michel Machado. 25) When we added VF (virtual function) attributes for network device dumps, this potentially bloats up the size of the dump of one network device such that the dump size is too large for the buffer allocated by properly written netlink applications. In particular, if you add 255 VFs to a network device, parts of GLIBC stop working. To fix this, we add an attribute that is used to turn on these extended portions of the network device dump. Sophisticaed applications like 'ip' that want to see this stuff will be changed to set the attribute, whereas things like GLIBC that don't care about VFs simply will not, and therefore won't be busted by the mere presence of VFs on a network device. Thanks to the tireless work of Greg Rose on this fix. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (53 commits) sfc: Fix assignment of ip_summed for pre-allocated skbs ppp: fix 'ppp_mp_reconstruct bad seq' errors enic: Fix endianness bug. gre: fix spelling in comments netfilter: ctnetlink: fix soft lockup when netlink adds new entries (v2) Revert "netfilter: ctnetlink: fix soft lockup when netlink adds new entries" davinci_emac: Do not free all rx dma descriptors during init mlx4_core: Fixing array indexes when setting port types phy: IC+101G and PHY_HAS_INTERRUPT flag netdev/phy/icplus: Correct broken phy_init code ipsec: be careful of non existing mac headers Move Logitech Harmony 900 from cdc_ether to zaurus hso: memsetting wrong data in hso_get_count() netfilter: ip6_route_output() never returns NULL. ethernet/broadcom: ip6_route_output() never returns NULL. ipv6: ip6_route_output() never returns NULL. jme: Fix FIFO flush issue atm: clip: remove clip_tbl ipv4: ping: Fix recvmsg MSG_OOB error handling. rtnetlink: Fix problem with buffer allocation ...
-
Linus Torvalds authored
The autofs compat handling fix caused a compile failure when CONFIG_COMPAT isn't defined. Instead of adding random #ifdef'fery in autofs, let's just make the compat helpers earlier to use: without CONFIG_COMPAT, is_compat_task() just hardcodes to zero. We could probably do something similar for a number of other cases where we have #ifdef's in code, but this is the low-hanging fruit. Reported-and-tested-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 25 Feb, 2012 2 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-stagingLinus Torvalds authored
Couple of minor driver fixes. * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (max34440) Fix resetting temperature history hwmon: (f75375s) Fix register write order when setting fans to full speed hwmon: (ads1015) Fix file leak in probe function hwmon: (max6639) Fix PPR register initialization to set both channels hwmon: (max6639) Fix FAN_FROM_REG calculation
-