Commits · 751e1f5099f1568444fe2485f2485ca541d4952e · nexedi / linux

19 May, 2011 40 commits

powerpc: Remove unused/obsolete CONFIG_XICS · 751e1f50
Benjamin Herrenschmidt authored May 19, 2011
```
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
```
751e1f50

misc: Add CARMA DATA-FPGA Programmer support · 0e1d715b

Ira Snyder authored Feb 11, 2011

This adds support for programming the data processing FPGAs on the OVRO
CARMA board. These FPGAs have a special programming sequence that
requires that we program the Freescale DMA engine, which is only
available inside the kernel.
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

0e1d715b

misc: Add CARMA DATA-FPGA Access Driver · c186f0e1

Ira Snyder authored Feb 11, 2011

This driver allows userspace to access the data processing FPGAs on the
OVRO CARMA board. It has two modes of operation:

1) random access

This allows users to poke any DATA-FPGA registers by using mmap to map
the address region directly into their memory map.

2) correlation dumping

When correlating, the DATA-FPGA's have special requirements for getting
the data out of their memory before the next correlation. This nominally
happens at 64Hz (every 15.625ms). If the data is not dumped before the
next correlation, data is lost.

The data dumping driver handles buffering up to 1 second worth of
correlation data from the FPGAs. This lowers the realtime scheduling
requirements for the userspace process reading the device.
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

c186f0e1

powerpc: Make IRQ_NOREQUEST last to clear, first to set · 41fb5e62

Milton Miller authored May 10, 2011

When creating an irq, don't allow a concurent driver request until
we have caled map, which will likley call set_chip_and_handler to
change the irq_chip and its operations.

Similarly, when tearing down an IRQ, make sure no new uses come
along while we change the irq back to the nop chip and then reset
the descriptor to freed status.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

41fb5e62

powerpc: Remove virq_to_host · 1e8c2301

Milton Miller authored May 10, 2011

The only references to the irq_map[].host field are internal to
arch/powerpc/kernel/irq.c
Signed-off-by: Milton Miller <miltonm@bga.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

1e8c2301

powerpc: Add virq_is_host to reduce virq_to_host usage · 3ee62d36

Milton Miller authored May 10, 2011

Some irq_host implementations are using virq_to_host to check if
they are the irq_host for a virtual irq.  To allow us to make space
versus time tradeoffs, replace this usage with an assertive
virq_is_host that confirms or denies the irq is associated with the
given irq_host.
Signed-off-by: Milton Miller <miltonm@bga.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

3ee62d36

powerpc/axon_msi: Validate msi irq via chip_data · 95533614

Milton Miller authored May 10, 2011

Instead of checking for rogue msi numbers via the irq_map host field
set the chip_data to h.host_data (which is the msic struct pointer)
at map and compare it in get_irq.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

95533614

powerpc/spider-pic: Get pic from chip_data instead of irq_map · 6b0aea44

Milton Miller authored May 10, 2011

Building on Grant's efforts to remove the irq_map array, this patch
moves spider-pics use of virq_to_host() to use irq_data_get_chip_data
and sets the irq chip data in the map call, like most other interrupt
controllers in powerpc.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

6b0aea44

powerpc: Remove irq_host_ops->remap hook · da051980

Milton Miller authored May 10, 2011

It was called from irq_create_mapping if that was called for a host
and hwirq that was previously mapped, "to update the flags". But the
only implementation was in beat_interrupt and all it did was repeat a
hypervisor call without error checking that was performed with error
checking at the beginning of the map hook. In addition, the comment on
the beat remap hook says it will only called once for a given mapping,
which would apply to map not remap.

All flags should be known by the time the match hook is called, before
we call the map hook. Removing this mostly unused hook will simpify
the requirements of irq_domain concept.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

da051980

powerpc/psurge: Create a irq_host for secondary cpus · 23f73a5f

Milton Miller authored May 10, 2011

Create a dummy irq_host using the generic dummy irq chip for the secondary
cpus to use. Create a direct irq mapping for the ipi and register the
ipi action handler against it. If for some unlikely reason part of this
fails then don't detect the secondary cpus.

This removes another instance of NO_IRQ_IGNORE, records the ipi stats
for the secondary cpus, and runs the ipi on the interrupt stack.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

23f73a5f

powerpc/mpc62xx_pic: Fix get_irq handling of NO_IRQ · 67347eba

Milton Miller authored May 10, 2011

If none of irq category bits were set mpc52xx_get_irq() would pass
NO_IRQ_IGNORE (-1) to irq_linear_revmap, which does an unsigned compare
and declares the interrupt above the linear map range.  It then punts
to irq_find_mapping, which performs a linear search of all irqs,
which will likely miss and only then return NO_IRQ.

If no status bit is set, then we should return NO_IRQ directly.
The interrupt should not be suppressed from spurious counting, in fact
that is the definition of supurious.
Signed-off-by: Milton Miller <miltonm@bga.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

67347eba

powerpc/mpc5121_ads_cpld: Remove use of NO_IRQ_IGNORE · c42385cd

Milton Miller authored May 10, 2011

As NO_IRQ_IGNORE is only used between the static function cpld_pic_get_irq
and its caller cpld_pic_cascade, and cpld_pic_cascade only uses it to
suppress calling handle_generic_irq, we can change these uses to NO_IRQ
and remove the extra tests and pathlength in cpld_pic_cascade.
Signed-off-by: Milton Miller <miltonm@bga.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

c42385cd

powerpc/fsl_msi: Use chip_data not handler_data · d1921bcd

Milton Miller authored May 10, 2011

handler_data should be reserved for flow handlers on the dependent
irq, not consumed by the parent irq code that is part of the irq_chip
code. The msi_data pointer was already set in msidesc->irqhost->hostdata
and being copied to irq_data->chipdata in the msidesc->irqhost->map()
method called via create_irq_mapping, so we can obtain the pointer
from there and free the instance it in teardown_msi_irqs.

Also remove the unnecessary cast of irq_get_handler_data in the
cascade handler, which is the demux flow handler of the parent
msi interrupt. (This is the expected usage for handler_data).
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

d1921bcd

powerpc/fsl_msi: Don't abuse platform_data for driver_data · 6c4c82e2

Milton Miller authored May 10, 2011

The msi platform device driver was abusing dev.platform_data for its
platform_driver_data. Use the correct pointer for storage.

Platform_data is supposed to be for platforms to communicate to drivers
parameters that are not otherwise discoverable. Its lifetime matches
the platform_device not the platform device driver. It is generally
not needed for drivers that only support systems with device trees.
Signed-off-by: Milton Miller <miltonm@bga.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

6c4c82e2

powerpc: Remove i8259 irq_host_ops->unmap · 7ee342bd

Milton Miller authored May 10, 2011

It was never called because the host is always IRQ_HOST_MAP_LEGACY.

And what it purported to do was mask the interrupt (which will already
have happend if we shutdown the interrupt), then synchronise_irq and
clear the chip pointer, both of which will have been be done by the
caller were we to call unmap on a legacy irq.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

7ee342bd

powerpc: Remove trival irq_host_ops.unmap · df74e70a

Milton Miller authored May 10, 2011

These all just clear chip or chipdata fields, which will be done
by the generic code when we call irq_free_descs.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

df74e70a

powerpc: Return early if irq_host lookup type is wrong · 2d441681

Milton Miller authored May 10, 2011

If for some reason the code incrorectly calls the wrong function to
manage the revmap, not only should we warn, we should take action.
However, in the paths we expect to be taken every delivered interrupt
change to WARN_ON_ONCE.  Use the if (WARN_ON(x)) format to get the
unlikely for free.
Signed-off-by: Milton Miller <miltonm@bga.com>
Reviewed-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

2d441681

powerpc: Radix trees are available before init_IRQ · 3af259d1

Milton Miller authored May 10, 2011

Since the generic irq code uses a radix tree for sparse interrupts,
the initcall ordering has been changed to initialize radix trees before
irqs.   We no longer need to defer creating revmap radix trees to the
arch_initcall irq_late_init.

Also, the kmem caches are allocated so we don't need to use
zalloc_maybe_bootmem.
Signed-off-by: Milton Miller <miltonm@bga.com>
Reviewed-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

3af259d1

powerpc/xics: Cleanup xics_host_map and ipi · e085255e

Milton Miller authored May 10, 2011

Since we already have a special case in map to set the ipi handler, use
the desired flow.

If we don't find an ics to handle the interrupt complain instead of
returning 0 without having set a chip or handler.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

e085255e

powerpc: Use bytes instead of bitops in smp ipi multiplexing · 71454272

Milton Miller authored May 10, 2011

Since there are only 4 messages, we can replace the atomic bit set
(which uses atomic load reserve and store conditional sequence) with
a byte stores to seperate bytes.  We still have to perform a load
reserve and store conditional sequence to avoid loosing messages on
reception but we can do that with a single call to xchg.

The do {} while and __BIG_ENDIAN specific mask testing was chosen by
looking at the generated asm code.  On gcc-4.4, the bit masking becomes
a simple bit mask and test of the register returned from xchg without
storing and loading the value to the stack like attempts with a union
of bytes and an int (or worse, loading single bit constants from the
constant pool into non-voliatle registers that had to be preseved on
the stack).  The do {} while avoids an unconditional branch to the
end of the loop to test the entry / repeat condition of a while loop
and instead optimises for the expected single iteration of the loop.

We have a full mb() at the beginning to cover ordering between send,
ipi, and receive so we can use xchg_local and forgo the further
acquire and release barriers of xchg.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

71454272

powerpc: Add kconfig for muxed smp ipi support · 1ece355b

Milton Miller authored May 10, 2011

Compile the new smp ipi mux and demux code only if a platform
will make use of it.  The new config is selected as required.

The new cause_ipi smp op is only available conditionally to point out
configs where the select is required; this makes setting the op an
immediate fail instead of a deferred unresolved symbol at link.

This also creates a new config for power surge powermac upgrade support
that can be disabled in expert mode but is default on.

I also removed the depends / default y on CONFIG_XICS since it is selected
by PSERIES.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

1ece355b

powerpc: Consolidate ipi message mux and demux · 23d72bfd

Milton Miller authored May 10, 2011

Consolidate the mux and demux of ipi messages into smp.c and call
a new smp_ops callback to actually trigger the ipi.

The powerpc architecture code is optimised for having 4 distinct
ipi triggers, which are mapped to 4 distinct messages (ipi many, ipi
single, scheduler ipi, and enter debugger). However, several interrupt
controllers only provide a single software triggered interrupt that
can be delivered to each cpu. To resolve this limitation, each smp_ops
implementation created a per-cpu variable that is manipulated with atomic
bitops. Since these lines will be contended they are optimialy marked as
shared_aligned and take a full cache line for each cpu. Distro kernels
may have 2 or 3 of these in their config, each taking per-cpu space
even though at most one will be in use.

This consolidation removes smp_message_recv and replaces the single call
actions cases with direct calls from the common message recognition loop.
The complicated debugger ipi case with its muxed crash handling code is
moved to debug_ipi_action which is now called from the demux code (instead
of the multi-message action calling smp_message_recv).

I put a call to reschedule_action to increase the likelyhood of correctly
merging the anticipated scheduler_ipi() hook coming from the scheduler
tree; that single required call can be inlined later.

The actual message decode is a copy of the old pseries xics code with its
memory barriers and cache line spacing, augmented with a per-cpu unsigned
long based on the book-e doorbell code. The optional data is set via a
callback from the implementation and is passed to the new cause-ipi hook
along with the logical cpu number. While currently only the doorbell
implemntation uses this data it should be almost zero cost to retrieve and
pass it -- it adds a single register load for the argument from the same
cache line to which we just completed a store and the register is dead
on return from the call. I extended the data element from unsigned int
to unsigned long in case some other code wanted to associate a pointer.

The doorbell check_self is replaced by a call to smp_muxed_ipi_resend,
conditioned on the CPU_DBELL feature. The ifdef guard could be relaxed
to CONFIG_SMP but I left it with BOOKE for now.

Also, the doorbell interrupt vector for book-e was not calling irq_enter
and irq_exit, which throws off cpu accounting and causes code to not
realize it is running in interrupt context. Add the missing calls.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

23d72bfd

powerpc: Move smp_ops_t from machdep.h to smp.h · 17f9c8a7

Milton Miller authored May 10, 2011

I can't see any reason these functions are needed by machdep.h
and they are all hidden by CONFIG_SMP with no UP alternative.

Also move the declarations for the fallback timebase ops, which
are used to fill in the smp ops.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

17f9c8a7

powerpc: Remove stubbed beat smp support · d4fc8fe1

Milton Miller authored May 10, 2011

I have no idea if the beat hypervisor supports multiple cpus in
a partition, but the code has not been touched since these stubs
were added in February of 2007 except to move them in April of 2008.
These are stubs: start_cpu always returns fail (which is dropped),
the message passing and reciving are empty functions, and the top
of file comment says "Incomplete".
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

d4fc8fe1

powerpc: Remove alloc_maybe_bootmem for zalloc version · a56555e5

Milton Miller authored May 10, 2011

Replace all remaining callers of alloc_maybe_bootmem with
zalloc_maybe_bootmem. The callsite in pci_dn is followed with a
memset to clear the memory, and not zeroing at the other callsites
in the celleb fake pci code could lead to following uninitialized
memory as pointers or even freeing said pointers on error paths.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

a56555e5

powerpc: Remove powermac/pic.h · 7ca8aa09

Milton Miller authored May 10, 2011

Its unused, and of the three declarations, one is duplicated in pmac.h,
the second is static and the third is renamed and static.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

7ca8aa09

powerpc/mpic: Simplify ipi cpu mask handling · 3caba98f

Milton Miller authored May 10, 2011

Now that MSG_ALL and MSG_ALL_BUT_SELF have been eliminated,
smp_mpic_mesage_pass no longer needs to lookup the cpumask just to
have mpic_send_ipi extract part of it and recode it in a NR_CPUS loop
by mpic_physmask.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

3caba98f

powerpc: Remove checks for MSG_ALL and MSG_ALL_BUT_SELF · f1072939

Milton Miller authored May 10, 2011

Now that smp_ops->smp_message_pass is always called with an (online) cpu
number for the target remove the checks for MSG_ALL and MSG_ALL_BUT_SELF.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

f1072939

powerpc: Remove call sites of MSG_ALL_BUT_SELF · e0476371

Milton Miller authored May 10, 2011

The only user of MSG_ALL_BUT_SELF in the whole kernel tree is powerpc,
and it only uses it to start the debugger. Both debuggers always call
smp_send_debugger_break with MSG_ALL_BUT_SELF, and only mpic can do
anything more optimal than a loop over all online cpus, but all message
passing implementations have to code for this special delivery target.

Convert smp_send_debugger_break to take void and loop calling the smp_ops
message_pass function for each of the other cpus in the online cpumask.

Use raw_smp_processor_id() because we are either entering the debugger
or trying to start kdump and the additional warning it not useful were
it to trigger.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

e0476371

powerpc/mpic: Break cpumask abstraction earlier · 2a116f3d

Milton Miller authored May 10, 2011

mpic_set_affinity is allocating and freeing a cpumask var even though
it was breaking the cpumask abstraction when passing the mask to
mpic_physmask. It also didn't have any check for allocatin failure.

Break the cpumask abstraction earlier and use simple bitwise and of the
bits from the mask with the bits of cpu_online_mask.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

2a116f3d

powerpc/mpic: Limit NR_CPUS loop to 32 bit · ebc04215

Milton Miller authored May 10, 2011

mpic_physmask was looping NR_CPUS times over a mask that was passed as
a u32. Since mpic is architecturaly limited to 32 physical cpus, clamp
the logical cpus to 32 when compiling (we could also clamp at runtime
to nr_cpu_ids).
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

ebc04215

powerpc: Call no-longer static setup_nr_cpu_ids instead of replicating it · aa79bc21

Milton Miller authored May 10, 2011

c1854e00 (powerpc: Set nr_cpu_ids early
and use it to free PACAs) copied the formerly static setup_nr_cpu_ids
from init/main.c but 34db18a0 (smp:
move smp setup functions to kernel/smp.c) moved it to kernel/smp.c
with a declaration in include/linux/smp.h, so we can call it instead of
replicating it.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

aa79bc21

powerpc: Use nr_cpu_ids in initial paca allocation · 2cd947f1

Milton Miller authored May 10, 2011

Now that we never set a cpu above nr_cpu_ids possible we can
limit our initial paca allocation to nr_cpu_ids.  We can then
clamp the number of cpus in platforms/iseries/setup.c.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

2cd947f1

powerpc: Respect nr_cpu_ids when calling set_cpu_possible and set_cpu_present · 8657ae28

Milton Miller authored May 10, 2011

We should not set cpus above nr_cpu_ids to possible.  While we
will trigger a warning with CONFIG_CPUMASK_DEBUG, even then the mask
initializers will set the bits beyond what the iterators check and cause
nr_cpu_ids to increase.

Respecting nr_cpu_ids during setup will allow us to use it in our initial
paca allocation.  It can be reduced from NR_CPUS by the existing early param
nr_cpus=, which was added in 2b633e3f (smp:
Use nr_cpus= to set nr_cpu_ids early).  We already call parse_early_parms
between finding the command line and allocating the pacas.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

8657ae28

powerpc/iseries: Cleanup and fix secondary startup · 7c827337

Milton Miller authored May 10, 2011

9cb82f2f (Make iSeries spin on
__secondary_hold_spinloop, like pSeries) added a load of current_set
but this load was repeated later and we don't even have the paca yet.
It also checked __secondary_hold_spinloop with a 32 bit compare instead
of a 64 bit compare.

b6f6b98a (Don't spin on sync instruction
at boot time) missed the copy of the startup code in iseries.

1426d5a3 (Dynamically allocate pacas)
doesn't allow for pacas to be less than lppacas and recalculated the paca
location from the cpu id in r0 every time through the secondary loop.

Various revisions over time made the comments on conditional branches
confusing with respect to being a hold loop or forward progress

Mostly in-order description of the changes:

Replicate the few lines of code saved by the ugly scoped ifdef CONFIG_SMP
in the secondary loop between yielding on UP and marking time with the
hypervisor on SMP.  Always compile the iseries_secondary_yield loop and
use it if the cpu id is above nr_cpu_ids.  Change all forward progress
paths to be forward branches to the next numerical label.  Assign a
label to all loops.  Move all sync instructions from the loops to the
forward progress path.  Wait to load current_set until paca is set to go.
Move the iseries_secondary_smp_loop label to cover the whole spin loop.
Add HMT_MEDIUM when we make forward progress.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

7c827337

powerpc/kdump64: Don't reference freed memory as pacas · bd9e5eef

Milton Miller authored May 10, 2011

Starting with 1426d5a3 (powerpc:
Dynamically allocate pacas) the space for pacas beyond cpu_possible
is freed, but we failed to update the loop in crash.c.

Since c1854e00 (powerpc: Set nr_cpu_ids
early and use it to free PACAs) the number of pacas allocated is
always nr_cpu_ids.
Signed-off-by: Milton Miller <miltonm@bga.com>
Cc: <stable@kernel.org> # .34.x
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

bd9e5eef

powerpc: Don't search for paca in freed memory · 768d18ad

Milton Miller authored May 10, 2011

Starting with 1426d5a3 (powerpc:
Dynamically allocate pacas) we free the memory for pacas beyond
cpu_possible, but we failed to update the loop the secondary cpus use
to find their paca.  If the system has running cpu threads for which
the kernel did not allocate a paca for they will search the memory that
was freed.  For instance this could happen when the device tree for
a kdump kernel was not updated after a cpu hotplug, or the kernel is
running with more cpus than the kernel was configured.

Since c1854e00 (powerpc: Set nr_cpu_ids
early and use it to free PACAs) we set nr_cpu_ids before telling the
cpus to advance, so use that to limit the search.

We can't reference nr_cpu_ids without CONFIG_SMP because it is defined
as 1 instead of a memory location, but any extra threads should be sent
to kexec_wait in that case anyways, so make that explicit and remove
the search loop for UP.

Note to stable: The fix also requires
c1854e00 (powerpc: Set
nr_cpu_ids early and use it to free PACAs) to function.  Also
9d07bc84 (Properly handshake CPUs going
out of boot spin loop) affects the second chunk, specifically the branch
target was 3b before and is 4b after that patch, and there was a blank
line before the #ifdef CONFIG_SMP that was removed

Cc: <stable@kernel.org> # .34.x: c1854e00 powerpc: Set nr_cpu_ids early
Cc: <stable@kernel.org> # .34.x
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

768d18ad

powerpc/kexec: Fix memory corruption from unallocated slaves · 3d2cea73

Milton Miller authored May 10, 2011

Commit 1fc711f7 (powerpc/kexec: Fix race
in kexec shutdown) moved the write to signal the cpu had exited the kernel
from before the transition to real mode in kexec_smp_wait to kexec_wait.

Unfornately it missed that kexec_wait is used both by cpus leaving the
kernel and by secondary slave cpus that were not allocated a paca for
what ever reason -- they could be beyond nr_cpus or not described in
the current device tree for whatever reason (for example, kexec-load
was not refreshed after a cpu hotplug operation). Cpus coming through
that path they will write to paca[NR_CPUS] which is beyond the space
allocated for the paca data and overwrite memory not allocated to pacas
but very likely still real mode accessable).

Move the write back to kexec_smp_wait, which is used only by cpus that
found their paca, but after the transition to real mode.
Signed-off-by: Milton Miller <miltonm@bga.com>
Cc: <stable@kernel.org> # (1fc711f7 was backported to 2.6.32)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

3d2cea73

powerpc/pseries: Print corrupt r3 in FWNMI code · f0e939ae

Anton Blanchard authored May 10, 2011

I have a report of an FWNMI with an r3 value that we think is
corrupt, but since we don't print r3 we have no idea what was
wrong with it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

f0e939ae

pseries/iommu: Restore iommu table pointer when restoring iommu ops · eb0dd411

Nishanth Aravamudan authored May 09, 2011

When we swtich to direct dma ops, we set the dma data union to have the
dma offset.  When we switch back to iommu table ops because of a later
dma_set_mask, we need to restore the iommu table pointer. Without this
change, crashes have been observed on kexec where (for reasons still
being investigated) we fall back to a 32-bit dma mask on a particular
device and then panic because the table pointer is not valid.

The easiset way to find this value is to call
pci_dma_dev_setup_pSeriesLP which will search up the pci tree until it
finds the node with the table.
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Milton Miller <miltonm@bga.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

eb0dd411