Commits · 13395c48c3def3e5afee7bd7a851f901374db5ec · nexedi / linux

17 Sep, 2012 19 commits

powerpc/powernv: Initialize DMA for PEs · 13395c48

Gavin Shan authored Aug 20, 2012

The patch introduces additional wrapper function to call the original
implementation so that the DMA can be configured for all existing PEs.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Reviewed-by: Ram Pai <linuxram@us.ibm.com>
Reviewed-by: Richard Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

13395c48

powerpc/powernv: I/O and MMIO resource assignment for PEs · 11685bec

Gavin Shan authored Aug 20, 2012

There're 2 types of PCI bus sensitive PEs: (A) The PE includes
single PCI bus. (B) The PE includes the PCI bus and all the subordinate
PCI buses, and the patch tries to assign I/O and MMIO resources
based on created PEs. Fortunately, we figured out unified scheme
to do resource assignment for all types of PCI bus based PEs according
to Ben's idea:

        - Resource assignment based on PE from top to bottom.
        - The soureces, either I/O or MMIO, of the PE are figured out
          from the assigned PCI bus.
        - The occupied resource by parent PE could possibilly be overrided
          by children PEs.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Reviewed-by: Ram Pai <linuxram@us.ibm.com>
Reviewed-by: Richard Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

11685bec

powerpc/powernv: PE list based on creation order · 7ebdf956

Gavin Shan authored Aug 20, 2012

The resource (I/O and MMIO) will be assigned on basis of PE from
top to bottom so that we can implement the trick here: the resource
that has been assigned to parent PE could be taken by child PE if
necessary.

The current implementation already has PE list per PHB basis, but
the list doesn't meet our requirment: tracing PE based on their
cration time from top to bottom. So the patch does rename for the
DMA based PE list and introduces the list to trace the PEs sequentially
based on their creation time.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Reviewed-by: Ram Pai <linuxram@us.ibm.com>
Reviewed-by: Richard Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

7ebdf956

powerpc/powernv: Create bus sensitive PEs · fb446ad0

Gavin Shan authored Aug 20, 2012

Basically, there're 2 types of PCI bus sensitive PEs: (A) The PE
includes single PCI bus. (B) The PE includes the PCI bus and all
the subordinate PCI buses. At present, we'd like to put PCI bus
originated by PCI-e link to form PE that contains single PCI bus,
and the PCIe-to-PCI bridge will form the 2nd type of PE. We don't
figure out to detect PLX bridge yet. Once we can detect PLX bridge
some day, we have to put PCI buses originated from the downstream
port of PLX bridge to the 2nd type of PE.

The patch changes the original implementation for a little bit
to support 2 types of PCI bus sensitive PEs described as above.
Also, the function used to retrieve the corresponding PE according
to the given PCI device has been changed based on that because each
PCI device should trace the directly associated PE.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Reviewed-by: Ram Pai <linuxram@us.ibm.com>
Reviewed-by: Richard Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

fb446ad0

powerpc/trace: Fix interrupt tracepoints vs. RCU · e72bbbab

Li Zhong authored Sep 10, 2012

There are a few tracepoints in the interrupt code path, which is before
irq_enter(), or after irq_exit(), like
trace_irq_entry()/trace_irq_exit() in do_IRQ(),
trace_timer_interrupt_entry()/trace_timer_interrupt_exit() in
timer_interrupt().

If the interrupt is from idle(), and because tracepoint contains RCU
read-side critical section, we could see following suspicious RCU usage
reported:

[  145.127743] ===============================
[  145.127747] [ INFO: suspicious RCU usage. ]
[  145.127752] 3.6.0-rc3+ #1 Not tainted
[  145.127755] -------------------------------
[  145.127759] /root/.workdir/linux/arch/powerpc/include/asm/trace.h:33
suspicious rcu_dereference_check() usage!
[  145.127765]
[  145.127765] other info that might help us debug this:
[  145.127765]
[  145.127771]
[  145.127771] RCU used illegally from idle CPU!
[  145.127771] rcu_scheduler_active = 1, debug_locks = 0
[  145.127777] RCU used illegally from extended quiescent state!
[  145.127781] no locks held by swapper/0/0.
[  145.127785]
[  145.127785] stack backtrace:
[  145.127789] Call Trace:
[  145.127796] [c00000000108b530] [c000000000013c40] .show_stack
+0x70/0x1c0 (unreliable)
[  145.127806] [c00000000108b5e0]
[c0000000000f59d8] .lockdep_rcu_suspicious+0x118/0x150
[  145.127813] [c00000000108b680] [c00000000000fc58] .do_IRQ+0x498/0x500
[  145.127820] [c00000000108b750] [c000000000003950]
hardware_interrupt_common+0x150/0x180
[  145.127828] --- Exception: 501 at .plpar_hcall_norets+0x84/0xd4
[  145.127828]     LR = .check_and_cede_processor+0x38/0x70
[  145.127836] [c00000000108bab0] [c0000000000665dc] .shared_cede_loop
+0x5c/0x100
[  145.127844] [c00000000108bb70] [c000000000588ab0] .cpuidle_enter
+0x30/0x50
[  145.127850] [c00000000108bbe0]
[c000000000588b0c] .cpuidle_enter_state+0x3c/0xb0
[  145.127857] [c00000000108bc60] [c000000000589730] .cpuidle_idle_call
+0x150/0x6c0
[  145.127863] [c00000000108bd30] [c000000000058440] .pSeries_idle
+0x10/0x40
[  145.127870] [c00000000108bda0] [c00000000001683c] .cpu_idle
+0x18c/0x2d0
[  145.127876] [c00000000108be60] [c00000000000b434] .rest_init
+0x124/0x1b0
[  145.127884] [c00000000108bef0] [c0000000009d0d28] .start_kernel
+0x568/0x588
[  145.127890] [c00000000108bf90] [c000000000009660] .start_here_common
+0x20/0x40

This is because the RCU usage in interrupt context should be used in
area marked by rcu_irq_enter()/rcu_irq_exit(), called in
irq_enter()/irq_exit() respectively.

Move them into the irq_enter()/irq_exit() area to avoid the reporting.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

e72bbbab

powerpc/mm: Make some of the PGTABLE_RANGE dependency explicit · 78f1dbde

Aneesh Kumar K.V authored Sep 10, 2012

slice array size and slice mask size depend on PGTABLE_RANGE.
Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

78f1dbde

powerpc/mm: Update VSID allocation documentation · f033d659

Aneesh Kumar K.V authored Sep 10, 2012

This update the proto-VSID and VSID scramble related information
to be more generic by using names instead of current values.
Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

f033d659

powerpc/mm: Add 64TB support · 048ee099

Aneesh Kumar K.V authored Sep 10, 2012

Increase max addressable range to 64TB. This is not tested on
real hardware yet.
Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

048ee099

powerpc/mm: Use 32bit array for slb cache · 735cafc3

Aneesh Kumar K.V authored Sep 10, 2012

With larger vsid we need to track more bits of ESID in slb cache
for slb invalidate.
Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

735cafc3

powerpc/mm: Use the required number of VSID bits in slbmte · ac8dc282

Aneesh Kumar K.V authored Sep 10, 2012

ASM_VSID_SCRAMBLE can leave non-zero bits in the high 28 bits of the result
for 256MB segment (40 bits for 1T segment). Properly mask them before using
the values in slbmte
Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

ac8dc282

powerpc/mm: Increase the slice range to 64TB · 7aa0727f

Aneesh Kumar K.V authored Sep 10, 2012

This patch makes the high psizes mask as an unsigned char array
so that we can have more than 16TB. Currently we support upto
64TB
Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

7aa0727f

powerpc/mm: Make KERN_VIRT_SIZE not dependend on PGTABLE_RANGE · 67550080

Aneesh Kumar K.V authored Sep 10, 2012

As we keep increasing PGTABLE_RANGE we need not increase the virual
map area for kernel.
Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

67550080

powerpc/mm: Convert virtual address to vpn · 5524a27d

Aneesh Kumar K.V authored Sep 10, 2012

This patch convert different functions to take virtual page number
instead of virtual address. Virtual page number is virtual address
shifted right by VPN_SHIFT (12) bits. This enable us to have an
address range of upto 76 bits.
Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

5524a27d

powerpc/mm: Simplify hpte_decode · dcda287a

Aneesh Kumar K.V authored Sep 10, 2012

This patch simplify hpte_decode for easy switching of virtual address to
virtual page number in the later patch
Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

dcda287a

powerpc/mm: Use hpt_va to compute virtual address · f6412b74

Aneesh Kumar K.V authored Sep 10, 2012

Don't open code the same
Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

f6412b74

powerpc/mm: Replace open coded CONTEXT_BITS value · f0383923

Aneesh Kumar K.V authored Sep 10, 2012

To clarify the meaning for future readers, replace the open coded
19 with CONTEXT_BITS
Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

f0383923

powerpc/mm: Fix typo in PTRS_PER_PUD · f3d34445

Scott Wood authored Sep 12, 2012

PTRS_PER_PUD should be based on PUD_INDEX_SIZE, not PMD_INDEX_SIZE.  We
got away with it because PUD and PMD had the same index size, but this is
no longer true with Aneesh's patchset to support a 46-bit user effective
address space.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

f3d34445

powerpc: Add denormalisation exception handling for POWER6/7 · b92a66a6

Michael Neuling authored Sep 10, 2012

On POWER6 and POWER7 if the input operand to an instruction is a
denormalised single precision binary floating point value we can take
a denormalisation exception where it's expected that the hypervisor
(HV=1) will fix up the inputs before the instruction is run.

This adds code to handle this denormalisation exception for POWER6 and
POWER7.

It also add a CONFIG_PPC_DENORMALISATION option and sets it in
pseries/ppc64_defconfig.

This is useful on bare metal systems only.  Based on patch from Milton
Miller.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

b92a66a6

Merge remote-tracking branch 'pci/pci/gavin-window-alignment' into next · eda485f0

Benjamin Herrenschmidt authored Sep 17, 2012

Merge Gavin patches from the PCI tree as subsequent powerpc
patches are going to depend on them
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

eda485f0

13 Sep, 2012 1 commit

Merge commit 'v3.6-rc5' into pci/gavin-window-alignment · 9a5d5bd8

Bjorn Helgaas authored Sep 13, 2012

* commit 'v3.6-rc5': (1098 commits)
  Linux 3.6-rc5
  HID: tpkbd: work even if the new Lenovo Keyboard driver is not configured
  Remove user-triggerable BUG from mpol_to_str
  xen/pciback: Fix proper FLR steps.
  uml: fix compile error in deliver_alarm()
  dj: memory scribble in logi_dj
  Fix order of arguments to compat_put_time[spec|val]
  xen: Use correct masking in xen_swiotlb_alloc_coherent.
  xen: fix logical error in tlb flushing
  xen/p2m: Fix one-off error in checking the P2M tree directory.
  powerpc: Don't use __put_user() in patch_instruction
  powerpc: Make sure IPI handlers see data written by IPI senders
  powerpc: Restore correct DSCR in context switch
  powerpc: Fix DSCR inheritance in copy_thread()
  powerpc: Keep thread.dscr and thread.dscr_inherit in sync
  powerpc: Update DSCR on all CPUs when writing sysfs dscr_default
  powerpc/powernv: Always go into nap mode when CPU is offline
  powerpc: Give hypervisor decrementer interrupts their own handler
  powerpc/vphn: Fix arch_update_cpu_topology() return value
  ARM: gemini: fix the gemini build
  ...

Conflicts:
	drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
	drivers/rapidio/devices/tsi721.c

9a5d5bd8

11 Sep, 2012 5 commits

powerpc/powernv: I/O and memory alignment for P2P bridges · 271fd03a

Gavin Shan authored Sep 11, 2012

The patch implements ppc_md.pcibios_window_alignment for powernv
platform so that the resource reassignment in PCI core will be
done according to the I/O and memory alignment returned from
powernv platform. The alignments returned from powernv platform
is closely depending on the scheme for PE segmenting. Besides,
the patch isn't useful for now, but the subsequent patches will
be working based on it.

[bhelgaas: use pci_pcie_type() since pci_dev.pcie_type was removed]
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

271fd03a

powerpc/PCI: Override pcibios_window_alignment() · 4c2245bb

Gavin Shan authored Sep 11, 2012

This patch implements pcibios_window_alignment() so powerpc platforms can
force P2P bridge windows to be at larger alignments than the PCI spec
requires.

[bhelgaas: changelog]
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

4c2245bb

PCI: Refactor pbus_size_mem() · c121504e

Gavin Shan authored Sep 11, 2012

The original idea comes from Ram Pai.  This patch puts the chunk of
code for calculating the minimal alignment of memory window into a
separate inline function.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

c121504e

PCI: Align P2P windows using pcibios_window_alignment() · 462d9303

Gavin Shan authored Sep 11, 2012

This patch changes pbus_size_io() and pbus_size_mem() to do window (I/O,
memory and prefetchable memory) reassignment based on the minimal
alignments for the P2P bridge, which was retrieved by window_alignment().

[bhelgaas: changelog]
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

462d9303

PCI: Add weak pcibios_window_alignment() interface · ac5ad93e

Gavin Shan authored Sep 11, 2012

This patch implements a weak function to return the default I/O or memory
window alignment for a P2P bridge.  By default, I/O windows are aligned to
4KiB or 1KiB and memory windows are aligned to 4MiB.  Some platforms, e.g.,
powernv, have special alignment requirements and can override
pcibios_window_alignment().

[bhelgaas: changelog]
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

ac5ad93e

10 Sep, 2012 2 commits

powerpc/mm: Match variable types to API · 6b5e7229

Joe MacDonald authored Aug 21, 2012

sys_subpage_prot() takes an unsigned long for 'addr' then does some stuff
with it and the result is stored in a signed int, i, which is eventually
used as the size parameter in a copy_from_user call.  Update 'i' to be an
unsigned long as well and since 'nw' is used in a size_t context which,
depending on whether this is 32- or 64-bit may be unsigned int or unsigned
long, switch that to a size_t and always be right.

Finally, since we're in the neighbourhood, make the same changes to
subpage_prot_clear().

Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Joe MacDonald <joe.macdonald@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

6b5e7229

powerpc/iommu: Add ppc_md.tce_get() callback for use by VFIO · 11f63d3f

Alexey Kardashevskiy authored Sep 04, 2012

The upcoming VFIO support requires a way to know which
entry in the TCE map is not empty in order to do cleanup
at QEMU exit/crash. This patch adds such functionality
to POWERNV platform code.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

11f63d3f

09 Sep, 2012 13 commits

powerpc: cleanup old DABRX #defines · 4715fed6

Michael Neuling authored Sep 06, 2012

These are no longer used so get rid of them
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

4715fed6

powerpc: Dynamically calculate the dabrx based on kernel/user/hypervisor · cd144573

Michael Neuling authored Sep 06, 2012

Currently we mark the DABRX to interrupt on all matches
(hypervisor/kernel/user and then filter in software.  We can be a lot
smarter now that we can set the DABRX dynamically.

This sets the DABRX based on the flags passed by the user.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

cd144573

powerpc: Rework set_dabr so it can take a DABRX value as well · 4474ef05

Michael Neuling authored Sep 06, 2012

Rework set_dabr to take a DABRX value as well.

Both the pseries and PS3 hypervisors do some checks on the DABRX
values that are passed in the hcall.  This patch stops bogus values
from being passed to hypervisor.  Also, in the case where we are
clearing the breakpoint, where DABR and DABRX are zero, we modify the
DABRX value to make it valid so that the hcall won't fail.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

4474ef05

powerpc/eeh: Cleanup on EEH PCI address cache · 3ab96a02