1. 28 Jun, 2009 1 commit
    • H. Peter Anvin's avatar
      Revert "x86: cap iomem_resource to addressable physical memory" · ff8a4bae
      H. Peter Anvin authored
      This reverts commit 95ee14e4.
      Mikael Petterson <mikepe@it.uu.se> reported that at least one of his
      systems will not boot as a result.  We have ruled out the detection
      algorithm malfunctioning, so it is not a matter of producing the
      incorrect bitmasks; rather, something in the application of them
      fails.
      
      Revert the commit until we can root cause and correct this problem.
      
      -stable team: this means the underlying commit should be rejected.
      Reported-and-isolated-by: default avatarMikael Petterson <mikpe@it.uu.se>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      LKML-Reference: <200906261559.n5QFxJH8027336@pilspetsen.it.uu.se>
      Cc: stable@kernel.org
      Cc: Grant Grundler <grundler@parisc-linux.org>
      ff8a4bae
  2. 25 Jun, 2009 5 commits
    • Pallipadi, Venkatesh's avatar
      x86, delay: tsc based udelay should have rdtsc_barrier · e888d7fa
      Pallipadi, Venkatesh authored
      delay_tsc needs rdtsc_barrier to provide proper delay.
      
      Output from a test driver using hpet to cross check delay
      provided by udelay().
      
      Before:
      [   86.794363] Expected delay 5us actual 4679ns
      [   87.154362] Expected delay 5us actual 698ns
      [   87.514162] Expected delay 5us actual 4539ns
      [   88.653716] Expected delay 5us actual 4539ns
      [   94.664106] Expected delay 10us actual 9638ns
      [   95.049351] Expected delay 10us actual 10126ns
      [   95.416110] Expected delay 10us actual 9568ns
      [   95.799216] Expected delay 10us actual 9638ns
      [  103.624104] Expected delay 10us actual 9707ns
      [  104.020619] Expected delay 10us actual 768ns
      [  104.419951] Expected delay 10us actual 9707ns
      
      After:
      [   50.983320] Expected delay 5us actual 5587ns
      [   51.261807] Expected delay 5us actual 5587ns
      [   51.565715] Expected delay 5us actual 5657ns
      [   51.861171] Expected delay 5us actual 5587ns
      [   52.164704] Expected delay 5us actual 5726ns
      [   52.487457] Expected delay 5us actual 5657ns
      [   52.789338] Expected delay 5us actual 5726ns
      [   57.119680] Expected delay 10us actual 10755ns
      [   57.893997] Expected delay 10us actual 10615ns
      [   58.261287] Expected delay 10us actual 10755ns
      [   58.620505] Expected delay 10us actual 10825ns
      [   58.941035] Expected delay 10us actual 10755ns
      [   59.320903] Expected delay 10us actual 10615ns
      [   61.306311] Expected delay 10us actual 10755ns
      [   61.520542] Expected delay 10us actual 10615ns
      Signed-off-by: default avatarVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      e888d7fa
    • H. Peter Anvin's avatar
      x86, setup: correct include file in <asm/boot.h> · 658dbfeb
      H. Peter Anvin authored
      <asm/boot.h> needs <asm/pgtable_types.h>, not <asm/page_types.h> in
      order to resolve PMD_SHIFT.  Also, correct a +1 which really should be
      + THREAD_ORDER.
      
      This is a build error which was masked by a typoed #ifdef.
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      658dbfeb
    • Robert P. J. Day's avatar
      x86, setup: Fix typo "CONFIG_x86_64" in <asm/boot.h> · 22f4319d
      Robert P. J. Day authored
      CONFIG_X86_64 was misspelled (wrong case), which caused the x86-64
      kernel to advertise itself as more relocatable than it really is.
      This could in theory cause boot failures once bootloaders start
      support the new relocation fields.
      Signed-off-by: default avatarRobert P. J. Day <rpjday@crashcourse.ca>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      22f4319d
    • Hidetoshi Seto's avatar
      x86, mce: percpu mcheck_timer should be pinned · 5be6066a
      Hidetoshi Seto authored
      If CONFIG_NO_HZ + CONFIG_SMP, timer added via add_timer() might
      be migrated on other cpu.  Use add_timer_on() instead.
      
      Avoids the following failure:
      
      Maciej Rutecki wrote:
      > > After normal boot I try:
      > >
      > > echo 1 > /sys/devices/system/machinecheck/machinecheck0/check_interval
      > >
      > > I found this in dmesg:
      > >
      > > [  141.704025] ------------[ cut here ]------------
      > > [  141.704039] WARNING: at arch/x86/kernel/cpu/mcheck/mce.c:1102
      > > mcheck_timer+0xf5/0x100()
      Reported-by: default avatarMaciej Rutecki <maciej.rutecki@gmail.com>
      Signed-off-by: default avatarHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Tested-by: default avatarMaciej Rutecki <maciej.rutecki@gmail.com>
      Acked-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      5be6066a
    • Kurt Garloff's avatar
      x86: Add sysctl to allow panic on IOCK NMI error · 5211a242
      Kurt Garloff authored
      This patch introduces a new sysctl:
      
          /proc/sys/kernel/panic_on_io_nmi
      
      which defaults to 0 (off).
      
      When enabled, the kernel panics when the kernel receives an NMI
      caused by an IO error.
      
      The IO error triggered NMI indicates a serious system
      condition, which could result in IO data corruption. Rather
      than contiuing, panicing and dumping might be a better choice,
      so one can figure out what's causing the IO error.
      
      This could be especially important to companies running IO
      intensive applications where corruption must be avoided, e.g. a
      bank's databases.
      
      [ SuSE has been shipping it for a while, it was done at the
        request of a large database vendor, for their users. ]
      Signed-off-by: default avatarKurt Garloff <garloff@suse.de>
      Signed-off-by: default avatarRoberto Angelino <robertangelino@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      LKML-Reference: <20090624213211.GA11291@kroah.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      5211a242
  3. 24 Jun, 2009 1 commit
    • Cliff Wickman's avatar
      x86: Fix uv bau sending buffer initialization · 9c26f52b
      Cliff Wickman authored
      The initialization of the UV Broadcast Assist Unit's sending
      buffers was making an invalid assumption about the
      initialization of an MMR that defines its address.
      
      The BIOS will not be providing that MMR.  So
      uv_activation_descriptor_init() should unconditionally set it.
      
      Tested on UV simulator.
      Signed-off-by: default avatarCliff Wickman <cpw@sgi.com>
      Cc: <stable@kernel.org> # for v2.6.30.x
      LKML-Reference: <E1MJTfj-0005i1-W8@eag09.americas.sgi.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      9c26f52b
  4. 23 Jun, 2009 2 commits
  5. 22 Jun, 2009 9 commits
    • Ingo Molnar's avatar
    • Tejun Heo's avatar
      x86: ensure percpu lpage doesn't consume too much vmalloc space · 0017c869
      Tejun Heo authored
      On extreme configuration (e.g. 32bit 32-way NUMA machine), lpage
      percpu first chunk allocator can consume too much of vmalloc space.
      Make it fall back to 4k allocator if the consumption goes over 20%.
      
      [ Impact: add sanity check for lpage percpu first chunk allocator ]
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarJan Beulich <JBeulich@novell.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      0017c869
    • Tejun Heo's avatar
      x86: implement percpu_alloc kernel parameter · fa8a7094
      Tejun Heo authored
      According to Andi, it isn't clear whether lpage allocator is worth the
      trouble as there are many processors where PMD TLB is far scarcer than
      PTE TLB.  The advantage or disadvantage probably depends on the actual
      size of percpu area and specific processor.  As performance
      degradation due to TLB pressure tends to be highly workload specific
      and subtle, it is difficult to decide which way to go without more
      data.
      
      This patch implements percpu_alloc kernel parameter to allow selecting
      which first chunk allocator to use to ease debugging and testing.
      
      While at it, make sure all the failure paths report why something
      failed to help determining why certain allocator isn't working.  Also,
      kill the "Great future plan" comment which had already been realized
      quite some time ago.
      
      [ Impact: allow explicit percpu first chunk allocator selection ]
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarJan Beulich <JBeulich@novell.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      fa8a7094
    • Tejun Heo's avatar
      x86: fix pageattr handling for lpage percpu allocator and re-enable it · e59a1bb2
      Tejun Heo authored
      lpage allocator aliases a PMD page for each cpu and returns whatever
      is unused to the page allocator.  When the pageattr of the recycled
      pages are changed, this makes the two aliases point to the overlapping
      regions with different attributes which isn't allowed and known to
      cause subtle data corruption in certain cases.
      
      This can be handled in simliar manner to the x86_64 highmap alias.
      pageattr code should detect if the target pages have PMD alias and
      split the PMD alias and synchronize the attributes.
      
      pcpur allocator is updated to keep the allocated PMD pages map sorted
      in ascending address order and provide pcpu_lpage_remapped() function
      which binary searches the array to determine whether the given address
      is aliased and if so to which address.  pageattr is updated to use
      pcpu_lpage_remapped() to detect the PMD alias and split it up as
      necessary from cpa_process_alias().
      
      Jan Beulich spotted the original problem and incorrect usage of vaddr
      instead of laddr for lookup.
      
      With this, lpage percpu allocator should work correctly.  Re-enable
      it.
      
      [ Impact: fix subtle lpage pageattr bug and re-enable lpage ]
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarJan Beulich <JBeulich@novell.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      e59a1bb2
    • Tejun Heo's avatar
      x86: reorganize cpa_process_alias() · 992f4c1c
      Tejun Heo authored
      Reorganize cpa_process_alias() so that new alias condition can be
      added easily.
      
      Jan Beulich spotted problem in the original cleanup thread which
      incorrectly assumed the two existing conditions were mutially
      exclusive.
      
      [ Impact: code reorganization ]
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jan Beulich <JBeulich@novell.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      992f4c1c
    • Tejun Heo's avatar
      x86: prepare setup_pcpu_lpage() for pageattr fix · 0ff2587f
      Tejun Heo authored
      Make the following changes in preparation of coming pageattr updates.
      
      * Define and use array of struct pcpul_ent instead of array of
        pointers.  The only difference is ->cpu field which is set but
        unused yet.
      
      * Rename variables according to the above change.
      
      * Rename local variable vm to pcpul_vm and move it out of the
        function.
      
      [ Impact: no functional difference ]
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jan Beulich <JBeulich@novell.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      0ff2587f
    • Tejun Heo's avatar
      x86: rename remap percpu first chunk allocator to lpage · 97c9bf06
      Tejun Heo authored
      The "remap" allocator remaps large pages to build the first chunk;
      however, the name isn't very good because 4k allocator remaps too and
      the whole point of the remap allocator is using large page mapping.
      The allocator will be generalized and exported outside of x86, rename
      it to lpage before that happens.
      
      percpu_alloc kernel parameter is updated to accept both "remap" and
      "lpage" for lpage allocator.
      
      [ Impact: code cleanup, kernel parameter argument updated ]
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      97c9bf06
    • Tejun Heo's avatar
      x86: fix duplicate free in setup_pcpu_remap() failure path · c5806df9
      Tejun Heo authored
      In the failure path, setup_pcpu_remap() tries to free the area which
      has already been freed to make holes in the large page.  Fix it.
      
      [ Impact: fix duplicate free in failure path ]
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      c5806df9
    • Tejun Heo's avatar
      percpu: fix too lazy vunmap cache flushing · 85ae87c1
      Tejun Heo authored
      In pcpu_unmap(), flushing virtual cache on vunmap can't be delayed as
      the page is going to be returned to the page allocator.  Only TLB
      flushing can be put off such that vmalloc code can handle it lazily.
      Fix it.
      
      [ Impact: fix subtle virtual cache flush bug ]
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      85ae87c1
  6. 21 Jun, 2009 22 commits