1. 15 May, 2010 2 commits
    • Frederic Weisbecker's avatar
      lockup_detector: Adapt CONFIG_PERF_EVENT_NMI to other archs · c01d4323
      Frederic Weisbecker authored
      CONFIG_PERF_EVENT_NMI is something that need to be enabled from the
      arch. This is fine on x86 as PERF_EVENTS is builtin but if other
      archs select it, they will need to handle the PERF_EVENTS dependency.
      
      Instead, handle the dependency in the generic layer:
      
      - archs need to tell what they support through HAVE_PERF_EVENTS_NMI
      - Enable magically PERF_EVENTS_NMI if we have PERF_EVENTS and
        HAVE_PERF_EVENTS_NMI.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      c01d4323
    • Frederic Weisbecker's avatar
      lockup_detector: Update some config · e16bb1d7
      Frederic Weisbecker authored
      We kept CONFIG_DETECT_SOFTLOCKUP around for compatibility with
      older configs. But it was enabled by default if CONFIG_DEBUG_KERNEL.
      
      So if we want to enable CONFIG_LOCKUP_DETECTOR on configs that had
      CONFIG_DETECT_SOFTLOCKUP, all we need is to have the same enabling
      by default if CONFIG_DEBUG_KERNEL. We can then remove
      CONFIG_DETECT_SOFTLOCKUP directly.
      
      So tag CONFIG_LOCKUP_DETECTOR as default y. This is what we want for
      most serious kernel debugging anyway.
      
      And also forbid the lockup detector in S390 as it was for the
      previous softlockup detector, event though the true reason for that
      is not outlined.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      e16bb1d7
  2. 13 May, 2010 3 commits
    • Ingo Molnar's avatar
      x86, watchdog: Fix build error in hw_nmi.c · 5e85391b
      Ingo Molnar authored
      On some configs the following build error triggers:
      
       arch/x86/kernel/apic/hw_nmi.c:35: error: 'apic' undeclared (first use in this function)
       arch/x86/kernel/apic/hw_nmi.c:35: error: (Each undeclared identifier is reported only once
       arch/x86/kernel/apic/hw_nmi.c:35: error: for each function it appears in.)
      
      Because asm/apic.h was only included implicitly. Include it explicitly.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      LKML-Reference: <1273713674-8434-1-git-send-regression-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      5e85391b
    • Ingo Molnar's avatar
      watchdog: Export touch_softlockup_watchdog · 0167c781
      Ingo Molnar authored
      There are modules that rely on it:
      
        ERROR: "touch_softlockup_watchdog" [drivers/video/nvidia/nvidiafb.ko] undefined!
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      LKML-Reference: <1273713674-8434-1-git-send-regression-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0167c781
    • Frederic Weisbecker's avatar
      lockup_detector: Fix forgotten config conversion · 19cc36c0
      Frederic Weisbecker authored
      Fix forgotten CONFIG_DETECT_SOFTLOCKUP -> CONFIG_LOCKUP_DETECTOR
      in sched.h
      
      Fixes:
      	arch/x86/built-in.o: In function `touch_nmi_watchdog':
      	(.text+0x1bd59): undefined reference to `touch_softlockup_watchdog'
      	kernel/built-in.o: In function `show_state_filter':
      	(.text+0x10d01): undefined reference to `touch_all_softlockup_watchdogs'
      	kernel/built-in.o: In function `sched_clock_idle_wakeup_event':
      	(.text+0x362f9): undefined reference to `touch_softlockup_watchdog'
      	kernel/built-in.o: In function `timekeeping_resume':
      	timekeeping.c:(.text+0x38757): undefined reference to `touch_softlockup_watchdog'
      	kernel/built-in.o: In function `tick_nohz_handler':
      	tick-sched.c:(.text+0x3e5b9): undefined reference to `touch_softlockup_watchdog'
      	kernel/built-in.o: In function `tick_sched_timer':
      	tick-sched.c:(.text+0x3e671): undefined reference to `touch_softlockup_watchdog'
      	kernel/built-in.o: In function `tick_check_idle':
      	(.text+0x3e90b): undefined reference to `touch_softlockup_watchdog'
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      19cc36c0
  3. 12 May, 2010 9 commits
    • Frederic Weisbecker's avatar
      lockup_detector: Make BOOTPARAM_SOFTLOCKUP_PANIC depend on LOCKUP_DETECTOR · 89d7ce2a
      Frederic Weisbecker authored
      Panic on softlockups was still depending on the softlockup detector.
      But the latter has been merged into the lockup detector now.
      
      Let's update this config dependency.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      89d7ce2a
    • Don Zickus's avatar
      lockup_detector: Separate touch_nmi_watchdog code path from touch_watchdog · d7c54733
      Don Zickus authored
      When I combined the nmi_watchdog (hardlockup) and softlockup code, I
      also combined the paths the touch_watchdog and touch_nmi_watchdog took.
      This may not be the best idea as pointed out by Frederic W., that the
      touch_watchdog case probably should not reset the hardlockup count.
      
      Therefore the patch below falls back to the previous idea of keeping
      the touch_nmi_watchdog a superset of the touch_watchdog case.
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      LKML-Reference: <1273266711-18706-9-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      d7c54733
    • Don Zickus's avatar
      x86: Cleanup hw_nmi.c cruft · 10f90149
      Don Zickus authored
      The design of the hardlockup watchdog has changed and cruft was left
      behind in the hw_nmi.c file.  Just remove the code that isn't used
      anymore.
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      LKML-Reference: <1273266711-18706-7-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      10f90149
    • Don Zickus's avatar
      x86: Move trigger_all_cpu_backtrace to its own die_notifier · 7cbb7e7f
      Don Zickus authored
      As part of the transition of the nmi watchdog to something more
      generic, the trigger_all_cpu_backtrace code is getting left behind.
      Put it in its own die_notifier so it can still be used.
      
      V2:
      - use arch_spin_locks
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      LKML-Reference: <1273266711-18706-6-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      7cbb7e7f
    • Don Zickus's avatar
      lockup_detector: Remove nmi_watchdog.c file · f69bcf60
      Don Zickus authored
      This file migrated to kernel/watchdog.c and then combined with
      kernel/softlockup.c.  As a result kernel/nmi_watchdog.c is no longer
      needed.  Just remove it.
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      LKML-Reference: <1273266711-18706-5-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      f69bcf60
    • Don Zickus's avatar
      lockup_detector: Remove old softlockup code · 2508ce18
      Don Zickus authored
      Now that is no longer compiled or used, just remove it.
      
      Also move some of the code wrapped with DETECT_SOFTLOCKUP to the
      LOCKUP_DETECTOR wrappers because that is the code that uses it now.
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      LKML-Reference: <1273266711-18706-4-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      2508ce18
    • Don Zickus's avatar
      lockup_detector: Touch_softlockup cleanups and softlockup_tick removal · 332fbdbc
      Don Zickus authored
      Just some code cleanup to make touch_softlockup clearer and remove the
      softlockup_tick function as it is no longer needed.
      
      Also remove the /proc softlockup_thres call as it has been changed to
      watchdog_thres.
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      LKML-Reference: <1273266711-18706-3-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      332fbdbc
    • Don Zickus's avatar
      lockup_detector: Combine nmi_watchdog and softlockup detector · 58687acb
      Don Zickus authored
      The new nmi_watchdog (which uses the perf event subsystem) is very
      similar in structure to the softlockup detector.  Using Ingo's
      suggestion, I combined the two functionalities into one file:
      kernel/watchdog.c.
      
      Now both the nmi_watchdog (or hardlockup detector) and softlockup
      detector sit on top of the perf event subsystem, which is run every
      60 seconds or so to see if there are any lockups.
      
      To detect hardlockups, cpus not responding to interrupts, I
      implemented an hrtimer that runs 5 times for every perf event
      overflow event.  If that stops counting on a cpu, then the cpu is
      most likely in trouble.
      
      To detect softlockups, tasks not yielding to the scheduler, I used the
      previous kthread idea that now gets kicked every time the hrtimer fires.
      If the kthread isn't being scheduled neither is anyone else and the
      warning is printed to the console.
      
      I tested this on x86_64 and both the softlockup and hardlockup paths
      work.
      
      V2:
      - cleaned up the Kconfig and softlockup combination
      - surrounded hardlockup cases with #ifdef CONFIG_PERF_EVENTS_NMI
      - seperated out the softlockup case from perf event subsystem
      - re-arranged the enabling/disabling nmi watchdog from proc space
      - added cpumasks for hardlockup failure cases
      - removed fallback to soft events if no PMU exists for hard events
      
      V3:
      - comment cleanups
      - drop support for older softlockup code
      - per_cpu cleanups
      - completely remove software clock base hardlockup detector
      - use per_cpu masking on hard/soft lockup detection
      - #ifdef cleanups
      - rename config option NMI_WATCHDOG to LOCKUP_DETECTOR
      - documentation additions
      
      V4:
      - documentation fixes
      - convert per_cpu to __get_cpu_var
      - powerpc compile fixes
      
      V5:
      - split apart warn flags for hard and soft lockups
      
      TODO:
      - figure out how to make an arch-agnostic clock2cycles call
        (if possible) to feed into perf events as a sample period
      
      [fweisbec: merged conflict patch]
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      LKML-Reference: <1273266711-18706-2-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      58687acb
    • Frederic Weisbecker's avatar
      Merge commit 'v2.6.34-rc7' into perf/nmi · a9aa1d02
      Frederic Weisbecker authored
      Merge reason: catch up with latest softlockup detector changes.
      a9aa1d02
  4. 10 May, 2010 3 commits
  5. 07 May, 2010 13 commits
  6. 06 May, 2010 10 commits