• Sameer Nanda's avatar
    NMI watchdog: fix for lockup detector breakage on resume · 45226e94
    Sameer Nanda authored
    On the suspend/resume path the boot CPU does not go though an
    offline->online transition.  This breaks the NMI detector post-resume
    since it depends on PMU state that is lost when the system gets
    suspended.
    
    Fix this by forcing a CPU offline->online transition for the lockup
    detector on the boot CPU during resume.
    
    To provide more context, we enable NMI watchdog on Chrome OS.  We have
    seen several reports of systems freezing up completely which indicated
    that the NMI watchdog was not firing for some reason.
    
    Debugging further, we found a simple way of repro'ing system freezes --
    issuing the command 'tasket 1 sh -c "echo nmilockup > /proc/breakme"'
    after the system has been suspended/resumed one or more times.
    
    With this patch in place, the system freeze result in panics, as
    expected.
    
    These panics provide a nice stack trace for us to debug the actual issue
    causing the freeze.
    
    [akpm@linux-foundation.org: fiddle with code comment]
    [akpm@linux-foundation.org: make lockup_detector_bootcpu_resume() conditional on CONFIG_SUSPEND]
    [akpm@linux-foundation.org: fix section errors]
    Signed-off-by: default avatarSameer Nanda <snanda@chromium.org>
    Cc: Ingo Molnar <mingo@elte.hu>
    Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
    Cc: Don Zickus <dzickus@redhat.com>
    Cc: Mandeep Singh Baines <msb@chromium.org>
    Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
    Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    45226e94
suspend.c 7.42 KB