• Robert Richter's avatar
    oprofile/x86: fix uninitialized counter usage during cpu hotplug · 2623a1d5
    Robert Richter authored
    This fixes a NULL pointer dereference that is triggered when taking a
    cpu offline after oprofile was initialized, e.g.:
    
     $ opcontrol --init
     $ opcontrol --start-daemon
     $ opcontrol --shutdown
     $ opcontrol --deinit
     $ echo 0 > /sys/devices/system/cpu/cpu1/online
    
    See the crash dump below. Though the counter has been disabled the cpu
    notifier is still active and trying to use already freed counter data.
    
    This fix is for linux-stable. To proper fix this, the hotplug code
    must be rewritten. Thus I will leave a WARN_ON_ONCE() message with
    this patch.
    
    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: [<ffffffff8132ad57>] op_amd_stop+0x2d/0x8e
    PGD 0
    Oops: 0000 [#1] SMP
    last sysfs file: /sys/devices/system/cpu/cpu1/online
    CPU 1
    Modules linked in:
    
    Pid: 0, comm: swapper Not tainted 2.6.34-rc5-oprofile-x86_64-standard-00210-g8c00f06 #16 Anaheim/Anaheim
    RIP: 0010:[<ffffffff8132ad57>]  [<ffffffff8132ad57>] op_amd_stop+0x2d/0x8e
    RSP: 0018:ffff880001843f28  EFLAGS: 00010006
    RAX: 0000000000000000 RBX: 0000000000000000 RCX: dead000000200200
    RDX: ffff880001843f68 RSI: dead000000100100 RDI: 0000000000000000
    RBP: ffff880001843f48 R08: 0000000000000000 R09: ffff880001843f08
    R10: ffffffff8102c9a5 R11: ffff88000184ea80 R12: 0000000000000000
    R13: ffff88000184f6c0 R14: 0000000000000000 R15: 0000000000000000
    FS:  00007fec6a92e6f0(0000) GS:ffff880001840000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 0000000000000000 CR3: 000000000163b000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process swapper (pid: 0, threadinfo ffff88042fcd8000, task ffff88042fcd51d0)
    Stack:
     ffff880001843f48 0000000000000001 ffff88042e9f7d38 ffff880001843f68
    <0> ffff880001843f58 ffffffff8132a602 ffff880001843f98 ffffffff810521b3
    <0> ffff880001843f68 ffff880001843f68 ffff880001843f88 ffff88042fcd9fd8
    Call Trace:
     <IRQ>
     [<ffffffff8132a602>] nmi_cpu_stop+0x21/0x23
     [<ffffffff810521b3>] generic_smp_call_function_single_interrupt+0xdf/0x11b
     [<ffffffff8101804f>] smp_call_function_single_interrupt+0x22/0x31
     [<ffffffff810029f3>] call_function_single_interrupt+0x13/0x20
     <EOI>
     [<ffffffff8102c9a5>] ? wake_up_process+0x10/0x12
     [<ffffffff81008701>] ? default_idle+0x22/0x37
     [<ffffffff8100896d>] c1e_idle+0xdf/0xe6
     [<ffffffff813f1170>] ? atomic_notifier_call_chain+0x13/0x15
     [<ffffffff810012fb>] cpu_idle+0x4b/0x7e
     [<ffffffff813e8a4e>] start_secondary+0x1ae/0x1b2
    Code: 89 e5 41 55 49 89 fd 41 54 45 31 e4 53 31 db 48 83 ec 08 89 df e8 be f8 ff ff 48 98 48 83 3c c5 10 67 7a 81 00 74 1f 49 8b 45 08 <42> 8b 0c 20 0f 32 48 c1 e2 20 25 ff ff bf ff 48 09 d0 48 89 c2
    RIP  [<ffffffff8132ad57>] op_amd_stop+0x2d/0x8e
     RSP <ffff880001843f28>
    CR2: 0000000000000000
    ---[ end trace 679ac372d674b757 ]---
    Kernel panic - not syncing: Fatal exception in interrupt
    Pid: 0, comm: swapper Tainted: G      D    2.6.34-rc5-oprofile-x86_64-standard-00210-g8c00f06 #16
    Call Trace:
     <IRQ>  [<ffffffff813ebd6a>] panic+0x9e/0x10c
     [<ffffffff810474b0>] ? up+0x34/0x39
     [<ffffffff81031ccc>] ? kmsg_dump+0x112/0x12c
     [<ffffffff813eeff1>] oops_end+0x81/0x8e
     [<ffffffff8101efee>] no_context+0x1f3/0x202
     [<ffffffff8101f1b7>] __bad_area_nosemaphore+0x1ba/0x1e0
     [<ffffffff81028d24>] ? enqueue_task_fair+0x16d/0x17a
     [<ffffffff810264dc>] ? activate_task+0x42/0x53
     [<ffffffff8102c967>] ? try_to_wake_up+0x272/0x284
     [<ffffffff8101f1eb>] bad_area_nosemaphore+0xe/0x10
     [<ffffffff813f0f3f>] do_page_fault+0x1c8/0x37c
     [<ffffffff81028d24>] ? enqueue_task_fair+0x16d/0x17a
     [<ffffffff813ee55f>] page_fault+0x1f/0x30
     [<ffffffff8102c9a5>] ? wake_up_process+0x10/0x12
     [<ffffffff8132ad57>] ? op_amd_stop+0x2d/0x8e
     [<ffffffff8132ad46>] ? op_amd_stop+0x1c/0x8e
     [<ffffffff8132a602>] nmi_cpu_stop+0x21/0x23
     [<ffffffff810521b3>] generic_smp_call_function_single_interrupt+0xdf/0x11b
     [<ffffffff8101804f>] smp_call_function_single_interrupt+0x22/0x31
     [<ffffffff810029f3>] call_function_single_interrupt+0x13/0x20
     <EOI>  [<ffffffff8102c9a5>] ? wake_up_process+0x10/0x12
     [<ffffffff81008701>] ? default_idle+0x22/0x37
     [<ffffffff8100896d>] c1e_idle+0xdf/0xe6
     [<ffffffff813f1170>] ? atomic_notifier_call_chain+0x13/0x15
     [<ffffffff810012fb>] cpu_idle+0x4b/0x7e
     [<ffffffff813e8a4e>] start_secondary+0x1ae/0x1b2
    ------------[ cut here ]------------
    WARNING: at /local/rrichter/.source/linux/arch/x86/kernel/smp.c:118 native_smp_send_reschedule+0x27/0x53()
    Hardware name: Anaheim
    Modules linked in:
    Pid: 0, comm: swapper Tainted: G      D    2.6.34-rc5-oprofile-x86_64-standard-00210-g8c00f06 #16
    Call Trace:
     <IRQ>  [<ffffffff81017f32>] ? native_smp_send_reschedule+0x27/0x53
     [<ffffffff81030ee2>] warn_slowpath_common+0x77/0xa4
     [<ffffffff81030f1e>] warn_slowpath_null+0xf/0x11
     [<ffffffff81017f32>] native_smp_send_reschedule+0x27/0x53
     [<ffffffff8102634b>] resched_task+0x60/0x62
     [<ffffffff8102653a>] check_preempt_curr_idle+0x10/0x12
     [<ffffffff8102c8ea>] try_to_wake_up+0x1f5/0x284
     [<ffffffff8102c986>] default_wake_function+0xd/0xf
     [<ffffffff810a110d>] pollwake+0x57/0x5a
     [<ffffffff8102c979>] ? default_wake_function+0x0/0xf
     [<ffffffff81026be5>] __wake_up_common+0x46/0x75
     [<ffffffff81026ed0>] __wake_up+0x38/0x50
     [<ffffffff81031694>] printk_tick+0x39/0x3b
     [<ffffffff8103ac37>] update_process_times+0x3f/0x5c
     [<ffffffff8104dc63>] tick_periodic+0x5d/0x69
     [<ffffffff8104dc90>] tick_handle_periodic+0x21/0x71
     [<ffffffff81018fd0>] smp_apic_timer_interrupt+0x82/0x95
     [<ffffffff81002853>] apic_timer_interrupt+0x13/0x20
     [<ffffffff81030cb5>] ? panic_blink_one_second+0x0/0x7b
     [<ffffffff813ebdd6>] ? panic+0x10a/0x10c
     [<ffffffff810474b0>] ? up+0x34/0x39
     [<ffffffff81031ccc>] ? kmsg_dump+0x112/0x12c
     [<ffffffff813eeff1>] ? oops_end+0x81/0x8e
     [<ffffffff8101efee>] ? no_context+0x1f3/0x202
     [<ffffffff8101f1b7>] ? __bad_area_nosemaphore+0x1ba/0x1e0
     [<ffffffff81028d24>] ? enqueue_task_fair+0x16d/0x17a
     [<ffffffff810264dc>] ? activate_task+0x42/0x53
     [<ffffffff8102c967>] ? try_to_wake_up+0x272/0x284
     [<ffffffff8101f1eb>] ? bad_area_nosemaphore+0xe/0x10
     [<ffffffff813f0f3f>] ? do_page_fault+0x1c8/0x37c
     [<ffffffff81028d24>] ? enqueue_task_fair+0x16d/0x17a
     [<ffffffff813ee55f>] ? page_fault+0x1f/0x30
     [<ffffffff8102c9a5>] ? wake_up_process+0x10/0x12
     [<ffffffff8132ad57>] ? op_amd_stop+0x2d/0x8e
     [<ffffffff8132ad46>] ? op_amd_stop+0x1c/0x8e
     [<ffffffff8132a602>] ? nmi_cpu_stop+0x21/0x23
     [<ffffffff810521b3>] ? generic_smp_call_function_single_interrupt+0xdf/0x11b
     [<ffffffff8101804f>] ? smp_call_function_single_interrupt+0x22/0x31
     [<ffffffff810029f3>] ? call_function_single_interrupt+0x13/0x20
     <EOI>  [<ffffffff8102c9a5>] ? wake_up_process+0x10/0x12
     [<ffffffff81008701>] ? default_idle+0x22/0x37
     [<ffffffff8100896d>] ? c1e_idle+0xdf/0xe6
     [<ffffffff813f1170>] ? atomic_notifier_call_chain+0x13/0x15
     [<ffffffff810012fb>] ? cpu_idle+0x4b/0x7e
     [<ffffffff813e8a4e>] ? start_secondary+0x1ae/0x1b2
    ---[ end trace 679ac372d674b758 ]---
    
    Cc: Andi Kleen <andi@firstfloor.org>
    Cc: stable <stable@kernel.org>
    Signed-off-by: default avatarRobert Richter <robert.richter@amd.com>
    2623a1d5
nmi_int.c 15.6 KB