• Muchun Song's avatar
    printk: fix deadlock when kernel panic · 8a8109f3
    Muchun Song authored
    printk_safe_flush_on_panic() caused the following deadlock on our
    server:
    
    CPU0:                                         CPU1:
    panic                                         rcu_dump_cpu_stacks
      kdump_nmi_shootdown_cpus                      nmi_trigger_cpumask_backtrace
        register_nmi_handler(crash_nmi_callback)      printk_safe_flush
                                                        __printk_safe_flush
                                                          raw_spin_lock_irqsave(&read_lock)
        // send NMI to other processors
        apic_send_IPI_allbutself(NMI_VECTOR)
                                                            // NMI interrupt, dead loop
                                                            crash_nmi_callback
      printk_safe_flush_on_panic
        printk_safe_flush
          __printk_safe_flush
            // deadlock
            raw_spin_lock_irqsave(&read_lock)
    
    DEADLOCK: read_lock is taken on CPU1 and will never get released.
    
    It happens when panic() stops a CPU by NMI while it has been in
    the middle of printk_safe_flush().
    
    Handle the lock the same way as logbuf_lock. The printk_safe buffers
    are flushed only when both locks can be safely taken. It can avoid
    the deadlock _in this particular case_ at expense of losing contents
    of printk_safe buffers.
    
    Note: It would actually be safe to re-init the locks when all CPUs were
          stopped by NMI. But it would require passing this information
          from arch-specific code. It is not worth the complexity.
          Especially because logbuf_lock and printk_safe buffers have been
          obsoleted by the lockless ring buffer.
    
    Fixes: cf9b1106 ("printk/nmi: flush NMI messages on the system panic")
    Signed-off-by: default avatarMuchun Song <songmuchun@bytedance.com>
    Reviewed-by: default avatarPetr Mladek <pmladek@suse.com>
    Cc: <stable@vger.kernel.org>
    Acked-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
    Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
    Link: https://lore.kernel.org/r/20210210034823.64867-1-songmuchun@bytedance.com
    8a8109f3
printk_safe.c 10.7 KB