• Peter Zijlstra's avatar
    perf: Fix lockdep warning on process exit · 4a1c0f26
    Peter Zijlstra authored
    Sasha Levin reported:
    
    > While fuzzing with trinity inside a KVM tools guest running the latest -next
    > kernel I've stumbled on the following spew:
    >
    > ======================================================
    > [ INFO: possible circular locking dependency detected ]
    > 3.15.0-next-20140613-sasha-00026-g6dd125d-dirty #654 Not tainted
    > -------------------------------------------------------
    > trinity-c578/9725 is trying to acquire lock:
    > (&(&pool->lock)->rlock){-.-...}, at: __queue_work (kernel/workqueue.c:1346)
    >
    > but task is already holding lock:
    > (&ctx->lock){-.....}, at: perf_event_exit_task (kernel/events/core.c:7471 kernel/events/core.c:7533)
    >
    > which lock already depends on the new lock.
    
    > 1 lock held by trinity-c578/9725:
    > #0: (&ctx->lock){-.....}, at: perf_event_exit_task (kernel/events/core.c:7471 kernel/events/core.c:7533)
    >
    >  Call Trace:
    >  dump_stack (lib/dump_stack.c:52)
    >  print_circular_bug (kernel/locking/lockdep.c:1216)
    >  __lock_acquire (kernel/locking/lockdep.c:1840 kernel/locking/lockdep.c:1945 kernel/locking/lockdep.c:2131 kernel/locking/lockdep.c:3182)
    >  lock_acquire (./arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602)
    >  _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151)
    >  __queue_work (kernel/workqueue.c:1346)
    >  queue_work_on (kernel/workqueue.c:1424)
    >  free_object (lib/debugobjects.c:209)
    >  __debug_check_no_obj_freed (lib/debugobjects.c:715)
    >  debug_check_no_obj_freed (lib/debugobjects.c:727)
    >  kmem_cache_free (mm/slub.c:2683 mm/slub.c:2711)
    >  free_task (kernel/fork.c:221)
    >  __put_task_struct (kernel/fork.c:250)
    >  put_ctx (include/linux/sched.h:1855 kernel/events/core.c:898)
    >  perf_event_exit_task (kernel/events/core.c:907 kernel/events/core.c:7478 kernel/events/core.c:7533)
    >  do_exit (kernel/exit.c:766)
    >  do_group_exit (kernel/exit.c:884)
    >  get_signal_to_deliver (kernel/signal.c:2347)
    >  do_signal (arch/x86/kernel/signal.c:698)
    >  do_notify_resume (arch/x86/kernel/signal.c:751)
    >  int_signal (arch/x86/kernel/entry_64.S:600)
    
    Urgh.. so the only way I can make that happen is through:
    
      perf_event_exit_task_context()
        raw_spin_lock(&child_ctx->lock);
        unclone_ctx(child_ctx)
          put_ctx(ctx->parent_ctx);
        raw_spin_unlock_irqrestore(&child_ctx->lock);
    
    And we can avoid this by doing the change below.
    
    I can't immediately see how this changed recently, but given that you
    say it's easy to reproduce, lets fix this.
    Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
    Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: Dave Jones <davej@redhat.com>
    Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Link: http://lkml.kernel.org/r/20140623141242.GB19860@laptop.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
    4a1c0f26
core.c 189 KB