• Ying Xue's avatar
    tipc: fix lockdep warning when intra-node messages are delivered · 1a194c2d
    Ying Xue authored
    When running tipcTC&tipcTS test suite, below lockdep unsafe locking
    scenario is reported:
    
    [ 1109.997854]
    [ 1109.997988] =================================
    [ 1109.998290] [ INFO: inconsistent lock state ]
    [ 1109.998575] 3.17.0-rc1+ #113 Not tainted
    [ 1109.998762] ---------------------------------
    [ 1109.998762] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
    [ 1109.998762] swapper/7/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
    [ 1109.998762]  (slock-AF_TIPC){+.?...}, at: [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
    [ 1109.998762] {SOFTIRQ-ON-W} state was registered at:
    [ 1109.998762]   [<ffffffff810a4770>] __lock_acquire+0x6a0/0x1d80
    [ 1109.998762]   [<ffffffff810a6555>] lock_acquire+0x95/0x1e0
    [ 1109.998762]   [<ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80
    [ 1109.998762]   [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
    [ 1109.998762]   [<ffffffffa0004fe8>] tipc_link_xmit+0xa8/0xc0 [tipc]
    [ 1109.998762]   [<ffffffffa000ec6f>] tipc_sendmsg+0x15f/0x550 [tipc]
    [ 1109.998762]   [<ffffffffa000f165>] tipc_connect+0x105/0x140 [tipc]
    [ 1109.998762]   [<ffffffff817676ee>] SYSC_connect+0xae/0xc0
    [ 1109.998762]   [<ffffffff81767b7e>] SyS_connect+0xe/0x10
    [ 1109.998762]   [<ffffffff817a9788>] compat_SyS_socketcall+0xb8/0x200
    [ 1109.998762]   [<ffffffff81a306e5>] sysenter_dispatch+0x7/0x1f
    [ 1109.998762] irq event stamp: 241060
    [ 1109.998762] hardirqs last  enabled at (241060): [<ffffffff8105a4ad>] __local_bh_enable_ip+0x6d/0xd0
    [ 1109.998762] hardirqs last disabled at (241059): [<ffffffff8105a46f>] __local_bh_enable_ip+0x2f/0xd0
    [ 1109.998762] softirqs last  enabled at (241020): [<ffffffff81059a52>] _local_bh_enable+0x22/0x50
    [ 1109.998762] softirqs last disabled at (241021): [<ffffffff8105a626>] irq_exit+0x96/0xc0
    [ 1109.998762]
    [ 1109.998762] other info that might help us debug this:
    [ 1109.998762]  Possible unsafe locking scenario:
    [ 1109.998762]
    [ 1109.998762]        CPU0
    [ 1109.998762]        ----
    [ 1109.998762]   lock(slock-AF_TIPC);
    [ 1109.998762]   <Interrupt>
    [ 1109.998762]     lock(slock-AF_TIPC);
    [ 1109.998762]
    [ 1109.998762]  *** DEADLOCK ***
    [ 1109.998762]
    [ 1109.998762] 2 locks held by swapper/7/0:
    [ 1109.998762]  #0:  (rcu_read_lock){......}, at: [<ffffffff81782dc9>] __netif_receive_skb_core+0x69/0xb70
    [ 1109.998762]  #1:  (rcu_read_lock){......}, at: [<ffffffffa0001c90>] tipc_l2_rcv_msg+0x40/0x260 [tipc]
    [ 1109.998762]
    [ 1109.998762] stack backtrace:
    [ 1109.998762] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.17.0-rc1+ #113
    [ 1109.998762] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
    [ 1109.998762]  ffffffff82745830 ffff880016c03828 ffffffff81a209eb 0000000000000007
    [ 1109.998762]  ffff880017b3cac0 ffff880016c03888 ffffffff81a1c5ef 0000000000000001
    [ 1109.998762]  ffff880000000001 ffff880000000000 ffffffff81012d4f 0000000000000000
    [ 1109.998762] Call Trace:
    [ 1109.998762]  <IRQ>  [<ffffffff81a209eb>] dump_stack+0x4e/0x68
    [ 1109.998762]  [<ffffffff81a1c5ef>] print_usage_bug+0x1f1/0x202
    [ 1109.998762]  [<ffffffff81012d4f>] ? save_stack_trace+0x2f/0x50
    [ 1109.998762]  [<ffffffff810a406c>] mark_lock+0x28c/0x2f0
    [ 1109.998762]  [<ffffffff810a3440>] ? print_irq_inversion_bug.part.46+0x1f0/0x1f0
    [ 1109.998762]  [<ffffffff810a467d>] __lock_acquire+0x5ad/0x1d80
    [ 1109.998762]  [<ffffffff810a70dd>] ? trace_hardirqs_on+0xd/0x10
    [ 1109.998762]  [<ffffffff8108ace8>] ? sched_clock_cpu+0x98/0xc0
    [ 1109.998762]  [<ffffffff8108ad2b>] ? local_clock+0x1b/0x30
    [ 1109.998762]  [<ffffffff810a10dc>] ? lock_release_holdtime.part.29+0x1c/0x1a0
    [ 1109.998762]  [<ffffffff8108aa05>] ? sched_clock_local+0x25/0x90
    [ 1109.998762]  [<ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc]
    [ 1109.998762]  [<ffffffff810a6555>] lock_acquire+0x95/0x1e0
    [ 1109.998762]  [<ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc]
    [ 1109.998762]  [<ffffffff810a6fb6>] ? trace_hardirqs_on_caller+0xa6/0x1c0
    [ 1109.998762]  [<ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80
    [ 1109.998762]  [<ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc]
    [ 1109.998762]  [<ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc]
    [ 1109.998762]  [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
    [ 1109.998762]  [<ffffffffa00076bd>] tipc_rcv+0x5ed/0x960 [tipc]
    [ 1109.998762]  [<ffffffffa0001d1c>] tipc_l2_rcv_msg+0xcc/0x260 [tipc]
    [ 1109.998762]  [<ffffffffa0001c90>] ? tipc_l2_rcv_msg+0x40/0x260 [tipc]
    [ 1109.998762]  [<ffffffff81783345>] __netif_receive_skb_core+0x5e5/0xb70
    [ 1109.998762]  [<ffffffff81782dc9>] ? __netif_receive_skb_core+0x69/0xb70
    [ 1109.998762]  [<ffffffff81784eb9>] ? dev_gro_receive+0x259/0x4e0
    [ 1109.998762]  [<ffffffff817838f6>] __netif_receive_skb+0x26/0x70
    [ 1109.998762]  [<ffffffff81783acd>] netif_receive_skb_internal+0x2d/0x1f0
    [ 1109.998762]  [<ffffffff81785518>] napi_gro_receive+0xd8/0x240
    [ 1109.998762]  [<ffffffff815bf854>] e1000_clean_rx_irq+0x2c4/0x530
    [ 1109.998762]  [<ffffffff815c1a46>] e1000_clean+0x266/0x9c0
    [ 1109.998762]  [<ffffffff8108ad2b>] ? local_clock+0x1b/0x30
    [ 1109.998762]  [<ffffffff8108aa05>] ? sched_clock_local+0x25/0x90
    [ 1109.998762]  [<ffffffff817842b1>] net_rx_action+0x141/0x310
    [ 1109.998762]  [<ffffffff810bd710>] ? handle_fasteoi_irq+0xe0/0x150
    [ 1109.998762]  [<ffffffff81059fa6>] __do_softirq+0x116/0x4d0
    [ 1109.998762]  [<ffffffff8105a626>] irq_exit+0x96/0xc0
    [ 1109.998762]  [<ffffffff81a30d07>] do_IRQ+0x67/0x110
    [ 1109.998762]  [<ffffffff81a2ee2f>] common_interrupt+0x6f/0x6f
    [ 1109.998762]  <EOI>  [<ffffffff8100d2b7>] ? default_idle+0x37/0x250
    [ 1109.998762]  [<ffffffff8100d2b5>] ? default_idle+0x35/0x250
    [ 1109.998762]  [<ffffffff8100dd1f>] arch_cpu_idle+0xf/0x20
    [ 1109.998762]  [<ffffffff810999fd>] cpu_startup_entry+0x27d/0x4d0
    [ 1109.998762]  [<ffffffff81034c78>] start_secondary+0x188/0x1f0
    
    When intra-node messages are delivered from one process to another
    process, tipc_link_xmit() doesn't disable BH before it directly calls
    tipc_sk_rcv() on process context to forward messages to destination
    socket. Meanwhile, if messages delivered by remote node arrive at the
    node and their destinations are also the same socket, tipc_sk_rcv()
    running on process context might be preempted by tipc_sk_rcv() running
    BH context. As a result, the latter cannot obtain the socket lock as
    the lock was obtained by the former, however, the former has no chance
    to be run as the latter is owning the CPU now, so headlock happens. To
    avoid it, BH should be always disabled in tipc_sk_rcv().
    Signed-off-by: default avatarYing Xue <ying.xue@windriver.com>
    Reviewed-by: default avatarJon Maloy <jon.maloy@ericsson.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    1a194c2d
socket.c 71 KB