• Robert Love's avatar
    fcoe: Fix deadlock between create and destroy paths · f9c4358e
    Robert Love authored
    We can deadlock (s_active and fcoe_config_mutex) if a
    port is being destroyed at the same time one is being created.
    
    [ 4200.503113] ======================================================
    [ 4200.503114] [ INFO: possible circular locking dependency detected ]
    [ 4200.503116] 3.8.0-rc5+ #8 Not tainted
    [ 4200.503117] -------------------------------------------------------
    [ 4200.503118] kworker/3:2/2492 is trying to acquire lock:
    [ 4200.503119]  (s_active#292){++++.+}, at: [<ffffffff8122d20b>] sysfs_addrm_finish+0x3b/0x70
    [ 4200.503127]
    but task is already holding lock:
    [ 4200.503128]  (fcoe_config_mutex){+.+.+.}, at: [<ffffffffa02f3338>] fcoe_destroy_work+0xe8/0x120 [fcoe]
    [ 4200.503133]
    which lock already depends on the new lock.
    
    [ 4200.503135]
    the existing dependency chain (in reverse order) is:
    [ 4200.503136]
    -> #1 (fcoe_config_mutex){+.+.+.}:
    [ 4200.503139]        [<ffffffff810c7711>] lock_acquire+0xa1/0x140
    [ 4200.503143]        [<ffffffff816ca7be>] mutex_lock_nested+0x6e/0x360
    [ 4200.503146]        [<ffffffffa02f11bd>] fcoe_enable+0x1d/0xb0 [fcoe]
    [ 4200.503148]        [<ffffffffa02f127d>] fcoe_ctlr_enabled+0x2d/0x50 [fcoe]
    [ 4200.503151]        [<ffffffffa02ffbe8>] store_ctlr_enabled+0x38/0x90 [libfcoe]
    [ 4200.503154]        [<ffffffff81424878>] dev_attr_store+0x18/0x30
    [ 4200.503157]        [<ffffffff8122b750>] sysfs_write_file+0xe0/0x150
    [ 4200.503160]        [<ffffffff811b334c>] vfs_write+0xac/0x180
    [ 4200.503162]        [<ffffffff811b3692>] sys_write+0x52/0xa0
    [ 4200.503164]        [<ffffffff816d7159>] system_call_fastpath+0x16/0x1b
    [ 4200.503167]
    -> #0 (s_active#292){++++.+}:
    [ 4200.503170]        [<ffffffff810c680f>] __lock_acquire+0x135f/0x1c90
    [ 4200.503172]        [<ffffffff810c7711>] lock_acquire+0xa1/0x140
    [ 4200.503174]        [<ffffffff8122c626>] sysfs_deactivate+0x116/0x160
    [ 4200.503176]        [<ffffffff8122d20b>] sysfs_addrm_finish+0x3b/0x70
    [ 4200.503178]        [<ffffffff8122b2eb>] sysfs_hash_and_remove+0x5b/0xb0
    [ 4200.503180]        [<ffffffff8122f3d1>] sysfs_remove_group+0x61/0x100
    [ 4200.503183]        [<ffffffff814251eb>] device_remove_groups+0x3b/0x60
    [ 4200.503185]        [<ffffffff81425534>] device_remove_attrs+0x44/0x80
    [ 4200.503187]        [<ffffffff81425e97>] device_del+0x127/0x1c0
    [ 4200.503189]        [<ffffffff81425f52>] device_unregister+0x22/0x60
    [ 4200.503191]        [<ffffffffa0300970>] fcoe_ctlr_device_delete+0xe0/0xf0 [libfcoe]
    [ 4200.503194]        [<ffffffffa02f1b5c>] fcoe_interface_cleanup+0x6c/0xa0 [fcoe]
    [ 4200.503196]        [<ffffffffa02f3355>] fcoe_destroy_work+0x105/0x120 [fcoe]
    [ 4200.503198]        [<ffffffff8107ee91>] process_one_work+0x1a1/0x580
    [ 4200.503203]        [<ffffffff81080c6e>] worker_thread+0x15e/0x440
    [ 4200.503205]        [<ffffffff8108715a>] kthread+0xea/0xf0
    [ 4200.503207]        [<ffffffff816d70ac>] ret_from_fork+0x7c/0xb0
    
    [ 4200.503209]
    other info that might help us debug this:
    
    [ 4200.503211]  Possible unsafe locking scenario:
    
    [ 4200.503212]        CPU0                    CPU1
    [ 4200.503213]        ----                    ----
    [ 4200.503214]   lock(fcoe_config_mutex);
    [ 4200.503215]                                lock(s_active#292);
    [ 4200.503218]                                lock(fcoe_config_mutex);
    [ 4200.503219]   lock(s_active#292);
    [ 4200.503221]
     *** DEADLOCK ***
    
    [ 4200.503223] 3 locks held by kworker/3:2/2492:
    [ 4200.503224]  #0:  (fcoe){.+.+.+}, at: [<ffffffff8107ee2b>] process_one_work+0x13b/0x580
    [ 4200.503228]  #1:  ((&port->destroy_work)){+.+.+.}, at: [<ffffffff8107ee2b>] process_one_work+0x13b/0x580
    [ 4200.503232]  #2:  (fcoe_config_mutex){+.+.+.}, at: [<ffffffffa02f3338>] fcoe_destroy_work+0xe8/0x120 [fcoe]
    [ 4200.503236]
    stack backtrace:
    [ 4200.503238] Pid: 2492, comm: kworker/3:2 Not tainted 3.8.0-rc5+ #8
    [ 4200.503240] Call Trace:
    [ 4200.503243]  [<ffffffff816c2f09>] print_circular_bug+0x1fb/0x20c
    [ 4200.503246]  [<ffffffff810c680f>] __lock_acquire+0x135f/0x1c90
    [ 4200.503248]  [<ffffffff810c463a>] ? debug_check_no_locks_freed+0x9a/0x180
    [ 4200.503250]  [<ffffffff810c7711>] lock_acquire+0xa1/0x140
    [ 4200.503253]  [<ffffffff8122d20b>] ? sysfs_addrm_finish+0x3b/0x70
    [ 4200.503255]  [<ffffffff8122c626>] sysfs_deactivate+0x116/0x160
    [ 4200.503258]  [<ffffffff8122d20b>] ? sysfs_addrm_finish+0x3b/0x70
    [ 4200.503260]  [<ffffffff8122d20b>] sysfs_addrm_finish+0x3b/0x70
    [ 4200.503262]  [<ffffffff8122b2eb>] sysfs_hash_and_remove+0x5b/0xb0
    [ 4200.503265]  [<ffffffff8122f3d1>] sysfs_remove_group+0x61/0x100
    [ 4200.503273]  [<ffffffff814251eb>] device_remove_groups+0x3b/0x60
    [ 4200.503275]  [<ffffffff81425534>] device_remove_attrs+0x44/0x80
    [ 4200.503277]  [<ffffffff81425e97>] device_del+0x127/0x1c0
    [ 4200.503279]  [<ffffffff81425f52>] device_unregister+0x22/0x60
    [ 4200.503282]  [<ffffffffa0300970>] fcoe_ctlr_device_delete+0xe0/0xf0 [libfcoe]
    [ 4200.503285]  [<ffffffffa02f1b5c>] fcoe_interface_cleanup+0x6c/0xa0 [fcoe]
    [ 4200.503287]  [<ffffffffa02f3355>] fcoe_destroy_work+0x105/0x120 [fcoe]
    [ 4200.503290]  [<ffffffff8107ee91>] process_one_work+0x1a1/0x580
    [ 4200.503292]  [<ffffffff8107ee2b>] ? process_one_work+0x13b/0x580
    [ 4200.503295]  [<ffffffffa02f3250>] ? fcoe_if_destroy+0x230/0x230 [fcoe]
    [ 4200.503297]  [<ffffffff81080c6e>] worker_thread+0x15e/0x440
    [ 4200.503299]  [<ffffffff81080b10>] ? busy_worker_rebind_fn+0x100/0x100
    [ 4200.503301]  [<ffffffff8108715a>] kthread+0xea/0xf0
    [ 4200.503304]  [<ffffffff81087070>] ? kthread_create_on_node+0x160/0x160
    [ 4200.503306]  [<ffffffff816d70ac>] ret_from_fork+0x7c/0xb0
    [ 4200.503308]  [<ffffffff81087070>] ? kthread_create_on_node+0x160/0x160
    Signed-off-by: default avatarRobert Love <robert.w.love@intel.com>
    Tested-by: default avatarJack Morgan <jack.morgan@intel.com>
    f9c4358e
fcoe.c 76.8 KB