• Zhang Wensheng's avatar
    driver core: fix deadlock in __device_attach · b232b02b
    Zhang Wensheng authored
    In __device_attach function, The lock holding logic is as follows:
    ...
    __device_attach
    device_lock(dev)      // get lock dev
      async_schedule_dev(__device_attach_async_helper, dev); // func
        async_schedule_node
          async_schedule_node_domain(func)
            entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC);
    	/* when fail or work limit, sync to execute func, but
    	   __device_attach_async_helper will get lock dev as
    	   well, which will lead to A-A deadlock.  */
    	if (!entry || atomic_read(&entry_count) > MAX_WORK) {
    	  func;
    	else
    	  queue_work_node(node, system_unbound_wq, &entry->work)
      device_unlock(dev)
    
    As shown above, when it is allowed to do async probes, because of
    out of memory or work limit, async work is not allowed, to do
    sync execute instead. it will lead to A-A deadlock because of
    __device_attach_async_helper getting lock dev.
    
    To fix the deadlock, move the async_schedule_dev outside device_lock,
    as we can see, in async_schedule_node_domain, the parameter of
    queue_work_node is system_unbound_wq, so it can accept concurrent
    operations. which will also not change the code logic, and will
    not lead to deadlock.
    
    Fixes: 765230b5 ("driver-core: add asynchronous probing support for drivers")
    Signed-off-by: default avatarZhang Wensheng <zhangwensheng5@huawei.com>
    Link: https://lore.kernel.org/r/20220518074516.1225580-1-zhangwensheng5@huawei.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    b232b02b
dd.c 34.7 KB