• Lai Jiangshan's avatar
    workqueue: Remove cpus_read_lock() from apply_wqattrs_lock() · 19af4575
    Lai Jiangshan authored
    1726a171 ("workqueue: Put PWQ allocation and WQ enlistment in the same
    lock C.S.") led to the following possible deadlock:
    
      WARNING: possible recursive locking detected
      6.10.0-rc5-00004-g1d4c6111406c #1 Not tainted
       --------------------------------------------
       swapper/0/1 is trying to acquire lock:
       c27760f4 (cpu_hotplug_lock){++++}-{0:0}, at: alloc_workqueue (kernel/workqueue.c:5152 kernel/workqueue.c:5730) 
      
       but task is already holding lock:
       c27760f4 (cpu_hotplug_lock){++++}-{0:0}, at: padata_alloc (kernel/padata.c:1007) 
       ...  
       stack backtrace:
       ...
       cpus_read_lock (include/linux/percpu-rwsem.h:53 kernel/cpu.c:488) 
       alloc_workqueue (kernel/workqueue.c:5152 kernel/workqueue.c:5730) 
       padata_alloc (kernel/padata.c:1007 (discriminator 1)) 
       pcrypt_init_padata (crypto/pcrypt.c:327 (discriminator 1)) 
       pcrypt_init (crypto/pcrypt.c:353) 
       do_one_initcall (init/main.c:1267) 
       do_initcalls (init/main.c:1328 (discriminator 1) init/main.c:1345 (discriminator 1)) 
       kernel_init_freeable (init/main.c:1364) 
       kernel_init (init/main.c:1469) 
       ret_from_fork (arch/x86/kernel/process.c:153) 
       ret_from_fork_asm (arch/x86/entry/entry_32.S:737) 
       entry_INT80_32 (arch/x86/entry/entry_32.S:944) 
    
    This is caused by pcrypt allocating a workqueue while holding
    cpus_read_lock(), so workqueue code can't do it again as that can lead to
    deadlocks if down_write starts after the first down_read.
    
    The pwq creations and installations have been reworked based on
    wq_online_cpumask rather than cpu_online_mask making cpus_read_lock() is
    unneeded during wqattrs changes. Fix the deadlock by removing
    cpus_read_lock() from apply_wqattrs_lock().
    
    tj: Updated changelog.
    Signed-off-by: default avatarLai Jiangshan <jiangshan.ljs@antgroup.com>
    Fixes: 1726a171 ("workqueue: Put PWQ allocation and WQ enlistment in the same lock C.S.")
    Link: http://lkml.kernel.org/r/202407081521.83b627c1-lkp@intel.comSigned-off-by: default avatarTejun Heo <tj@kernel.org>
    19af4575
workqueue.c 223 KB