• Lai Jiangshan's avatar
    workqueue: fix possible pool stall bug in wq_unbind_fn() · eb283428
    Lai Jiangshan authored
    Since multiple pools per cpu have been introduced, wq_unbind_fn() has
    a subtle bug which may theoretically stall work item processing.  The
    problem is two-fold.
    
    * wq_unbind_fn() depends on the worker executing wq_unbind_fn() itself
      to start unbound chain execution, which works fine when there was
      only single pool.  With multiple pools, only the pool which is
      running wq_unbind_fn() - the highpri one - is guaranteed to have
      such kick-off.  The other pool could stall when its busy workers
      block.
    
    * The current code is setting WORKER_UNBIND / POOL_DISASSOCIATED of
      the two pools in succession without initiating work execution
      inbetween.  Because setting the flags requires grabbing assoc_mutex
      which is held while new workers are created, this could lead to
      stalls if a pool's manager is waiting for the previous pool's work
      items to release memory.  This is almost purely theoretical tho.
    
    Update wq_unbind_fn() such that it sets WORKER_UNBIND /
    POOL_DISASSOCIATED, goes over schedule() and explicitly kicks off
    execution for a pool and then moves on to the next one.
    
    tj: Updated comments and description.
    Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    Cc: stable@vger.kernel.org
    eb283428
workqueue.c 104 KB