Commit 0b9a58a7 authored by Oleg Nesterov's avatar Oleg Nesterov Committed by Greg Kroah-Hartman

destroy_workqueue() can livelock

Pointed out by Michal Schmidt <mschmidt@redhat.com>.

The bug was introduced in 2.6.22 by me.

cleanup_workqueue_thread() does flush_cpu_workqueue(cwq) in a loop until
->worklist becomes empty.  This is live-lockable, a re-niced caller can get
CPU after wake_up() and insert a new barrier before the lower-priority
cwq->thread has a chance to clear ->current_work.

Change cleanup_workqueue_thread() to do flush_cpu_workqueue(cwq) only once.
 We can rely on the fact that run_workqueue() won't return until it flushes
all works.  So it is safe to call kthread_stop() after that, the "should
stop" request won't be noticed until run_workqueue() returns.
Signed-off-by: default avatarOleg Nesterov <oleg@tv-sign.ru>
Cc: Michal Schmidt <mschmidt@redhat.com>
Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
parent 7553b617
...@@ -739,18 +739,17 @@ static void cleanup_workqueue_thread(struct cpu_workqueue_struct *cwq, int cpu) ...@@ -739,18 +739,17 @@ static void cleanup_workqueue_thread(struct cpu_workqueue_struct *cwq, int cpu)
if (cwq->thread == NULL) if (cwq->thread == NULL)
return; return;
flush_cpu_workqueue(cwq);
/* /*
* If the caller is CPU_DEAD the single flush_cpu_workqueue() * If the caller is CPU_DEAD and cwq->worklist was not empty,
* is not enough, a concurrent flush_workqueue() can insert a * a concurrent flush_workqueue() can insert a barrier after us.
* barrier after us. * However, in that case run_workqueue() won't return and check
* kthread_should_stop() until it flushes all work_struct's.
* When ->worklist becomes empty it is safe to exit because no * When ->worklist becomes empty it is safe to exit because no
* more work_structs can be queued on this cwq: flush_workqueue * more work_structs can be queued on this cwq: flush_workqueue
* checks list_empty(), and a "normal" queue_work() can't use * checks list_empty(), and a "normal" queue_work() can't use
* a dead CPU. * a dead CPU.
*/ */
while (flush_cpu_workqueue(cwq))
;
kthread_stop(cwq->thread); kthread_stop(cwq->thread);
cwq->thread = NULL; cwq->thread = NULL;
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment