[PATCH] sched: smt fixes
while looking at HT scheduler bugreports and boot failures i discovered a bad assumption in most of the HT scheduling code: that resched_task() can be called without holding the task's runqueue. This is most definitely not valid - doing it without locking can lead to the task on that CPU exiting, and this CPU corrupting the (ex-) task_info struct. It can also lead to HT-wakeup races with task switching on that other CPU. (this_CPU marking the wrong task on that_CPU as need_resched - resulting in e.g. idle wakeups not working.) The attached patch against fixes it all up. Changes: - resched_task() needs to touch the task so the runqueue lock of that CPU must be held: resched_task() now enforces this rule. - wake_priority_sleeper() was called without holding the runqueue lock. - wake_sleeping_dependent() needs to hold the runqueue locks of all siblings (2 typically). Effects of this ripples back to schedule() as well - in the non-SMT case it gets compiled out so it's fine. - dependent_sleeper() needs the runqueue locks too - and it's slightly harder because it wants to know the 'next task' info which might change during the lock-drop/reacquire. Ripple effect on schedule() => compiled out on non-SMT so fine. - resched_task() was disabling preemption for no good reason - all paths that called this function had either a spinlock held or irqs disabled. Compiled & booted on x86 SMP and UP, with and without SMT. Booted the SMT kernel on a real SMP+HT box as well. (Unpatched kernel wouldn't even boot with the resched_task() assert in place.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Showing
Please register or sign in to comment