Commit 8c2bfc5b authored by Benjamin Herrenschmidt's avatar Benjamin Herrenschmidt Committed by Linus Torvalds

[PATCH] del_timer() vs. mod_timer() SMP race

We just spent some days fighting a rare race in one of the distro's who backported
some of timer.c from 2.6 to 2.4 (though they missed a bit).

The actual race we found didn't happen in 2.6 _but_ code inspection showed that a
similar race is still present in 2.6, explanation below:

Code removing a timer from a list (run_timers or del_timer) takes that CPU list
lock, does list_del, then timer->base = NULL.

It is mandatory that this timer->base = NULL is visible to other CPUs only after
the list_del() is complete. If not, then mod timer could see it NULL, thus take it's
own CPU list lock and not the one for the CPU the timer was beeing removed from the
list, and thus the list_add in mod_timer() could race with the list_del() from
run_timers() or del_timer().

Our race happened with run_timers(), which _DOES_ contain a proper smp_wmb() in the
right spot in 2.6, but didn't in the "backport" we were fighting with.

However, del_timer() doesn't have such a barrier, and thus is subject to this race in
2.6 as well. This patch fixes it.
Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
parent 6621ad71
......@@ -308,6 +308,8 @@ int del_timer(struct timer_list *timer)
goto repeat;
}
list_del(&timer->entry);
/* Need to make sure that anybody who sees a NULL base also sees the list ops */
smp_wmb();
timer->base = NULL;
spin_unlock_irqrestore(&base->lock, flags);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment