IPoIB: fix mcast_dev_flush/mcast_restart_task race
Our mcast_dev_flush routine and our mcast_restart_task can race against each other. In particular, they both hold the priv->lock while manipulating the rbtree and while removing mcast entries from the multicast_list and while adding entries to the remove_list, but they also both drop their locks prior to doing the actual removes. The mcast_dev_flush routine is run entirely under the rtnl lock and so has at least some locking. The actual race condition is like this: Thread 1 Thread 2 ifconfig ib0 up start multicast join for broadcast multicast join completes for broadcast start to add more multicast joins call mcast_restart_task to add new entries ifconfig ib0 down mcast_dev_flush mcast_leave(mcast A) mcast_leave(mcast A) As mcast_leave calls ib_sa_multicast_leave, and as member in core/multicast.c is ref counted, we run into an unbalanced refcount issue. To avoid stomping on each others removes, take the rtnl lock specifically when we are deleting the entries from the remove list. Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
Showing
Please register or sign in to comment