• Jon Maloy's avatar
    tipc: eliminate message disordering during binding table update · 988f3f16
    Jon Maloy authored
    We have seen the following race scenario:
    1) named_distribute() builds a "bulk" message, containing a PUBLISH
       item for a certain publication. This is based on the contents of
       the binding tables's 'cluster_scope' list.
    2) tipc_named_withdraw() removes the same publication from the list,
       bulds a WITHDRAW message and distributes it to all cluster nodes.
    3) tipc_named_node_up(), which was calling named_distribute(), sends
       out the bulk message built under 1)
    4) The WITHDRAW message arrives at the just detected node, finds
       no corresponding publication, and is dropped.
    5) The PUBLISH item arrives at the same node, is added to its binding
       table, and remains there forever.
    
    This arrival disordering was earlier taken care of by the backlog queue,
    originally added for a different purpose, which was removed in the
    commit referred to below, but we now need a different solution.
    In this commit, we replace the rcu lock protecting the 'cluster_scope'
    list with a regular RW lock which comprises even the sending of the
    bulk message. This both guarantees both the list integrity and the
    message sending order. We will later add a commit which cleans up
    this code further.
    
    Note that this commit needs recently added commit d3092b2e ("tipc:
    fix unsafe rcu locking when accessing publication list") to apply
    cleanly.
    
    Fixes: 37922ea4 ("tipc: permit overlapping service ranges in name table")
    Reported-by: default avatarTuong Lien Tong <tuong.t.lien@dektech.com.au>
    Acked-by: default avatarYing Xue <ying.xue@windriver.com>
    Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    988f3f16
name_table.c 26.5 KB