• Jiri Olsa's avatar
    net: adding memory barrier to the poll and receive callbacks · a57de0b4
    Jiri Olsa authored
    Adding memory barrier after the poll_wait function, paired with
    receive callbacks. Adding fuctions sock_poll_wait and sk_has_sleeper
    to wrap the memory barrier.
    
    Without the memory barrier, following race can happen.
    The race fires, when following code paths meet, and the tp->rcv_nxt
    and __add_wait_queue updates stay in CPU caches.
    
    CPU1                         CPU2
    
    sys_select                   receive packet
      ...                        ...
      __add_wait_queue           update tp->rcv_nxt
      ...                        ...
      tp->rcv_nxt check          sock_def_readable
      ...                        {
      schedule                      ...
                                    if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
                                            wake_up_interruptible(sk->sk_sleep)
                                    ...
                                 }
    
    If there was no cache the code would work ok, since the wait_queue and
    rcv_nxt are opposit to each other.
    
    Meaning that once tp->rcv_nxt is updated by CPU2, the CPU1 either already
    passed the tp->rcv_nxt check and sleeps, or will get the new value for
    tp->rcv_nxt and will return with new data mask.
    In both cases the process (CPU1) is being added to the wait queue, so the
    waitqueue_active (CPU2) call cannot miss and will wake up CPU1.
    
    The bad case is when the __add_wait_queue changes done by CPU1 stay in its
    cache, and so does the tp->rcv_nxt update on CPU2 side.  The CPU1 will then
    endup calling schedule and sleep forever if there are no more data on the
    socket.
    
    Calls to poll_wait in following modules were ommited:
    	net/bluetooth/af_bluetooth.c
    	net/irda/af_irda.c
    	net/irda/irnet/irnet_ppp.c
    	net/mac80211/rc80211_pid_debugfs.c
    	net/phonet/socket.c
    	net/rds/af_rds.c
    	net/rfkill/core.c
    	net/sunrpc/cache.c
    	net/sunrpc/rpc_pipe.c
    	net/tipc/socket.c
    Signed-off-by: default avatarJiri Olsa <jolsa@redhat.com>
    Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    a57de0b4
tcp.c 75.9 KB