• Raju Rangoju's avatar
    cxgb4: notify uP to route ctrlq compl to rdma rspq · abef7afb
    Raju Rangoju authored
    commit dec6b331 upstream.
    
    During the module initialisation there is a possible race
    (basically race between uld and lld) where neither the uld
    nor lld notifies the uP about where to route the ctrl queue
    completions. LLD skips notifying uP as the rdma queues were
    not created by then (will leave it to ULD to notify the uP).
    As the ULD comes up, it also skips notifying the uP as the
    flag FULL_INIT_DONE is not set yet (ULD assumes that the
    interface is not up yet).
    
    Consequently, this race between uld and lld leaves uP
    unnotified about where to send the ctrl queue completions
    to, leading to iwarp RI_RES WR failure.
    
    Here is the race:
    
    CPU 0                                   CPU1
    
    - allocates nic rx queus
    - t4_sge_alloc_ctrl_txq()
    (if rdma rsp queues exists,
    tell uP to route ctrl queue
    compl to rdma rspq)
                                    - acquires the mutex_lock
                                    - allocates rdma response queues
                                    - if FULL_INIT_DONE set,
                                      tell uP to route ctrl queue compl
                                      to rdma rspq
                                    - relinquishes mutex_lock
    - acquires the mutex_lock
    - enable_rx()
    - set FULL_INIT_DONE
    - relinquishes mutex_lock
    
    This patch fixes the above issue.
    
    Fixes: e7519f99('cxgb4: avoid enabling napi twice to the same queue')
    Signed-off-by: default avatarRaju Rangoju <rajur@chelsio.com>
    Acked-by: default avatarSteve Wise <swise@opengridcomputing.com>
    Signed-off-by: default avatarGanesh Goudar <ganeshgr@chelsio.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    abef7afb
cxgb4_main.c 139 KB