• Steve Wise's avatar
    RDMA/cxgb4: Use completion objects for event blocking · c337374b
    Steve Wise authored
    There exists a race condition when using wait_queue_head_t objects
    that are declared on the stack.  This was being done in a few places
    where we are sending work requests to the FW and awaiting replies, but
    we don't have an endpoint structure with an embedded c4iw_wr_wait
    struct.  So the code was allocating it locally on the stack.  Bad
    design.  The race is:
    
      1) thread on cpuX declares the wait_queue_head_t on the stack, then
         posts a firmware WR with that wait object ptr as the cookie to be
         returned in the WR reply.  This thread will proceed to block in
         wait_event_timeout() but before it does:
    
      2) An interrupt runs on cpuY with the WR reply.  fw6_msg() handles
         this and calls c4iw_wake_up().  c4iw_wake_up() sets the condition
         variable in the c4iw_wr_wait object to TRUE and will call
         wake_up(), but before it calls wake_up():
    
      3) The thread on cpuX calls c4iw_wait_for_reply(), which calls
         wait_event_timeout().  The wait_event_timeout() macro checks the
         condition variable and returns immediately since it is TRUE.  So
         this thread never blocks/sleeps. The function then returns
         effectively deallocating the c4iw_wr_wait object that was on the
         stack.
    
      4) So at this point cpuY has a pointer to the c4iw_wr_wait object
         that is no longer valid.  Further its pointing to a stack frame
         that might now be in use by some other context/thread.  So cpuY
         continues execution and calls wake_up() on a ptr to a wait object
         that as been effectively deallocated.
    
    This race, when it hits, can cause a crash in wake_up(), which I've
    seen under heavy stress. It can also corrupt the referenced stack
    which can cause any number of failures.
    
    The fix:
    
    Use struct completion, which supports on-stack declarations.
    Completions use a spinlock around setting the condition to true and
    the wake up so that steps 2 and 4 above are atomic and step 3 can
    never happen in-between.
    Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
    c337374b
iw_cxgb4.h 19.8 KB