1. 15 Mar, 2011 2 commits
    • Sean Hefty's avatar
      IB/cm: Bump reference count on cm_id before invoking callback · 29963437
      Sean Hefty authored
      When processing a SIDR REQ, the ib_cm allocates a new cm_id.  The
      refcount of the cm_id is initialized to 1.  However, cm_process_work
      will decrement the refcount after invoking all callbacks.  The result
      is that the cm_id will end up with refcount set to 0 by the end of the
      sidr req handler.
      
      If a user tries to destroy the cm_id, the destruction will proceed,
      under the incorrect assumption that no other threads are referencing
      the cm_id.  This can lead to a crash when the cm callback thread tries
      to access the cm_id.
      
      This problem was noticed as part of a larger investigation with kernel
      crashes in the rdma_cm when running on a real time OS.
      Signed-off-by: default avatarSean Hefty <sean.hefty@intel.com>
      Acked-by: default avatarDoug Ledford <dledford@redhat.com>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      29963437
    • Sean Hefty's avatar
      RDMA/cma: Fix crash in request handlers · 25ae21a1
      Sean Hefty authored
      Doug Ledford and Red Hat reported a crash when running the rdma_cm on
      a real-time OS.  The crash has the following call trace:
      
          cm_process_work
             cma_req_handler
                cma_disable_callback
                rdma_create_id
                   kzalloc
                   init_completion
                cma_get_net_info
                cma_save_net_info
                cma_any_addr
                   cma_zero_addr
                rdma_translate_ip
                   rdma_copy_addr
                cma_acquire_dev
                   rdma_addr_get_sgid
                   ib_find_cached_gid
                   cma_attach_to_dev
                ucma_event_handler
                   kzalloc
                   ib_copy_ah_attr_to_user
                cma_comp
      
      [ preempted ]
      
          cma_write
              copy_from_user
              ucma_destroy_id
                 copy_from_user
                 _ucma_find_context
                 ucma_put_ctx
                 ucma_free_ctx
                    rdma_destroy_id
                       cma_exch
                       cma_cancel_operation
                       rdma_node_get_transport
      
              rt_mutex_slowunlock
              bad_area_nosemaphore
              oops_enter
      
      They were able to reproduce the crash multiple times with the
      following details:
      
          Crash seems to always happen on the:
                  mutex_unlock(&conn_id->handler_mutex);
          as conn_id looks to have been freed during this code path.
      
      An examination of the code shows that a race exists in the request
      handlers.  When a new connection request is received, the rdma_cm
      allocates a new connection identifier.  This identifier has a single
      reference count on it.  If a user calls rdma_destroy_id() from another
      thread after receiving a callback, rdma_destroy_id will proceed to
      destroy the id and free the associated memory.  However, the request
      handlers may still be in the process of running.  When control returns
      to the request handlers, they can attempt to access the newly created
      identifiers.
      
      Fix this by holding a reference on the newly created rdma_cm_id until
      the request handler is through accessing it.
      Signed-off-by: default avatarSean Hefty <sean.hefty@intel.com>
      Acked-by: default avatarDoug Ledford <dledford@redhat.com>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      25ae21a1
  2. 18 Feb, 2011 8 commits
  3. 17 Feb, 2011 10 commits
  4. 16 Feb, 2011 15 commits
  5. 15 Feb, 2011 5 commits