1. 30 Oct, 2012 10 commits
    • Alex Elder's avatar
      rbd: remove options args from rbd_add_parse_args() · f28e565a
      Alex Elder authored
      They "options" argument to rbd_add_parse_args() (and it's partner
      options_size) is now only needed within the function, so there's no
      need to have the caller allocate and pass the options buffer.  Just
      allocate the options buffer within the function using dup_token().
      
      Also distinguish between failures due to failed memory allocation
      and failing because a required argument was missing.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      f28e565a
    • Alex Elder's avatar
      rbd: get rid of snap_name_len · e5c35534
      Alex Elder authored
      The value returned in the "snap_name_len" argument to
      rbd_add_parse_args() is never actually used, so get rid of it.
      
      The snap_name_len recorded in rbd_dev_v2_snap_name() is not
      useful either, so get rid of that too.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      e5c35534
    • Alex Elder's avatar
      rbd: do all argument parsing in one place · 0ddebc0c
      Alex Elder authored
      This patch makes rbd_add_parse_args() be the single place all
      argument parsing occurs for an image map request:
          - Move the ceph_parse_options() call into that function
          - Use local variables rather than parameters to hold the list
            of monitor addresses supplied
          - Rather than returning it, pass the snapshot name (and its
            length) back via parameters
          - Have the function return a ceph_options structure pointer
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      0ddebc0c
    • Alex Elder's avatar
      rbd: move ceph_parse_options() call up · 78cea76e
      Alex Elder authored
      Move option parsing out of rbd_get_client() and into its caller.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      78cea76e
    • Alex Elder's avatar
      rbd: rename snap_exists field · daba5fdb
      Alex Elder authored
      A Boolean field "snap_exists" in an rbd mapping is used to indicate
      whether a mapped snapshot has been removed from an image's snapshot
      context, to stop sending requests for that snapshot as soon as we
      know it's gone.
      
      Generalize the interpretation of this field so it applies to
      non-snapshot (i.e. "head") mappings.  That is, define its value
      to be false until the mapping has been set, and then define it to be
      true for both snapshot mappings or head mappings.
      
      Rename the field "exists" to reflect the broader interpretation.
      The rbd_mapping structure is on its way out, so move the field
      back into the rbd_device structure.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      daba5fdb
    • Alex Elder's avatar
      rbd: move snap info out of rbd_mapping struct · 971f839a
      Alex Elder authored
      Moving the snap_id and snap_name fields into the separate
      rbd_mapping structure was misguided.  (And in time, perhaps
      we'll do away with that structure altogether...)
      
      Move these fields back into struct rbd_device.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      971f839a
    • Alex Elder's avatar
      rbd: make pool_id a 64 bit value · 86992098
      Alex Elder authored
      If a format 2 image has a parent, its pool id will be specified
      using a 64-bit value.  Change the pool id we save for an image to
      match that.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      86992098
    • Alex Elder's avatar
      rbd: remove snapshots on error in rbd_add() · 41f38c2b
      Alex Elder authored
      If rbd_dev_snaps_update() has ever been called for an rbd device
      structure there could be snapshot structures on its snaps list.
      In rbd_add(), this function is called but a subsequent error
      path neglected to clean up any of these snapshots.
      
      Add a call to rbd_remove_all_snaps() in the appropriate spot to
      remedy this.  Change a couple of error labels to be a little
      clearer while there.
      
      Drop the leading underscores from the function name; there's nothing
      special about that function that they might signify.  As suggested
      in review, the leading underscores in __rbd_remove_snap_dev() have
      been removed as well.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      41f38c2b
    • Alex Elder's avatar
      rbd: simplify rbd_rq_fn() · f7760dad
      Alex Elder authored
      When processing a request, rbd_rq_fn() makes clones of the bio's in
      the request's bio chain and submits the results to osd's to be
      satisfied.  If a request bio straddles the boundary between objects
      backing the rbd image, it must be represented by two cloned bio's,
      one for the first part (at the end of one object) and one for the
      second (at the beginning of the next object).
      
      This has been handled by a function bio_chain_clone(), which
      includes an interface only a mother could love, and which has
      been found to have other problems.
      
      This patch defines two new fairly generic bio functions (one which
      replaces bio_chain_clone()) to help out the situation, and then
      revises rbd_rq_fn() to make use of them.
      
      First, bio_clone_range() clones a portion of a single bio, starting
      at a given offset within the bio and including only as many bytes
      as requested.  As a convenience, a request to clone the entire bio
      is passed directly to bio_clone().
      
      Second, bio_chain_clone_range() performs a similar function,
      producing a chain of cloned bio's covering a sub-range of the
      source chain.  No bio_pair structures are used, and if successful
      the result will represent exactly the specified range.
      
      Using bio_chain_clone_range() makes bio_rq_fn() a little easier
      to understand, because it avoids the need to pass very much
      state information between consecutive calls.  By avoiding the need
      to track a bio_pair structure, it also eliminates the problem
      described here:  http://tracker.newdream.net/issues/2933
      
      Note that a block request (and therefore the complete length of
      a bio chain processed in rbd_rq_fn()) is an unsigned int, while
      the result of rbd_segment_length() is u64.  This change makes
      this range trunctation explicit, and trips a bug if the the
      segment boundary is too far off.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      f7760dad
    • Sage Weil's avatar
      libceph: fix osdmap decode error paths · 0ed7285e
      Sage Weil authored
      Ensure that we set the err value correctly so that we do not pass a 0
      value to ERR_PTR and confuse the calling code.  (In particular,
      osd_client.c handle_map() will BUG(!newmap)).
      Signed-off-by: default avatarSage Weil <sage@inktank.com>
      Reviewed-by: default avatarAlex Elder <elder@inktank.com>
      0ed7285e
  2. 26 Oct, 2012 13 commits
  3. 10 Oct, 2012 7 commits
    • Alex Elder's avatar
      rbd: activate v2 image support · 35152979
      Alex Elder authored
      Now that v2 images support is fully implemented, have
      rbd_dev_v2_probe() return 0 to indicate a successful probe.
      
      (Note that an image that implements layering will fail
      the probe early because of the feature chekc.)
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      35152979
    • Alex Elder's avatar
      rbd: implement feature checks · d889140c
      Alex Elder authored
      Version 2 images have two sets of feature bit fields.  The first
      indicates features possibly used by the image.  The second indicates
      features that the client *must* support in order to use the image.
      
      When an image (or snapshot) is first examined, we need to make sure
      that the local implementation supports the image's required
      features.  If not, fail the probe for the image.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      d889140c
    • Alex Elder's avatar
      rbd: define rbd_dev_v2_refresh() · 117973fb
      Alex Elder authored
      Define a new function rbd_dev_v2_refresh() to update/refresh the
      snapshot context for a format version 2 rbd image.  This function
      will update anything that is not fixed for the life of an rbd
      image--at the moment this is mainly the snapshot context and (for
      a base mapping) the size.
      
      Update rbd_refresh_header() so it selects which function to use
      based on the image format.
      
      Rename __rbd_refresh_header() to be rbd_dev_v1_refresh()
      to be consistent with the naming of its version 2 counterpart.
      Similarly rename rbd_refresh_header() to be rbd_dev_refresh().
      
      Unrelated--we use rbd_image_format_valid() here.  Delete the other
      use of it, which was primarily put in place to ensure that function
      was referenced at the time it was defined.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      117973fb
    • Alex Elder's avatar
      rbd: define rbd_update_mapping_size() · 9478554a
      Alex Elder authored
      Encapsulate the code that handles updating the size of a mapping
      after an rbd image has been refreshed.  This is done in anticipation
      of the next patch, which will make this common code for format 1 and
      2 images.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      9478554a
    • Alex Elder's avatar
      rbd: define common queue_con_delay() · 802c6d96
      Alex Elder authored
      This patch defines a single function, queue_con_delay() to call
      queue_delayed_work() for a connection.  It basically generalizes
      what was previously queue_con() by adding the delay argument.
      queue_con() is now a simple helper that passes 0 for its delay.
      queue_con_delay() returns 0 if it queued work or an errno if it
      did not for some reason.
      
      If con_work() finds the BACKOFF flag set for a connection, it now
      calls queue_con_delay() to handle arranging to start again after a
      delay.
      
      Note about connection reference counts:  con_work() only ever gets
      called as a work item function.  At the time that work is scheduled,
      a reference to the connection is acquired, and the corresponding
      con_work() call is then responsible for dropping that reference
      before it returns.
      
      Previously, the backoff handling inside con_work() silently handed
      off its reference to delayed work it scheduled.  Now that
      queue_con_delay() is used, a new reference is acquired for the
      newly-scheduled work, and the original reference is dropped by the
      con->ops->put() call at the end of the function.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarSage Weil <sage@inktank.com>
      802c6d96
    • Alex Elder's avatar
      rbd: let con_work() handle backoff · 8618e30b
      Alex Elder authored
      Both ceph_fault() and con_work() include handling for imposing a
      delay before doing further processing on a faulted connection.
      The latter is used only if ceph_fault() is unable to.
      
      Instead, just let con_work() always be responsible for implementing
      the delay.  After setting up the delay value, set the BACKOFF flag
      on the connection unconditionally and call queue_con() to ensure
      con_work() will get called to handle it.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarSage Weil <sage@inktank.com>
      8618e30b
    • Alex Elder's avatar
      rbd: reset BACKOFF if unable to re-queue · 588377d6
      Alex Elder authored
      If ceph_fault() is unable to queue work after a delay, it sets the
      BACKOFF connection flag so con_work() will attempt to do so.
      
      In con_work(), when BACKOFF is set, if queue_delayed_work() doesn't
      result in newly-queued work, it simply ignores this condition and
      proceeds as if no backoff delay were desired.  There are two
      problems with this--one of which is a bug.
      
      The first problem is simply that the intended behavior is to back
      off, and if we aren't able queue the work item to run after a delay
      we're not doing that.
      
      The only reason queue_delayed_work() won't queue work is if the
      provided work item is already queued.  In the messenger, this
      means that con_work() is already scheduled to be run again.  So
      if we simply set the BACKOFF flag again when this occurs, we know
      the next con_work() call will again attempt to hold off activity
      on the connection until after the delay.
      
      The second problem--the bug--is a leak of a reference count.  If
      queue_delayed_work() returns 0 in con_work(), con->ops->put() drops
      the connection reference held on entry to con_work().  However,
      processing is (was) allowed to continue, and at the end of the
      function a second con->ops->put() is called.
      
      This patch fixes both problems.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarSage Weil <sage@inktank.com>
      588377d6
  4. 03 Oct, 2012 2 commits
  5. 01 Oct, 2012 8 commits