1. 14 Jan, 2014 1 commit
  2. 31 Dec, 2013 38 commits
  3. 13 Dec, 2013 1 commit
    • Josh Durgin's avatar
      libceph: resend all writes after the osdmap loses the full flag · 9a1ea2db
      Josh Durgin authored
      With the current full handling, there is a race between osds and
      clients getting the first map marked full. If the osd wins, it will
      return -ENOSPC to any writes, but the client may already have writes
      in flight. This results in the client getting the error and
      propagating it up the stack. For rbd, the block layer turns this into
      EIO, which can cause corruption in filesystems above it.
      
      To avoid this race, osds are being changed to drop writes that came
      from clients with an osdmap older than the last osdmap marked full.
      In order for this to work, clients must resend all writes after they
      encounter a full -> not full transition in the osdmap. osds will wait
      for an updated map instead of processing a request from a client with
      a newer map, so resent writes will not be dropped by the osd unless
      there is another not full -> full transition.
      
      This approach requires both osds and clients to be fixed to avoid the
      race. Old clients talking to osds with this fix may hang instead of
      returning EIO and potentially corrupting an fs. New clients talking to
      old osds have the same behavior as before if they encounter this race.
      
      Fixes: http://tracker.ceph.com/issues/6938Reviewed-by: default avatarSage Weil <sage@inktank.com>
      Signed-off-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      9a1ea2db