• Ilya Dryomov's avatar
    rbd: prefix rbd writes with CEPH_OSD_OP_SETALLOCHINT osd op · 0ccd5926
    Ilya Dryomov authored
    In an effort to reduce fragmentation, prefix every rbd write with
    a CEPH_OSD_OP_SETALLOCHINT osd op with an expected_write_size value set
    to the object size (1 << order).  Backwards compatibility is taken care
    of on the libceph/osd side.
    
    "The CEPH_OSD_OP_SETALLOCHINT hint is durable, in that it's enough to
    do it once.  The reason every rbd write is prefixed is that rbd doesn't
    explicitly create objects and relies on writes creating them
    implicitly, so there is no place to stick a single hint op into.  To
    get around that we decided to prefix every rbd write with a hint (just
    like write and setattr ops, hint op will create an object implicitly if
    it doesn't exist)."
    Signed-off-by: default avatarIlya Dryomov <ilya.dryomov@inktank.com>
    Reviewed-by: default avatarSage Weil <sage@inktank.com>
    Reviewed-by: default avatarAlex Elder <elder@linaro.org>
    0ccd5926
rbd.c 138 KB