Thanks for the input Greg, we've submitted the patch to the ceph github repo https://github.com/ceph/ceph/pull/21222

Kevin

On 04/02/2018 01:10 PM, Gregory Farnum wrote:
On Mon, Apr 2, 2018 at 8:21 AM Kevin Hrpcek <kevin.hrp...@ssec.wisc.edu <mailto:kevin.hrp...@ssec.wisc.edu>> wrote:

    Hello,

    We use python librados bindings for object operations on our
    cluster. For a long time we've been using 2 ec pools with k=4 m=1
    and a fixed 4MB read/write size with the python bindings. During
    preparations for migrating all of our data to a k=6 m=2 pool we've
    discovered that ec pool alignment size is dynamic and the librados
    bindings for python and go fail to write objects because they are
    not aware of the the pool alignment size and therefore cannot
    adjust the write block size to be a multiple of that. The ec pool
    alignment size seems to be (k value * 4K) on new pools, but is
    only 4K on old pools from the hammer days. We haven't been able to
    find much useful documentation for this pool alignment setting
    other than the librados docs
    (http://docs.ceph.com/docs/master/rados/api/librados)
    rados_ioctx_pool_requires_alingment,
    rados_ioctx_pool_requires_alignment2,
    rados_ioctx_pool_required_alignment,
    rados_ioctx_pool_required_alignment2. After going through the
    rados binary source we found that the binary is rounding the write
    op size for an ec pool to a multiple of the pool alignment size
    (line ~1945
    https://github.com/ceph/ceph/blob/master/src/tools/rados/rados.cc#L1945).
    The min write op size can be figured out by writing to an ec pool
    like this to get the binary to round up and print it out `rados -b
    1k -p $pool put .....`. All of the support for being alignment
    aware is obviously available but simply isn't available in the
    bindings, we've only tested python and go.

    We've gone ahead and submitted a patch and pull request to the
    pycradox project which seems to be what was merged into the ceph
    project for python bindings
    https://github.com/sileht/pycradox/pull/4. It replicates getting
    the alignment size of the pool in the python bindings so that we
    can then calculate the proper op sizes for writing to a pool

    We find it hard to believe that we're the only ones to have run
    into this problem when using the bindings. Have we missed
    something obvious for cluster configuration? Or maybe we're just
    doing things different compared to most users... Any insight would
    be appreciated as we'd prefer to use an official solution rather
    than our bindings fix for long term use.


It's not impossible you're the only user both using the python bindings and targeting EC pools. Even now with overwrites they're limited in terms of object class and omap support, and I think all the direct-access users I've heard about required at least one of omap or overwrites.

Just submit the patch to the Ceph github repo and it'll get fixed up! :)
-Greg


    Tested on Luminous 12.2.2 and 12.2.4.

    Thanks,
    Kevin

-- Kevin Hrpcek
    Linux Systems Administrator
    NASA SNPP Atmospheric SIPS
    Space Science & Engineering Center
    University of Wisconsin-Madison

    _______________________________________________
    ceph-users mailing list
    ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to