Greg, Paul,

Thank you for the feedback. This has been very enlightening. One last
question (for now at least). Are there any expected performance impacts
from having I/O to multiple pools from the same client? (Given how RGW and
CephFS store metadata, I would hope not, but I thought I'd ask.) Based on
everything that has been described it makes sense to have metadata heavy
objects (i.e., objects with a large fraction of kv data) to be in a
replicated pool while putting the large blobs in an EC pool.

Thanks again,
Ben

On Wed, Sep 12, 2018 at 1:05 PM Gregory Farnum <gfar...@redhat.com> wrote:

> On Tue, Sep 11, 2018 at 5:32 PM Benjamin Cherian <
> benjamin.cher...@gmail.com> wrote:
>
>> Ok, that’s good to know. I was planning on using an EC pool. Maybe I'll
>> store some of the larger kv pairs in their own objects or move the metadata
>> into it's own replicated pool entirely. If the storage mechanism is the
>> same, is there a reason xattrs are supported and omap is not? (Or is there
>> some hidden cost to storing kv pairs in an EC pool I’m unaware of, e.g.,
>> does the kv data get replicated across all OSDs being used for a PG or
>> something?)
>>
>
> Yeah, if you're on an EC pool there isn't a good way to erasure-code
> key-value data. So we willingly replicate xattrs across all the nodes
> (since they are presumed to be small and limited in number — I think we
> actually have total limits, but not sure?) but don't support omap at all
> (as it's presumed to be a lot of data).
>
> Do note that if small objects are a large proportion of your data you
> might prefer to put them in a replicated pool — in an EC pool you'd need
> very small chunk sizes to get any non-replication happening anyway, and for
> something in the 10KB range at a reasonable k+m you'd be dominated by
> metadata size anyway.
> -Greg
>
>
>>
>> Thanks,
>> Ben
>>
>> On Tue, Sep 11, 2018 at 1:46 PM Patrick Donnelly <pdonn...@redhat.com>
>> wrote:
>>
>>> On Tue, Sep 11, 2018 at 12:43 PM, Benjamin Cherian
>>> <benjamin.cher...@gmail.com> wrote:
>>> > On Tue, Sep 11, 2018 at 10:44 AM Gregory Farnum <gfar...@redhat.com>
>>> wrote:
>>> >>
>>> >> <snip>
>>> >> In general, if the key-value storage is of unpredictable or
>>> non-trivial
>>> >> size, you should use omap.
>>> >>
>>> >> At the bottom layer where the data is actually stored, they're likely
>>> to
>>> >> be in the same places (if using BlueStore, they are the same — in
>>> FileStore,
>>> >> a rados xattr *might* be in the local FS xattrs, or it might not). It
>>> is
>>> >> somewhat more likely that something stored in an xattr will get
>>> pulled into
>>> >> memory at the same time as the object's internal metadata, but that
>>> only
>>> >> happens if it's quite small (think the xfs or ext4 xattr rules).
>>> >
>>> >
>>> > Based on this description, if I'm planning on using Bluestore, there
>>> is no
>>> > particular reason to ever prefer using xattrs over omap (outside of
>>> ease of
>>> > use in the API), correct?
>>>
>>> You may prefer xattrs on bluestore if the metadata is small and you
>>> may need to store the xattrs on an EC pool. omap is not supported on
>>> ecpools.
>>>
>>> --
>>> Patrick Donnelly
>>>
>>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to