On 9/23/19 9:38 AM, Robert LeBlanc wrote:
> On Wed, Sep 18, 2019 at 11:47 AM Shawn A Kwang <kwa...@uwm.edu> wrote:
>>
>> We are planning our ceph architecture and I have a question:
>>
>> How should NVMe drives be used when our spinning storage devices use
>> bluestore:
>>
>> 1. block WAL and DB partitions
>> (https://docs.ceph.com/docs/nautilus/rados/configuration/bluestore-config-ref/)
>> 2. Cache tier
>> (https://docs.ceph.com/docs/nautilus/rados/operations/cache-tiering/)
>> 3. Something else?
>>
>> Hardware- Each node has:
>> 3x 8 TB HDD
>> 1x 450 GB NVMe drive
>> 192 GB RAM
>> 2x Xeon CPUs (24 cores total)
>>
>> I plan to have three OSD daemons running on the node. There are 95 nodes
>> total with the same hardware.
>>
>> Use Case:
>>
>> The plan is create cephfs and use this filesystem to store people's home
>> directories and data. I anticipate more read operations than writes.
>>
>> Regarding cache tiering: The online documentation says cache tiering
>> will often degrade performance. But when I read various threads on this
>> ML there do seem to be people using cache tiering with success. I do see
>> that it is heavily dependent upon one's use-case. In 2019 is there any
>> updated recommendations as to whether to use cache tiering?
>>
>> If there is a third suggestion that people have I would be interested in
>> hearing it. Thanks in advance.
> 
> I've had good success when I've been able to hold all the 'hot' data
> for 24 hours in a cache tier. That reduces the amount of data being
> evicted from the tier and being added to the tier such that you reduce
> the penalty from those operations. You can adjust the config (hit
> rate, etc) to help reduce promotions for rarely accessed objects. The
> size of the NVMe drive may best be suited for WAL (I highly recommend
> that for any HDD install) for each OSD, then carve out the rest as an
> SSD pool that you can put the CephFS metadata pool on. I don't think
> you would have a good experience with cache tier at that size.
> However, you know your access patterns far better than I do and it may
> be a good fit.

Robert,

I like your idea of partitioning each SSD for bluestore's DB [1], and
then extra space for the cephFS metadata pool.

[1] Question: You wrote 'WAL', but did you mean block.wal or block.db?
Or both?

Sincerely,
Shawn

-- 
Associate Scientist
Center for Gravitation, Cosmology, and Astrophysics
University of Wisconsin-Milwaukee
office: +1 414 229 4960
kwa...@uwm.edu

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to