On 9/23/19 9:38 AM, Robert LeBlanc wrote: > On Wed, Sep 18, 2019 at 11:47 AM Shawn A Kwang <kwa...@uwm.edu> wrote: >> >> We are planning our ceph architecture and I have a question: >> >> How should NVMe drives be used when our spinning storage devices use >> bluestore: >> >> 1. block WAL and DB partitions >> (https://docs.ceph.com/docs/nautilus/rados/configuration/bluestore-config-ref/) >> 2. Cache tier >> (https://docs.ceph.com/docs/nautilus/rados/operations/cache-tiering/) >> 3. Something else? >> >> Hardware- Each node has: >> 3x 8 TB HDD >> 1x 450 GB NVMe drive >> 192 GB RAM >> 2x Xeon CPUs (24 cores total) >> >> I plan to have three OSD daemons running on the node. There are 95 nodes >> total with the same hardware. >> >> Use Case: >> >> The plan is create cephfs and use this filesystem to store people's home >> directories and data. I anticipate more read operations than writes. >> >> Regarding cache tiering: The online documentation says cache tiering >> will often degrade performance. But when I read various threads on this >> ML there do seem to be people using cache tiering with success. I do see >> that it is heavily dependent upon one's use-case. In 2019 is there any >> updated recommendations as to whether to use cache tiering? >> >> If there is a third suggestion that people have I would be interested in >> hearing it. Thanks in advance. > > I've had good success when I've been able to hold all the 'hot' data > for 24 hours in a cache tier. That reduces the amount of data being > evicted from the tier and being added to the tier such that you reduce > the penalty from those operations. You can adjust the config (hit > rate, etc) to help reduce promotions for rarely accessed objects. The > size of the NVMe drive may best be suited for WAL (I highly recommend > that for any HDD install) for each OSD, then carve out the rest as an > SSD pool that you can put the CephFS metadata pool on. I don't think > you would have a good experience with cache tier at that size. > However, you know your access patterns far better than I do and it may > be a good fit.
Robert, I like your idea of partitioning each SSD for bluestore's DB [1], and then extra space for the cephFS metadata pool. [1] Question: You wrote 'WAL', but did you mean block.wal or block.db? Or both? Sincerely, Shawn -- Associate Scientist Center for Gravitation, Cosmology, and Astrophysics University of Wisconsin-Milwaukee office: +1 414 229 4960 kwa...@uwm.edu
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io