The problem with lots of OSDs per node is that this usually means you
have too few nodes. It's perfectly fine to run 60 OSDs per node if you
got a total of 1000 OSDs or so.
But I've seen too many setups with 3-5 nodes where each node runs 60
OSDs which makes no sense (and usually isn't even cheaper than more
nodes, especially once you consider the lost opportunity for running
erasure coding).

The usual backup cluster we are seeing is in the single-digit petabyte
range with about 12 to 24 disks per server running ~8+3 erasure
coding.

Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Wed, Oct 2, 2019 at 12:53 AM Darrell Enns <darre...@knowledge.ca> wrote:
>
> Thanks Paul. I was speaking more about total OSDs and RAM, rather than a 
> single node. However, I am considering building a cluster with a large 
> OSD/node count. This would be for archival use, with reduced performance and 
> availability requirements. What issues would you anticipate with a large 
> OSD/node count? Is the concern just the large rebalance if a node fails and 
> takes out a large portion of the OSDs at once?
>
> -----Original Message-----
> From: Paul Emmerich <paul.emmer...@croit.io>
> Sent: Tuesday, October 01, 2019 3:00 PM
> To: Darrell Enns <darre...@knowledge.ca>
> Cc: ceph-users@ceph.io
> Subject: Re: [ceph-users] RAM recommendation with large OSDs?
>
> On Tue, Oct 1, 2019 at 6:12 PM Darrell Enns <darre...@knowledge.ca> wrote:
> >
> > The standard advice is “1GB RAM per 1TB of OSD”. Does this actually still 
> > hold with large OSDs on bluestore?
>
> No
>
> > Can it be reasonably reduced with tuning?
>
> Yes
>
>
> > From the docs, it looks like bluestore should target the 
> > “osd_memory_target” value by default. This is a fixed value (4GB by 
> > default), which does not depend on OSD size. So shouldn’t the advice really 
> > by “4GB per OSD”, rather than “1GB per TB”? Would it also be reasonable to 
> > reduce osd_memory_target for further RAM savings?
>
> Yes
>
> > For example, suppose we have 90 12TB OSD drives:
>
> Please don't put 90 drives in one node, that's not a good idea in 99.9% of 
> the use cases.
>
> >
> > “1GB per TB” rule: 1080GB RAM
> > “4GB per OSD” rule: 360GB RAM
> > “2GB per OSD” (osd_memory_target reduced to 2GB): 180GB RAM
> >
> >
> >
> > Those are some massively different RAM values. Perhaps the old advice was 
> > for filestore? Or there is something to consider beyond the bluestore 
> > memory target? What about when using very dense nodes (for example, 60 12TB 
> > OSDs on a single node)?
>
> Keep in mind that it's only a target value, it will use more during recovery 
> if you set a low value.
> We usually set a target of 3 GB per OSD and recommend 4 GB of RAM per OSD.
>
> RAM saving trick: use fewer PGs than recommended.
>
>
> Paul
>
>
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
>
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> > email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to