On 6 November 2013 14:08, Andrey Korolyov <and...@xdel.ru> wrote:

> > We are looking at building high density nodes for small scale 'starter'
> > deployments for our customers (maybe 4 or 5 nodes).  High density in this
> > case could mean a 2u chassis with 2x external 45 disk JBOD containers
> > attached.  That's 90 3TB disks/OSDs to be managed by a single node.
>  That's
> > about 243TB of potential usable space, and so (assuming up to 75%
> fillage)
> > maybe 182TB of potential data 'loss' in the event of a node failure.  On
> an
> > uncongested, unused, 10Gbps network, my back-of-a-beer-mat calculations
> say
> > that would take about 45 hours to get the cluster back into an undegraded
> > state - that is the requisite number of copies of all objects.
> >
>
> For such large number of disks you should consider that the cache
> amortization will not take any place even if you are using 1GB
> controller(s) - only tiered cache can be an option. Also recovery will
> take much more time even if you have a room for client I/O in the
> calculations because raw disks have very limited IOPS capacity and
> recovery will either take a much longer than such expectations at a
> glance or affect regular operations. For S3/Swift it may be acceptable
> but for VM images it does not.


Sure, but my argument was that you are never likely to actually let that
entire recovery operation complete - you're going to replace the hardware
and plug the disks back in and let them catch up by log replay/backfill.
 Assuming you don't ever actually expect to really lose all data on 90
disks in one go...

By tiered caching, do you mean using something like flashcache or bcache?
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to