We are in the (very) early stages of considering testing backing Hadoop via 
Ceph - as opposed to HDFS.  I've seen a few very vague references to doing 
that, but haven't found any concrete info (architecture, configuration 
recommendations, gotchas, lessons learned, etc...).   I did find the 
ceph.com/docs/ info [1] which discusses use of CephFS for backing Hadoop - but 
this would be foolish for production clusters given that CephFS isn't yet 
considered production quality/grade.

Does anyone in the ceph-users community have experience with this that they'd 
be willing to share?   Preferably ... via use of Ceph - not via CephFS...but I 
am interested in any CephFS related experiences too.

If we were to do this, and Ceph proved out as a backing store to Hadoop - there 
is the potential to be creating a fairly large multi-Petabyte (100s ??) class 
backing store for Ceph.  We do a very large amount of analytics on a lot of 
data sets for security trending correlations, etc...

Our current Ceph experience is limited to a few small (90 x 4TB OSD size) 
clusters - which we are working towards putting in production for Glance/Cinder 
backing and for Block storage for various large storage need platforms (eg 
software and package repo/mirrors, etc...).

Thanks in  advance for any input, thoughts, or pointers ...

~~shane

[1] http://ceph.com/docs/master/cephfs/hadoop/


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to