Re: Struggling to understand CFS and its use.

2014-01-09 Thread Ben Coverston
+1 to what Ed said. CFS is a good facilitator for running MR jobs on Cassandra to fill the HDFS requirement (you just want to run MR, but you don't want the whole Hadoop stack). The source data for your MR jobs should be in Cassandra KS/CFs. On Mon, Nov 18, 2013 at 3:21 PM, Edward Capriolo wrote

Re: Struggling to understand CFS and its use.

2013-11-18 Thread Edward Capriolo
CFS was written so that Brisk (now defunct) did not need a separate hadoop HDFS stack (NN + DataNodes) to do map reduce work. It is better served as an alternative to HDFS not as a general purpose distributed file system. On Mon, Nov 18, 2013 at 2:02 PM, Robert Coli wrote: > On Sat, Nov 16, 201

Re: Struggling to understand CFS and its use.

2013-11-18 Thread Robert Coli
On Sat, Nov 16, 2013 at 9:10 PM, Willie Slepecki wrote: > The last issue i'm dealing with before starting to write code is random > file storage. The application will have the ability to upload whatever, > images, pdf, etc, and i need to put them somewhere. (for the record, > Amazon S3 is not a

Re: Struggling to understand CFS and its use.

2013-11-17 Thread Jon Haddad
Having used (and moved off of) Titan I do not recommend it as a primary database. Until it overcomes it’s extremely unoptimized graph traversals, it will increase the load on your database by several orders of magnitude. As a secondary analytics database, it might do fine. Just don’t rely on

Struggling to understand CFS and its use.

2013-11-16 Thread Willie Slepecki
Hi all. I'm in the bar napkin phase of coming up with a big app. The application is going to be a large graph app so I was drawn to Cassandra because of Titan and the replication of Cassandra is far superior to Neo4j and other open source systems I have looked at. The last issue i'm dealing with