Having used (and moved off of) Titan I do not recommend it as a primary 
database.  Until it overcomes it’s extremely unoptimized graph traversals, it 
will increase the load on your database by several orders of magnitude.  

As a secondary analytics database, it might do fine.  Just don’t rely on it for 
anything time sensitive.  

Jon


On Nov 16, 2013, at 9:10 PM, Willie Slepecki <scpha...@gmail.com> wrote:

> Hi all.  I'm in the bar napkin phase of coming up with a big app.  The 
> application is going to be a large graph app so I was drawn to Cassandra 
> because of Titan and the replication of Cassandra is far superior to Neo4j 
> and other open source systems I have looked at.
> 
> The last issue i'm dealing with before starting to write code is random file 
> storage.  The application will have the ability to upload whatever, images, 
> pdf, etc, and i need to put them somewhere.  (for the record, Amazon S3 is 
> not an option, long story)  So i'm looking at a hugely expensive raid array, 
> or an insanely complex distributed file system.  Given the budget im dealing 
> with, most likely distributed file system.
> 
> Now in the past hour or so, i stumbled on CFS.  And I think i know what it 
> is, and that its not going to work for me, but I just wanted to make sure.  
> 
> From what I can tell, it is a file system that does not like small files (15k 
> images and such) because for each file you upload, its going to allocate a 2 
> meg block.  
> 
> Second, it looks like its similar to HDFS in that the FS is a misleading 
> statement and should have probably been named CDS (Cassandra Data Store).  I 
> mean that in the sense, it wasn't designed to map a drive to and drop files 
> in with explorer, but intended more as a convenient way to upload to your 
> analytics engine (mapreduce or whatever) large files of structured data to 
> have back end processes rip apart and tell you cool things you didn't know.  
> Or for us really old guys, think of it as an easy way to dump a butt load of 
> data into your data warehouse without having to write an ETL, and instead you 
> write the ETL when you want to do something with it.
> 
> Third, it looks like it commercial, from that stax something company.  
> 
> Am i wrong about any of this?
> 
> Thanks
> 
> -- 
> You want it fast, cheap, or right.  Pick two!!

Reply via email to