Re: Cassandra - Spark - Flume: best architecture for log analytics.

2015-07-23 Thread Edward Ribeiro
Disclaimer: I have worked for DataStax. Cassandra is fairly good for log analytics and has been used many places for that ( https://www.usenix.org/conference/lisa14/conference-program/presentation/josephsen ). Of course, requirements vary from place to place, but it has been a good fit. Spark and

Re: Cassandra - Spark - Flume: best architecture for log analytics.

2015-07-23 Thread Ipremyadav
Though DSE cassandra comes with hadoop integration, this is clearly is use case for hadoop. Any reason why cassandra is your first choice? > On 23 Jul 2015, at 6:12 a.m., Pierre Devops wrote: > > Cassandra is not very good at massive read/bulk read if you need to retrieve > and compute a la

Re: Cassandra - Spark - Flume: best architecture for log analytics.

2015-07-22 Thread Pierre Devops
Cassandra is not very good at massive read/bulk read if you need to retrieve and compute a large amount of data on multiple machines using something like spark or hadoop (or you'll need to hack and process the sstable directly, something which is not "natively" supported, you'll have to hack your w

Cassandra - Spark - Flume: best architecture for log analytics.

2015-07-22 Thread Renato Perini
Problem: Log analytics. Solutions: 1) Aggregating logs using Flume and storing the aggregations into Cassandra. Spark reads data from Cassandra, make some computations and write the results in distinct tables, still in Cassandra. 2) Aggregating logs using Flume to a sink, streamin