Disclaimer: I have worked for DataStax.
Cassandra is fairly good for log analytics and has been used many places
for that (
https://www.usenix.org/conference/lisa14/conference-program/presentation/josephsen
). Of course, requirements vary from place to place, but it has been a good
fit. Spark and
Though DSE cassandra comes with hadoop integration, this is clearly is use case
for hadoop.
Any reason why cassandra is your first choice?
> On 23 Jul 2015, at 6:12 a.m., Pierre Devops wrote:
>
> Cassandra is not very good at massive read/bulk read if you need to retrieve
> and compute a la
Cassandra is not very good at massive read/bulk read if you need to
retrieve and compute a large amount of data on multiple machines using
something like spark or hadoop (or you'll need to hack and process the
sstable directly, something which is not "natively" supported, you'll have
to hack your w
Problem: Log analytics.
Solutions:
1) Aggregating logs using Flume and storing the aggregations
into Cassandra. Spark reads data from Cassandra, make some computations
and write the results in distinct tables, still in Cassandra.
2) Aggregating logs using Flume to a sink, streamin