Re: Cassandra & MapReduce/Storm/ etc

Jack Krupansky Fri, 16 May 2014 09:58:43 -0700

Here’s a meetup talk on analytics using Cassandra, Storm, and Kafka:
http://www.slideshare.net/aih1013/building-largescale-analytics-platform-with-storm-kafka-and-cassandra-nyc-storm-user-group-meetup-21st-nov-2013

-- Jack Krupansky

From: Manoj Khangaonkar 
Sent: Thursday, May 8, 2014 5:43 PM
To: user@cassandra.apache.org 
Subject: Cassandra & MapReduce/Storm/ etc

Hi,

Searching for Cassandra with MapReduce, I am finding that the search results 
are really dated -- from version 0.7 & 2010/2011.

Is there a good blog/article that describes how using MapReduce on Cassandra 
table ?

>From my naive understanding, Cassandra is all about partitioning. Querying is 
>based on partitionkey + clustered column(s).

Inputs to MapReduce is a sequence of Key,values. For Storm it is a stream of 
tuples.

If a database table is input source for MapReduce or Storm, for me , this is in 
the simple case, is translating to a full table scan of the input table, which 
can timeout and is generally not a recommended access pattern in Cassandra. 

My initial reaction is that if I need to process data with MapReduce or Storm, 
reading it from Cassandra might not be the optimal way. Storing the output to 
Cassandra however does make sense.

If anyone had links to blogs or personal experience in this area, I would 
appreciate if you can share it.

regards

Re: Cassandra & MapReduce/Storm/ etc

Reply via email to