Start looking at the Spark/Cassandra connector here (in Scala):
https://github.com/datastax/spark-cassandra-connector/tree/master/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector

Data locality is provided by this method:
https://github.com/datastax/spark-cassandra-connector/blob/master/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/rdd/CassandraRDD.scala#L329-L336

Start digging from this all the way down the code.

As for Stratio Deep, I can't tell how the did the integration with Spark.
Take some time to dig down their code to understand the logic.



On Wed, Feb 11, 2015 at 2:25 PM, Marcelo Valle (BLOOMBERG/ LONDON) <
mvallemil...@bloomberg.net> wrote:

> Taking the opportunity Spark was being discussed in another thread, I
> decided to start a new one as I have interest in using Spark + Cassandra in
> the feature.
>
> About 3 years ago, Spark was not an existing option and we tried to use
> hadoop to process Cassandra data. My experience was horrible and we reached
> the conclusion it was faster to develop an internal tool than insist on
> Hadoop _for our specific case_.
>
> How I can see Spark is starting to be known as a "better hadoop" and it
> seems market is going this way now. I can also see I have many more options
> to decide how to integrate Cassandra using the Spark RDD concept than using
> the ColumnFamilyInputFormat.
>
> I have found this java driver made by Datastax:
> https://github.com/datastax/spark-cassandra-connector
>
> I also have found python Cassandra support on spark's repo, but it seems
> experimental yet:
> https://github.com/apache/spark/tree/master/examples/src/main/python
>
> Finally I have found stratio deep: https://github.com/Stratio/deep-spark
> It seems Stratio guys have forked Cassandra also, I am still a little
> confused about it.
>
> Question: which driver should I use, if I want to use Java? And which if I
> want to use python?
> I think the way Spark can integrate to Cassandra makes all the difference
> in the world, from my past experience, so I would like to know more about
> it, but I don't even know which source code I should start looking...
> I would like to integrate using python and or C++, but I wonder if it
> doesn't pay the way to use the java driver instead.
>
> Thanks in advance
>
>
>
>

Reply via email to