For SQL queries on Cassandra I used to use Presto: https://prestodb.io/

It's a nice tool from FB and seems to work well with Cassandra. You can use 
their JDBC driver with your favourite java SQL tool. 

Inside my apps, I never needed to use SQL queries.

[]s
From: pavel.velik...@gmail.com 
Subject: Re: best supported spark connector for Cassandra

Hi Marcelo,

  Were you able to use the Spark SQL features of the Cassandra connector? I 
couldn’t make a .jar that wouldn’t confict with Spark SQL native .jar…
So I ended up using only the basic features, cannot use SQL queries.


On Feb 13, 2015, at 7:49 PM, Paulo Ricardo Motta Gomes 
<paulo.mo...@chaordicsystems.com> wrote:
I used to use calliope, which was really awesome before DataStax native 
integration with Spark. Now I'm quite happy with the official DataStax spark 
connector, it's very straightforward to use.

I never tried to use these drivers with Java though, I'd suggest you to use 
them with Scala, which is the best option to write spark jobs.

On Fri, Feb 13, 2015 at 12:12 PM, Carlos Rolo <r...@pythian.com> wrote:

Not for sure ;)

If you need Cassandra support I can forward you to someone to talk to at 
Pythian.

Regards,

Regards,

Carlos Juzarte Rolo
Cassandra Consultant
 
Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo
Tel: 1649
www.pythian.com

On Fri, Feb 13, 2015 at 3:05 PM, Marcelo Valle (BLOOMBERG/ LONDON) 
<mvallemil...@bloomberg.net> wrote:

Actually, I am not the one looking for support, but I thank you a lot anyway.
But from your message I guess the answer is yes, Datastax is not the only 
Cassandra vendor offering support and changing official Cassandra source at 
this moment, is this right?

From: user@cassandra.apache.org 
Subject: Re: best supported spark connector for Cassandra

Of course, Stratio Deep and Stratio Cassandra are licensed  Apache 2.0.   

Regarding the Cassandra support, I can introduce you to someone in Stratio that 
can help you. 

2015-02-12 15:05 GMT+01:00 Marcelo Valle (BLOOMBERG/ LONDON) 
<mvallemil...@bloomberg.net>:

Thanks for the hint Gaspar. 
Do you know if Stratio Deep / Stratio Cassandra are also licensed Apache 2.0?

I had interest in knowing more about Stratio when I was working on a start up. 
Now, on a blueship, it seems one of the hardest obstacles to use Cassandra in a 
project is the need of an area supporting it, and it seems people are specially 
concerned about how many vendors an open source solution has to provide 
support. 

This seems to be kind of an advantage of HBase, as there are many vendors 
supporting it, but I wonder if Stratio can be considered an alternative to 
Datastax reggarding Cassandra support?

It's not my call here to decide anything, but as part of the community it helps 
to have this business scenario clear. I could say Cassandra could be the best 
fit technical solution for some projects but sometimes non-technical factors 
are in the game, like this need for having more than one vendor available...


From: gmu...@stratio.com 
Subject: Re: best supported spark connector for Cassandra

My suggestion is to use Java or Scala instead of Python. For Java/Scala both 
the Datastax and Stratio drivers are valid and similar options. As far as I 
know they both take care about data locality and are not based on the Hadoop 
interface. The advantage of Stratio Deep is that allows you to integrate Spark 
not only with Cassandra but with MongoDB, Elasticsearch, Aerospike and others 
as well. 
Stratio has a forked Cassandra for including some additional features such as 
Lucene based secondary indexes. So Stratio driver works fine with the Apache 
Cassandra and also with their fork.

You can find some examples of using Deep here: 
https://github.com/Stratio/deep-examples  Please if you need some help with 
Stratio Deep do not hesitate to contact us.


2015-02-11 17:18 GMT+01:00 shahab <shahab.mok...@gmail.com>:

I am using Calliope cassandra-spark 
connector(http://tuplejump.github.io/calliope/), which is quite handy and easy 
to use!
The only problem is that it is a bit outdates , works with Spark 1.1.0, 
hopefully new version comes soon.

best,
/Shahab

On Wed, Feb 11, 2015 at 2:51 PM, Marcelo Valle (BLOOMBERG/ LONDON) 
<mvallemil...@bloomberg.net> wrote:

I just finished a scala course, nice exercise to check what I learned :D

Thanks for the answer!

From: user@cassandra.apache.org 
Subject: Re: best supported spark connector for Cassandra

Start looking at the Spark/Cassandra connector here (in Scala): 
https://github.com/datastax/spark-cassandra-connector/tree/master/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector

Data locality is provided by this method: 
https://github.com/datastax/spark-cassandra-connector/blob/master/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/rdd/CassandraRDD.scala#L329-L336

Start digging from this all the way down the code.

As for Stratio Deep, I can't tell how the did the integration with Spark. Take 
some time to dig down their code to understand the logic. 


On Wed, Feb 11, 2015 at 2:25 PM, Marcelo Valle (BLOOMBERG/ LONDON) 
<mvallemil...@bloomberg.net> wrote:

Taking the opportunity Spark was being discussed in another thread, I decided 
to start a new one as I have interest in using Spark + Cassandra in the feature.

About 3 years ago, Spark was not an existing option and we tried to use hadoop 
to process Cassandra data. My experience was horrible and we reached the 
conclusion it was faster to develop an internal tool than insist on Hadoop _for 
our specific case_. 

How I can see Spark is starting to be known as a "better hadoop" and it seems 
market is going this way now. I can also see I have many more options to decide 
how to integrate Cassandra using the Spark RDD concept than using the 
ColumnFamilyInputFormat. 

I have found this java driver made by Datastax: 
https://github.com/datastax/spark-cassandra-connector

I also have found python Cassandra support on spark's repo, but it seems 
experimental yet: 
https://github.com/apache/spark/tree/master/examples/src/main/python

Finally I have found stratio deep: https://github.com/Stratio/deep-spark
It seems Stratio guys have forked Cassandra also, I am still a little confused 
about it.

Question: which driver should I use, if I want to use Java? And which if I want 
to use python? 
I think the way Spark can integrate to Cassandra makes all the difference in 
the world, from my past experience, so I would like to know more about it, but 
I don't even know which source code I should start looking...
I would like to integrate using python and or C++, but I wonder if it doesn't 
pay the way to use the java driver instead.

Thanks in advance


-- 

Gaspar Muñoz 
@gmunozsoria

Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 352 59 42 // @stratiobd


-- 

Gaspar Muñoz 
@gmunozsoria

Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 352 59 42 // @stratiobd


--


-- 
Paulo Motta

Chaordic | Platform
www.chaordic.com.br
+55 48 3232.3200


Reply via email to