Hi Marcelo, Were you able to use the Spark SQL features of the Cassandra connector? I couldn’t make a .jar that wouldn’t confict with Spark SQL native .jar… So I ended up using only the basic features, cannot use SQL queries.
> On Feb 13, 2015, at 7:49 PM, Paulo Ricardo Motta Gomes > <paulo.mo...@chaordicsystems.com> wrote: > > I used to use calliope, which was really awesome before DataStax native > integration with Spark. Now I'm quite happy with the official DataStax spark > connector, it's very straightforward to use. > > I never tried to use these drivers with Java though, I'd suggest you to use > them with Scala, which is the best option to write spark jobs. > > On Fri, Feb 13, 2015 at 12:12 PM, Carlos Rolo <r...@pythian.com > <mailto:r...@pythian.com>> wrote: > Not for sure ;) > > If you need Cassandra support I can forward you to someone to talk to at > Pythian. > > Regards, > > Regards, > > Carlos Juzarte Rolo > Cassandra Consultant > > Pythian - Love your data > > rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo > <http://linkedin.com/in/carlosjuzarterolo> > Tel: 1649 > www.pythian.com <http://www.pythian.com/> > > On Fri, Feb 13, 2015 at 3:05 PM, Marcelo Valle (BLOOMBERG/ LONDON) > <mvallemil...@bloomberg.net <mailto:mvallemil...@bloomberg.net>> wrote: > Actually, I am not the one looking for support, but I thank you a lot anyway. > But from your message I guess the answer is yes, Datastax is not the only > Cassandra vendor offering support and changing official Cassandra source at > this moment, is this right? > > From: user@cassandra.apache.org <mailto:user@cassandra.apache.org> > Subject: Re: best supported spark connector for Cassandra > Of course, Stratio Deep and Stratio Cassandra are licensed Apache 2.0. > > Regarding the Cassandra support, I can introduce you to someone in Stratio > that can help you. > > 2015-02-12 15:05 GMT+01:00 Marcelo Valle (BLOOMBERG/ LONDON) > <mvallemil...@bloomberg.net <mailto:mvallemil...@bloomberg.net>>: > Thanks for the hint Gaspar. > Do you know if Stratio Deep / Stratio Cassandra are also licensed Apache 2.0? > > I had interest in knowing more about Stratio when I was working on a start > up. Now, on a blueship, it seems one of the hardest obstacles to use > Cassandra in a project is the need of an area supporting it, and it seems > people are specially concerned about how many vendors an open source solution > has to provide support. > > This seems to be kind of an advantage of HBase, as there are many vendors > supporting it, but I wonder if Stratio can be considered an alternative to > Datastax reggarding Cassandra support? > > It's not my call here to decide anything, but as part of the community it > helps to have this business scenario clear. I could say Cassandra could be > the best fit technical solution for some projects but sometimes non-technical > factors are in the game, like this need for having more than one vendor > available... > > > From: gmu...@stratio.com <mailto:gmu...@stratio.com> > Subject: Re: best supported spark connector for Cassandra > My suggestion is to use Java or Scala instead of Python. For Java/Scala both > the Datastax and Stratio drivers are valid and similar options. As far as I > know they both take care about data locality and are not based on the Hadoop > interface. The advantage of Stratio Deep is that allows you to integrate > Spark not only with Cassandra but with MongoDB, Elasticsearch, Aerospike and > others as well. > Stratio has a forked Cassandra for including some additional features such as > Lucene based secondary indexes. So Stratio driver works fine with the Apache > Cassandra and also with their fork. > > You can find some examples of using Deep here: > https://github.com/Stratio/deep-examples > <https://github.com/Stratio/deep-examples> Please if you need some help with > Stratio Deep do not hesitate to contact us. > > > 2015-02-11 17:18 GMT+01:00 shahab <shahab.mok...@gmail.com > <mailto:shahab.mok...@gmail.com>>: > I am using Calliope cassandra-spark > connector(http://tuplejump.github.io/calliope/ > <http://tuplejump.github.io/calliope/>), which is quite handy and easy to use! > The only problem is that it is a bit outdates , works with Spark 1.1.0, > hopefully new version comes soon. > > best, > /Shahab > > On Wed, Feb 11, 2015 at 2:51 PM, Marcelo Valle (BLOOMBERG/ LONDON) > <mvallemil...@bloomberg.net <mailto:mvallemil...@bloomberg.net>> wrote: > I just finished a scala course, nice exercise to check what I learned :D > > Thanks for the answer! > > From: user@cassandra.apache.org <mailto:user@cassandra.apache.org> > Subject: Re: best supported spark connector for Cassandra > Start looking at the Spark/Cassandra connector here (in Scala): > https://github.com/datastax/spark-cassandra-connector/tree/master/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector > > <https://github.com/datastax/spark-cassandra-connector/tree/master/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector> > > Data locality is provided by this method: > https://github.com/datastax/spark-cassandra-connector/blob/master/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/rdd/CassandraRDD.scala#L329-L336 > > <https://github.com/datastax/spark-cassandra-connector/blob/master/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/rdd/CassandraRDD.scala#L329-L336> > > Start digging from this all the way down the code. > > As for Stratio Deep, I can't tell how the did the integration with Spark. > Take some time to dig down their code to understand the logic. > > > > On Wed, Feb 11, 2015 at 2:25 PM, Marcelo Valle (BLOOMBERG/ LONDON) > <mvallemil...@bloomberg.net <mailto:mvallemil...@bloomberg.net>> wrote: > Taking the opportunity Spark was being discussed in another thread, I decided > to start a new one as I have interest in using Spark + Cassandra in the > feature. > > About 3 years ago, Spark was not an existing option and we tried to use > hadoop to process Cassandra data. My experience was horrible and we reached > the conclusion it was faster to develop an internal tool than insist on > Hadoop _for our specific case_. > > How I can see Spark is starting to be known as a "better hadoop" and it seems > market is going this way now. I can also see I have many more options to > decide how to integrate Cassandra using the Spark RDD concept than using the > ColumnFamilyInputFormat. > > I have found this java driver made by Datastax: > https://github.com/datastax/spark-cassandra-connector > <https://github.com/datastax/spark-cassandra-connector> > > I also have found python Cassandra support on spark's repo, but it seems > experimental yet: > https://github.com/apache/spark/tree/master/examples/src/main/python > <https://github.com/apache/spark/tree/master/examples/src/main/python> > > Finally I have found stratio deep: https://github.com/Stratio/deep-spark > <https://github.com/Stratio/deep-spark> > It seems Stratio guys have forked Cassandra also, I am still a little > confused about it. > > Question: which driver should I use, if I want to use Java? And which if I > want to use python? > I think the way Spark can integrate to Cassandra makes all the difference in > the world, from my past experience, so I would like to know more about it, > but I don't even know which source code I should start looking... > I would like to integrate using python and or C++, but I wonder if it doesn't > pay the way to use the java driver instead. > > Thanks in advance > > > > > > > > > > -- > Gaspar Muñoz > @gmunozsoria > > <http://www.stratio.com/> > Vía de las dos Castillas, 33, Ática 4, 3ª Planta > 28224 Pozuelo de Alarcón, Madrid > Tel: +34 91 352 59 42 <tel:%2B34%2091%20352%2059%2042> // @stratiobd > <https://twitter.com/StratioBD> > > > > -- > Gaspar Muñoz > @gmunozsoria > > <http://www.stratio.com/> > Vía de las dos Castillas, 33, Ática 4, 3ª Planta > 28224 Pozuelo de Alarcón, Madrid > Tel: +34 91 352 59 42 <tel:%2B34%2091%20352%2059%2042> // @stratiobd > <https://twitter.com/StratioBD> > > > -- > > > > > > > > > -- > Paulo Motta > > Chaordic | Platform > www.chaordic.com.br <http://www.chaordic.com.br/> > +55 48 3232.3200