Thanks for the info. can you share please where can I read about mesos integration for HA and StandAlone mode execution?
Thanks Oleg. On Thu, Sep 11, 2014 at 12:13 AM, DuyHai Doan <doanduy...@gmail.com> wrote: > Hello Oleg > > Question 2: yes. The official spark cassandra connector can be found here: > https://github.com/datastax/spark-cassandra-connector > > There is docs in the doc/ folder. You can read & write directly from/to > Cassandra without EVER using HDFS. You still need a resource manager like > Apache Mesos though to have high availability of your Spark cluster, on run > in stand alone mode and manage fail over yourself, choice is yours > > Question 3: yes, you can save a massive amount of data into Cassandra > > Question 4: I've played a little bit with it, it's quite smart, data > locality is guaranteed by creating Spark RDD partition mapping directly to > Cassandra node having the primary partition range. I have still not played > with it into production though so I can't tell anything about stability. > > Maybe other guys on the list may give their thoughts about it ? > > Regards > > Duy Hai DOAN > > > > Le 10 sept. 2014 17:35, "Oleg Ruchovets" <oruchov...@gmail.com> a écrit : > > Hi , >> I try to evaluate different option of spark + cassandra and I have >> couple of questions: >> My aim is to use cassandra+spark without hadoop: >> >> 1) Is it possible to use only cassandra as input/output parameter for >> PySpark? >> 2) In case I'll use Spark (java,scala) is it possible to use only >> cassandra - input/output without hadoop? >> 3) I know there are couple of strategies for storage level, in case my >> data set is quite big and I have no enough memory to process - can I use >> DISK_ONLY option without hadoop (having only cassandra)? >> 4) please share your experience how stable cassandra + spark integration? >> >> Thanks >> Oleg >> >