Re: Spark Cassandra clusters

2016-01-24 Thread vivek.meghanathan
mazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/> From: Ted Yu [mailto:yuzhih...@gmail.com] Sent: Friday, January 22, 2016 6:37 PM To: vivek.meghanat...@wipro.com Cc: user Subject: Re: Spark Cassandra clusters I am not Cassandra developer :-) Can you use http://search-hadoop.co

RE: Spark Cassandra clusters

2016-01-22 Thread Mohammed Guller
Spark<http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/> From: Ted Yu [mailto:yuzhih...@gmail.com] Sent: Friday, January 22, 2016 6:37 PM To: vivek.meghanat...@wipro.com Cc: user Subject: Re: Spark Cassandra clusters I am not Cassandra developer :-) Can you us

Re: Spark Cassandra clusters

2016-01-22 Thread Ted Yu
Vivek: I searched for 'cassandra gc pause' and found a few hits. e.g. : http://search-hadoop.com/m/qZFqM1c5nrn1Ihwf6&subj=Re+GC+pauses+affecting+entire+cluster+ Keep in mind the effect of GC on shared nodes. FYI On Fri, Jan 22, 2016 at 7:09 PM, Mohammed Guller wrote: > For data locality, it is

RE: Spark Cassandra clusters

2016-01-22 Thread Mohammed Guller
For data locality, it is recommended to run the Spark workers and Cassandra on the same nodes. Mohammed Author: Big Data Analytics with Spark From: vivek.meghanat...@wipro.com [mailto:vivek.meghanat...@wipro.com] Sent:

Re: Spark Cassandra clusters

2016-01-22 Thread Ted Yu
I am not Cassandra developer :-) Can you use http://search-hadoop.com/ or ask on Cassandra mailing list. Cheers On Fri, Jan 22, 2016 at 6:35 PM, wrote: > Thanks Ted, also what is the suggested memory setting for Cassandra > process? > > Regards > Vivek > On Sat, Jan 23, 2016 at 7:57 am, Ted Yu

Re: Spark Cassandra clusters

2016-01-22 Thread vivek.meghanathan
Thanks Ted, also what is the suggested memory setting for Cassandra process? Regards Vivek On Sat, Jan 23, 2016 at 7:57 am, Ted Yu mailto:yuzhih...@gmail.com>> wrote: >From your description, putting Cassandra daemon on Spark cluster should be >feasible. One aspect to be measured is how much l

Re: Spark Cassandra clusters

2016-01-22 Thread Ted Yu
>From your description, putting Cassandra daemon on Spark cluster should be feasible. One aspect to be measured is how much locality can be achieved in this setup - Cassandra is distributed NoSQL store. Cheers On Fri, Jan 22, 2016 at 6:13 PM, wrote: > + spark standalone cluster > On Sat, Jan 2

Re: Spark Cassandra clusters

2016-01-22 Thread vivek.meghanathan
+ spark standalone cluster On Sat, Jan 23, 2016 at 7:33 am, Vivek Meghanathan (WT01 - NEP) mailto:vivek.meghanat...@wipro.com>> wrote: We have the setup on Google cloud platform. Each node has 8 CPU + 30GB memory. 10 nodes for spark another 9nodes for Cassandra. We are using spark 1.3.0 and Da

Re: Spark Cassandra clusters

2016-01-22 Thread vivek.meghanathan
Thanks. We are using spark - Cassandra connector aligned for spark 1.3. Regards Vivek On Sat, Jan 23, 2016 at 7:27 am, Durgesh Verma mailto:dv21...@gmail.com>> wrote: This may be useful, you can try connectors. https://academy.datastax.com/demos/getting-started-apache-spark-and-cassandra https

Re: Spark Cassandra clusters

2016-01-22 Thread vivek.meghanathan
We have the setup on Google cloud platform. Each node has 8 CPU + 30GB memory. 10 nodes for spark another 9nodes for Cassandra. We are using spark 1.3.0 and Datastax bundle 4.5.9(which has 2.0.x Cassandra). Spark master and worker daemon uses Xmx & Xms 4G. We have not changed the default setting

Re: Spark Cassandra clusters

2016-01-22 Thread Durgesh Verma
This may be useful, you can try connectors. https://academy.datastax.com/demos/getting-started-apache-spark-and-cassandra https://spark-summit.org/2015/events/cassandra-and-spark-optimizing-for-data-locality/ Thanks, -Durgesh > On Jan 22, 2016, at 8:37 PM, > wrote: > > Hi All, > What is the

Re: Spark Cassandra clusters

2016-01-22 Thread Ted Yu
Can you give us a bit more information ? How much memory does each node have ? What's the current heap allocation for Cassandra process and executor ? Spark / Cassandra release you are using Thanks On Fri, Jan 22, 2016 at 5:37 PM, wrote: > Hi All, > What is the right spark Cassandra cluster se