Re: cassandra + spark / pyspark

2014-09-11 Thread abhinav chowdary
Adding to conversation... there are 3 great open source options available 1. Calliope http://tuplejump.github.io/calliope/ This is the first library that was out some time late last year (as i can recall) and I have been using this for a while, mostly very stable, uses Hadoop i/o in Cassandra

Re: cassandra + spark / pyspark

2014-09-11 Thread DuyHai Doan
2. "still uses thrift for minor stuff" --> I think that the only call using thrift is "describe_ring" to get an estimate of ratio of partition keys within the token range 3. Stratio has a talk today at the SF Summit, presenting Stratio META. For the folks not attending the conference, video should

Re: cassandra + spark / pyspark

2014-09-11 Thread Oleg Ruchovets
Ok. DataStax , Startio are required mesos, hadoop yarn other third party to get spark cluster HA. What in case of calliope? Is it sufficient to have cassandra + calliope + spark to be able process aggregations? In my case we have quite a lot of data so doing aggregation only in memory - impossi

Re: cassandra + spark / pyspark

2014-09-11 Thread Rohit Rai
Hi Oleg, I am the creator of Calliope. Calliope doesn't force any deployment model... that means you can run it with Mesos or Hadoop or Standalone. To be fair I don't think the other libs mentioned here should work too. The Spark cluster HA can be provided using ZooKeeper even in the standalone d

[RELEASE] Apache Cassandra 2.1.0

2014-09-11 Thread Sylvain Lebresne
The Cassandra team is pleased to announce the release of the final version of Apache Cassandra 2.1.0. Cassandra 2.1.0 brings a number of new features and improvements including (but not limited to): - Improved support of Windows. - A new incremental repair option[4, 5] - A better row cache that

Re: Mutation Stage does not finish

2014-09-11 Thread Eduardo Cusa
Hello, The jstack output can be seen in : http://pastebin.com/LXnNyY3U. I run the tpstats today and always get the same output: Pool NameActive Pending Completed Blocked All time blocked ReadStage 0 0 0 0

Re: Quickly loading C* dataset into memory (row cache)

2014-09-11 Thread Danny Chan
What are you referring to when you say memory store? RAM disk? memcached? Thanks, Danny On Wed, Sep 10, 2014 at 1:11 AM, DuyHai Doan wrote: > Rob Coli strikes again, you're Doing It Wrong, and he's right :D > > Using Cassandra as an distributed cache is a bad idea, seriously. Putting > 6GB int

Re: [RELEASE] Apache Cassandra 2.1.0

2014-09-11 Thread Alain RODRIGUEZ
Thanks for this new version that seems to bring a lot of new interesting features and improvements ! Definitely interested in trying new counters and incremental repairs. Congrats. PS: I am also quite curious to know what is still inside the heap :D. Maybe key cache ? So what is recommended heap

Re: [RELEASE] Apache Cassandra 2.1.0

2014-09-11 Thread Tony Anecito
Congrads team I know you worked hard on it!! One question. Where can users get a java Datastax driver to support this version? If so is it released? Best Regards, -Tony Anecito Founder/President MyUniPortal LLC http://www.myuniportal.com On Thursday, September 11, 2014 9:05 AM, Sylvain Leb

Re: [RELEASE] Apache Cassandra 2.1.0

2014-09-11 Thread abhinav chowdary
Yes its was released java driver 2.1 On Sep 11, 2014 8:33 AM, "Tony Anecito" wrote: > Congrads team I know you worked hard on it!! > > One question. Where can users get a java Datastax driver to support this > version? If so is it released? > > Best Regards, > -Tony Anecito > Founder/President >

Re: cassandra + spark / pyspark

2014-09-11 Thread Oleg Ruchovets
Thank you Rohit. I sent the email to you. Thanks Oleg. On Thu, Sep 11, 2014 at 10:51 PM, Rohit Rai wrote: > Hi Oleg, > > I am the creator of Calliope. Calliope doesn't force any deployment > model... that means you can run it with Mesos or Hadoop or Standalone. To > be fair I don't think the

Re: Quickly loading C* dataset into memory (row cache)

2014-09-11 Thread Robert Coli
On Thu, Sep 11, 2014 at 8:30 AM, Danny Chan wrote: > What are you referring to when you say memory store? > > RAM disk? memcached? > In 2014, probably Redis? =Rob

Detecting bitrot with incremental repair

2014-09-11 Thread John Sumsion
jbellis talked about incremental repair, which is great, but as I understood, repair was also somewhat responsible for detecting and repairing bitrot on long-lived sstables. If repair doesn't do it, what will? Thanks, John... NOTICE: This email message is for the sole use of the intended rec

Re: Detecting bitrot with incremental repair

2014-09-11 Thread Robert Coli
On Thu, Sep 11, 2014 at 9:44 AM, John Sumsion wrote: > jbellis talked about incremental repair, which is great, but as I > understood, repair was also somewhat responsible for detecting and > repairing bitrot on long-lived sstables. > SSTable checksums, and the checksums on individual compressed

Re: Mutation Stage does not finish

2014-09-11 Thread Eduardo Cusa
Robert/Elliot. I deleted commit logs, restarted cassandra and finally the node is up. Thanks for helps! Regards. Eduardo On Thu, Sep 11, 2014 at 12:08 PM, Eduardo Cusa < eduardo.c...@usmediaconsulting.com> wrote: > Hello, > > The jstack output can be seen in : http://pastebin.com/LXnN

Re: Mutation Stage does not finish

2014-09-11 Thread Robert Coli
On Thu, Sep 11, 2014 at 10:34 AM, Eduardo Cusa < eduardo.c...@usmediaconsulting.com> wrote: > I deleted commit logs, restarted cassandra and finally the node is up. > Do you have some crazy workload where you do a huge amount of delete or something? Replaying a commitlog should not take longer th

Re: [RELEASE] Apache Cassandra 2.1.0

2014-09-11 Thread Tony Anecito
Ok is it part of the release or needs to be downloaded from Datastax somewhere. I am wondering about the java driver. Thanks! -Tony On Thursday, September 11, 2014 9:47 AM, abhinav chowdary wrote: Yes its was released java driver 2.1 On Sep 11, 2014 8:33 AM, "Tony Anecito" wrote: Congr

Is it possible to bootstrap the 1st node of a new DC?

2014-09-11 Thread Tom van den Berge
When setting up a new (additional) data center, the documentation tells us to use "nodetool rebuild -- " to fill up the node(s) in the new dc, and to disable auto_bootstrap. I'm wondering if it is possible to fill the node with "auto_bootstrap=true" instead of a nodetool rebuild command. If so, ho

Re: Is it possible to bootstrap the 1st node of a new DC?

2014-09-11 Thread Robert Coli
On Thu, Sep 11, 2014 at 1:18 PM, Tom van den Berge wrote: > When setting up a new (additional) data center, the documentation tells us > to use "nodetool rebuild -- " to fill up the node(s) in the new dc, > and to disable auto_bootstrap. > > I'm wondering if it is possible to fill the node with >

Re: Is it possible to bootstrap the 1st node of a new DC?

2014-09-11 Thread Tom van den Berge
Thanks, Rob. I actually tried using LOCAL_ONE instead of ONE, but I still saw this problem. Maybe I missed some queries when updating to LOCAL_ONE. Anyway, it's good to know that this is supposed to work. Tom On Thu, Sep 11, 2014 at 10:28 PM, Robert Coli wrote: > On Thu, Sep 11, 2014 at 1:18 PM

Re: Mutation Stage does not finish

2014-09-11 Thread Eduardo Cusa
yes we have a huge amount insert that can be repeated, now we are working in a new data model On Thu, Sep 11, 2014 at 2:54 PM, Robert Coli wrote: > On Thu, Sep 11, 2014 at 10:34 AM, Eduardo Cusa < > eduardo.c...@usmediaconsulting.com> wrote: > >> I deleted commit logs, restarted cassandra and fi