date:20130624

Re: Cassandra driver performance question...

2013-06-24 Thread Jabbar Azam

Hello tony,

I couldnt reply earlier because I've been decorating over the weekend so
have been a bit busy.

Let me know what's happens.

Out of couriosity why are you using and not a cql3 native driver?

Thanks

Jabbar Azam
On 24 Jun 2013 00:32, "Tony Anecito"  wrote:

> Hi Jabbar,
>
>  I was able to get the performance issue resolved by reusing the
> connection object. It will be interesting to see what happens when I use a
> connection pool from a app server.
>
> I still think it would be a good idea to have a minimal mode for metadata.
> It is rare I use metadata.
>
> Regards,
> -Tony
>
>   *From:* Tony Anecito 
> *To:* "user@cassandra.apache.org" ; Tony
> Anecito 
> *Sent:* Friday, June 21, 2013 9:33 PM
> *Subject:* Re: Cassandra driver performance question...
>
>   Hi Jabbar,
>
> I think I know what is going on. I happened accross a change mentioned by
> the jdbc driver developers regarding metadata caching. Seems the metadata
> caching was moved from the connection object to the preparedStatement
> object. So I am wondering if the time difference I am seeing on the second
> preparedStatement object is because of the Metadata is cached then.
>
> So my question is how to test this theory? Is there a way to stop the
> metadata from coming accross from Cassandra? A 20x performance improvement
> would be nice to have.
>
> Thanks,
> -Tony
>
>   *From:* Tony Anecito 
> *To:* "user@cassandra.apache.org" 
> *Sent:* Friday, June 21, 2013 8:56 PM
> *Subject:* Re: Cassandra driver performance question...
>
>   Thanks Jabbar,
>
> I ran nodetool as suggested and it 0 latency for the row count I have.
>
> I also ran cli list command for the table hit by my JDBC perparedStatement
> and it was slow like 121msecs the first time I ran it and second time I ran
> it it was 40msecs versus jdbc call of 38msecs to start with unless I run it
> twice also and get 1.5-2.5msecs for executeQuery the second time the
> preparedStatement is called.
>
> I ran describe from cli for the table and it said caching is "ALL" which
> is correct.
>
> A real mystery and I need to understand better what is going on.
>
> Regards,
> -Tony
>
>   *From:* Jabbar Azam 
> *To:* user@cassandra.apache.org; Tony Anecito 
> *Sent:* Friday, June 21, 2013 3:32 PM
> *Subject:* Re: Cassandra driver performance question...
>
>  Hello Tony,
>
> I would guess that the first queries data  is put into the row cache and
> the filesystem cache. The second query gets the data from the row cache and
> or the filesystem cache so it'll be faster.
>
> If you want to make it consistently faster having a key cache will
> definitely help. The following advice from Aaron Morton will also help
>
> "You can also see what it looks like from the server side.
>
> nodetool proxyhistograms will show you full request latency recorded by the 
> coordinator.
> nodetool cfhistograms will show you the local read latency, this is just the 
> time it takes
> to read data on a replica and does not include network or wait times.
>
> If the proxyhistograms is showing most requests running faster than your app 
> says it's your
> app."
>
>
> http://mail-archives.apache.org/mod_mbox/cassandra-user/201301.mbox/%3ce3741956-c47c-4b43-ad99-dad8afc3a...@thelastpickle.com%3E
>
>
>
>  Thanks
>
> Jabbar Azam
>
>
> On 21 June 2013 21:29, Tony Anecito  wrote:
>
>Hi All,
> I am using jdbc driver and noticed that if I run the same query twice the
> second time it is much faster.
> I setup the row cache and column family cache and it not seem to make a
> difference.
>
> I am wondering how to setup cassandra such that the first query is always
> as fast as the second one. The second one was 1.8msec and the first 28msec
> for the same exact paremeters. I am using preparestatement.
>
> Thanks!
>
>
>
>
>
>
>
>
>

Re: Updated sstable size for LCS, ran upgradesstables, file sizes didn't change

2013-06-24 Thread Hiller, Dean

We would be very very interested in your results.  We currently run 10M but 
have heard of 256M sizes as well.

Please let us know what you find out.
Thanks,
Dean

From: Andrew Bialecki 
mailto:andrew.biale...@gmail.com>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Date: Friday, June 21, 2013 5:40 PM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Subject: Updated sstable size for LCS, ran upgradesstables, file sizes didn't 
change

We're potentially considering increasing the size of our sstables for some 
column families from 10MB to something larger.

In test, we've been trying to verify that the sstable file sizes change and 
then doing a bit of benchmarking. However when we run alter the column family 
and then run "nodetool upgradesstables -a keyspace columnfamily," the files in 
the data directory have been re-written, but the file sizes are the same.

Is this the expected behavior? If not, what's the right way to upgrade them. If 
this is expected, how can we benchmark the read/write performance with varying 
sstable sizes.

Thanks in advance!

Andrew

AssertionError: Unknown keyspace?

2013-06-24 Thread Hiller, Dean

I haven't seen this error in a long time.  We just received the below error in 
production when rebuilding a node…any ideas on how to get around this?  We had 
rebuilt 3 other nodes already I think(we have been swapping hardware)

ERROR 06:32:21,474 Exception in thread Thread[ReadStage:1,5,main]
java.lang.AssertionError: Unknown keyspace databus5
at org.apache.cassandra.db.Table.(Table.java:263)
at org.apache.cassandra.db.Table.open(Table.java:110)
at org.apache.cassandra.db.Table.open(Table.java:88)
at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47)
at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)

Thanks for any insight,
Dean

Re: AssertionError: Unknown keyspace?

2013-06-24 Thread Hiller, Dean

Ah, so digging deeper, it is not bootstrapping.  How do I force the node
to bootstrap?  (this is version 1.2.2 and the other nodes somehow knew to
bootstrap automatically but this one I need to force for some reason).  I
remember there was a property for this.

NOTE: I enabled some debug logs and auto bootstrap is true according to
this log
DEBUG 06:53:03,411 setting auto_bootstrap to true

OR better yet, if someone can point me to the code on where bootstrap is
decided so I can see why it decides not to bootstrap?

Thanks,
Dean

On 6/24/13 6:42 AM, "Hiller, Dean"  wrote:

>I haven't seen this error in a long time.  We just received the below
>error in production when rebuilding a nodeŠany ideas on how to get around
>this?  We had rebuilt 3 other nodes already I think(we have been swapping
>hardware)
>
>ERROR 06:32:21,474 Exception in thread Thread[ReadStage:1,5,main]
>java.lang.AssertionError: Unknown keyspace databus5
>at org.apache.cassandra.db.Table.(Table.java:263)
>at org.apache.cassandra.db.Table.open(Table.java:110)
>at org.apache.cassandra.db.Table.open(Table.java:88)
>at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47)
>at 
>org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:
>56)
>at 
>java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.
>java:895)
>at 
>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
>:918)
>at java.lang.Thread.run(Thread.java:662)
>
>Thanks for any insight,
>Dean

Re: AssertionError: Unknown keyspace?

2013-06-24 Thread Hiller, Dean

Oh shoot, this is a seed node.  Is there documentation on how to bootstrap
a seed node?  If I have seeds of A, B, C for every machine on the ring and
I am bootstrapping node B, do I just modify cassandra.yaml and remove node
B from the yaml file temporarily and boot it up (Note, I still received
the unknown keyspace errors :( but it is bootstrapping now) and I assume I
can add node B back once all the data is in there.

Thanks,
Dean

On 6/24/13 6:55 AM, "Hiller, Dean"  wrote:

>Ah, so digging deeper, it is not bootstrapping.  How do I force the node
>to bootstrap?  (this is version 1.2.2 and the other nodes somehow knew to
>bootstrap automatically but this one I need to force for some reason).  I
>remember there was a property for this.
>
>NOTE: I enabled some debug logs and auto bootstrap is true according to
>this log
>DEBUG 06:53:03,411 setting auto_bootstrap to true
>
>OR better yet, if someone can point me to the code on where bootstrap is
>decided so I can see why it decides not to bootstrap?
>
>Thanks,
>Dean
>
>On 6/24/13 6:42 AM, "Hiller, Dean"  wrote:
>
>>I haven't seen this error in a long time.  We just received the below
>>error in production when rebuilding a nodeŠany ideas on how to get around
>>this?  We had rebuilt 3 other nodes already I think(we have been swapping
>>hardware)
>>
>>ERROR 06:32:21,474 Exception in thread Thread[ReadStage:1,5,main]
>>java.lang.AssertionError: Unknown keyspace databus5
>>at org.apache.cassandra.db.Table.(Table.java:263)
>>at org.apache.cassandra.db.Table.open(Table.java:110)
>>at org.apache.cassandra.db.Table.open(Table.java:88)
>>at 
>>org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47)
>>at 
>>org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java
>>:
>>56)
>>at 
>>java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor
>>.
>>java:895)
>>at 
>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.jav
>>a
>>:918)
>>at java.lang.Thread.run(Thread.java:662)
>>
>>Thanks for any insight,
>>Dean
>

quick question on seed nodes configuration

2013-06-24 Thread Hiller, Dean

For ease of use, we actually had a single cassandra.yaml deployed to every 
machine and a script that swapped out the token and listen address.  I had seed 
nodes ip1,ip2,ip3 as the seeds but what I didn't realize was then that these 
nodes had themselves as seeds.  I am assuming that should never be done, is 
that correct.  I really should deploy ip1, ip2, ip3 on all nodes and then for 
nodes 1, 2, and 3, I should do something like

ip1 will have ip2, ip3, ip4
ip2 will have ip1, ip3, ip4
Etc. etc.

QUESTION: Would it be ok if I just configured every node to be the 3 ip's after 
it like this instead as well
ip1 would have ip2, ip3, ip4
ip2 would have ip3, ip4, ip5
ip3 would have ip4, ip5, ip6
ip4 would have ip5, ip6, ip1

Is this okay for seed node configuration?

Thanks,
Dean

Re: Cassandra driver performance question...

2013-06-24 Thread Tony Anecito

Hi Jabbar,
 
I am using JDBC driver because almost no examples exist about what you mention. 
Even most of the JDBC examples I find do not work because they are incomplete 
or out of date. If you have a good reference about what you mentioned I can try 
it.
 
As I menioned I got selects to work now I am trying to get inserts to work via 
JDBC. Running into issues there also but I will work at it till I get them to 
work.
 
Regards,
-Tony

From: Jabbar Azam 
To: user@cassandra.apache.org 
Cc: Tony Anecito  
Sent: Monday, June 24, 2013 3:26 AM
Subject: Re: Cassandra driver performance question...



Hello tony,
I couldnt reply earlier because I've been decorating over the weekend so have 
been a bit busy.
Let me know what's happens.
Out of couriosity why are you using and not a cql3 native driver?
Thanks
Jabbar Azam
On 24 Jun 2013 00:32, "Tony Anecito"  wrote:

Hi Jabbar,
> 
> I was able to get the performance issue resolved by reusing the connection 
>object. It will be interesting to see what happens when I use a connection 
>pool from a app server.
> 
>I still think it would be a good idea to have a minimal mode for metadata. It 
>is rare I use metadata.
> 
>Regards,
>-Tony
>
>
>From: Tony Anecito 
>To: "user@cassandra.apache.org" ; Tony Anecito 
> 
>Sent: Friday, June 21, 2013 9:33 PM
>Subject: Re: Cassandra driver performance question...
>
>
>
>Hi Jabbar,
> 
>I think I know what is going on. I happened accross a change mentioned by the 
>jdbc driver developers regarding metadata caching. Seems the metadata caching 
>was moved from the connection object to the preparedStatement object. So I am 
>wondering if the time difference I am seeing on the second preparedStatement 
>object is because of the Metadata is cached then.
> 
>So my question is how to test this theory? Is there a way to stop the metadata 
>from coming accross from Cassandra? A 20x performance improvement would be 
>nice to have.
> 
>Thanks,
>-Tony
>
>
>From: Tony Anecito 
>To: "user@cassandra.apache.org"  
>Sent: Friday, June 21, 2013 8:56 PM
>Subject: Re: Cassandra driver performance question...
>
>
>
>Thanks Jabbar,
> 
>I ran nodetool as suggested and it 0 latency for the row count I have.
> 
>I also ran cli list command for the table hit by my JDBC perparedStatement and 
>it was slow like 121msecs the first time I ran it and second time I ran it it 
>was 40msecs versus jdbc call of 38msecs to start with unless I run it twice 
>also and get 1.5-2.5msecs for executeQuery the second time the 
>preparedStatement is called.
> 
>I ran describe from cli for the table and it said caching is "ALL" which is 
>correct.
> 
>A real mystery and I need to understand better what is going on.
> 
>Regards,
>-Tony
>
>
>From: Jabbar Azam 
>To: user@cassandra.apache.org; Tony Anecito  
>Sent: Friday, June 21, 2013 3:32 PM
>Subject: Re: Cassandra driver performance question...
>
>
>
>Hello Tony, 
>
>
>I would guess that the first queries data  is put into the row cache and the 
>filesystem cache. The second query gets the data from the row cache and or the 
>filesystem cache so it'll be faster.
>
>
>If you want to make it consistently faster having a key cache will definitely 
>help. The following advice from Aaron Morton will also help 
>"You can also see what it looks like from the server side. 

nodetool proxyhistograms will show you full request latency recorded by the 
coordinator. 
nodetool cfhistograms will show you the local read latency, this is just the 
time it takes
to read data on a replica and does not include network or wait times. 

If the proxyhistograms is showing most requests running faster than your app 
says it's your
app."
>
>
>http://mail-archives.apache.org/mod_mbox/cassandra-user/201301.mbox/%3ce3741956-c47c-4b43-ad99-dad8afc3a...@thelastpickle.com%3E
>
>
>
>
>Thanks
>
>Jabbar Azam
>
>
>
>On 21 June 2013 21:29, Tony Anecito  wrote:
>
>Hi All,
>>I am using jdbc driver and noticed that if I run the same query twice the 
>>second time it is much faster.
>>I setup the row cache and column family cache and it not seem to make a 
>>difference.
>>
>>
>>I am wondering how to setup cassandra such that the first query is always as 
>>fast as the second one. The second one was 1.8msec and the first 28msec for 
>>the same exact paremeters. I am using preparestatement.
>>
>>
>>Thanks!
>
>
>
>
>
>
>

Re: quick question on seed nodes configuration

2013-06-24 Thread julien Campan

Hi ,

The seeds are only used when a node appears in the cluster. At this moment
it chooses a seed (in the same dc) in order to have some information.

So, the most secure way is to write all your other nodes as seed, but in
fact you need only one up.
if you think that you will never have 3 node down at the same time , you
can put only three nodes.


Julien Campan


2013/6/24 Hiller, Dean 

> For ease of use, we actually had a single cassandra.yaml deployed to every
> machine and a script that swapped out the token and listen address.  I had
> seed nodes ip1,ip2,ip3 as the seeds but what I didn't realize was then that
> these nodes had themselves as seeds.  I am assuming that should never be
> done, is that correct.  I really should deploy ip1, ip2, ip3 on all nodes
> and then for nodes 1, 2, and 3, I should do something like
>
> ip1 will have ip2, ip3, ip4
> ip2 will have ip1, ip3, ip4
> Etc. etc.
>
> QUESTION: Would it be ok if I just configured every node to be the 3 ip's
> after it like this instead as well
> ip1 would have ip2, ip3, ip4
> ip2 would have ip3, ip4, ip5
> ip3 would have ip4, ip5, ip6
> ip4 would have ip5, ip6, ip1
>
> Is this okay for seed node configuration?
>
> Thanks,
> Dean
>
>

Re: Cassandra driver performance question...

2013-06-24 Thread Jabbar Azam

Hello Tony,

This came out recently

http://www.datastax.com/doc-source/developer/java-driver/index.html

I can't vouch for performance but the documentation is ok and it works. I'm
using it on a side project myself.

There is also astyanax by netflix and it also supports CQL 3
https://github.com/Netflix/astyanax/wiki/Getting-Started


Thanks

Jabbar Azam


On 24 June 2013 15:34, Tony Anecito  wrote:

> Hi Jabbar,
>
> I am using JDBC driver because almost no examples exist about what you
> mention. Even most of the JDBC examples I find do not work because they are
> incomplete or out of date. If you have a good reference about what you
> mentioned I can try it.
>
> As I menioned I got selects to work now I am trying to get inserts to work
> via JDBC. Running into issues there also but I will work at it till I get
> them to work.
>
> Regards,
> -Tony
>
>   *From:* Jabbar Azam 
> *To:* user@cassandra.apache.org
> *Cc:* Tony Anecito 
> *Sent:* Monday, June 24, 2013 3:26 AM
>
> *Subject:* Re: Cassandra driver performance question...
>
>  Hello tony,
> I couldnt reply earlier because I've been decorating over the weekend so
> have been a bit busy.
> Let me know what's happens.
> Out of couriosity why are you using and not a cql3 native driver?
> Thanks
> Jabbar Azam
> On 24 Jun 2013 00:32, "Tony Anecito"  wrote:
>
>  Hi Jabbar,
>
>  I was able to get the performance issue resolved by reusing the
> connection object. It will be interesting to see what happens when I use a
> connection pool from a app server.
>
> I still think it would be a good idea to have a minimal mode for metadata.
> It is rare I use metadata.
>
> Regards,
> -Tony
>
>   *From:* Tony Anecito 
> *To:* "user@cassandra.apache.org" ; Tony
> Anecito 
> *Sent:* Friday, June 21, 2013 9:33 PM
> *Subject:* Re: Cassandra driver performance question...
>
>   Hi Jabbar,
>
> I think I know what is going on. I happened accross a change mentioned by
> the jdbc driver developers regarding metadata caching. Seems the metadata
> caching was moved from the connection object to the preparedStatement
> object. So I am wondering if the time difference I am seeing on the second
> preparedStatement object is because of the Metadata is cached then.
>
> So my question is how to test this theory? Is there a way to stop the
> metadata from coming accross from Cassandra? A 20x performance improvement
> would be nice to have.
>
> Thanks,
> -Tony
>
>   *From:* Tony Anecito 
> *To:* "user@cassandra.apache.org" 
> *Sent:* Friday, June 21, 2013 8:56 PM
> *Subject:* Re: Cassandra driver performance question...
>
>   Thanks Jabbar,
>
> I ran nodetool as suggested and it 0 latency for the row count I have.
>
> I also ran cli list command for the table hit by my JDBC perparedStatement
> and it was slow like 121msecs the first time I ran it and second time I ran
> it it was 40msecs versus jdbc call of 38msecs to start with unless I run it
> twice also and get 1.5-2.5msecs for executeQuery the second time the
> preparedStatement is called.
>
> I ran describe from cli for the table and it said caching is "ALL" which
> is correct.
>
> A real mystery and I need to understand better what is going on.
>
> Regards,
> -Tony
>
>   *From:* Jabbar Azam 
> *To:* user@cassandra.apache.org; Tony Anecito 
> *Sent:* Friday, June 21, 2013 3:32 PM
> *Subject:* Re: Cassandra driver performance question...
>
>  Hello Tony,
>
> I would guess that the first queries data  is put into the row cache and
> the filesystem cache. The second query gets the data from the row cache and
> or the filesystem cache so it'll be faster.
>
> If you want to make it consistently faster having a key cache will
> definitely help. The following advice from Aaron Morton will also help
>
> "You can also see what it looks like from the server side.
>
> nodetool proxyhistograms will show you full request latency recorded by the 
> coordinator.
> nodetool cfhistograms will show you the local read latency, this is just the 
> time it takes
> to read data on a replica and does not include network or wait times.
>
> If the proxyhistograms is showing most requests running faster than your app 
> says it's your
> app."
>
>
> http://mail-archives.apache.org/mod_mbox/cassandra-user/201301.mbox/%3ce3741956-c47c-4b43-ad99-dad8afc3a...@thelastpickle.com%3E
>
>
>
>  Thanks
>
> Jabbar Azam
>
>
> On 21 June 2013 21:29, Tony Anecito  wrote:
>
>Hi All,
> I am using jdbc driver and noticed that if I run the same query twice the
> second time it is much faster.
> I setup the row cache and column family cache and it not seem to make a
> difference.
>
> I am wondering how to setup cassandra such that the first query is always
> as fast as the second one. The second one was 1.8msec and the first 28msec
> for the same exact paremeters. I am using preparestatement.
>
> Thanks!
>
>
>
>
>
>
>
>
>
>
>

Hadoop/Cassandra 1.2 timeouts

2013-06-24 Thread Brian Jeltema

I'm having problems with Hadoop job failures on a Cassandra 1.2 cluster due to 

Caused by: TimedOutException()
2013-06-24 11:29:11,953  INFO  Driver  -at 
org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12932)

This is running on a 6-node cluster, RF=3. If I run the job with CL=ONE, it 
usually runs pretty well, with an occasional timeout. But
if I run at CL=QUORUM, the number of timeouts is often enough to kill the job. 
The table being read is effectively read-only when this job runs.
It has from 5 to 10 million rows, with each row having no more than 256 
columns. Each column typically only has a few hundred bytes of data at most.

I've fiddled with the batch-range size and increasing the timeout, without a 
lot of luck. I see some evidence of GC activity in the Cassandra logs, but
it's hard to see a clear correlation with the timeouts.

I could use some suggestions on an approach to pin down the root cause.

TIA

Brian

Re: Upgrade from 1.1.10 to 1.2.4

2013-06-24 Thread Robert Coli

On Sun, Jun 23, 2013 at 2:31 AM, Ananth Gundabattula
 wrote:
> Looks like the cause of the error was because of not specifying num_tokens
> in the cassandra.yaml file. I was under the impression that setting a value
> of num_tokens will override the initial_token value . Looks like we need to
> set num_tokens to 1 to get around this error. Not specifying anything causes
> the above error.

My understanding is that the 1.2.x behavior here is :

1) initial_token set, num_tokens set = cassandra picks the num_tokens
value, ignores initial_token
2) initial_token unset, num_tokens unset = cassandra (until 2.0) picks
a single token via range bisection
3) initial_token unset, num_tokens set = cassandra uses num_tokens
number of vnodes

Are you saying this is not the behavior you saw?

=Rob

Re: AssertionError: Unknown keyspace?

2013-06-24 Thread Robert Coli

On Mon, Jun 24, 2013 at 6:04 AM, Hiller, Dean  wrote:
> Oh shoot, this is a seed node.  Is there documentation on how to bootstrap
> a seed node?  If I have seeds of A, B, C for every machine on the ring and
> I am bootstrapping node B, do I just modify cassandra.yaml and remove node
> B from the yaml file temporarily and boot it up

Yes. The only thing that makes a node fail that check is being in its
own seed list. But if the node is in other nodes' seed lists, those
nodes will contact it anyway. This strongly implies that the
"contains()" check there is the wrong test, but I've never nailed that
down and/or filed a ticket on it. Conversation at the summit suggests
I should, making a note to do so...

=Rob

Re: Upgrade from 1.1.10 to 1.2.4

2013-06-24 Thread Ananth Gundabattula

Hello Rob,

I ran into the stack trace when the situation was :

num_tokens unset ( by this I mean not specifying anything ) and
initial_token set to some value.

I was initially under the impression that specifying num_tokens will over
ride the initial_token value and hence left num_tokens blank. I was able
to get past that exception only when num_token was specified with a value
of 1. 

Regards,
Ananth

On 6/25/13 3:27 AM, "Robert Coli"  wrote:

>On Sun, Jun 23, 2013 at 2:31 AM, Ananth Gundabattula
> wrote:
>> Looks like the cause of the error was because of not specifying
>>num_tokens
>> in the cassandra.yaml file. I was under the impression that setting a
>>value
>> of num_tokens will override the initial_token value . Looks like we
>>need to
>> set num_tokens to 1 to get around this error. Not specifying anything
>>causes
>> the above error.
>
>My understanding is that the 1.2.x behavior here is :
>
>1) initial_token set, num_tokens set = cassandra picks the num_tokens
>value, ignores initial_token
>2) initial_token unset, num_tokens unset = cassandra (until 2.0) picks
>a single token via range bisection
>3) initial_token unset, num_tokens set = cassandra uses num_tokens
>number of vnodes
>
>Are you saying this is not the behavior you saw?
>
>=Rob

Re: AssertionError: Unknown keyspace?

2013-06-24 Thread Wei Zhu

I have got bitten by it once. At least there should be a message saying, there 
is no streaming data since it's a seed node. 
I searched the source code, the message was there and it got removed at certain 
version.

-Wei 

 From: Robert Coli 
To: user@cassandra.apache.org 
Sent: Monday, June 24, 2013 10:34 AM
Subject: Re: AssertionError: Unknown keyspace?

On Mon, Jun 24, 2013 at 6:04 AM, Hiller, Dean  wrote:
> Oh shoot, this is a seed node.  Is there documentation on how to bootstrap
> a seed node?  If I have seeds of A, B, C for every machine on the ring and
> I am bootstrapping node B, do I just modify cassandra.yaml and remove node
> B from the yaml file temporarily and boot it up

Yes. The only thing that makes a node fail that check is being in its
own seed list. But if the node is in other nodes' seed lists, those
nodes will contact it anyway. This strongly implies that the
"contains()" check there is the wrong test, but I've never nailed that
down and/or filed a ticket on it. Conversation at the summit suggests
I should, making a note to do so...

=Rob

sorting columns by time

2013-06-24 Thread Bill Hastings

Hi All

I have a requirement where I need to have my columns sorted by the creation
time. However I would like to have my own naming scheme for the columns and
not use TimeUUID as column names. Please advice as to how I can achieve
this in Cassandra as this has been pretty confusing to me.

Re: AssertionError: Unknown keyspace?

2013-06-24 Thread Hiller, Dean

Yes, it would be nice at startup just to say don't list your seed node as this 
node and then fail out and we would have known this a long long time ago ;).
Dean

From: Wei Zhu mailto:wz1...@yahoo.com>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>, Wei Zhu 
mailto:wz1...@yahoo.com>>
Date: Monday, June 24, 2013 12:36 PM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Subject: Re: AssertionError: Unknown keyspace?

I have got bitten by it once. At least there should be a message saying, there 
is no streaming data since it's a seed node.
I searched the source code, the message was there and it got removed at certain 
version.

-Wei

From: Robert Coli mailto:rc...@eventbrite.com>>
To: user@cassandra.apache.org
Sent: Monday, June 24, 2013 10:34 AM
Subject: Re: AssertionError: Unknown keyspace?

On Mon, Jun 24, 2013 at 6:04 AM, Hiller, Dean 
mailto:dean.hil...@nrel.gov>> wrote:
> Oh shoot, this is a seed node.  Is there documentation on how to bootstrap
> a seed node?  If I have seeds of A, B, C for every machine on the ring and
> I am bootstrapping node B, do I just modify cassandra.yaml and remove node
> B from the yaml file temporarily and boot it up

Yes. The only thing that makes a node fail that check is being in its
own seed list. But if the node is in other nodes' seed lists, those
nodes will contact it anyway. This strongly implies that the
"contains()" check there is the wrong test, but I've never nailed that
down and/or filed a ticket on it. Conversation at the summit suggests
I should, making a note to do so...

=Rob

Re: sorting columns by time

2013-06-24 Thread Hiller, Dean

Send the naming scheme you desire.  Is long time since epoch ok?  Or a 
composite name of time since epoch + (something else)

Dean

From: Bill Hastings mailto:bllhasti...@gmail.com>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Date: Monday, June 24, 2013 12:55 PM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Subject: sorting columns by time

Hi All

I have a requirement where I need to have my columns sorted by the creation 
time. However I would like to have my own naming scheme for the columns and not 
use TimeUUID as column names. Please advice as to how I can achieve this in 
Cassandra as this has been pretty confusing to me.

Re: AssertionError: Unknown keyspace?

2013-06-24 Thread Wei Zhu

Here is the line in the source code for 1.1.0: 

https://github.com/apache/cassandra/blob/cassandra-1.1.0/src/java/org/apache/cassandra/service/StorageService.java#L549

And it's refactored later to this, and the message was removed. 

https://github.com/apache/cassandra/blob/cassandra-1.2.0/src/java/org/apache/cassandra/service/StorageService.java#L549

-Wei 

- Original Message -

From: "Dean Hiller"  
To: user@cassandra.apache.org, "Wei Zhu"  
Sent: Monday, June 24, 2013 12:04:10 PM 
Subject: Re: AssertionError: Unknown keyspace? 

Yes, it would be nice at startup just to say don't list your seed node as this 
node and then fail out and we would have known this a long long time ago ;). 
Dean 

From: Wei Zhu mailto:wz1...@yahoo.com>> 
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>, Wei Zhu 
mailto:wz1...@yahoo.com>> 
Date: Monday, June 24, 2013 12:36 PM 
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>> 
Subject: Re: AssertionError: Unknown keyspace? 

I have got bitten by it once. At least there should be a message saying, there 
is no streaming data since it's a seed node. 
I searched the source code, the message was there and it got removed at certain 
version. 

-Wei 

From: Robert Coli mailto:rc...@eventbrite.com>> 
To: user@cassandra.apache.org 
Sent: Monday, June 24, 2013 10:34 AM 
Subject: Re: AssertionError: Unknown keyspace? 

On Mon, Jun 24, 2013 at 6:04 AM, Hiller, Dean 
mailto:dean.hil...@nrel.gov>> wrote: 
> Oh shoot, this is a seed node. Is there documentation on how to bootstrap 
> a seed node? If I have seeds of A, B, C for every machine on the ring and 
> I am bootstrapping node B, do I just modify cassandra.yaml and remove node 
> B from the yaml file temporarily and boot it up 

Yes. The only thing that makes a node fail that check is being in its 
own seed list. But if the node is in other nodes' seed lists, those 
nodes will contact it anyway. This strongly implies that the 
"contains()" check there is the wrong test, but I've never nailed that 
down and/or filed a ticket on it. Conversation at the summit suggests 
I should, making a note to do so... 

=Rob

Re: [Cassandra] Running node tool cleanup

2013-06-24 Thread Emalayan Vairavanathan

Thank you Robert and others for answering my questions.

I started to play with nodetool  and I have few more questions.

Does nodetool cleanup run synchronously or asynchronously ?

If it is running asynchronously is there any way to monitor the progress ?

Thank you
Emalayan

 From: Robert Coli 
To: user@cassandra.apache.org; Emalayan Vairavanathan  
Sent: Thursday, 20 June 2013 10:03 AM
Subject: Re: [Cassandra] Running node tool cleanup

On Thu, Jun 20, 2013 at 12:01 AM, Emalayan Vairavanathan
 wrote:
> 1) What will happen if I run nodetool cleanup immediately after bringing a
> new node up (i.e. before the key migration process is completed) ?
>         Will it cause some race conditions ? Or will it result in some part
> of the space never be reclaimed ?

As I understand it, the new node isn't responsible for the range until
the migration process is complete, so I presume cleanup will do
nothing in this case. This is so the old node can continue to serve
the range during the bootstrap, and in case of bootstrap failure.

> 2) After adding a new machine, how can I make sure that the key migration is
> completed ? Should I run nodetool netstats on all the nodes ? Is there any
> better way ?

nodetool ring/netstats and/or grepping the log for the
completed-bootstrap message.

=Rob

Re: CAS and long lived locks

2013-06-24 Thread sankalp kohli

Assuming that database migration is a one time and rare operation, why
don't you try to grab a lock for a short time. If you are able to grab it,
then you can renew it for a longer time. This will make sure that in case
of collision, all contenders wont be locked out for long time.
You can use Netflix client recipe for locks.


On Sat, Jun 22, 2013 at 3:09 PM, Blair Zajac  wrote:

> Looking at the Cassandra 13 keynote [1], slide 56 regarding hinted writes
> causing the lock to be taken even though the client thinks the lock attempt
> failed, which the new CAS support fixes.
>
> I have some database migrations to run on Cassandra, so I still need a
> long lived lock somewhere to prevent two or more migrations running
> concurrently, so CAS doesn't directly solve this problem.
>
> It sounds like I could have a BOOLEAN column named "lock" but use CAS to
> update it from false or NULL value to true and this avoids the problem of
> hinted updates problem.  The finally block would reset it to false or NULL.
>  This would be a simpler implementation than using the wait chain algorithm.
>
> Any problems with this?
>
> Blair
>
> [1] 
> http://www.slideshare.net/**jbellis/cassandra-summit-2013-**keynote
> [2] http://media.fightmymonster.**com/Shared/docs/Wait%20Chain%**
> 20Algorithm.pdf
>

Re: CAS and long lived locks

2013-06-24 Thread sankalp kohli

Also CAS is in 2.0 which is not production ready so I am not sure how you
will use it.


On Mon, Jun 24, 2013 at 4:35 PM, sankalp kohli wrote:

> Assuming that database migration is a one time and rare operation, why
> don't you try to grab a lock for a short time. If you are able to grab it,
> then you can renew it for a longer time. This will make sure that in case
> of collision, all contenders wont be locked out for long time.
> You can use Netflix client recipe for locks.
>
>
> On Sat, Jun 22, 2013 at 3:09 PM, Blair Zajac  wrote:
>
>> Looking at the Cassandra 13 keynote [1], slide 56 regarding hinted writes
>> causing the lock to be taken even though the client thinks the lock attempt
>> failed, which the new CAS support fixes.
>>
>> I have some database migrations to run on Cassandra, so I still need a
>> long lived lock somewhere to prevent two or more migrations running
>> concurrently, so CAS doesn't directly solve this problem.
>>
>> It sounds like I could have a BOOLEAN column named "lock" but use CAS to
>> update it from false or NULL value to true and this avoids the problem of
>> hinted updates problem.  The finally block would reset it to false or NULL.
>>  This would be a simpler implementation than using the wait chain algorithm.
>>
>> Any problems with this?
>>
>> Blair
>>
>> [1] 
>> http://www.slideshare.net/**jbellis/cassandra-summit-2013-**keynote
>> [2] http://media.fightmymonster.**com/Shared/docs/Wait%20Chain%**
>> 20Algorithm.pdf
>>
>
>

Re: Date range queries

2013-06-24 Thread Christopher J. Bottaro

Yes, that makes sense and that article helped a lot, but I still have a few
questions...

The created_at in our answers table is basically used as a version id.
 When a user updates his answer, we don't overwrite the old answer, but
rather insert a new answer with a more recent timestamp (the version).

answers
---
user_id | created_at | question_id | result
---
  1 | 2013-01-01 | 1   | yes
  1 | 2013-01-01 | 2   | blah
  1 | 2013-01-02 | 1   | no

So the queries we really want to run are "find me all the answers for a
given user at a given time."  So given the date of 2013-01-02 and user_id
1, we would want rows 2 and 3 returned (since rows 3 obsoletes row 1).  Is
it possible to do this with CQL given the current schema?

As an aside, we can do this in Postgresql using window functions, not
standard SQL, but pretty neat.

We can alter our schema like so...

answers
---
user_id | start_at | end_at | question_id | result

Where the start_at and end_at denote when an answer is active.  So the
example above would become:

answers
---
user_id | start_at   | end_at | question_id | result

  1 | 2013-01-01 | 2013-01-02 | 1   | yes
  1 | 2013-01-01 | null   | 2   | blah
  1 | 2013-01-02 | null   | 1   | no

Now we can query "SELECT * FROM answers WHERE user_id = 1 AND start_at >=
'2013-01-02' AND (end_at < '2013-01-02' OR end_at IS NULL)".

How would one define the partitioning key and cluster columns in CQL to
accomplish this?  Is it as simple as PRIMARY KEY (user_id, start_at,
end_at, question_id) (remembering that we sometimes want to limit by
question_id)?

Also, we are a bit worried about race conditions.  Consider two separate
processes updating an answer for a given user_id / question_id.  There will
be a race condition between the two to update the correct row's end_at
field.  Does that make sense?  I can draw it out with ASCII tables, but I
feel like this email is already too long... :P

Thanks for the help.

On Wed, Jun 19, 2013 at 2:28 PM, David McNelis  wrote:

> So, if you want to grab by the created_at and occasionally limit by
> question id, that is why you'd use created_at.
>
> The way the primary keys work is the first part of the primary key is the
> Partioner key, that field is what essentially is the single cassandra row.
>  The second key is the order preserving key, so you can sort by that key.
>  If you have a third piece, then that is the secondary order preserving key.
>
> The reason you'd want to do (user_id, created_at, question_id) is because
> when you do a query on the keys, if you MUST use the preceding pieces of
> the primary key.  So in your case, you could not do a query with just
> user_id and question_id with the user-created-question key.  Alternatively
> if you went with (user_id, question_id, created_at), you would not be able
> to include a range of created_at unless you were also filtering on the
> question_id.
>
> Does that make sense?
>
> As for the large rows, 10k is unlikely to cause you too many issues
> (unless the answer is potentially a big blob of text).  Newer versions of
> cassandra deal with a lot of things in far, far, superior ways to < 1.0.
>
> For a really good primary on keys in cql and how to potentially avoid hot
> rows, a really good article to read is this one:
> http://thelastpickle.com/2013/01/11/primary-keys-in-cql/  Aaron did a
> great job of laying out the subtleties of primary keys in CQL.
>
>
> On Wed, Jun 19, 2013 at 2:21 PM, Christopher J. Bottaro <
> cjbott...@academicworks.com> wrote:
>
>> Interesting, thank you for the reply.
>>
>> Two questions though...
>>
>> Why should created_at come before question_id in the primary key?  In
>> other words, why (user_id, created_at, question_id) instead of (user_id,
>> question_id, created_at)?
>>
>> Given this setup, all a user's answers (all 10k) will be stored in a
>> single C* (internal, not cql) row?  I thought having "fat" or "big" rows
>> was bad.  I worked with Cassandra 0.6 at my previous job and given the
>> nature of our work, we would sometimes generate these "fat" rows... at
>> which point Cassandra would basically shit the bed.
>>
>> Thanks for the help.
>>
>>
>> On Wed, Jun 19, 2013 at 12:26 PM, David McNelis wrote:
>>
>>> I think you'd just be better served with just a little different primary
>>> key.
>>>
>>> If your primary key was (user_id, created_at)  or (user_id, created_at,
>>> question_id), then you'd be able to run the above query without a problem.
>>>
>>> This will mean that the entire pantheon of a specific user_id will be
>>> stored as a 'row' (in the old style C* vernacular), and then the
>>> information would be ordered by the 2nd piece of the primary key (or 2nd,
>>> then 3rd if you included question_id).
>>>
>>> You would certainly want to include any field that makes a record un

Re: Cassandra terminates with OutOfMemory (OOM) error

2013-06-24 Thread Mohammed Guller

No deletes. In my test, I am just writing and reading data.

There is a lot of GC, but only on the younger generation. Cassandra terminates
before the GC for old generation kicks in.

I know that our queries are reading an unusual amount of data. However, I
expected it to throw a timeout exception instead of crashing. Also, don't
understand why 1.8 Gb heap is getting full when the total data stored in the
entire Cassandra cluster is less than 55 MB.

Mohammed

On Jun 21, 2013, at 7:30 PM, "sankalp kohli"
mailto:kohlisank...@gmail.com>> wrote:

Looks like you are putting lot of pressure on the heap by doing a slice query
on a large row.
Do you have lot of deletes/tombstone on the rows? That might be causing a
problem.
Also why are you returning so many columns as once, you can use auto paginate
feature in Astyanax.

Also do you see lot of GC happening?

On Fri, Jun 21, 2013 at 1:13 PM, Jabbar Azam
mailto:aja...@gmail.com>> wrote:
Hello Mohammed,

You should increase the heap space. You should also tune the garbage collection
so young generation objects are collected faster, relieving pressure on heap We
have been using jdk 7 and it uses G1 as the default collector. It does a better
job than me trying to optimise the JDK 6 GC collectors.

Bear in mind though that the OS will need memory, so will the row cache and the
filing system. Although memory usage will depend on the workload of your system.

I'm sure you'll also get good advice from other members of the mailing list.

Thanks

Jabbar Azam

On 21 June 2013 18:49, Mohammed Guller
mailto:moham...@glassbeam.com>> wrote:
We have a 3-node cassandra cluster on AWS. These nodes are running cassandra
1.2.2 and have 8GB memory. We didn't change any of the default heap or GC
settings. So each node is allocating 1.8GB of heap space. The rows are wide;
each row stores around 260,000 columns. We are reading the data using Astyanax.
If our application tries to read 80,000 columns each from 10 or more rows at
the same time, some of the nodes run out of heap space and terminate with OOM
error. Here is the error message:

java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)
at
org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:50)
at
org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60)
at
org.apache.cassandra.db.marshal.AbstractCompositeType.split(AbstractCompositeType.java:126)
at
org.apache.cassandra.db.filter.ColumnCounter$GroupByPrefix.count(ColumnCounter.java:96)
at
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:164)
at
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
at
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
at
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:294)
at
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1363)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1220)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1132)
at org.apache.cassandra.db.Table.getRow(Table.java:355)
at
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
at
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052)
at
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1578)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)

ERROR 02:14:05,351 Exception in thread Thread[Thrift:6,5,main]
java.lang.OutOfMemoryError: Java heap space
at java.lang.Long.toString(Long.java:269)
at java.lang.Long.toString(Long.java:764)
at
org.apache.cassandra.dht.Murmur3Partitioner$1.toString(Murmur3Partitioner.java:171)
at
org.apache.cassandra.service.StorageService.describeRing(StorageService.java:1068)
at
org.apache.cassandra.thrift.CassandraServer.describe_ring(CassandraServer.java:1192)
at
org.apache.cassandra.thrift.Cassandra$Processor$describe_ring.getResult(Cassandra.java:3766)
at
org.apache.cassandra.thrift.Cassandra$Processor$describe_ring.getResult(Cassandra.java:3754)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
at
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(Cus

Counter value becomes incorrect after several dozen reads & writes

2013-06-24 Thread Josh Dzielak

I have a loop that reads a counter, increments it by some integer, then goes 
off and does about 500ms of other work. After about 10 iterations of this loop, 
the counter value *sometimes* appears to be corrupted.

Looking at the logs, a sequence that just happened is:

Read counter - 15000
Increase counter by - 353
Read counter - 15353
Increase counter by - 1067
Read counter - 286079 (the new counter value is *very* different than what the 
increase should have produced, but usually, suspiciously, around 280k)
Increase counter by - 875
Read counter - 286079  (the counter stops changing at a certain point)


There is only 1 thread running this sequence, and consistency levels are set to 
ALL. The behavior is fairly repeatable - the unexpectation mutation will happen 
at least 10% of the time I run this program, but at different points. When it 
does not go awry, I can run this loop many thousands of times and keep the 
counter exact. But if it starts happening to a specific counter, the counter 
will never "recover" and will continue to maintain it's incorrect value even 
after successful subsequent writes.

I'm using the latest Astyanax driver on Cassandra 1.2.3 in a 3-node test 
cluster. It's also happened in development. Has anyone seem something like 
this? It feels almost too strange to be an actual bug but I'm stumped and have 
been looking at it too long :)

Thanks,
Josh

--
Josh Dzielak 
VP Engineering • Keen IO
Twitter • @dzello (https://twitter.com/dzello)
Mobile • 773-540-5264

Re: CAS and long lived locks

2013-06-24 Thread Blair Zajac

I normally have migrations run at server startup and depending upon the 
complexity, they could run for a while if they need to do per-row data 
transformations.  I don't get the point regarding collisions, somebody 
is going to be locked out for a while, so getting the lock for a short 
period and renewing it is the same???


I used the Astyanax client for a bit but the locking recipes don't work 
with CQL3 non-compact tables, when I last tried 2 months ago.


The other advantage of using CAS compared with Astyanax's lock is that 
the write/read operations that Astyanax does are all done server side in 
CAS, plus it avoids the issue of the hinted writes that can cause "lost" 
locks.


Blair

On 06/24/2013 01:35 PM, sankalp kohli wrote:

Assuming that database migration is a one time and rare operation, why
don't you try to grab a lock for a short time. If you are able to grab
it, then you can renew it for a longer time. This will make sure that in
case of collision, all contenders wont be locked out for long time.
You can use Netflix client recipe for locks.


On Sat, Jun 22, 2013 at 3:09 PM, Blair Zajac mailto:bl...@orcaware.com>> wrote:

Looking at the Cassandra 13 keynote [1], slide 56 regarding hinted
writes causing the lock to be taken even though the client thinks
the lock attempt failed, which the new CAS support fixes.

I have some database migrations to run on Cassandra, so I still need
a long lived lock somewhere to prevent two or more migrations
running concurrently, so CAS doesn't directly solve this problem.

It sounds like I could have a BOOLEAN column named "lock" but use
CAS to update it from false or NULL value to true and this avoids
the problem of hinted updates problem.  The finally block would
reset it to false or NULL.  This would be a simpler implementation
than using the wait chain algorithm.

Any problems with this?

Blair

[1]
http://www.slideshare.net/__jbellis/cassandra-summit-2013-__keynote

[2]
http://media.fightmymonster.__com/Shared/docs/Wait%20Chain%__20Algorithm.pdf

Re: CAS and long lived locks

2013-06-24 Thread Blair Zajac

Our product is in development now so we don't plan on going into 
production later when 2.0.0 is out.


Blair

On 06/24/2013 01:36 PM, sankalp kohli wrote:

Also CAS is in 2.0 which is not production ready so I am not sure how
you will use it.


On Mon, Jun 24, 2013 at 4:35 PM, sankalp kohli mailto:kohlisank...@gmail.com>> wrote:

Assuming that database migration is a one time and rare operation,
why don't you try to grab a lock for a short time. If you are able
to grab it, then you can renew it for a longer time. This will make
sure that in case of collision, all contenders wont be locked out
for long time.
You can use Netflix client recipe for locks.


On Sat, Jun 22, 2013 at 3:09 PM, Blair Zajac mailto:bl...@orcaware.com>> wrote:

Looking at the Cassandra 13 keynote [1], slide 56 regarding
hinted writes causing the lock to be taken even though the
client thinks the lock attempt failed, which the new CAS support
fixes.

I have some database migrations to run on Cassandra, so I still
need a long lived lock somewhere to prevent two or more
migrations running concurrently, so CAS doesn't directly solve
this problem.

It sounds like I could have a BOOLEAN column named "lock" but
use CAS to update it from false or NULL value to true and this
avoids the problem of hinted updates problem.  The finally block
would reset it to false or NULL.  This would be a simpler
implementation than using the wait chain algorithm.

Any problems with this?

Blair

[1]
http://www.slideshare.net/__jbellis/cassandra-summit-2013-__keynote

[2]

http://media.fightmymonster.__com/Shared/docs/Wait%20Chain%__20Algorithm.pdf

Re: Counter value becomes incorrect after several dozen reads & writes

2013-06-24 Thread Arthur Zubarev

Hi Josh,

are you looking at the read counter produced by cfstats?

If so it is not for a CF, but the entire KS and not tied to a specific 
operation, but rather per the entire lifetime of JVM.

Just in case, some supporting info: 
http://stackoverflow.com/questions/9431590/cassandra-cfstats-and-meaning-of-read-write-latency

/Arthur

From: Josh Dzielak 
Sent: Monday, June 24, 2013 9:42 PM
To: user@cassandra.apache.org 
Subject: Counter value becomes incorrect after several dozen reads & writes

I have a loop that reads a counter, increments it by some integer, then goes 
off and does about 500ms of other work. After about 10 iterations of this loop, 
the counter value *sometimes* appears to be corrupted.

Looking at the logs, a sequence that just happened is:

Read counter - 15000
Increase counter by - 353
Read counter - 15353
Increase counter by - 1067
Read counter - 286079 (the new counter value is *very* different than what the 
increase should have produced, but usually, suspiciously, around 280k)
Increase counter by - 875
Read counter - 286079  (the counter stops changing at a certain point)

There is only 1 thread running this sequence, and consistency levels are set to 
ALL. The behavior is fairly repeatable - the unexpectation mutation will happen 
at least 10% of the time I run this program, but at different points. When it 
does not go awry, I can run this loop many thousands of times and keep the 
counter exact. But if it starts happening to a specific counter, the counter 
will never "recover" and will continue to maintain it's incorrect value even 
after successful subsequent writes.

I'm using the latest Astyanax driver on Cassandra 1.2.3 in a 3-node test 
cluster. It's also happened in development. Has anyone seem something like 
this? It feels almost too strange to be an actual bug but I'm stumped and have 
been looking at it too long :)

Thanks,
Josh

--
Josh Dzielak
VP Engineering • Keen IO
Twitter • @dzello
Mobile • 773-540-5264

How to do a CAS UPDATE on single column CF?

2013-06-24 Thread Blair Zajac


How does one do an atomic update in a column family with a single column?

I have a this CF

  CREATE TABLE schema_migrations (
version TEXT PRIMARY KEY,
  ) WITH COMPACTION = {'class': 'LeveledCompactionStrategy'};

that records which database migrations have been applied.  I want to do 
a CAS UPDATE to add a dummy lock token to prevent multiple migrations 
from running, but these three attempts fail (using the Python cql client):



>>> cursor.execute("UPDATE schema_migrations SET version = 'locked' 
WHERE version = 'locked' IF NOT EXISTS")
cql.apivalues.ProgrammingError: Bad Request: PRIMARY KEY part version 
found in SET part



>>> cursor.execute("UPDATE schema_migrations SET WHERE version = 
'locked' IF NOT EXISTS")
cql.apivalues.ProgrammingError: Bad Request: line 1:29 no viable 
alternative at input 'WHERE'




>>> cursor.execute("INSERT INTO schema_migrations (version) VALUES 
('locked') IF NOT EXISTS")

cql.apivalues.ProgrammingError: Bad Request: line 1:58 missing EOF at 'IF'

Thanks,
Blair

Mixing CAS UPDATE and non-CAS DELETE

2013-06-24 Thread Blair Zajac


Looking at the CAS unit tests [1], if one does a CAS UPDATE to create a ROW:

  UPDATE test SET v1 = 2, v2 = 'foo' WHERE k = 0 IF NOT EXISTS

there isn't a CAS DELETE FROM that only uses the partition key.  You can 
do this to delete the row using CAS:


  DELETE FROM test WHERE k = 0 IF v1 = null

But if I want to delete it regardless of v1, then this doesn't work:

  DELETE FROM test WHERE k = 0 IF EXISTS

So one is left to

  DELETE FROM test WHERE k = 0

How does this non-CAS DELETE mix with a CAS UPDATE for the same 
partition key?  Will they properly not step over each other?


Thanks,
Blair

[1] 
https://github.com/riptano/cassandra-dtest/blob/master/cql_tests.py#L3044

Re: How to do a CAS UPDATE on single column CF?

2013-06-24 Thread Arthur Zubarev


On 06/24/2013 11:23 PM, Blair Zajac wrote:

CAS UPDATE

Since when C* has IF NOT EXISTS in DML part of CQL?

--

Regards,

Arthur

copy data between clusters

2013-06-24 Thread S C

I have a scenario here. I have a cluster A and cluster B running on cassandra 
1.1. I need to copy data from Cluster A to Cluster B. Cluster A has few 
keyspaces that I need to copy over to Cluster B. What are my options?
Thanks,SC

Re: copy data between clusters

2013-06-24 Thread Arthur Zubarev


On 06/24/2013 11:35 PM, S C wrote:
I have a scenario here. I have a cluster A and cluster B running on 
cassandra 1.1. I need to copy data from Cluster A to Cluster B. 
Cluster A has few keyspaces that I need to copy over to Cluster B. 
What are my options?


Thanks,
SC
I am thinking of SSTABLELOADER (bulk insert) which would stream your KS 
sstables to the target cluster nodes.


--

Regards,

Arthur

Re: NREL has released open source Databus on github for time series data

2013-06-24 Thread aaron morton

Hi Dean, 
Does this handle rollup aggregates along with the time series data ? 
I had a quick look at the links and could not see anything. 

Cheers
Aaron
 
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 22/06/2013, at 2:51 AM, "Hiller, Dean"  wrote:

> NREL has released their open source databus.  They spin it as energy data 
> (and a system for campus energy/building energy) but it is very general right 
> now and probably will stay pretty general.  More information can be found here
> 
> http://www.nrel.gov/analysis/databus/
> 
> The source code can be found here
> https://github.com/deanhiller/databus
> 
> Star the project if you like the idea.  NREL just did a big press release and 
> is developing a community around the project.  It is in it's early stages but 
> there are users using it and I am helping HP set an instance up this month.  
> If you want to become a committer on the project, let me know as well.
> 
> Later,
> Dean
>

Re: [Cassandra] Replacing a cassandra node with one of the same IP

2013-06-24 Thread aaron morton

> so I am just wondering if this means the hinted handoffs are also updated to 
> reflect the new Cassandra node uuid. 
Without checking the code I would guess not. 
Because it would involve a potentially large read / write / delete to create a 
new row with the same data. And Hinted Handoff is an optimisation. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 22/06/2013, at 9:52 AM, "Mahony, Robin"  wrote:

> Please note that I am currently using version 1.2.2 of Cassandra.  Also we 
> are using virtual nodes.
>  
> My question mainly stems from the fact that the nodes appear to be aware that 
> the node uuid changes for the IP (from reading the logs), so I am just 
> wondering if this means the hinted handoffs are also updated to reflect the 
> new Cassandra node uuid. If that was the case, I would not think a nodetool 
> cleanup would be necessary.
>  
> - Forwarded Message -
> From: Robert Coli 
> To: user@cassandra.apache.org; Emalayan Vairavanathan  
> Sent: Thursday, 20 June 2013 11:40 AM
> Subject: Re: [Cassandra] Replacing a cassandra node
> 
> On Thu, Jun 20, 2013 at 10:40 AM, Emalayan Vairavanathan
>  wrote:
> > In the case where replace a cassandra node (call it node A) with another one
> > that has the exact same IP (ie. during a node failure), what exactly should
> > we do?  Currently I understand that we should at least run "nodetool
> > repair".
> 
> If you lost the data from the node, then what you want is "replace_token."
> 
> If you didn't lose the data from the node (and can tolerate stale
> reads until the repair completes) you want to start the node with
> auto_bootstrap set to false and then repair.
> 
> =Rob

Re: Counter value becomes incorrect after several dozen reads & writes

2013-06-24 Thread Josh Dzielak

Hi Arthur,  

This is actually for a column in a counter column family, i.e. 
CounterColumnType. Will check out that thread though, thanks.

Best,
Josh

--
Josh Dzielak 
VP Engineering • Keen IO
Twitter • @dzello (https://twitter.com/dzello)
Mobile • 773-540-5264


On Monday, June 24, 2013 at 8:01 PM, Arthur Zubarev wrote:

> Hi Josh,
>   
> are you looking at the read counter produced by cfstats?
>   
> If so it is not for a CF, but the entire KS and not tied to a specific 
> operation, but rather per the entire lifetime of JVM.
>   
> Just in case, some supporting info: 
> http://stackoverflow.com/questions/9431590/cassandra-cfstats-and-meaning-of-read-write-latency
>   
> /Arthur
>   
> From: Josh Dzielak (mailto:j...@keen.io)  
> Sent: Monday, June 24, 2013 9:42 PM
> To: user@cassandra.apache.org (mailto:user@cassandra.apache.org)  
> Subject: Counter value becomes incorrect after several dozen reads & writes
>  
>  
>   
>  
> I have a loop that reads a counter, increments it by some integer, then goes 
> off and does about 500ms of other work. After about 10 iterations of this 
> loop, the counter value *sometimes* appears to be corrupted.
>   
> Looking at the logs, a sequence that just happened is:
>   
> Read counter - 15000
> Increase counter by - 353
> Read counter - 15353
> Increase counter by - 1067
> Read counter - 286079 (the new counter value is *very* different than what 
> the increase should have produced, but usually, suspiciously, around 280k)
> Increase counter by - 875
> Read counter - 286079  (the counter stops changing at a certain point)
>  
>   
> There is only 1 thread running this sequence, and consistency levels are set 
> to ALL. The behavior is fairly repeatable - the unexpectation mutation will 
> happen at least 10% of the time I run this program, but at different points. 
> When it does not go awry, I can run this loop many thousands of times and 
> keep the counter exact. But if it starts happening to a specific counter, the 
> counter will never "recover" and will continue to maintain it's incorrect 
> value even after successful subsequent writes.
>   
> I'm using the latest Astyanax driver on Cassandra 1.2.3 in a 3-node test 
> cluster. It's also happened in development. Has anyone seem something like 
> this? It feels almost too strange to be an actual bug but I'm stumped and 
> have been looking at it too long :)
>   
> Thanks,
> Josh
>   
> --
> Josh Dzielak 
> VP Engineering • Keen IO
> Twitter • @dzello (https://twitter.com/dzello)
> Mobile • 773-540-5264
>   
>  
>  
>  
>  
>  
>

Problems with node rejoining cluster

2013-06-24 Thread Arindam Barua

We need to do a rolling upgrade of our Cassandra cluster in production, since
we are upgrading Cassandra on solaris to Cassandra on CentOS.
(We went with solaris initially since most of our other hosts in production are
solaris, but were running into some lockup issues during perf tests, and
decided to switch to linux)

Here are the steps we are following to take the node out of service, and get it
back. Can someone comment if we are missing anything (eg. is it recommended to
specify tokens in cassandra.yaml, or do something different with the seed hosts
than mentioned below)

1. nodetool decommission - wait for the data to be streamed out.

2. Re-image (everything is wiped off the disks) the host to CentOS, with
the same Cassandra version

3. Get Cassandra back up.

Other details:

- Using Cassandra 1.1.5

- We do not specify any tokens in cassandra.yaml relying on bootstrap
assigning the tokens automatically.

- We are testing with a 4 node cluster, with only one seed host. The
seed host is specified in the cassandra.yaml of each node and is not changed at
any point.

While testing the solaris to linux upgrade path, things seem to work smoothly.
The data streams out fine, and streams back in when the node comes back up.
However, testing the linux to solaris path (in case we need to rollback), we
are facing some issues with the nodes joining back the ring. nodetool indicates
that the node has joined back the ring, but no data streams in, the node
doesn't know about the keyspaces/column families, etc. We see some errors in
the logs of the newly added nodes pasted below.

[17/06/2013:14:10:17 PDT] MutationStage:1: ERROR RowMutationVerbHandler.java
(line 61) Error in row mutation
org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=1020
at
org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:126)
at
org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:439)
at
org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:447)
at org.apache.cassandra.db.RowMutation.fromBytes(RowMutation.java:395)
at
org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:42)
at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

Thanks,
Arindam

Re: How to do a CAS UPDATE on single column CF?

2013-06-24 Thread Blair Zajac


On 06/24/2013 08:35 PM, Arthur Zubarev wrote:

On 06/24/2013 11:23 PM, Blair Zajac wrote:

CAS UPDATE

Since when C* has IF NOT EXISTS in DML part of CQL?


It's new in 2.0.

https://issues.apache.org/jira/browse/CASSANDRA-5062
https://github.com/riptano/cassandra-dtest/blob/master/cql_tests.py#L3044

Blair

37 matches

Mail list logo