Re: EC2 node adding trouble

2011-05-25 Thread Sasha Dolgy
All of the nodes are 0.8.x and no longer 0.7.2 ? It is peculiar. As another person noted, maybe reviewing the results of netstat -an on each node could help. I've had issues in the past with JMX -- thinking I had configured it for IP x.x.x.x and went crazy only to see that it was never configure

How to programmatically index an existed column?

2011-05-25 Thread Dikang Gu
I want to build a secondary index on an existed column, how to programmatically do this using hector API? Thanks. -- Dikang Gu 0086 - 18611140205

PHP CQL Driver

2011-05-25 Thread Kwasi Gyasi - Agyei
Hi, I have manged to generate thrift interface for php along with implementing auto-loading of both Cassandra and thrift core class. However during my testing the only query that works as expected is the create keyspace cql query... all other queries don't do or return any results nor do they thr

Re: Re: nodetool move trying to stream data to node no longer in cluster

2011-05-25 Thread jonathan . colby
Seems like it had something to do with stale endpoint information. I did a rolling restart of the whole cluster and that seemed to trigger the nodes to remove the node that was decommissioned. On , aaron morton wrote: Is it showing progress ? It may just be a problem with the information p

Re: "range query" vs "slice range query"

2011-05-25 Thread david lee
so, that was actually simpler than i thought ay? cheers guys~ On 26 May 2011 05:38, Roland Gude wrote: > That is correct. Random partitioner orders rows according to the MD5 sum. > > Am 25.05.2011 um 16:11 schrieb "Robert Jackson" >: > > Also, it is my understanding that if you are not using >

Re: How to make use of Cassandra raw row keys?

2011-05-25 Thread Robert Jackson
If you are using 0.10 or 0.11 of the cassandra gem you will only get rows back that have values(columns). This is due to the way cassandra handles deleted rows by adding a tombstone. So if you delete a row (or delete all the columns in a row) the gem will remove that particular row from the hash

Re: Measure Latency

2011-05-25 Thread aaron morton
I can get down and dirty in the SQL Server world :) Best analogy is that cassandra is doing "high safety mode synchronous mirroring" http://msdn.microsoft.com/en-us/library/ms189852.aspx The replicating is effectively inside the Transaction commit, cassandra does not have transactions but you g

Re: How to make use of Cassandra raw row keys?

2011-05-25 Thread aaron morton
Hard to say exactly what the issue is. Are they connected to the same node and using the same Consistency Level? Try turing the logging up to DEBUG to see they are issuing the same query. Hope that helps. - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thel

Re: EC2 node adding trouble

2011-05-25 Thread aaron morton
I've seen discussion of using the EIP but I do not have direct experience. Others may be able to provide more help. Previous discussion http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Does-anybody-have-experience-with-running-Cassandra-in-Amazon-s-Virtual-Private-Cloud-VPC-td6

Re: nodetool move trying to stream data to node no longer in cluster

2011-05-25 Thread aaron morton
Is it showing progress ? It may just be a problem with the information printed out. Can you check from the other nodes in the cluster to see if they are receiving the stream ? cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 26

Re: "range query" vs "slice range query"

2011-05-25 Thread Roland Gude
That is correct. Random partitioner orders rows according to the MD5 sum. Am 25.05.2011 um 16:11 schrieb "Robert Jackson" mailto:robe...@promedicalinc.com>>: Also, it is my understanding that if you are not using OrderPreservingPartitioner a get_range_slices may not return what you would expec

Re: Priority queue in a single row - performance falls over time

2011-05-25 Thread Dan Kuebrich
It sounds like the problem is that the row is getting filled up with tombstones and becoming enormous? Another idea then, which might not be worth the added complexity, is to progressively use new rows. Depending on volume, this could mean having 5-minute-window rows, or 1 minute, or whatever wor

Re: Priority queue in a single row - performance falls over time

2011-05-25 Thread Jonathan Ellis
You're basically intentionally inflicting the worst case scenario on the Cassandra storage engine: http://wiki.apache.org/cassandra/DistributedDeletes You could play around with reducing gc_grace_seconds but a PQ with "millions" of items is something you should probably just do in memory these day

RE: Database grows 10X bigger after running nodetool repair

2011-05-25 Thread Or Offer
Hi Everyone , We looking for help with upgrading our Cassandra from 0.6 to 0.7.2 here in Israel. If there is anyone hear that can help out with consulting, please email me. Thanks Or Offer SimilarGroup or.of...@similargroup.com

Re: Database grows 10X bigger after running nodetool repair

2011-05-25 Thread Daniel Doubleday
Firstly any ideas for a quick fix because this is giving me big production problems. Write/read with QUORUM is reportedly producing unpredictable results (people have called support regarding monsters in my MMO appearing and disappearing magically) and many operations are just failing with

Re: Database grows 10X bigger after running nodetool repair

2011-05-25 Thread Dominic Williams
Links for issue causing this: http://issues.apache.org/jira/browse/CASSANDRA-2670 http://issues.apache.org/jira/browse/CASSANDRA-2280 For anyone in this boat, my advice is:- 1. Do a rolling restart immediately, starting with the node you were running repair on. If you don't do this, the other nod

Re: Database grows 10X bigger after running nodetool repair

2011-05-25 Thread Dominic Williams
Jeepers creepers that's it Jeeves!!! Argh. Basically once my repair hit a big column family db size exploded until the node ran out of disk space.. Firstly any ideas for a quick fix because this is giving me big production problems. Write/read with QUORUM is reportedly producing unpredict

Re: Measure Latency

2011-05-25 Thread Stephan P
Thanks for the reply. I agree about the eventual consistency feature but in "addition" I'm trying to have an oversight if there is a slowness between nodes in a cluster meaning sometimes "some" network slowness won't translate in CA slowness and sometimes it will. I guess I'm looking for a process

Re: Inconsistent results using secondary indexes between two DC

2011-05-25 Thread Wojciech Pietrzok
2011/5/23 Jonathan Ellis : >> It was installed as 0.7.2 and upgraded with each new official release. > > I bet that's the problem, then. > https://issues.apache.org/jira/browse/CASSANDRA-2244 could cause > indexes to not be updated for releases < 0.7.4.  You'll want to > rebuild the index. > >> By

Re: EC2 node adding trouble

2011-05-25 Thread mcasandra
Can you post the output of "netstat -anp|grep "LISTEN"|grep java" from all the 3 nodes? Also compare seconds nodes yaml with new nodes yaml and see what diff. you find, if any. Another thing try telnet tests from seed node to the new node. -- View this message in context: http://cassandra-user-

Re: Database grows 10X bigger after running nodetool repair

2011-05-25 Thread Daniel Doubleday
We are having problems with repair too. It sounds like yours are the same. From today: http://permalink.gmane.org/gmane.comp.db.cassandra.user/16619 On May 25, 2011, at 4:52 PM, Dominic Williams wrote: > Hi, > > I've got a strange problem, where the database on a node has inflated 10X > after

Priority queue in a single row - performance falls over time

2011-05-25 Thread dnallsopp
Hi all, I'm trying to implement a priority queue for holding a large number (millions) of items that need to be processed in time order. My solution works - but gets slower and slower until performance is unacceptable - even with a small number of items. Each item essentially needs to be popped

Re: Database grows 10X bigger after running nodetool repair

2011-05-25 Thread jonathan . colby
I'm not sure if this is the absolute best advice, but perhaps running "clean" on the data will help cleanup any data that isn't assigned to this token - in case you've moved the cluster around before. Any exceptions in the logs, eg EOF ? I experienced this and it caused the repairs to trip

Re: Recommandation on how to organize CF

2011-05-25 Thread openvictor Open
Thanks Aaron, Sorry I didn't see your message sooner. So the CF Messages using UTF8Type holds the information such as : who has the right to read/ is it possible to answer to this list etc... There are two "kinds" of keys. The keys which begin by : "message:uuid" and the "messagelist:uuid". A co

Database grows 10X bigger after running nodetool repair

2011-05-25 Thread Dominic Williams
Hi, I've got a strange problem, where the database on a node has inflated 10X after running repair. This is not the result of receiving missed data. I didn't perform repair within my usual 10 day cycle, so followed recommended practice: http://wiki.apache.org/cassandra/Operations#Dealing_with_the

Re: How to make use of Cassandra raw row keys?

2011-05-25 Thread Suan Aik Yeo
Thanks, that definitely helped. Any idea why my client is showing far less existing rows than cassandra-cli though? I'm using the Ruby Cassandra client, and when I get all the rows for the "Sessions" cf, I get 8 rows returned. However, when I do "list Sessions" in the cassandra-cli I get 40 rows r

Re: "range query" vs "slice range query"

2011-05-25 Thread Robert Jackson
Also, it is my understanding that if you are not using OrderPreservingPartitioner a get_range_slices may not return what you would expect. With the RandomPartitioner you can iterate over the complete list by using the last row key as the start for subsequent requests, but if you are using a

Re: "range query" vs "slice range query"

2011-05-25 Thread Roland Gude
I cannot Display the Book page you are referring to, but your General understanding is correct. A Range Refers to several rows, a slice Refers to several columns. A RangeSlice is a combination of Both. From all rows in a Range get a specific slice of columns. Am 25.05.2011 um 10:43 schrieb "dav

Re: "range query" vs "slice range query"

2011-05-25 Thread Jonathan Ellis
get_range_slices is the api to get a slice (of columns) from each of a range (of rows) On Wed, May 25, 2011 at 3:42 AM, david lee wrote: > hi guys, > i'm reading up on the book "Cassandra - Definitive guide" > and i don't seem to understand what it says about "ranges and slices" > my understandin

nodetool move trying to stream data to node no longer in cluster

2011-05-25 Thread Jonathan Colby
I recently removed a node (with decommission) from our cluster. I added a couple new nodes and am now trying to rebalance the cluster using nodetool move. However, netstats shows that the node being "moved" is trying to stream data to the node that I already decommissioned yesterday. The remo

Re: Quorum + Token range confusion

2011-05-25 Thread Timo Nentwig
On 5/25/11 14:08, Timo Nentwig wrote: On 5/25/11 13:45, Watanabe Maki wrote: I think I don't get your situation yet, but if you use RF=2, CL=QUORUM is identical with CL=ALL. Does it explain your experience? If it was CL=ALL, it would explain it, however I does not explain why it works when I

Re: Quorum + Token range confusion

2011-05-25 Thread Timo Nentwig
On 5/25/11 13:45, Watanabe Maki wrote: I think I don't get your situation yet, but if you use RF=2, CL=QUORUM is identical with CL=ALL. Does it explain your experience? If it was CL=ALL, it would explain it, however I does not explain why it works when I decommission one node. RF=2 means that

Re: Quorum + Token range confusion

2011-05-25 Thread Watanabe Maki
I think I don't get your situation yet, but if you use RF=2, CL=QUORUM is identical with CL=ALL. Does it explain your experience? maki On 2011/05/25, at 19:39, Timo Nentwig wrote: > Hi! > > 5 nodes, replication factor of 2, fifth node down. > > As long as I write a single column with hector

Quorum + Token range confusion

2011-05-25 Thread Timo Nentwig
Hi! 5 nodes, replication factor of 2, fifth node down. As long as I write a single column with hector or pelops, it works. With 2 columns it fails because there are supposed to few servers to reach quorum. Confusing. If I decommission the fifth node with nodetool quorum works again and I can s

"range query" vs "slice range query"

2011-05-25 Thread david lee
hi guys, i'm reading up on the book "Cassandra - Definitive guide" and i don't seem to understand what it says about "ranges and slices" my understanding is a range as in "a mathematical range to define a subset from an ordered set of elements", in cassandra typically means a range of rows wherea

Re: repair question

2011-05-25 Thread Daniel Doubleday
Ok - obviously these haven't been my brightest days. The stream request sent to the neighbors doesn't contain the CF for which the ranges have been determined to mismatch. So every diff in every CF will result in getting that range from every CF of the neighbor. That explains everything. So I