Re: RF=1

2011-08-17 Thread Patrik Modesto
On Wed, Aug 17, 2011 at 17:08, Jonathan Ellis wrote: > See https://issues.apache.org/jira/browse/CASSANDRA-2388 Ok, thanks for the JIRA ticker. I've found that very same problem during my work on ignoring unavailabla ranges. But there is the another problem with Hadoop-Cassandra, if there is no

Re: Could Not connect to cassandra-cli on windows

2011-08-17 Thread Alaa Zubaidi
Hi Aaron, Thanks for the reply. I am running 0.7.4 and NO client. The error was reported by the application where it fails to connect and it happens that 2 threads are trying to connect at the same time. and when I checked the cassandra log I found these errors?? Thanks Alaa On 8/17/2011 4:29

Re: node restart taking too long

2011-08-17 Thread Boris Yen
Because the file only preserve the "key" of records, not the whole record. Records for those saved key will be loaded into cassandra during the startup of cassandra. On Wed, Aug 17, 2011 at 5:52 PM, Yan Chunlu wrote: > but the data size in the saved_cache are relatively small: > > will that caus

Re: Repairs are both ways ?

2011-08-17 Thread Jonathan Ellis
Because they are occurring in parallel. On Wed, Aug 17, 2011 at 5:32 PM, Philippe wrote: >> Almost, but not quite: if you have nodes A,B,C and repair A, it will >> transfer A<->B, A<->C, but not B<->C. > > But on a 3 node cluster once you do A<->B & A<->C, why don't you > transitively get B<->C ?

Re: One hot node slows down whole cluster

2011-08-17 Thread Hefeng Yuan
Thanks Aaron for the response. We're not doing drain on node, and there's no that message in the log. We used LOCAL_QUORUM CL, endpoint_snitch: org.apache.cassandra.locator.PropertyFileSnitch dynamic_snitch: false dynamic_snitch_badness_threshold: 0.0 Because we have another 3 nodes DC for Brisk

Re: Reg row key sorting

2011-08-17 Thread aaron morton
Rows are sorted according to the order of the partitioner, see http://wiki.apache.org/cassandra/FAQ#range_rp Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 18/08/2011, at 8:53 AM, Thamizh wrote: > Hi All, > > Thanks a lot. > >

Re: Could Not connect to cassandra-cli on windows

2011-08-17 Thread aaron morton
What client, what version, what version of cassandra are you using ? Looks like you are connecting with an old version of thrift, like the message says. Check the client you are using was made for cassandra 0.8. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton

Re: One hot node slows down whole cluster

2011-08-17 Thread aaron morton
wrt the Exception something has shutdown the Mutation thread pool. The only thing I can see in the code to do this is nodetool drain and running the Embedded server. If it was drain you should see an INFO level messages "Node is drained" somewhere. Could either of these things be happening ? w

Re: dropping secondary indexes

2011-08-17 Thread aaron morton
ooo, didn't know there was a drop index statement. I got the same result, the Antlr grammar seems to say it's a valid identifier (not that I have much Antlr foo)… Identifier : (Letter | Alnum) (Alnum | '_' | '-' )* fragment Letter : 'a'..'z' | 'A'..'Z' ; fragment Digit :

Re: Cassandra in Multiple Datacenters Active - Standby configuration

2011-08-17 Thread aaron morton
See the bug report, implementations of IPartitioner.describeOwnership() Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 18/08/2011, at 2:58 AM, Oleg Tsvinev wrote: > Aaron, > > Can you point to a line in Cassandra sources where y

Re: Bulk loading into live data

2011-08-17 Thread aaron morton
That is my understanding. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 18/08/2011, at 12:36 AM, Philippe wrote: >> What if the column is a counter ? Does it overwrite or increment ? Ie if the >> SST I am loading has the exact

Re: Partitioning, tokens, and sequential keys

2011-08-17 Thread aaron morton
> One question on nodetool ring, the "owns" refers to how many of the possible > keys each node owns, not the actual node size correct? yes > So you could technically have a load of 15gb, 60gb, and 15gb on a three node > cluster, but if you have the tokens set correctly each would own 33.33%. Ye

Re: Repairs are both ways ?

2011-08-17 Thread Philippe
> > Almost, but not quite: if you have nodes A,B,C and repair A, it will > transfer A<->B, A<->C, but not B<->C. > But on a 3 node cluster once you do A<->B & A<->C, why don't you transitively get B<->C ? Thanks

Re: nodetool repair caused high disk space usage

2011-08-17 Thread Philippe
Huy, Have you tried repairing one keyspace at a time and then giving it some breathing time to compact. My current observations is that the streams of repairs are triggering massive compactions which are filling up my disks too. Another idea I'd like to try is to limit the number of concurrent comp

Re: ColumnFamilyOutputFormat problem

2011-08-17 Thread Jian Fang
Thanks. I turned on the log for Cassandra and the batch mutation was not called at all. Seems have to dig into the API code myself. On Tue, Aug 16, 2011 at 7:25 PM, aaron morton wrote: > I suggested turning up the logging to see if the server processed a > batch_mutate call. This is done from the

Re: Repairs are both ways ?

2011-08-17 Thread Jonathan Ellis
Almost, but not quite: if you have nodes A,B,C and repair A, it will transfer A<->B, A<->C, but not B<->C. https://issues.apache.org/jira/browse/CASSANDRA-2610 is open to make this match your intuition. On Wed, Aug 17, 2011 at 3:45 PM, Philippe wrote: > Looking at the logs, I see that repairs st

Re: Unable to repair a node

2011-08-17 Thread Philippe
I have a smallish keyspace on my 3 node, RF=3 cluster. My cluster has no read/write traffic while I am testing repairs. I am running 0.8.4 of debian packages on ubuntu. I've know run 7 repairs in a row on this keyspace and every single one has finished successfully but performed streams between al

Re: nodetool repair caused high disk space usage

2011-08-17 Thread Huy Le
I restarted the cluster and kicked off repair on the same node again. It only made the matter worse. It filled up the 830GB partition, and cassandra on the node repair ran on crashed. I restarted it, and now I am running compaction to reduce disk usage. Repair after upgrading to 0.8.4 is still

Re: Reg row key sorting

2011-08-17 Thread Thamizh
Hi All, Thanks a lot. How to sort row key in CF using CLI / API ? Does it boost the search performance ? As of now, I am inserting row key( as ByteBuffer ) from Hadoop Map/Reduce. It looks, by default Cassandra does not sort the Row keys. Below is the cli command, I have used to create CF. cr

Repairs are both ways ?

2011-08-17 Thread Philippe
Looking at the logs, I see that repairs stream data TO and FROM a node to its replicas. So on a 3-node RF=3 cluster, one only needs to launch repairs on a single node right ? Thanks

Re: Where is sstableloader?

2011-08-17 Thread Jonathan Ellis
On Wed, Aug 17, 2011 at 2:55 PM, Christopher Bottaro wrote: > I installed deb Cassandra 0.8.4 packages from the apt repo a > la: http://wiki.apache.org/cassandra/DebianPackaging > Where is sstableloader? Looks like it isn't included in the deb yet. Can you open a ticket to add it? >  Also, wher

Where is sstableloader?

2011-08-17 Thread Christopher Bottaro
Hello, I installed deb Cassandra 0.8.4 packages from the apt repo a la: http://wiki.apache.org/cassandra/DebianPackaging Where is sstableloader? Also, where is the tool to convert old storage-schema.xml files to the new yaml (I think it's called config-converter)? Thanks for the help.

Re: Error String index out of range and old client?

2011-08-17 Thread Alaa Zubaidi
I forgot to mention that we are on 0.7.4 now... On 8/17/2011 12:31 PM, Alaa Zubaidi wrote: Hi, I se this error while the application tries to connect to cassandra at the same time from 2 different threads: any clues: ERROR [pool-1-thread-13] 2011-07-29 06:46:45,718 CustomTThreadPoolServer.java

Error String index out of range and old client?

2011-08-17 Thread Alaa Zubaidi
Hi, I se this error while the application tries to connect to cassandra at the same time from 2 different threads: any clues: ERROR [pool-1-thread-13] 2011-07-29 06:46:45,718 CustomTThreadPoolServer.java (line 222) Error occurred during processing of message. java.lang.StringIndexOutOfBoundsExce

Re: Could Not connect to cassandra-cli on windows

2011-08-17 Thread Alaa Zubaidi
Hi, I se this error while the application tries to connect to cassandra at the same time from 2 different threads: any clues: ERROR [pool-1-thread-13] 2011-07-29 06:46:45,718 CustomTThreadPoolServer.java (line 222) Error occurred during processing of message. java.lang.StringIndexOutOfBounds

Re: One hot node slows down whole cluster

2011-08-17 Thread Hefeng Yuan
Just wondering, would it help if we shorten the rpc_timeout_in_ms (currently using 30,000), so that when one node gets hot and responding slowly, others will just take it as down and move forward? On Aug 17, 2011, at 11:35 AM, Hefeng Yuan wrote: > Sorry, correction, we're using 0.8.1. > > On A

Re: One hot node slows down whole cluster

2011-08-17 Thread Hefeng Yuan
Sorry, correction, we're using 0.8.1. On Aug 17, 2011, at 11:24 AM, Hefeng Yuan wrote: > Hi, > > We're noticing that when one node gets hot (very high cpu usage) because of > 'nodetool repair', the whole cluster's performance becomes really bad. > > We're using 0.8.1 with random partition. We

One hot node slows down whole cluster

2011-08-17 Thread Hefeng Yuan
Hi, We're noticing that when one node gets hot (very high cpu usage) because of 'nodetool repair', the whole cluster's performance becomes really bad. We're using 0.8.0 with random partition. We have 6 nodes with RF 5. Our repair is scheduled to run once a week, spread across whole cluster. I d

Re: apply deserializer to "list" cmd in cli?

2011-08-17 Thread Yang
thanks a lot, this works great On Wed, Aug 17, 2011 at 1:34 AM, aaron morton wrote: > check the help for the assume statement in the cli. > > If the CF has the key_validation_class  set to AsciiType the CLI should do > the right thing. > > Cheers > > - > Aaron Morton > Freelance

Re: dropping secondary indexes

2011-08-17 Thread Dan Kuebrich
Thanks, Aaron! In terms of dropping stuff from the CLI, I tried to re-drop the remaining built column index and get the following error message. I wonder if there's some sort of parser bug related to numeric vs alpha tokens. The column name below is col2 from the show keyspace dump earlier in th

Re: nodetool repair caused high disk space usage

2011-08-17 Thread Huy Le
Sorry for the duplicate thread. I saw the issue being referenced to https://issues.apache.org/jira/browse/CASSANDRA-2280. However, I am running version 0.8.4. I saw your comment in on of the threads that the issue is not reprocible, but multiple users have the same issue. This there anything

Re: RF=1

2011-08-17 Thread Jonathan Ellis
See https://issues.apache.org/jira/browse/CASSANDRA-2388 On Wed, Aug 17, 2011 at 6:28 AM, Patrik Modesto wrote: > Hi, > > while I was investigating this issue, I've found that hadoop+cassandra > don't work if you stop even just one node in the cluster. It doesn't > depend on RF. ColumnFamilyRecor

Re: nodetool repair caused high disk space usage

2011-08-17 Thread Philippe
Look at my last two or three threads. I've encountered the same thing and got some pointers/answers. On Aug 17, 2011 4:03 PM, "Huy Le" wrote: > Hi, > > After upgrading to cass 0.8.4 from cass 0.6.11. I ran scrub. That worked > fine. Then I ran nodetool repair on one of the nodes. The disk usage on

Re: Cassandra in Multiple Datacenters Active - Standby configuration

2011-08-17 Thread Oleg Tsvinev
Aaron, Can you point to a line in Cassandra sources where you believe it does not understand the "multi ring" approach? I'm not sure about Cassandra team but Hector team likes pull requests with patches. Anyways, I believe I should run a test to see if data is indeed replicated between datacenters

nodetool repair caused high disk space usage

2011-08-17 Thread Huy Le
Hi, After upgrading to cass 0.8.4 from cass 0.6.11. I ran scrub. That worked fine. Then I ran nodetool repair on one of the nodes. The disk usage on data directory increased from 40GB to 480GB, and it's still growing. The cluster has 4 nodes with replica factor 3. The ring shows: Address

Re: Reg row limit & sorting

2011-08-17 Thread Konstantin Naryshkin
1. The 100 row limit is for listing (i.e. how many rows that the list command will print). You can give list another limit: list User limit 1000; This limit has nothing to do with any internal Cassandra limitation. I am not aware of any limitation on the number of rows that you can have. 2. I be

Re: HOW TO select a column or all columns that start with X

2011-08-17 Thread Alvin UW
Thanks. it helps. 2011/8/17 Boris Yen > Each compositeType consistes of a few components. Use ("bob", 1982) as an > example, it contains two components, I assume it is (utf8, integer). So when > you want to use a slice query, you need the start and end columns by add > components to them. That i

Re: Bulk loading into live data

2011-08-17 Thread Philippe
> > What if the column is a counter ? Does it overwrite or increment ? Ie if > the SST I am loading has the exact same setup but value 2, will my value > change to 3 ? > > Counter columns only know how to increment (assuming no deletes), so you > will get 3. See > https://github.com/apache/cassandr

Re: RF=1

2011-08-17 Thread Patrik Modesto
And one more patch: http://pastebin.com/zfNPjtQz This one handles a case where there are no nodes available for a slice. For example where the is a keyspace with RF=1 and a node is shut down. Its range of keys gets ignored. Regards, P. On Wed, Aug 17, 2011 at 13:28, Patrik Modesto wrote: > Hi, >

Re: Partitioning, tokens, and sequential keys

2011-08-17 Thread David McNelis
Well, I think what happened was that we had three tokens generated, 0, 567x, and 1134x... but the way that we read the comments in the yaml file, we just set the second two nodes with the initial token and left the token for the seed node blank. Then we started the seed node, started the other

Re: RF=1

2011-08-17 Thread Patrik Modesto
Hi, while I was investigating this issue, I've found that hadoop+cassandra don't work if you stop even just one node in the cluster. It doesn't depend on RF. ColumnFamilyRecordReader gets list of nodes (acording the RF) but chooses just the local host and if there is no cassandra running localy it

Re: Cassandra London: failure modes and HBase

2011-08-17 Thread Eldad Yamin
HI Dave, unfortunately, me and some guys that are very interesting won't be able to get all the way to London. Can you please consider using a video streaming service? I recommend on using Watchitoo.com (I used to work there) At the moment its free. Thanks! On Tue, Aug 16, 2011 at 12:47 PM, Dave

Re: node restart taking too long

2011-08-17 Thread Yan Chunlu
but the data size in the saved_cache are relatively small: will that cause the load problem? ls -lh /cassandra/saved_caches/ total 32M -rw-r--r-- 1 cass cass 2.9M 2011-08-12 19:53 cass-CommentSortsCache-KeyCache -rw-r--r-- 1 cass cass 2.9M 2011-08-17 04:29 cass-CommentSortsCache-RowCache -rw-r

Reg row limit & sorting

2011-08-17 Thread Thamizh
Hi All, I have two questions on Cassandra, 1. Is there any limit(s) on total no. of rows in a single column family? I am using Cassandra-0.8.4 version. [default@tutorials] list User; Using default limit of 100 It looks, here the default limit is 100. How shall I increas

Re: Cassandra in Multiple Datacenters Active - Standby configuration

2011-08-17 Thread aaron morton
The calculation for ownership does not understand the "multi ring" approach to assigning tokens. I've created https://issues.apache.org/jira/browse/CASSANDRA-3047 for you. Otherwise your tokens look good to me. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton

Re: Bulk loading into live data

2011-08-17 Thread aaron morton
> If I SSTLoad data into that KS & CF that has the same key, it will rely on > timestamps stored in the SSTable to overwrite value "1" or not, right ? yes. > What if the column is a counter ? Does it overwrite or increment ? Ie if the > SST I am loading has the exact same setup but value 2, will

Re: apply deserializer to "list" cmd in cli?

2011-08-17 Thread aaron morton
check the help for the assume statement in the cli. If the CF has the key_validation_class set to AsciiType the CLI should do the right thing. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 17/08/2011, at 6:02 PM, Yang wrote:

Re: node restart taking too long

2011-08-17 Thread aaron morton
If you have a node that cannot start up due to issues loading the saved cache delete the files in the saved_cache directory before starting it. The settings to save the row and key cache are per CF. You can change them with an update column family statement via the CLI when attached to any node

Re: HOW TO select a column or all columns that start with X

2011-08-17 Thread Boris Yen
Each compositeType consistes of a few components. Use ("bob", 1982) as an example, it contains two components, I assume it is (utf8, integer). So when you want to use a slice query, you need the start and end columns by add components to them. That is what start.addCompo and end.addCompo... mea