Re: Poor performance; PHP & Thrift to blame

2010-08-20 Thread sasha
Julian Simon jules.com.au> writes: > > Hi, > > I've been trying to benchmark Cassandra for our use case and have been > seeing poor performance on both writes and (extremely) poor > performance on reads. > > Using Cassandra 0.51 stable & thrift-0.2.0. > > It turns out all the CPU time is goin

SV: Poor performance; PHP & Thrift to blame

2010-08-20 Thread Thorvaldsson Justus
Seach the mailing list http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/ This is already been addressed and is a php issue only The time 5 sec is a timeout because if I remember correctly packet size is too small or something like it You can config so it stops being a porblem but

SV: SV: Help with getting Key range with some column limitations

2010-08-20 Thread Thorvaldsson Justus
I think you should try to do it some other way than iterate, it sounds super suboptimal to me. Also the plugin option he was thinking of I think is changing Cassandra sourcecode, kind of hard when Cassandra is changing so fast but very possible. I think you should look at http://blip.tv/file/401

Re: SV: SV: Help with getting Key range with some column limitations

2010-08-20 Thread Jone Lura
Thanks! Read your blog a few times, but it's hard to get rid of sql thinking. So if I create a new standard ColumnFamily with a rowId and geohash a lat/lon into a UTF8Type, I could geohash the boundingbox, and query for all matching columns. Or do I always need to know the rowId to do a slic

SV: SV: SV: Help with getting Key range with some column limitations

2010-08-20 Thread Thorvaldsson Justus
If you only want to check the last 5 min, make time a part of your key And make a customized sort and sort by the time. Remember sort is made when inserting data. http://www.sodeso.nl/?p=421 Or make a range check that understands the time limit, should work I think from the top of my head. But y

Re: SV: SV: SV: Help with getting Key range with some column limitations

2010-08-20 Thread Jone Lura
Thank you for your effort. Im pretty sure I will make it work. Have a nice weekend! On 20/08/2010 10:48, Thorvaldsson Justus wrote: If you only want to check the last 5 min, make time a part of your key And make a customized sort and sort by the time. Remember sort is made when inserting d

Re: Errors with Cassandra 0.7

2010-08-20 Thread Gary Dusbabek
On Thu, Aug 19, 2010 at 16:30, Alaa Zubaidi wrote: > Hi, > > I am trying to run Cassandra 0.7 and I am getting different errors: First it > was while calling client.insert and now while calling set_keyspace (see > below). > > Note: I get the following when I start Cassandra: > *10/08/19 12:58:26 I

Re: Replication factor and other schema changes in >= 0.7

2010-08-20 Thread Gary Dusbabek
It is coming. In fact, I started working on this ticket yesterday. Most of the settings that you could change before will be modifiable. Unfortunately, you must still manually perform the repair operations, etc., afterward. https://issues.apache.org/jira/browse/CASSANDRA-1285 Gary. On Thu, Aug

Re: Cassandra and Pig

2010-08-20 Thread Christian Decker
Hm, that was my conclusion too, but somehow I don't get what I'm doing wrong. I checked that the thrift library is in CLASSPATH and the PIG_CLASSPATH and as shown in the script above I'm using register to add the library to the dependencies. Am I missing something else? Regards, Chris -- Christian

Re: SV: SV: Help with getting Key range with some column limitations

2010-08-20 Thread Mark
On 8/20/10 1:05 AM, Thorvaldsson Justus wrote: I think you should try to do it some other way than iterate, it sounds super suboptimal to me. Also the plugin option he was thinking of I think is changing Cassandra sourcecode, kind of hard when Cassandra is changing so fast but very possible.

Re: Replication factor and other schema changes in >= 0.7

2010-08-20 Thread Andres March
Cool, thanks. I suspected the same, including the repair. On 08/20/2010 06:05 AM, Gary Dusbabek wrote: It is coming. In fact, I started working on this ticket yesterday. Most of the settings that you could change before will be modifiable. Unfortunately, you must still manually perform the re

Re: Node OOM Problems

2010-08-20 Thread Wayne
I turned off the creation of the secondary indexes which had the large rows and all seemed good. Thank you for the help. I was getting 60k+/writes/second on the 6 node cluster. Unfortunately again three hours later a node went down. I can not even look at the logs when it started since they are go

Re: Node OOM Problems

2010-08-20 Thread Edward Capriolo
On Fri, Aug 20, 2010 at 1:17 PM, Wayne wrote: > I turned off the creation of the secondary indexes which had the large rows > and all seemed good. Thank you for the help. I was getting > 60k+/writes/second on the 6 node cluster. > > Unfortunately again three hours later a node went down. I can not

Re: Node OOM Problems

2010-08-20 Thread Wayne
I deleted ALL data and reset the nodes from scratch. There are no more large rows in there. 8-9megs MAX across all nodes. This appears to be a new problem. I restarted the node in question and it seems to be running fine, but I had to run repair on it as it appears to be missing a lot of data. On

Re: Node OOM Problems

2010-08-20 Thread Jonathan Ellis
these warnings mean you have more requests queued up than you are able to handle. that request queue is what is using up most of your heap memory. On Fri, Aug 20, 2010 at 12:17 PM, Wayne wrote: > I turned off the creation of the secondary indexes which had the large rows > and all seemed good. T

Re: Cassandra disk space utilization WAY higher than I would expect

2010-08-20 Thread Julie
Robert Coli digg.com> writes: > Check the size of the Hinted Handoff CF? If your nodes are flapping > under sustained write, they could be storing a non-trivial number of > hinted handoff rows? Probably not 5x usage though.. > > http://wiki.apache.org/cassandra/Operations > " > The reason why yo

Re: "MessageDeserializationTask.java (line 47) dropping message" errors

2010-08-20 Thread Ronald Park
How about this message: WARN [DroppedMessagesLogger] 2010-08-20 08:02:27,668 MessagingService.java (lin e 512) Dropped 46469 messages in the last 1000ms For this application, I also have a 3-node cluster but I'm running a version of Cassandra built off the trunk (we wanted the 'time to live' fea

Re: "MessageDeserializationTask.java (line 47) dropping message" errors

2010-08-20 Thread Jonathan Ellis
updates being dropped on the floor is exactly what that message means (the batch_mutate you send are decomposed under the hood to individual rows for replication, which is why the number is greater than your 3000) On Fri, Aug 20, 2010 at 3:50 PM, Ronald Park wrote: > How about this message: > >

Re: Cassandra disk space utilization WAY higher than I would expect

2010-08-20 Thread Julie
Julie nextcentury.com> writes: Please see previous post but is hinted handoff a factor if the CL is set to ALL?

Re: Cassandra disk space utilization WAY higher than I would expect

2010-08-20 Thread Rob Coli
On 8/20/10 1:58 PM, Julie wrote: Julie nextcentury.com> writes: Please see previous post but is hinted handoff a factor if the CL is set to ALL? Your previous post looks like a flush or compaction is causing the node to mark its neighbors down. Do you see correlation between memtable flush

Re: Node OOM Problems

2010-08-20 Thread Bill de hÓra
On Fri, 2010-08-20 at 19:17 +0200, Wayne wrote: > WARN [MESSAGE-DESERIALIZER-POOL:1] 2010-08-20 16:57:02,602 > MessageDeserializationTask.java (line 47) dropping message > (1,078,378ms past timeout) > WARN [MESSAGE-DESERIALIZER-POOL:1] 2010-08-20 16:57:02,602 > MessageDeserializationTask.java (l

Re: "MessageDeserializationTask.java (line 47) dropping message" errors

2010-08-20 Thread Ronald Park
Well this is bad news... Why is this logged only as a WARN? Seems quite SEVERE to me. Fortunately for my app, I can go back to my source, recreate the data and reload it at a slower rate (and cross my fingers that this rate is 'slow enough')... unfortunately, it took me 20 hours to load the data

Re: "MessageDeserializationTask.java (line 47) dropping message" errors

2010-08-20 Thread Jonathan Ellis
On Fri, Aug 20, 2010 at 5:02 PM, Ronald Park wrote: > Well this is bad news... Why is this logged only as a WARN?  Seems quite > SEVERE to me. That's as high as log4j goes, without going all the way to ERROR which we reserve for bugs. (High load is not a bug.) > Fortunately for my app, I can go

Re: "MessageDeserializationTask.java (line 47) dropping message" errors

2010-08-20 Thread Ronald Park
Oh, excellent. Ok, my code does some retrying (including some sleep then retry logic) when I get a TimedOut, so I think I'm actually just fine. Whew. :) I had just noticed the dropped message in Cassandra's log today but had seen the time outs in my log while it was running. I just didn't conne