Re: unrepairable sstable data rows

2011-04-11 Thread Jonathan Colby
Thanks for the answer Aaron. There are Data, Index, Filter, and Statistics files associated with SSTables. What files must be physically moved/deleted? I tried just moving the Data file and Cassandra would not start. I see this exception: WARN [WrapperSimpleAppMain] 2011-04-11 12:04:23,23

Re: unrepairable sstable data rows

2011-04-11 Thread Sylvain Lebresne
Remove main-f-5-{Index|Filter|Statistics}.db files. They make no sense without a Data file and Cassandra always make sure it removes those before the Data file (that while it gets confused if it finds one of those file without a data file). Note that your error was with the sstable main-f-232-Data

Cassandra constantly nodes which doens allredy exists

2011-04-11 Thread ruslan usifov
Hello I use cassandra 0.7.4. After reconfiguring cluster on one node i constantly see folow log: INFO [GossipStage:1] 2011-04-11 17:14:13,514 StorageService.java (line 865) Removing token 56713727820156410577229101238628035242 for /10.32.59.202 INFO [ScheduledTasks:1] 2011-04-11 17:14:13,514 Hint

Read time get worse during dynamic snitch reset

2011-04-11 Thread shimi
I finally upgraded 0.6.x to 0.7.4. The nodes are running with the new version for several days across 2 data centers. I noticed that the read time in some of the nodes increase by x50-60 every ten minutes. There was no indication in the logs for something that happen at the same time. The only thi

exceptions during bootstrap 0.7.4

2011-04-11 Thread Jonathan Colby
Seeing these exceptions on a node during the bootstrap phase of a move . Cassandra 0.7.4. Anyone able to shed more light on what may be causing this? btw - the move was done to assign a new token, decommission phase seemed to have gone ok. bootstrapping is still in progress (i hope) INFO [

Cassandra Database Modeling

2011-04-11 Thread Shalom
I would like to save statistics on 10,000,000 (ten millions) pairs of particles, how they relate to one another in any given space in time. So suppose that within a total experiment time of T1..T1000 (assume that T1 is when the experiment starts, and T1000 is the time when the experiment ends) I w

Analysing hotspot gc logs

2011-04-11 Thread Chris Burroughs
To avoid taking my own thread [1] off on a tangent. Does anyone have a reccomendation for a tool to graphical analysis (ie make useful graphs) out of hoptspot gc logs? Google searches have turned up several results along the lines of "go try this zip file" [2]. [1] http://www.mail-archive.com/us

Re: Analysing hotspot gc logs

2011-04-11 Thread Ryan King
On Mon, Apr 11, 2011 at 10:35 AM, Chris Burroughs wrote: > To avoid taking my own thread [1] off on a tangent.  Does anyone have a > reccomendation for a tool to graphical analysis (ie make useful graphs) > out of hoptspot gc logs?  Google searches have turned up several results > along the lines

problems getting started with Cassandra & Ruby

2011-04-11 Thread Mark Lilback
I'm trying to connect to Cassandra from a Ruby script. I'm using rvm, and made a clean install of Ruby 1.9.2 and then did "gem install cassandra". When I run a script that just contains "require 'cassandra/0.7'", I get the output below. Any suggestion on what I need to do to get rid of these war

Timeout during stress test

2011-04-11 Thread mcasandra
I am running stress test using hector. In the client logs I see: me.prettyprint.hector.api.exceptions.HTimedOutException: TimedOutException() at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:32) at me.prettyprint.cassandra.service

Remove call vs. delete mutation

2011-04-11 Thread Josep Blanquer
All, From a thrift client perspective using Cassandra, there are currently 2 options for deleting keys/columns/subcolumns: 1- One can use the "remove" call: which only takes a column path so you can only delete 'one thing' at a time (an entire key, an entire supercolumn, a column or a subcolumn)

help! seed node needs to be replaced

2011-04-11 Thread Jonathan Colby
My seed node (1 of 4) having the wraparound range (token 0) needs to be replaced. Should I bootstrap the node with a new IP, then add it back as a seed? Should I run remove token on another node to take over the range?

Re: help! seed node needs to be replaced

2011-04-11 Thread Jonathan Colby
I shutdown cassandra, deleted (with a backup) the contents of the data directory and did a "nodetool move 0".It seems to be populating the node with its range of data.Hope that was a good idea. On Apr 11, 2011, at 10:38 PM, Jonathan Colby wrote: > > My seed node (1 of 4) having the wr

Re: unrepairable sstable data rows

2011-04-11 Thread aaron morton
FYI, I was chatting with Dominic Williams on IRC yesterday, he had an 0.7.4 install with the same problem see error stack here http://pastebin.com/YasPtEYj He has not run nodetool scrub but I think it the 0.7.4 install had been there a while so I the data file may have been fresh. Aaron On 1

Re: Cassandra constantly nodes which doens allredy exists

2011-04-11 Thread aaron morton
In JConsole go to o.a.c.db.HintedHandoffManager and try the deleteHintsForEndpopints operation. This is also called as when a token is removed from the ring, or when a node is decomissioned. What process did you use to reconfigure the cluster? Aaron On 12 Apr 2011, at 01:15, ruslan usifov w

Re: Read time get worse during dynamic snitch reset

2011-04-11 Thread aaron morton
The reset interval clears the latency tracked for each node so a bad node will be read from again. The scores for each node are then updated every 100ms (default) using the last 100 responses from a node. How long does the bad performance last for? What CL are you reading at ? At Quorum with R

Re: help! seed node needs to be replaced

2011-04-11 Thread aaron morton
Is this the node that had the earlier EOF error during bootstrap ? Aaron On 12 Apr 2011, at 08:42, Jonathan Colby wrote: > I shutdown cassandra, deleted (with a backup) the contents of the data > directory and did a "nodetool move 0".It seems to be populating the node > with its range of

Re: help! seed node needs to be replaced

2011-04-11 Thread Jonathan Colby
Yes. This node has repeatedly given problems while reading various sstables. So I decided to start with a fresh data dir, relying on the fact that with an RF=3, the data will be able to be retrieved from the cluster. Since this is a seed node, I am a little unsure how to proceed. From everyt

Re: Timeout during stress test

2011-04-11 Thread mcasandra
I see this occurring often when all cassandra nodes all of a sudden show CPU spike. All reads fail for about 2 mts. GC.log and system.log doesn't reveal much. Only think I notice is that when I restart nodes there are tons of files that gets deleted. cfstats from one of the nodes looks like this:

Re: Cassandra Database Modeling

2011-04-11 Thread aaron morton
The tricky part here is the level of flexibility you want for the querying. In general you will want to denormalise to support the read queries. If your queries are not interactive you may be able to use Hadoop / Pig / Hive e.g. http://www.datastax.com/products/brisk In which case you can prob

Re: Remove call vs. delete mutation

2011-04-11 Thread aaron morton
AFAIK both follow the same path internally. Aaron On 12 Apr 2011, at 06:47, Josep Blanquer wrote: > All, > > From a thrift client perspective using Cassandra, there are currently > 2 options for deleting keys/columns/subcolumns: > > 1- One can use the "remove" call: which only takes a column

Re: Timeout during stress test

2011-04-11 Thread aaron morton
TimedOutException means the cluster could not perform the request in rpc_timeout time. The client should retry as the problem may be transitory. In this case read performance may have slowed down due to the number of sstables 286. It hard to tell without knowing what the workload is. Aaron On

Re: Timeout during stress test

2011-04-11 Thread mcasandra
It looks like hector did retry on all the nodes and failed. Does this then mean cassandra is down for clients in this scenario? That would be bad. -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Timeout-during-stress-test-tp6262430p6263270.html Se

Lot of pending tasks for writes

2011-04-11 Thread mcasandra
I am running stress test and on one of the nodes I see: [root@dsdb5 ~]# nodetool -h `hostname` tpstats Pool NameActive Pending Completed ReadStage 0 0 2495 RequestResponseStage 0 0 242202 MutationStag

Re: Timeout during stress test

2011-04-11 Thread aaron morton
It means the cluster is currently overloaded and unable to complete requests in time at the CL specified. Aaron On 12 Apr 2011, at 11:18, mcasandra wrote: > It looks like hector did retry on all the nodes and failed. Does this then > mean cassandra is down for clients in this scenario? That wo

Re: Timeout during stress test

2011-04-11 Thread mcasandra
But I don't understand the reason for oveload. It was doing simple read of 12 threads and reasing 5 rows. Avg CPU only 20%, No GC issues that I see. I would expect cassandra to be able to process more with 6 nodes, 12 core, 96 GB RAM and 4 GB heap. -- View this message in context: http://cass

Re: Timeout during stress test

2011-04-11 Thread aaron morton
You'll need to provide more information, from the TP stats the read stage could not keep up. If the node is not CPU bound then it is probably IO bound. What sort of read? How many columns was it asking for ? How many columns do the rows have ? Was the test asking for different rows ? How many

unsubscribe

2011-04-11 Thread Denis Kirpichenkov

Re: unsubscribe

2011-04-11 Thread daryl smith
>

Re: unsubscribe

2011-04-11 Thread aaron morton
http://wiki.apache.org/cassandra/FAQ#unsubscribe On 12 Apr 2011, at 14:43, Denis Kirpichenkov wrote: > >

Re: Cassandra Database Modeling

2011-04-11 Thread csharpplusproject
Hi Aaron, Yes, of course it helps, I am starting to get a flavor of Cassandra -- thank you very much! First of all, by 'interactive' queries, are you referring to 'real-time' queries? (meaning, where experiments data is 'streaming', data needs to be stored and following that, the query needs to b

Re: Timeout during stress test

2011-04-11 Thread Terje Marthinussen
I notice you have pending hinted handoffs? Look for errors related to that. We have seen occasional corruptions in the hinted handoff sstables, If you are stressing the system to its limits, you may also consider playing with more with the number of read/write threads (concurrent_reads/writes)

Re: Timeout during stress test

2011-04-11 Thread mcasandra
aaron morton wrote: > > You'll need to provide more information, from the TP stats the read stage > could not keep up. If the node is not CPU bound then it is probably IO > bound. > > > What sort of read? > How many columns was it asking for ? > How many columns do the rows have ? > Was the t

Re: Read time get worse during dynamic snitch reset

2011-04-11 Thread shimi
On Tue, Apr 12, 2011 at 12:26 AM, aaron morton wrote: > The reset interval clears the latency tracked for each node so a bad node > will be read from again. The scores for each node are then updated every > 100ms (default) using the last 100 responses from a node. > > How long does the bad perform