Change one thing at a time and work out what metric it is you want to improve.
I would start with reducing compaction_throughput_mb_per_sec. Have a look in
your logs for the "Enqueuing flush of Memtable…" messages, count up how many
serialised bytes you are flushing and then check it against th
Did you sort this out ? The #cassandra IRC room is a good place to get help as
I tried to build it first using
mvn compile and got this different error
"[ERROR] Failed to execute goal on project cassandra-tutorial: Could not
resolve dependencies for project
> We've been having issues where as soon as we start doing heavy writes (via
> hadoop) recently, it really hammers 4 nodes out of 20. We're using random
> partitioner and we've set the initial tokens for our 20 nodes according to
> the general spacing formula, except for a few token offsets as
I'm running low on ideas for this one. Anyone else ?
If the phantom node is not listed in the ring, other nodes should not be
storing hints for it. You can see what nodes they are storing hints for via
You can try a rolling restart passing the JVM opt
I ran into this. I also tried log_ring_state=false which also did not help.
The way I got through this was to stop the entire cluster and start the nodes
I realize this is not a practical solution for everyone, but if you can afford
to stop the cluster for a few minutes, it's
On Aug 23, 2011, at 2:25 AM, Peter Schuller wrote:
>> We've been having issues where as soon as we start doing heavy writes (via
>> hadoop) recently, it really hammers 4 nodes out of 20. We're using random
>> partitioner and we've set the initial tokens for our 20 nodes according to
>> the ge
Is there way to preload entire CF into cache with seq access when server
I think that standard cache preloader is using random access and because
of that its so slow that we cant use it.
Dropped messages in ReadRepair is odd. Are you also dropping mutations ?
There are two tasks performed on the ReadRepair stage. The digests are compared
on this stage, and secondly the repair happens on the stage. Comparing digests
is quick. Doing the repair could take a bit longer, all the cf'
I normally link to the data stax article to avoid having to actually write
those words :)
Aaron Morton
Freelance Cassandra Developer
On 23/
Hi All,
This is regarding multi-node cluster configuration doubt.
I have configured 3 nodes of cluster using Cassandra-0.8.4 and getting error
when I ran Map/Reduce job which uploads records from HDFS to Cassandra.
Here are my 3 nodes cluster config file (cassandra.yaml) for Cassandra:
El sáb, 20-08-2011 a las 01:22 +0200, Peter Schuller escribió:
> > Is there any chance that the entire file from source node got streamed to
> > destination node even though only small amount of data in hte file from
> > source node is supposed to be streamed destination node?
> Yes, but the thi
On Aug 23, 2011, at 3:43 AM, aaron morton wrote:
> Dropped messages in ReadRepair is odd. Are you also dropping mutations ?
> There are two tasks performed on the ReadRepair stage. The digests are
> compared on this stage, and secondly the repair happens on the stage.
> Comparing digests is
On 21 August 2011 12:34, Yan Chunlu wrote:
> since "nodetool cleanup" could remove hinted handoff, will it cause the
> data loss?
Hi Yan,
Hints are not guaranteed to be delivered and "nodetool cleanup" is one of
the reasons for that. This will only cause data-loss if you are writing at
Hi, we're running some performance tests against some clusters and I'm
curious about some of the numbers I see.
I'm running the stress test against two identically configured clusters, but
after I run at stress test, I get different Load values across the
The difference between the two
By your solution, the problem is fixed.
But my output is like this.
I don't understand what's the meaning of " Error stacktraces are turned on."
And I hope only the results are outputted, not the INFO.
mvn -e exec:java -Dexec.args="get"
Have you run repair on the nodes ? Maybe some data was lost and not repaired
yet ?
2011/8/23 Chris Marino
> Hi, we're running some performance tests against some clusters and I'm
> curious about some of the numbers I see.
> I'm running the stress test against two identically configur
As mentioned by Ed Anuff in his blog and slides, one way to build customized
secondary index is:
We use one CF, each row to represent a secondary index, with the secondary
index name as row key.
For example,
Indexes = {
"User_Keys_By_Last_Name" : {
"adams" : "e5d61f2b-…",
"alden" : "e80a17
On Tue, Aug 23, 2011 at 11:56 AM, Sam Overton wrote:
> On 21 August 2011 12:34, Yan Chunlu wrote:
>> since "nodetool cleanup" could remove hinted handoff, will it cause the
>> data loss?
> Hi Yan,
> Hints are not guaranteed to be delivered and "nodetool cleanup" is one of
> the reasons
Unsubscribe, please.
From: SebWajam
Sent: Monday, August 22, 2011 7:29 AM
Subject: Re: Cassandra Cluster Admin - phpMyAdmin for Cassandra
And thank you for your feedback! :)
View this message in context:
On Tue, Aug 23, 2011 at 2:26 AM, aaron morton wrote:
> I'm running low on ideas for this one. Anyone else ?
> If the phantom node is not listed in the ring, other nodes should not be
> storing hints for it. You can see what nodes they are storing hints for via
> JConsole.
I think I found it i
Thanks Jonathan, and thanks Peter.
How do u guys use the mail list? I'm using a mail client and this e-mail didn't
group up until i found it today...
On Aug 19, 2011, at 12:27 PM, Jonathan Ellis wrote:
> I think this is what you want:
We are getting an error in our Solandra search when the search string
contains a space. Is anyone else seeing this?
*Net::HTTPFatalError*: 500 "null java.
lang.ArrayIndexOutOfBoundsException null
java.lang.ArrayIndexOutOfBoundsException request:
Taking the cluster down completely did remove the phantom node. The
hintscolumnfamily is causing a lot of commit logs to back up and threaten the
commit log drive to run out of space. A manual flush of that column family
always clears out the files though.
-Original Message-
From: Br
Are checksum errors detected in Cassandra and if so how are they resolved?
I'm running a 16-node cassandra cluster, with a reasonably large
amount of data per node (~1TB). Nodes have 16G ram, but heap is set to 8G.
The nodes keep stopping with this output in the log. Any ideas?
ERROR [Thread-85] 2011-08-23 21:00:38,723
(line 113)
Hi Aaron,
We are using Thrift 5..
TSocket _tr = new TSocket(server.Host,
server.Port);//"localhost", 9160);
_transport = new TFramedTransport(_tr);
_protocol = new TBinaryProtocol(_transport);
_client = new Cassandra.Client(_protocol);
Do you have
2011/8/23 Ernst D Schoen-René :
> Hi,
> I'm running a 16-node cassandra cluster, with a reasonably large amount of
> data per node (~1TB). Nodes have 16G ram, but heap is set to 8G.
> The nodes keep stopping with this output in the log. Any ideas?
> ERROR [Thread-85] 2011-08-23 21:00:38,723
We had already been running cassandra with a larger heap size, but
it meant that java took way too long between garbage collections. The
advice I'd found was to set the heap size at the 8 we're running at. It
was ok for a while, but now some nodes crash. It's definitely our
I patched CASSANDRA 2530 on this version, and tested it for our financial
related case. It really improved a lot on disk consumption, using only 20% of
original space for financing-related data storage. The performance is better
than MySQL and also it consumes only 1x more than My
INFO [769787724@qtp-311722089-9825] 2011-08-23 22:07:53,750
(line 1370) [users] webapp=/solandra path=/select
I'm looking for advice for running cassandra 8.+ on a single node. Would love
to hear stories about how much RAM you succeeded with, etc.
Currently we are running with a 4GB heap size. Hardware is 4 cores and 8GB
physical memory. We're not opposed to going to 16GB of memory or even 32GB.
Thx for the info I'll try to reproduce
On Aug 23, 2011, at 9:28 PM, Ashley Martens wrote:
> INFO [769787724@qtp-311722089-9825] 2011-08-23 22:07:53,750
> (line 1370) [users] webapp=/solandra path=/select
> params={fl=*,score&start=0&q=+(+(first_name:hatice^1.2)+(first_name:hatic
I had a thread going the other day about vector clock memory usage and that
it is a series of (clock id, clock):ts and the ability to prune old entries
… I'm specifically curious here how often old entries are pruned.
If you're storing small columns within cassandra. Say just an integer. The
33 matches
Mail list logo