Thanks Michael.
I will make a benchmark using Hadoop Map/Reduce(example...) in our cluster.
and any valuable information I will let you know. :)
On Wed, Jan 30, 2013 at 2:39 PM, Michael Kjellman
> And finally to make wide rows with C* and Hadoop even better, these
> problems have
As per the Datastax Cassandra Documentation 1.2,
"for single data center deployments, tokens are calculated by dividing
the hash range by the number of nodes in the cluster", *does it mean we
have to recalculate the tokens of keys when nodes come and go?**
"for multiple data center depl
I'll admit that this part of the DataStax documentation is a bit confusing
I'll reach to the doc writers to make sure this is improved).
The partitioner (being it RandomPartitioner, Murmur3Partitioner or
OrderPreservingPartitioner) is pretty much only a hash function that defines
how to compu
Well this is getting stranger, for me with this simple table definition,
select key,gender from users
is also failing with a null pointer exception
On 29 Jan 2013, at 13:50, Andy Cobley wrote:
> When connecting to Cassandra 1.2.0 from CQLSH the table was created with:
On Wed 30 Jan 2013 05:47:59 PM CST, Sylvain Lebresne wrote:
I'll admit that this part of the DataStax documentation is a bit
confusing (and
I'll reach to the doc writers to make sure this is improved).
The partitioner (being it RandomPartitioner, Murmur3Partitioner or
I am running 3 nodes cassandra cluster with replica factor 2 in one
DC. Now I need to run multiple data center clusters with cassandra and
I have following queries;
1. I want to replicate whole data on another DC and after that both
DC's nodes should have complete Data. In which topol
Hi Brian,
Which version of cassandra are you using? And are you using the BOF to write to
Kind regards,
-Original Message-
From: Brian Jeltema []
Sent: woensdag 30 januari 2013 13:20
Subject: cryptic e
1. I want to replicate whole data on another DC and after that both DC's
nodes should have complete Data. In which topology is it possible ?
I think NetworkTopology is best suited for such configuration, You may
want to use nodetool to generate token accordingly.
2. If I need backup, what's the
Cassandra 1.1.5, using BulkOutputFormat
On Jan 30, 2013, at 7:39 AM, Pieter Callewaert wrote:
> Hi Brian,
> Which version of cassandra are you using? And are you using the BOF to write
> to Cassandra?
> Kind regards,
> Pieter
> -Original Message-
> From: Brian Jeltema [ma
This was unexpected fallout fro the change to murmur partitioner. A jira is
open but if you need map red murmers is currently out of the question.
On Wednesday, January 30, 2013, Tejas Patil
> While reading data from Cassandra in map-reduce, I am getting
Fix is simply to switch to random partitioner.
On Wednesday, January 30, 2013, Edward Capriolo
> This was unexpected fallout fro the change to murmur partitioner. A jira
is open but if you need map red murmers is currently out of the question.
> On Wednesday, January 30, 2013, Tejas Pati
You really can't mix cql2 and cql3. Cql2 does not understand cql3s sparse
tables. Technically it ,barfs all over the place. Cql2 is only good for
contact tables.
On Wednesday, January 30, 2013, Andy Cobley
> Well this is getting stranger, for me with this simple table
> sele
I have the same issue (but with sstableloaders).
Should be fixed in 1.2 release
Kind regards,
-Original Message-
From: Brian Jeltema []
Sent: woensdag 30 januari 2013 13:58
To: user@cassa
Darn auto correct cql2 , is only good for compact tables. Make sure you are
setting you cql version. Or frankly just switch to Hector / thrift and use
things that are know to work for years now.
On Wednesday, January 30, 2013, Edward Capriolo
> You really can't mix cql2 and cql3. Cql2 does
Any query is going to fail quorum + rf3 + 2 nodes down.
One thing about 2x indexes (both user defined and built in) is that finding
an answer using them requires more nodes to be up then just a single get or
On Monday, January 28, 2013, Mike Sample wrote:
> Thanks Aaron. So basically it
I recall someone doing some work in Astyanax and I don't know if it made it
back in where astyanax would retry at a lower CL level when 2 nodes were down
so things could continue to work which was a VERY VERY cool feature. You may
want to look into that….I know at some point, I plan to.
I'm not sure this is the same problem. I'm getting these even when using a
single reducer
for the entire job.
On Jan 30, 2013, at 9:26 AM, Pieter Callewaert wrote:
> I have the same issue (but with sstableloaders).
> Should be fixed in 1.2 release
> (
You should not use the row cache and the key vacumed on the same cf. If
that is what you are doing it explains your numbers. Some docs suggest you
can use them together but in practice I have seen when this is done the key
cache rate drops to near 0.
On Tuesday, January 29, 2013, Keith wrote:
> H
Hector has this feature because Hector is awesome sauce, but aystynsnax is
new,sexy, and bogged about by netflix.
So the new cassandra trend to force everyone to use less functional new
stuff is at work here making you wish for something that already exists
On Wednesday, January 30, 20
I'd also point out, Hector has better support for CQL3 features than
Astyanax. I contributed some stuff to hector back in December, but I
don't have time to apply those changes to astyanax.
I have other contributions in mind for hector, which I hope to work on
later this year.
On Wed, Jan 30, 201
I migrated my test environment from 1.2.0 to 1.2.1 (DataStax Community) and
nodetool can not communicate to 7199, even if it is listening. in one node
I get
Failed to connect to 'cassandra4:7199': Connection refused
in another node I get timeout.
Did I do anything wrong, when upgrading?
A good portion of people and traffic on this list is questions about:
1) asytnax
2) cassandra-jdbc
3) cassandra native client
3) pyhtondra / whatever
With the exception of the native transport which is only half way part of
Cassandra, none of the these other client issues have much to do with cor
On Wed 30 Jan 2013 02:29:27 AM CST, Zhong Li wrote:
One more question, can I add a virtual node manually without reboot
and rebuild a host data?
I checked nodetool command, there is no option to add a node.
On Jan 29, 2013, at 11:09 AM, Zhong Li wrote:
I was misunderstood thi
I totally agree.
On Wed, Jan 30, 2013 at 8:51 PM, Edward Capriolo wrote:
> A good portion of people and traffic on this list is questions about:
> 1) asytnax
> 2) cassandra-jdbc
> 3) cassandra native client
> 3) pyhtondra / whatever
> With the exception of the native transport which i
Are you using execute_cql3_query() ?
On Jan 30, 2013, at 7:31 AM, "Oleksandr Petrov"
> Hi,
> I'm creating a table via cql3 query like:
> CREATE TABLE posts (
> userid text,
> blog_name text,
> entry_title text,
> posted_at text,
> PRIMARY KEY (userid, blog_name)
> )
Yes, execute_cql3_query, exactly.
On Wed, Jan 30, 2013 at 4:37 PM, Michael Kjellman
> Are you using execute_cql3_query() ?
> On Jan 30, 2013, at 7:31 AM, "Oleksandr Petrov" <
>> wrote:
> > Hi,
> >
> > I'm creating a table via cql3 query like:
> >
The high CPU node got replaced and now I'm not getting abnormally high CPU
from one node. They all are evenly balanced now.
On 29 January 2013 16:29, Jabbar wrote:
> Hello,
> I've been testing a four identical node cassanda 1.2 cluster for a number
> of days. I have written a c# client using
Did you pack the composite correctly? This exception normally shows up when the
composite bytes are malformed
On Jan 30, 2013, at 7:45 AM, "Oleksandr Petrov">> wrote:
Yes, execute_cql3_query, exactly.
On Wed, Jan 30, 2013 at 4:37 PM, Michael Kjellman
>From src/java/org/apache/cassandra/db/marshal/
* The encoding of a CompositeType column name should be:
* where is:
* <'end-of-component' byte>
* where is a 2 bytes unsigned short the and the
* 'end-of-component' byte should always be 0 for actual column nam
I recently open sourced a WIP java library for handling timestamped data.
I am looking for feedback/criticism and also interest. It was made
primarily to process lots of small numeric values, without having to load
the entire set into memory.
Anyways, thoughts and feedback appreciated.
I'm sure it helps if I link the thing:
On Wed, Jan 30, 2013 at 8:39 AM, Dan Simpson wrote:
> Hello,
> I recently open sourced a WIP java library for handling timestamped data.
> I am looking for feedback/criticism and also interest. It was made
> primar
> You add a physical node and that in turn adds num_token tokens to the ring.
No, I am talking about Virtual Nodes with order preserving partitioner. For an
existing host with multiple tokens setting list on cassandra.inital_token.
After initial bootstrapping, the host will not aware changes of
At what level will the NY talks be? I had been planning on attending
Datastax's big summer conference and I might not be able to get approval
for bothso I'd like to hear more about this one.
On Wed, Jan 30, 2013 at 12:40 PM, Jonathan Ellis wrote:
> ApacheCon North America (Portland, Feb 26
On Wed, Jan 30, 2013 at 7:21 AM, Edward Capriolo wrote:
> My suggestion: At minimum we should re-route these questions to client-dev
> or simply say, "If it is not part of core Cassandra, you are looking in the
> wrong place for support"
+1, I find myself scanning past all those questions in orde
I am using DseDelegateSnitch
Subject: Re: cluster issues
Date: Tue, 29 Jan 2013 20:15:45 +1300
We can always be proactive in keeping the time sync. But, Is there any way to
recover from a time drift (in a reactive manner)? Sinc
Are there tickets/documents explain how data be replicated on Virtual Nodes? If
there are multiple tokens on one physical host, may a chance two or more tokens
chosen by replication strategy located on same host? If move/remove/add a token
manually, does Cassandra Engine validate the case?
My guess is that those one or two nodes with the gc pressure also have more
rows in your big CF. More rows could be due to imbalanced distribution if
your'e not using a random partitioner or from those nodes not yet removing
deleted rows which other nodes may have done.
JVM heap space is used for
What's the output of nodetool cfstats for those 2 column families on
cassNode2 and cassNode3? And what is the replication factor for this
Per the previous reply, nodetool ring should show each of your nodes
with ~16.7% of the data if well balanced.
Also, the auto-detection for memory siz
I had the same problem with 1.2.0. The problem went away after readline was
Yen-Fen Hsu
This is what a row of your table will look like internally…
RowKey: id-value
=> (column=date-value:request-value:, value=, timestamp=1359586739456000)
=> (column=date-value:request-value:data1, value=64617461312d76616c7565,
=> (column=date-value:req
erg, that error means it's not really part of the ring.
I would try to restart the joining.
Shut down the node, and delete everything in /var/lib/data/system. You can
leave the data that's already there if you want or delete it.
Then try joining again.
Aaron Morton
Your latencies and distribution look fine.
How big/what types of queries are you issuing? Are you issuing a lot
of large multigets?
Also, do either of these column families have secondary indexes?
On Wed, Jan 30, 2013 at 2:59 PM, Guillermo Barbero
> Iep,
> I missed the attachment...
On Wed, Jan 30, 2013 at 2:44 PM, Guillermo Barbero <> wrote:
> WARN [MemoryMeter:1] 2013-01-30 21:37:48,079 (line 202)
> setting live ratio to maximum of 64.0 instead of 751.6512549537648
This looks interesting. Doesn't this mean that the ratio of
The looks bug like, can you create a ticket on
Please include the C* version, the table and insert statements, and if you can
repo is using CQL 3.
Aaron Morton
Freelance Cassandra Developer
New Zealand
> I think a row mutation is isolated now, but is it across column families?
Correct they are isolated, but only for an individual CF.
> By the way, the wiki page really needs updating.
You can update if you would like to.
Aaron Morton
Freelance Cassandra Developer
On Thu 31 Jan 2013 08:55:40 AM CST, aaron morton wrote:
I think a row mutation is isolated now, but is it across column families?
Correct they are isolated, but only for an individual CF.
By the way, the wiki page really needs updating.
You can update if you would like to.
That should not bother you.
For example, if your doing an hbase scan that crosses two column families,
that count end up being two (disk) seeks.
Having an API that hides the seeks from you does not give you better
performance, it only helps you when your debating with people that do not
Some updates:
Since we still have not fully turned on the system. We did something crazy
today. We tried to treat the node as dead one. (My boss wants us to practice
replacing a dead node before going to full production) and boot strap it. Here
is what we did:
* drain the node
* che
Hi all,
We have a situation that CPU loads on some of our nodes in a cluster has
spiked occasionally since the last November, which is triggered by requests
for rows that reside on two specific sstables.
We confirmed the followings(when spiked):
version: 1.0.7(current) <- 0.8.6 <- 0.8.5 <- 0.7.8
Hi All,
I have created a column family as follows. (With secondary indexes.)
create column family users with comparator=UTF8Type and
key_validation_class = 'UTF8Type' and default_validation_class = 'UTF8Type'
and column_metadata=[{column_name: full_name, validation_class: UTF8Type},
50 matches
Mail list logo