R: Indexes on Columns & SubColumns Clarification

2010-11-04 Thread cbert...@libero.it
In each family, both CF and SCF, data are grouped by rows. Just to give an idea ... Super Column Family Name{ Row 1 { SuperColumn1 { Column1 Key: Column1 Value ... ColumnN Key: ColumnN Value} SuperColumn2 { Column1 Key: Column1 Value, ColumnN K

Re: SSD vs. HDD

2010-11-04 Thread Alaa Zubaidi
Thanks for the advise... We are running on Windows, and I just added more memory to my system, 16G I will run the test again with 8G heap. The load is continues, however, the CPU usage is around 40% with max of 70%. As for cache, I am not using cache, because I am under the impression that cach

Re: SSD vs. HDD

2010-11-04 Thread Alaa Zubaidi
Its a little bit different than what most people use it for, and that's why we are trying to test it, to see if we can benefit from the speed of writing/reading, scalability when and if we need it, and also the coast. and part of the testing we are doing, is trying to see how many nodes do we ne

Re: Bringing a new node to replace a dead node - streaming is very slow

2010-11-04 Thread Adi
> Hello, > Environment - Cassandra 0.6.5 CentOS55 amazon ec2 image small. > > I am bringing in a new node to replace a dead node. Using new IP and > deadNodeToken -1 as the InitialToken for the new node. The node has started > bootstrapping and I can see the seed node is streaming to the

Indexes on Columns & SubColumns Clarification

2010-11-04 Thread Jim Morrison
Hi, I've been doing a lot of reading and I've one thing I'm not entirely clear on - could someone clarify? Q: Exactly at what point does indexing stop? I'm trying to use cassandra to store log information that is both user & time sensitive. So I've a basic model like this: detailed_log: { // s

Re: getRange - Cassandra

2010-11-04 Thread Nate McCall
It is the row key for one of your results. See the toString method on RowImpl in the hector source. Also, please feel free to direct any hector specific questions to: hector-us...@googlegroups.com On Thu, Nov 4, 2010 at 9:03 AM, Massimo Carro wrote: > Hi at all, > > I try to make a getRange with

Bringing a new node to replace a dead node - streaming is very slow

2010-11-04 Thread Adi
Hello, Environment - Cassandra 0.6.5 CentOS55 amazon ec2 image small. I am bringing in a new node to replace a dead node. Using new IP and deadNodeToken -1 as the InitialToken for the new node. The node has started bootstrapping and I can see the seed node is streaming to the new node.

getRange - Cassandra

2010-11-04 Thread Massimo Carro
Hi at all, I try to make a getRange with Hector and it's ok. But I don't understand why at the end of read I find a column like [Row(eeg,ColumnSlice([]))] Someone know what is "eeg" ? Thks a lot! Massimo Carro www.liquida.it - www.liquida.com

Re: SSD vs. HDD

2010-11-04 Thread Juho Mäkinen
Do you really need Cassandra to store just 80 GB data for just four hours? It might be just me, but this sounds like quite far fetched from normal Cassandra usage. Cassandra isn't happy unless you run enough nodes to cover one or two node doing compaction (which hurts the node performance). Are you

Re: SSD vs. HDD

2010-11-04 Thread Nick Telford
If you're bottle-necking on read I/O making proper use of Cassandras key cache and row cache will improve things dramatically. A little maths using the numbers you've provided tells me that you have about 80GB of "hot" data (data valid in a 4 hour period). That's obviously too much to directly cac

Re: cassandra data spreading across the cluster

2010-11-04 Thread Juho Mäkinen
The load contains duplicate data which is created due to compaction. Run 'cleanup' command with nodetool to those big nodes and you should see the load drop to the actual usage. - Garo On Thu, Nov 4, 2010 at 11:08 AM, Mark Zitnik wrote: > Hi All, > > I'm having a problem in spreading data acros

cassandra data spreading across the cluster

2010-11-04 Thread Mark Zitnik
Hi All, I'm having a problem in spreading data across the cluster. my replication factor is 3, please advice why there is a big difference between 10.11.40.239 and 10.11.40.161. Thanks Address Status Load Range Ring 10.11.40.173 Up 220.58

Re: SSD vs. HDD

2010-11-04 Thread Peter Schuller
> I am having time out errors while reading. > I have 5 CFs but two CFs with high write/read. > The data is organized in time series rows, in CF1 the new rows are read > every 10 seconds and then the whole rows are deleted, While in CF2 the rows > are read in different time range slices and eventua