Re: Memory overhead of vector clocks…. how often are they pruned?

2011-08-24 Thread Radim Kolar
From my point vector clocks is too much overhead. If you sync clocks in your cluster using NTP (which you should do anyway) you will get clock precision < 1/1000s which is good enough. all my machines running NTP has offset < 1/1000s. They are FreeBSD, Linux is not that precise in clock syncin

Re: preloading entire CF with SEQ access on startup

2011-08-24 Thread aaron morton
Nothing automatic, you can do it by using range slices that request 0 columns. Once you have a hot cache it will be automatically saved a re-loaded at startup if you have enabled row_cache_save_period or key_cache_save_period for the CF. Cheers - Aaron Morton Freelance Cassand

Re: multi-node cassandra config doubt

2011-08-24 Thread aaron morton
Did you get this sorted ? At a guess I would say there are no nodes listed in the Hadoop JobConf. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 23/08/2011, at 9:51 PM, Thamizh wrote: > Hi All, > > This is regarding multi-node

Re: run Cassandra tutorial example

2011-08-24 Thread aaron morton
HColumn(city=Austin) Is the data you are after. Have a look in src/main/resources/log4j.properties if you want to change the logging settings. Have fun. - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 24/08/2011, at 4:06 AM, Alvin UW

Re: Customized Secondary Index Schema

2011-08-24 Thread aaron morton
IMHO it's only a scalability problem if those nodes have trouble handling the throughput. The load will go all all replicas, not one, unless you turn off Read Repair. If it is a problem then you could manually partition the index into multiple rows, bit of a pain thought. I'd wait and see, or

Re: checksumming

2011-08-24 Thread aaron morton
At the file level see https://issues.apache.org/jira/browse/CASSANDRA-674 At the higher level there is node tool repair http://wiki.apache.org/cassandra/AntiEntropy. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 24/08/2011, at

Re: Could Not connect to cassandra-cli on windows

2011-08-24 Thread aaron morton
Not off the top of my head. Can you get 0.7.8 running with a pre-packaged client ? Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 24/08/2011, at 12:16 PM, Alaa Zubaidi wrote: > Hi Aaron, > We are using Thrift 5.. >

Re: cassandra unexpected shutdown

2011-08-24 Thread aaron morton
First thing is are you on 0.8 ? It has some automagical memory management that is both automatic and magical http://thelastpickle.com/2011/05/04/How-are-Memtables-measured/ Secondly if you are OOM'ing you need to look at how much memory your schema is taking. See the link above, or just use 0.8

Cassandra-CLI does not allow list 1105115; with Syntax error

2011-08-24 Thread Renato Bacelar da Silveira
Hi All Good day, A question concerning Cassandra-Cli. I have a Column Family named 11001500. I have inserted the CF with Hector, and it did not throw any exception concerning the name of the column. If I am issuing the command list 1105115; I incur the following error: [default@unknown] l

Cassandra-CLI does not allow list 1105115; with Syntax error

2011-08-24 Thread Renato Bacelar da Silveira
Just some information about the Column family in question: ColumnFamily: 1105100 Key Validation Class: org.apache.cassandra.db.marshal.BytesType Default column value validator: org.apache.cassandra.db.marshal.BytesType Columns sorted by: org.apache.cassandra.db.marshal.Asc

nodetool repair does not return...

2011-08-24 Thread Boris Yen
Hi, In our testing environment, we got two nodes with RF=2 running 0.8.4. We tried to test the repair functions of cassandra, however, every once a while, the "nodetool repair" never returns. We have checked the system.log, nothing seems to be out of ordinary, no errors, no exceptions. The data is

Re: multi-node cassandra config doubt

2011-08-24 Thread Thamizh
Hi Aaron, This is yet to be resolved. I have set-up Cassandra multi node clustering and facing issues in pushing HDFS data to Cassandra. When I ran "MapReduce" progrma I am getting UnknownHostException. In hadoop(0.20.1), I have configured node01-as master and node01, node02 & node03 as slav

Re: help creating data model

2011-08-24 Thread Helder Oliveira
Thanks Indranath Ghosh for your tip! I will continue here the question. Aaron, i have read your suggestion and tried to design your suggestion and i have one question regarding it. Let's forget for now the Requests and Events! Just keep the Visitants and the Sessions. My goal is when having a

Re: Customized Secondary Index Schema

2011-08-24 Thread Alvin UW
Thanks. 2011/8/24 aaron morton > IMHO it's only a scalability problem if those nodes have trouble handling > the throughput. The load will go all all replicas, not one, unless you turn > off Read Repair. > > If it is a problem then you could manually partition the index into > multiple rows, bit

Re: checksumming

2011-08-24 Thread Jonathan Ellis
https://issues.apache.org/jira/browse/CASSANDRA-1717 added block level checksums. On Wed, Aug 24, 2011 at 4:28 AM, aaron morton wrote: > At the file level see https://issues.apache.org/jira/browse/CASSANDRA-674 > At the higher level there is node tool > repair http://wiki.apache.org/cassandra/Ant

Re: Could Not connect to cassandra-cli on windows

2011-08-24 Thread Alaa Zubaidi
Hi Aaron, I cannot at this point of time.. Thanks for your help.. Alaa On 8/24/2011 2:30 AM, aaron morton wrote: Not off the top of my head. Can you get 0.7.8 running with a pre-packaged client ? Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thel

Re: Commit log fills up in less than a minute

2011-08-24 Thread Anand Somani
So, I restarted the cluster (not rolling), but it is still maintaining hints for the IP's that are no longer part of the ring. nodetool ring shows things correctly (as only 3 nodes). When I check thru the jmx hintedhandoff manager, it shows it is maintaining the hints for those non existent IP's. S

Re: Commit log fills up in less than a minute

2011-08-24 Thread Anand Somani
So I have looked at the cluster from - Cassandra-client - describe cluster => shows correctly - 3 nodes - used the StorageService - JMX bean =>UnreachableNodes - shows 0 If all these show the correct ring state, why are hints being maintained, looks like that is the only way to find out ab

Re: cassandra unexpected shutdown

2011-08-24 Thread Ernst D Schoen-René
If by magical, you mean magically shuts down randomly, then yes, that is magical. We're on 8, but we discovered that 8 has an undocumented feature where turning off the commitlog doesn't work, so we're upgrading to 8.1 or whatever is current. It doesn't seem to be tied to high or low load, r

Re: Memory overhead of vector clocks…. how often are they pruned?

2011-08-24 Thread Ryan King
On Tue, Aug 23, 2011 at 7:58 PM, Kevin Burton wrote: > I had a thread going the other day about vector clock memory usage and that > it is a series of (clock id, clock):ts and the ability to prune old entries > … I'm specifically curious here how often old entries are pruned. > > If you're storin

Cassandra-cli not able to find CF after fresh CF insert.

2011-08-24 Thread Renato Bacelar da Silveira
Hi All Good day, I have again come across a situation where the CF is not being found by the list command... it would be too painful at this stage to restart the node just to be able to query the CF... *ColumnFamily: a1307* Key Validation Class: org.apache.cassandra.db.marshal.BytesTyp

Re: Memory overhead of vector clocks…. how often are they pruned?

2011-08-24 Thread Kevin Burton
This is really interesting… I can track it down but there are a number of references to Cassandra HAVING vector clocks … which would make sense that I can't find out how much memory they are using :-P "Cassandra: The Definitive Guide" … which I was reading the other night says that they were intro

Re: Memory overhead of vector clocks…. how often are they pruned?

2011-08-24 Thread Jeremy Hanna
At the point that book was written (about a year ago it was finalized), vector clocks were planned. In August or September of last year, they were removed. 0.7 was released in January. The ticket for vector clocks is here and you can see the reasoning for not using them at the bottom. https

how to migrate?

2011-08-24 Thread William Oberman
I was hoping to transition my "simple" cassandra cluster (where each node is a cassandra + hadoop tasktracker) to a cluster with two virtual datacenters (vanilla cassandra vs. cassandra + hadoop tasktracker), based on this: http://wiki.apache.org/cassandra/HadoopSupport#ClusterConfig The problem

Re: cassandra unexpected shutdown

2011-08-24 Thread Ernst D Schoen-René
So, we're on 8, so I don't think there's a key cache setting. Am I wrong? here's my newest crash log: ERROR [Thread-210] 2011-08-24 06:29:53,247 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[Thread-210,5,main] java.util.concurrent.RejectedExecutionException: ThreadPo

question about cassandra.in.sh

2011-08-24 Thread Koert Kuipers
i have an existing cassandra instance on my machine, it came with brisk and lives in /usr/share/brisk/cassandra. it also created /usr/share/cassandra/ cassandra.in.sh now i wanted to run another instance of cassandra (i needed a 0.7 version for compatibility reasons), so i downloaded it from apach

Cassandra Node Requirements

2011-08-24 Thread Jacob, Arun
I'm trying to determine a node configuration for Cassandra. From what I've been able to determine from reading around: 1. we need to cap data size at 50% of total node storage capacity for compaction 2. with RF=3, that means that I need to effectively assume that I have 1/6th of total stor

Re: Cassandra-CLI does not allow list 1105115; with Syntax error

2011-08-24 Thread aaron morton
Similar to https://issues.apache.org/jira/browse/CASSANDRA-3054 can you create a new ticket and link to that one. - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 24/08/2011, at 10:33 PM, Renato Bacelar da Silveira wrote: > Just some inf

Re: Cassandra Node Requirements

2011-08-24 Thread Edward Capriolo
On Wed, Aug 24, 2011 at 2:54 PM, Jacob, Arun wrote: > I'm trying to determine a node configuration for Cassandra. From what I've > been able to determine from reading around: > > >1. we need to cap data size at 50% of total node storage capacity for >compaction >2. with RF=3, that mea

Re: Memory overhead of vector clocks…. how often are they pruned?

2011-08-24 Thread Ryan King
We did have a Clock construct for awhile, but it never made it into a released version (afaik). We though about using them for counters. Timestamps are endemic to the data model and therefore can never be pruned. Cassandra basically trades memory for availability here. -ryan On Wed, Aug 24, 2011

Re: multi-node cassandra config doubt

2011-08-24 Thread aaron morton
Jump on the machine that raised the error and see if you can ssh to node01. or try using ip address to see if they work. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 24/08/2011, at 11:34 PM, Thamizh wrote: > Hi Aaron, > > T

Re: help creating data model

2011-08-24 Thread aaron morton
I normally suggest trying a model with Standard CF's first as there are some down sides to super CF's. If you know there will only be a few sub columns there are probably OK (see http://wiki.apache.org/cassandra/CassandraLimitations). Your alternative design is fine. Test it out and see what wo

Re: run Cassandra tutorial example

2011-08-24 Thread Thairu
"Error stacktraces" is output from maven. mvn -e option turns on Error reporting. From: aaron morton mailto:aa...@thelastpickle.com>> Reply-To: "user@cassandra.apache.org" mailto:user@cassandra.apache.org>> Date: Wed, 24 Aug 2011 02:14:12 -0700 To: "user@cassandr

Atomic or Non-Atomic Counters

2011-08-24 Thread Sal Fuentes
The design document that is referenced on the Cassandra wiki page ( http://wiki.apache.org/cassandra/Counters) describes the Counters in Cassandra as non-atomic ( https://issues.apache.org/jira/secure/attachment/12459754/Partitionedcountersdesigndoc.pdf). However, the DataStax post on counters ( ht

Re: Customized Secondary Index Schema

2011-08-24 Thread Ryan King
On Tue, Aug 23, 2011 at 10:03 AM, Alvin UW wrote: > Hello, > > As mentioned by Ed Anuff in his blog and slides, one way to build customized > secondary index is: > We use one CF, each row to represent a secondary index, with the secondary > index name as row key. > For example, > > Indexes = { > "

Re: Cassandra Node Requirements

2011-08-24 Thread Jacob, Arun
Thanks for the links and the answers. The vagueness of my initial questions reflects the fact that I'm trying to configure for a general case — I will clarify below: I need to account for a variety of use cases. (1) they will be both read and write heavy. I was assuming that SSDs would be real

Re: question about cassandra.in.sh

2011-08-24 Thread Eric Evans
On Wed, Aug 24, 2011 at 1:28 PM, Koert Kuipers wrote: > my problem is that the scripts for my cassandra 0.7 instance don't work > properly. the problem lies in the code snippets below. when i run the > scripts they source /usr/share/cassandra/cassandra.in.sh, which has the > wrong settings (it now

Re: Atomic or Non-Atomic Counters

2011-08-24 Thread Jonathan Ellis
They are atomic in the sense that if you increment from N to M, readers will never see any intermediate values, just N or M itself. On Wed, Aug 24, 2011 at 6:50 PM, Sal Fuentes wrote: > The design document that is referenced on the Cassandra wiki page > (http://wiki.apache.org/cassandra/Counters)

Re: Cassandra Node Requirements

2011-08-24 Thread Jonathan Ellis
On Wed, Aug 24, 2011 at 1:54 PM, Jacob, Arun wrote: > we need to cap data size at 50% of total node storage capacity for > compaction Sort of. There's some fine print, such as the 50% number is only if you're manually forcing major compactions, which is not recommended, but a bigger thing to kno

Re: nodetool repair does not return...

2011-08-24 Thread Boris Yen
Would Cassandra-2433 cause this? On Wed, Aug 24, 2011 at 7:23 PM, Boris Yen wrote: > Hi, > > In our testing environment, we got two nodes with RF=2 running 0.8.4. We > tried to test the repair functions of cassandra, however, every once a > while, the "nodetool repair" never returns. We have che

Re: how to know if nodetool cleanup is safe?

2011-08-24 Thread Yan Chunlu
got it! thanks a lot for the explanation! On Wed, Aug 24, 2011 at 1:06 AM, Edward Capriolo wrote: > > On Tue, Aug 23, 2011 at 11:56 AM, Sam Overton wrote: > >> On 21 August 2011 12:34, Yan Chunlu wrote: >> >>> since "nodetool cleanup" could remove hinted handoff, will it cause the >>> data los

For multi-tenant, is it good to have a key space for each tenant?

2011-08-24 Thread Guofeng Zhang
I wonder if it is a good practice to create a key space for each tenant. Any advice is appreciated. Thanks

Re: For multi-tenant, is it good to have a key space for each tenant?

2011-08-24 Thread Himanshi Sharma
I am working on similar sort of stuff. As per my knowledge, creating keyspace for each tenant would impose lot of memory constraints. Following Shared Keyspace and Shared Column families would be a better approach. And each row in CF could be referred by tenant_id as row key. And again it depe