Re: how large can a cluster over the WAN be?

2011-03-08 Thread Robert Coli
On Mon, Mar 7, 2011 at 11:32 AM, John Lewis wrote: > When you say decent latency and throughput what numbers do you consider > decent? I know throughput would be highly dependent on the quantity of kb > shoved through the pipe so I would expect throughput needs would be highly > dependent on th

setting consistency level

2011-03-08 Thread Sagar Kohli
Hi, Can we define consistency level in yaml file(or at the time of designing cassandra data modal), my question may sound stupid since m still in process of understanding Cassandra :)... Thanks and regards sagar Are you exploring a Big Data Strategy ? Listen t

setting consistency level

2011-03-08 Thread Sagar Kohli
Hi, Can we define consistency level in yaml file(or at the time of designing cassandra data modal), my question may sound stupid since m still in process of understanding Cassandra :)... Thanks and regards sagar Are you exploring a Big Data Strategy ? Listen t

Re: Splitting the data of a single blog into 2 CFs (to implement effective caching) according to views.

2011-03-08 Thread Norman Maurer
Yeah this make sense as far as I can tell. Bye, Norman 2011/3/8 Aditya Narayan > > My application displays list of several blogs' overview data (like > blogTitle/ nameOfBlogger/ shortDescrption for each blog) on 1st page (in > very much similar manner like Digg's newsfeed) and when the user

Re: Error when bringing up nodes during failure testing

2011-03-08 Thread aaron morton
It looks like the node is sending out it application state and waiting the required time after which it expects to know about all other nodes in the cluster. > INFO [main] 2011-03-07 17:04:06,660 StorageService.java (line 399) Joining: > sleeping 3 ms for pending range setup For some reaso

auto_bootstrap setting after bootstrapping

2011-03-08 Thread Maki Watanabe
Hello, According to the Wiki/StorageConfiguration page, auto_bootstrap is described as below: auto_bootstrap Set to 'true' to make new [non-seed] nodes automatically migrate the right data to themselves. (If no InitialToken is specified, they will pick one such that they will get half the rang

Re: TException: Error: TSocket read 0 bytes

2011-03-08 Thread aaron morton
Just checking the version of Thrift, you said 0.7.2 the latest stable is 0.6 Unfortunately for cassandra 0.6 you need to match a specific SVN release for thrift see http://wiki.apache.org/cassandra/InstallThrift For cassandra 0.6.12 it's r917130 Is there a reason you are using cassandra 0.6.12

Re: setting consistency level

2011-03-08 Thread aaron morton
Consistency is set by the client for each read or write requests. You define the Replication Factor when creating the Keyspace, either in cassandra.yaml or as part of the create keyspace statement using the cassandra-cli. For background... Check the docs if any for the high level client you are

Re: Splitting the data of a single blog into 2 CFs (to implement effective caching) according to views.

2011-03-08 Thread aaron morton
You could duplicate the data from CF1 in CF2 as well (use a batch_mutation through whatever client you have). So when serving the second page you only need to read one row from CF2. Aaron On 8/03/2011, at 8:13 PM, Norman Maurer wrote: > Yeah this make sense as far as I can tell. > > > Bye,

Re: auto_bootstrap setting after bootstrapping

2011-03-08 Thread aaron morton
AFAIK yes. The node marks itself as bootstrapped whenever it starts, and will not re-bootstrap once that it set. More info here http://wiki.apache.org/cassandra/Operations#Bootstrap Hope that helps. Aaron On 8/03/2011, at 9:35 PM, Maki Watanabe wrote: > Hello, > According to the Wiki/Stora

Re: changing ip's ...

2011-03-08 Thread Sasha Dolgy
One of the issues with ec2 is after a reboot. the internal ip changes.this caused a a big problem for me yesterday. On Mar 8, 2011 2:29 AM, "aaron morton" wrote: > Not this fits your problem, but if you pass -Dcassandra.load_ring_state=false as a JVM option it will stop the node from loading

Re: Splitting the data of a single blog into 2 CFs (to implement effective caching) according to views.

2011-03-08 Thread Aditya Narayan
Yes Aaron I thought about that but that doesnt seem to be just a small amount of data either (contains text), but yes we can consider to do so later as we find the need for it.. Thank you both! On Tue, Mar 8, 2011 at 2:25 PM, aaron morton wrote: > You could duplicate the data from CF1 in CF2 a

Re: when do snapshots go away?

2011-03-08 Thread Sylvain Lebresne
On Tue, Mar 8, 2011 at 1:53 AM, Jeffrey Wang wrote: > Hi all, > > > > When I drop a column family, it creates a snapshot. When does the snapshot > go away and free up the disk space? I was able to run nodetool clearsnapshot > to get rid of them, but will they go away themselves? (Also, is there a

Re: What would be a good strategy for Storing the large text contents like blog posts in Cassandra.

2011-03-08 Thread Jean-Christophe Sirot
On 03/07/2011 10:08 PM, Aaron Morton wrote: You can fill your boots. So long as your boots have a capacity of 2 billion. Background ... http://wiki.apache.org/cassandra/LargeDataSetConsiderations http://wiki.apache.org/cassandra/CassandraLimitations http://www.pcworld.idg.com.au/article/37348

Re: recommended way to grow a cluster?

2011-03-08 Thread aaron morton
I do not know of any articles I could send your way, and others may have some tales from running production systems. But here are a few thoughts, others please correct me if I am wrong: - the replication factor is not intended to the changed on a running system. It can be, but it will be a heav

Re: auto_bootstrap setting after bootstrapping

2011-03-08 Thread Maki Watanabe
Thx! 2011/3/8 aaron morton : > AFAIK yes. The node marks itself as bootstrapped whenever it starts, and > will not re-bootstrap once that it set. > More info here > http://wiki.apache.org/cassandra/Operations#Bootstrap > Hope that helps. > Aaron > On 8/03/2011, at 9:35 PM, Maki Watanabe wrote: > >

Re: Nodes frozen in GC

2011-03-08 Thread ruslan usifov
2011/3/8 Chris Goffinet > How large are your SSTables on disk? My thought was because you have so > many on disk, we have to store the bloom filter + every 128 keys from index > in memory. > > 0.5GB But as I understand store in memory happens only when read happens, i do only inserts. And i thin

London meetup on Hadoop Integration

2011-03-08 Thread Dave Gardner
Hi all, This month's London user group will be on the topic of Hadoop integration. If anyone is interested in sharing knowledge about how they use Hadoop with Cassandra then please get in touch, there are some speaker slots available. If you'd like to learn more then please come along! http://www

Re: Nodes frozen in GC

2011-03-08 Thread David Boxenhorn
If RF=2 and CL= QUORUM, you're getting no benefit from replication. When a node is in GC it stops everything. Set RF=3, so when one node is busy the cluster will still work. On Tue, Mar 8, 2011 at 11:46 AM, ruslan usifov wrote: > > > 2011/3/8 Chris Goffinet > >> How large are your SSTables on di

Re: Nodes frozen in GC

2011-03-08 Thread ruslan usifov
2011/3/8 Paul Pak > Hi Ruslan, > > It looks like Jonathan and Stu have already been working to reduce garbage > collection on v.8 The ticket is at > https://issues.apache.org/jira/browse/CASSANDRA-2252 > > Jonathan, is there any way to apply the patch to .73 and have ruslan test > it to see if

0.7.3 nodetool scrub exceptions

2011-03-08 Thread Karl Hiramoto
I have 1000's of these in the log is this normal? java.io.IOError: java.io.EOFException: bloom filter claims to be longer than entire row size at org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:117) at org.apache.cassandra.db.CompactionMan

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
> So, are you saying this is normal and expected from Cassandra?  So, > under load, we can expect java garbage collection to stop the Cassandra > process on that server from time to time, essentially taking out the > node for short periods of time while it does garbage collection? This thread is g

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
> (1) I cannot stress this one enough: Run with -XX:+PrintGC > -XX:+PrintGCDetails -XX:+PrintGCTimeStamps and collect the output. Actually, I wonder if it's worth someone getting this enabled by default, with the obvious problems associated with getting the log output placed appropriately and rota

Re: setting consistency level

2011-03-08 Thread Mayank Mishra
Sagar, Consistency level defines how your reads and writes should work. You can defer it according to your needs, defines what are your expectations when you are reading/writing data. Hence, they are not static to Keyspace/CF metadata. With regards, Mayank On 08-03-2011 13:15, Sagar Kohli w

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
Also: * What is the frequency of the pauses? Are we talking every few seconds, minutes, hours, days * If you say decrease the load down to 25%. Are you seeing the same effect but at 1/4th the frequency, or does it remain unchanged, or does the problem go away completely? -- / Peter Schuller

Re: changing ip's ...

2011-03-08 Thread David McNelis
I've run into this issue as well when running a test instance on my laptop. In the office (where I set it up) I have no issues, go outside the office on a different network, different story. I'll try your suggestion, Aaron. On Tue, Mar 8, 2011 at 12:43 AM, Sasha Dolgy wrote: > One of the issue

Re: Nodes frozen in GC

2011-03-08 Thread ruslan usifov
2011/3/8 Peter Schuller > > (1) I cannot stress this one enough: Run with -XX:+PrintGC > -XX:+PrintGCDetails -XX:+PrintGCTimeStamps and collect the output. > (2) Attach to your process with jconsole or some similar tool. > (3) Observe the behavior of the heap over time. Preferably post > screensh

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
> JVM_OPTS="$JVM_OPTS -XX:+PrintGCApplicationStoppedTime" > JVM_OPTS="$JVM_OPTS -Xloggc:/var/log/cassandra/gc.log" Add: JVM_OPTS="$JVM_OPTS -XX:+PrintGC" JVM_OPTS="$JVM_OPTS -XX:+PrintGCDetails" JVM_OPTS="$JVM_OPTS -XX:+PrintGCTimeStamps" And you will see significantly more detail in the GC log.

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
> Add: > > JVM_OPTS="$JVM_OPTS -XX:+PrintGC" > JVM_OPTS="$JVM_OPTS -XX:+PrintGCDetails" > JVM_OPTS="$JVM_OPTS -XX:+PrintGCTimeStamps" > > And you will see significantly more detail in the GC log. Maybe you want to add -XX:+PrintGCApplicationConcurrentTime while you're at it. But the key is to se

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
>                 $client->batch_mutate($mutations, > cassandra_ConsistencyLevel::QUORUM); Btw, what are the mutations? Are you doing something like inserting both very small values and very large ones? In any case: My main reason to butt back into this thread is that under normal circumstances y

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
> Also, why is there so much garbage collection to begin with?  Memcache > uses a slab allocator to reuse blocks to prevent allocation/deallocation > of blocks from consuming all the cpu time.  Are there any plans to reuse > blocks so the garbage collector doesn't have to work so hard? And to addr

problem with bootstrap

2011-03-08 Thread Patrik Modesto
Hi, I've small test cluster, 2 servers, both running successfully cassandra 0.7.3. I've three keyspaces, two with RF1, one with RF3. Now when I try to bootstrap 3rd server (empty initial_token, auto_bootstrap: true), I get this exception on the new server. INFO 23:13:43,229 Joining: getting boot

nodetool repair hung in 0.7.3

2011-03-08 Thread Karl Hiramoto
I never saw this before upgrading to 0.7.3 but now I do nodetool repair and it sits there for hours. Previously it took about 20 minutes per node (about 10GB of data per node). I had some OOM crashes, but haven't seen them since I increased the heap size and decreased the key cache. In the

Re: nodetool repair hung in 0.7.3

2011-03-08 Thread Sylvain Lebresne
I just saw repair hang here too, it's actually very easy to reproduce. I'm looking at it right now. -- Sylvain On Tue, Mar 8, 2011 at 4:30 PM, Karl Hiramoto wrote: > I never saw this before upgrading to 0.7.3 but now I do nodetool repair and > it sits there for hours. Previously it took about

Re: nodetool repair hung in 0.7.3

2011-03-08 Thread Karl Hiramoto
On 08/03/2011 16:34, Sylvain Lebresne wrote: I just saw repair hang here too, it's actually very easy to reproduce. I'm looking at it right now. -- Thanks. Should i bump GCGraceSeconds since i can no longer repair? I tried repair on 3 nodes of a 6 node cluster and they all hang. Woul

Re: recommended way to grow a cluster?

2011-03-08 Thread Peter Schuller
> - When adding nodes to a cluster it's mode efficient if you can change the > range to existing nodes to be a sub set of what they were responsible for > previously. So the node only has to stream out data, rather than stream out > and stream in data. Say you have this contrived example (where val

Re: Error when bringing up nodes during failure testing

2011-03-08 Thread Jonathan Ellis
Is he trying to bootstrap? What does that have to do with failure recovery? Doesn't make sense to me. On Tue, Mar 8, 2011 at 2:33 AM, aaron morton wrote: > It looks like the node is sending out it application state and waiting the > required time after which it expects to know about all other

Re: 0.7.3 nodetool scrub exceptions

2011-03-08 Thread Jonathan Ellis
No. What is the history of your cluster? On Tue, Mar 8, 2011 at 5:34 AM, Karl Hiramoto wrote: > I have 1000's of these in the log  is this normal? > > java.io.IOError: java.io.EOFException: bloom filter claims to be longer than > entire row size >        at > org.apache.cassandra.io.sstable.SSTa

Re: 0.7.3 nodetool scrub exceptions

2011-03-08 Thread Karl Hiramoto
On 08/03/2011 17:09, Jonathan Ellis wrote: No. What is the history of your cluster? It started out as 0.7.0 - RC3 And I've upgraded 0.7.0, 0.7.1, 0.7.2, 0.7.3 within a few days after each was released. I have 6 nodes about 10GB of data each RF=2. Only one CF every row/column has a T

Several 'TimedOutException' in stress.py

2011-03-08 Thread A J
Trying out stress.py on AWS EC2 environment (4 Large instances. Each of 2-cores and 7.5GB RAM. All in the same region/zone.) python stress.py -o insert -d 10.253.203.224,10.220.203.48,10.220.17.84,10.124.89.81 -l 2 -e ALL -t 10 -n 500 -S 100 -k (I want to try with column size of about 1MB. I

Re: when do snapshots go away?

2011-03-08 Thread Robert Coli
On Tue, Mar 8, 2011 at 1:25 AM, Sylvain Lebresne wrote: > And it's far easier for you to know what to do with the snapshot > (whether that is deleting it or archiving it somewhere) than for the > application. Snapshots also have the neat property of not being the full size of your corpus unless y

Re: how large can a cluster over the WAN be?

2011-03-08 Thread John Lewis
Thanks for the reply, I realize my question was rather nebulous as I consider this proposed deployment to be rather nebulous as well. Any bit of information and a direction on which sections of documentation are relevant helps this challenge become less nebulous over time. I will do some reading

Re: Error when bringing up nodes during failure testing

2011-03-08 Thread mcasandra
I turned the auto_bootstrap off and it worked fine. I don't think it's connectivity issue or network issue at all. I am very confused about what's going on here. Can you please let me know if this a bug that I am facing? Also, what are the disadvantage of turning off auto bootstrap? Do I need to

Re: Error when bringing up nodes during failure testing

2011-03-08 Thread Peter Schuller
> Also, what are the disadvantage of turning off auto bootstrap? Do I need to > do anything after the fact? Inserting a new node into a ring without auto_bootstrap implies that it will join the ring, but will not contain any data for which it is supposedly responsible. A 'nodetool repair' should c

Re: Error when bringing up nodes during failure testing

2011-03-08 Thread Peter Schuller
> 2) When I brought 2 nodes down (out of 3), I was able to start one node > (with 66 % load below) even though auto_bootstrap is set to true. Shouldn't > it have failed for the same reason? This is a good point/question. As far as I can tell, a node being bootstrapped would need to receive data fr

Re: Alternative to repair

2011-03-08 Thread Daniel Doubleday
Thanks for the reply! > Not really: > > - range scans do not perform read repair Ok I obviously overlooked that RangeSliceResponseResolver does not repair rows on nodes that never saw a write for a given key at all. But that's not a big problem for us since we are mainly interested in fixing m

Re: Several 'TimedOutException' in stress.py

2011-03-08 Thread ruslan usifov
2011/3/8 A J > Trying out stress.py on AWS EC2 environment (4 Large instances. Each > of 2-cores and 7.5GB RAM. All in the same region/zone.) > > python stress.py -o insert -d > 10.253.203.224,10.220.203.48,10.220.17.84,10.124.89.81 -l 2 -e ALL -t > 10 -n 500 -S 100 -k > > (I want to try wit

Re: Nodes frozen in GC

2011-03-08 Thread Paul Pak
Hi Ruslan, Is it possible for you to tell us the details on what you have done which measurably helped your situation, so we can start a "best practices" doc on growing cassandra systems? So far, I see that under load, cassandra is rarely "ready" to take heavy load in it's default configuration

Cassandra Meetup in Austin, TX

2011-03-08 Thread Nate McCall
http://www.meetup.com/Cassandra-Austin/ Waiting on a few more folks to join before we start discussing a date, so please signup if you are in the area.

Re: Cassandra Meetup in Austin, TX

2011-03-08 Thread Jake Luciani
There is also a newly formed NYC area Cassandra User Group http://www.meetup.com/NYC-Cassandra-User-Group On Tue, Mar 8, 2011 at 1:46 PM, Nate McCall wrote: > http://www.meetup.com/Cassandra-Austin/ > > Waiting on a few more folks to join before we start discussing a date, > so please signup i

Re: Error when bringing up nodes during failure testing

2011-03-08 Thread mcasandra
I am as clear as mud with what is happening here :) But with some suggestions I can try to start my test from scratch and post results in that order. -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Exception-when-bringing-up-nodes-during-failure-

cassandra + zabbix

2011-03-08 Thread pob
Hello, Im using cassandra with mx4j, I was googling half day but cant find anything usable to connect it with zibbix. I just found Zapcat, but I dont wanna make any change into code, then Munin with plugins https://github.com/jamesgolick/cassandra-munin-plugins... I need something to being easy co

Re: cassandra + zabbix

2011-03-08 Thread ruslan usifov
You can simply write you own java agent(this doesn't require chage of source code), you can simply begin from http://blogs.sun.com/jmxetc/entry/connecting_through_firewall_using_jmx, http://download.oracle.com/javase/6/docs/api/java/lang/instrument/package-summary.html, In agent you can start HTTP

Re: Nodes frozen in GC

2011-03-08 Thread ruslan usifov
2011/3/8 Paul Pak > Hi Ruslan, > > Is it possible for you to tell us the details on what you have done which > measurably helped your situation, so we can start a "best practices" doc on > growing cassandra systems? > > So far, I see that under load, cassandra is rarely "ready" to take heavy > l

RE: Cassandra Meetup in Austin, TX

2011-03-08 Thread Sanchez, Carlos
Anything in Dallas? From: Jake Luciani [mailto:jak...@gmail.com] Sent: Tuesday, March 08, 2011 12:53 PM To: user@cassandra.apache.org Subject: Re: Cassandra Meetup in Austin, TX There is also a newly formed NYC area Cassandra User Group http://www.meetup.com/NYC-Cassandra-User-Group On Tue, Mar

Re: cassandra + zabbix

2011-03-08 Thread pob
Hello, that was the way i was thinking about, actually its written https://gist.github.com/744761 But any hint how to get those data from httpserver into zabbix? Thanks 2011/3/8 ruslan usifov > You can simply write you own java agent(this doesn't require chage of > source code), you can simp

Re: nodetool repair hung in 0.7.3

2011-03-08 Thread Sylvain Lebresne
I suspect you are in the case of https://issues.apache.org/jira/browse/CASSANDRA-2290. That is some neighbor node died or was unable to perform its part of the repair. You can always retry making sure all node are and stay alive to see if it is the former one. But seeing the other exception in you

Re: Nodes frozen in GC

2011-03-08 Thread ruslan usifov
2011/3/8 Peter Schuller > > $client->batch_mutate($mutations, > > cassandra_ConsistencyLevel::QUORUM); > > Btw, what are the mutations? Are you doing something like inserting > both very small values and very large ones? > > I have big xml file (5 GB) (mysql dump in xml format) an

Re: problem with bootstrap

2011-03-08 Thread aaron morton
I've seen this around a couple of times now. On reason to fail if there is not enough nodes to meet the replication factor is that CL.ALL requests cannot be processed. You could make the argument that we can get into that state at any time is a node is down. But this error is their never been

Re: Several 'TimedOutException' in stress.py

2011-03-08 Thread aaron morton
Is this a client side time out or a server side one? What does the error stack look like ? Also check the server side logs for errors. The thrift API will raise a timeout when less the CL level of nodes return in rpc_timeout. Good luck Aaron On 9/03/2011, at 7:37 AM, ruslan usifov wrote: >

Re: 0.7.3 nodetool scrub exceptions

2011-03-08 Thread Sylvain Lebresne
Did you run scrub as soon as you updated to 0.7.3 ? And did you had problems/exceptions before running scrub ? If yes, did you had problems with only 0.7.3 or also with 0.7.2 ? If the problems started with running scrub, since it takes a snapshot before running, can you try restarting a test clus

Re: 0.7.3 nodetool scrub exceptions

2011-03-08 Thread Terje Marthinussen
I had similar errors in late 0.7.3 releases related to testing I did for the mails with subject "Argh: Data Corruption (LOST DATA) (0.7.0)". I do not see these corruptions or the above error anymore with 0.7.3 release as long as the dataset is created from scratch. The patch (2104) mentioned in th

Re: problem with bootstrap

2011-03-08 Thread mcasandra
I think this not the right functionality and it is really odd that you can't successfully bring it online without turning off bootstrap BUT you can bring it online by turning auto_boostrap off and then run nodetool repair afterwards. Also, if that's the case then when one node goes down, say out o

Re: 0.7.3 nodetool scrub exceptions

2011-03-08 Thread Karl Hiramoto
On 03/08/11 21:45, Sylvain Lebresne wrote: > Did you run scrub as soon as you updated to 0.7.3 ? > Yes, whithin a few minutes of starting up 0.7.3 on the node > And did you had problems/exceptions before running scrub ? Not sure. > If yes, did you had problems with only 0.7.3 or also with 0.7.2 ?

Re: Several 'TimedOutException' in stress.py

2011-03-08 Thread A J
Client side (it is just a 5th instance in the same EC2 zone, having stress.py installed on it) gives the following error: Process Inserter-4: Traceback (most recent call last): File "/usr/lib64/python2.6/multiprocessing/process.py", line 232, in _bootstrap self.run() File "stress.py", line

RE: cassandra + zabbix

2011-03-08 Thread Prasanna Jayapalan
Hi Peter, If you would like to monitor Cassandra, do explore Evident ClearStone’s Cassandra Managementpack. All it takes is 5 minutes to setup and get the data using JMX interface. Prasanna *Prasanna Jayapalan Senior Solution Consultant Evident Software *

Re: Cassandra Meetup in Austin, TX

2011-03-08 Thread Sasha Dolgy
And there three people here in Zurich if anyone else is lurking ... not organized beer + discussion yet. On Tue, Mar 8, 2011 at 7:52 PM, Jake Luciani wrote: > There is also a newly formed NYC area Cassandra User Group > > http://www.meetup.com/NYC-Cassandra-User-Group > > > On Tue, Mar 8, 2011 a

Re: Several 'TimedOutException' in stress.py

2011-03-08 Thread aaron morton
Cool, so it's a server side because - in the client side stack the thrift code is raising the error - server side log has this DEBUG 22:29:10,318 ... timed out The TimedOutException is raised when the number of replicas required by your CL have not returned inside the timespan specified by rpc_

Re: 0.7.3 nodetool scrub exceptions

2011-03-08 Thread Jonathan Ellis
alienth on irc is reporting the same error. His path was 0.6.8 to 0.7.1 to 0.7.3. It's probably a bug in scrub. If we can get an sstable exhibiting the problem posted here or on Jira that would help troubleshoot. On Tue, Mar 8, 2011 at 10:31 AM, Karl Hiramoto wrote: > On 08/03/2011 17:09, Jona

Re: Cassandra Meetup in Austin, TX

2011-03-08 Thread Christopher St John
On Tue, Mar 8, 2011 at 1:56 PM, Sanchez, Carlos wrote: > Anything in Dallas? > Funny you should ask, on March 22nd there's: http://dbdmh.eventbrite.com Informal get-together more than a "real" event, but Cassandra has come up as a topic and I suspect it would be a good place to find other peop

Re: 0.7.3 nodetool scrub exceptions

2011-03-08 Thread Jonathan Ellis
Turn on debug logging and see if the output looks like what I posted to https://issues.apache.org/jira/browse/CASSANDRA-2296 It *may* be harmless depending on where those zero-length rows are coming from. I've added asserts to 0.7 branch that fire if we attempt to write a zero-length row, so if t

Re: 0.7.3 nodetool scrub exceptions

2011-03-08 Thread Jonathan Ellis
Looks like it is harmless -- Scrub would write a zero-length row when tombstones expire and there is nothing left, instead of writing no row at all. Fix attached to the jira ticket. On Tue, Mar 8, 2011 at 8:58 PM, Jonathan Ellis wrote: > It *may* be harmless depending on where those zero-length r

Does the memtable replace the old version of column with the new overwriting version or is it just a simple append ?

2011-03-08 Thread Aditya Narayan
Do the overwrites of newly written columns(that are present in memtable) *replace the old column* or is it just a simple append. I am trying to understand that if I update these column very very frequently(while they are in memtable), does the read performance of these columns gets affected, since

Re: Does the memtable replace the old version of column with the new overwriting version or is it just a simple append ?

2011-03-08 Thread Narendra Sharma
Multiple write for same key and column will result in overwriting of column in a memtable. Basically multiple updates for same (key, column) are reconciled based on the column's timestamp. This happens per memtable. So if a memtable is flushed to an sstable, this rule will be valid for the next mem

Re: Does the memtable replace the old version of column with the new overwriting version or is it just a simple append ?

2011-03-08 Thread Aditya Narayan
so this means that in memtable only the most recent version of a column will reside!? For this implementation, while writing "to memtable" Cassandra will see if there are other versions and will overwrite them (reconcilation while writing) !? I know that different SST tables may have different ver