Re: singular or plural column family names

2010-07-27 Thread Aaron Morton
For RDBMS I *always* used singular for table names. And was prepared to backup this position for force if necessary :)Any ways, now days it's all about how it makes you feel inside. And I feel it should still be singular. It tends to work better when there are multiple CF's related to the same logi

SV: what causes MESSAGE-DESERIALIZER-POOL to spike

2010-07-27 Thread Thorvaldsson Justus
AFAIK You could use more nodes and read in parallel from them making your read rate go up. Also don't write and read to the same disk may help some. It's not so much about "Cassandra's" read rate but what your hardware can manage. /Justus Från: Dathan Pattishall [mailto:datha...@gmail.com] Skic

SV: Help! Cassandra Data Loader threads are getting stuck

2010-07-27 Thread Thorvaldsson Justus
I made one program doing just this with Java Basically I read with one thread from file into an array stopping when size is 20k and w8 until it is less than 20k and continue reading the datafile. (this is the raw data I want to move) I have n number of threads Each with one batch of their own an

Re: what causes MESSAGE-DESERIALIZER-POOL to spike

2010-07-27 Thread Dathan Pattishall
Ah, the weird thing is I/O is assumed to be the limiting factor, but iops on the box was very low. Service time and atime very low, and the data access was only 6MB a second. With all of this, I'm tending to believe that the problem may be someplace else. Maybe there is a preferred Java version fo

Re: Key Caching

2010-07-27 Thread Peter Schuller
> @Todd, I noticed some new ops in your cassandra.in.sh. Is there any > documentation on what these ops are, and what they do? > > For instance AggressiveOpts, etc. A fairly complete list is here: http://java.sun.com/javase/technologies/hotspot/vmoptions.jsp -- / Peter Schuller

Re: what causes MESSAGE-DESERIALIZER-POOL to spike

2010-07-27 Thread Peter Schuller
> Ah, the weird thing is I/O is assumed to be the limiting factor, but iops on > the box was very low. Service time and atime very low, and the data access > was only 6MB a second. With all of this, I'm tending to believe that the > problem may be someplace else. You vmstat output shows idle and w

Re: what causes MESSAGE-DESERIALIZER-POOL to spike

2010-07-27 Thread Peter Schuller
> average queue size column too. But given the vmstat output I doubt > this is the case since you should either be seeing a lot more wait > time or a lot less idle time. Hmm, another thing: you mention 16 i7 cores. I presume that's 16 in total, counting hyper-threading? Because that means 8 thread

SV: Key Caching

2010-07-27 Thread Thorvaldsson Justus
I can test on 3 servers and I can test using up to 86gb on each, is there anything specific you want to test in this case? I am using Cassandra 6.3 and running a much smaller amount of RAM but if you think it is interesting I will add it to my ToDo list. I don’t know if I will have more servers

Re: Key Caching

2010-07-27 Thread Dathan Pattishall
woot thnx, lots of knobs to play with! On Tue, Jul 27, 2010 at 12:16 AM, Peter Schuller < peter.schul...@infidyne.com> wrote: > > @Todd, I noticed some new ops in your cassandra.in.sh. Is there any > > documentation on what these ops are, and what they do? > > > > For instance AggressiveOpts, etc

Re: Cassandra to store 1 billion small 64KB Blobs

2010-07-27 Thread aaron morton
> Some possibilities open up when using OPP, especially with aggregate > keys. This is more of an option when RF==cluster size, but not > necessarily a good reason to make RF=cluster size if you haven't > already. This use of the OOP sounds like the way Lucandra stores data, they want to have ran

non blocking Cassandra with Tornado

2010-07-27 Thread aaron morton
Today I worked out how to make non blocking calls to Cassandra inside of the non blocking Tornado web server (http://www.tornadoweb.org/) using Python. I thought I'd share it here and see if anyone thinks I'm abusing Thrift too much and inviting trouble. It's a bit mucky and I have not tested i

Re: Cassandra disk space utilization WAY higher than I would expect

2010-07-27 Thread Jonathan Ellis
On Fri, Jul 23, 2010 at 8:57 AM, Julie wrote: > But in my focused testing today I see that if I run nodetool "cleanup" on the > nodes taking up way more space than I expect, I see multiple SS Tables being > combined into 1 or 2 and the live disk usage going way down, down to what I > know > the r

Re: non blocking Cassandra with Tornado

2010-07-27 Thread Sandeep Kalidindi at PaGaLGuY.com
@aaron - thanks a lot. i will test it. This is very much needed. Cheers, Deepu. On Tue, Jul 27, 2010 at 6:03 PM, aaron morton wrote: > Today I worked out how to make non blocking calls to Cassandra inside of > the non blocking Tornado web server (http://www.tornadoweb.org/) using > Python. I t

Quick Poll: Server names

2010-07-27 Thread uncle mantis
I will be naming my servers after insect family names. What do you all use for yours? If this is something that is too off topic please contact a moderator. Regards, Michael

RE: Quick Poll: Server names

2010-07-27 Thread John Hogan
Star Trek ship names. JH From: uncle mantis [mailto:uncleman...@gmail.com] Sent: Tuesday, July 27, 2010 9:55 AM To: cassandra-u...@incubator.apache.org Subject: Quick Poll: Server names I will be naming my servers after insect family names. What do you all use for yours? If this is something t

Cassandra vs MongoDB

2010-07-27 Thread Mark
Can someone quickly explain the differences between the two? Other than the fact that MongoDB supports ad-hoc querying I don't know whats different. It also appears (using google trends) that MongoDB seems to be growing while Cassandra is dying off. Is this the case? Thanks for the help

Can't find the storageproxy using jconsole

2010-07-27 Thread Mingfan Lu
I am using Jconsole to access JMX and find out that I can't find storageproxy under mbean tab while I can get information of storageservice. It is very interesting that I find the storageproxy has been registered in source code. private StorageProxy() {} static { MBeanServer mbs =

Re: Quick Poll: Server names

2010-07-27 Thread Michael Widmann
Stargate Series Names: ONeil Asgard Jumper ZPM1 - till ZPMx Chevron1 till Chevron9 2010/7/27 John Hogan > Star Trek ship names. > > > > JH > > > > *From:* uncle mantis [mailto:uncleman...@gmail.com] > *Sent:* Tuesday, July 27, 2010 9:55 AM > *To:* cassandra-u...@incubator.apache.org > *Subj

Re: Quick Poll: Server names

2010-07-27 Thread Dave Viner
I've seen & used several... names of children of employees of the company names of streets near office names of diseases (lead to very hard to spell names after a while, but was quite educational for most developers) names of characters from famous books (e.g., lord of the rings, asimov novels, et

Re: non blocking Cassandra with Tornado

2010-07-27 Thread Peter Schuller
> The idea is rather than calling a cassandra client function like > get_slice(), call the send_get_slice() then have a non blocking wait on the > socket thrift is using, then call recv_get_slice(). (disclaimer: I've never used tornado) Without looking at the generated thrift code, this sounds da

Re: Quick Poll: Server names

2010-07-27 Thread Brett Thomas
I like names of colleges On Tue, Jul 27, 2010 at 11:40 AM, Dave Viner wrote: > I've seen & used several... > > names of children of employees of the company > names of streets near office > names of diseases (lead to very hard to spell names after a while, but was > quite educational for most de

Re: Quick Poll: Server names

2010-07-27 Thread uncle mantis
Ah S**T! The Pooh server is is down again! =) What does one do if they run out of themed names? Regards, Michael On Tue, Jul 27, 2010 at 10:46 AM, Brett Thomas wrote: > I like names of colleges > > > On Tue, Jul 27, 2010 at 11:40 AM, Dave Viner wrote: > >> I've seen & used several... >> >> n

Re: non blocking Cassandra with Tornado

2010-07-27 Thread Dave Viner
FWIW - I think this is actually more of a question about Thrift than about Cassandra. If I understand you correctly, you're looking for a async client. Cassandra "lives" on the other side of the thrift service. So, you need a client that can speak Thrift asynchronously. You might check out the

Re: Quick Poll: Server names

2010-07-27 Thread Nick Jones
Counties in Texas is a significant list: http://en.wikipedia.org/wiki/List_of_counties_in_Texas Nick

Re: Quick Poll: Server names

2010-07-27 Thread Edward Capriolo
On Tue, Jul 27, 2010 at 11:49 AM, uncle mantis wrote: > Ah S**T! The Pooh server is is down again! =) > > What does one do if they run out of themed names? > > Regards, > > Michael > > > On Tue, Jul 27, 2010 at 10:46 AM, Brett Thomas > wrote: >> >> I like names of colleges >> >> On Tue, Jul 27, 2

Re: Quick Poll: Server names

2010-07-27 Thread Benjamin Black
[role][sequence].[airport code][sequence].[domain].[tld]

Re: Quick Poll: Server names

2010-07-27 Thread Colin Vipurs
+1 for this > I know this is a fun thread, and I hate being a "debby downer" > but...In my opinion, naming servers after anything then their function > is not a great idea. Lets look at some positives and negatives: > > System1: > cassandra01 > cassandra02 > cassandra03 > > VS > > System2: > tom >

Re: Quick Poll: Server names

2010-07-27 Thread Michael Widmann
Hmm I never will that anyone than one of my team will reboot a instance or server of mine. Means - if I don't have the possiblity to remote "terminate" the task - or Remote Power (IP Based) reboot the DataCenter isn't my Datacenter ;-) Just my 2 cents - my names (chevron etc) are already on the l

Re: Quick Poll: Server names

2010-07-27 Thread uncle mantis
+1. Quick and simple. Regards, Michael On Tue, Jul 27, 2010 at 10:54 AM, Benjamin Black wrote: > [role][sequence].[airport code][sequence].[domain].[tld] >

Re: Cassandra disk space utilization WAY higher than I would expect

2010-07-27 Thread Peter Schuller
> a) cleanup is a superset of compaction, so if you've been doing > overwrites at all then it will reduce space used for that reason I had failed to consider over-writes as a possible culprit (since removals were stated not to be done). However thinking about it I believe the effect of this should

Re: Cassandra behaviour

2010-07-27 Thread Peter Schuller
> So userspace throttling is probably the answer? I believe so. >  Is the normal way of > doing this to go through the JMX interface from a userspace program, > and hold off on inserts until the values fall below a given threshold? >  If so, that's going to be a pain, since most of my system is >

cassandra summit, making videos?

2010-07-27 Thread S Ahmed
Will there be videos of the session at the Cassandra Summit in SF? I am really interested in the Cassandra codebase/internals seminar.

Re: cassandra summit, making videos?

2010-07-27 Thread uncle mantis
Why is everything always in California or Las Vegas? :-( Regards, Michael On Tue, Jul 27, 2010 at 11:49 AM, S Ahmed wrote: > Will there be videos of the session at the Cassandra Summit in SF? > > I am really interested in the Cassandra codebase/internals seminar. > > >

Re: Can't find the storageproxy using jconsole

2010-07-27 Thread Jonathan Ellis
I have also seen StorageProxy missing from the mbeans tab -- I'm not sure if it is being removed after being registered, or somehow never being registered at all, or possibly even a jconsole bug where querying the object manually (or, say, with jmxterm) would still work. I haven't spent any time t

Re: Cassandra disk space utilization WAY higher than I would expect

2010-07-27 Thread Jonathan Ellis
On Tue, Jul 27, 2010 at 9:26 AM, Peter Schuller wrote: > I had failed to consider over-writes as a possible culprit (since > removals were stated not to be done). However thinking about it I > believe the effect of this should be limited to roughly a doubling of > disk space in the absolute worst

Re: Cassandra disk space utilization WAY higher than I would expect

2010-07-27 Thread Peter Schuller
> Minor compactions (see > http://wiki.apache.org/cassandra/MemtableSSTable) will try to keep the > growth in check but it is by no means limited to 2x. Sorry I was being unclear. I was rather thinking along the lines of a doubling of data triggering an implicit major compaction. However I was wro

Re: Cassandra vs MongoDB

2010-07-27 Thread Drew Dahlke
There's a good post on stackoverflow comparing the two http://stackoverflow.com/questions/2892729/mongodb-vs-cassandra It seems to me that both projects have pretty vibrant communities behind them. On Tue, Jul 27, 2010 at 11:14 AM, Mark wrote: > Can someone quickly explain the differences betwee

Re: about cassandra compression

2010-07-27 Thread Jeremy Davis
I've been wondering about this question as well, but from a different angle. More along the lines of should I bother to compress myself? Specifically in cases where I might want to take several small columns and compress into 1 more compact column. Each column by itself is pretty spartan and won't

Re: Cassandra disk space utilization WAY higher than I would expect

2010-07-27 Thread Julie
Peter Schuller infidyne.com> writes: > > a) cleanup is a superset of compaction, so if you've been doing > > overwrites at all then it will reduce space used for that reason > Hi Peter and Jonathan, In my test, I write 80,000 rows (100KB each row) to an 8 node cluster. The 80,000 rows all hav

Re: UnavailableException on QUORUM write

2010-07-27 Thread Per Olesen
On Jul 27, 2010, at 12:23 AM, Jonathan Ellis wrote: > Can you turn on debug logging and try this patch? Yes, but..I am on vacation now, so it will be about 3 weeks from now.

Upgrading to Cassanda 0.7 Thrift Erlang

2010-07-27 Thread J T
Hi, I just tried upgrading a perfectly working Cassandra 0.6.3 to Cassandra 0.7 and am finding that even after re-generating the erlang thrift bindings that I am unable to perform any operation. I can get a connection but if I try to login or set the keyspace I get a report from the erlang binding

NoServer Available

2010-07-27 Thread Daniel Bernstein
I've set up a 2 node cluster and I'm trying to connect using pycassa. My thrift address is set to the default: localhost:9160 I've verified that the port is open and I'm able to connect to it via telnet. My keyspace "Ananda" is defined as is the column family "URL" in storage.xml Running the f

Re: Key Caching

2010-07-27 Thread B. Todd Burruss
AggressiveOpts, if i remember correctly, uses options that are not documented but will probably make into a future release of the JVM. cassandra used it once upon a time. probably should take it out, but things work just fine for me now ;) On Tue, 2010-07-27 at 01:48 -0700, Dathan Pattishall wr

Re: Cassandra vs MongoDB

2010-07-27 Thread Jonathan Shook
Also, google trends is only a measure of what terms people are searching for. To equate this directly to growth would be misleading. Tue, Jul 27, 2010 at 12:27 PM, Drew Dahlke wrote: > There's a good post on stackoverflow comparing the two > http://stackoverflow.com/questions/2892729/mongodb-vs-

Re: Cassandra vs MongoDB

2010-07-27 Thread Dave Gardner
There are quite a few differences. Ultimately it depends on your use case! For example Mongo has a limit on the maximum "document" size of 4MB, whereas with Cassandra you are not really limited to the volume of data/columns per-row (I think there maybe a limit of 2GB perhaps; basically none) Anoth

Re: Quick Poll: Server names

2010-07-27 Thread Benoit Perroud
We use name of (european) cities for "logical" functionnalities : - berlin01, berlin02, berlin03 part are mysql cluster, - zurich1 and zurich2 are AD, - roma01, roma02, and so on are Cassanrda cluster for the Roma project - and so on. We found this way a good tradeoff. Regards, Benoit. 2010/7/

Re: Cassandra vs MongoDB

2010-07-27 Thread Mark
On 7/27/10 12:42 PM, Dave Gardner wrote: There are quite a few differences. Ultimately it depends on your use case! For example Mongo has a limit on the maximum "document" size of 4MB, whereas with Cassandra you are not really limited to the volume of data/columns per-row (I think there maybe a l

repair failed or stopped after 7-8 hours?

2010-07-27 Thread Michael Andreasen
I've started repair on 6 nodes some 7-8 hours ago The nodes still have load of 2-3 (normally 0.5) and if i grep AE in system.log i get lines like this on most of the nodes Performing streaming repair of 30 ranges to /172.19.0.32 for Load is 400-500gb on the nodes. Any word of advise

Re: Quick Poll: Server names

2010-07-27 Thread Daniel Jue
Names of Transformers Blurr, Megatron, Sideswipe, Unicron, Arcee etc On Tue, Jul 27, 2010 at 3:57 PM, Benoit Perroud wrote: > We use name of (european) cities for "logical" functionnalities : > > - berlin01, berlin02, berlin03 part are mysql cluster, > - zurich1 and zurich2 are AD, > - roma01, r

Re: Upgrading to Cassanda 0.7 Thrift Erlang

2010-07-27 Thread Jonathan Ellis
trunk is using framed thrift connections by default now (was unframed) On Tue, Jul 27, 2010 at 11:33 AM, J T wrote: > Hi, > I just tried upgrading a perfectly working Cassandra 0.6.3 to Cassandra 0.7 > and am finding that even after re-generating the erlang thrift bindings that > I am unable to p

Re: Help! Cassandra Data Loader threads are getting stuck

2010-07-27 Thread Rana Aich
Thanks for your offer...there was some problem in reading the *.gz files in System.in. I've rectified my code.. On Tue, Jul 27, 2010 at 12:09 AM, Thorvaldsson Justus < justus.thorvalds...@svenskaspel.se> wrote: > I made one program doing just this with Java > > Basically > > I read with one thr

Re: non blocking Cassandra with Tornado

2010-07-27 Thread Aaron Morton
Without looking at the generated thrift code, this sounds dangerous. What happens if send_get_slice() blocks? What happens if recv_get_slice() has to block because you didn't happen to receive the response in one packet? get_slice() has two lines it it, a call to send_get_slice() and one to recv_

Re: non blocking Cassandra with Tornado

2010-07-27 Thread Aaron Morton
Thanks for the link. It is more of a thrift thing, perhaps I need to do some tests where the web handler sends the get_slice to cassandra but never calls recv to see what could happen. I'll take a look at the Java binding and see what it would take to offer a patch to Thrift. Most people coding P

Re: Cassandra to store 1 billion small 64KB Blobs

2010-07-27 Thread Bryan Whitehead
Just a warning about ZFS. If the plan is to use JBOD w/RAID-Z, don't. 3, 4, 5, ... or N disks in a RAID-Z array (using ZFS) will result in read performance equivalent to only 1 disk. Check out this blog entry: http://blogs.sun.com/relling/entry/zfs_raid_recommendations_space_performance The secon