Re: Why is cassandra named cassandra?

2010-07-09 Thread Daniel Jue
It's in a FAQ somewhere. Based on this: http://en.wikipedia.org/wiki/Cassandra An oracle might also be called a prophet. On Thu, Jul 8, 2010 at 9:43 PM, ChingShen wrote: > Hi, > >   Why is cassandra named cassandra? > > Thanks. > > Shen >

Re: Why is cassandra named cassandra?

2010-07-09 Thread Pieter Maes
it is explained in the video on the site ;) Op 9/07/10 03:43, ChingShen schreef: > Hi, > > Why is cassandra named cassandra? > > Thanks. > > Shen

UnavailableException on QUORUM write

2010-07-09 Thread Per Olesen
Hi, I am a bit confused about getting an UnavailableException when doing a QUORUM write. I have a 3 node cluster, with RF=3. When all 3 nodes are up, the QUORUM write succeeds. When 1 of the 3 nodes are down, the QUORUM write fails with UnavailableException. Shouldn't it be enough with 2 nodes

total disk space used on a node for a CF is too large than expected

2010-07-09 Thread Sagar Agrawal
row size is 10 KB and write count on a node for a CF is 1054451, so ideally the total disk space used on that node by that CF should be around 10 GB but it's showing 23 GB what else might be taking up so much space? Thanks

Re: UnavailableException on QUORUM write

2010-07-09 Thread ChingShen
Which client library do you use? Shen On Fri, Jul 9, 2010 at 4:53 PM, Per Olesen wrote: > Hi, > > I am a bit confused about getting an UnavailableException when doing a > QUORUM write. > > I have a 3 node cluster, with RF=3. When all 3 nodes are up, the QUORUM > write succeeds. When 1 of the 3

Re: UnavailableException on QUORUM write

2010-07-09 Thread Per Olesen
On Jul 9, 2010, at 11:11 AM, ChingShen wrote: > Which client library do you use? Direct on thrift api using thrift.jar, in version 917130.

new node can't find seed node

2010-07-09 Thread Boris Spasojevic
Hi, I am attempting to add a new node to a single node already running. I have set the first node not to bootstrap, and the second node to bootstrap whit the first node as it's seeder. The IP configuration is OK, the machines can ping each other, the seed machine (or should I say cassandra runnin

Re: new node can't find seed node

2010-07-09 Thread Boris Spasojevic
Solved it! Sorry to spam your inbox! BoriS On Fri, 2010-07-09 at 11:50 +0200, Boris Spasojevic wrote: > Hi, > > I am attempting to add a new node to a single node already running. > I have set the first node not to bootstrap, and the second node to > bootstrap whit the first node as it's seede

Re: total disk space used on a node for a CF is too large than expected

2010-07-09 Thread Sagar Agrawal
what does WriteCount signify actually, it should also include writes which are replicas right? It is total no of writes on that node for that CFtill now, right ? On Fri, Jul 9, 2010 at 2:39 PM, Sagar Agrawal wrote: > row size is 10 KB and write count on a node for a CF is 1054451, > so ideally

Re: new node can't find seed node

2010-07-09 Thread Dimitry Lvovsky
Sounds like maybe your not binding the 7000 port to the correct interface, maybe you have it set to localhost, rather then the IP address. If you want to confirm, try prompt> telnet [machine ip] 7000 If you get a connection refused, then the above is true. Hope this helps. Dimitry Lvovsky Dir

Iterate all keys - doing it as the faq fails for me :(

2010-07-09 Thread Per Olesen
Hi, I was reading http://wiki.apache.org/cassandra/FAQ#iter_world and decided to implement the get_range_slices method for listing all keys of a CF. Only thing is, it doesn't work that well for me :-) I do as it says (I think), and take KeyRanges of size N and use the key of the last call as s

manual InitialToken assignemnt

2010-07-09 Thread Sagar Agrawal
I have a 2 node cluster node1 - 5 node2 - 9 If I insert a row with key="a", which node should it go and why? It is going to node1, but I think it should go to node2, since token value of node is closer to "a" (using java string compareTo method) someone please clarify Thanks

Re: manual InitialToken assignemnt

2010-07-09 Thread Jonathan Ellis
see the beginning of http://wiki.apache.org/cassandra/Operations On Fri, Jul 9, 2010 at 7:16 AM, Sagar Agrawal wrote: > I have a 2 node cluster > node1 - 5 > node2 - 9 > > If I insert a row with key="a", which node should it go and why? > > It is going to node1, but I think it should go to node2,

Re: manual InitialToken assignemnt

2010-07-09 Thread Per Olesen
Are you using OrderPreservingPartitioner or RandomPartitioner? Cause if you are using RandomPartitioner, a hash is calculated from "a" and that hash is used to determine where the data for "a" key goes, not "a". On Jul 9, 2010, at 2:16 PM, Sagar Agrawal wrote: > I have a 2 node cluster > node1

Re: Digg 4 Preview on TWiT

2010-07-09 Thread Terje Marthinussen
http://twitter.com/nk/status/17903187277 Another "not using" joke?

Re: NYC Cassandra training

2010-07-09 Thread S Ahmed
My previous reply seemed to have bounced. Will there be a training day before/after the Cassandr Summit? (in SF on the 10th) On Fri, Jul 2, 2010 at 2:08 PM, Jonathan Ellis wrote: > Riptano's one day Cassandra training is coming to NYC in August, our > first public session on the East coast: > h

Re: manual InitialToken assignemnt

2010-07-09 Thread Sagar Agrawal
got it, thanks On Fri, Jul 9, 2010 at 6:21 PM, Per Olesen wrote: > Are you using OrderPreservingPartitioner or RandomPartitioner? > > Cause if you are using RandomPartitioner, a hash is calculated from "a" and > that hash is used to determine where the data for "a" key goes, not "a". > > > On Ju

Re: NYC Cassandra training

2010-07-09 Thread Jeremy Dunck
On Fri, Jul 2, 2010 at 1:08 PM, Jonathan Ellis wrote: > Riptano's one day Cassandra training is coming to NYC in August, our > first public session on the East coast: > http://www.eventbrite.com/event/749518831 Is there a calendar where you're listing this stuff, or is it just tweets and mail mes

Help! Cassandra disk space utilization WAY higher than I would expect

2010-07-09 Thread Julie
Hi guys, I am on the hook to explain why 30GB of data is filling up 106GB of disk space since this is concerning information for my project. We are very excited about the possibility of using Cassandra but need to understand this anomaly in order to feel confident. Does anyone know why this cou

Last day to submit your Surge 2010 CFP!

2010-07-09 Thread Jason Dixon
Today is your last chance to submit a CFP abstract for the 2010 Surge Scalability Conference. The event is taking place on Sept 30 and Oct 1, 2010 in Baltimore, MD. Surge focuses on case studies that address production failures and the re-engineering efforts that led to victory in Web Application

RE: Help! Cassandra disk space utilization WAY higher than I would expect

2010-07-09 Thread Stu Hood
Cassandra has a very high constant per-row overhead at the moment of around 40 bytes. Additionally, there is around 12 bytes of overhead per column. Finally, column names are repeated for each row. CASSANDRA-674 and CASSANDRA-1207 will help with these overheads, but they will not be fixed until

Re: get_range_slices

2010-07-09 Thread Jonathan Shook
FYI: https://issues.apache.org/jira/browse/CASSANDRA-1145 Yes, it's a bug. CL.ONE is a reasonable work around. On Thu, Jul 8, 2010 at 11:04 PM, Mike Malone wrote: > I think the answer to your question is no, you shouldn't. > I'm feeling far too lazy to do even light research on the topic, but I >

Re: NYC Cassandra training

2010-07-09 Thread Dave Gardner
Do you have a rough estimate as to when there might be a training day in London (UK). I'm currently weighing up whether I should be making a journey across the pond for one of the US-based events. Thanks Dave On 9 July 2010 15:36, Jeremy Dunck wrote: > On Fri, Jul 2, 2010 at 1:08 PM, Jonathan

RE: Running Cassandra as a Windows Service

2010-07-09 Thread Kochheiser,Todd W - TOK-DITT-1
I've submitted a contrib. (windows.zip) to the JIRA issue/ticket https://issues.apache.org/jira/browse/CASSANDRA-292. The zip contains everything needed to run Cassandra as a Windows Service. It should be unzipped under the contrib directory. It includes an Ant build script and unit test. I'v

RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud

2010-07-09 Thread maneela a
Are there any known performance issues if cassandra cluster launched with RackAwareStrategy because I see huge performance difference between RackAwareStrategy vs RackUnAwareStrategy.  Here are details: we have a cluster setup with 4 EC2 X large nodes, 3 of them are running in East region an

Re: RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud

2010-07-09 Thread Dave Viner
Hi, Can you post the stress test code and storage.conf used? I have a cluster in EC2 using RackAware. However, I am in 1 region (us-east-1) but 2 Availability Zones. Amazon helps to ensure that AZ's are isolated from each other creating a fail-resistant cluster. But, staying in the same region

Re: RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud

2010-07-09 Thread Joe Stump
We had similar issues when we started running Cassandra on EC2 between multiple AZ's (not regions; we're working up to that shortly). We ended up building a rack aware strategy specific to AWS, which is posted somewhere in JIRA. Basically it uses the AWS API to ensure that replicants are stored

Re: RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud

2010-07-09 Thread Bill de hÓra
east: A B C west: D Perhaps you are blocking on a write to D - what's your quorum/rf set up as? Bill On Fri, 2010-07-09 at 10:36 -0700, maneela a wrote: > Are there any known performance issues if cassandra cluster > launched with RackAwareStrategy because I see huge perfo

InitialToken assignemnt

2010-07-09 Thread Claire Chang
my keys are sequential integers and i use random partitioner in a multi-node cluster. In this case, do I still have to specify initialToken? thanks, claire

Re: Cassandra disk space utilization WAY higher than I would expect

2010-07-09 Thread Jonathan Ellis
then obsolete sstables is not your culprit. On Thu, Jul 8, 2010 at 8:32 AM, Julie wrote: > Jonathan Ellis gmail.com> writes: > >> "SSTables that are obsoleted by a compaction are deleted >> asynchronously when the JVM performs a GC. You can force a GC from >> jconsole if necessary, but Cassandra

Re: How to stop Cassandra running in embeded mode

2010-07-09 Thread Jonathan Ellis
there's some support for this in 0.7 (see http://issues.apache.org/jira/browse/CASSANDRA-1018) but fundamentally it's not really designed to be started and stopped multiple times within the same process. On Thu, Jul 8, 2010 at 3:44 AM, Andriy Kopachevsky wrote: > Hi, we are trying to set up inter

Re: Understanding atomicity in Cassandra

2010-07-09 Thread Jonathan Ellis
typically you will update both as part of a batch_mutate, and if it fails, retry the operation. re-writing any part that succeeded will be harmless. On Thu, Jul 8, 2010 at 11:13 AM, Stuart Langridge wrote: > Hi, Cassandra people! > > We're looking at Cassandra as a possible replacement for some

Re: UnavailableException on QUORUM write

2010-07-09 Thread Jonathan Ellis
this sounds like a bug, although if you've attempted any node movement or bootstrapping, that could cause the required quorum to be larger than just the number of nodes. On Fri, Jul 9, 2010 at 3:53 AM, Per Olesen wrote: > Hi, > > I am a bit confused about getting an UnavailableException when doin

Re: total disk space used on a node for a CF is too large than expected

2010-07-09 Thread Jonathan Ellis
you should read the "cassandra disk space utilization" thread. On Fri, Jul 9, 2010 at 4:09 AM, Sagar Agrawal wrote: >  row size is 10 KB and write count on a node for a CF is 1054451, > so ideally the total disk space used on that node by that CF should be > around 10 GB > but it's showing  23 GB

Re: InitialToken assignemnt

2010-07-09 Thread Jonathan Ellis
Short answer: yes. Longer answer: http://wiki.apache.org/cassandra/Operations On Fri, Jul 9, 2010 at 1:19 PM, Claire Chang wrote: > my keys are sequential integers and i use random partitioner in a multi-node > cluster. In this case, do I still have to specify  initialToken? > > thanks, > claire

Re: RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud

2010-07-09 Thread maneela a
Thanks for your quick reply.. JoeI forgot to mentioned that we are using PropertyFileEndPointSnitch to provide cassandra about our network topology and below is property file uses by that class cat rack.properties10.9.0.6=east:r1b10.9.0.18=east:r1c10.9.0.14=east:r1d10.9.0.10=west:r1adefault=east

Re: How to stop Cassandra running in embeded mode

2010-07-09 Thread Ran Tavory
The workaround I do is fork always. Each test pulls up its own jvm. On Jul 9, 2010 9:51 PM, "Jonathan Ellis" wrote: there's some support for this in 0.7 (see http://issues.apache.org/jira/browse/CASSANDRA-1018) but fundamentally it's not really designed to be started and stopped multiple times w

Re: RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud

2010-07-09 Thread maneela a
ConsistencyLevel.ONE is default option given inside stress.py so I am using default one --- On Fri, 7/9/10, Bill de hÓra wrote: From: Bill de hÓra Subject: Re: RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud To: user@cassandra.apache.org Date: Friday, July 9, 2010, 2:12 PM   east:

Re: RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud

2010-07-09 Thread Joe Stump
On Jul 9, 2010, at 1:16 PM, maneela a wrote: > Is there any way to mark cassandra node to keep it as just for replication > purpose and not to be as Primary for any data range in the ring? I believe there is. This is what we're doing, but we do all of our writes via a queue. Derek or Mike fro

TechCrunch article on Twitter and Cassandra

2010-07-09 Thread Kochheiser,Todd W - TOK-DITT-1
A good read. http://techcrunch.com/2010/07/09/twitter-analytics-mysql/ Todd