Re: Ring management and load balance

2010-03-26 Thread Roland Hänel
But this 26.03.2010 22:29 schrieb am "Rob Coli" : On 3/26/10 1:36 PM, Roland Hänel wrote: > > If I was going to write such a tool: do you think the th... The JMX interface exposes an Attribute which seems appropriate to this use. It is called "TotalDiskSpaceUsed," and is available on a per-column

FW: Re: Is ReplicationFactor (eventually) guaranteed ?

2010-03-26 Thread Stu Hood
Ack... very sorry. I read the original message too quickly. The fact that neither read-repair nor anti-entropy are working is suspicious though. Do you think you could paste your config somewhere? -Original Message- From: "Stu Hood" Sent: Friday, March 26, 2010 11:57pm To: user@cassandr

Re: Is ReplicationFactor (eventually) guaranteed?

2010-03-26 Thread Stu Hood
replication factor == 1 means that there is only one copy of the data. And you deleted it. Repair depends on the replication factor being greater than 1. -Original Message- From: "Jianing Hu" Sent: Friday, March 26, 2010 9:33pm To: user@cassandra.apache.org Subject: Re: Is ReplicationFac

Re: Is ReplicationFactor (eventually) guaranteed?

2010-03-26 Thread Jianing Hu
That's not what I saw in my test. I'm probably making some noob mistakes. Can someone enlighten me? Here's what I did: 1) Bring up a cluster with three servers cs1,2,3, with their initial token set to 'foo3', 'foo6', and 'foo9', respectively. ReplicationFactor is set to 2 on all 3. 2) Insert 9 colu

Re: How to get all the columns under one column family ?

2010-03-26 Thread Jeff Zhang
I have found the api. On Sat, Mar 27, 2010 at 10:13 AM, Jeff Zhang wrote: > Hi all, > > It seems the api of cassandra is a little different from hbase, I am > looking for the api for list all the columns under one column family ? Is > there any way to do this ? Thanks. > > > > -- > Best Regard

How to get all the columns under one column family ?

2010-03-26 Thread Jeff Zhang
Hi all, It seems the api of cassandra is a little different from hbase, I am looking for the api for list all the columns under one column family ? Is there any way to do this ? Thanks. -- Best Regards Jeff Zhang

Does Cassandra has size limitation on value ?

2010-03-26 Thread Jeff Zhang
Hi all, I'd like to use Cassandra to store small files, so I wonder whether Cassandra has size limitation on value ? -- Best Regards Jeff Zhang

Re: How to check whether one key exist ?

2010-03-26 Thread Jonathan Ellis
Keys don't exist without columns in Cassandra. So the right answer is, use get or get_slice to check for the column(s) that should be in the key's row. On Fri, Mar 26, 2010 at 7:56 PM, Jeff Zhang wrote: > Hi all, > > I'd like to check whether one key exist, currently my solution is to let the >

Re: Is ReplicationFactor (eventually) guaranteed?

2010-03-26 Thread Rob Coli
On 3/26/10 5:57 PM, Jianing Hu wrote: In a cluster with ReplicationFactor> 1, if one server goes down, will new replicas be created on other servers to satisfy the set ReplicationFactor? Yes, via Anti-Entropy. http://wiki.apache.org/cassandra/AntiEntropy http://wiki.apache.org/cassandra/Archi

Is ReplicationFactor (eventually) guaranteed?

2010-03-26 Thread Jianing Hu
In a cluster with ReplicationFactor > 1, if one server goes down, will new replicas be created on other servers to satisfy the set ReplicationFactor? I ran some tests that seem to suggest no new replicas were created. Is that the expected behavior? If so, is there a way to guarantee that any data a

How to check whether one key exist ?

2010-03-26 Thread Jeff Zhang
Hi all, I'd like to check whether one key exist, currently my solution is to let the cassandra use the OrderPreservingPartitioner, and facilitate the get_key_range() API to check key's existence. I wonder whether there's other ways to do this ? Thanks. -- Best Regards Jeff Zhang

Re: ArrayIndexOutOfBoundsException on 0.6-betarc3

2010-03-26 Thread Jonathan Ellis
The quickest solution is definitely going to be just blowing away the Hint files from the system keyspace data directories. On Fri, Mar 26, 2010 at 5:36 PM, Scott White wrote: > Nope it's always been random. > > On Fri, Mar 26, 2010 at 2:13 PM, Jonathan Ellis wrote: >> >> Did you switch partitio

Re: ArrayIndexOutOfBoundsException on 0.6-betarc3

2010-03-26 Thread Scott White
Nope it's always been random. On Fri, Mar 26, 2010 at 2:13 PM, Jonathan Ellis wrote: > Did you switch partitioner types at some point? > > On Fri, Mar 26, 2010 at 2:53 PM, Scott White wrote: > > I don't know if this is from switching from 0.5 to 0.6-betarc3 just > recently > > or from doing a s

Re: Ring management and load balance

2010-03-26 Thread Mike Malone
2010/3/26 Roland Hänel > Jonathan, > > I agree with your idea about a tool that could 'propose' good token choices > for optimal load-balancing. > > If I was going to write such a tool: do you think the thrift API provides > the necessary information? I think with the RandomPartitioner you cannot

Re: Ring management and load balance

2010-03-26 Thread Rob Coli
On 3/26/10 1:36 PM, Roland Hänel wrote: If I was going to write such a tool: do you think the thrift API provides the necessary information? I think with the RandomPartitioner you cannot scan all your rows to actually find out how big certain ranges of rows are. And even with the OPP (that is the

Re: ArrayIndexOutOfBoundsException on 0.6-betarc3

2010-03-26 Thread Jonathan Ellis
Did you switch partitioner types at some point? On Fri, Mar 26, 2010 at 2:53 PM, Scott White wrote: > I don't know if this is from switching from 0.5 to 0.6-betarc3 just recently > or from doing a series of bootstrap and removeToken operations but I > recently started getting ArrayIndexOutOfBound

Re: Ring management and load balance

2010-03-26 Thread Roland Hänel
Jonathan, I agree with your idea about a tool that could 'propose' good token choices for optimal load-balancing. If I was going to write such a tool: do you think the thrift API provides the necessary information? I think with the RandomPartitioner you cannot scan all your rows to actually find

Re: Newbie Performance Question

2010-03-26 Thread Scott White
Right, that's what I meant, thanks for the correction. On Fri, Mar 26, 2010 at 1:11 PM, Brandon Williams wrote: > On Fri, Mar 26, 2010 at 3:08 PM, Scott White wrote: > >> Yep I believe those are inserts per second. Take the last line: >> >> "811653,1666,250" >> >> I believe that's telling you t

Re: Newbie Performance Question

2010-03-26 Thread Brandon Williams
On Fri, Mar 26, 2010 at 3:08 PM, Scott White wrote: > Yep I believe those are inserts per second. Take the last line: > > "811653,1666,250" > > I believe that's telling you that during that 10 second interval you did > 1666 inserts but your overall insert rate is 811653/250 = 3246.612 > inserts/s

Re: Newbie Performance Question

2010-03-26 Thread Scott White
Yep I believe those are inserts per second. Take the last line: "811653,1666,250" I believe that's telling you that during that 10 second interval you did 1666 inserts but your overall insert rate is 811653/250 = 3246.612 inserts/sec. Timeouts may be due to your machine(s) being fully saturated?

Re: Newbie Performance Question

2010-03-26 Thread malcolm smith
Ok I ran the stress test with out of box settings -- 50 threads and 1M row inserts. It seems to get as high as 4400 ops per second and as low as 968. Am I reading these correctly as inserts per second? These are results below. But is also generates timeouts and failures in the python code like

ArrayIndexOutOfBoundsException on 0.6-betarc3

2010-03-26 Thread Scott White
I don't know if this is from switching from 0.5 to 0.6-betarc3 just recently or from doing a series of bootstrap and removeToken operations but I recently started getting ArrayIndexOutOfBoundsException exceptions (centered around reading UTF from SSTableSliceIterator) on one of the machines in my c

Re: Gsoc2010 proposal

2010-03-26 Thread Ben Standefer
Priyanka, I think our listserv might be dropping your attachment. Please send it directly to me at benstande...@gmail.com. -Ben On Fri, Mar 26, 2010 at 10:41 AM, Priyanka Sharma wrote: > Hi > > I am Priyanka Sharma, master student at Vrije University, Amsterdam. My > major is "parallel and d

Re: Newbie Performance Question

2010-03-26 Thread Brandon Williams
On Fri, Mar 26, 2010 at 10:45 AM, malcolm smith < malsm...@treehousesystems.com> wrote: > I've been getting a feel for the performance elements of Cassandra using > version 0.51. I've done similar tests on HBase before, but Cassandra has > some very appealing aspects that I would like to pursue.

Re: memory question

2010-03-26 Thread Jonathan Ellis
No. The reason we're using mmap in the first place is that it's much better at "allowing the OS to do the caching." You just have too much data for the OS to cache effectively; making Cassandra set that aside to cache key locations can help because it's much more ram-efficient. On Fri, Mar 26, 2

RE: memory question

2010-03-26 Thread Todd Burruss
so just to close this out ... before mmap files, i would allow the OS to do the caching using its I/O cache, but now since mmap files take up a majority of my RAM, i need to cache more to maintain performance. is that a fair statement? From: Jonathan Ell

Re: Load balancing and Failover

2010-03-26 Thread Jonathan Ellis
nodetool ring http://wiki.apache.org/cassandra/NodeProbe On Fri, Mar 26, 2010 at 10:37 AM, Y Aw wrote: > Yes it does... > > Is there an easy way to know if a node is down or cannot reply to queries (a > simple telnet command) ? > > > > > > 2010/3/25 Jeremy Dunck >> >> On Thu, Mar 25, 2010 at 1:

Newbie Performance Question

2010-03-26 Thread malcolm smith
I've been getting a feel for the performance elements of Cassandra using version 0.51. I've done similar tests on HBase before, but Cassandra has some very appealing aspects that I would like to pursue. However I'm not seeing the what seems like the common level of performance others are seeing.

Load balancing and Failover

2010-03-26 Thread Y Aw
Yes it does... Is there an easy way to know if a node is down or cannot reply to queries (a simple telnet command) ? 2010/3/25 Jeremy Dunck > On Thu, Mar 25, 2010 at 1:20 PM, Y Aw wrote: > > Hi all, > > I have a question about load-balancing. > > http://wiki.apache.org/cassandra/FAQ#node_c

Re: Range scan performance in 0.6.0 beta2

2010-03-26 Thread Jonathan Ellis
On Fri, Mar 26, 2010 at 7:40 AM, Henrik Schröder wrote: > For each indexvalue we insert a row where the key is indexid + ":" + > indexvalue encoded as hex string, and the row contains only one column, > where the name is the object key encoded as a bytearray, and the value is > empty. It's a uniq

Re: Range scan performance in 0.6.0 beta2

2010-03-26 Thread Henrik Schröder
> > So all the values for an entire index will be in one row? That > doesn't sound good. > > You really want to put each index [and each table] in its own CF, but > until we can do that dynamically (0.7) you could at least make the > index row keys a tuple of (indexid, indexvalue) and the column n

Re: Cassandra cluster can not been installed in different subnet ?

2010-03-26 Thread Jonathan Ellis
Different subnet isn't a problem from Cassandra's point of view, but it might be if your network is doing something funky. Did you check the logs on one of the machines that isn't in the ring? On Fri, Mar 26, 2010 at 4:17 AM, Jeff Zhang wrote: > Hi all, > > I have 6 different machines, 4 are in

Re: cassandra 0.6.0 release date

2010-03-26 Thread roger schildmeijer
On Fri, Mar 26, 2010 at 11:12 AM, ROGER PUIG GANZA wrote: > Hi everybody. > > When will Cassandra 0.6.0 be released aproximately? > Within a couple of weeks (hopefully) > Is it fine developing with the current beta release and put it on > production when the final stable version is out? > Ab

Re: NullPointerException in DatabaseDescriptor.getComparator

2010-03-26 Thread Oleg Mürk
Thanks! This has solved the problem! On Wed, Mar 24, 2010 at 5:50 PM, gabriele renzi wrote: > On Wed, Mar 24, 2010 at 3:36 PM, Oleg Mürk wrote: > > Hi Jonathan, > > > > On Wed, Mar 24, 2010 at 4:32 PM, Jonathan Ellis > wrote: > >> > >> probably 0.5.1 is allowing an invalid query and erroring o

cassandra 0.6.0 release date

2010-03-26 Thread ROGER PUIG GANZA
Hi everybody. When will Cassandra 0.6.0 be released aproximately? Is it fine developing with the current beta release and put it on production when the final stable version is out? Thanks Roger Puig Ganza

Cassandra cluster can not been installed in different subnet ?

2010-03-26 Thread Jeff Zhang
Hi all, I have 6 different machines, 4 are in one subnet and the other two are in another subnet. The following is the ip address of the 6 machines. 10.148.219.12 10.148.219.15 10.148.219.11 10.148.219.71 10.148.224.199 10.148.224.194 I make the same configuration on each machine, and finally fo