setcachecapacity is forgotten

2010-05-24 Thread Ran Tavory
I use nodetool to set cache capacity on a certain node but the settings are "forgotten" after a few minutes. I run: $ nodetool -h localhost -p 9004 setcachecapacity outbrain_kvdb KvImpressions 1000 100 And then run nodetool cfstats immediately after and the settings are effective, I see t

Re: setcachecapacity is forgotten

2010-05-24 Thread Jonathan Ellis
fixed for 0.6.2 in https://issues.apache.org/jira/browse/CASSANDRA-1079 On Mon, May 24, 2010 at 7:14 AM, Ran Tavory wrote: > I use nodetool to set cache capacity on a certain node but the settings are > "forgotten" after a few minutes. > I run: > $ nodetool -h localhost -p 9004 setcachecapacity o

RE: Key cache capacity: 1 when using KeysCached ="50%"

2010-05-24 Thread Stu Hood
50% of 0 will be rounded up to 1. -Original Message- From: "Ran Tavory" Sent: Monday, May 24, 2010 12:34am To: user@cassandra.apache.org Subject: Key cache capacity: 1 when using KeysCached="50%" I've noticed that when defining KeysCached="50%" (or KeysCached="100%" and I didn't test ot

Re: Key cache capacity: 1 when using KeysCached="50%"

2010-05-24 Thread Jonathan Ellis
If you really want a cache capacity of 0 then you need to use 0 explicitly, otherwise the % versions will give you at least 1. On Mon, May 24, 2010 at 12:34 AM, Ran Tavory wrote: > I've noticed that when defining KeysCached="50%" (or KeysCached="100%" and I > didn't test other values with %) then

Re: Problem accessing Cassandra wiki top page with browser locale other than english

2010-05-24 Thread Jonathan Ellis
Can anyone with Moin experience tell us what our options are here? On Sun, May 23, 2010 at 10:09 PM, Yuki Morishita wrote: > Hi all, > > I'm currently working on translating cassandra wiki to Japanese. > Cassandra is gaining attention in Japan, too. :) > > I noticed that for those who have browse

Cassandra data loss

2010-05-24 Thread Steve Lihn
I am evaluating Cassandra as a candidate for our next-gen database. One of my colleagues told me that "it's not recommended to use it as your system of Record because it CAN lose data". Can someone with architecture understanding shed some light on under what circumstance Cassandra cluster can eith

Re: Cassandra data loss

2010-05-24 Thread Joe Stump
This is largely FUD. Cassandra let's you choose how consistent you want writes to be. The more consistency you choose, the slower the writes, but it's very unlikely with high consistency that you'll lose data. That being said, if you write with a consistency level of 0 then, yes, you could lose

Re: Cassandra data loss

2010-05-24 Thread Steve Lihn
So if I set it up to be strongly consistent, I should have the same level of consistency as traditional relational DB ? On the other hand, what will happen if I set it up as eventual consistent? Will the data become inconsistent after a crash/reboot, similar to the case of asynchronous replication

Re: Cassandra data loss

2010-05-24 Thread Joe Stump
On May 24, 2010, at 10:01 AM, Steve Lihn wrote: > So if I set it up to be strongly consistent, I should have the same level of > consistency as traditional relational DB ? If you do, say, QUORUM on the consistency level it will ensure at least 2 out of the 3 replicants have responded back that

Re: Cassandra data loss

2010-05-24 Thread Mark Greene
Ryan King actually has a very nice, short and sweet explanation that cuts through the FUD: http://theryanking.com/entries/2010/04/29/potential-consistency/ On Mon, May 24, 2010 at 12:01 PM, Steve Lihn wrote: > So if I set it up to be strongly consistent, I should have the same level > of consis

Re: Key cache capacity: 1 when using KeysCached="50%"

2010-05-24 Thread Ran Tavory
I'd like to have 100% keys cached. Sorry if my example of Super2 wasn't correct, but I do think there's a problem. Here's with my own data: When using actual numbers (in this case for RowsCached) it works as expected, however when specifying KeysCached="100%" I get only 1.

Re: Cassandra data loss

2010-05-24 Thread Jonathan Ellis
You also need to set CommitLogSync to batch instead of periodic if you Absolutely Cannot Lose Data. On Mon, May 24, 2010 at 10:51 AM, Joe Stump wrote: > This is largely FUD. Cassandra let's you choose how consistent you want > writes to be. The more consistency you choose, the slower the writes,

Re: Ideal configuration for given hardware

2010-05-24 Thread Jonathan Ellis
I can think of at least 2 clusters running 32GB boxes with single Cassandra processes on each. (16 seems to be more common.) At 64 I would seriously consider multiple processes per machine. You'd want to configure a Snitch such that same-machine boxes were considered the same rack, there is no s

Re: Ideal configuration for given hardware

2010-05-24 Thread Ian Soboroff
My data disks on two of my nodes are RAID-5, just because of circumstances. My other nodes are JBOD. I don't notice any real difference, but I haven't strongly benched it. Ian On Mon, May 24, 2010 at 2:45 PM, Jonathan Ellis wrote: > I can think of at least 2 clusters running 32GB boxes with sin

get() with TTL update?

2010-05-24 Thread Omer van der Horst Jansen
We have an application that stores session data in Cassandra. The session data needs to be deleted after, say, one hour of inactivity. The CASSANDRA-699 TTL update in 0.7 looks like it will work very well for that. However, we have a few scenarios where some session data will be retrieved frequent

Re: Ideal configuration for given hardware

2010-05-24 Thread Aaron McCurry
Thanks, a lot! So for RAID 10, is the thought that the node can survive a single disk failure and keep going until a normal maintain cycle? Also are you saying that you would configure a single RAID 10 for the whole box? OS included? I have 8 x 500 Gig drives, so that would leave me with 2T per

Re: get() with TTL update?

2010-05-24 Thread Jonathan Ellis
(a) cassandra does not use update-in-place storage so doing the update as part of the get call isn't much of an efficiency gain (b) I don't think it's a common enough use case to warrant special treatment On Mon, May 24, 2010 at 2:19 PM, Omer van der Horst Jansen wrote: > We have an application t

Re: Cassandra thrift question

2010-05-24 Thread Carlos Alvarez
I have the same issue in my cluster: 0,5% of requests are extremely slow because the time it takes to read the data from the socket. However in my case it is not related to the load. Actually the percentage of anomalies drop as the load increases. On the other hand the nio is actually slow than b

[ANN] Cassandra Tutorial @ OSCON

2010-05-24 Thread Eric Evans
For those interested in Cassandra training, I'll be giving a 3-hour tutorial[1] at OSCON this year entitled Hands-on Cassandra. [1]: http://www.oscon.com/oscon2010/public/schedule/detail/14283 The tutorial will cover setup, configuration, and management of clusters, and will include some Python

Re: Cassandra thrift question

2010-05-24 Thread Jonathan Ellis
with C# you need to be sure to tell thrift to use client-side buffering. http://wiki.apache.org/cassandra/ThriftExamples#C.23 shows this (but didn't until recently) On Mon, May 24, 2010 at 4:09 PM, Carlos Alvarez wrote: > I have the same issue in my cluster: 0,5% of requests are extremely > slow

Re: Cassandra thrift question

2010-05-24 Thread Carlos Alvarez
On Mon, May 24, 2010 at 7:43 PM, Jonathan Ellis wrote: > with C# you need to be sure to tell thrift to use client-side > buffering.  http://wiki.apache.org/cassandra/ThriftExamples#C.23 shows > this (but didn't until recently) Yes, I am unsing TBufferedTransport. However the high times continues.

Different KeySpaces for different nodes in the same ring

2010-05-24 Thread Mishail
Greetings, Is it possible to spread the particular keyspace only to the part of the ring? For example: Node| Keyspaces node1 | KS1 node2 | KS2 node3 | KS1, KS2 Each KeySpace has ReplicationFactor == 2, so, node3 would store data from all keyspaces.

Re: Different KeySpaces for different nodes in the same ring

2010-05-24 Thread Jonathan Ellis
DatacenterShardSnitch in 0.7 allows something like this: # DatacenterShardStrategy is a generalization of RackAwareStrategy. # For each datacenter, you can specify (in `datacenter.properties`) # how many replicas you want on a per-keyspace basis. Replicas are # placed on different racks w

Re: Different KeySpaces for different nodes in the same ring

2010-05-24 Thread Mishail
Hi Jonatan, Will it be possible to have datacenter replication factor == 0? (do not replicate keyspace to that DC) Jonathan Ellis wrote: > DatacenterShardSnitch in 0.7 allows something like this: > > # DatacenterShardStrategy is a generalization of RackAwareStrategy. > # For each datacenter,

Hector vs cassandra-java-client

2010-05-24 Thread Peter Hsu
Hi All, This may have been answered already, but I did a [quick] Google search and didn't find much. Which is the better Java client to use? Hector or cassandra-java-client or neither? it seems Hector is more fully featured and more active as a project in general. What are user experiences w

Re: Hector vs cassandra-java-client

2010-05-24 Thread Jeff Zhang
I think hector is better, and seems the author of cassandra-java-client does not continue work on it. On Tue, May 25, 2010 at 10:21 AM, Peter Hsu wrote: > Hi All, > > This may have been answered already, but I did a [quick] Google search and > didn't find much.  Which is the better Java client

Re: Different KeySpaces for different nodes in the same ring

2010-05-24 Thread Jonathan Ellis
Yes. On Mon, May 24, 2010 at 8:56 PM, Mishail wrote: > Hi Jonatan, > > Will it be possible to have datacenter replication factor == 0? (do not > replicate keyspace to that DC) > > Jonathan Ellis wrote: >> DatacenterShardSnitch in 0.7 allows something like this: >> >> #   DatacenterShardStrategy i

Is there a way to turn HH off?

2010-05-24 Thread Ran Tavory
For small clusters Hinted Handoff cost is not negligible. I'd like to test its effect. Is there a way to turn it off for my cluster?

Re: Is there a way to turn HH off?

2010-05-24 Thread Jonathan Ellis
https://issues.apache.org/jira/browse/CASSANDRA-894 On Mon, May 24, 2010 at 11:30 PM, Ran Tavory wrote: > For small clusters Hinted Handoff cost is not negligible. I'd like to test > its effect. > Is there a way to turn it off for my cluster? -- Jonathan Ellis Project Chair, Apache Cassandra

Re: Ideal configuration for given hardware

2010-05-24 Thread Jonathan Ellis
yes, I would do raid1 on 2 commitlog disks and raid10 on the 6 remaining for OS + data On Mon, May 24, 2010 at 2:27 PM, Aaron McCurry wrote: > Thanks, a lot!  So for RAID 10, is the thought that the node can survive a > single disk failure and keep going until a normal maintain cycle?  Also are >

Why Cassandra is "space inefficient" compared to MySQL?

2010-05-24 Thread sharanabasava raddi
Hi all, Am running "Cassandra" on Windows XP (single node) machine. I have made insertion of about "10 million" records into "Cassandra" , and it took around 90 minutes to insert and 8GB of space. For the same number of records MySQL will take "3 GB" space. Could you please tell me why? And please

Re: Why Cassandra is "space inefficient" compared to MySQL?

2010-05-24 Thread casablinca126.com
hi Sharan, what's the replication factor are you using ? regards, Cao Jiguang 2010-05-25 casablinca126.com 发件人: sharanabasava raddi 发送时间: 2010-05-25 13:46:38 收件人: user@cassandra.apache.org 抄送: 主题: Why Cassandra is "space inefficient" compared to MySQL? Hi all, Am running "Cassan

Re: Why Cassandra is "space inefficient" compared to MySQL?

2010-05-24 Thread Jeff Zhang
I think maybe one reason is that Cassandra will also log the operation into log files, and the log contains the records. 2010/5/25 casablinca126.com : > hi Sharan, > what's the replication factor are you using ? > > regards, > Cao Jiguang > > > 2010-05-25 > > cas

Re: Is there a way to turn HH off?

2010-05-24 Thread Ran Tavory
ah, 0.6.2, worth waiting for... On Tue, May 25, 2010 at 7:52 AM, Jonathan Ellis wrote: > https://issues.apache.org/jira/browse/CASSANDRA-894 > > On Mon, May 24, 2010 at 11:30 PM, Ran Tavory wrote: > > For small clusters Hinted Handoff cost is not negligible. I'd like to > test > > its effect. >

Re: Hector vs cassandra-java-client

2010-05-24 Thread Ran Tavory
cassandra-java-client is up to cassandra's 0.4.2 version, so you probably can't use it out of the box. Hector is active and up to the latest 0.6.1 release with a bunch of committers, contributors and users. See http://wiki.github.com/rantav/hector/ and http://groups.google.com/group/hector-users O