Write consistency

2010-04-08 Thread Paul Prescod
In this¹ debate, there seemed to be consensus on the following fact: "In Cassandra, say you use N=3, W=3 & R=1. Let’s say you managed to only write to replicas A & B, but not C. In this case Cassandra will return an error to the application saying the write failed- which is acceptable given than W

Re: writes to Cassandra failing occasionally

2010-04-08 Thread Philip Jackson
At Wed, 07 Apr 2010 13:19:26 -0700, Mike Gallamore wrote: > > I have writes to cassandra that are failing, or at least a read shortly > after a write is still getting an old value. I realize Cassandra is > "eventually consistent" but this system is a single CPU single node with > consistency le

Starting Cassandra Fauna

2010-04-08 Thread Nirmala Agadgar
Hi, I am newbie. Downloaded Fauna and followed the instructions to run Fauna in Centos. But when i try to run cassandra_helper cassandra, i get the following Error. Can anyone help me solve this. i have installed 1)Java java version "1.6.0" OpenJDK Runtime Environment (build 1.6.0-b09) OpenJDK

Re: writes to Cassandra failing occasionally

2010-04-08 Thread Ted Zlatanov
On Thu, 08 Apr 2010 12:53:48 +0100 Philip Jackson wrote: PJ> At Wed, 07 Apr 2010 13:19:26 -0700, PJ> Mike Gallamore wrote: >> >> I have writes to cassandra that are failing, or at least a read shortly >> after a write is still getting an old value. I realize Cassandra is >> "eventually consi

Re: writes to Cassandra failing occasionally

2010-04-08 Thread Ted Zlatanov
On Wed, 07 Apr 2010 13:19:26 -0700 Mike Gallamore wrote: MG> As an aside I motified some other code to use Net::Cassandra instead MG> of Net::Cassandra::Easy and noticed that it seems to run 3-4X MG> slower. Both aren't stunningly fast. The test clients are running on MG> the same machine as Ca

Re: Write consistency

2010-04-08 Thread Gary Dusbabek
On Thu, Apr 8, 2010 at 02:55, Paul Prescod wrote: > In this¹ debate, there seemed to be consensus on the following fact: > > "In Cassandra, say you use N=3, W=3 & R=1. Let’s say you managed to > only write to replicas A & B, but not C. In this case Cassandra will > return an error to the applicati

Re: OrderPreservingPartitioner limits and workarounds

2010-04-08 Thread Mark Robson
On 7 April 2010 19:13, Jonathan Ellis wrote: > One thing you can do is manually "randomize" keys for any CFs that > don't need the OP by pre-pending their md5 to the key you send > Cassandra. (This is all RP is doing under the hood anyway.) > Another possibility is to prepend some hash of somet

Some insight into the slow read speed. Where to go from here? RC1 MESSAGE-DESERIALIZER-POOL

2010-04-08 Thread Mark Jones
I don't see any way to increase the # of active Deserializers in storage-conf.xml Tpstats more than 8 hours after insert/read stop Pool NameActive Pending Completed FILEUTILS-DELETE-POOL 0 0227 STREAM-STAGE 0

Re: Some insight into the slow read speed. Where to go from here? RC1 MESSAGE-DESERIALIZER-POOL

2010-04-08 Thread Jonathan Ellis
Have you checked iostat -x ? On Thu, Apr 8, 2010 at 9:45 AM, Mark Jones wrote: > I don't see any way to increase the # of active Deserializers in > storage-conf.xml > > Tpstats more than 8 hours after insert/read stop > > Pool Name                    Active   Pending      Completed > FILEUTILS-D

Re: Some insight into the slow read speed. Where to go from here? RC1 MESSAGE-DESERIALIZER-POOL

2010-04-08 Thread Jonathan Ellis
The sawtooth ram graph is typical of normal GC activity, btw -- the GC doesn't bother with a major collection until it reaches some percent of the total available.

Re: Some insight into the slow read speed. Where to go from here? RC1 MESSAGE-DESERIALIZER-POOL

2010-04-08 Thread Avinash Lakshman
The tooth wave in memory utilization could be memtable dumps. I/O wait in TCP happens when you are overwhelming the server with requests. Could you run sar and find out how many bytes/sec you are receiving/transmitting? Cheers Avinash On Thu, Apr 8, 2010 at 7:45 AM, Mark Jones wrote: > I don't

Re: Sorting and ordering in Cassandra

2010-04-08 Thread Jonathan Ellis
Re "Why couldn't you walk though random-layout hash-keyed data by token the same way? The hashes still have an order." You can in 0.6. On Wed, Apr 7, 2010 at 8:45 AM, Paul Prescod wrote: > I'm working on a blog post that combines all of the information and > ideas I can find relative to managing

Re: Inconsistency when unit testing

2010-04-08 Thread Jonathan Ellis
Your first step should be upgrading to 0.6. On Wed, Apr 7, 2010 at 10:38 AM, Philip Jackson wrote: > At Wed, 7 Apr 2010 17:29:49 +0200, > Sylvain Lebresne wrote: >> >> Use ConsistencyLevel.QUORUM when you write *and* when you read. > > I already do (plus, I only test with one node). > > BTW, I'm

Re: Write consistency

2010-04-08 Thread Jeremy Dunck
On Thu, Apr 8, 2010 at 7:16 AM, Gary Dusbabek wrote: > On Thu, Apr 8, 2010 at 02:55, Paul Prescod wrote: >> In this¹ debate, there seemed to be consensus on the following fact: >> >> "In Cassandra, say you use N=3, W=3 & R=1. Let’s say you managed to >> only write to replicas A & B, but not C. In

Re: Write consistency

2010-04-08 Thread Avinash Lakshman
What your describing is a distributed transaction? Generally strong consistency is always associated with doing transactional writes where you never see the results of a failed write on a subsequent read no matter what happens. Cassandra has no notion of rollback. That is why no combination will gi

RE: Some insight into the slow read speed. Where to go from here? RC1 MESSAGE-DESERIALIZER-POOL

2010-04-08 Thread Mark Jones
I restarted Cassandra on that node to clear out that queue, reduced the available memory to java to 4GB, and now I'm able to read with 8 concurrent threads, about 110/second Running iostat -x I see a large amount of time in await, and a small amount of time in svctm indicating the device is res

Re: writes to Cassandra failing occasionally

2010-04-08 Thread Mike Gallamore
On 04/08/2010 04:53 AM, Philip Jackson wrote: At Wed, 07 Apr 2010 13:19:26 -0700, Mike Gallamore wrote: I have writes to cassandra that are failing, or at least a read shortly after a write is still getting an old value. I realize Cassandra is "eventually consistent" but this system is a sin

Re: writes to Cassandra failing occasionally

2010-04-08 Thread Jonathan Ellis
is N:C:E possibly ignoring thrift exceptions? On Thu, Apr 8, 2010 at 10:45 AM, Mike Gallamore wrote: > On 04/08/2010 04:53 AM, Philip Jackson wrote: >> >> At Wed, 07 Apr 2010 13:19:26 -0700, >> Mike Gallamore wrote: >> >>> >>> I have writes to cassandra that are failing, or at least a read shortl

Re: Heap sudden jump during import

2010-04-08 Thread Tatu Saloranta
On Wed, Apr 7, 2010 at 1:51 PM, Eric Evans wrote: > On Tue, 2010-04-06 at 10:55 -0700, Tatu Saloranta wrote: >> On Tue, Apr 6, 2010 at 12:15 AM, JKnight JKnight >> wrote: >> > When import, all data in json file will load in memory. So that, you >> can not >> > import large data. >> > You need to

SAR results don't seem overwhelming

2010-04-08 Thread Mark Jones
I stopped writing to the cluster more than 8 hours ago, at worst case, I could only be getting a periodic memtable dump (I think) Running 16 QUORUM read threads getting 600 records/second Sar for all 3 nodes (collected almost simultaneously: Average:CPU %user %nice %system %i

Re: SAR results don't seem overwhelming

2010-04-08 Thread Jonathan Ellis
Can you keep this to one thread please? It is hard to follow when the subject keeps changing.

Re: Write consistency

2010-04-08 Thread Benjamin Black
On Thu, Apr 8, 2010 at 12:55 AM, Paul Prescod wrote: > > ¹ http://jsensarma.com/blog/2009/11/dynamo-part-i-a-followup-and-re-rebuttals/ > Pay no attention to this disingenuous troll. b

Re: Write consistency

2010-04-08 Thread Mark Greene
So unless you re-try the write, the previous stale write stays on the other two nodes? Would a read repair fix this eventually? On Thu, Apr 8, 2010 at 11:36 AM, Avinash Lakshman < avinash.laksh...@gmail.com> wrote: > What your describing is a distributed transaction? Generally strong > consistenc

Re: Write consistency

2010-04-08 Thread Benjamin Black
Yes. Or you would retry the write. Either way, the system achieves consistency eventually, hence the name. On Thu, Apr 8, 2010 at 9:36 AM, Mark Greene wrote: > So unless you re-try the write, the previous stale write stays on the other > two nodes? Would a read repair fix this eventually? >

Re: Write consistency

2010-04-08 Thread Benjamin Black
His arguments consistently (hah!) boil down to this: if you misconfigure things for your intended application, you get undesirable behavior. For example, the correct approach to the situation cited is to use quorum reads and writes. W=3/R=1/N=3 might be appropriate for situations in which you wan

Re: Write consistency

2010-04-08 Thread David Strauss
A read repair will fix it immediately after the first read of the row. On 2010-04-08 16:36, Mark Greene wrote: > So unless you re-try the write, the previous stale write stays on the > other two nodes? Would a read repair fix this eventually? > > On Thu, Apr 8, 2010 at 11:36 AM, Avinash Lakshman

Re: Starting Cassandra Fauna

2010-04-08 Thread Jonathan Ellis
cassandra_helper does a bunch of magic to set things up. looks like the "extract a private copy of cassandra 0.6 beta2" part of the magic is failing. you'll probably need to manually attempt the un-tar to figure out why it is bailing. On Thu, Apr 8, 2010 at 7:30 AM, Nirmala Agadgar wrote: > Hi,

Re: Starting Cassandra Fauna

2010-04-08 Thread Paul Prescod
On Thu, Apr 8, 2010 at 9:49 AM, Jonathan Ellis wrote: > cassandra_helper does a bunch of magic to set things up.  looks like > the "extract a private copy of cassandra 0.6 beta2" part of the magic > is failing.  you'll probably need to manually attempt the un-tar to > figure out why it is bailing.

Re: Write consistency

2010-04-08 Thread Avinash Lakshman
Retry is the best option. Because the read repair will fix it on a subsequent read and it will actually fix it with a value that was actually deemed a failed write to the client. Avinash On Thu, Apr 8, 2010 at 9:47 AM, David Strauss wrote: > A read repair will fix it immediately after the first

Re: Starting Cassandra Fauna

2010-04-08 Thread Jonathan Ellis
Sounds like it's worth reporting on the github project then. On Thu, Apr 8, 2010 at 11:53 AM, Paul Prescod wrote: > On Thu, Apr 8, 2010 at 9:49 AM, Jonathan Ellis wrote: >> cassandra_helper does a bunch of magic to set things up.  looks like >> the "extract a private copy of cassandra 0.6 beta2"

Re: Starting Cassandra Fauna

2010-04-08 Thread Ryan King
Yeah, this is a known issue, we're working on it today. -ryan On Thu, Apr 8, 2010 at 10:31 AM, Jonathan Ellis wrote: > Sounds like it's worth reporting on the github project then. > > On Thu, Apr 8, 2010 at 11:53 AM, Paul Prescod wrote: >> On Thu, Apr 8, 2010 at 9:49 AM, Jonathan Ellis wrote:

Re: Is this sentence slightly inaccurate

2010-04-08 Thread Peter Chang
Sorry, I've read through the docs (although not too recently) and have followed this mailing list for a bit. But I haven't seen how it's possible to iterate with RP? Could you kindly point out to me where it shows how to do this? TIA. On Wed, Apr 7, 2010 at 4:37 PM, Benjamin Black wrote: > It wa

Re: Is this sentence slightly inaccurate

2010-04-08 Thread Brandon Williams
On Thu, Apr 8, 2010 at 1:26 PM, Peter Chang wrote: > Sorry, I've read through the docs (although not too recently) and have > followed this mailing list for a bit. But I haven't seen how it's possible > to iterate with RP? Could you kindly point out to me where it shows how to > do this? TIA. T

Re: writes to Cassandra failing occasionally

2010-04-08 Thread Mike Gallamore
On 04/07/2010 01:31 PM, Eric Evans wrote: On Wed, 2010-04-07 at 13:19 -0700, Mike Gallamore wrote: I have writes to cassandra that are failing, or at least a read shortly after a write is still getting an old value. I realize Cassandra is "eventually consistent" but this system is a single C

Re: writes to Cassandra failing occasionally

2010-04-08 Thread Mike Gallamore
On 04/08/2010 04:53 AM, Philip Jackson wrote: At Wed, 07 Apr 2010 13:19:26 -0700, Mike Gallamore wrote: I have writes to cassandra that are failing, or at least a read shortly after a write is still getting an old value. I realize Cassandra is "eventually consistent" but this system is a sin

Re: Iterate through entire data set

2010-04-08 Thread Sonny Heer
I have two boxes. One is a windows box running Cassandra .6, and the other is an ubuntu box from which I'm trying to run the word count program as in the readme. The windows box seed is set to 127.0.0.1, and listen address to localhost. The ubuntu box seed & listen is point to IP of the windows

Re: writes to Cassandra failing occasionally

2010-04-08 Thread Jeremy Dunck
On Thu, Apr 8, 2010 at 12:41 PM, Mike Gallamore wrote: > Hello. If you are doing exactly the same thing as N::C::Easy (ie a join on > the gettimeofday). Then you should have the same problem I found a fix for. > The problem is that the microseconds value isn't zero padded. So if you are > at say 2

Re: writes to Cassandra failing occasionally

2010-04-08 Thread Mike Gallamore
I'll work on making a benchmark sometime latter. But I don't think that my changes would be batched. My rows only have one column and for this test each row is only accessed once (when it is written), I pretty much directly mapped over from a key value store that was using memcache before. It

Re: Iterate through entire data set

2010-04-08 Thread Benjamin Black
Strange setup, but, ok. What is your setting on the Windows machine? On Thu, Apr 8, 2010 at 11:44 AM, Sonny Heer wrote: > I have two boxes.  One is a windows box running Cassandra .6, and the > other is an ubuntu box from which I'm trying to run the word count > program as in the readme. > > Th

Re: Iterate through entire data set

2010-04-08 Thread Benjamin Black
Are you actually trying to make the Ubuntu system another node in the ring? While the first node is only listening on localhost? There's your problem. On Thu, Apr 8, 2010 at 11:44 AM, Sonny Heer wrote: > I have two boxes.  One is a windows box running Cassandra .6, and the > other is an ubuntu

Re: Iterate through entire data set

2010-04-08 Thread Sonny Heer
Yeah I agree it's strange, Windows is just my local box, and I'm testing before setting up actual boxes :) the thrift address is 'localhost' on the windows box. On Thu, Apr 8, 2010 at 11:53 AM, Benjamin Black wrote: > Strange setup, but, ok.  What is your setting on the > Windows machine? > > O

Re: Iterate through entire data set

2010-04-08 Thread Sonny Heer
Single node cluster (the windows box). the Ubuntu box is only used to run the word count On Thu, Apr 8, 2010 at 11:54 AM, Benjamin Black wrote: > Are you actually trying to make the Ubuntu system another node in the > ring?  While the first node is only listening on localhost?  There's > your pr

Re: Iterate through entire data set

2010-04-08 Thread Benjamin Black
You are telling Windows to only listen on localhost, which is the loopback, which is only accessible on the system itself, not from external machines. On Thu, Apr 8, 2010 at 11:56 AM, Sonny Heer wrote: > Yeah I agree it's strange, Windows is just my local box, and I'm > testing before setting up

Re: writes to Cassandra failing occasionally

2010-04-08 Thread Mike Gallamore
On 04/08/2010 05:53 AM, Ted Zlatanov wrote: On Thu, 08 Apr 2010 12:53:48 +0100 Philip Jackson wrote: PJ> At Wed, 07 Apr 2010 13:19:26 -0700, PJ> Mike Gallamore wrote: I have writes to cassandra that are failing, or at least a read shortly after a write is still getting an old value. I r

Re: Iterate through entire data set

2010-04-08 Thread Benjamin Black
"The ubuntu box seed & listen is point to IP of the windows box." If you are setting storage-conf.xml parameters, you are trying to run Cassandra on the Ubuntu system. Regardless, your earlier mail saying you are setting ThriftAddress to localhost on the Windows machine precludes anything connect

Re: Iterate through entire data set

2010-04-08 Thread Sonny Heer
Right. The word_count program has a storage-conf.xml file which I'm assuming it reads in order to discover the cluster. I've changed the thrift listen address on the windows box to be the IP instead. but still the same result. what is proper setup for easily testing this? On Thu, Apr 8, 2010 a

Re: writes to Cassandra failing occasionally

2010-04-08 Thread Mike Gallamore
On 04/08/2010 11:46 AM, Jeremy Dunck wrote: On Thu, Apr 8, 2010 at 12:41 PM, Mike Gallamore wrote: Hello. If you are doing exactly the same thing as N::C::Easy (ie a join on the gettimeofday). Then you should have the same problem I found a fix for. The problem is that the microseconds val

Re: Iterate through entire data set

2010-04-08 Thread Sonny Heer
Okay I moved everything to the ubuntu box: ~/dev/cassandra-0.6.0-rc1/contrib/word_count$ bin/word_count_setup 10/04/08 11:15:10 INFO config.DatabaseDescriptor: Auto DiskAccessMode determined to be standard 10/04/08 11:15:10 WARN config.DatabaseDescriptor: KeysCachedFraction is deprecated: use Keys

Re: Sorting and ordering in Cassandra

2010-04-08 Thread Peter Chang
Eagerly reading this post. One line here doesn't make sense to me. "Out of the box, Cassandra does not support TimeUUIDs for sorting an OrderPreservingPartitioner." Does this mean you can't use Time UUIDs when using OPP? Or that the keys will not have their order preserved? If it's the latter, pe

Re: Iterate through entire data set

2010-04-08 Thread Sonny Heer
Is there other documentation on how to setup all the pieces? Currently I'm simply trying to test the example word_count, but will likely need to write other map/reduce programs over the cassandra data set. For this test I have one box (ubuntu) where i have moved cass .6 rc1 binary , and started

Re: Is this sentence slightly inaccurate

2010-04-08 Thread Peter Chang
I thought OPP was required for get_range_slices. Is this no longer the case for 0.6? On Thu, Apr 8, 2010 at 11:29 AM, Brandon Williams wrote: > On Thu, Apr 8, 2010 at 1:26 PM, Peter Chang wrote: > >> Sorry, I've read through the docs (although not too recently) and have >> followed this mailing

Re: Is this sentence slightly inaccurate

2010-04-08 Thread Brandon Williams
On Thu, Apr 8, 2010 at 2:38 PM, Peter Chang wrote: > I thought OPP was required for get_range_slices. Is this no longer the case > for 0.6? Right, get_range_slices works with either partitioner. -Brandon

Re: Iterate through entire data set

2010-04-08 Thread Sonny Heer
Missing the commons logging and commons httpclient jars. Must be using the the wrong jdk? On Thu, Apr 8, 2010 at 12:38 PM, Sonny Heer wrote: > Is there other documentation on how to setup all the pieces? > > Currently I'm simply trying to test the example word_count, but will > likely need to wr

Cassandra .6 map/reduce

2010-04-08 Thread Sonny Heer
Running the word_count example the hadoop job appears to be run internally. If I have a Cassandra cluster of 10 nodes, how does the Hadoop cluster get configured?

Re: Iterate through entire data set

2010-04-08 Thread Jonathan Ellis
those aren't shipped with Cassandra. On Thu, Apr 8, 2010 at 3:00 PM, Sonny Heer wrote: > Missing the commons logging and commons httpclient jars.  Must be > using the the wrong jdk? > > On Thu, Apr 8, 2010 at 12:38 PM, Sonny Heer wrote: >> Is there other documentation on how to setup all the pie

RE: Cassandra .6 map/reduce

2010-04-08 Thread Stu Hood
Code that uses Hadoop will look for mapred-site.xml, core-site.xml, hdfs-site.xml etc on your CLASSPATH. If you add your Hadoop config directory to CLASSPATH before running the script, Hadoop will use that configuration to connect to your cluster. -Original Message- From: "Sonny Heer"

Re: Is this sentence slightly inaccurate

2010-04-08 Thread David Strauss
On 2010-04-08 18:29, Brandon Williams wrote: > The method is the same for both partitioners: get_range_slices. The key > difference is, with RP you don't really have a useful order, it's based > on the hash of the row key. That's fine; I don't care about the order. I just want the ability to walk

Very new user needs some troubleshooting pointers

2010-04-08 Thread Heath Oderman
Hi All, I'm brand new to Cassandra and know absolutely nothing, so please forgive me in advance. A friend and I have each setup a few Cassandra stand alone nodes, completely default. His: Mac OSX Snow Leopard Mac Book Pro Intel Duo Core 4GB Ram 5400 rpm disk Mine: debian 5.x

Re: Cassandra .6 map/reduce

2010-04-08 Thread Sonny Heer
Where is it configured in the word_count example? On Thu, Apr 8, 2010 at 1:29 PM, Stu Hood wrote: > Code that uses Hadoop will look for mapred-site.xml, core-site.xml, > hdfs-site.xml etc on your CLASSPATH. If you add your Hadoop config directory > to CLASSPATH before running the script, Hadoop

Re: Iterate through entire data set

2010-04-08 Thread Sonny Heer
Yeah I realized that shortly... :) I'm still not able to point the word_count to a live cluster. If i have a single node cluster and the thrift address is the IP of that box, and it has seed value as the IP of itself as well. How do i run the word_count remotely then? Sorry I must be missing so

Re: writes to Cassandra failing occasionally

2010-04-08 Thread Philip Jackson
At Thu, 08 Apr 2010 11:41:30 -0700, Mike Gallamore wrote: > > [1 ] > On 04/08/2010 04:53 AM, Philip Jackson wrote: > > At Wed, 07 Apr 2010 13:19:26 -0700, > > Mike Gallamore wrote: > > > >> I have writes to cassandra that are failing, or at least a read shortly > >> after a write is still get

Re: Starting Cassandra Fauna

2010-04-08 Thread Jeff Hodges
While I wasn't able to reproduce the error, we did have another pop up. I think I may have actually fixed your problem the other day. Pull the latest master from fauna/cassandra and you should be good to go. -- Jeff On Thu, Apr 8, 2010 at 10:51 AM, Ryan King wrote: > Yeah, this is a known issue,

Re: Sorting and ordering in Cassandra

2010-04-08 Thread Paul Prescod
On Thu, Apr 8, 2010 at 12:34 PM, Peter Chang wrote: > Eagerly reading this post. One line here doesn't make sense to me. > "Out of the box, Cassandra does not support TimeUUIDs for sorting an > OrderPreservingPartitioner." > Does this mean you can't use Time UUIDs when using OPP? Or that the keys

Worst case #iops to read a row

2010-04-08 Thread Scott Shealy
Not knowing know anything about the physical layout of the data on disk or how it is accessed when it is read... Could someone who does help estimate the worst case scenario(no caching at any level) for the number of iops to read a row of modest size and modest number of columns in a large col