Re: help needed interpreting Read/Write latency in cfstats and cfhistograms output

2011-10-04 Thread aaron morton
ms is for microseconds To get a handle on what is happening I would run them both first to reset the recent counts. Then run them and see if they make sense. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 4/10/2011, at 9:57 AM,

Re: Running on Windows

2011-10-04 Thread aaron morton
I'm guessing things here without checking, but some issues may be: * pretty sure JNA mlockall() to lock the cassandra memory from swapping is not available https://issues.apache.org/jira/browse/CASSANDRA-1214 * Creating hard links for the snapshots will be different, not sure exactly how differ

[RELEASE CANDIDATE] Apache Cassandra 1.0.0-rc2 released

2011-10-04 Thread Sylvain Lebresne
The first release candidate happened to have some important regressions, so we decided to release a new release candidate. So here it is: Apache Cassandra 1.0.0-rc2. This is still *not* the final release and is thus not yet considered ready for production. As always, your help in testing this rel

Re: nodetool cfstats on 1.0.0-rc1 throws an exception

2011-10-04 Thread aaron morton
That row has a size of 819 peta bytes, so something is odd there. The error is a result of that value been so huge. When you rant he same script on 0.8.6 what was the max size of the Migrations CF ? As Jonathan says, it's unlikely anyone would have tested creating 5000 CF's. Most people only cr

Re: Weird problem with empty CF

2011-10-04 Thread aaron morton
Yes that's the slice query skipping past the tombstone columns. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 4/10/2011, at 4:24 PM, Daning Wang wrote: > Lots of SliceQueryFilter in the log, is that handling tombstone? > > DEB

RE: invalid column name length 0

2011-10-04 Thread Desimpel, Ignace
I run the application with the JVM -ea option, so assertions are enabled. I insert records using the StorageProxy.mutate function. The elements are created as specified below. Below : The arForwardFuncValueBytes and arReverseFuncValueBytes are tested for null or length = 0 by my code. The oTok

Does anybody know why Twitter stop integrate Cassandra as Twitter store?

2011-10-04 Thread ruslan usifov
http://engineering.twitter.com/2010/07/cassandra-at-twitter-today.html As said in this post Twiter stop working on using Cassandra as a store for Tweets, but there nothing said why they made this decision? Does anybody have mo information

How to speed up "Waiting for schema agreement" for a single node Cassandra cluster?

2011-10-04 Thread Joseph Norton
Hello. For unit test purposes, I have a single node Cassandra cluster. I need to drop and re-create several keyspaces between each test iteration. This process takes approximately 10 seconds for a single node installation. Can you recommend any tricks or recipes to reduce the time required f

Re: How to speed up "Waiting for schema agreement" for a single node Cassandra cluster?

2011-10-04 Thread Jonathan Ellis
Truncate is faster than drop + recreate. On Tue, Oct 4, 2011 at 9:15 AM, Joseph Norton wrote: > > Hello. > > For unit test purposes, I have a single node Cassandra cluster.  I need to > drop and re-create several keyspaces between each test iteration.  This > process takes approximately 10 seco

Re: How to speed up "Waiting for schema agreement" for a single node Cassandra cluster?

2011-10-04 Thread Joseph Norton
I didn't consider using truncate because a set of potentially random Column Families are created dynamically during the test. Are there any configuration knobs that could be adjusted for drop + recreate? thanks in advance, - Joe N Joseph Norton nor...@alum.mit.edu On Oct 4, 2011, at 11:19

Re: How to speed up "Waiting for schema agreement" for a single node Cassandra cluster?

2011-10-04 Thread Jonathan Ellis
Hmm... Maybe disable compaction, since that can block schema changes. Otherwise the big win will be in https://issues.apache.org/jira/browse/CASSANDRA-1391. On Tue, Oct 4, 2011 at 9:33 AM, Joseph Norton wrote: > > I didn't consider using truncate because a set of potentially random Column > Fa

Re: Weird problem with empty CF

2011-10-04 Thread Daning
Thanks Aaron. How about I set the gc_grace_seconds to 0 or like 2 hours? I like to clean up tomebstone sooner, I don't care losing some data and all my columns have ttl. If one node is down longer than gc_grace_seconds, and I got tombstone removed, once the node is up, from my understanding d

Re: Does anybody know why Twitter stop integrate Cassandra as Twitter store?

2011-10-04 Thread Paul Loy
Did you read the article you posted? "*We believe that this isn't the time to make large scale migration to a new technology*. We will focus our Cassandra work on new projects that we wouldn't be able to ship without a large-scale data store. "*We're investing in Cassandra every day. It'll be wit

Re: Does anybody know why Twitter stop integrate Cassandra as Twitter store?

2011-10-04 Thread ruslan usifov
Hello 2011/10/4 Paul Loy > Did you read the article you posted? > Yes > "*We believe that this isn't the time to make large scale migration to a > new technology*. We will focus our Cassandra work on new projects that we > wouldn't be able to ship without a large-scale data store. There was

Re: help needed interpreting Read/Write latency in cfstats and cfhistograms output

2011-10-04 Thread Brandon Williams
On Mon, Oct 3, 2011 at 3:57 PM, Ramesh Natarajan wrote: > Thanks Aaron. The ms in the latency is it microseconds or milliseconds? > I ran the 2 commands at the same time. I was expecting the values to be in > the some what similar but from my output earlier ,  you can see the median > in read late

dedicated gossip lan

2011-10-04 Thread Sorin Julean
Hi, Did anyone used a dedicated interfaces and LAN / VLAN for gossip traffic ? Any benefits in such approach ? Cheers, Sorin

Re: Does anybody know why Twitter stop integrate Cassandra as Twitter store?

2011-10-04 Thread Paul Loy
yup, and again it gives a perfectly adequate reason: "Twitter is busy fighting other fires and they don't have the time to retrofit something that is (more or less) working, namely their MySQL based tweet storage, with a completely

ApacheCon meetup?

2011-10-04 Thread Chris Burroughs
ApacheCon NA is coming up next month. I suspect there will be at least a few Cassandra users there (yeah new release!). Would anyone be interested in getting together and sharing some stories? This could either be a "official" [1] meetup. Or grabbing food together sometime. [1] http://wiki.apa

Re: dedicated gossip lan

2011-10-04 Thread Brandon Williams
On Tue, Oct 4, 2011 at 2:00 PM, Sorin Julean wrote: > Hi, > >  Did anyone used a dedicated interfaces and LAN / VLAN for gossip traffic ? > >  Any benefits in such approach ? I don't think there is any substantial benefit to doing this, but also it's impossible: gossip is not separate from the st

Re: dedicated gossip lan

2011-10-04 Thread Sorin Julean
Sorry for not being clear. Indeed I mean a separate LAN and interfaces for "listen_address". - sorin On Tue, Oct 4, 2011 at 10:49 PM, Brandon Williams wrote: > On Tue, Oct 4, 2011 at 2:00 PM, Sorin Julean > wrote: > > Hi, > > > > Did anyone used a dedicated interfaces and LAN / VLAN for gossi

Re: Weird problem with empty CF

2011-10-04 Thread aaron morton
I would not get gc_grace seconds to 0, set to to something small. gc_grace_seconds or ttl is only the minimum amount of time the column will stay in the data files. The columns are only purged when compaction runs some time after that timespan has ended. If you are seeing issues where a heavy

Re: Does anybody know why Twitter stop integrate Cassandra as Twitter store?

2011-10-04 Thread aaron morton
If you want to see just how much Twitter uses Cassandra watch Chris Goffinet's awesome presentation at this years Cassandra SF meeting http://www.datastax.com/events/cassandrasf2011/presentations Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thel

EC2 raid0 disks ?

2011-10-04 Thread Yang
it seems that how many virtual disks you can have is fixed: on m2.4xlarge you have 2 disks, while on m2.2xlarge you have only 1, so I can't setup a raid0 on m2.2xlarge am I correct? Thanks Yang

Re: Does anybody know why Twitter stop integrate Cassandra as Twitter store?

2011-10-04 Thread Chris Goffinet
At the time of that project, there wasn't enough resources and dedicated team. Since then we changed that (based on the presentation I gave). We decided to focus on other areas, and newer projects. We spent a lot of time with the community improving failure conditions, performance, etc. We chose to

Shrinking cluster with counters ...

2011-10-04 Thread Ian Danforth
All, If I have a 3 node cluster storing counters and RF3, is it possible to shrink back down to a single node cluster? If so should I change replication factor, disable a node, wait for streaming to complete, and repeat for the other node? Should I assume that the cluster will be unavailable duri

Re: How to speed up "Waiting for schema agreement" for a single node Cassandra cluster?

2011-10-04 Thread Joseph Norton
Thanks for the pointers. For this type of functional unit testing, I suppose what I really want is a mock Cassandra (or Thrift Server) node for quickly running lots of tests for an application's logic. thanks, - Joe N. Joseph Norton nor...@alum.mit.edu On Oct 5, 2011, at 12:01 AM, Jonath

Re: Weird problem with empty CF

2011-10-04 Thread Daning
Thanks. Do you have plan to improve this? I think tombstone should be separated with live data since it serves different purpose, built in separate SSTable or indexed differently. It is pretty costly to do filtering while reading. Daning On 10/04/2011 01:34 PM, aaron morton wrote: I would n

Re: EC2 raid0 disks ?

2011-10-04 Thread Yi Yang
AFAIK it's around 450G per ephemeral disk. BTW randomly you can get high performance EBS drives as well. Performance are good for DB but are random in IOps. --Original Message-- From: Yang To: user@cassandra.apache.org ReplyTo: user@cassandra.apache.org Subject: EC2 raid0 disks ? Sent: Oct

Re: EC2 raid0 disks ?

2011-10-04 Thread Joaquin Casares
Correct. Not with ephemeral storage. Here's a complete list of the drives that are attached: http://docs.amazonwebservices.com/AWSEC2/latest/UserGuide/index.html?InstanceStorage.html#StorageOnInstanceTypes Joaquin Casares DataStax Software Engineer/Support On Tue, Oct 4, 2011 at 4:01 PM, Yang

Re: EC2 raid0 disks ?

2011-10-04 Thread Joaquin Casares
Hello again, Also, EBS volumes can be attached, but the performance issues cause other issues when running a healthy cluster. From experience running clusters on EBS volumes bring their own set of unique problems and are harder to debug. Here's a quick link that provides a bit more background inf

Re: EC2 raid0 disks ?

2011-10-04 Thread Yang
Thanks guys. btw, what is the performance difference between doing a raid0 on the multiple ephemeral drives available, and then assign it to cassandra data directory, vs creating a mount on each of these drives, and then specify all of these to cassandra's data directory list? since these drives

Re: EC2 raid0 disks ?

2011-10-04 Thread Joaquin Casares
Not a problem! We ran a test a few months back and know it's better to use RAID0 vs. just one mount directory. It slips my mind whether the test was also run for multiple directories vs. RAID0 as well. We chose the RAID0 method for the AMI as to avoid confusion and allow for all the sstables to b

Re: How to speed up "Waiting for schema agreement" for a single node Cassandra cluster?

2011-10-04 Thread Jeremiah Jordan
But truncate is still slow, especially if it can't use JNA (windows) as it snapshots. Depending on how much data you are inserting during your unit tests, just paging through all the keys and then deleting them is the fastest way, though if you use timestamps besides "now" this won't work, as t

Re: How to speed up "Waiting for schema agreement" for a single node Cassandra cluster?

2011-10-04 Thread Roshan Dawrani
On Wed, Oct 5, 2011 at 7:42 AM, Jeremiah Jordan < jeremiah.jor...@morningstar.com> wrote: > But truncate is still slow, especially if it can't use JNA (windows) as it > snapshots. Depending on how much data you are inserting during your unit > tests, just paging through all the keys and then dele

ByteOrderedPartitioner token generation

2011-10-04 Thread Masoud Moshref Javadi
I need to insert a large amount of data to Cassandra cluster in a short time. So I want the interaction among Cassandra servers be minimum. I think that the best way to do this is to use ByteOrderedPartitioner and generate ID of new data based on the InitialToken of servers and send data to the