Cassandra and EC2 performance testing

2010-10-07 Thread Corey Hulen
I recently posted a blog article about Cassandra and EC2 performance testing for small vs large, EBS vs ephemeral storage, compared to real HW with and without an SSD. Hope people find it interesting. http://www.coreyhulen.org/?p=326 Highlights: - The variance in test results from run to run

Re: Possible bug in Cassandra MapReduce

2010-06-18 Thread Corey Hulen
3: https://issues.apache.org/jira/browse/CASSANDRA-1042 > > On Fri, Jun 18, 2010 at 2:49 PM, Corey Hulen wrote: > > > > We are using MapReduce to periodical verify and rebuild our secondary > > indexes along with counting total records. We started to noticed double > > cou

Re: Possible bug in Cassandra MapReduce

2010-06-18 Thread Corey Hulen
r key count you will get double counts. -Corey On Fri, Jun 18, 2010 at 3:15 PM, Corey Hulen wrote: > > I thought the same thing, but using the supplied contrib example I just > delete the /var/lib/data dirs and commit log. > > -Corey > > > > > On Fri, Jun 18, 2

Re: Possible bug in Cassandra MapReduce

2010-06-18 Thread Corey Hulen
feature (of sorts). When this happens I > increment a formally non-zero portion of the timestamp (the last digit of > precision which was always zero) and use this as a counter to track how many > times a key/col was updated (max 9 for my purposes). > > -phil > > On Jun 18, 2010, a

Possible bug in Cassandra MapReduce

2010-06-18 Thread Corey Hulen
We are using MapReduce to periodical verify and rebuild our secondary indexes along with counting total records. We started to noticed double counting of unique keys on single machine standalone tests. We were finally able to reproduce the problem using the apache-cassandra-0.6.2-src/contrib/word_

paging row keys

2010-05-12 Thread Corey Hulen
Can someone point me to a thrift sample (preferable java) to list all the rows in a ColumnFamily for my Cassandra server. I noticed some examples using SlicePredicate and SliceRange to perform a similar query against the columns with paging, but I was looking for something similar for rows with pa