Hi, I have several questions. I hope some of you can share your experiences in each or all of these following. I will be curious about twitter and digg's experience as they might be processing
1. Eventual consistency: Given a volume of 5K writes / sec and roughly 1500 writes are Updates per sec while the rest are inserts, what kind of latency can be expected in eventual consistency? 2. Performance: Are there any bench marks on how many writes /sec and reads/sec cassandra supports on an "n node" cluster? a Node can be of variable size and would like to know the hardware/software details of the cluster as well. 3. EC2: Has any one implemented cassandra on EC2 and what kind transaction volume are they using it for and how is their experience with cassandra on EC2?. 4. Overhead and issues: What are typical nightmare scenario's one could face when using Cassandra for heavy write / read intensive systems? 5. Backups : If there is a 4 or 5 TB cassandra cluster what do you recommend the backup scenario's could be? Also, Does cassandra support counters? Digg's article said they are going to contribute their work to open source any idea when that would be? Thanks in advance for sharing your experience Lenin On Fri, Mar 19, 2010 at 1:03 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > Jeff Hodsdon edited the new link in: > http://about.digg.com/blog/looking-future-cassandra > > On Fri, Mar 19, 2010 at 2:49 PM, Nathan McCall <n...@vervewireless.com> > wrote: > > Gary, > > Did you see this larticle linked from the Cassandra wiki? > > http://about.digg.com/node/564 > > > > See http://wiki.apache.org/cassandra/ArticlesAndPresentations for more > > examples like the above. In general, you structure your data according > > to how it will be queried. This can lead to duplication, but that is > > one of the trade-offs for performance and scale. > > > > Digg folks - the "Looking to the Future with Cassandra" linked on the > > wiki is no longer available. I found that article quite helpful > > originally. Is there a chance this could be re-posted? > > > > Cheers, > > -Nate > > > > On Fri, Mar 19, 2010 at 12:16 PM, Gary <daxia...@gmail.com> wrote: > >> I am a newbie to bigtable like model and have a question as follows. > Take > >> Digg as an example, I want to find a list users who dug a URL and also > want > >> to find a list of URLs a user dug. How should the data model look like > for > >> the queries to be efficient? If I use the username and the URL for two > rows, > >> when a user digs a URL, I will have to update two rows so I need a > >> transaction to keep data consistent. > >> Any thoughts? > >> Thanks, > >> Gary > > > -- twitter: leningali skype: galilenin Cell:513.382.3371