[Q] MapReduce behavior and Cassandra's scalability for petabytes of data

2010-10-21 Thread Takayuki Tsunakawa
chical extensions to Dynamo. Also, note that this problem is actively addressed by O(1) DHT systems(e.g., [14]). -- Regards, Takayuki Tsunakawa

Re: [Q] MapReduce behavior and Cassandra's scalability for petabytes of data

2010-10-22 Thread Takayuki Tsunakawa
d. Regards, Takayuki Tsunakawa - Original Message - From: aaron morton To: user@cassandra.apache.org Sent: Friday, October 22, 2010 4:05 PM Subject: Re: [Q] MapReduce behavior and Cassandra's scalability for petabytes of data For plain old log analysis the Cloudera Hadoop distribution

Re: [Q] MapReduce behavior and Cassandra's scalability for petabytes of data

2010-10-24 Thread Takayuki Tsunakawa
TB of disks and the replication factor is 3, the simple calculation shows 4 TB * 400 / 3 = 533 TB (ignoring commit log, OS areas, etc). [Q3-2] Based on the current architecture, how many nodes is the limit and how much (approximate) data is the practical limit? Regards, Takayuki Tsunakawa

Re: [Q] MapReduce behavior and Cassandra's scalability for petabytes of data

2010-10-25 Thread Takayuki Tsunakawa
if you cloud give me your thoughts if you remember some technical challenges that could cause difficulties in a cluster which has petabytes of data and thousands of nodes. Regards, Takayuki Tsunakawa

Re: [Q] MapReduce behavior and Cassandra's scalability for petabytes of data

2010-10-25 Thread Takayuki Tsunakawa
e, so I'm interested in the elasticity. Yahoo!'s YCSB report makes me worry about adding nodes. Regards, Takayuki Tsunakawa From: "Edward Capriolo" [Q3] There are some challenges with very large disk nodes. Caveats: I will use words like "long", "slow", and

Re: [Q] MapReduce behavior and Cassandra's scalability for petabytes of data

2010-10-25 Thread Takayuki Tsunakawa
If our project starts with Cassandra and encounter any issues or interesting things, I'll report here. Regards, Takayuki Tsunakawa From: Mike Malone Hey Takayuki, I don't think you're going to find anyone willing to promise that Cassandra will fit your petabyte scale data ana