chical extensions to Dynamo. Also, note
that this problem is actively addressed by O(1) DHT systems(e.g.,
[14]).
--
Regards,
Takayuki Tsunakawa
d.
Regards,
Takayuki Tsunakawa
- Original Message -
From: aaron morton
To: user@cassandra.apache.org
Sent: Friday, October 22, 2010 4:05 PM
Subject: Re: [Q] MapReduce behavior and Cassandra's scalability for
petabytes of data
For plain old log analysis the Cloudera Hadoop distribution
TB of disks and the replication factor is 3, the
simple calculation shows 4 TB * 400 / 3 = 533 TB (ignoring commit log,
OS areas, etc).
[Q3-2]
Based on the current architecture, how many nodes is the limit and how
much (approximate) data is the practical limit?
Regards,
Takayuki Tsunakawa
if you cloud give me your thoughts if you remember some
technical challenges that could cause difficulties in a cluster which
has petabytes of data and thousands of nodes.
Regards,
Takayuki Tsunakawa
e, so I'm interested in the elasticity.
Yahoo!'s YCSB report makes me worry about adding nodes.
Regards,
Takayuki Tsunakawa
From: "Edward Capriolo"
[Q3]
There are some challenges with very large disk nodes.
Caveats:
I will use words like "long", "slow", and
If our project starts with
Cassandra and encounter any issues or interesting things, I'll report here.
Regards,
Takayuki Tsunakawa
From: Mike Malone
Hey Takayuki,
I don't think you're going to find anyone willing to promise that Cassandra
will fit your petabyte scale data ana