On Sun, Apr 18, 2010 at 8:00 AM, Mason Hale <ma...@onespot.com> wrote:
> This is a statement I wish I had run across sooner. Our first > implementation (which we're changing now) included some very big rows. We > ran into trouble with compaction and during hinted hand-off operations > (which also deals with data a full row at a time) because these rows would > not fit into available memory. > > I think until there are not these lurking gotcha spots like compaction and > hinted hand-off, where a full row must fit in memory, we should not be > making misleading statements like "Cassandra has the advantage of a more > advanced datamodel, allowing for a single row to contain billions of > column/value pairs: enough to fill a machine." (from: > http://gigaom.com/2010/03/11/digg-cassandara/ , > http://spyced.blogspot.com/2010/03/cassandra-in-action.html). A statement > like that should have some caveats, otherwise it reads as an endorsement, a > suggestion even, to build a data model with massively wide rows. In > practice, it is not feasible to have billions of columns in a single row > because it will lead to problems with compaction and hinted hand-off, maybe > elsewhere. > > Mason > http://wiki.apache.org/cassandra/CassandraLimitations We aren't hiding anything from the user who wishes to educate themselves. -Brandon