On Thu, Feb 3, 2011 at 6:44 AM, Sylvain Lebresne <sylv...@datastax.com>wrote:

> On Thu, Feb 3, 2011 at 3:00 PM, David Boxenhorn <da...@lookin2.com> wrote:
>
>> The advantage would be to enable secondary indexes on supercolumn
>> families.
>>
>
> Then I suggest opening a ticket for adding secondary indexes to supercolumn
> families and voting on it. This will be 1 or 2 order of magnitude less work
> than getting rid of super column internally, and probably a much better
> solution anyway.
>

I realize that this is largely subjective, and on such matters code speaks
louder than words, but I don't think I agree with you on the issue of which
alternative is less work, or even which is a better solution.

If the goal is to have a hierarchical model, limiting the depth to two seems
arbitrary. Why not go all the way and allow an arbitrarily deep hierarchy?

If a more sophisticated hierarchical model is deemed unnecessary, or
impractical, allowing a depth of two seems inconsistent and
unnecessary. It's pretty trivial to overlay a hierarchical model on top of
the map-of-sorted-maps model that Cassandra implements. Ed Anuff has
implemented a custom comparator that does the job [1]. Google's Megastore
has a similar architecture and goes even further [2].

It seems to me that super columns are a historical artifact from Cassandra's
early life as Facebook's inbox storage system. They needed posting lists of
messages, sharded by user. So that's what they built. In my dealings with
the Cassandra code, super columns end up making a mess all over the place
when algorithms need to be special cased and branch based on the
column/supercolumn distinction.

I won't even mention what it does to the thrift interface.

Mike

[1] http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html
[2] http://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper32.pdf

Reply via email to