I agree, that is the way to go. Then each piece of new functionality will not have to be implemented twice.
On Sat, Feb 12, 2011 at 9:41 AM, Stu Hood <stuh...@gmail.com> wrote: > I would like to continue to support super columns, but to slowly convert > them into "compound column names", since that is really all they really are. > > > On Thu, Feb 10, 2011 at 10:16 AM, Frank LoVecchio <fr...@isidorey.com>wrote: > >> I've found super column families quite useful when using >> RandomOrderedPartioner on a low-maintenance cluster (as opposed to >> Byte/Ordered), e.g. returning ordered data from a TimeUUID comparator type; >> try doing that with one regular column family and secondary indexes (you >> could obviously sort on the client side, but that is tedious and not logical >> for older data). >> >> On Thu, Feb 10, 2011 at 12:32 AM, David Boxenhorn <da...@lookin2.com>wrote: >> >>> Mike, my problem is that I have an database and codebase that already >>> uses supercolumns. If I had to do it over, it wouldn't use them, for the >>> reasons you point out. In fact, I have a feeling that over time supercolumns >>> will become deprecated de facto, if not de jure. That's why I would like to >>> see them represented internally as regular columns, with an upgrade path for >>> backward compatibility. >>> >>> I would love to do it myself! (I haven't looked at the code base, but I >>> don't understand why it should be so hard.) But my employer has other >>> ideas... >>> >>> >>> On Wed, Feb 9, 2011 at 8:14 PM, Mike Malone <m...@simplegeo.com> wrote: >>> >>>> On Tue, Feb 8, 2011 at 2:03 AM, David Boxenhorn <da...@lookin2.com>wrote: >>>> >>>>> Shaun, I agree with you, but marking them as deprecated is not good >>>>> enough for me. I can't easily stop using supercolumns. I need an upgrade >>>>> path. >>>>> >>>> >>>> David, >>>> >>>> Cassandra is open source and community developed. The right thing to do >>>> is what's best for the community, which sometimes conflicts with what's >>>> best >>>> for individual users. Such strife should be minimized, it will never be >>>> eliminated. Luckily, because this is an open source, liberal licensed >>>> project, if you feel strongly about something you should feel free to add >>>> whatever features you want yourself. I'm sure other people in your >>>> situation >>>> will thank you for it. >>>> >>>> At a minimum I think it would behoove you to re-read some of the >>>> comments here re: why super columns aren't really needed and take another >>>> look at your data model and code. I would actually be quite surprised to >>>> find a use of super columns that could not be trivially converted to normal >>>> columns. In fact, it should be possible to do at the framework/client >>>> library layer - you probably wouldn't even need to change any application >>>> code. >>>> >>>> Mike >>>> >>>> On Tue, Feb 8, 2011 at 3:53 AM, Shaun Cutts <sh...@cuttshome.net>wrote: >>>>> >>>>>> >>>>>> I'm a newbie here, but, with apologies for my presumptuousness, I >>>>>> think you should deprecate SuperColumns. They are already distracting >>>>>> you, >>>>>> and as the years go by the cost of supporting them as you add more and >>>>>> more >>>>>> functionality is only likely to get worse. It would be better to >>>>>> concentrate >>>>>> on making the "core" column families better (and I'm sure we can all >>>>>> think >>>>>> of lots of things we'd like). >>>>>> >>>>>> Just dropping SuperColumns would be bad for your reputation -- and for >>>>>> users like David who are currently using them. But if you mark them >>>>>> clearly >>>>>> as deprecated and explain why and what to do instead (perhaps putting a >>>>>> bit >>>>>> of effort into migration tools... or even a "virtual" layer supporting >>>>>> arbitrary hierarchical data), then you can drop them in a few years (when >>>>>> you get to 1.0, say), without people feeling betrayed. >>>>>> >>>>>> -- Shaun >>>>>> >>>>>> On Feb 6, 2011, at 3:48 AM, David Boxenhorn wrote: >>>>>> >>>>>> "My main point was to say that it's think it is better to create >>>>>> tickets for what you want, rather than for something else completely >>>>>> different that would, as a by-product, give you what you want." >>>>>> >>>>>> Then let me say what I want: I want supercolumn families to have any >>>>>> feature that regular column families have. >>>>>> >>>>>> My data model is full of supercolumns. I used them, even though I knew >>>>>> it didn't *have to*, "because they were there", which implied to me that >>>>>> I >>>>>> was supposed to use them for some good reason. Now I suspect that they >>>>>> will >>>>>> gradually become less and less functional, as features are added to >>>>>> regular >>>>>> column families and not supported for supercolumn families. >>>>>> >>>>>> >>>>>> On Fri, Feb 4, 2011 at 10:58 AM, Sylvain Lebresne < >>>>>> sylv...@datastax.com> wrote: >>>>>> >>>>>>> On Fri, Feb 4, 2011 at 12:35 AM, Mike Malone <m...@simplegeo.com>wrote: >>>>>>> >>>>>>>> On Thu, Feb 3, 2011 at 6:44 AM, Sylvain Lebresne < >>>>>>>> sylv...@datastax.com> wrote: >>>>>>>> >>>>>>>>> On Thu, Feb 3, 2011 at 3:00 PM, David Boxenhorn <da...@lookin2.com >>>>>>>>> > wrote: >>>>>>>>> >>>>>>>>>> The advantage would be to enable secondary indexes on supercolumn >>>>>>>>>> families. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Then I suggest opening a ticket for adding secondary indexes to >>>>>>>>> supercolumn families and voting on it. This will be 1 or 2 order of >>>>>>>>> magnitude less work than getting rid of super column internally, and >>>>>>>>> probably a much better solution anyway. >>>>>>>>> >>>>>>>> >>>>>>>> I realize that this is largely subjective, and on such matters code >>>>>>>> speaks louder than words, but I don't think I agree with you on the >>>>>>>> issue of >>>>>>>> which alternative is less work, or even which is a better solution. >>>>>>>> >>>>>>> >>>>>>> You are right, I put probably too much emphase in that sentence. My >>>>>>> main point was to say that it's think it is better to create tickets for >>>>>>> what you want, rather than for something else completely different that >>>>>>> would, as a by-product, give you what you want. >>>>>>> Then I suspect that *if* the only goal is to get secondary indexes on >>>>>>> super columns, then there is a good chance this would be less work than >>>>>>> getting rid of super columns. But to be fair, secondary indexes on super >>>>>>> columns may not make too much sense without #598, which itself would >>>>>>> require >>>>>>> quite some work, so clearly I spoke a bit quickly. >>>>>>> >>>>>>> >>>>>>>> If the goal is to have a hierarchical model, limiting the depth to >>>>>>>> two seems arbitrary. Why not go all the way and allow an arbitrarily >>>>>>>> deep >>>>>>>> hierarchy? >>>>>>>> >>>>>>>> If a more sophisticated hierarchical model is deemed unnecessary, or >>>>>>>> impractical, allowing a depth of two seems inconsistent and >>>>>>>> unnecessary. It's pretty trivial to overlay a hierarchical model on >>>>>>>> top of >>>>>>>> the map-of-sorted-maps model that Cassandra implements. Ed Anuff has >>>>>>>> implemented a custom comparator that does the job [1]. Google's >>>>>>>> Megastore >>>>>>>> has a similar architecture and goes even further [2]. >>>>>>>> >>>>>>>> It seems to me that super columns are a historical artifact from >>>>>>>> Cassandra's early life as Facebook's inbox storage system. They needed >>>>>>>> posting lists of messages, sharded by user. So that's what they built. >>>>>>>> In my >>>>>>>> dealings with the Cassandra code, super columns end up making a mess >>>>>>>> all >>>>>>>> over the place when algorithms need to be special cased and branch >>>>>>>> based on >>>>>>>> the column/supercolumn distinction. >>>>>>>> >>>>>>>> I won't even mention what it does to the thrift interface. >>>>>>>> >>>>>>> >>>>>>> Actually, I agree with you, more than you know. If I were to start >>>>>>> coding Cassandra now, I wouldn't include super columns (and I would >>>>>>> probably >>>>>>> not go for a depth unlimited hierarchical model either). But it's there >>>>>>> and >>>>>>> I'm not sure getting rid of them fully (meaning, including in thrift) >>>>>>> is an >>>>>>> option (it would be a big compatibility breakage). And (even though I >>>>>>> certainly though about this more than once :)) I'm slightly >>>>>>> less enthusiastic about keeping them in thrift but encoding them in >>>>>>> regular >>>>>>> column family internally: it would still be a lot of work but we would >>>>>>> still >>>>>>> probably end up with nasty tricks to stick to the thrift api. >>>>>>> >>>>>>> -- >>>>>>> Sylvain >>>>>>> >>>>>>> >>>>>>>> Mike >>>>>>>> >>>>>>>> [1] >>>>>>>> http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html >>>>>>>> [2] http://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper32.pdf >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >> >> -- >> Frank LoVecchio >> Senior Software Engineer | Isidorey, LLC >> Google Voice +1.720.295.9179 >> isidorey.com | facebook.com/franklovecchio | franklovecchio.com | >> rodsandricers.com >> >> >