On Mon, May 10, 2010 at 9:44 AM, Stu Hood <stu.h...@rackspace.com> wrote:
> I think that it is 100% ideal: it's what I've been working on implementing > in #674, #847 and #998. I'm hoping to post a large patchset and docs this > week, and I'm aiming to get it committed for 0.8. > > The work I've been doing doesn't touch the user interface: it only deals > with the internal changes necessary to make this type of storage possible. > Yea, Stu, I've been looking at your github changes. I think we both have a lot of the same ideas. I'd love to chat more about this stuff sometime. > > > -----Original Message----- > From: "Mike Malone" <m...@simplegeo.com> > Sent: Monday, May 10, 2010 11:37am > To: user@cassandra.apache.org > Subject: Re: Is SuperColumn necessary? > > Maybe... but honestly, it doesn't affect the architecture or interface at > all. I'm more interested in thinking about how the system should work than > what things are called. Naming things are important, but that can happen > later. > > Does anyone have any thoughts or comments on the architecture I suggested > earlier? > > Mike > > On Mon, May 10, 2010 at 8:36 AM, Schubert Zhang <zson...@gmail.com> wrote: > > > Yes, the "column" here is not appropriate. > > Maybe we need not to create new terms, in Google's Bigtable, the term > > "qualifier" is a good one. > > > > > > On Thu, May 6, 2010 at 3:04 PM, David Boxenhorn <da...@lookin2.com> > wrote: > > > >> That would be a good time to get rid of the confusing "column" term, > which > >> incorrectly suggests a two-dimensional tabular structure. > >> > >> Suggestions: > >> > >> 1. A hypercube (or hypocube, if only two dimensions): replace "key" and > >> "column" with "1st dimension", "2nd dimension", etc. > >> > >> 2. A file system: replace "key" and "column" with "directory" and > >> "subdirectory" > >> > >> 3. A tuple tree: "Column family" replaced by top-level tuple, whose > value > >> is the set of keys, whose value is the set of supercolumns of the key, > whose > >> value is the set of columns for the supercolumn, etc. > >> > >> 4. Etc. > >> > >> On Thu, May 6, 2010 at 2:28 AM, Mike Malone <m...@simplegeo.com> wrote: > >> > >>> Nice, Ed, we're doing something very similar but less generic. > >>> > >>> Now replace all of the various methods for querying with a simple query > >>> interface that takes a Predicate, allow the user to specify (in > >>> storage-conf) which levels of the nested Columns should be indexed, and > >>> completely remove Comparators and have people subclass Column / > implement > >>> IColumn and we'd really be on to something ;). > >>> > >>> Mock storage-conf.xml: > >>> <Column Name="ThingThatsNowKey" Indexed="True" > >>> ClusterPartitioned="True" Type="UTF8"> > >>> <Column Name="ThingThatsNowColumnFamily" DiskPartitioned="True" > >>> Type="UTF8"> > >>> <Column Name="ThingThatsNowSuperColumnName" Type="Long"> > >>> <Column Name="ThingThatsNowColumnName" Indexed="True" > >>> Type="ASCII"> > >>> <Column Name="ThingThatCantCurrentlyBeRepresented"/> > >>> </Column> > >>> </Column> > >>> </Column> > >>> </Column> > >>> > >>> Thrift: > >>> struct NamePredicate { > >>> 1: required list<binary> column_names, > >>> } > >>> struct SlicePredicate { > >>> 1: required binary start, > >>> 2: required binary end, > >>> } > >>> struct CountPredicate { > >>> 1: required struct predicate, > >>> 2: required i32 count=100, > >>> } > >>> struct AndPredicate { > >>> 1: required Predicate left, > >>> 2: required Predicate right, > >>> } > >>> struct SubColumnsPredicate { > >>> 1: required Predicate columns, > >>> 2: required Predicate subcolumns, > >>> } > >>> ... OrPredicate, OtherUsefulPredicates ... > >>> query(predicate, count, consistency_level) # Count here would be > total > >>> count of leaf values returned, whereas CountPredicate specifies a > column > >>> count for a particular sub-slice. > >>> > >>> Not fully baked... but I think this could really simplify stuff and > make > >>> it more flexible. Downside is it may give people enough rope to hang > >>> themselves, but at least the predicate stuff is easily distributable. > >>> > >>> I'm thinking I'll play around with implementing some of this stuff > myself > >>> if I have any free time in the near future. > >>> > >>> Mike > >>> > >>> > >>> On Wed, May 5, 2010 at 2:04 PM, Jonathan Ellis <jbel...@gmail.com > >wrote: > >>> > >>>> Very interesting, thanks! > >>>> > >>>> On Wed, May 5, 2010 at 1:31 PM, Ed Anuff <e...@anuff.com> wrote: > >>>> > Follow-up from last weeks discussion, I've been playing around with > a > >>>> simple > >>>> > column comparator for composite column names that I put up on > github. > >>>> I'd > >>>> > be interested to hear what people think of this approach. > >>>> > > >>>> > http://github.com/edanuff/CassandraCompositeType > >>>> > > >>>> > Ed > >>>> > > >>>> > On Wed, Apr 28, 2010 at 12:52 PM, Ed Anuff <e...@anuff.com> wrote: > >>>> >> > >>>> >> It might make sense to create a CompositeType subclass of > >>>> AbstractType for > >>>> >> the purpose of constructing and comparing these types of > "composite" > >>>> column > >>>> >> names so that if you could more easily do that sort of thing rather > >>>> than > >>>> >> having to concatenate into one big string. > >>>> >> > >>>> >> On Wed, Apr 28, 2010 at 10:25 AM, Mike Malone <m...@simplegeo.com> > >>>> wrote: > >>>> >>> > >>>> >>> The only thing SuperColumns appear to buy you (as someone pointed > >>>> out to > >>>> >>> me at the Cassandra meetup - I think it was Eric Florenzano) is > that > >>>> you can > >>>> >>> use different comparator types for the Super/SubColumns, I > guess..? > >>>> But you > >>>> >>> should be able to do the same thing by creating your own Column > >>>> comparator. > >>>> >>> I guess my point is that SuperColumns are mostly a convenience > >>>> mechanism, as > >>>> >>> far as I can tell. > >>>> >>> Mike > >>>> > > >>>> > > >>>> > >>>> > >>>> > >>>> -- > >>>> Jonathan Ellis > >>>> Project Chair, Apache Cassandra > >>>> co-founder of Riptano, the source for professional Cassandra support > >>>> http://riptano.com > >>>> > >>> > >>> > >> > > > > >