Re: Is SuperColumn necessary?

Mike Malone Mon, 10 May 2010 10:03:04 -0700

On Mon, May 10, 2010 at 9:44 AM, Stu Hood <stu.h...@rackspace.com> wrote:


> I think that it is 100% ideal: it's what I've been working on implementing
> in #674, #847 and #998. I'm hoping to post a large patchset and docs this
> week, and I'm aiming to get it committed for 0.8.
>
> The work I've been doing doesn't touch the user interface: it only deals
> with the internal changes necessary to make this type of storage possible.
>

Yea, Stu, I've been looking at your github changes. I think we both have a
lot of the same ideas. I'd love to chat more about this stuff sometime.


>
>
> -----Original Message-----
> From: "Mike Malone" <m...@simplegeo.com>
> Sent: Monday, May 10, 2010 11:37am
> To: user@cassandra.apache.org
> Subject: Re: Is SuperColumn necessary?
>
> Maybe... but honestly, it doesn't affect the architecture or interface at
> all. I'm more interested in thinking about how the system should work than
> what things are called. Naming things are important, but that can happen
> later.
>
> Does anyone have any thoughts or comments on the architecture I suggested
> earlier?
>
> Mike
>
> On Mon, May 10, 2010 at 8:36 AM, Schubert Zhang <zson...@gmail.com> wrote:
>
> > Yes, the "column" here is not appropriate.
> > Maybe we need not to create new terms, in Google's Bigtable, the term
> > "qualifier" is a good one.
> >
> >
> > On Thu, May 6, 2010 at 3:04 PM, David Boxenhorn <da...@lookin2.com>
> wrote:
> >
> >> That would be a good time to get rid of the confusing "column" term,
> which
> >> incorrectly suggests a two-dimensional tabular structure.
> >>
> >> Suggestions:
> >>
> >> 1. A hypercube (or hypocube, if only two dimensions): replace "key" and
> >> "column" with "1st dimension", "2nd dimension", etc.
> >>
> >> 2. A file system: replace "key" and "column" with "directory" and
> >> "subdirectory"
> >>
> >> 3. A tuple tree: "Column family" replaced by top-level tuple, whose
> value
> >> is the set of keys, whose value is the set of supercolumns of the key,
> whose
> >> value is the set of columns for the supercolumn, etc.
> >>
> >> 4. Etc.
> >>
> >> On Thu, May 6, 2010 at 2:28 AM, Mike Malone <m...@simplegeo.com> wrote:
> >>
> >>> Nice, Ed, we're doing something very similar but less generic.
> >>>
> >>> Now replace all of the various methods for querying with a simple query
> >>> interface that takes a Predicate, allow the user to specify (in
> >>> storage-conf) which levels of the nested Columns should be indexed, and
> >>> completely remove Comparators and have people subclass Column /
> implement
> >>> IColumn and we'd really be on to something ;).
> >>>
> >>> Mock storage-conf.xml:
> >>>   <Column Name="ThingThatsNowKey" Indexed="True"
> >>> ClusterPartitioned="True" Type="UTF8">
> >>>     <Column Name="ThingThatsNowColumnFamily" DiskPartitioned="True"
> >>> Type="UTF8">
> >>>       <Column Name="ThingThatsNowSuperColumnName" Type="Long">
> >>>         <Column Name="ThingThatsNowColumnName" Indexed="True"
> >>> Type="ASCII">
> >>>           <Column Name="ThingThatCantCurrentlyBeRepresented"/>
> >>>         </Column>
> >>>       </Column>
> >>>     </Column>
> >>>   </Column>
> >>>
> >>> Thrift:
> >>>   struct NamePredicate {
> >>>     1: required list<binary> column_names,
> >>>   }
> >>>   struct SlicePredicate {
> >>>     1: required binary start,
> >>>     2: required binary end,
> >>>   }
> >>>   struct CountPredicate {
> >>>     1: required struct predicate,
> >>>     2: required i32 count=100,
> >>>   }
> >>>   struct AndPredicate {
> >>>     1: required Predicate left,
> >>>     2: required Predicate right,
> >>>   }
> >>>   struct SubColumnsPredicate {
> >>>     1: required Predicate columns,
> >>>     2: required Predicate subcolumns,
> >>>   }
> >>>   ... OrPredicate, OtherUsefulPredicates ...
> >>>   query(predicate, count, consistency_level) # Count here would be
> total
> >>> count of leaf values returned, whereas CountPredicate specifies a
> column
> >>> count for a particular sub-slice.
> >>>
> >>> Not fully baked... but I think this could really simplify stuff and
> make
> >>> it more flexible. Downside is it may give people enough rope to hang
> >>> themselves, but at least the predicate stuff is easily distributable.
> >>>
> >>> I'm thinking I'll play around with implementing some of this stuff
> myself
> >>> if I have any free time in the near future.
> >>>
> >>> Mike
> >>>
> >>>
> >>> On Wed, May 5, 2010 at 2:04 PM, Jonathan Ellis <jbel...@gmail.com
> >wrote:
> >>>
> >>>> Very interesting, thanks!
> >>>>
> >>>> On Wed, May 5, 2010 at 1:31 PM, Ed Anuff <e...@anuff.com> wrote:
> >>>> > Follow-up from last weeks discussion, I've been playing around with
> a
> >>>> simple
> >>>> > column comparator for composite column names that I put up on
> github.
> >>>> I'd
> >>>> > be interested to hear what people think of this approach.
> >>>> >
> >>>> > http://github.com/edanuff/CassandraCompositeType
> >>>> >
> >>>> > Ed
> >>>> >
> >>>> > On Wed, Apr 28, 2010 at 12:52 PM, Ed Anuff <e...@anuff.com> wrote:
> >>>> >>
> >>>> >> It might make sense to create a CompositeType subclass of
> >>>> AbstractType for
> >>>> >> the purpose of constructing and comparing these types of
> "composite"
> >>>> column
> >>>> >> names so that if you could more easily do that sort of thing rather
> >>>> than
> >>>> >> having to concatenate into one big string.
> >>>> >>
> >>>> >> On Wed, Apr 28, 2010 at 10:25 AM, Mike Malone <m...@simplegeo.com>
> >>>> wrote:
> >>>> >>>
> >>>> >>> The only thing SuperColumns appear to buy you (as someone pointed
> >>>> out to
> >>>> >>> me at the Cassandra meetup - I think it was Eric Florenzano) is
> that
> >>>> you can
> >>>> >>> use different comparator types for the Super/SubColumns, I
> guess..?
> >>>> But you
> >>>> >>> should be able to do the same thing by creating your own Column
> >>>> comparator.
> >>>> >>> I guess my point is that SuperColumns are mostly a convenience
> >>>> mechanism, as
> >>>> >>> far as I can tell.
> >>>> >>> Mike
> >>>> >
> >>>> >
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Jonathan Ellis
> >>>> Project Chair, Apache Cassandra
> >>>> co-founder of Riptano, the source for professional Cassandra support
> >>>> http://riptano.com
> >>>>
> >>>
> >>>
> >>
> >
>
>
>

Re: Is SuperColumn necessary?

Reply via email to