I have to disagree about the naming of things. The name of something isn't just a literal identifier. It affects the way people think about it. For new users, the whole naming thing has been a persistent barrier.
As for your suggestions, I'm all for simplifying or generalizing the "how it works" part down to a more generalized set of operations. I'm not sure it's a good idea to require users to think in terms building up a fluffy query structure just to thread it through a needle of an API, even for the simplest of queries. At some point, the level of generic boilerplate takes away from the semantic hand rails that developers like. So I guess I'm suggesting that "how it works" and "how we use it" are not always exactly the same. At least they should both hinge on a common conceptual model, which is where the naming becomes an important anchoring point. Jonathan On Mon, May 10, 2010 at 11:37 AM, Mike Malone <m...@simplegeo.com> wrote: > Maybe... but honestly, it doesn't affect the architecture or interface at > all. I'm more interested in thinking about how the system should work than > what things are called. Naming things are important, but that can happen > later. > Does anyone have any thoughts or comments on the architecture I suggested > earlier? > > Mike > > On Mon, May 10, 2010 at 8:36 AM, Schubert Zhang <zson...@gmail.com> wrote: >> >> Yes, the "column" here is not appropriate. >> Maybe we need not to create new terms, in Google's Bigtable, the term >> "qualifier" is a good one. >> >> On Thu, May 6, 2010 at 3:04 PM, David Boxenhorn <da...@lookin2.com> wrote: >>> >>> That would be a good time to get rid of the confusing "column" term, >>> which incorrectly suggests a two-dimensional tabular structure. >>> >>> Suggestions: >>> >>> 1. A hypercube (or hypocube, if only two dimensions): replace "key" and >>> "column" with "1st dimension", "2nd dimension", etc. >>> >>> 2. A file system: replace "key" and "column" with "directory" and >>> "subdirectory" >>> >>> 3. A tuple tree: "Column family" replaced by top-level tuple, whose value >>> is the set of keys, whose value is the set of supercolumns of the key, whose >>> value is the set of columns for the supercolumn, etc. >>> >>> 4. Etc. >>> >>> On Thu, May 6, 2010 at 2:28 AM, Mike Malone <m...@simplegeo.com> wrote: >>>> >>>> Nice, Ed, we're doing something very similar but less generic. >>>> Now replace all of the various methods for querying with a simple query >>>> interface that takes a Predicate, allow the user to specify (in >>>> storage-conf) which levels of the nested Columns should be indexed, and >>>> completely remove Comparators and have people subclass Column / implement >>>> IColumn and we'd really be on to something ;). >>>> Mock storage-conf.xml: >>>> <Column Name="ThingThatsNowKey" Indexed="True" >>>> ClusterPartitioned="True" Type="UTF8"> >>>> <Column Name="ThingThatsNowColumnFamily" DiskPartitioned="True" >>>> Type="UTF8"> >>>> <Column Name="ThingThatsNowSuperColumnName" Type="Long"> >>>> <Column Name="ThingThatsNowColumnName" Indexed="True" >>>> Type="ASCII"> >>>> <Column Name="ThingThatCantCurrentlyBeRepresented"/> >>>> </Column> >>>> </Column> >>>> </Column> >>>> </Column> >>>> Thrift: >>>> struct NamePredicate { >>>> 1: required list<binary> column_names, >>>> } >>>> struct SlicePredicate { >>>> 1: required binary start, >>>> 2: required binary end, >>>> } >>>> struct CountPredicate { >>>> 1: required struct predicate, >>>> 2: required i32 count=100, >>>> } >>>> struct AndPredicate { >>>> 1: required Predicate left, >>>> 2: required Predicate right, >>>> } >>>> struct SubColumnsPredicate { >>>> 1: required Predicate columns, >>>> 2: required Predicate subcolumns, >>>> } >>>> ... OrPredicate, OtherUsefulPredicates ... >>>> query(predicate, count, consistency_level) # Count here would be total >>>> count of leaf values returned, whereas CountPredicate specifies a column >>>> count for a particular sub-slice. >>>> Not fully baked... but I think this could really simplify stuff and make >>>> it more flexible. Downside is it may give people enough rope to hang >>>> themselves, but at least the predicate stuff is easily distributable. >>>> I'm thinking I'll play around with implementing some of this stuff >>>> myself if I have any free time in the near future. >>>> Mike >>>> >>>> On Wed, May 5, 2010 at 2:04 PM, Jonathan Ellis <jbel...@gmail.com> >>>> wrote: >>>>> >>>>> Very interesting, thanks! >>>>> >>>>> On Wed, May 5, 2010 at 1:31 PM, Ed Anuff <e...@anuff.com> wrote: >>>>> > Follow-up from last weeks discussion, I've been playing around with a >>>>> > simple >>>>> > column comparator for composite column names that I put up on >>>>> > github. I'd >>>>> > be interested to hear what people think of this approach. >>>>> > >>>>> > http://github.com/edanuff/CassandraCompositeType >>>>> > >>>>> > Ed >>>>> > >>>>> > On Wed, Apr 28, 2010 at 12:52 PM, Ed Anuff <e...@anuff.com> wrote: >>>>> >> >>>>> >> It might make sense to create a CompositeType subclass of >>>>> >> AbstractType for >>>>> >> the purpose of constructing and comparing these types of "composite" >>>>> >> column >>>>> >> names so that if you could more easily do that sort of thing rather >>>>> >> than >>>>> >> having to concatenate into one big string. >>>>> >> >>>>> >> On Wed, Apr 28, 2010 at 10:25 AM, Mike Malone <m...@simplegeo.com> >>>>> >> wrote: >>>>> >>> >>>>> >>> The only thing SuperColumns appear to buy you (as someone pointed >>>>> >>> out to >>>>> >>> me at the Cassandra meetup - I think it was Eric Florenzano) is >>>>> >>> that you can >>>>> >>> use different comparator types for the Super/SubColumns, I guess..? >>>>> >>> But you >>>>> >>> should be able to do the same thing by creating your own Column >>>>> >>> comparator. >>>>> >>> I guess my point is that SuperColumns are mostly a convenience >>>>> >>> mechanism, as >>>>> >>> far as I can tell. >>>>> >>> Mike >>>>> > >>>>> > >>>>> >>>>> >>>>> >>>>> -- >>>>> Jonathan Ellis >>>>> Project Chair, Apache Cassandra >>>>> co-founder of Riptano, the source for professional Cassandra support >>>>> http://riptano.com >>>> >>> >> > >