On Mon, May 10, 2010 at 9:52 AM, Jonathan Shook <jsh...@gmail.com> wrote:
> I have to disagree about the naming of things. The name of something > isn't just a literal identifier. It affects the way people think about > it. For new users, the whole naming thing has been a persistent > barrier. > I'm saying we shouldn't be worried too much about coming up with names and analogies until we've decided what it is we're naming. > As for your suggestions, I'm all for simplifying or generalizing the > "how it works" part down to a more generalized set of operations. I'm > not sure it's a good idea to require users to think in terms building > up a fluffy query structure just to thread it through a needle of an > API, even for the simplest of queries. At some point, the level of > generic boilerplate takes away from the semantic hand rails that > developers like. So I guess I'm suggesting that "how it works" and > "how we use it" are not always exactly the same. At least they should > both hinge on a common conceptual model, which is where the naming > becomes an important anchoring point. > If things are done properly, client libraries could expose simplified query interfaces without much effort. Most ORMs these days work by building a propositional directed acyclic graph that's serialized to SQL. This would work the same way, but it wouldn't be converted into a 4GL. Mike > > Jonathan > > On Mon, May 10, 2010 at 11:37 AM, Mike Malone <m...@simplegeo.com> wrote: > > Maybe... but honestly, it doesn't affect the architecture or interface at > > all. I'm more interested in thinking about how the system should work > than > > what things are called. Naming things are important, but that can happen > > later. > > Does anyone have any thoughts or comments on the architecture I suggested > > earlier? > > > > Mike > > > > On Mon, May 10, 2010 at 8:36 AM, Schubert Zhang <zson...@gmail.com> > wrote: > >> > >> Yes, the "column" here is not appropriate. > >> Maybe we need not to create new terms, in Google's Bigtable, the term > >> "qualifier" is a good one. > >> > >> On Thu, May 6, 2010 at 3:04 PM, David Boxenhorn <da...@lookin2.com> > wrote: > >>> > >>> That would be a good time to get rid of the confusing "column" term, > >>> which incorrectly suggests a two-dimensional tabular structure. > >>> > >>> Suggestions: > >>> > >>> 1. A hypercube (or hypocube, if only two dimensions): replace "key" and > >>> "column" with "1st dimension", "2nd dimension", etc. > >>> > >>> 2. A file system: replace "key" and "column" with "directory" and > >>> "subdirectory" > >>> > >>> 3. A tuple tree: "Column family" replaced by top-level tuple, whose > value > >>> is the set of keys, whose value is the set of supercolumns of the key, > whose > >>> value is the set of columns for the supercolumn, etc. > >>> > >>> 4. Etc. > >>> > >>> On Thu, May 6, 2010 at 2:28 AM, Mike Malone <m...@simplegeo.com> > wrote: > >>>> > >>>> Nice, Ed, we're doing something very similar but less generic. > >>>> Now replace all of the various methods for querying with a simple > query > >>>> interface that takes a Predicate, allow the user to specify (in > >>>> storage-conf) which levels of the nested Columns should be indexed, > and > >>>> completely remove Comparators and have people subclass Column / > implement > >>>> IColumn and we'd really be on to something ;). > >>>> Mock storage-conf.xml: > >>>> <Column Name="ThingThatsNowKey" Indexed="True" > >>>> ClusterPartitioned="True" Type="UTF8"> > >>>> <Column Name="ThingThatsNowColumnFamily" DiskPartitioned="True" > >>>> Type="UTF8"> > >>>> <Column Name="ThingThatsNowSuperColumnName" Type="Long"> > >>>> <Column Name="ThingThatsNowColumnName" Indexed="True" > >>>> Type="ASCII"> > >>>> <Column Name="ThingThatCantCurrentlyBeRepresented"/> > >>>> </Column> > >>>> </Column> > >>>> </Column> > >>>> </Column> > >>>> Thrift: > >>>> struct NamePredicate { > >>>> 1: required list<binary> column_names, > >>>> } > >>>> struct SlicePredicate { > >>>> 1: required binary start, > >>>> 2: required binary end, > >>>> } > >>>> struct CountPredicate { > >>>> 1: required struct predicate, > >>>> 2: required i32 count=100, > >>>> } > >>>> struct AndPredicate { > >>>> 1: required Predicate left, > >>>> 2: required Predicate right, > >>>> } > >>>> struct SubColumnsPredicate { > >>>> 1: required Predicate columns, > >>>> 2: required Predicate subcolumns, > >>>> } > >>>> ... OrPredicate, OtherUsefulPredicates ... > >>>> query(predicate, count, consistency_level) # Count here would be > total > >>>> count of leaf values returned, whereas CountPredicate specifies a > column > >>>> count for a particular sub-slice. > >>>> Not fully baked... but I think this could really simplify stuff and > make > >>>> it more flexible. Downside is it may give people enough rope to hang > >>>> themselves, but at least the predicate stuff is easily distributable. > >>>> I'm thinking I'll play around with implementing some of this stuff > >>>> myself if I have any free time in the near future. > >>>> Mike > >>>> > >>>> On Wed, May 5, 2010 at 2:04 PM, Jonathan Ellis <jbel...@gmail.com> > >>>> wrote: > >>>>> > >>>>> Very interesting, thanks! > >>>>> > >>>>> On Wed, May 5, 2010 at 1:31 PM, Ed Anuff <e...@anuff.com> wrote: > >>>>> > Follow-up from last weeks discussion, I've been playing around with > a > >>>>> > simple > >>>>> > column comparator for composite column names that I put up on > >>>>> > github. I'd > >>>>> > be interested to hear what people think of this approach. > >>>>> > > >>>>> > http://github.com/edanuff/CassandraCompositeType > >>>>> > > >>>>> > Ed > >>>>> > > >>>>> > On Wed, Apr 28, 2010 at 12:52 PM, Ed Anuff <e...@anuff.com> wrote: > >>>>> >> > >>>>> >> It might make sense to create a CompositeType subclass of > >>>>> >> AbstractType for > >>>>> >> the purpose of constructing and comparing these types of > "composite" > >>>>> >> column > >>>>> >> names so that if you could more easily do that sort of thing > rather > >>>>> >> than > >>>>> >> having to concatenate into one big string. > >>>>> >> > >>>>> >> On Wed, Apr 28, 2010 at 10:25 AM, Mike Malone <m...@simplegeo.com > > > >>>>> >> wrote: > >>>>> >>> > >>>>> >>> The only thing SuperColumns appear to buy you (as someone pointed > >>>>> >>> out to > >>>>> >>> me at the Cassandra meetup - I think it was Eric Florenzano) is > >>>>> >>> that you can > >>>>> >>> use different comparator types for the Super/SubColumns, I > guess..? > >>>>> >>> But you > >>>>> >>> should be able to do the same thing by creating your own Column > >>>>> >>> comparator. > >>>>> >>> I guess my point is that SuperColumns are mostly a convenience > >>>>> >>> mechanism, as > >>>>> >>> far as I can tell. > >>>>> >>> Mike > >>>>> > > >>>>> > > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> Jonathan Ellis > >>>>> Project Chair, Apache Cassandra > >>>>> co-founder of Riptano, the source for professional Cassandra support > >>>>> http://riptano.com > >>>> > >>> > >> > > > > >