Re: Cassandra has moved to Git
On Thu, Dec 29, 2011 at 12:08 AM, Dave Brosius wrote: > On 12/28/2011 02:55 PM, Eric Evans wrote: >> >> While this is something we had talked about for ages, the actual >> switch-over happened rather abruptly, and Cassandra's canonical >> repository is now hosted in Git. >> >> For instructions on getting started, see >> https://git-wip-us.apache.org. We've also started putting random >> administrivia in the wiki at >> http://wiki.apache.org/cassandra/GitTransition. >> >> The Github mirror (http://github.com/apache/cassandra) hasn't been >> seeing updates since the move, but that will be fixed at some point. >> The important thing is that they share identical histories, so new (or >> existing forks) are forward-compatible. >> >> There are a few outstanding items being worked on (CI systems for >> example), but if you notice something that's been missed don't >> hesitate to speak up. The website will be updated as soon as SVN is >> unlocked. >> >> There are also some matters of work-flow or process that we need to >> hashed out. For example, how do we handle reviews now? Do we >> continue to mandate/recommend/allow rebasing? >> >> Thoughts? >> > doing > > git clone http://git-wip-us.apache.org/repos/asf/cassandra.git cassandra > > proceeded as a normal clone until the end when i received > > warning: remote HEAD refers to nonexistent ref, unable to checkout. > > any ideas what i'm doing wrong? That's because HEAD points to master by default. Just checkout trunk and you should be OK. See: http://wiki.apache.org/cassandra/GitTransition We'll get this fixed. -- Eric Evans Acunu | http://www.acunu.com | @acunu
Re: Cassandra has moved to Git
On 12/28/2011 02:55 PM, Eric Evans wrote: While this is something we had talked about for ages, the actual switch-over happened rather abruptly, and Cassandra's canonical repository is now hosted in Git. For instructions on getting started, see https://git-wip-us.apache.org. We've also started putting random administrivia in the wiki at http://wiki.apache.org/cassandra/GitTransition. The Github mirror (http://github.com/apache/cassandra) hasn't been seeing updates since the move, but that will be fixed at some point. The important thing is that they share identical histories, so new (or existing forks) are forward-compatible. There are a few outstanding items being worked on (CI systems for example), but if you notice something that's been missed don't hesitate to speak up. The website will be updated as soon as SVN is unlocked. There are also some matters of work-flow or process that we need to hashed out. For example, how do we handle reviews now? Do we continue to mandate/recommend/allow rebasing? Thoughts? doing git clone http://git-wip-us.apache.org/repos/asf/cassandra.git cassandra proceeded as a normal clone until the end when i received warning: remote HEAD refers to nonexistent ref, unable to checkout. any ideas what i'm doing wrong?
Re: Cassandra has moved to Git
git://git.apache.org/cassandra.git this still works?
Re: CQL support for compound columns
I've updated the wiki page at http://wiki.apache.org/cassandra/Cassandra2474 with a more in-depth Background section that hopefully clears up where I'm going with this sparse/dense business. Eric mentioned on IRC that he's uneasy about the PRIMARY KEY syntax implicitly using the first element of PRIMARY KEY as the row key. We could make it explicit with another WITH option to the TRANSPOSED clause: {{{ CREATE TABLE timeline ( user_id int, posted_at uuid, column string, value blob, PRIMARY KEY(user_id, posted_at) ) TRANSPOSED WITH ROW KEY(user_id) }}} This makes things more verbose (this would be a required clause) but I'm okay with that if consensus is that being explicit here is better.
Re: Cassandra has moved to Git
On Thu, Dec 29, 2011 at 9:18 AM, Eric Evans wrote: > On Thu, Dec 29, 2011 at 12:08 AM, Dave Brosius wrote: >> doing >> >> git clone http://git-wip-us.apache.org/repos/asf/cassandra.git cassandra >> >> proceeded as a normal clone until the end when i received >> >> warning: remote HEAD refers to nonexistent ref, unable to checkout. >> >> any ideas what i'm doing wrong? > > That's because HEAD points to master by default. Just checkout trunk > and you should be OK. See: > http://wiki.apache.org/cassandra/GitTransition > > We'll get this fixed. FYI: https://issues.apache.org/jira/browse/INFRA-4258 -- Eric Evans Acunu | http://www.acunu.com | @acunu
Re: Cassandra has moved to Git
On Thu, Dec 29, 2011 at 11:56 AM, Radim Kolar wrote: > git://git.apache.org/cassandra.git > > this still works? I'm not sure what the status of this is, or what the future holds for it. I would stick with http://git-wip.us.apache.org to be on the safe-side. -- Eric Evans Acunu | http://www.acunu.com | @acunu
Re: CQL support for compound columns
On Thu, Dec 29, 2011 at 12:04 PM, Jonathan Ellis wrote: > I've updated the wiki page at > http://wiki.apache.org/cassandra/Cassandra2474 with a more in-depth > Background section that hopefully clears up where I'm going with this > sparse/dense business. > > Eric mentioned on IRC that he's uneasy about the PRIMARY KEY syntax > implicitly using the first element of PRIMARY KEY as the row key. We > could make it explicit with another WITH option to the TRANSPOSED > clause: > > {{{ > CREATE TABLE timeline ( > user_id int, > posted_at uuid, > column string, > value blob, > PRIMARY KEY(user_id, posted_at) > ) TRANSPOSED WITH ROW KEY(user_id) > }}} > > This makes things more verbose (this would be a required clause) but > I'm okay with that if consensus is that being explicit here is better. I think that was a reaction to an earlier iteration. Assuming that the only place where order matters is in that primary key definition, then I think it makes sense without the "... WITH ROW KEY..." bit. -- Eric Evans Acunu | http://www.acunu.com | @acunu
Re: CQL support for compound columns
On Wed, Dec 28, 2011 at 1:05 PM, Jonathan Ellis wrote: > Gamma proposal update: > > The more I think about it the less happy I am with omitting support > for sparse columns. Remember that dense composites may only be > inserted and deleted, not updated, since they are just a tuple of > values with "column names" determined by schema and/or convention. > > I think we can support sparse columns well in a way that improves the > conceptual integrity for the dense composites as well: > > {code} > -- "column" and "value" are sparse; a transposed row will be stored as > -- two columns of (user_id, posted_at, 'column': string) and (user_id, > posted_at, 'value': blob) > CREATE TABLE timeline ( > user_id int, > posted_at uuid, > column string, > value blob, > PRIMARY KEY(user_id, posted_at) > ) TRANSPOSED; > > -- entire transposed row is stored as a single dense composite column > -- (series, ts1, cat, subcat, 1337, 92d21d0a-...: []). Note that the > -- composite column's value is unused in this case. > CREATE TABLE events ( > series text, > ts1 int, > cat text, > subcat text, > "1337" uuid, > "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" bigint, > PRIMARY KEY(series, ts1, cat, subcat, "1337", > "92d21d0a-d6cb-437c-9d3f-b67aa733a19f") > ) TRANSPOSED WITH COLUMN NAMES ("1337" int, > "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" uuid); > {code} Could you explain what this "TRANSPOSED WITH COLUMN NAMES" syntax does (or link to a previous description if I missed it)? > Thus, columns included in the (transposed) primary key will be > "dense," and not updateable, which conforms to our existing practice > that keys are not updateable. Remaining columns will be updateable > since they will each map to a separate physical column. -- Eric Evans Acunu | http://www.acunu.com | @acunu
Re: CQL support for compound columns
That's to allow defining column names that are not text/utf8. So you could have column name "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" be an actual 128-bit uuid binary value internally, not its string representation. Put another way, this would affect the CqlMetadata name_types map. However, we already have the "column names are always strings" limitations with existing CQL DDL so it probably makes more sense to consider it separately from transposition. On Thu, Dec 29, 2011 at 3:22 PM, Eric Evans wrote: > On Wed, Dec 28, 2011 at 1:05 PM, Jonathan Ellis wrote: >> Gamma proposal update: >> >> The more I think about it the less happy I am with omitting support >> for sparse columns. Remember that dense composites may only be >> inserted and deleted, not updated, since they are just a tuple of >> values with "column names" determined by schema and/or convention. >> >> I think we can support sparse columns well in a way that improves the >> conceptual integrity for the dense composites as well: >> >> {code} >> -- "column" and "value" are sparse; a transposed row will be stored as >> -- two columns of (user_id, posted_at, 'column': string) and (user_id, >> posted_at, 'value': blob) >> CREATE TABLE timeline ( >> user_id int, >> posted_at uuid, >> column string, >> value blob, >> PRIMARY KEY(user_id, posted_at) >> ) TRANSPOSED; >> >> -- entire transposed row is stored as a single dense composite column >> -- (series, ts1, cat, subcat, 1337, 92d21d0a-...: []). Note that the >> -- composite column's value is unused in this case. >> CREATE TABLE events ( >> series text, >> ts1 int, >> cat text, >> subcat text, >> "1337" uuid, >> "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" bigint, >> PRIMARY KEY(series, ts1, cat, subcat, "1337", >> "92d21d0a-d6cb-437c-9d3f-b67aa733a19f") >> ) TRANSPOSED WITH COLUMN NAMES ("1337" int, >> "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" uuid); >> {code} > > Could you explain what this "TRANSPOSED WITH COLUMN NAMES" syntax does > (or link to a previous description if I missed it)? > >> Thus, columns included in the (transposed) primary key will be >> "dense," and not updateable, which conforms to our existing practice >> that keys are not updateable. Remaining columns will be updateable >> since they will each map to a separate physical column. > > > > -- > Eric Evans > Acunu | http://www.acunu.com | @acunu -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: CQL support for compound columns
On Thu, Dec 29, 2011 at 3:44 PM, Jonathan Ellis wrote: > That's to allow defining column names that are not text/utf8. So you > could have column name "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" be an > actual 128-bit uuid binary value internally, not its string > representation. Put another way, this would affect the CqlMetadata > name_types map. > > However, we already have the "column names are always strings" > limitations with existing CQL DDL so it probably makes more sense to > consider it separately from transposition. Right, and to get a jump on that bikeshedding I'd propose that look something like: CREATE TABLE test ( int(10) text, uuid(92d21d0a-d6cb-437c-9d3f-b67aa733a19f) bigint ) or... CREATE TABLE test ( (int)10 text, (uuid)92d21d0a-d6cb-437c-9d3f-b67aa733a19f bigint ) But I digress, that's probably best left for another issue and another time. :) > On Thu, Dec 29, 2011 at 3:22 PM, Eric Evans wrote: >> On Wed, Dec 28, 2011 at 1:05 PM, Jonathan Ellis wrote: >>> Gamma proposal update: >>> >>> The more I think about it the less happy I am with omitting support >>> for sparse columns. Remember that dense composites may only be >>> inserted and deleted, not updated, since they are just a tuple of >>> values with "column names" determined by schema and/or convention. >>> >>> I think we can support sparse columns well in a way that improves the >>> conceptual integrity for the dense composites as well: >>> >>> {code} >>> -- "column" and "value" are sparse; a transposed row will be stored as >>> -- two columns of (user_id, posted_at, 'column': string) and (user_id, >>> posted_at, 'value': blob) >>> CREATE TABLE timeline ( >>> user_id int, >>> posted_at uuid, >>> column string, >>> value blob, >>> PRIMARY KEY(user_id, posted_at) >>> ) TRANSPOSED; >>> >>> -- entire transposed row is stored as a single dense composite column >>> -- (series, ts1, cat, subcat, 1337, 92d21d0a-...: []). Note that the >>> -- composite column's value is unused in this case. >>> CREATE TABLE events ( >>> series text, >>> ts1 int, >>> cat text, >>> subcat text, >>> "1337" uuid, >>> "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" bigint, >>> PRIMARY KEY(series, ts1, cat, subcat, "1337", >>> "92d21d0a-d6cb-437c-9d3f-b67aa733a19f") >>> ) TRANSPOSED WITH COLUMN NAMES ("1337" int, >>> "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" uuid); >>> {code} >> >> Could you explain what this "TRANSPOSED WITH COLUMN NAMES" syntax does >> (or link to a previous description if I missed it)? -- Eric Evans Acunu | http://www.acunu.com | @acunu
Re: CQL support for compound columns
https://issues.apache.org/jira/browse/CASSANDRA-3685 On Thu, Dec 29, 2011 at 6:34 PM, Eric Evans wrote: > On Thu, Dec 29, 2011 at 3:44 PM, Jonathan Ellis wrote: >> That's to allow defining column names that are not text/utf8. So you >> could have column name "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" be an >> actual 128-bit uuid binary value internally, not its string >> representation. Put another way, this would affect the CqlMetadata >> name_types map. >> >> However, we already have the "column names are always strings" >> limitations with existing CQL DDL so it probably makes more sense to >> consider it separately from transposition. > > Right, and to get a jump on that bikeshedding I'd propose that look > something like: > > CREATE TABLE test ( > int(10) text, > uuid(92d21d0a-d6cb-437c-9d3f-b67aa733a19f) bigint > ) > > or... > > CREATE TABLE test ( > (int)10 text, > (uuid)92d21d0a-d6cb-437c-9d3f-b67aa733a19f bigint > ) > > But I digress, that's probably best left for another issue and another time. > :) > > >> On Thu, Dec 29, 2011 at 3:22 PM, Eric Evans wrote: >>> On Wed, Dec 28, 2011 at 1:05 PM, Jonathan Ellis wrote: Gamma proposal update: The more I think about it the less happy I am with omitting support for sparse columns. Remember that dense composites may only be inserted and deleted, not updated, since they are just a tuple of values with "column names" determined by schema and/or convention. I think we can support sparse columns well in a way that improves the conceptual integrity for the dense composites as well: {code} -- "column" and "value" are sparse; a transposed row will be stored as -- two columns of (user_id, posted_at, 'column': string) and (user_id, posted_at, 'value': blob) CREATE TABLE timeline ( user_id int, posted_at uuid, column string, value blob, PRIMARY KEY(user_id, posted_at) ) TRANSPOSED; -- entire transposed row is stored as a single dense composite column -- (series, ts1, cat, subcat, 1337, 92d21d0a-...: []). Note that the -- composite column's value is unused in this case. CREATE TABLE events ( series text, ts1 int, cat text, subcat text, "1337" uuid, "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" bigint, PRIMARY KEY(series, ts1, cat, subcat, "1337", "92d21d0a-d6cb-437c-9d3f-b67aa733a19f") ) TRANSPOSED WITH COLUMN NAMES ("1337" int, "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" uuid); {code} >>> >>> Could you explain what this "TRANSPOSED WITH COLUMN NAMES" syntax does >>> (or link to a previous description if I missed it)? > > -- > Eric Evans > Acunu | http://www.acunu.com | @acunu -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com