Re: Cassandra has moved to Git

2011-12-29 Thread Eric Evans
On Thu, Dec 29, 2011 at 12:08 AM, Dave Brosius  wrote:
> On 12/28/2011 02:55 PM, Eric Evans wrote:
>>
>> While this is something we had talked about for ages, the actual
>> switch-over happened rather abruptly, and Cassandra's canonical
>> repository is now hosted in Git.
>>
>> For instructions on getting started, see
>> https://git-wip-us.apache.org.  We've also started putting random
>> administrivia in the wiki at
>> http://wiki.apache.org/cassandra/GitTransition.
>>
>> The Github mirror (http://github.com/apache/cassandra) hasn't been
>> seeing updates since the move, but that will be fixed at some point.
>> The important thing is that they share identical histories, so new (or
>> existing forks) are forward-compatible.
>>
>> There are a few outstanding items being worked on (CI systems for
>> example), but if you notice something that's been missed don't
>> hesitate to speak up.  The website will be updated as soon as SVN is
>> unlocked.
>>
>> There are also some matters of work-flow or process that we need to
>> hashed out.  For example, how do we handle reviews now?  Do we
>> continue to mandate/recommend/allow rebasing?
>>
>> Thoughts?
>>
> doing
>
> git clone http://git-wip-us.apache.org/repos/asf/cassandra.git cassandra
>
> proceeded as a normal clone until the end when i received
>
> warning: remote HEAD refers to nonexistent ref, unable to checkout.
>
> any ideas what i'm doing wrong?

That's because HEAD points to master by default.  Just checkout trunk
and you should be OK.  See:
http://wiki.apache.org/cassandra/GitTransition

We'll get this fixed.

-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu


Re: Cassandra has moved to Git

2011-12-29 Thread Dave Brosius

On 12/28/2011 02:55 PM, Eric Evans wrote:

While this is something we had talked about for ages, the actual
switch-over happened rather abruptly, and Cassandra's canonical
repository is now hosted in Git.

For instructions on getting started, see
https://git-wip-us.apache.org.  We've also started putting random
administrivia in the wiki at
http://wiki.apache.org/cassandra/GitTransition.

The Github mirror (http://github.com/apache/cassandra) hasn't been
seeing updates since the move, but that will be fixed at some point.
The important thing is that they share identical histories, so new (or
existing forks) are forward-compatible.

There are a few outstanding items being worked on (CI systems for
example), but if you notice something that's been missed don't
hesitate to speak up.  The website will be updated as soon as SVN is
unlocked.

There are also some matters of work-flow or process that we need to
hashed out.  For example, how do we handle reviews now?  Do we
continue to mandate/recommend/allow rebasing?

Thoughts?


doing

git clone http://git-wip-us.apache.org/repos/asf/cassandra.git cassandra

proceeded as a normal clone until the end when i received

warning: remote HEAD refers to nonexistent ref, unable to checkout.

any ideas what i'm doing wrong?




Re: Cassandra has moved to Git

2011-12-29 Thread Radim Kolar

git://git.apache.org/cassandra.git

this still works?


Re: CQL support for compound columns

2011-12-29 Thread Jonathan Ellis
I've updated the wiki page at
http://wiki.apache.org/cassandra/Cassandra2474 with a more in-depth
Background section that hopefully clears up where I'm going with this
sparse/dense business.

Eric mentioned on IRC that he's uneasy about the PRIMARY KEY syntax
implicitly using the first element of PRIMARY KEY as the row key.  We
could make it explicit with another WITH option to the TRANSPOSED
clause:

{{{
CREATE TABLE timeline (
user_id int,
posted_at uuid,
column string,
value blob,
PRIMARY KEY(user_id, posted_at)
) TRANSPOSED WITH ROW KEY(user_id)
}}}

This makes things more verbose (this would be a required clause) but
I'm okay with that if consensus is that being explicit here is better.


Re: Cassandra has moved to Git

2011-12-29 Thread Eric Evans
On Thu, Dec 29, 2011 at 9:18 AM, Eric Evans  wrote:
> On Thu, Dec 29, 2011 at 12:08 AM, Dave Brosius  wrote:
>> doing
>>
>> git clone http://git-wip-us.apache.org/repos/asf/cassandra.git cassandra
>>
>> proceeded as a normal clone until the end when i received
>>
>> warning: remote HEAD refers to nonexistent ref, unable to checkout.
>>
>> any ideas what i'm doing wrong?
>
> That's because HEAD points to master by default.  Just checkout trunk
> and you should be OK.  See:
> http://wiki.apache.org/cassandra/GitTransition
>
> We'll get this fixed.

FYI: https://issues.apache.org/jira/browse/INFRA-4258

-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu


Re: Cassandra has moved to Git

2011-12-29 Thread Eric Evans
On Thu, Dec 29, 2011 at 11:56 AM, Radim Kolar  wrote:
> git://git.apache.org/cassandra.git
>
> this still works?

I'm not sure what the status of this is, or what the future holds for
it.  I would stick with http://git-wip.us.apache.org to be on the
safe-side.

-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu


Re: CQL support for compound columns

2011-12-29 Thread Eric Evans
On Thu, Dec 29, 2011 at 12:04 PM, Jonathan Ellis  wrote:
> I've updated the wiki page at
> http://wiki.apache.org/cassandra/Cassandra2474 with a more in-depth
> Background section that hopefully clears up where I'm going with this
> sparse/dense business.
>
> Eric mentioned on IRC that he's uneasy about the PRIMARY KEY syntax
> implicitly using the first element of PRIMARY KEY as the row key.  We
> could make it explicit with another WITH option to the TRANSPOSED
> clause:
>
> {{{
> CREATE TABLE timeline (
>    user_id int,
>    posted_at uuid,
>    column string,
>    value blob,
>    PRIMARY KEY(user_id, posted_at)
> ) TRANSPOSED WITH ROW KEY(user_id)
> }}}
>
> This makes things more verbose (this would be a required clause) but
> I'm okay with that if consensus is that being explicit here is better.

I think that was a reaction to an earlier iteration.  Assuming that
the only place where order matters is in that primary key definition,
then I think it makes sense without the "... WITH ROW KEY..." bit.



-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu


Re: CQL support for compound columns

2011-12-29 Thread Eric Evans
On Wed, Dec 28, 2011 at 1:05 PM, Jonathan Ellis  wrote:
> Gamma proposal update:
>
> The more I think about it the less happy I am with omitting support
> for sparse columns.  Remember that dense composites may only be
> inserted and deleted, not updated, since they are just a tuple of
> values with "column names" determined by schema and/or convention.
>
> I think we can support sparse columns well in a way that improves the
> conceptual integrity for the dense composites as well:
>
> {code}
> -- "column" and "value" are sparse; a transposed row will be stored as
> -- two columns of (user_id, posted_at, 'column': string) and (user_id,
> posted_at, 'value': blob)
> CREATE TABLE timeline (
>   user_id int,
>   posted_at uuid,
>   column string,
>   value blob,
>   PRIMARY KEY(user_id, posted_at)
> ) TRANSPOSED;
>
> -- entire transposed row is stored as a single dense composite column
> -- (series, ts1, cat, subcat, 1337, 92d21d0a-...: []).  Note that the
> -- composite column's value is unused in this case.
> CREATE TABLE events (
>   series text,
>   ts1 int,
>   cat text,
>   subcat text,
>   "1337" uuid,
>   "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" bigint,
>   PRIMARY KEY(series, ts1, cat, subcat, "1337",
> "92d21d0a-d6cb-437c-9d3f-b67aa733a19f")
> ) TRANSPOSED WITH COLUMN NAMES ("1337" int,
> "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" uuid);
> {code}

Could you explain what this "TRANSPOSED WITH COLUMN NAMES" syntax does
(or link to a previous description if I missed it)?

> Thus, columns included in the (transposed) primary key will be
> "dense," and not updateable, which conforms to our existing practice
> that keys are not updateable.  Remaining columns will be updateable
> since they will each map to a separate physical column.



-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu


Re: CQL support for compound columns

2011-12-29 Thread Jonathan Ellis
That's to allow defining column names that are not text/utf8.  So you
could have column name "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" be an
actual 128-bit uuid binary value internally, not its string
representation.  Put another way, this would affect the CqlMetadata
name_types map.

However, we already have the "column names are always strings"
limitations with existing CQL DDL so it probably makes more sense to
consider it separately from transposition.

On Thu, Dec 29, 2011 at 3:22 PM, Eric Evans  wrote:
> On Wed, Dec 28, 2011 at 1:05 PM, Jonathan Ellis  wrote:
>> Gamma proposal update:
>>
>> The more I think about it the less happy I am with omitting support
>> for sparse columns.  Remember that dense composites may only be
>> inserted and deleted, not updated, since they are just a tuple of
>> values with "column names" determined by schema and/or convention.
>>
>> I think we can support sparse columns well in a way that improves the
>> conceptual integrity for the dense composites as well:
>>
>> {code}
>> -- "column" and "value" are sparse; a transposed row will be stored as
>> -- two columns of (user_id, posted_at, 'column': string) and (user_id,
>> posted_at, 'value': blob)
>> CREATE TABLE timeline (
>>   user_id int,
>>   posted_at uuid,
>>   column string,
>>   value blob,
>>   PRIMARY KEY(user_id, posted_at)
>> ) TRANSPOSED;
>>
>> -- entire transposed row is stored as a single dense composite column
>> -- (series, ts1, cat, subcat, 1337, 92d21d0a-...: []).  Note that the
>> -- composite column's value is unused in this case.
>> CREATE TABLE events (
>>   series text,
>>   ts1 int,
>>   cat text,
>>   subcat text,
>>   "1337" uuid,
>>   "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" bigint,
>>   PRIMARY KEY(series, ts1, cat, subcat, "1337",
>> "92d21d0a-d6cb-437c-9d3f-b67aa733a19f")
>> ) TRANSPOSED WITH COLUMN NAMES ("1337" int,
>> "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" uuid);
>> {code}
>
> Could you explain what this "TRANSPOSED WITH COLUMN NAMES" syntax does
> (or link to a previous description if I missed it)?
>
>> Thus, columns included in the (transposed) primary key will be
>> "dense," and not updateable, which conforms to our existing practice
>> that keys are not updateable.  Remaining columns will be updateable
>> since they will each map to a separate physical column.
>
>
>
> --
> Eric Evans
> Acunu | http://www.acunu.com | @acunu



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: CQL support for compound columns

2011-12-29 Thread Eric Evans
On Thu, Dec 29, 2011 at 3:44 PM, Jonathan Ellis  wrote:
> That's to allow defining column names that are not text/utf8.  So you
> could have column name "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" be an
> actual 128-bit uuid binary value internally, not its string
> representation.  Put another way, this would affect the CqlMetadata
> name_types map.
>
> However, we already have the "column names are always strings"
> limitations with existing CQL DDL so it probably makes more sense to
> consider it separately from transposition.

Right, and to get a jump on that bikeshedding I'd propose that look
something like:

CREATE TABLE test (
   int(10) text,
   uuid(92d21d0a-d6cb-437c-9d3f-b67aa733a19f) bigint
)

or...

CREATE TABLE test (
   (int)10 text,
   (uuid)92d21d0a-d6cb-437c-9d3f-b67aa733a19f bigint
)

But I digress, that's probably best left for another issue and another time. :)


> On Thu, Dec 29, 2011 at 3:22 PM, Eric Evans  wrote:
>> On Wed, Dec 28, 2011 at 1:05 PM, Jonathan Ellis  wrote:
>>> Gamma proposal update:
>>>
>>> The more I think about it the less happy I am with omitting support
>>> for sparse columns.  Remember that dense composites may only be
>>> inserted and deleted, not updated, since they are just a tuple of
>>> values with "column names" determined by schema and/or convention.
>>>
>>> I think we can support sparse columns well in a way that improves the
>>> conceptual integrity for the dense composites as well:
>>>
>>> {code}
>>> -- "column" and "value" are sparse; a transposed row will be stored as
>>> -- two columns of (user_id, posted_at, 'column': string) and (user_id,
>>> posted_at, 'value': blob)
>>> CREATE TABLE timeline (
>>>   user_id int,
>>>   posted_at uuid,
>>>   column string,
>>>   value blob,
>>>   PRIMARY KEY(user_id, posted_at)
>>> ) TRANSPOSED;
>>>
>>> -- entire transposed row is stored as a single dense composite column
>>> -- (series, ts1, cat, subcat, 1337, 92d21d0a-...: []).  Note that the
>>> -- composite column's value is unused in this case.
>>> CREATE TABLE events (
>>>   series text,
>>>   ts1 int,
>>>   cat text,
>>>   subcat text,
>>>   "1337" uuid,
>>>   "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" bigint,
>>>   PRIMARY KEY(series, ts1, cat, subcat, "1337",
>>> "92d21d0a-d6cb-437c-9d3f-b67aa733a19f")
>>> ) TRANSPOSED WITH COLUMN NAMES ("1337" int,
>>> "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" uuid);
>>> {code}
>>
>> Could you explain what this "TRANSPOSED WITH COLUMN NAMES" syntax does
>> (or link to a previous description if I missed it)?

-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu


Re: CQL support for compound columns

2011-12-29 Thread Jonathan Ellis
https://issues.apache.org/jira/browse/CASSANDRA-3685

On Thu, Dec 29, 2011 at 6:34 PM, Eric Evans  wrote:
> On Thu, Dec 29, 2011 at 3:44 PM, Jonathan Ellis  wrote:
>> That's to allow defining column names that are not text/utf8.  So you
>> could have column name "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" be an
>> actual 128-bit uuid binary value internally, not its string
>> representation.  Put another way, this would affect the CqlMetadata
>> name_types map.
>>
>> However, we already have the "column names are always strings"
>> limitations with existing CQL DDL so it probably makes more sense to
>> consider it separately from transposition.
>
> Right, and to get a jump on that bikeshedding I'd propose that look
> something like:
>
> CREATE TABLE test (
>   int(10) text,
>   uuid(92d21d0a-d6cb-437c-9d3f-b67aa733a19f) bigint
> )
>
> or...
>
> CREATE TABLE test (
>   (int)10 text,
>   (uuid)92d21d0a-d6cb-437c-9d3f-b67aa733a19f bigint
> )
>
> But I digress, that's probably best left for another issue and another time. 
> :)
>
>
>> On Thu, Dec 29, 2011 at 3:22 PM, Eric Evans  wrote:
>>> On Wed, Dec 28, 2011 at 1:05 PM, Jonathan Ellis  wrote:
 Gamma proposal update:

 The more I think about it the less happy I am with omitting support
 for sparse columns.  Remember that dense composites may only be
 inserted and deleted, not updated, since they are just a tuple of
 values with "column names" determined by schema and/or convention.

 I think we can support sparse columns well in a way that improves the
 conceptual integrity for the dense composites as well:

 {code}
 -- "column" and "value" are sparse; a transposed row will be stored as
 -- two columns of (user_id, posted_at, 'column': string) and (user_id,
 posted_at, 'value': blob)
 CREATE TABLE timeline (
   user_id int,
   posted_at uuid,
   column string,
   value blob,
   PRIMARY KEY(user_id, posted_at)
 ) TRANSPOSED;

 -- entire transposed row is stored as a single dense composite column
 -- (series, ts1, cat, subcat, 1337, 92d21d0a-...: []).  Note that the
 -- composite column's value is unused in this case.
 CREATE TABLE events (
   series text,
   ts1 int,
   cat text,
   subcat text,
   "1337" uuid,
   "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" bigint,
   PRIMARY KEY(series, ts1, cat, subcat, "1337",
 "92d21d0a-d6cb-437c-9d3f-b67aa733a19f")
 ) TRANSPOSED WITH COLUMN NAMES ("1337" int,
 "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" uuid);
 {code}
>>>
>>> Could you explain what this "TRANSPOSED WITH COLUMN NAMES" syntax does
>>> (or link to a previous description if I missed it)?
>
> --
> Eric Evans
> Acunu | http://www.acunu.com | @acunu



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com