Re: CQL support for compound columns

Sylvain Lebresne Tue, 03 Jan 2012 10:46:44 -0800

I had complains that my preceding mail was unreadable (thanks gmailfor
fucking my formatting up), so I've posted the same thing with nice
formatting on the JIRA ticket.


--
Sylvain

On Tue, Jan 3, 2012 at 7:08 PM, Sylvain Lebresne <sylv...@datastax.com> wrote:
> Ok, I think I'm warming up to what we're getting at. I would change
> thesyntax of the VALUE() thing however. Instead of:CREATE TABLE
> timeline (   userid int,   posted_at uuid,   body string,   PRIMARY
> KEY(user_id, posted_at),   VALUE(body))I would prefer:CREATE COMPACT
> TABLE timeline (   userid int,   posted_at uuid,   body string,
> PRIMARY KEY(user_id, posted_at),)
> The reasons being that it really influences the implementation layout
> of theCF in C*. Namely, the non-compact CF defined by CREATE TABLE
> timeline (   userid int,   posted_at uuid,   body string,   PRIMARY
> KEY(user_id, posted_at),)would look in C* like:<userid> : {
> <posted_at>:'body' -> <value>}while the COMPACT variant would
> be:<userid> : {    <posted_at> -> <value>}which is using the fact that
> there is only 1 field not part of the key to"optimize" the layout. And
> I believe making the COMPACT part of the CREATEemphasis better that
> it's a property of the definition itself (that cannot bechanged)
> rather than of that specific 'body' field. It also make the rule
> forCOMPACT table rather simple: "a compact table should have only one
> field notpart of the primary key"; you don't have to deal with errors
> like someonedefining two VALUE() for instance.
>
> That being said, I'd like to try to resume where we're at (including
> theCOMPACT change above) and add a few random ideas along the way.
> Please correctme if I've got something wrong.
> I think we have 4 different cases, 2 for 'standard' CF without
> composites:- static CFs (the only case CQL handle really well today)-
> dynamic CFs (wide rows, time series if you prefer)and 2 for CF with
> composite column names:- 'dense' composite (typically time series but
> where the key is naturally  multi-parts)- 'sparse' composite (aka
> super columns)
> Let me try to take an example for which, with how it would
> translateinternally and example queries.
>
> Standard "static" CF--------------------
> "For each user, holds his infos"
> CREATE TABLE users (    userid uuid PRIMARY KEY,    firstname text,
> lastname text,    age int)
> In C*:<userid> : {    'firstname' -> <value>    'lastname' -> <value>
>   'age' -> <value>}
> Query:SELECT firstname, lastname FROM users WHERE userid = '...';
> Standard "dynamic" CF---------------------
> "For each user, keep each url he clicked on with the date of last click"
> CREATE COMPACT TABLE clicks (    userid uuid,    url text,
> timestamp date    PRIMARY KEY (userid, url))
> In C*:<userid> : {    <url> -> <timestamp>}
> Query:SELECT url, timestamp FROM clicks WHERE userid = '...';SELECT
> timestamp FROM clicks WHERE userid = '...' and url = 'http://...';
> 'dense' composite-----------------
> "For each user, keep ip and port from where he connected with the date
> of lastconnection"
> CREATE COMPACT TABLE connections (    userid uuid,    ip binary,
> port int,    timestamp date    PRIMARY KEY (userid, ip, port))
> In C*:<userid> : {    <ip>:<port> -> <timestamp>}
> Query:SELECT ip, port, timestamp FROM connections WHERE userid = '...';
> 'sparse' composite------------------
> "User timeline"
> CREATE TABLE timeline (    userid uuid,    posted_at date,    body
> text,    posted_by text,    PRIMARY KEY (user_id, posted_at),);
> In C*:<userid> : {    <posted_at>:'body' -> <value>
> <posted_at>:'posted_by' -> <value>}
> Query:SELECT body, posted_by FROM timeline WHERE userid = '...' and
> posted_at = '2 janvier 2010'
> Note: I think we really should also be able to do queries like:SELECT
> posted_ad, body, posted_by FROM timeline WHERE userid = '...' and
> posted_at > '2 janvier 2010'but that's more akin to the modification
> of the syntax for slices.
>
> Random other ideas------------------
> 1) We could allow something like:    CONSTRAINT key PRIMARY KEY
> (userid, ip, port)  which would then allow to write    SELECT
> timestamp FROM users WHERE key = ('...', 192.168.0.1, 80);  (I believe
> this is the 'standard' notation to name a 'composite' key in SQL)
> - Above we're ony handling the use of composites for column names, but
> they  can be useful for value (and row keys) and it could be nice to
> have an easy  notation for that (clearly a following ticket however).
> What about:
> CREATE COMPACT TABLE timeline (    userid_part1 text,    userid_part2
> int,    posted_at date,    posted_by uuid,    body text    header text
>    GROUP (userid_part1, userid_part2) AS userid,    PRIMARY KEY
> (userid, posted_at, posted_by)    GROUP (header, body))
> In C*:<userid_part1>:<userid_part2> : {    <posted_at>:<posted_by> ->
> <header>:<body>}
> Query:SELECT posted_at, posted_by, body, header FROM timeline WHERE
> userid = ('john', 32)
>
> --
> Sylvain
> On Mon, Jan 2, 2012 at 8:29 PM, Eric Evans <eev...@acunu.com> wrote:
>> On Mon, Jan 2, 2012 at 12:55 PM, Jonathan Ellis <jbel...@gmail.com> wrote:
>>> On Mon, Jan 2, 2012 at 10:53 AM, Eric Evans <eev...@acunu.com> wrote:
>>>> In SQL, PRIMARY KEY is a modifier to a column spec, and here PRIMARY
>>>> KEY(user_id, posted_at, posted_by) reads like a PRIMARY modifier
>>>> applied to a KEY() function.  It's also a little strange the way it
>>>> appears in the grouping of column specs, when it's actually defining a
>>>> grouping or relationship of them (maybe this is what you meant about
>>>> using TRANSPOSED WITH <options> to emphasize the non-standard).
>>>
>>> Fear not, I can set your mind at ease. :)
>>>
>>> Personally I think the syntax works reasonably well in its own right,
>>> but my main reason for the proposed syntax is that it is actually
>>> standard SQL for composite primary keys at least as far back as SQL
>>> 92, as a subcategory of table constraints.  The SQL standard is not
>>> freely linkable, but see
>>> http://www.postgresql.org/docs/9.1/static/sql-createtable.html for a
>>> real-world example.
>>
>> OK, I stand corrected (and my mind is at ease :) ).
>>
>>
>> --
>> Eric Evans
>> Acunu | http://www.acunu.com | @acunu

Re: CQL support for compound columns

Reply via email to