Re: Composite keys and range queries

John Laban Wed, 14 Mar 2012 14:23:53 -0700

Ahhh, ok, I thought that CQL was just being brought up to date with
the functionality already built into composite keys, but I guess I was
mistaken there.


But I guess it's just providing a convenient abstraction, using composite
column names under the hood.  That's where I was confused, thanks.

So, in terms of composite column names vs supercolumns:  is the only
advantage to composite column names that you can do column slicing on
subsets of the "subcolumns"? I.e. if I don't mind loading all of the
subcolumns for a given supercolumn name in memory at once (since I need
them all anyway), is there any disadvantage to using supercolumns here?
 They seem a little cleaner and more straightforward for my use case, since
I don't have the advantage of the CQL composite key thing.

Thanks,
John


On Wed, Mar 14, 2012 at 12:53 PM, Jeremiah Jordan <
jeremiah.jor...@morningstar.com> wrote:

>  Right, so until the new CQL stuff exists to actually query with
> something smart enough to know about "composite keys" , You have to define
> and query on your own.
>
> Row Key = UUID
> Column = CompositeColumn(string, string)
>
> You want to then use COLUMN slicing, not row ranges to query the data.
> Where you slice in priority as the first part of a Composite Column Name.
>
> See the "Under the hood and historical notes" section of the blog post.
> You want to layout your data per the "Physical representation of the
> denormalized timeline rows" diagram.
> Where your UUID is the "user_id" from the example, and your priority is
> the "tweet_id"
>
> -Jeremiah
>
>
>  ------------------------------
> *From:* John Laban [j...@pagerduty.com]
> *Sent:* Wednesday, March 14, 2012 12:37 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Composite keys and range queries
>
>   Hmm, now I'm really confused.
>
>  > This may be of use to you
> http://www.datastax.com/dev/blog/schema-in-cassandra-1-1
>
>  This article is what I actually used to come up with my schema here.  In
> the "Clustering, composite keys, and more" section they're using a schema
> very similarly to how I'm trying to use it.  They define a composite key
> with two parts, expecting the first part to be used as the partition key
> and the second part to be used for ordering.
>
>  > The hash for (uuid-1 , p1) may be 100 and the hash for (uuid-1, p2)
> may be 1 .
>
>  Why?  Shouldn't only "uuid-1" be used as the partition key?  (So
> shouldn't those two hash to the same location?)
>
>  I'm thinking of using supercolumns for this instead as I know they'll
> work (where the row key is the uuid and the supercolumn name is the
> priority), but aren't composite row keys supposed to essentially replace
> the need for supercolumns?
>
>  Thanks, and sorry if I'm getting this all wrong,
> John
>
>
>
> On Wed, Mar 14, 2012 at 12:52 AM, aaron morton <aa...@thelastpickle.com>wrote:
>
>> You are seeing this http://wiki.apache.org/cassandra/FAQ#range_rp
>>
>> The hash for (uuid-1 , p1) may be 100 and the hash for (uuid-1, p2) may
>> be 1 .
>>
>> You cannot do what you want to. Even if you passed a start of
>> (uuid1,<empty>) and no finish, you would not only get rows where the key
>> starts with uuid1.
>>
>> This may be of use to you
>> http://www.datastax.com/dev/blog/schema-in-cassandra-1-1
>>
>> Or you can store all the priorities that are valid for an ID in another
>> row.
>>
>> Cheers
>>
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 14/03/2012, at 1:05 PM, John Laban wrote:
>>
>> > Forwarding to the Cassandra mailing list as well, in case this is more
>> of an issue on how I'm using Cassandra.
>> >
>> > Am I correct to assume that I can use range queries on composite row
>> keys, even when using a RandomPartitioner, if I make sure that the first
>> part of the composite key is fixed?
>> >
>> > Any help would be appreciated,
>> > John
>> >
>> >
>> >
>> > On Tue, Mar 13, 2012 at 12:15 PM, John Laban <j...@pagerduty.com>
>> wrote:
>> > Hi,
>> >
>> > I have a column family that uses a composite key:
>> >
>> > (ID, priority) -> ...
>> >
>> > Where the ID is a UUID and the priority is an integer.
>> >
>> > I'm trying to perform a range query now:  I want all the rows where the
>> ID matches some fixed UUID, but within a range of priorities.  This is
>> supported even if I'm using a RandomPartitioner, right?  (Because the first
>> key in the composite key is the partition key, and the second part of the
>> composite key is automatically ordered?)
>> >
>> > So I perform a range slices query:
>> >
>> > val rangeQuery = HFactory.createRangeSlicesQuery(keyspace, new
>> CompositeSerializer, StringSerializer.get, BytesArraySerializer.get)
>> > rangeQuery.setColumnFamily(RouteColumnFamilyName).
>> >             setKeys( new Composite(id, priorityStart), new
>> Composite(id, priorityEnd) ).
>> >             setRange( null, null, false, Int.MaxValue )
>> >
>> >
>> > But I get this error:
>> >
>> > me.prettyprint.hector.api.exceptions.HInvalidRequestException:
>> InvalidRequestException(why:start key's md5 sorts after end key's md5.
>>  this is not allowed; you probably should not specify end key at all, under
>> RandomPartitioner)
>> >
>> > Shouldn't they have the same md5, since they have the same partition
>> key?
>> >
>> > Am I using the wrong query here, or does Hector not support composte
>> range queries, or am I making some mistake in how I think Cassandra's
>> composite keys work?
>> >
>> > Thanks,
>> > John
>> >
>> >
>>
>>
>

Re: Composite keys and range queries

Reply via email to