OK ... I think I understand these.  So the idea is that you would use the time 
as the column key?

So when I might have something like this:

<key1> | time=2013/01/03 08:19:01 | user=john | site=Chicago
<key2> | time=2013/01/05 01:55:34 | user=john | site=Chicago
<key3> | time=2013/01/09 16:21:42 | user=john | site=New York
<key4> | time=2013/01/09 17:27:41 | user=susan | site=Boston
<key5> | time=2013/01/09 17:27:41 | user=asok | site=Dallas

Instead it would be better to do something like this:

<key1> | 2013/01/03 08:19:01= {user=john, site=Chicago} | 2013/01/05 
01:55:34={user=john, site=Chicago } | 2013/01/09 16:21:42={user=john, site=New 
York}
<key2> | time=2013/01/09 17:27:41 = {user=susan, site=Boston}
<key3> | time=2013/01/09 17:27:41={user=asok,site=Dallas}

Am I understanding this correctly?  This seems to have the HUGE disadvantage 
that I am no longer going to be able to create secondary indexes on user and 
site.  Is that right?

This seems like an impossible solution for my requirements.

Steve

From: Tyler Hobbs [mailto:ty...@datastax.com]
Sent: Wednesday, January 09, 2013 2:21 PM
To: user@cassandra.apache.org
Subject: Re: Date Index?

If you're going to be looking data up by date ranges frequently, I strongly 
suggest you go with a typical time-series pattern (what Aaron described as 
hand-rolled indexes):

http://rubyscale.com/blog/2011/03/06/basic-time-series-with-cassandra/
http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra

If you're just running these date-based queries occasionally and the result set 
won't be huge, then using secondary indexes as you described is a convenient 
but not terribly efficient way to do that.

On Wed, Jan 9, 2013 at 10:04 AM, Michael Kjellman 
<mkjell...@barracuda.com<mailto:mkjell...@barracuda.com>> wrote:
ElasticSearch is a nice option for ordered lists. In 2.0 triggers would fit 
updates to elastic search much easier as right now it's in your application 
logic to detect changes and update.

On Jan 9, 2013, at 7:55 AM, 
"stephen.m.thomp...@wellsfargo.com<mailto:stephen.m.thomp...@wellsfargo.com>" 
<stephen.m.thomp...@wellsfargo.com<mailto:stephen.m.thomp...@wellsfargo.com>> 
wrote:
Thanks Aaron, that helps.  So is there anything approaching a "consensus" of 
how to do something like this?

You mention a custom index ... is there a good document on creating a custom 
index?  Google doesn't show me much.

Steve

From: aaron morton [mailto:aa...@thelastpickle.com]
Sent: Tuesday, January 08, 2013 9:35 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Date Index?

There has to be one equality clause in there, and thats the thing to cassandra 
uses to select of disk. The others are in memory filters.

So if you have one on the year+month you can have a simple select clause and it 
limits the amount of data that has to be read.

If you have like many 10's to 100's millions of things in the same month you 
may want to do some performance testing. There can still be times when you want 
to support common read paths by using custom / hand rolled indexes.

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 9/01/2013, at 6:05 AM, 
stephen.m.thomp...@wellsfargo.com<mailto:stephen.m.thomp...@wellsfargo.com> 
wrote:

Hi folks -

Question about secondary indexes.  How are people doing date indexes?    I have 
a date column in my tables in RDBMS that we use frequently, such as look at all 
records recorded in the last month.  What is the best practice for being able 
to do such a query?  It seems like there could be an advantage to adding a 
couple of columns like this:

                {timestamp=2013/01/08 12:32:01 -0500}
                {month=201301}
                {day=08}

And then I could do secondary index on the month and day columns?  Would that 
be the best way to do something like this?  Is there any accepted "best 
practice" on this yet?

Thanks!
Steve


----------------------------------
Join Barracuda Networks in the fight against hunger.
To learn how you can help in your community, please visit: 
http://on.fb.me/UAdL4f
  



--
Tyler Hobbs
DataStax<http://datastax.com/>

Reply via email to