> I am wondering how to index on the most recent hour as well. (ie show me top > 5 URLs type query)..
AFAIK thats not a great application for counters. You would need range support in the secondary indexes so you could get the first X rows ordered by a column value. To be honest, depending on scale, I'd consider a sorted set in redis for that. Hope that helps. ----------------- Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 11 Jun 2011, at 00:36, Ian Holsman wrote: > > On Jun 9, 2011, at 10:04 PM, aaron morton wrote: > >> I may be missing something but could you use a column for each of the last >> 48 hours all in the same row for a url ? >> >> e.g. >> { >> "/url.com/hourly" : { >> "20110609T01:00:00" : 456, >> "20110609T02:00:00" : 4567, >> } >> } > > yes.. that would work better... I was storing all the different times in the > same row. > { > "/url.com" : { > "H-20110609T01:00:00" : 456, > "H-0110609T02:00:00" : 4567, > "D-0110609" : 5678, > } > } > > I am wondering how to index on the most recent hour as well. (ie show me top > 5 URLs type query).. > >> >> Increment the current hour only. Delete the older columns either when a read >> detects there are old values or as a maintenance job. Or as part of writing >> values for the first 5 minutes of any hour. > > yes.. I thought of that. The problem with doing it on read is there may be a > case where a old URL never gets read.. so it will just sit there taking up > space.. the maintenance job is the route I went down. > >> >> The row will get spread out over a lot of sstables which may reduce read >> speed. If this is a problem consider a separate CF with more aggressive GC >> and compaction settings. > > Thanks! >> >> Cheers >> >> >> ----------------- >> Aaron Morton >> Freelance Cassandra Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 10 Jun 2011, at 09:28, Ian Holsman wrote: >> >>> So would doing something like storing it in reverse (so I know what to >>> delete) work? Or is storing a million columns in a supercolumn impossible. >>> >>> I could always use a logfile and run the archiver off that as a worst case >>> I guess. >>> Would doing so many deletes screw up the db/cause other problems? >>> >>> --- >>> Ian Holsman - 703 879-3128 >>> >>> I saw the angel in the marble and carved until I set him free -- >>> Michelangelo >>> >>> On 09/06/2011, at 4:22 PM, Ryan King <r...@twitter.com> wrote: >>> >>>> On Thu, Jun 9, 2011 at 1:06 PM, Ian Holsman <had...@holsman.net> wrote: >>>>> Hi Ryan. >>>>> you wouldn't have your version of cassandra up on github would you?? >>>> >>>> No, and the patch isn't in our version yet either. We're still working on >>>> it. >>>> >>>> -ryan >> >