> I am wondering how to index on the most recent hour as well. (ie show me top 
> 5 URLs type query).. 

AFAIK thats not a great application for counters. You would need range support 
in the secondary indexes so you could get the first X rows ordered by a column 
value. 

To be honest, depending on scale, I'd consider a sorted set in redis for that. 

Hope that helps. 
  
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 11 Jun 2011, at 00:36, Ian Holsman wrote:

> 
> On Jun 9, 2011, at 10:04 PM, aaron morton wrote:
> 
>> I may be missing something but could you use a column for each of the last 
>> 48 hours all in the same row for a url ?
>> 
>> e.g. 
>> {
>>      "/url.com/hourly" : {
>>              "20110609T01:00:00" : 456,
>>              "20110609T02:00:00" : 4567,
>>      }
>> }
> 
> yes.. that would work better... I was storing all the different times in the 
> same row.
> {
>       "/url.com" : {
>        "H-20110609T01:00:00" : 456,
>        "H-0110609T02:00:00" : 4567,
>        "D-0110609" : 5678,
>       }
> }
> 
> I am wondering how to index on the most recent hour as well. (ie show me top 
> 5 URLs type query).. 
> 
>> 
>> Increment the current hour only. Delete the older columns either when a read 
>> detects there are old values or as a maintenance job. Or as part of writing 
>> values for the first 5 minutes of any hour. 
> 
> yes.. I thought of that. The problem with doing it on read is there may be a 
> case where a old URL never gets read.. so it will just sit there taking up 
> space.. the maintenance job is the route I went down.
> 
>> 
>> The row will get spread out over a lot of sstables which may reduce read 
>> speed. If this is a problem consider a separate CF with more aggressive GC 
>> and compaction settings. 
> 
> Thanks!
>> 
>> Cheers
>> 
>> 
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 10 Jun 2011, at 09:28, Ian Holsman wrote:
>> 
>>> So would doing something like storing it in reverse (so I know what to 
>>> delete) work? Or is storing a million columns in a supercolumn impossible. 
>>> 
>>> I could always use a logfile and run the archiver off that as a worst case 
>>> I guess. 
>>> Would doing so many deletes screw up the db/cause other problems?
>>> 
>>> ---
>>> Ian Holsman - 703 879-3128
>>> 
>>> I saw the angel in the marble and carved until I set him free -- 
>>> Michelangelo
>>> 
>>> On 09/06/2011, at 4:22 PM, Ryan King <r...@twitter.com> wrote:
>>> 
>>>> On Thu, Jun 9, 2011 at 1:06 PM, Ian Holsman <had...@holsman.net> wrote:
>>>>> Hi Ryan.
>>>>> you wouldn't have your version of cassandra up on github would you??
>>>> 
>>>> No, and the patch isn't in our version yet either. We're still working on 
>>>> it.
>>>> 
>>>> -ryan
>> 
> 

Reply via email to