On Jun 9, 2011, at 10:04 PM, aaron morton wrote:

> I may be missing something but could you use a column for each of the last 48 
> hours all in the same row for a url ?
> 
> e.g. 
> {
>       "/url.com/hourly" : {
>               "20110609T01:00:00" : 456,
>               "20110609T02:00:00" : 4567,
>       }
> }

yes.. that would work better... I was storing all the different times in the 
same row.
{
        "/url.com" : {
         "H-20110609T01:00:00" : 456,
         "H-0110609T02:00:00" : 4567,
         "D-0110609" : 5678,
        }
}

I am wondering how to index on the most recent hour as well. (ie show me top 5 
URLs type query).. 

> 
> Increment the current hour only. Delete the older columns either when a read 
> detects there are old values or as a maintenance job. Or as part of writing 
> values for the first 5 minutes of any hour. 

yes.. I thought of that. The problem with doing it on read is there may be a 
case where a old URL never gets read.. so it will just sit there taking up 
space.. the maintenance job is the route I went down.

> 
> The row will get spread out over a lot of sstables which may reduce read 
> speed. If this is a problem consider a separate CF with more aggressive GC 
> and compaction settings. 

Thanks!
> 
> Cheers
> 
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 10 Jun 2011, at 09:28, Ian Holsman wrote:
> 
>> So would doing something like storing it in reverse (so I know what to 
>> delete) work? Or is storing a million columns in a supercolumn impossible. 
>> 
>> I could always use a logfile and run the archiver off that as a worst case I 
>> guess. 
>> Would doing so many deletes screw up the db/cause other problems?
>> 
>> ---
>> Ian Holsman - 703 879-3128
>> 
>> I saw the angel in the marble and carved until I set him free -- Michelangelo
>> 
>> On 09/06/2011, at 4:22 PM, Ryan King <r...@twitter.com> wrote:
>> 
>>> On Thu, Jun 9, 2011 at 1:06 PM, Ian Holsman <had...@holsman.net> wrote:
>>>> Hi Ryan.
>>>> you wouldn't have your version of cassandra up on github would you??
>>> 
>>> No, and the patch isn't in our version yet either. We're still working on 
>>> it.
>>> 
>>> -ryan
> 

Reply via email to