Yeah, I intentionally didn't mention the expected data set size, hoping I could 
find a more elegant solution that would work both in the small N and large N 
cases. In any case, I appreciate the recommendations.

When I get some time I am interested in looking at the source and figuring out 
whether or not getting a "Most/least recently updated" ordering for columns 
would be doable.


On May 8, 2010, at 4:12 PM, Ed Anuff wrote:

> I was thinking it was going to be a lot more than that, you might want to 
> consider just storing them all as a single serialized array of timestamps and 
> uuids.  By my math, you could fit up to 40 uuid/timestamp pairs for under 1K. 
>  Then you'd just store something like this:
> 
> // Row key is userId
> 12345 : {
>   last_seen : 387587235233, // timestamp of last visit
>   last_uuid: ‘256fb890-5a4b-11df-a08a-0800200c9a66’,
>   history : 0x000....., // serialized array of N timestamp/uuid pairs (24 
> bytes per pair)
> }
> 
> On Sat, May 8, 2010 at 3:54 PM, William Ashley <wash...@gmail.com> wrote:
> That is a good question, because realistically I see N being under 10, and 
> there are no current plans to make use of a large historical record. I could 
> have the update process pull all columns and issue deletes as necessary such 
> that only M (M >= N) are kept.
> 
> Thanks for the inspiration.
> 
> 
> On May 8, 2010, at 3:42 PM, Ed Anuff wrote:
> 
>> Sorry, missed that.  I'm not sure if there's a cleaner way than using the 
>> approaches you've looked at, hopefully someone else has an answer.  How big 
>> is N and do you need to keep more than N around?
>> 
>> On Sat, May 8, 2010 at 10:26 AM, William Ashley <wash...@gmail.com> wrote:
>> This would be a solution if I wanted to get the N most recently CREATED 
>> guids, but I'm interested in the most recently SEEN guids.
>> 
>> 
> 
> 

Reply via email to