Yeah, I intentionally didn't mention the expected data set size, hoping I could
find a more elegant solution that would work both in the small N and large N
cases. In any case, I appreciate the recommendations.
When I get some time I am interested in looking at the source and figuring out
whether or not getting a "Most/least recently updated" ordering for columns
would be doable.
On May 8, 2010, at 4:12 PM, Ed Anuff wrote:
> I was thinking it was going to be a lot more than that, you might want to
> consider just storing them all as a single serialized array of timestamps and
> uuids. By my math, you could fit up to 40 uuid/timestamp pairs for under 1K.
> Then you'd just store something like this:
>
> // Row key is userId
> 12345 : {
> last_seen : 387587235233, // timestamp of last visit
> last_uuid: ‘256fb890-5a4b-11df-a08a-0800200c9a66’,
> history : 0x000....., // serialized array of N timestamp/uuid pairs (24
> bytes per pair)
> }
>
> On Sat, May 8, 2010 at 3:54 PM, William Ashley <[email protected]> wrote:
> That is a good question, because realistically I see N being under 10, and
> there are no current plans to make use of a large historical record. I could
> have the update process pull all columns and issue deletes as necessary such
> that only M (M >= N) are kept.
>
> Thanks for the inspiration.
>
>
> On May 8, 2010, at 3:42 PM, Ed Anuff wrote:
>
>> Sorry, missed that. I'm not sure if there's a cleaner way than using the
>> approaches you've looked at, hopefully someone else has an answer. How big
>> is N and do you need to keep more than N around?
>>
>> On Sat, May 8, 2010 at 10:26 AM, William Ashley <[email protected]> wrote:
>> This would be a solution if I wanted to get the N most recently CREATED
>> guids, but I'm interested in the most recently SEEN guids.
>>
>>
>
>