Re: How to store data

2012-07-26 Thread Erik Søe Sørensen
On 26-07-2012 10:23, Andrew Kondratovich wrote: 1) Identifiers is not random. They are collected in groups, but number of these groups is large and the same identifier can occur in different groups. 2) It's a fixed amount of time - it can be changed, but usually it's one day. But request can be

Re: How to store data

2012-07-26 Thread Andrew Kondratovich
1) Identifiers is not random. They are collected in groups, but number of these groups is large and the same identifier can occur in different groups. 2) It's a fixed amount of time - it can be changed, but usually it's one day. But request can be "get items from day before", not only for today. O

Re: How to store data

2012-07-25 Thread Yousuf Fauzan
Using key filter on a big bucket could cause performance problems. On Jul 25, 2012 9:53 PM, "Andrew Kondratovich" < andrew.kondratov...@gmail.com> wrote: > Yeap.. half a thousand requests to riak isn't cool =( I'm looking some > strategy of storing data so that i could fetch all items by 1 request

Re: How to store data

2012-07-25 Thread Andrew Kondratovich
Yeap.. half a thousand requests to riak isn't cool =( I'm looking some strategy of storing data so that i could fetch all items by 1 request. I could use index MR at time and filter results at map phase. I could use special keys with from data and use key filters (with time filtering at map phase)

Re: How to store data

2012-07-25 Thread Andres Jaan Tack
Is that a realistic strategy for low latency requirements? Imagine this were some web service, and people generate this query at some reasonable frequency. (not that I know what Andrew is looking for, exactly) 2012/7/25 Yousuf Fauzan > Since 500 is not that big a number, I think you can run tha

Re: How to store data

2012-07-25 Thread Yousuf Fauzan
Since 500 is not that big a number, I think you can run that many M/Rs with each emitting only records having "time" greater than specified. Input would be {index, <<"bucket">>, <<"from_bin">>, <<"from_field_value">>} If you decide to split the data into separate buckets based on "from" field, inp

Re: How to store data

2012-07-25 Thread Andrew Kondratovich
Hello, Yousuf. Thanks for your reply. We have several millions of items. It's about 10 000 of unique 'from' fields (about 1000 items for each). Usually, we need to get items for about 500 'from' identifiers with 'time' limit (about 5% of items is corresponding). On Wed, Jul 25, 2012 at 1:02 PM,