Hi,

what's the best approach to process batch of events in N seconds after
latest event in a group happen? Events are grouped by key.

I am thinking about following scheme:

1) events are recorded in a way that every write creates new sibling
to avoid read/write multiple cycles  per event
2) with every write new secondary index is created with value =
"sweep_at_$current_time + N"
3) every second process queries Riak for secondary keys with values <=
"sweep_at_$current_time"
4) for every item returned it queries all it's siblings:
 - if there are siblings, then merge them into 1 record, calculate and
write new secondary index "seep_at_$latest_sibling_time + N". Go to
next substep if newly calculated timeout value is <= current time.
 - if there are no siblings, process them and remove key from Riak

Therefore for every batch of N events on average (given that 99% of
event batches timespans are less than N) there will be:
N+1 writes and 2 secondary index seek and 2 reads

Is it correct approach for Riak? It could be improved further by
carefully setting secondary index on stage 2 so that merge of all
sibling will be immediately followed by processing of events batch,
but right now I am more intrested wether it fit nicely to Riak.

Thank you.

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to