This sounds like you have some per-key state to keep track of, so the
'correct' way to do it would be to keyBy the guid. I believe that if you
run your environment using the Rocks DB state backend you will not OOM
regardless of the number of GUIDs that are eventually tracked. Whether
flink/stream processing is the most effective way to achieve your goal, I
can't say, but I am fairly confident that this particular aspect is not a
problem.

On Sat, Apr 23, 2016 at 1:13 AM, Chen Bekor <chen.be...@gmail.com> wrote:

> hi all,
>
> I have a stream of incoming object versions (objects change over time) and
> a requirement to fetch from a datastore the last known object version in
> order to link it with the id of the new version,  so that I end up with a
> linked list of object versions.
>
> all object versions contain the same guid, so I was thinking about using
> flink streaming in order to assure ordering and avoid concurrency / race
> conditions in the linkage process (object version might arrive unordered or
> may arrive at spikes)
>
> if I use the object guid as a key for a keyed stream I am concerned I will
> end up with millions of windowed streams hence causing OOM.
>
> what do you think should be the right approach? do you think flink is the
> right technology for this task?
>

Reply via email to