Hi, from what I read, I get the impression that you attempt to implement you own "keyed state" with a hashmap? Why not using the keyed state that is already provided by Flink and gives you efficient rescaling etc. out of the box? Please see [1] for the details.
[1] https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/state/state.html#using-managed-keyed-state <https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/state/state.html#using-managed-keyed-state> > Am 20.02.2018 um 13:44 schrieb gerardg <ger...@talaia.io>: > > Hello, > > To improve performance we have " keyed state" in the operator's memory, > basically we keep a Map which contains the state per each of the keys. The > problem comes when we want to restore the state after a failure or after > rescaling the operator. What we are doing is sending the concatenation of > all the state to every operator using an union redistribution and then we > restore the "in memory state" every time we see a new key. Then, after a > while, we just clear the redistributed state. This is somewhat complex and > prone to errors so we would like to find an alternative way of doing this. > > As far as I know Flink knows which keys belong to each operator > (distributing key groups) so I guess it would be possible to somehow > calculate the key id from each of the stored keys and restore the in memory > state at once if we could access to the key groups mapping. Is that > possible? We could patch Flink if necessary to access that information. > > Thanks, > > Gerard > > > > -- > Sent from: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/