Hi All, We are currently using flink in production and use keyBy for performing a CPU intensive computation. There is a cache lookup for a set of keys and since keyBy cannot guarantee the data is sent to a single node we are basically replicating the cache on all nodes. This is causing more memory problems for us and we would like to explore some options to mitigate the current limitations.
Is there a way to group a set of keys and send to a set of nodes so that we don't have to replicate the cache data on all nodes? Has someone tried implementing hashing with adaptive load balancing so that if a node is busy processing then the data can be routed effectively to other nodes which are free. Any suggestions are greatly appreciated. Thanks
