Hi All, We have a flink streaming job processing around 200k events per second. The job requires a lot of less frequently changing data (sort of static but there will be some changes over time, say 5% change once per day or so). There are about 12 caches with some containing approximately 20k entries whereas a few with about 2 million entries.
In the current implementation we are using in-memory lazy loading static cache to populate the data and the initialization happens in open function. The reason to choose this approach is because we have allocated around 4GB extra memory per TM for these caches and if a TM has 6 slots the cache can be shared. Now the issue we have with this approach is everytime when a container is restarted or a new job is deployed it has to populate the cache again. Sometimes this lazy loading takes a while and it causes back pressure as well. We were thinking to move this logic to the broadcast stream but since the data has to be stored per slot it would increase the memory consumption by a lot. Another option that we were thinking of is to replace the current near far cache that uses rest api to load the data to redis based near far cache. This will definitely reduce the overall loading time but still not the perfect solution. Are there any recommendations on how this can be achieved effectively? Also how is everyone overcoming this problem? Thanks, Navneeth