Hi All, I have couple of questions regarding state maintenance in flink.
- I have a connected stream and then a keyby operator followed by a flatmap function. I use MapState and keys get added by data from stream1 and removed by messges from stream2. Stream2 acts as a control stream in my pipeline. My question is when the keys are removed will the state in rocksdb also be removed? How does rocks db get the most recent state? - Can I use guava cache in MapState like MapState<String, Cache<String, String>>? Do I have to write a serializer to persist data from guava cache? - One of my downstream operator requires keyed state because I need to query the state value but it also has two huge state values that are basically the same across all parallel operator instances. Initially I used operator state and checkpoint only in the 0th index of operator and other instances would not checkpoint the same data. How can I achieve this in Keyed State? Each operator will have around 10GB of same data. Not sure if this will be a problem in future. Thanks, Navneeth