Hi, I'm trying to implement customized state logic with KeyedProcessFunction. But I'm not quite sure how to implement the correct watermark behavior when late data is involved.
According to the answer on stackoverflow: https://stackoverflow.com/questions/59468154/how-to-sort-an-out-of-order-event-time-stream-using-flink , there should be a state buffering all events until watermark passed the expected time and a event time trigger will fetch from the state and do the calculation. The buffer type should be Map<T, List<E>> where T is the timestamp and E is the event type. However, the interface provided by Flink currently is only a MapStae<K, V>. If the V is a List<E> and buffered all events, every time an event comes Flink will do ser/deser and could be very expensive when throughput is huge. I checked the built-in window implementation which implements the watermark buffering. It seems that WindowOperator consists of some InternalStates, of which signature is where window is namespace or key, if I understand correctly. But internal states are not available for Flink users. So my question is: is there an efficient way to simulate watermark buffering using process function for Flink users? Thanks.