Hi folks, I’m trying to write a Flink job that computes a bunch of counters which requires custom triggers and I was trying to figure out the best way to express that.
The query looks something like this: SELECT userId, customUDAF(...) AS counter1, customUDAF(...) AS counter2, ... FROM ( SELECT * FROM my_kafka_stream ) GROUP BY userId, HOP(`timestamp`, INTERVAL '6' HOUR, INTERVAL '7' DAY) We sink this to a KV store (memcache / couchbase) for persistence. Some of these counters end up spanning a pretty wide time window (longest is 7 days) and if we want to keep the state tractable we have to have a pretty large slide interval (6 hours or greater). A requirement that some of our users have is for counters to be updated fairly frequently (e.g. every min) so we were looking at how to achieve that with the Table / SQL api. I see that this is possible using the custom triggers<https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/operators/windows.html#triggers> support if we were to use the Datastream api but I’m wondering if this is possible using the Table / SQL apis. I did see another thread where Fabian brought up this design doc<https://docs.google.com/document/d/1wrla8mF_mmq-NW9sdJHYVgMyZsgCmHumJJ5f5WUzTiM/edit?ts=59816a8b#heading=h.1zg4jlmqwzlr> which has listed what support for emit triggers would look like (in various streaming platforms). Is this something that is being actively worked on? If not, any suggestions on how we could get the ball rolling on this? (google doc design + jira?) Thanks, -- Piyush