Hi Li, Thanks for the positive feedback on the FLIP! I appreciate you taking the time. Glad you see the potential, especially for autonomous operations - it's a key driver.
Your question on the Kafka reporter is great; it'll be a vital implementation. V1's core goal is to establish the right foundation, making it safer to build these reporters later. Here’s how V1 enables it: 1. EventsReporter Interface: The main extension point; a Kafka reporter implements open, reportEvent, and close. 2. Async Dispatch: A local EventDispatchService decouples via its queue/thread. Flink uses a non-blocking offer() so slow reporters won't block the JobManager. If full, V1 logs/drops, prioritizing stability. 3. Config & Data: Standard flink-conf.yaml pattern. V1 defines the FlinkOperationalEvent (JSON) for consistency. V1 provides the stable framework so future devs can focus on Kafka logic. This enables secure additions and supports our AIOps goals. Building a production Kafka reporter still involves design choices (producer settings, backpressure, schemas). With that in mind, I'd be interested to hear community thoughts: What are critical features/challenges for a KafkaEventsReporter? E.g., delivery guarantees vs. stability? Getting perspectives now is valuable. Based on this feedback, I'm happy to update the FLIP to include these considerations for V2+. Thanks again for the discussion! Best regards, Kartikey Pant