Hi Piotr, thanks for the proposal! I think this would be a very valuable addition to Flink as it would simplify operations a lot (disclaimer: we already use it in our internal Flink version)
I have a couple of remarks regarding the FLIP: 1. Should it list the events that would be emitted in the first version? If not as part of the proposed change, then maybe as examples 2. Interface Event - Scope It's scope is explicitly tied to a Class which limits flexibility. I can imagine alternative usages of scope, for example scope="job-lifecycle" or scope="task-manager". It also adds a barrier to rename or move classes. So I'd remove the mention of Class. 3. Interface Event - Body Can you explain the purpose of this property? In my opinion, most information about an event can be stored in its attributes, except for when it's used by Flink itself; or is very common. Regards, Roman On Thu, Nov 7, 2024 at 2:35 PM Piotr Nowojski <piotr.nowoj...@gmail.com> wrote: > Hi all! > > I would like to open up for the discussion a new FLIP [1] > > Motivation > > Currently, Flink observability has support for Metrics and Traces. We > suggest enhancing the observability capabilities by adding support for > Events. This can be used to track the most important events that happen in > Flink, e.g. completed checkpoints or changes to the job’s state. Storing > such information, e.g. in an event log or a database, would allow us to > create a history of those events which can be used for audit purposes or to > derive important information like, for example, job uptime/downtime or > violations of required checkpoint completion times. > > For more information please look into the FLIP [1]. > > I'm looking forward to your thoughts on this. > > Best, > Piotrek > > [1] https://cwiki.apache.org/confluence/x/3IyMEw >