Hi Averell, > On Sep 27, 2018, at 3:09 PM, Averell <lvhu...@gmail.com> wrote: > > Hi Kostas, > > Yes, I want them as metrics, as they are purely for monitoring purpose. > There's no need of fault tolerance. > > If I use side-output, for example for that metric no.1, I would need a > tumbling AllWindowFunction, which, as I understand, would introduce some > delay to both the normal processing flow, and to the checkpoint process. >
Side-output may introduce all that but you can always do something like: mymainStream = … myMainStream.myMainComputation…. muMainStream.windowAll().myMonitoringComputation… This does not affect the main path of your computation (if this is your only concern). > I already tried to follow the referencing web page that you sent. However, I > could not know how to have what I want. > For example, with metrics no.1 - meter: org.apache.flink.metrics.Meter only > provides markEvent(), which marks an event to that Meter. There is no option > to provide the event_time, and processing_time is always used. So my graph > is spread over time like the one below. > <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t1586/Meter2.png> > > Event time is a notion of Flink and a property of your data (timestamp). Metric systems like Prometheus take whatever you expose as metric and attach a timestamp based on the current wall-clock time, as for them, the time an event occurred is the time that they got that metric. So, if you want event-time computations, then those should be done in Flink. > For metrics no.2 - histogram: What I can see at Prometheus is the calculated > percentile values (0.5, 0.75, 0.9, 0.99, 0.999), which tells me, for > example: 99% the total number of records had ts1-ts2 <= 350s (which looks > more like a rolling average). But it doesn't tell me roughly how many % of > record have diff of 250ms, how many of 260ms, etc... > <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t1586/Histo2.png> > > For this case, the two notions of histograms (yours and Prometheus’) are not aligned. So what you could do, is keep each “bucket” of your histogram as a separate metric and expose it as such to prometheus. So essentially, you are the one creating the histogram. Metric systems do not perform any (complex) computation for you. Once again, I would say that these exploratory metrics about your stream fall more under the category of analytics about your input data, rather than “metrics”, but of course feel free to disagree :) Cheers, Kostas > Thanks and regards, > Averell > > > > > -- > Sent from: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/