Thanks Stephan, this should work for now :) You are right, latency is quite tricky, I don't have any better ideas either, but I will definitely let you know if there are any.
Gyula Stephan Ewen <se...@apache.org> ezt írta (időpont: 2015. nov. 7., Szo, 21:58): > You probably need to calculate the throughput yourself at this point, from > accumulated number of records. You can periodically poll the following URLs > via HTTP GET > > - /jobs/<jobid> : This gives you the aggregate number of records / bytes > per JobVertex > - /jobs/<jobid>/vertices/<vertexid> : This gives you accumulated records / > bytes for subtasks > > There is no latency metric right now. The latency is quite tricky to > assess, in general. It needs timestamps attached at the sources and > measured at the sinks. So far, no problem, but this assumes that source and > sink clocks are quite in sync. If they are off by a few milliseconds, then > the low latencies are quite off already. We may decide to accept that > inaccuracy, or to try and correct it a bit by letting the JobManager > broadcast its clock offsets and TaskManagers offset theirs. > > For experiments, we wrote special jobs where we could sample the records > that after two re-partitionings return to the same JVM, so we would not > have clock misalignment. Still thinking about good ways to have a general > purpose latency measurement mechanism. > > If you have any ideas there, let me know! > > Greetings, > Stephan > > > On Sat, Nov 7, 2015 at 7:39 PM, Gyula Fóra <gyula.f...@gmail.com> wrote: > > > Hey guys, > > > > I am trying to look at the throughput of my Flink Streaming job over > time. > > Is there any way to extract this information from the dashboard or is it > > only possible to view the cumulative statistics at given time points. > > > > Also I am wondering whether there is any info about the latency in the > > metrics somewhere. > > > > Cheers, > > Gyula > > >