Thanks Stephan, this should work for now :)

You are right, latency is quite tricky, I don't have any better ideas
either, but I will definitely let you know if there are any.

Gyula

Stephan Ewen <se...@apache.org> ezt írta (időpont: 2015. nov. 7., Szo,
21:58):

> You probably need to calculate the throughput yourself at this point, from
> accumulated number of records. You can periodically poll the following URLs
> via HTTP GET
>
>  - /jobs/<jobid> : This gives you the aggregate number of records / bytes
> per JobVertex
>  - /jobs/<jobid>/vertices/<vertexid> : This gives you accumulated records /
> bytes for subtasks
>
> There is no latency metric right now. The latency is quite tricky to
> assess, in general. It needs timestamps attached at the sources and
> measured at the sinks. So far, no problem, but this assumes that source and
> sink clocks are quite in sync. If they are off by a few milliseconds, then
> the low latencies are quite off already. We may decide to accept that
> inaccuracy, or to try and correct it a bit by letting the JobManager
> broadcast its clock offsets and TaskManagers offset theirs.
>
> For experiments, we wrote special jobs where we could sample the records
> that after two re-partitionings return to the same JVM, so we would not
> have clock misalignment. Still thinking about good ways to have a general
> purpose latency measurement mechanism.
>
> If you have any ideas there, let me know!
>
> Greetings,
> Stephan
>
>
> On Sat, Nov 7, 2015 at 7:39 PM, Gyula Fóra <gyula.f...@gmail.com> wrote:
>
> > Hey guys,
> >
> > I am trying to look at the throughput of my Flink Streaming job over
> time.
> > Is there any way to extract this information from the dashboard or is it
> > only possible to view the cumulative statistics at given time points.
> >
> > Also I am wondering whether there is any info about the latency in the
> > metrics somewhere.
> >
> > Cheers,
> > Gyula
> >
>

Reply via email to