Hi Xiuming,
> However, I am not sure if this option meets my need - is it possible to
obtain only the whole time spent between the source and the sink, without
the detailed time spent on each operator?
In framework, there is no possible to skip operators on the path. Just like
`watermark`,  `LatencyMarker` is sent from source, flow through each
operator to sink.
But if you need, you could just obtain latency from source to sink by
fulfill `operatorId` in  metric path (
<operator_id>.<operator_subtask_index>.latency) to sink operatorId.



xm lian <lian...@gmail.com> 于2021年7月2日周五 下午3:52写道:

> Yes, as you suggested, the granularity of these distributions can be
> controlled in the Flink configuration - metrics.latency.granularity
>
> However, I am not sure if this option meets my need - is it possible to
> obtain only the whole time spent between the source and the sink, without
> the detailed time spent on each operator?
>
> JING ZHANG <beyond1...@gmail.com> 于2021年7月2日周五 上午11:04写道:
>
>> Hi Xiuming,
>> +1 on your idea.
>> BTW, Flink also provides a debug tool to track the latency of records
>> travelling through the system[1]. But you should note the following
>> issue if enable the latency tracking.
>> (1) It's a tool for debugging purposes because enabling latency metrics
>> can significantly impact the performance ( worse if enable subtask
>>  granularity).
>> (2) latency tracking metrics could only partially reflect latency, please
>> see details explanation in document [1]
>> (3) This feature is disabled by default.
>>
>> So your proposal to add metric in source and sink sounds better.
>>
>>
>> [1]
>> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/metrics/#latency-tracking
>>
>> xm lian <lian...@gmail.com> 于2021年7月1日周四 下午2:06写道:
>>
>>> Hello community,
>>>
>>> I would like to know how long it takes for an event to flow through the
>>> whole Flink pipeline, that consumes from Kafka and sinks to Redis.
>>>
>>> My current idea is, for each event:
>>>
>>> 1. calculate a start_time in source (timestamp field of [metadata](
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/connectors/kafka.html#available-metadata)
>>> or flink PROCTIME)
>>> 2. calculate end_time in the sink (System.currentTimeMillis or flink
>>> PROCTIME)
>>> 3. push the (end_time - start_time) to Prometheus
>>>
>>> I wonder if flink provides a better and more native way of to calculate
>>> the time spent in Flink?
>>>
>>> Thanks!
>>>
>>> Best,
>>> Xiuming
>>>
>>

Reply via email to