Hi Felipe,

Please find the answers to your questions below.

> Each "operator_subtask_index" means each instance of the parallel
physical operator, doesn't it?
Yes.
> How can I set a fixed ID for the "operator_id" in my code so I can
identify quickly which operator I am measuring?
You are using the correct api (uid(...))
> What is the hash function used so I can identify my operator?
Flink uses
https://guava.dev/releases/18.0/api/docs/com/google/common/hash/Hashing.html#murmur3_128(int)

Regards,
Roman


On Thu, Mar 5, 2020 at 12:45 PM Felipe Gutierrez <
[email protected]> wrote:

> Hi community,
>
> I am tracking the latency of operators in Flink according to this
> reference [1]. When I am using Prometheus+Grafana I can issue a query using
> "flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency"
> and I can check the percentiles of each "operator_id" and each
> "operator_subtask_index". Each "operator_subtask_index" means each instance
> of the parallel physical operator, doesn't it?
>
> How can I set a fixed ID for the "operator_id" in my code so I can
> identify quickly which operator I am measuring? I used "map(new
> MyMapUDF()).uid('my-operator-ID')" but it seems that there is a hash
> function that converts the string to a hash value. What is the hash
> function used so I can identify my operator? I know that I can use the Rest
> API [2] and if I name my operator it will have always the same hash when I
> restart the job, but I would like to set its name.
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#latency-tracking
> [2]
> https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#rest-api-integration
> *-*
> *- Felipe Gutierrez*
>
> *- skype: felipe.o.gutierrez*
> *- **https://felipeogutierrez.blogspot.com
> <https://felipeogutierrez.blogspot.com>* *
> <https://felipeogutierrez.blogspot.com>*
>

Reply via email to