Hi Felipe, Please find the answers to your questions below.
> Each "operator_subtask_index" means each instance of the parallel physical operator, doesn't it? Yes. > How can I set a fixed ID for the "operator_id" in my code so I can identify quickly which operator I am measuring? You are using the correct api (uid(...)) > What is the hash function used so I can identify my operator? Flink uses https://guava.dev/releases/18.0/api/docs/com/google/common/hash/Hashing.html#murmur3_128(int) Regards, Roman On Thu, Mar 5, 2020 at 12:45 PM Felipe Gutierrez < [email protected]> wrote: > Hi community, > > I am tracking the latency of operators in Flink according to this > reference [1]. When I am using Prometheus+Grafana I can issue a query using > "flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency" > and I can check the percentiles of each "operator_id" and each > "operator_subtask_index". Each "operator_subtask_index" means each instance > of the parallel physical operator, doesn't it? > > How can I set a fixed ID for the "operator_id" in my code so I can > identify quickly which operator I am measuring? I used "map(new > MyMapUDF()).uid('my-operator-ID')" but it seems that there is a hash > function that converts the string to a hash value. What is the hash > function used so I can identify my operator? I know that I can use the Rest > API [2] and if I name my operator it will have always the same hash when I > restart the job, but I would like to set its name. > > [1] > https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#latency-tracking > [2] > https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#rest-api-integration > *-* > *- Felipe Gutierrez* > > *- skype: felipe.o.gutierrez* > *- **https://felipeogutierrez.blogspot.com > <https://felipeogutierrez.blogspot.com>* * > <https://felipeogutierrez.blogspot.com>* >
