Well the issue is the following:
the metric system assumes the following naming scheme for tasks based on
the DataSet API and simple streaming jobs: [CHAIN] operatorName1 [=>
operatorName2 [ ...]]
To retrieve the operator name the above is split by "=>", giving us a
String[] of all operator names in a task, from which we then select the
correct one based on the position in the chain.
However, the Stremaing API has some fancy chaining stuff going on, where
multiple operations can be chained to a single one which results in a
name like this: operatorName1 => (operatorName2, operatorName3)
For both op2 and op3 the chain index is identical (since for a tree
structure the index is the depth), resulting in both picking
(operatorName2, operatorName3) as their name, which is obviously wrong.
The solution (which i already implemented, sorry for that) is to simply
stop inferring the operator names from the task (it was hacky to being
with) and just encode them in the configuration for the operator.
This can be seen here:
https://github.com/zentol/flink/commit/7f439525a26504e98b72f2d39b987ac878464419
Regards,
Chesnay
On 20.10.2016 14:21, Philipp Bussche wrote:
Thanks Chesnay,
I am happy to share more around my environment and do additional testing for
this.
Also I would be happy to help fixing if we see there might be an issue in
the code somewhere.
In fact I am still trying to get a Hacktoberfest T-Shirt and I am still pull
requests short ;)
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Task-and-Operator-Monitoring-via-JMX-naming-tp9560p9650.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at
Nabble.com.