[ https://issues.apache.org/jira/browse/BEAM-11578?focusedWorklogId=773667&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773667 ]
ASF GitHub Bot logged work on BEAM-11578: ----------------------------------------- Author: ASF GitHub Bot Created on: 23/May/22 18:55 Start Date: 23/May/22 18:55 Worklog Time Spent: 10m Work Description: y1chi merged PR #17706: URL: https://github.com/apache/beam/pull/17706 Issue Time Tracking ------------------- Worklog Id: (was: 773667) Time Spent: 1h 40m (was: 1.5h) > `dataflow_metrics` (python) fails with TypeError (when int overflowing?) > ------------------------------------------------------------------------ > > Key: BEAM-11578 > URL: https://issues.apache.org/jira/browse/BEAM-11578 > Project: Beam > Issue Type: Bug > Components: sdk-py-core > Affects Versions: 2.25.0 > Reporter: Romain Yon > Assignee: Yi Hu > Priority: P1 > Time Spent: 1h 40m > Remaining Estimate: 0h > > Hi all, > It seems like the python beam job I'm running is failing because of a bug in > beam's metrics. > The logic of the job appears to work and the final output is successfully > being written on GCS, but the dataflow job throws and error and has a failed > status: > ``` > Traceback (most recent call last): > File "path/to/my/code.py", line 11, in <module> > MyJob().run() > File "/path/to/my/lib.py", line 173, in run > for c in result.metrics().query()["counters"] > File > "/usr/local/lib/python3.7/site-packages/apache_beam/runners/dataflow/dataflow_metrics.py", > line 261, in query > self._populate_metrics(response, metric_results, user_metrics=True) > File > "/usr/local/lib/python3.7/site-packages/apache_beam/runners/dataflow/dataflow_metrics.py", > line 188, in _populate_metrics > attempted = self._get_metric_value(metric['tentative']) > File > "/usr/local/lib/python3.7/site-packages/apache_beam/runners/dataflow/dataflow_metrics.py", > line 224, in _get_metric_value > lambda x: x.key == 'sum').value.double_value) > TypeError: int() argument must be a string, a bytes-like object or a number, > not 'NoneType' > ``` > Note that prior to this stacktrace, there is a logging entry: > ``` > {"severity": "INFO", "message": "Distribution metric sum value seems to have > overflowed integer_value range, the correctness of sum or mean value may not > be guaranteed: <JsonValue\\n object_value: <JsonObject\\n properties: > [<Property\\n key: \'count\'\\n value: <JsonValue\\n integer_value: 96>>, > <Property\\n key: \'mean\'\\n value: <JsonValue\\n integer_value: 0>>, > <Property\\n key: \'max\'\\n value: <JsonValue\\n integer_value: 0>>, > <Property\\n key: \'min\'\\n value: <JsonValue\\n integer_value: 0>>, > <Property\\n key: \'sum\'\\n value: <JsonValue\\n integer_value: 0>>]>>"} > ``` > I guess there seems to be an issue while casting the overflowing int to a > double. > (Note: We don't really have control over the number of events being fired > since the metrics are emitted by `tensorflow_transform.beam.TransformDataset`) -- This message was sent by Atlassian Jira (v8.20.7#820007)