[ https://issues.apache.org/jira/browse/FLINK-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296731#comment-15296731 ]
ASF GitHub Bot commented on FLINK-3586: --------------------------------------- GitHub user fhueske opened a pull request: https://github.com/apache/flink/pull/2024 [FLINK-3586] Fix potential overflow of Long AVG aggregation. Fixes a potential overflow of Long `AVG` aggregates in the Table API (intermediate sum is computed using `BigInteger` instead of `Long`). Aggregates are refactored to specify their intermediate types as `TypeInformation` instead of SQL types. Intermediate results are not exposed to Calcite and Flink internal. So SQL types are not required and need to be converted into `TypeInformation` in any case. Adds unit tests for `MIN`, `MAX´, `COUNT`, `SUM`, and `AVG` aggregates. - [X] General - [X] Documentation - No functionality added - Some ScalaDocs extended - [X] Tests & Build - Unit tests for existing Aggregates added You can merge this pull request into a Git repository by running: $ git pull https://github.com/fhueske/flink tableLongAvgOverflow Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2024.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2024 ---- commit a887d1d7edb2b1b96652ca5021beec123011e03a Author: Fabian Hueske <fhue...@apache.org> Date: 2016-05-22T14:46:43Z [FLINK-3586] Fix potential overflow of Long AVG aggregation. - Add unit tests for Aggretates. ---- > Risk of data overflow while use sum/count to calculate AVG value > ---------------------------------------------------------------- > > Key: FLINK-3586 > URL: https://issues.apache.org/jira/browse/FLINK-3586 > Project: Flink > Issue Type: Sub-task > Components: Table API > Reporter: Chengxiang Li > Assignee: Fabian Hueske > Priority: Minor > > Now, we use {{(sum: Long, count: Long}} to store AVG partial aggregate data, > which may have data overflow risk, we should use unbounded data type(such as > BigInteger) to store them for necessary data types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)