[ 
https://issues.apache.org/jira/browse/FLINK-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296731#comment-15296731
 ] 

ASF GitHub Bot commented on FLINK-3586:
---------------------------------------

GitHub user fhueske opened a pull request:

    https://github.com/apache/flink/pull/2024

    [FLINK-3586] Fix potential overflow of Long AVG aggregation.

    Fixes a potential overflow of Long `AVG` aggregates in the Table API 
(intermediate sum is computed using `BigInteger` instead of `Long`).
    
    Aggregates are refactored to specify their intermediate types as 
`TypeInformation` instead of SQL types. Intermediate results are not exposed to 
Calcite and Flink internal. So SQL types are not required and need to be 
converted into `TypeInformation` in any case.
    
    Adds unit tests for `MIN`, `MAX´, `COUNT`, `SUM`, and `AVG` aggregates.
    
    - [X] General
    - [X] Documentation
      - No functionality added
      - Some ScalaDocs extended
    
    - [X] Tests & Build
      - Unit tests for existing Aggregates added

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/fhueske/flink tableLongAvgOverflow

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/2024.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2024
    
----
commit a887d1d7edb2b1b96652ca5021beec123011e03a
Author: Fabian Hueske <fhue...@apache.org>
Date:   2016-05-22T14:46:43Z

    [FLINK-3586] Fix potential overflow of Long AVG aggregation.
    
    - Add unit tests for Aggretates.

----


> Risk of data overflow while use sum/count to calculate AVG value
> ----------------------------------------------------------------
>
>                 Key: FLINK-3586
>                 URL: https://issues.apache.org/jira/browse/FLINK-3586
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table API
>            Reporter: Chengxiang Li
>            Assignee: Fabian Hueske
>            Priority: Minor
>
> Now, we use {{(sum: Long, count: Long}} to store AVG partial aggregate data, 
> which may have data overflow risk, we should use unbounded data type(such as 
> BigInteger) to store them for necessary data types.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to