[ 
https://issues.apache.org/jira/browse/HIVE-25653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462512#comment-17462512
 ] 

Ashish Sharma edited comment on HIVE-25653 at 12/20/21, 11:28 AM:
------------------------------------------------------------------

[~zabetak] 

I have corrected the column data from int to decimal in the example. 

*Code Snippet* 

public class MyClass {
    public static void main(String args[]) {
      System.out.println(10230.72+10230.72+10230.72);
    }
}

*Output* - 

30692.159999999996

*Expected* - 

30692.16


Because of the double and floating point arithmetic accuracy issue in java 
language. Output of UDFs and UDAFs like STDDEV are getting affected hence we 
are getting output (*0.0000000000005940794514955821*) instead of (*0.0*)

But since other engine like MYSQL, POSTGRES are returning the similar value. So 
we can ignore arithmetic accuracy issue in UDFs for now.

I have created a revert request - https://github.com/apache/hive/pull/2897


was (Author: ashish-kumar-sharma):
[~zabetak] 

*Code Snippet* 

public class MyClass {
    public static void main(String args[]) {
      System.out.println(10230.72+10230.72+10230.72);
    }
}

*Output* - 

30692.159999999996

*Expected* - 

30692.16


Because of the double and floating point arithmetic accurcy issue in java 
language. Output of UDFs and UDAFs like STDDEV are getting affected hence we 
are getting output (*0.0000000000005940794514955821*) instead of (*0.0*)

But since other engine like MYSQL are returning the same value. So we can 
ignore arithmetic accuracy issue in UDFs for now.

I have created a revert request - https://github.com/apache/hive/pull/2897

> Incorrect results returned by STDDEV, STDDEV_SAMP, STDDEV_POP for floating 
> point data types.
> --------------------------------------------------------------------------------------------
>
>                 Key: HIVE-25653
>                 URL: https://issues.apache.org/jira/browse/HIVE-25653
>             Project: Hive
>          Issue Type: Improvement
>          Components: UDF
>    Affects Versions: 3.1.0, 3.1.2
>            Reporter: Ashish Sharma
>            Assignee: Ashish Sharma
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Description
> *Script*- 
> create table test ( col1 decimal );
> insert into values 
> ('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72');
> select STDDEV_SAMP(col1) AS STDDEV_6M , STDDEV(col1) as STDDEV 
> ,STDDEV_POP(col1) as STDDEV_POP from test;
> *Result*- 
> STDDDEV_SAMP                            STDDEV                      
> STDDEV_POP 
> 5.940794514955821E-13     5.42317860890711E-13         5.42317860890711E-13
> *Expected*- 
> STDDDEV_SAMP                            STDDEV                      
> STDDEV_POP 
> 0                                                           0                 
>                0



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to