Mich Talebzadeh created HIVE-13574:
--------------------------------------
Summary: Standard Deviation in Hive 2 is still incorrect
Key: HIVE-13574
URL: https://issues.apache.org/jira/browse/HIVE-13574
Project: Hive
Issue Type: Bug
Components: Build Infrastructure
Affects Versions: 2.0.0
Environment: All
Reporter: Mich Talebzadeh
Assignee: Mich Talebzadeh
Priority: Minor
Fix For: 0.10.1
I reported back in April 2015 that what Hive calls Standard Deviation Function
STDDEV is a pointer to STDDEV_POP. This is incorrect and has not been rectified
in Hive 2
Both Oracle and Sybase point STDDEV to STDDEV_SAMP not STDDEV_POP. Also I did
tests with Spark 1.6 as well and Spark correctly points STTDEV to STDDEV_SAMP.
The following query was used
SELECT
SQRT((SUM(POWER(AMOUNT_SOLD,2))-(COUNT(1)*POWER(AVG(AMOUNT_SOLD),2)))/(COUNT(1)-1))
AS MYSTDDEV,
STDDEV(amount_sold) AS STDDEV,
STDDEV_SAMP(amount_sold) AS STDDEV_SAMP,
STDDEV_POP(amount_sold) AS STDDEV_POP
from sales;
The following is from running the above query on Hive where STDDEV -->
STDDEV_POP which is incorrect
+--------------------+---------------------+--------------------+---------------------+--+
| mystddev | stddev | stddev_samp |
stddev_pop |
+--------------------+---------------------+--------------------+---------------------+--+
| 260.7270919450411 | 260.72704617040444 | 260.7270722861465 |
260.72704617040444 |
+--------------------+---------------------+--------------------+---------------------+--+
The following is from Spark-sql where STDDEV --> STDDEV_SAMP which is correct
+--------------------+---------------------+--------------------+---------------------+--+
| mystddev | stddev | stddev_samp |
stddev_pop |
+--------------------+---------------------+--------------------+---------------------+--+
| 260.7270919450411 | 260.7270722861637 | 260.7270722861637 |
260.72704617042166 |
+--------------------+---------------------+--------------------+---------------------+--+
Hopefully The Hive one will be corrected.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)