HIVE-13574 <https://issues.apache.org/jira/browse/HIVE-13574>

Created and assigned to myself

Thanks

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 20 April 2016 at 06:54, Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> Will do thanks
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 19 April 2016 at 23:33, Alan Gates <alanfga...@gmail.com> wrote:
>
>> Have you filed a JIRA ticket for this?  If not, please do so we can track
>> it and fix it.  Patches are welcomed as well. :)
>>
>> Alan.
>>
>> > On Apr 4, 2016, at 15:27, Mich Talebzadeh <mich.talebza...@gmail.com>
>> wrote:
>> >
>> >
>> > Hi,
>> >
>> > I reported back in April 2015 that what Hive calls Standard Deviation
>> Function  STDDEV is a pointer to STDDEV_POP. This is incorrect and has not
>> been rectified in Hive 2
>> >
>> > Both Oracle and Sybase point STDDEV to STDDEV_SAMP not STDDEV_POP. Also
>> I did tests with Spark 1.6 as well and Spark correctly points STTDEV to
>> STDDEV_SAMP.
>> >
>> > The following query was used
>> >
>> > SELECT
>> >
>>  
>> SQRT((SUM(POWER(AMOUNT_SOLD,2))-(COUNT(1)*POWER(AVG(AMOUNT_SOLD),2)))/(COUNT(1)-1))
>> AS MYSTDDEV,
>> >         STDDEV(amount_sold) AS STDDEV,
>> >         STDDEV_SAMP(amount_sold) AS STDDEV_SAMP,
>> >         STDDEV_POP(amount_sold) AS STDDEV_POP
>> > from    sales;
>> >
>> > The following is from running the above query on Hive where STDDEV -->
>> STDDEV_POP which is incorrect
>> >
>> >
>> >
>> +--------------------+---------------------+--------------------+---------------------+--+
>> > |      mystddev      |       stddev        |    stddev_samp     |
>>  stddev_pop      |
>> >
>> +--------------------+---------------------+--------------------+---------------------+--+
>> > | 260.7270919450411  | 260.72704617040444  | 260.7270722861465  |
>> 260.72704617040444  |
>> >
>> +--------------------+---------------------+--------------------+---------------------+--+
>> >
>> > The following is from Spark-sql where STDDEV -->  STDDEV_SAMP which is
>> correct
>> >
>> >
>> +--------------------+---------------------+--------------------+---------------------+--+
>> > |      mystddev      |       stddev        |    stddev_samp     |
>>  stddev_pop      |
>> >
>> +--------------------+---------------------+--------------------+---------------------+--+
>> > | 260.7270919450411  | 260.7270722861637   | 260.7270722861637  |
>> 260.72704617042166  |
>> >
>> +--------------------+---------------------+--------------------+---------------------+--+
>> >
>> > Hopefully The Hive one will be corrected.
>> >
>> > Thanks
>> >
>> >
>> > Dr Mich Talebzadeh
>> >
>> > LinkedIn
>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> >
>> > http://talebzadehmich.wordpress.com
>> >
>>
>>
>

Reply via email to