[ 
https://issues.apache.org/jira/browse/HIVE-8745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14199629#comment-14199629
 ] 

Jason Dere commented on HIVE-8745:
----------------------------------

Looking into this, it's not just HiveDecimal.equals() - BinarySortableSerde is 
serializing decimals in such a way that 1.1 is not the same as 1.10. This is 
why we're seeing the difference in the join behavior.

It looks like this difference in behavior is due to HIVE-7373. Before 
HiveDecimal was automatically trimming the trailing zeros and so 1.1 and 1.10 
would both be represented as 1.1. Now that they have different internal 
representations, there seem to be some unexpected differences in behavior like 
we're seeing with BinarySortableSerde. We may want to consider backing out the 
changes from HIVE-7373.

If we were to try to fix this issue without reverting HIVE-7373, we would still 
have to trim trailing zeros within BinarySortableSerde so that 1.1 == 1.10. If 
we do this this will result in having the trimmed behavior that was deemed 
undesirable in HIVE-7373, but which would only exhibit itself when 
BinarySortableSerde is used (joins), which seems a bit odd.

Thoughts?

> Joins on decimal keys return different results whether they are run as reduce 
> join or map join
> ----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-8745
>                 URL: https://issues.apache.org/jira/browse/HIVE-8745
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Gunther Hagleitner
>            Assignee: Jason Dere
>            Priority: Critical
>         Attachments: join_test.q
>
>
> See attached .q file to reproduce. The difference seems to be whether 
> trailing 0s are considered the same value or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to