[jira] [Commented] (HIVE-3976) Support specifying scale and precision with Hive decimal type

Gunther Hagleitner (JIRA) Thu, 29 Aug 2013 22:31:28 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13754386#comment-13754386
 ]


Gunther Hagleitner commented on HIVE-3976:
------------------------------------------

This is cool. Looking forward to having a complete decimal data type. I think 
it'd be neat to have some sort of spec for this though. Some thoughts that I 
had when doing the initial decimal (hope that's helpful):

- The current decimal type is of "mixed scale" and by default has the full 
maximum precision (38). Are you going to change this? A smaller default has 
benefits, so does a fixed default scale. For performance reasons it would also 
be good to keep the default precision below 19. This way it'll fit in a single 
long (and we can in future switch to a faster implementation). Fixed scale 
means that you can more efficiently compute the basic operations (and it's more 
sane than mixed scale). [~ehans] was the one who pointed that out to me.

-  HIVE-5022 is related to this and it'd be good to keep any breaking changes 
in a single release.

- Have you thought about the rules for precision + scale in arithmetic 
operations? Here are some sensible and compliant rules from microsoft: 
http://technet.microsoft.com/en-us/library/ms190476.aspx

- Currently when we read data and the read value has a precision greater than 
the max we turn that into null. Have you thought about what to do when you read 
a decimal that doesn't fit/doesn't conform with the column specification? 
Round, truncate, error, or null?

- Propagating precision and scale changes throughout arithmetic expressions 
seem difficult. Jason's patch might lay some ground work, but numeric 
operations are different from the char/varchar. I think you need to keep track 
of them though so you can return them to the user (e.g.: through jdbc/odbc) and 
probably also for CTAS.

- Same thing for UDFs. That might have some wrinkles too. I.e.: How do you know 
what precision/scale a UDF is going to return?

- I'm not sure whether you need to worry about "insert into" - maybe you can 
just write whatever and handle the data when you read it again.

- If you switch everything to fixed scale, I think BinarySortableSerde can be 
greatly simplified for decimals.



                
> Support specifying scale and precision with Hive decimal type
> -------------------------------------------------------------
>
>                 Key: HIVE-3976
>                 URL: https://issues.apache.org/jira/browse/HIVE-3976
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor, Types
>            Reporter: Mark Grover
>            Assignee: Xuefu Zhang
>         Attachments: remove_prec_scale.diff
>
>
> HIVE-2693 introduced support for Decimal datatype in Hive. However, the 
> current implementation has unlimited precision and provides no way to specify 
> precision and scale when creating the table.
> For example, MySQL allows users to specify scale and precision of the decimal 
> datatype when creating the table:
> {code}
> CREATE TABLE numbers (a DECIMAL(20,2));
> {code}
> Hive should support something similar too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3976) Support specifying scale and precision with Hive decimal type

Reply via email to