[ https://issues.apache.org/jira/browse/HIVE-14568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15453811#comment-15453811 ]
Xuefu Zhang commented on HIVE-14568: ------------------------------------ I think this is mostly by design. You have two columns: decimal(p1, s1) and decimal(p2,s2). We need to statically derive the type for the product of the two columns based on s = s1 + s2 and p1 = p1 + p2 +1. since your s1 = 28 and s2 = 10 in your case, then s = 38. Similarly, p = 38 (which is the max). Thus, the result column has a type decimal(38, 38). This basically means that the result cannot have any integer part. On the other hand, if the result type is set as (38, 18), I can certainly construct example data which shows that the production of the two column loses the scale that I was expecting. I understand that NULL may have been surprising to people. However, I wonder why a column defined as decimal (38,28) to be used to store data like 1.2, 1.44, etc. Is it reasonable to have a smaller precision/scale? This sounds like a data modeling issue. the metadata needs to closely define the data. It's a good point that an ERROR here might be better so that NULL doesn't slick in unnoticed. I believe that in MySQL there is a strict mode, which, when on, will generate error in this case. We don't have such mode defined in Hive, but it may make sense to introduce such a mode. > Hive Decimal Returns NULL > ------------------------- > > Key: HIVE-14568 > URL: https://issues.apache.org/jira/browse/HIVE-14568 > Project: Hive > Issue Type: Bug > Components: Hive > Affects Versions: 1.0.0, 1.2.0 > Environment: Centos 6.7, Hadoop 2.7.2,hive 1.0.0,2.0 > Reporter: gurmukh singh > Assignee: Xuefu Zhang > > Hi > I was under the impression that the bug: > https://issues.apache.org/jira/browse/HIVE-5022 got fixed. But, I see the > same issue in Hive 1.0 and hive 1.2 as well. > hive> desc mul_table; > OK > prc decimal(38,28) > vol decimal(38,10) > Time taken: 0.068 seconds, Fetched: 2 row(s) > hive> select prc, vol, prc*vol as cost from mul_table; > OK > 1.2 200 NULL > 1.44 200 NULL > 2.14 100 NULL > 3.004 50 NULL > 1.2 200 NULL > Time taken: 0.048 seconds, Fetched: 5 row(s) > Rather then returning NULL, it should give error or round off. > I understand that, I can use Double instead of decimal or can cast it, but > still returning "Null" will make many things go unnoticed. > hive> desc mul_table2; > OK > prc double > vol decimal(14,10) > Time taken: 0.049 seconds, Fetched: 2 row(s) > hive> select * from mul_table2; > OK > 1.4 200 > 1.34 200 > 7.34 100 > 7454533.354544 100 > Time taken: 0.028 seconds, Fetched: 4 row(s) > hive> select prc, vol, prc*vol as cost from mul_table3; > OK > 7.34 100 734.0 > 7.34 1000 7340.0 > 1.0004 1000 1000.4 > 7454533.354544 100 7.454533354544E8 <----- Wrong result > 7454533.354544 1000 7.454533354544E9 <----- Wrong result > Time taken: 0.025 seconds, Fetched: 5 row(s) > Casting: > hive> select prc, vol, cast(prc*vol as decimal(38,38)) as cost from > mul_table3; > OK > 7.34 100 NULL > 7.34 1000 NULL > 1.0004 1000 NULL > 7454533.354544 100 NULL > 7454533.354544 1000 NULL > Time taken: 0.033 seconds, Fetched: 5 row(s) > hive> select prc, vol, cast(prc*vol as decimal(38,10)) as cost from > mul_table3; > OK > 7.34 100 734 > 7.34 1000 7340 > 1.0004 1000 1000.4 > 7454533.354544 100 745453335.4544 > 7454533.354544 1000 7454533354.544 > Time taken: 0.026 seconds, Fetched: 5 row(s) -- This message was sent by Atlassian JIRA (v6.3.4#6332)