[ 
https://issues.apache.org/jira/browse/HIVE-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13808625#comment-13808625
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-5436:
---------------------------------------------------------

[~xuefu] Reasons why I thought of fixing the consistency first:
1. I wanted to see how the intermediate results are handled in case of 
numericals. For example, for tiny int (1228+1228)/20 will lead to a in-range 
result, where as the intermediate result 1228+1228 will be a non tiny int.  
This scenario will be very common in case of exponential notation.
2. HIVE-5382 will need a baseline to compare the string cast results with 
non-string cast results. My plan was to use testcases like this :
select cast('-1.5e2' as int)-cast(-1.5e2 as int) from tmp and verify that the 
result is always 0. This will ensure consistency across cast from 
string->numericals (and will expose any existing bugs which is fixed in future 
for only one of the cast types since the non-string cast and string cast are 
handled separately).

> Hive's casting behavior needs to be consistent
> ----------------------------------------------
>
>                 Key: HIVE-5436
>                 URL: https://issues.apache.org/jira/browse/HIVE-5436
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Hari Sankar Sivarama Subramaniyan
>            Assignee: Hari Sankar Sivarama Subramaniyan
>            Priority: Critical
>
> Hive's casting behavior is inconsistent and the behavior of casting from one 
> type to another undocumented as of now when the casted value is out of range. 
> For example, casting out of range values from one type to another can result 
> in incorrect results.
> Eg: 
> 1. select cast('1000'  as tinyint) from t1;
> NULL
> 2. select 1000Y from t1;
> FAILED: SemanticException [Error 10029]: Line 1:7 Invalid numerical constant 
> '1000Y'
> 3.  select cast(1000 as tinyint) from t1;
> -24
> 4.select cast(1.1e3-1000/0 as tinyint) from t1;
> 0
> 5. select cast(10/0 as tinyint) from pw18; 
> -1
> The hive user can accidently try to typecast an out of range value. For 
> example in the e.g. 4/5 even though the final result is NaN, Hive can 
> typecast to a random result. Either we should document that the end user 
> should take care of  overflow, underflow, division by 0, etc.  by 
> himself/herself or we should return NULLs when the final result is out of 
> range.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to