[ https://issues.apache.org/jira/browse/HIVE-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13808625#comment-13808625 ]
Hari Sankar Sivarama Subramaniyan commented on HIVE-5436: --------------------------------------------------------- [~xuefu] Reasons why I thought of fixing the consistency first: 1. I wanted to see how the intermediate results are handled in case of numericals. For example, for tiny int (1228+1228)/20 will lead to a in-range result, where as the intermediate result 1228+1228 will be a non tiny int. This scenario will be very common in case of exponential notation. 2. HIVE-5382 will need a baseline to compare the string cast results with non-string cast results. My plan was to use testcases like this : select cast('-1.5e2' as int)-cast(-1.5e2 as int) from tmp and verify that the result is always 0. This will ensure consistency across cast from string->numericals (and will expose any existing bugs which is fixed in future for only one of the cast types since the non-string cast and string cast are handled separately). > Hive's casting behavior needs to be consistent > ---------------------------------------------- > > Key: HIVE-5436 > URL: https://issues.apache.org/jira/browse/HIVE-5436 > Project: Hive > Issue Type: Bug > Reporter: Hari Sankar Sivarama Subramaniyan > Assignee: Hari Sankar Sivarama Subramaniyan > Priority: Critical > > Hive's casting behavior is inconsistent and the behavior of casting from one > type to another undocumented as of now when the casted value is out of range. > For example, casting out of range values from one type to another can result > in incorrect results. > Eg: > 1. select cast('1000' as tinyint) from t1; > NULL > 2. select 1000Y from t1; > FAILED: SemanticException [Error 10029]: Line 1:7 Invalid numerical constant > '1000Y' > 3. select cast(1000 as tinyint) from t1; > -24 > 4.select cast(1.1e3-1000/0 as tinyint) from t1; > 0 > 5. select cast(10/0 as tinyint) from pw18; > -1 > The hive user can accidently try to typecast an out of range value. For > example in the e.g. 4/5 even though the final result is NaN, Hive can > typecast to a random result. Either we should document that the end user > should take care of overflow, underflow, division by 0, etc. by > himself/herself or we should return NULLs when the final result is out of > range. -- This message was sent by Atlassian JIRA (v6.1#6144)