[ https://issues.apache.org/jira/browse/HIVE-5979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842163#comment-13842163 ]
Gopal V commented on HIVE-5979: ------------------------------- (Pasted from an email) The nano second sql timestamp stuff in Java is horribly broken for usability. https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFTimestampFieldLong.java#L52 Read my comments there on how it handles -ve timestamps and sub-second timings. Because of the way integer division works in Java, you can end with rounding towards zero - this causes hell with the restriction that setNanos() has to always be positive. On top of that it uses 1 integer and 1 long to store the time always (unix-epoch seconds + nanos) - the millisecond fraction is stored in the nanos field, so the setNanos() overwrites the millisecond fraction of time always, which is why the getNanos() is added to it. http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/sql/Timestamp.java#Timestamp.setTime%28long%29 That makes sense, until you realize that a negative millisecond timing is stored as a -1ve second + positive nanosecond time. So when you mix that with the negative modulo in Java, you end up with a fairly ugly kludge which needs to take care of a several edge cases related to the java.sql.Timestamp implementation. > Failure in cast to timestamps. > ------------------------------ > > Key: HIVE-5979 > URL: https://issues.apache.org/jira/browse/HIVE-5979 > Project: Hive > Issue Type: Sub-task > Reporter: Jitendra Nath Pandey > Assignee: Jitendra Nath Pandey > > Query ran: > {code} > select cast(t as timestamp), cast(si as timestamp), > cast(i as timestamp), cast(b as timestamp), > cast(f as string), cast(d as timestamp), > cast(bo as timestamp), cast(b * 0 as timestamp), > cast(ts as timestamp), cast(s as timestamp), > cast(substr(s, 1, 1) as timestamp) > from Table1; > {code} > Running this query with hive.vectorized.execution.enabled=true fails with the > following exception: > {noformat} > 13/12/05 07:56:36 ERROR tez.TezJobMonitor: Status: Failed > Vertex failed, vertexName=Map 1, vertexId=vertex_1386227234886_0482_1_00, > diagnostics=[Task failed, taskId=task_1386227234886_0482_1_00_000000, > diagnostics=[AttemptID:attempt_1386227234886_0482_1_00_000000_0 Info:Error: > java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: > Hive Runtime Error while processing row > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.processRow(MapRecordProcessor.java:205) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:171) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:112) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:201) > at > org.apache.hadoop.mapred.YarnTezDagChild$4.run(YarnTezDagChild.java:484) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) > at > org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:474) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime > Error while processing row > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.processRow(MapRecordProcessor.java:193) > ... 8 more > Caused by: java.lang.IllegalArgumentException: nanos > 999999999 or < 0 > at java.sql.Timestamp.setNanos(Timestamp.java:383) > at > org.apache.hadoop.hive.ql.exec.vector.TimestampUtils.assignTimeInNanoSec(TimestampUtils.java:27) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$1.writeValue(VectorExpressionWriterFactory.java:412) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:162) > at > org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:152) > at > org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.processOp(VectorFileSinkOperator.java:85) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:786) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:129) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:786) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:93) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:786) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) > ... 9 more > {noformat} > Full log is attached. > Schema for the table is as follows: > {code} > hive> desc Table1; > OK > t tinyint from deserializer > si smallint from deserializer > i int from deserializer > b bigint from deserializer > f float from deserializer > d double from deserializer > bo boolean from deserializer > s string from deserializer > s2 string from deserializer > ts timestamp from deserializer > Time taken: 0.521 seconds, Fetched: 10 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.1#6144)