Matt McCline created HIVE-7269: ---------------------------------- Summary: First query in ptf.q (Partition Table Function test) fails when input table is changed to ORC format Key: HIVE-7269 URL: https://issues.apache.org/jira/browse/HIVE-7269 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Harish Butani
This fails: {noformat} CREATE TABLE partorc( p_partkey INT, p_name STRING, p_mfgr STRING, p_brand STRING, p_type STRING, p_size INT, p_container STRING, p_retailprice DOUBLE, p_comment STRING ) STORED AS ORC; LOAD DATA LOCAL INPATH '/Users/mmccline/hive_ptf/data/files/part_tiny.txt' overwrite into table partorc; select p_mfgr, p_name, p_size, rank() over (partition by p_mfgr order by p_name) as r, dense_rank() over (partition by p_mfgr order by p_name) as dr, sum(p_retailprice) over (partition by p_mfgr order by p_name rows between unbounded preceding and current row) as s1 from noop(on part partition by p_mfgr order by p_name ); {noformat} The same thing works when STORED AS ORC clause removed. If you specify set hive.execution.engine=tez, you get these failure stack traces for the ORC table. {noformat} 14/06/20 15:05:33 [main]: ERROR tez.TezJobMonitor: Status: Failed Vertex failed, vertexName=Map 1, vertexId=vertex_1403230487252_0002_1_02, diagnostics=[Task failed, taskId=task_1403230487252_0002_1_02_000000, diagnostics=[AttemptID:attempt_1403230487252_0002_1_02_000000_0 Info:Error: java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: java.io.IOException: Malformed ORC file hdfs://localhost:9000/user/hive/warehouse/partorc/part_tiny.txt. Invalid postscript. at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:188) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307) at org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:581) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:570) Caused by: java.lang.RuntimeException: java.io.IOException: java.io.IOException: Malformed ORC file hdfs://localhost:9000/user/hive/warehouse/partorc/part_tiny.txt. Invalid postscript. at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:174) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.<init>(TezGroupedSplitsInputFormat.java:113) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:79) at org.apache.tez.mapreduce.input.MRInput.setupOldRecordReader(MRInput.java:250) at org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:400) at org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:379) at org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:110) at org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:79) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142) ... 6 more Caused by: java.io.IOException: java.io.IOException: Malformed ORC file hdfs://localhost:9000/user/hive/warehouse/partorc/part_tiny.txt. Invalid postscript. at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:243) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:171) ... 14 more Caused by: java.io.IOException: Malformed ORC file hdfs://localhost:9000/user/hive/warehouse/partorc/part_tiny.txt. Invalid postscript. at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.ensureOrcFooter(ReaderImpl.java:226) at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:336) at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:292) at org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:201) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1010) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:241) ... 15 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)