Matt McCline created HIVE-7269:
----------------------------------

             Summary: First query in ptf.q (Partition Table Function test) 
fails when input table is changed to ORC format
                 Key: HIVE-7269
                 URL: https://issues.apache.org/jira/browse/HIVE-7269
             Project: Hive
          Issue Type: Bug
            Reporter: Matt McCline
            Assignee: Harish Butani



This fails:

{noformat}
CREATE TABLE partorc( 
    p_partkey INT,
    p_name STRING,
    p_mfgr STRING,
    p_brand STRING,
    p_type STRING,
    p_size INT,
    p_container STRING,
    p_retailprice DOUBLE,
    p_comment STRING
) STORED AS ORC;

LOAD DATA LOCAL INPATH '/Users/mmccline/hive_ptf/data/files/part_tiny.txt' 
overwrite into table partorc;

select 
  p_mfgr, 
  p_name, 
  p_size,
  rank() 
    over (partition by p_mfgr order by p_name) as r,
  dense_rank() 
    over (partition by p_mfgr order by p_name) as dr,
  sum(p_retailprice) 
    over (partition by p_mfgr order by p_name rows between unbounded preceding 
and current row) as s1
from noop(on part 
  partition by p_mfgr
  order by p_name
  );

{noformat}


The same thing works when STORED AS ORC clause removed.

If you specify set hive.execution.engine=tez, you get these failure stack 
traces for the ORC table.

{noformat}
14/06/20 15:05:33 [main]: ERROR tez.TezJobMonitor: Status: Failed
Vertex failed, vertexName=Map 1, vertexId=vertex_1403230487252_0002_1_02, 
diagnostics=[Task failed, taskId=task_1403230487252_0002_1_02_000000, 
diagnostics=[AttemptID:attempt_1403230487252_0002_1_02_000000_0 Info:Error: 
java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: 
java.io.IOException: Malformed ORC file 
hdfs://localhost:9000/user/hive/warehouse/partorc/part_tiny.txt. Invalid 
postscript.
        at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:188)
        at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
        at 
org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:581)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:394)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at 
org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:570)
Caused by: java.lang.RuntimeException: java.io.IOException: 
java.io.IOException: Malformed ORC file 
hdfs://localhost:9000/user/hive/warehouse/partorc/part_tiny.txt. Invalid 
postscript.
        at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:174)
        at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.<init>(TezGroupedSplitsInputFormat.java:113)
        at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:79)
        at 
org.apache.tez.mapreduce.input.MRInput.setupOldRecordReader(MRInput.java:250)
        at 
org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:400)
        at 
org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:379)
        at 
org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:110)
        at 
org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:79)
        at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142)
        ... 6 more
Caused by: java.io.IOException: java.io.IOException: Malformed ORC file 
hdfs://localhost:9000/user/hive/warehouse/partorc/part_tiny.txt. Invalid 
postscript.
        at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
        at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
        at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:243)
        at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:171)
        ... 14 more
Caused by: java.io.IOException: Malformed ORC file 
hdfs://localhost:9000/user/hive/warehouse/partorc/part_tiny.txt. Invalid 
postscript.
        at 
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.ensureOrcFooter(ReaderImpl.java:226)
        at 
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:336)
        at 
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:292)
        at 
org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:201)
        at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1010)
        at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:241)
        ... 15 more
{noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to