
Illya Yalovyy commented on HIVE-13185:

In OrcInputFormat.validateInput(...) it checks that file.size() is not 0. But a 
valid ORC file should be much larger than 0. Is there a way to come up with a 
smallest valid ORC file? For instance would it be correct to replace "if 
(file.getLen() == 0)" with "if (file.getLen() < OrcFile.MAGIC.length() + 1)"?

> orc.ReaderImp.ensureOrcFooter() method fails on small text files with 
> IndexOutOfBoundsException
> -----------------------------------------------------------------------------------------------
>                 Key: HIVE-13185
>                 URL: https://issues.apache.org/jira/browse/HIVE-13185
>             Project: Hive
>          Issue Type: Bug
>          Components: ORC
>    Affects Versions: 2.1.0
>            Reporter: Illya Yalovyy
> Steps to reproduce:
> 1. Create a Text source table with one line of data:
> {code}
> create table src (id int);
> insert overwrite table src values (1);
> {code}
> 2. Create a target table:
> {code}
> create table trg (id int);
> {code}
> 3. Try to load small text file to the target table:
> {code}
> load data inpath 'user/hive/warehouse/src/000000_0' into table trg;
> {code}
> *Error message:*
> {quote}
> FAILED: SemanticException Unable to load data to destination table. Error: 
> java.lang.IndexOutOfBoundsException
> {quote}
> *Stack trace:*
> {noformat}
> org.apache.hadoop.hive.ql.parse.SemanticException: Unable to load data to 
> destination table. Error: java.lang.IndexOutOfBoundsException
>       at 
> org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.ensureFileFormatsMatch(LoadSemanticAnalyzer.java:340)
>       at 
> org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.analyzeInternal(LoadSemanticAnalyzer.java:224)
>       at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:242)
>       at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:481)
>       at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:317)
>       at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1190)
>       at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1285)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1116)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1104)
> ...
> {noformat}

This message was sent by Atlassian JIRA

Reply via email to