[ 
https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Riju Trivedi updated HIVE-11977:
--------------------------------
    Description: 
If a zero length file is in the top level directory housing an external avro 
table, all hive queries on the table fail.

This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader 
creates a new org.apache.avro.file.DataFileReader and DataFileReader throws an 
exception when trying to read an empty file (because the empty file lacks the 
magic number marking it as avro).

AvroGenericRecordReader should detect an empty file and then behave reasonably.

Caused by: java.io.IOException: Not a data file.
at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97)
at 
org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.<init>(AvroGenericRecordReader.java:81)
at 
org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
... 25 more

  was:
If a zero length file is in the top level directory housing an external avro 
table,  all hive queries on the table fail.

This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader 
creates a new org.apache.avro.file.DataFileReader and DataFileReader throws an 
exception when trying to read an empty file (because the empty file lacks the 
magic number marking it as avro).  

AvroGenericRecordReader should detect an empty file and then behave reasonably.

Caused by: java.io.IOException: Not a data file.
at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97)
at 
org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.<init>(AvroGenericRecordReader.java:81)
at 
org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
... 25 more


> Hive should handle an external avro table with zero length files present
> ------------------------------------------------------------------------
>
>                 Key: HIVE-11977
>                 URL: https://issues.apache.org/jira/browse/HIVE-11977
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.2.1
>            Reporter: Aaron Dossett
>            Assignee: Aaron Dossett
>            Priority: Major
>             Fix For: 2.0.0
>
>         Attachments: HIVE-11977.2.patch, HIVE-11977.patch
>
>
> If a zero length file is in the top level directory housing an external avro 
> table, all hive queries on the table fail.
> This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader 
> creates a new org.apache.avro.file.DataFileReader and DataFileReader throws 
> an exception when trying to read an empty file (because the empty file lacks 
> the magic number marking it as avro).
> AvroGenericRecordReader should detect an empty file and then behave 
> reasonably.
> Caused by: java.io.IOException: Not a data file.
> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
> at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.<init>(AvroGenericRecordReader.java:81)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
> ... 25 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to