[ https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941548#comment-14941548 ]
Ashutosh Chauhan commented on HIVE-11977: ----------------------------------------- Thanks for patch [~dossett] A 0-length file is an invalid Avro file, as in Avro's {{DataFileWriter}} will always write MAGIC header for version. Thats the reason {{DataFileReader}} expects it and throws up when it doesn't get one. It seems these 0 length files got there because of some faulty generator process. Isn't it better to just not generate those 0 length files. Or, alternatively, delete these faulty files. > Hive should handle an external avro table with zero length files present > ------------------------------------------------------------------------ > > Key: HIVE-11977 > URL: https://issues.apache.org/jira/browse/HIVE-11977 > Project: Hive > Issue Type: Bug > Reporter: Aaron Dossett > Assignee: Aaron Dossett > Attachments: HIVE-11977-2.patch, HIVE-11977.patch > > > If a zero length file is in the top level directory housing an external avro > table, all hive queries on the table fail. > This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader > creates a new org.apache.avro.file.DataFileReader and DataFileReader throws > an exception when trying to read an empty file (because the empty file lacks > the magic number marking it as avro). > AvroGenericRecordReader should detect an empty file and then behave > reasonably. > Caused by: java.io.IOException: Not a data file. > at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102) > at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97) > at > org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.<init>(AvroGenericRecordReader.java:81) > at > org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246) > ... 25 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)