[ https://issues.apache.org/jira/browse/HIVE-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111436#comment-14111436 ]
Venkata Puneet Ravuri commented on HIVE-7886: --------------------------------------------- The same issue occurs in Hive 0.12. But it worked when column pruning was disabled by setting the property 'hive.optimize.cp' to false. For Hive 0.13 this property was disabled as part of [HIVE-4113|https://issues.apache.org/jira/browse/HIVE-4113]. > Aggregation queries fail with RCFile based Hive tables with S3 storage > ---------------------------------------------------------------------- > > Key: HIVE-7886 > URL: https://issues.apache.org/jira/browse/HIVE-7886 > Project: Hive > Issue Type: Bug > Components: File Formats > Affects Versions: 0.13.1 > Reporter: Venkata Puneet Ravuri > > Aggregation queries on Hive tables which use RCFile format and S3 storage are > failing. > My setup is Hadoop 2.5.0 and Hive 0.13.1. > I create a table with following schema:- > CREATE EXTERNAL TABLE `testtable`( > `col1` string, > `col2` tinyint, > `col3` int, > `col4` float, > `col5` boolean, > `col6` smallint) > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe' > WITH SERDEPROPERTIES ( > 'serialization.format'='\t', > 'line.delim'='\n', > 'field.delim'='\t' > ) > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.RCFileInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.RCFileOutputFormat' > LOCATION > 's3n://<testbucket>/testtable'; > When I run 'select count(*) from testtable', it gives the following exception > stack:- > Error: java.io.IOException: java.io.IOException: java.io.EOFException: > Attempted to seek or read past the end of the file > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:256) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:171) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:198) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:184) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) > Caused by: java.io.IOException: java.io.EOFException: Attempted to seek or > read past the end of the file > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:344) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:122) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:254) > ... 11 more > Caused by: java.io.EOFException: Attempted to seek or read past the end of > the file > at > org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:462) > at > org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleException(Jets3tNativeFileSystemStore.java:411) > at > org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieve(Jets3tNativeFileSystemStore.java:234) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > at org.apache.hadoop.fs.s3native.$Proxy17.retrieve(Unknown Source) > at > org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.seek(NativeS3FileSystem.java:205) > at > org.apache.hadoop.fs.BufferedFSInputStream.seek(BufferedFSInputStream.java:96) > at > org.apache.hadoop.fs.BufferedFSInputStream.skip(BufferedFSInputStream.java:67) > at java.io.DataInputStream.skipBytes(DataInputStream.java:220) > at > org.apache.hadoop.hive.ql.io.RCFile$ValueBuffer.readFields(RCFile.java:739) > at > org.apache.hadoop.hive.ql.io.RCFile$Reader.currentValueBuffer(RCFile.java:1720) > at > org.apache.hadoop.hive.ql.io.RCFile$Reader.getCurrentRow(RCFile.java:1898) > at > org.apache.hadoop.hive.ql.io.RCFileRecordReader.next(RCFileRecordReader.java:149) > at > org.apache.hadoop.hive.ql.io.RCFileRecordReader.next(RCFileRecordReader.java:44) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:339) > ... 15 more -- This message was sent by Atlassian JIRA (v6.2#6252)