Thanks, for reply Alan. What do you mean by Hortonworks list? Is it mailing list like this. I can't find the address. Or is it some kind of HDP support feature? (Sadly, we didn't purchase support subscription yet.)
On Fri, Mar 11, 2016 at 9:02 PM, Alan Gates <alanfga...@gmail.com> wrote: > I believe this is an issue in the Storm Hive bolt. I don’t have an Apache > JIRA on it, but if you ask on the Hortonworks lists we can connect you with > the fix for the storm bolt. > > Alan. > > > On Mar 10, 2016, at 04:02, Igor Kuzmenko <f1she...@gmail.com> wrote: > > > > Hello, I'm using Hortonworks Data Platform 2.3.4 which includes Apache > Hive 1.2.1 and Apache Storm 0.10. > > I've build Storm topology using Hive Bolt, which eventually using Hive > StreamingAPI to stream data into hive table. > > In Hive I've created transactional table: > > > > • CREATE EXTERNAL TABLE cdr1 ( > > • ........ > > • ) > > • PARTITIONED BY (dt INT) > > • CLUSTERED BY (telcoId) INTO 5 buckets > > • STORED AS ORC > > • LOCATION '/data/sorm3/cdr/cdr1' > > • TBLPROPERTIES ("transactional"="true") > > > > Hive settings: > > > > • hive.support.concurrency=true > > • hive.enforce.bucketing=true > > • hive.exec.dynamic.partition.mode=nonstrict > > • hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager > > • hive.compactor.initiator.on=true > > • hive.compactor.worker.threads=1 > > > > When I run my Storm Topology it fails with OutOfMemoryException. The > Storm exception doesn't bother me, it was just a test. But after topology > fail my Hive table is not consistent. > > Simple select from table leads into exception: > > > > SELECT COUNT(*) FROM cdr1 > > ERROR : Status: Failed > > ERROR : Vertex failed, vertexName=Map 1, > vertexId=vertex_1453891518300_0098_1_00, diagnostics=[Task failed, > taskId=task_1453891518300_0098_1_00_000000, diagnostics=[TaskAttempt 0 > failed, info=[Error: Failure while running task:java.lang.RuntimeException: > java.lang.RuntimeException: java.io.IOException: java.io.EOFException > > .... > > Caused by: java.io.IOException: java.io.EOFException > > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) > > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) > > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:251) > > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:193) > > ... 19 more > > Caused by: java.io.EOFException > > at java.io.DataInputStream.readFully(DataInputStream.java:197) > > at > org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:370) > > at > org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:317) > > at > org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:238) > > at > org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.<init>(OrcRawRecordMerger.java:460) > > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1269) > > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1151) > > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:249) > > ... 20 more > > ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 > killedTasks:0, Vertex vertex_1453891518300_0098_1_00 [Map 1] killed/failed > due to:OWN_TASK_FAILURE] > > ERROR : Vertex killed, vertexName=Reducer 2, > vertexId=vertex_1453891518300_0098_1_01, diagnostics=[Vertex received Kill > while in RUNNING state., Vertex did not succeed due to > OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:1, Vertex > vertex_1453891518300_0098_1_01 [Reducer 2] killed/failed due > to:OTHER_VERTEX_FAILURE] > > ERROR : DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 > killedVertices:1 > > > > > > Compaction fails with same exception: > > > > 2016-03-10 13:20:54,550 WARN [main] org.apache.hadoop.mapred.YarnChild: > Exception running child : java.io.EOFException: Cannot seek after EOF > > at > org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1488) > > at > org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:62) > > at > org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:368) > > at > org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:317) > > at > org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:238) > > at > org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.<init>(OrcRawRecordMerger.java:460) > > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRawReader(OrcInputFormat.java:1362) > > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:565) > > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:544) > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:422) > > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) > > > > > > Looking throw files that was created by streaming I've found several > zero sized ORC files. Probably these files leads to exception. > > > > > > Is it normal for hive transactional table? How can I prevent such > behavior? > >