Please file a bug for this with the details provided in your email.
On Tue, Aug 26, 2014 at 9:44 AM, Gary Malouf <malouf.g...@gmail.com> wrote: > +1 I've seen this same issue. > > > On Tue, Aug 26, 2014 at 12:33 PM, Andrew O'Neill <aone...@paytronix.com> > wrote: > >> Hello all, >> >> My setup: >> - Flume 1.4 >> - CDH 4.2.2 (2.0.0-cdh4.2.2) >> >> >> I am testing a simple flume setup with a Sequence Generator Source, a >> File Channel, and an HDFS Sink (see my flume.conf below). This >> configuration works as expected until I reboot the cluster’s NameNode or >> until I restart the HDFS service on the cluster. At this point, it appears >> that the Flume Agent cannot reconnect to HDFS and must be manually >> restarted. Since this is not an uncommon occurrence in our production >> cluster, it is important that Flume is able to reconnect gracefully without >> any manual intervention. >> >> So, how do we fix this HDFS reconnection issue? >> >> >> Here is our flume.conf: >> >> appserver.sources = rawtext >> appserver.channels = testchannel >> appserver.sinks = test_sink >> >> appserver.sources.rawtext.type = seq >> appserver.sources.rawtext.channels = testchannel >> >> appserver.channels.testchannel.type = file >> appserver.channels.testchannel.capacity = 10000000 >> appserver.channels.testchannel.minimumRequiredSpace = 214748364800 >> appserver.channels.testchannel.checkpointDir = >> /Users/aoneill/Desktop/testchannel/checkpoint >> appserver.channels.testchannel.dataDirs = >> /Users/aoneill/Desktop/testchannel/data >> appserver.channels.testchannel.maxFileSize = 20000000 >> >> appserver.sinks.test_sink.type = hdfs >> appserver.sinks.test_sink.channel = testchannel >> appserver.sinks.test_sink.hdfs.path = >> hdfs://cluster01:8020/user/aoneill/flumetest >> appserver.sinks.test_sink.hdfs.closeTries = 3 >> appserver.sinks.test_sink.hdfs.filePrefix = events- >> appserver.sinks.test_sink.hdfs.fileSuffix = .avro >> appserver.sinks.test_sink.hdfs.fileType = DataStream >> appserver.sinks.test_sink.hdfs.writeFormat = Text >> appserver.sinks.test_sink.hdfs.inUsePrefix = inuse- >> appserver.sinks.test_sink.hdfs.inUseSuffix = .avro >> appserver.sinks.test_sink.hdfs.rollCount = 100000 >> appserver.sinks.test_sink.hdfs.rollInterval = 30 >> appserver.sinks.test_sink.hdfs.rollSize = 10485760 >> >> >> These are the two error message that the Flume Agent outputs constantly >> after the restart: >> >> 2014-08-26 10:47:24,572 >> (SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR - >> org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:96)] >> Unexpected error while checking replication factor >> java.lang.reflect.InvocationTargetException >> at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:606) >> at >> org.apache.flume.sink.hdfs.AbstractHDFSWriter.getNumCurrentReplicas(AbstractHDFSWriter.java:162) >> at >> org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:82) >> at >> org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:452) >> at >> org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:387) >> at >> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392) >> at >> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) >> at >> org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) >> at java.lang.Thread.run(Thread.java:744) >> Caused by: java.net.ConnectException: Connection refused >> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >> at >> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735) >> at >> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:207) >> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:525) >> at >> org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1253) >> at >> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.transfer(DFSOutputStream.java:891) >> at >> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:881) >> at >> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:982) >> at >> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:779) >> at >> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:448) >> >> and >> >> 2014-08-26 10:47:29,592 >> (SinkRunner-PollingRunner-DefaultSinkProcessor) [WARN - >> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:418)] >> HDFS IO error >> java.net.ConnectException: Connection refused >> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >> at >> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735) >> at >> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:207) >> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:525) >> at >> org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1253) >> at >> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.transfer(DFSOutputStream.java:891) >> at >> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:881) >> at >> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:982) >> at >> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:779) >> at >> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:448) >> >> >> I can provide additional information if needed. Thank you very much for >> any insight you are able to provide into this problem. >> >> >> Best, >> >> Andrew >> > > -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.