I have attached the zipped log file at https://issues.apache.org/jira/browse/FLUME-1928
________________________________ From: Hari Shreedharan <[email protected]> To: [email protected]; Rahul Ravindran <[email protected]> Sent: Monday, February 25, 2013 1:30 PM Subject: Re: File Channel error stops flume Can you send your full logs? I suspect the channel did a full replay because it was restarted during a restart. (If it did, the logs would show a BadCheckpointException). Hari -- Hari Shreedharan On Monday, February 25, 2013 at 1:20 PM, Rahul Ravindran wrote: Thanks Hari. I had waited for 20 minutes and this did not move change. Now, after more than an hour, I see it working > > > >________________________________ > From: Hari Shreedharan <[email protected]> >To: [email protected]; Rahul Ravindran <[email protected]> >Sent: Monday, February 25, 2013 12:46 PM >Subject: Re: File Channel error stops flume > > >Rahul, > > >Those messages actually just suggest that your channel is replaying. The >channel will complete the replay and the agent will start the sinks once the >channel is ready. It might take a few minutes based on how many events you >have in the channel. > > > > >Hari > > >-- >Hari Shreedharan > > >On Monday, February 25, 2013 at 12:07 PM, Rahul Ravindran wrote: >Hi, >> I modified a parameter to the HDFS sink on a flume config (added an >>idleInterval) on 2 machines. Things worked fine on one, and not on the other. >>I tried restarting flume a couple of times and I continue seeing the same log >>statement (bolded below) with no writes to HDFS >> >> >>25 Feb 2013 08:27:00,174 INFO [Log-BackgroundWorker-ch2] >>(org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint:109) - >>Start checkpoint for /flume2/checkpoint/checkpoint, elements to sync = 8506 >>:% >>25 Feb 2013 19:55:51,577 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.ReplayHandler.replayLog:236) - Replaying >>/flume2/data/log-17 >>25 Feb 2013 19:55:51,585 INFO [lifecycleSupervisor-1-1] >>(org.apache.flume.channel.file.ReplayHandler.replayLog:236) - Replaying >>/flume1/data/log-17 >>25 Feb 2013 19:55:51,588 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.tools.DirectMemoryUtils.getDefaultDirectMemorySize:113) - >>Unable to get maxDirectMemory from VM: NoSuchMethodException: >>sun.misc.VM.maxDirectMemory(null) >>25 Feb 2013 19:55:51,592 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.tools.DirectMemoryUtils.allocate:47) - Direct Memory >>Allocation: Allocation = 1048576, Allocated = 0, MaxDirectMemorySize = >>268435456, Remaining = 268435456 >>25 Feb 2013 19:55:51,634 INFO [lifecycleSupervisor-1-1] >>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466) >> - fast-forward to checkpoint position: 1622812128 >>25 Feb 2013 19:55:51,634 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466) >> - fast-forward to checkpoint position: 1622720601 >>25 Feb 2013 19:55:51,654 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.ReplayHandler.replayLog:236) - Replaying >>/flume2/data/log-18 >>25 Feb 2013 19:55:51,655 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466) >> - fast-forward to checkpoint position: 1622821593 >>25 Feb 2013 19:55:51,655 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.ReplayHandler.replayLog:236) - Replaying >>/flume2/data/log-19 >>25 Feb 2013 19:55:51,656 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466) >> - fast-forward to checkpoint position: 1622678590 >>25 Feb 2013 19:55:51,656 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.ReplayHandler.replayLog:236) - Replaying >>/flume2/data/log-20 >>25 Feb 2013 19:55:51,657 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466) >> - fast-forward to checkpoint position: 244707334 >>25 Feb 2013 19:55:51,657 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.ReplayHandler.replayLog:236) - Replaying >>/flume2/data/log-21 >>25 Feb 2013 19:55:51,657 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466) >> - fast-forward to checkpoint position: 530601497 >>25 Feb 2013 19:55:51,658 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFile$SequentialReader.next:491) - >>Encountered EOF at 530601497 in /flume2/data/log-21 >>25 Feb 2013 19:55:51,658 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.ReplayHandler.replayLog:236) - Replaying >>/flume2/data/log-22 >>25 Feb 2013 19:55:51,658 INFO [lifecycleSupervisor-1-1] >>(org.apache.flume.channel.file.ReplayHandler.replayLog:236) - Replaying >>/flume1/data/log-18 >>25 Feb 2013 19:55:51,658 WARN [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:470) >> - Checkpoint for file(/flume2/data/log-22) is: 1361844516782, which is >>beyond the requested checkpoint time: 1361844516783 and position 0 >>25 Feb 2013 19:55:51,659 INFO [lifecycleSupervisor-1-1] >>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466) >> - fast-forward to checkpoint position: 1622674426 >>25 Feb 2013 19:55:51,659 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.ReplayHandler.replayLog:236) - Replaying >>/flume2/data/log-23 >>25 Feb 2013 19:55:51,659 INFO [lifecycleSupervisor-1-1] >>(org.apache.flume.channel.file.ReplayHandler.replayLog:236) - Replaying >>/flume1/data/log-19 >>25 Feb 2013 19:55:51,659 WARN [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:470) >> - Checkpoint for file(/flume2/data/log-23) is: 1361844516783, which is >>beyond the requested checkpoint time: 1361844516783 and position 0 >>25 Feb 2013 19:55:51,660 INFO [lifecycleSupervisor-1-1] >>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466) >> - fast-forward to checkpoint position: 1622239091 >>25 Feb 2013 19:55:51,660 INFO [lifecycleSupervisor-1-1] >>(org.apache.flume.channel.file.ReplayHandler.replayLog:236) - Replaying >>/flume1/data/log-20 >>25 Feb 2013 19:55:51,661 INFO [lifecycleSupervisor-1-1] >>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466) >> - fast-forward to checkpoint position: 221490603 >>25 Feb 2013 19:55:51,661 INFO [lifecycleSupervisor-1-1] >>(org.apache.flume.channel.file.ReplayHandler.replayLog:236) - Replaying >>/flume1/data/log-21 >>25 Feb 2013 19:55:51,661 INFO [lifecycleSupervisor-1-1] >>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466) >> - fast-forward to checkpoint position: 532696754 >>25 Feb 2013 19:55:52,048 INFO [lifecycleSupervisor-1-1] >>(org.apache.flume.channel.file.LogFile$SequentialReader.next:491) - >>Encountered EOF at 1623195597 in /flume1/data/log-17 >>25 Feb 2013 19:55:52,103 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFile$SequentialReader.next:491) - >>Encountered EOF at 1623195583 in /flume2/data/log-17 >>25 Feb 2013 19:55:52,308 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFile$SequentialReader.next:491) - >>Encountered EOF at 1623195536 in /flume2/data/log-18 >>25 Feb 2013 19:55:52,319 INFO [lifecycleSupervisor-1-1] >>(org.apache.flume.channel.file.LogFile$SequentialReader.next:491) - >>Encountered EOF at 1623195584 in /flume1/data/log-18 >>25 Feb 2013 19:55:52,418 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFile$SequentialReader.next:491) - >>Encountered EOF at 1623195600 in /flume2/data/log-19 >>25 Feb 2013 19:55:52,439 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFile$SequentialReader.next:491) - >>Encountered EOF at 244816205 in /flume2/data/log-20 >>25 Feb 2013 19:55:52,440 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.ReplayHandler.replayLog:320) - read: 12348, >>put: 0, take: 0, rollback: 0, commit: 0, skip: 12348, eventCount:0 >>25 Feb 2013 19:55:52,441 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.Log.replay:399) - Rolling /flume2/data >>25 Feb 2013 19:55:52,441 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.Log.roll:811) - Roll start /flume2/data >>25 Feb 2013 19:55:52,443 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFile$Writer.<init>:171) - Opened >>/flume2/data/log-24 >>25 Feb 2013 19:55:52,449 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.Log.roll:826) - Roll end >>25 Feb 2013 19:55:52,453 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint:109) - >>Start checkpoint for /flume2/checkpoint/checkpoint, elements to sync = 0 >>25 Feb 2013 19:55:52,455 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint:117) - >>Updating checkpoint metadata: logWriteOrderID: 1361844516784, queueSize: >>34525000, queueHead: 40625267 >>25 Feb 2013 19:55:52,489 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85) - >>Updating log-24.meta currentPosition = 0, logWriteOrderID = 1361844516784 >>25 Feb 2013 19:55:52,491 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.Log.writeCheckpoint:886) - Updated checkpoint >>for file: /flume2/data/log-24 position: 0 logWriteOrderID: 1361844516784 >>25 Feb 2013 19:55:52,491 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFile$RandomReader.close:356) - Closing >>RandomReader /flume2/data/log-17 >>25 Feb 2013 19:55:52,497 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85) - >>Updating log-17.meta currentPosition = 1622720601, logWriteOrderID = >>1361844516784 >>25 Feb 2013 19:55:52,499 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFile$RandomReader.close:356) - Closing >>RandomReader /flume2/data/log-18 >>25 Feb 2013 19:55:52,500 INFO [lifecycleSupervisor-1-1] >>(org.apache.flume.channel.file.LogFile$SequentialReader.next:491) - >>Encountered EOF at 1623195593 in /flume1/data/log-19 >>25 Feb 2013 19:55:52,505 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85) - >>Updating log-18.meta currentPosition = 1622821593, logWriteOrderID = >>1361844516784 >>25 Feb 2013 19:55:52,507 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFile$RandomReader.close:356) - Closing >>RandomReader /flume2/data/log-19 >>25 Feb 2013 19:55:52,513 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85) - >>Updating log-19.meta currentPosition = 1622678590, logWriteOrderID = >>1361844516784 >>25 Feb 2013 19:55:52,514 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFile$RandomReader.close:356) - Closing >>RandomReader /flume2/data/log-20 >>25 Feb 2013 19:55:52,520 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85) - >>Updating log-20.meta currentPosition = 244707334, logWriteOrderID = >>1361844516784 >>25 Feb 2013 19:55:52,521 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFile$RandomReader.close:356) - Closing >>RandomReader /flume2/data/log-21 >>25 Feb 2013 19:55:52,527 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85) - >>Updating log-21.meta currentPosition = 530601497, logWriteOrderID = >>1361844516784 >>25 Feb 2013 19:55:52,529 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.channel.file.FileChannel.start:309) - Queue Size after >>replay: 34525000 [channel=ch2] >>25 Feb 2013 19:55:52,594 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.instrumentation.MonitoredCounterGroup.register:89) - >>Monitoried counter group for type: CHANNEL, name: ch2, registered >>successfully. >>25 Feb 2013 19:55:52,594 INFO [lifecycleSupervisor-1-0] >>(org.apache.flume.instrumentation.MonitoredCounterGroup.start:73) - >>Component type: CHANNEL, name: ch2 started >>25 Feb 2013 19:55:52,619 INFO [lifecycleSupervisor-1-1] >>(org.apache.flume.channel.file.LogFile$SequentialReader.next:491) - >>Encountered EOF at 222290119 in /flume1/data/log-20 > > > >
