I have attached the zipped log file at
https://issues.apache.org/jira/browse/FLUME-1928



________________________________
 From: Hari Shreedharan <[email protected]>
To: [email protected]; Rahul Ravindran <[email protected]> 
Sent: Monday, February 25, 2013 1:30 PM
Subject: Re: File Channel error stops flume
 

Can you send your full logs? I suspect the channel did a full replay because it 
was restarted during a restart. (If it did, the logs would show a 
BadCheckpointException). 


Hari


-- 
Hari Shreedharan

On Monday, February 25, 2013 at 1:20 PM, Rahul Ravindran wrote:
Thanks Hari. I had waited for 20 minutes and this did not move change. Now, 
after more than an hour, I see it working
>
>
>
>________________________________
> From: Hari Shreedharan <[email protected]>
>To: [email protected]; Rahul Ravindran <[email protected]> 
>Sent: Monday, February 25, 2013 12:46 PM
>Subject: Re: File Channel error stops flume
> 
>
>Rahul, 
>
>
>Those messages actually just suggest that your channel is replaying. The 
>channel will complete the replay and the agent will start the sinks once the 
>channel is ready. It might take a few minutes based on how many events you 
>have in the channel.
>
>
>
>
>Hari
>
>
>-- 
>Hari Shreedharan
>
>
>On Monday, February 25, 2013 at 12:07 PM, Rahul Ravindran wrote:
>Hi,
>>   I modified a parameter to the HDFS sink on a flume config (added an 
>>idleInterval) on 2 machines. Things worked fine on one, and not on the other. 
>>I tried restarting flume a couple of times and I continue seeing the same log 
>>statement (bolded below) with no writes to HDFS
>>
>>
>>25 Feb 2013 08:27:00,174 INFO  [Log-BackgroundWorker-ch2] 
>>(org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint:109)  - 
>>Start checkpoint for /flume2/checkpoint/checkpoint, elements to sync = 8506
>>:%
>>25 Feb 2013 19:55:51,577 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
>>/flume2/data/log-17
>>25 Feb 2013 19:55:51,585 INFO  [lifecycleSupervisor-1-1] 
>>(org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
>>/flume1/data/log-17
>>25 Feb 2013 19:55:51,588 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.tools.DirectMemoryUtils.getDefaultDirectMemorySize:113)  - 
>>Unable to get maxDirectMemory from VM: NoSuchMethodException: 
>>sun.misc.VM.maxDirectMemory(null)
>>25 Feb 2013 19:55:51,592 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.tools.DirectMemoryUtils.allocate:47)  - Direct Memory 
>>Allocation:  Allocation = 1048576, Allocated = 0, MaxDirectMemorySize = 
>>268435456, Remaining = 268435456
>>25 Feb 2013 19:55:51,634 INFO  [lifecycleSupervisor-1-1] 
>>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466)
>>  - fast-forward to checkpoint position: 1622812128
>>25 Feb 2013 19:55:51,634 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466)
>>  - fast-forward to checkpoint position: 1622720601
>>25 Feb 2013 19:55:51,654 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
>>/flume2/data/log-18
>>25 Feb 2013 19:55:51,655 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466)
>>  - fast-forward to checkpoint position: 1622821593
>>25 Feb 2013 19:55:51,655 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
>>/flume2/data/log-19
>>25 Feb 2013 19:55:51,656 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466)
>>  - fast-forward to checkpoint position: 1622678590
>>25 Feb 2013 19:55:51,656 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
>>/flume2/data/log-20
>>25 Feb 2013 19:55:51,657 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466)
>>  - fast-forward to checkpoint position: 244707334
>>25 Feb 2013 19:55:51,657 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
>>/flume2/data/log-21
>>25 Feb 2013 19:55:51,657 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466)
>>  - fast-forward to checkpoint position: 530601497
>>25 Feb 2013 19:55:51,658 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFile$SequentialReader.next:491)  - 
>>Encountered EOF at 530601497 in /flume2/data/log-21
>>25 Feb 2013 19:55:51,658 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
>>/flume2/data/log-22
>>25 Feb 2013 19:55:51,658 INFO  [lifecycleSupervisor-1-1] 
>>(org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
>>/flume1/data/log-18
>>25 Feb 2013 19:55:51,658 WARN  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:470)
>>  - Checkpoint for file(/flume2/data/log-22) is: 1361844516782, which is 
>>beyond the requested checkpoint time: 1361844516783 and position 0
>>25 Feb 2013 19:55:51,659 INFO  [lifecycleSupervisor-1-1] 
>>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466)
>>  - fast-forward to checkpoint position: 1622674426
>>25 Feb 2013 19:55:51,659 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
>>/flume2/data/log-23
>>25 Feb 2013 19:55:51,659 INFO  [lifecycleSupervisor-1-1] 
>>(org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
>>/flume1/data/log-19
>>25 Feb 2013 19:55:51,659 WARN  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:470)
>>  - Checkpoint for file(/flume2/data/log-23) is: 1361844516783, which is 
>>beyond the requested checkpoint time: 1361844516783 and position 0
>>25 Feb 2013 19:55:51,660 INFO  [lifecycleSupervisor-1-1] 
>>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466)
>>  - fast-forward to checkpoint position: 1622239091
>>25 Feb 2013 19:55:51,660 INFO  [lifecycleSupervisor-1-1] 
>>(org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
>>/flume1/data/log-20
>>25 Feb 2013 19:55:51,661 INFO  [lifecycleSupervisor-1-1] 
>>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466)
>>  - fast-forward to checkpoint position: 221490603
>>25 Feb 2013 19:55:51,661 INFO  [lifecycleSupervisor-1-1] 
>>(org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
>>/flume1/data/log-21
>>25 Feb 2013 19:55:51,661 INFO  [lifecycleSupervisor-1-1] 
>>(org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466)
>>  - fast-forward to checkpoint position: 532696754
>>25 Feb 2013 19:55:52,048 INFO  [lifecycleSupervisor-1-1] 
>>(org.apache.flume.channel.file.LogFile$SequentialReader.next:491)  - 
>>Encountered EOF at 1623195597 in /flume1/data/log-17
>>25 Feb 2013 19:55:52,103 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFile$SequentialReader.next:491)  - 
>>Encountered EOF at 1623195583 in /flume2/data/log-17
>>25 Feb 2013 19:55:52,308 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFile$SequentialReader.next:491)  - 
>>Encountered EOF at 1623195536 in /flume2/data/log-18
>>25 Feb 2013 19:55:52,319 INFO  [lifecycleSupervisor-1-1] 
>>(org.apache.flume.channel.file.LogFile$SequentialReader.next:491)  - 
>>Encountered EOF at 1623195584 in /flume1/data/log-18
>>25 Feb 2013 19:55:52,418 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFile$SequentialReader.next:491)  - 
>>Encountered EOF at 1623195600 in /flume2/data/log-19
>>25 Feb 2013 19:55:52,439 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFile$SequentialReader.next:491)  - 
>>Encountered EOF at 244816205 in /flume2/data/log-20
>>25 Feb 2013 19:55:52,440 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.ReplayHandler.replayLog:320)  - read: 12348, 
>>put: 0, take: 0, rollback: 0, commit: 0, skip: 12348, eventCount:0
>>25 Feb 2013 19:55:52,441 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.Log.replay:399)  - Rolling /flume2/data
>>25 Feb 2013 19:55:52,441 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.Log.roll:811)  - Roll start /flume2/data
>>25 Feb 2013 19:55:52,443 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFile$Writer.<init>:171)  - Opened 
>>/flume2/data/log-24
>>25 Feb 2013 19:55:52,449 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.Log.roll:826)  - Roll end
>>25 Feb 2013 19:55:52,453 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint:109)  - 
>>Start checkpoint for /flume2/checkpoint/checkpoint, elements to sync = 0
>>25 Feb 2013 19:55:52,455 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint:117)  - 
>>Updating checkpoint metadata: logWriteOrderID: 1361844516784, queueSize: 
>>34525000, queueHead: 40625267
>>25 Feb 2013 19:55:52,489 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85)  - 
>>Updating log-24.meta currentPosition = 0, logWriteOrderID = 1361844516784
>>25 Feb 2013 19:55:52,491 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.Log.writeCheckpoint:886)  - Updated checkpoint 
>>for file: /flume2/data/log-24 position: 0 logWriteOrderID: 1361844516784
>>25 Feb 2013 19:55:52,491 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFile$RandomReader.close:356)  - Closing 
>>RandomReader /flume2/data/log-17
>>25 Feb 2013 19:55:52,497 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85)  - 
>>Updating log-17.meta currentPosition = 1622720601, logWriteOrderID = 
>>1361844516784
>>25 Feb 2013 19:55:52,499 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFile$RandomReader.close:356)  - Closing 
>>RandomReader /flume2/data/log-18
>>25 Feb 2013 19:55:52,500 INFO  [lifecycleSupervisor-1-1] 
>>(org.apache.flume.channel.file.LogFile$SequentialReader.next:491)  - 
>>Encountered EOF at 1623195593 in /flume1/data/log-19
>>25 Feb 2013 19:55:52,505 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85)  - 
>>Updating log-18.meta currentPosition = 1622821593, logWriteOrderID = 
>>1361844516784
>>25 Feb 2013 19:55:52,507 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFile$RandomReader.close:356)  - Closing 
>>RandomReader /flume2/data/log-19
>>25 Feb 2013 19:55:52,513 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85)  - 
>>Updating log-19.meta currentPosition = 1622678590, logWriteOrderID = 
>>1361844516784
>>25 Feb 2013 19:55:52,514 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFile$RandomReader.close:356)  - Closing 
>>RandomReader /flume2/data/log-20
>>25 Feb 2013 19:55:52,520 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85)  - 
>>Updating log-20.meta currentPosition = 244707334, logWriteOrderID = 
>>1361844516784
>>25 Feb 2013 19:55:52,521 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFile$RandomReader.close:356)  - Closing 
>>RandomReader /flume2/data/log-21
>>25 Feb 2013 19:55:52,527 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85)  - 
>>Updating log-21.meta currentPosition = 530601497, logWriteOrderID = 
>>1361844516784
>>25 Feb 2013 19:55:52,529 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.channel.file.FileChannel.start:309)  - Queue Size after 
>>replay: 34525000 [channel=ch2]
>>25 Feb 2013 19:55:52,594 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.instrumentation.MonitoredCounterGroup.register:89)  - 
>>Monitoried counter group for type: CHANNEL, name: ch2, registered 
>>successfully.
>>25 Feb 2013 19:55:52,594 INFO  [lifecycleSupervisor-1-0] 
>>(org.apache.flume.instrumentation.MonitoredCounterGroup.start:73)  - 
>>Component type: CHANNEL, name: ch2 started
>>25 Feb 2013 19:55:52,619 INFO  [lifecycleSupervisor-1-1] 
>>(org.apache.flume.channel.file.LogFile$SequentialReader.next:491)  - 
>>Encountered EOF at 222290119 in /flume1/data/log-20 
>
>
>
> 

Reply via email to