Mike, The events from the file-channel are consumed by the sink and sent to another flume agent.
I have verified the number through jconsole on the agent and collector. But the data is still at data directory log-1, log-2 1.6 and 1.6G respectively. Madhu Munagala (214)679-2872 On Apr 11, 2013, at 3:08 PM, Mike Keane <[email protected]> wrote: > Are you sure all your events were taken off the channel by the sink? > Did you verify all the data you sent landed at the final destination? I > have had my file channel backup like this when sinking to a slow source > but eventually the file channel empties to a few MB provided I'm not > adding data faster than the sink can remove it. > > I have only seen a similar problem once while evaluating flume but was > unable to reproduce. I had 4 parallel flows. I killed the agents in > the storage/filter tier (http://blogs.apache.org/flume/) and let logs > backup up in the collector tier. I watched the file channels on the > collector tier grow to tens of GB each before restarting the > storage/filter tier agents. 3 of the 4 file channels backing the 4 > parallel flows drained to a few MB each. The 4th however did not. Even > after I stopped putting data on the flows and verified all data > successfully landed in the final sink location the 4th channel was still > 50+ GB. I stopped and restarted the agent and the agent iterated > through all the data/checkpoint files. Ultimately it sent a couple more > batches of events but the channel emptied. > > So yes, I have seen your problem however it was either explainable or > not reproducible. Explainable in the case where data is added to the > channel faster than the sink can remove it and not reproducible the one > time but Flumed fixed itself on a restart. > > Because of the one time I witnessed the channel not clearing I will be > monitoring the file channel size outside of flume as a precaution when > we move flume to production. > > Regards, > > Mike > > > > On 04/11/2013 02:37 PM, Madhu Gmail wrote: >> Hello, >> >> I have not heard from anyone. so just want make sure I have explained the >> issue correctly. >> >> I think this is a common problem for everyone who uses it flume. >> >> when flume sink consumes the log event from file channel, what will happen >> to the data that is committed to local disk under data directory. >> >> will it grow indefinitely like log-1, log-2, log-3.....and so on ??? >> >> do I have to write script to remove the data from data directory ?? >> >> >> >> Madhu Munagala >> (214)679-2872 >> >> On Apr 11, 2013, at 11:52 AM, Madhu Gmail <[email protected]> wrote: >> >>> Hello, >>> >>> How to clean up the data in file channel data folder. After the log >>> events are processed by the sink, I still see the log-1 and log-2 shows >>> 1.6GB and 1.2GB. >>> >>> once the log events are processed by the sink, the channel should not have >>> any data in data directory under file-channel ....?? >>> >>> >>> Madhu Munagala >>> (214)679-2872 > > > > > > This email and any files included with it may contain privileged, > proprietary and/or confidential information that is for the sole use > of the intended recipient(s). Any disclosure, copying, distribution, > posting, or use of the information contained in or attached to this > email is prohibited unless permitted by the sender. If you have > received this email in error, please immediately notify the sender > via return email, telephone, or fax and destroy this original transmission > and its included files without reading or saving it in any manner. > Thank you. >
