Add -XX:-HeapDumpOnOutOfMemoryError parameter as well, if your process is OOME, would generate a Heap dump. Allocate Heap based on the number of events you need to keep in channel. Try with 1 GB, but calculate according the Channel size as (average event size * number of events), plus object over heads.
Please note, this is just a rough calculation, actual memory usage would be higher. On Thu, Jul 17, 2014 at 11:21 AM, SaravanaKumar TR <saran0081...@gmail.com> wrote: > Okay thanks , So for 128 GB , I will allocate 1 GB as a heap memory for > flume agent. > > But I am surprised why there was no error registered for this memory > issues in log file (flume.log). > > Do i need to check in any other logs? > > > On 16 July 2014 21:55, Jonathan Natkins <na...@streamsets.com> wrote: > >> That's definitely your problem. 20MB is way too low for this. Depending >> on the other processes you're running with your system, the amount of >> memory you'll need will vary, but I'd recommend at least 1GB. You should >> define it exactly where it's defined right now, so instead of the current >> command, you can run: >> >> "/cv/jvendor/bin/java -Xmx1g -Dflume.root.logger=DEBUG,LOGFILE......" >> >> >> On Wed, Jul 16, 2014 at 3:03 AM, SaravanaKumar TR <saran0081...@gmail.com >> > wrote: >> >>> I guess i am using defaulk values , from running flume i could see these >>> lines "/cv/jvendor/bin/java -Xmx20m >>> -Dflume.root.logger=DEBUG,LOGFILE......" >>> >>> so i guess it takes 20 mb as agent flume memory. >>> My RAM is 128 GB.So please suggest how much can i assign as heap memory >>> and where to define it. >>> >>> >>> On 16 July 2014 15:05, Jonathan Natkins <na...@streamsets.com> wrote: >>> >>>> Hey Saravana, >>>> >>>> I'm attempting to reproduce this, but do you happen to know what the >>>> Java heap size is for your Flume agent? This information leads me to >>>> believe that you don't have enough memory allocated to the agent, which you >>>> may need to do with the -Xmx parameter when you start up your agent. That >>>> aside, you can set the byteCapacity parameter on the memory channel to >>>> specify how much memory it is allowed to use. It should default to 80% of >>>> the Java heap size, but if your heap is too small, this might be a cause of >>>> errors. >>>> >>>> Does anything get written to the log when you try to pass in an event >>>> of this size? >>>> >>>> Thanks, >>>> Natty >>>> >>>> >>>> On Wed, Jul 16, 2014 at 1:46 AM, SaravanaKumar TR < >>>> saran0081...@gmail.com> wrote: >>>> >>>>> Hi Natty, >>>>> >>>>> While looking further , i could see memory channal stops if a line >>>>> comes with greater than 2 MB.Let me know which parameter helps us to >>>>> define >>>>> max event size of about 3 MB. >>>>> >>>>> >>>>> On 16 July 2014 12:46, SaravanaKumar TR <saran0081...@gmail.com> >>>>> wrote: >>>>> >>>>>> I am asking point 1 , because in some cases I could see a line in >>>>>> logfile around 2 MB.So i need to know what mamimum event size.How to >>>>>> measure it? >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 16 July 2014 10:18, SaravanaKumar TR <saran0081...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi Natty, >>>>>>> >>>>>>> Please help me to get the answers for the below queries. >>>>>>> >>>>>>> 1,In case of exec source , (tail -F <logfile>) , is that each line >>>>>>> in file is considered to be a single event ? >>>>>>> If suppose a line is considered to be a event , what is that maximum >>>>>>> size of event supported by flume?I mean maximum characters in a line >>>>>>> supported? >>>>>>> 2.When event stop processing , I am not seeing "tail -F" command >>>>>>> running in the background. >>>>>>> I have used option like "a1.sources.r1.restart = true >>>>>>> a1.sources.r1.logStdErr = true".. >>>>>>> Does these config will not send any errors to flume.log if any >>>>>>> issues in tail? >>>>>>> Will this config doesnt try to restart the "tail -F" if its not >>>>>>> running in the background. >>>>>>> >>>>>>> 3.Does flume supports all formats of data in logfile or it has any >>>>>>> predefined data formats.. >>>>>>> >>>>>>> Please help me with these to understand better.. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 16 July 2014 00:56, Jonathan Natkins <na...@streamsets.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Saravana, >>>>>>>> >>>>>>>> Everything here looks pretty sane. Do you have a record of the >>>>>>>> events that came in leading up to the agent stopping collection? If >>>>>>>> you can >>>>>>>> provide the last file created by the agent, and ideally whatever >>>>>>>> events had >>>>>>>> come in, but not been written out to your HDFS sink, it might be >>>>>>>> possible >>>>>>>> for me to reproduce this issue. Would it be possible to get some sample >>>>>>>> data from you? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Natty >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Jul 15, 2014 at 10:26 AM, SaravanaKumar TR < >>>>>>>> saran0081...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi Natty , >>>>>>>>> >>>>>>>>> Just to understand , at present my settings is as >>>>>>>>> "flume.root.logger=INFO,LOGFILE" >>>>>>>>> in log4j.properties , do you want me to change it to >>>>>>>>> "flume.root.logger=DEBUG,LOGFILE" and restart the agent. >>>>>>>>> >>>>>>>>> But when I start agent , I am already starting with below >>>>>>>>> command.I guess i am using DEBUG already but not in config file , >>>>>>>>> while >>>>>>>>> starting agent. >>>>>>>>> >>>>>>>>> ../bin/flume-ng agent -c /d0/flume/conf -f >>>>>>>>> /d0/flume/conf/flume-conf.properties -n a1 >>>>>>>>> -Dflume.root.logger=DEBUG,LOGFILE >>>>>>>>> >>>>>>>>> If I do some changes in config "flume-conf.properties" or restart >>>>>>>>> the agent , it works again and starts collecting the data. >>>>>>>>> >>>>>>>>> currently all my logs move to flume.log , I dont see any exception >>>>>>>>> . >>>>>>>>> >>>>>>>>> cat flume.log | grep "Exception" doesnt show any. >>>>>>>>> >>>>>>>>> >>>>>>>>> On 15 July 2014 22:24, Jonathan Natkins <na...@streamsets.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi Saravana, >>>>>>>>>> >>>>>>>>>> Our best bet on figuring out what's going on here may be to turn >>>>>>>>>> on the debug logging. What I would recommend is stopping your >>>>>>>>>> agents, and >>>>>>>>>> modifying the log4j properties to turn on DEBUG logging for the root >>>>>>>>>> logger, and then restart the agents. Once the agent stops producing >>>>>>>>>> new >>>>>>>>>> events, send out the logs and I'll be happy to take a look over them. >>>>>>>>>> >>>>>>>>>> Does the system begin working again if you restart the agents? >>>>>>>>>> Have you noticed any other events correlated with the agent stopping >>>>>>>>>> collecting events? Maybe a spike in events or something like that? >>>>>>>>>> And for >>>>>>>>>> my own peace of mind, if you run `cat /var/log/flume-ng/* | grep >>>>>>>>>> "Exception"`, does it bring anything back? >>>>>>>>>> >>>>>>>>>> Thanks! >>>>>>>>>> Natty >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Jul 15, 2014 at 2:55 AM, SaravanaKumar TR < >>>>>>>>>> saran0081...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Natty, >>>>>>>>>>> >>>>>>>>>>> This is my entire config file. >>>>>>>>>>> >>>>>>>>>>> # Name the components on this agent >>>>>>>>>>> a1.sources = r1 >>>>>>>>>>> a1.sinks = k1 >>>>>>>>>>> a1.channels = c1 >>>>>>>>>>> >>>>>>>>>>> # Describe/configure the source >>>>>>>>>>> a1.sources.r1.type = exec >>>>>>>>>>> a1.sources.r1.command = tail -F /data/logs/test_log >>>>>>>>>>> a1.sources.r1.restart = true >>>>>>>>>>> a1.sources.r1.logStdErr = true >>>>>>>>>>> >>>>>>>>>>> #a1.sources.r1.batchSize = 2 >>>>>>>>>>> >>>>>>>>>>> a1.sources.r1.interceptors = i1 >>>>>>>>>>> a1.sources.r1.interceptors.i1.type = regex_filter >>>>>>>>>>> a1.sources.r1.interceptors.i1.regex = resuming normal >>>>>>>>>>> operations|Received|Response >>>>>>>>>>> >>>>>>>>>>> #a1.sources.r1.interceptors = i2 >>>>>>>>>>> #a1.sources.r1.interceptors.i2.type = timestamp >>>>>>>>>>> #a1.sources.r1.interceptors.i2.preserveExisting = true >>>>>>>>>>> >>>>>>>>>>> # Describe the sink >>>>>>>>>>> a1.sinks.k1.type = hdfs >>>>>>>>>>> a1.sinks.k1.hdfs.path = hdfs:// >>>>>>>>>>> testing.sck.com:9000/running/test.sck/date=%Y-%m-%d >>>>>>>>>>> a1.sinks.k1.hdfs.writeFormat = Text >>>>>>>>>>> a1.sinks.k1.hdfs.fileType = DataStream >>>>>>>>>>> a1.sinks.k1.hdfs.filePrefix = events- >>>>>>>>>>> a1.sinks.k1.hdfs.rollInterval = 600 >>>>>>>>>>> ##need to run hive query randomly to check teh long running >>>>>>>>>>> process , so we need to commit events in hdfs files regularly >>>>>>>>>>> a1.sinks.k1.hdfs.rollCount = 0 >>>>>>>>>>> a1.sinks.k1.hdfs.batchSize = 10 >>>>>>>>>>> a1.sinks.k1.hdfs.rollSize = 0 >>>>>>>>>>> a1.sinks.k1.hdfs.useLocalTimeStamp = true >>>>>>>>>>> >>>>>>>>>>> # Use a channel which buffers events in memory >>>>>>>>>>> a1.channels.c1.type = memory >>>>>>>>>>> a1.channels.c1.capacity = 10000 >>>>>>>>>>> a1.channels.c1.transactionCapacity = 10000 >>>>>>>>>>> >>>>>>>>>>> # Bind the source and sink to the channel >>>>>>>>>>> a1.sources.r1.channels = c1 >>>>>>>>>>> a1.sinks.k1.channel = c1 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 14 July 2014 22:54, Jonathan Natkins <na...@streamsets.com> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Saravana, >>>>>>>>>>>> >>>>>>>>>>>> What does your sink configuration look like? >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Natty >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jul 11, 2014 at 11:05 PM, SaravanaKumar TR < >>>>>>>>>>>> saran0081...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Assuming each line in the logfile is considered as a event for >>>>>>>>>>>>> flume , >>>>>>>>>>>>> >>>>>>>>>>>>> 1.Do we have any maximum size of event defined for memory/file >>>>>>>>>>>>> channel.like any maximum no of characters in a line. >>>>>>>>>>>>> 2.Does flume supports all formats of data to be processed as >>>>>>>>>>>>> events or do we have any limitation. >>>>>>>>>>>>> >>>>>>>>>>>>> I am just still trying to understanding why the flume stops >>>>>>>>>>>>> processing events after sometime. >>>>>>>>>>>>> >>>>>>>>>>>>> Can someone please help me out here. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> saravana >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 11 July 2014 17:49, SaravanaKumar TR < >>>>>>>>>>>>> saran0081...@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi , >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am new to flume and using Apache Flume 1.5.0. Quick setup >>>>>>>>>>>>>> explanation here. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Source:exec , tail –F command for a logfile. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Channel: tried with both Memory & file channel >>>>>>>>>>>>>> >>>>>>>>>>>>>> Sink: HDFS >>>>>>>>>>>>>> >>>>>>>>>>>>>> When flume starts , processing events happens properly and >>>>>>>>>>>>>> its moved to hdfs without any issues. >>>>>>>>>>>>>> >>>>>>>>>>>>>> But after sometime flume suddenly stops sending events to >>>>>>>>>>>>>> HDFS. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am not seeing any errors in logfile flume.log as >>>>>>>>>>>>>> well.Please let me know if I am missing any configuration here. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Below is the channel configuration defined and I left the >>>>>>>>>>>>>> remaining to be default values. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> a1.channels.c1.type = FILE >>>>>>>>>>>>>> >>>>>>>>>>>>>> a1.channels.c1.transactionCapacity = 100000 >>>>>>>>>>>>>> >>>>>>>>>>>>>> a1.channels.c1.capacity = 10000000 >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Saravana >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > -- thanks ashish Blog: http://www.ashishpaliwal.com/blog My Photo Galleries: http://www.pbase.com/ashishpaliwal