Okay thanks , So for 128 GB , I will allocate 1 GB as a heap memory for flume agent.
But I am surprised why there was no error registered for this memory issues in log file (flume.log). Do i need to check in any other logs? On 16 July 2014 21:55, Jonathan Natkins <na...@streamsets.com> wrote: > That's definitely your problem. 20MB is way too low for this. Depending on > the other processes you're running with your system, the amount of memory > you'll need will vary, but I'd recommend at least 1GB. You should define it > exactly where it's defined right now, so instead of the current command, > you can run: > > "/cv/jvendor/bin/java -Xmx1g -Dflume.root.logger=DEBUG,LOGFILE......" > > > On Wed, Jul 16, 2014 at 3:03 AM, SaravanaKumar TR <saran0081...@gmail.com> > wrote: > >> I guess i am using defaulk values , from running flume i could see these >> lines "/cv/jvendor/bin/java -Xmx20m >> -Dflume.root.logger=DEBUG,LOGFILE......" >> >> so i guess it takes 20 mb as agent flume memory. >> My RAM is 128 GB.So please suggest how much can i assign as heap memory >> and where to define it. >> >> >> On 16 July 2014 15:05, Jonathan Natkins <na...@streamsets.com> wrote: >> >>> Hey Saravana, >>> >>> I'm attempting to reproduce this, but do you happen to know what the >>> Java heap size is for your Flume agent? This information leads me to >>> believe that you don't have enough memory allocated to the agent, which you >>> may need to do with the -Xmx parameter when you start up your agent. That >>> aside, you can set the byteCapacity parameter on the memory channel to >>> specify how much memory it is allowed to use. It should default to 80% of >>> the Java heap size, but if your heap is too small, this might be a cause of >>> errors. >>> >>> Does anything get written to the log when you try to pass in an event of >>> this size? >>> >>> Thanks, >>> Natty >>> >>> >>> On Wed, Jul 16, 2014 at 1:46 AM, SaravanaKumar TR < >>> saran0081...@gmail.com> wrote: >>> >>>> Hi Natty, >>>> >>>> While looking further , i could see memory channal stops if a line >>>> comes with greater than 2 MB.Let me know which parameter helps us to define >>>> max event size of about 3 MB. >>>> >>>> >>>> On 16 July 2014 12:46, SaravanaKumar TR <saran0081...@gmail.com> wrote: >>>> >>>>> I am asking point 1 , because in some cases I could see a line in >>>>> logfile around 2 MB.So i need to know what mamimum event size.How to >>>>> measure it? >>>>> >>>>> >>>>> >>>>> >>>>> On 16 July 2014 10:18, SaravanaKumar TR <saran0081...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi Natty, >>>>>> >>>>>> Please help me to get the answers for the below queries. >>>>>> >>>>>> 1,In case of exec source , (tail -F <logfile>) , is that each line in >>>>>> file is considered to be a single event ? >>>>>> If suppose a line is considered to be a event , what is that maximum >>>>>> size of event supported by flume?I mean maximum characters in a line >>>>>> supported? >>>>>> 2.When event stop processing , I am not seeing "tail -F" command >>>>>> running in the background. >>>>>> I have used option like "a1.sources.r1.restart = true >>>>>> a1.sources.r1.logStdErr = true".. >>>>>> Does these config will not send any errors to flume.log if any issues >>>>>> in tail? >>>>>> Will this config doesnt try to restart the "tail -F" if its not >>>>>> running in the background. >>>>>> >>>>>> 3.Does flume supports all formats of data in logfile or it has any >>>>>> predefined data formats.. >>>>>> >>>>>> Please help me with these to understand better.. >>>>>> >>>>>> >>>>>> >>>>>> On 16 July 2014 00:56, Jonathan Natkins <na...@streamsets.com> wrote: >>>>>> >>>>>>> Saravana, >>>>>>> >>>>>>> Everything here looks pretty sane. Do you have a record of the >>>>>>> events that came in leading up to the agent stopping collection? If you >>>>>>> can >>>>>>> provide the last file created by the agent, and ideally whatever events >>>>>>> had >>>>>>> come in, but not been written out to your HDFS sink, it might be >>>>>>> possible >>>>>>> for me to reproduce this issue. Would it be possible to get some sample >>>>>>> data from you? >>>>>>> >>>>>>> Thanks, >>>>>>> Natty >>>>>>> >>>>>>> >>>>>>> On Tue, Jul 15, 2014 at 10:26 AM, SaravanaKumar TR < >>>>>>> saran0081...@gmail.com> wrote: >>>>>>> >>>>>>>> Hi Natty , >>>>>>>> >>>>>>>> Just to understand , at present my settings is as >>>>>>>> "flume.root.logger=INFO,LOGFILE" >>>>>>>> in log4j.properties , do you want me to change it to >>>>>>>> "flume.root.logger=DEBUG,LOGFILE" and restart the agent. >>>>>>>> >>>>>>>> But when I start agent , I am already starting with below command.I >>>>>>>> guess i am using DEBUG already but not in config file , while starting >>>>>>>> agent. >>>>>>>> >>>>>>>> ../bin/flume-ng agent -c /d0/flume/conf -f >>>>>>>> /d0/flume/conf/flume-conf.properties -n a1 >>>>>>>> -Dflume.root.logger=DEBUG,LOGFILE >>>>>>>> >>>>>>>> If I do some changes in config "flume-conf.properties" or restart >>>>>>>> the agent , it works again and starts collecting the data. >>>>>>>> >>>>>>>> currently all my logs move to flume.log , I dont see any exception . >>>>>>>> >>>>>>>> cat flume.log | grep "Exception" doesnt show any. >>>>>>>> >>>>>>>> >>>>>>>> On 15 July 2014 22:24, Jonathan Natkins <na...@streamsets.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Saravana, >>>>>>>>> >>>>>>>>> Our best bet on figuring out what's going on here may be to turn >>>>>>>>> on the debug logging. What I would recommend is stopping your agents, >>>>>>>>> and >>>>>>>>> modifying the log4j properties to turn on DEBUG logging for the root >>>>>>>>> logger, and then restart the agents. Once the agent stops producing >>>>>>>>> new >>>>>>>>> events, send out the logs and I'll be happy to take a look over them. >>>>>>>>> >>>>>>>>> Does the system begin working again if you restart the agents? >>>>>>>>> Have you noticed any other events correlated with the agent stopping >>>>>>>>> collecting events? Maybe a spike in events or something like that? >>>>>>>>> And for >>>>>>>>> my own peace of mind, if you run `cat /var/log/flume-ng/* | grep >>>>>>>>> "Exception"`, does it bring anything back? >>>>>>>>> >>>>>>>>> Thanks! >>>>>>>>> Natty >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Jul 15, 2014 at 2:55 AM, SaravanaKumar TR < >>>>>>>>> saran0081...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hi Natty, >>>>>>>>>> >>>>>>>>>> This is my entire config file. >>>>>>>>>> >>>>>>>>>> # Name the components on this agent >>>>>>>>>> a1.sources = r1 >>>>>>>>>> a1.sinks = k1 >>>>>>>>>> a1.channels = c1 >>>>>>>>>> >>>>>>>>>> # Describe/configure the source >>>>>>>>>> a1.sources.r1.type = exec >>>>>>>>>> a1.sources.r1.command = tail -F /data/logs/test_log >>>>>>>>>> a1.sources.r1.restart = true >>>>>>>>>> a1.sources.r1.logStdErr = true >>>>>>>>>> >>>>>>>>>> #a1.sources.r1.batchSize = 2 >>>>>>>>>> >>>>>>>>>> a1.sources.r1.interceptors = i1 >>>>>>>>>> a1.sources.r1.interceptors.i1.type = regex_filter >>>>>>>>>> a1.sources.r1.interceptors.i1.regex = resuming normal >>>>>>>>>> operations|Received|Response >>>>>>>>>> >>>>>>>>>> #a1.sources.r1.interceptors = i2 >>>>>>>>>> #a1.sources.r1.interceptors.i2.type = timestamp >>>>>>>>>> #a1.sources.r1.interceptors.i2.preserveExisting = true >>>>>>>>>> >>>>>>>>>> # Describe the sink >>>>>>>>>> a1.sinks.k1.type = hdfs >>>>>>>>>> a1.sinks.k1.hdfs.path = hdfs:// >>>>>>>>>> testing.sck.com:9000/running/test.sck/date=%Y-%m-%d >>>>>>>>>> a1.sinks.k1.hdfs.writeFormat = Text >>>>>>>>>> a1.sinks.k1.hdfs.fileType = DataStream >>>>>>>>>> a1.sinks.k1.hdfs.filePrefix = events- >>>>>>>>>> a1.sinks.k1.hdfs.rollInterval = 600 >>>>>>>>>> ##need to run hive query randomly to check teh long running >>>>>>>>>> process , so we need to commit events in hdfs files regularly >>>>>>>>>> a1.sinks.k1.hdfs.rollCount = 0 >>>>>>>>>> a1.sinks.k1.hdfs.batchSize = 10 >>>>>>>>>> a1.sinks.k1.hdfs.rollSize = 0 >>>>>>>>>> a1.sinks.k1.hdfs.useLocalTimeStamp = true >>>>>>>>>> >>>>>>>>>> # Use a channel which buffers events in memory >>>>>>>>>> a1.channels.c1.type = memory >>>>>>>>>> a1.channels.c1.capacity = 10000 >>>>>>>>>> a1.channels.c1.transactionCapacity = 10000 >>>>>>>>>> >>>>>>>>>> # Bind the source and sink to the channel >>>>>>>>>> a1.sources.r1.channels = c1 >>>>>>>>>> a1.sinks.k1.channel = c1 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 14 July 2014 22:54, Jonathan Natkins <na...@streamsets.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Saravana, >>>>>>>>>>> >>>>>>>>>>> What does your sink configuration look like? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Natty >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Jul 11, 2014 at 11:05 PM, SaravanaKumar TR < >>>>>>>>>>> saran0081...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Assuming each line in the logfile is considered as a event for >>>>>>>>>>>> flume , >>>>>>>>>>>> >>>>>>>>>>>> 1.Do we have any maximum size of event defined for memory/file >>>>>>>>>>>> channel.like any maximum no of characters in a line. >>>>>>>>>>>> 2.Does flume supports all formats of data to be processed as >>>>>>>>>>>> events or do we have any limitation. >>>>>>>>>>>> >>>>>>>>>>>> I am just still trying to understanding why the flume stops >>>>>>>>>>>> processing events after sometime. >>>>>>>>>>>> >>>>>>>>>>>> Can someone please help me out here. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> saravana >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 11 July 2014 17:49, SaravanaKumar TR <saran0081...@gmail.com >>>>>>>>>>>> > wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi , >>>>>>>>>>>>> >>>>>>>>>>>>> I am new to flume and using Apache Flume 1.5.0. Quick setup >>>>>>>>>>>>> explanation here. >>>>>>>>>>>>> >>>>>>>>>>>>> Source:exec , tail –F command for a logfile. >>>>>>>>>>>>> >>>>>>>>>>>>> Channel: tried with both Memory & file channel >>>>>>>>>>>>> >>>>>>>>>>>>> Sink: HDFS >>>>>>>>>>>>> >>>>>>>>>>>>> When flume starts , processing events happens properly and its >>>>>>>>>>>>> moved to hdfs without any issues. >>>>>>>>>>>>> >>>>>>>>>>>>> But after sometime flume suddenly stops sending events to >>>>>>>>>>>>> HDFS. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I am not seeing any errors in logfile flume.log as well.Please >>>>>>>>>>>>> let me know if I am missing any configuration here. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Below is the channel configuration defined and I left the >>>>>>>>>>>>> remaining to be default values. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> a1.channels.c1.type = FILE >>>>>>>>>>>>> >>>>>>>>>>>>> a1.channels.c1.transactionCapacity = 100000 >>>>>>>>>>>>> >>>>>>>>>>>>> a1.channels.c1.capacity = 10000000 >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Saravana >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >