Hi Natty, While looking further , i could see memory channal stops if a line comes with greater than 2 MB.Let me know which parameter helps us to define max event size of about 3 MB.
On 16 July 2014 12:46, SaravanaKumar TR <saran0081...@gmail.com> wrote: > I am asking point 1 , because in some cases I could see a line in logfile > around 2 MB.So i need to know what mamimum event size.How to measure it? > > > > > On 16 July 2014 10:18, SaravanaKumar TR <saran0081...@gmail.com> wrote: > >> Hi Natty, >> >> Please help me to get the answers for the below queries. >> >> 1,In case of exec source , (tail -F <logfile>) , is that each line in >> file is considered to be a single event ? >> If suppose a line is considered to be a event , what is that maximum size >> of event supported by flume?I mean maximum characters in a line supported? >> 2.When event stop processing , I am not seeing "tail -F" command running >> in the background. >> I have used option like "a1.sources.r1.restart = true >> a1.sources.r1.logStdErr = true".. >> Does these config will not send any errors to flume.log if any issues in >> tail? >> Will this config doesnt try to restart the "tail -F" if its not running >> in the background. >> >> 3.Does flume supports all formats of data in logfile or it has any >> predefined data formats.. >> >> Please help me with these to understand better.. >> >> >> >> On 16 July 2014 00:56, Jonathan Natkins <na...@streamsets.com> wrote: >> >>> Saravana, >>> >>> Everything here looks pretty sane. Do you have a record of the events >>> that came in leading up to the agent stopping collection? If you can >>> provide the last file created by the agent, and ideally whatever events had >>> come in, but not been written out to your HDFS sink, it might be possible >>> for me to reproduce this issue. Would it be possible to get some sample >>> data from you? >>> >>> Thanks, >>> Natty >>> >>> >>> On Tue, Jul 15, 2014 at 10:26 AM, SaravanaKumar TR < >>> saran0081...@gmail.com> wrote: >>> >>>> Hi Natty , >>>> >>>> Just to understand , at present my settings is as >>>> "flume.root.logger=INFO,LOGFILE" >>>> in log4j.properties , do you want me to change it to >>>> "flume.root.logger=DEBUG,LOGFILE" and restart the agent. >>>> >>>> But when I start agent , I am already starting with below command.I >>>> guess i am using DEBUG already but not in config file , while starting >>>> agent. >>>> >>>> ../bin/flume-ng agent -c /d0/flume/conf -f >>>> /d0/flume/conf/flume-conf.properties -n a1 >>>> -Dflume.root.logger=DEBUG,LOGFILE >>>> >>>> If I do some changes in config "flume-conf.properties" or restart the >>>> agent , it works again and starts collecting the data. >>>> >>>> currently all my logs move to flume.log , I dont see any exception . >>>> >>>> cat flume.log | grep "Exception" doesnt show any. >>>> >>>> >>>> On 15 July 2014 22:24, Jonathan Natkins <na...@streamsets.com> wrote: >>>> >>>>> Hi Saravana, >>>>> >>>>> Our best bet on figuring out what's going on here may be to turn on >>>>> the debug logging. What I would recommend is stopping your agents, and >>>>> modifying the log4j properties to turn on DEBUG logging for the root >>>>> logger, and then restart the agents. Once the agent stops producing new >>>>> events, send out the logs and I'll be happy to take a look over them. >>>>> >>>>> Does the system begin working again if you restart the agents? Have >>>>> you noticed any other events correlated with the agent stopping collecting >>>>> events? Maybe a spike in events or something like that? And for my own >>>>> peace of mind, if you run `cat /var/log/flume-ng/* | grep "Exception"`, >>>>> does it bring anything back? >>>>> >>>>> Thanks! >>>>> Natty >>>>> >>>>> >>>>> On Tue, Jul 15, 2014 at 2:55 AM, SaravanaKumar TR < >>>>> saran0081...@gmail.com> wrote: >>>>> >>>>>> Hi Natty, >>>>>> >>>>>> This is my entire config file. >>>>>> >>>>>> # Name the components on this agent >>>>>> a1.sources = r1 >>>>>> a1.sinks = k1 >>>>>> a1.channels = c1 >>>>>> >>>>>> # Describe/configure the source >>>>>> a1.sources.r1.type = exec >>>>>> a1.sources.r1.command = tail -F /data/logs/test_log >>>>>> a1.sources.r1.restart = true >>>>>> a1.sources.r1.logStdErr = true >>>>>> >>>>>> #a1.sources.r1.batchSize = 2 >>>>>> >>>>>> a1.sources.r1.interceptors = i1 >>>>>> a1.sources.r1.interceptors.i1.type = regex_filter >>>>>> a1.sources.r1.interceptors.i1.regex = resuming normal >>>>>> operations|Received|Response >>>>>> >>>>>> #a1.sources.r1.interceptors = i2 >>>>>> #a1.sources.r1.interceptors.i2.type = timestamp >>>>>> #a1.sources.r1.interceptors.i2.preserveExisting = true >>>>>> >>>>>> # Describe the sink >>>>>> a1.sinks.k1.type = hdfs >>>>>> a1.sinks.k1.hdfs.path = hdfs:// >>>>>> testing.sck.com:9000/running/test.sck/date=%Y-%m-%d >>>>>> a1.sinks.k1.hdfs.writeFormat = Text >>>>>> a1.sinks.k1.hdfs.fileType = DataStream >>>>>> a1.sinks.k1.hdfs.filePrefix = events- >>>>>> a1.sinks.k1.hdfs.rollInterval = 600 >>>>>> ##need to run hive query randomly to check teh long running process , >>>>>> so we need to commit events in hdfs files regularly >>>>>> a1.sinks.k1.hdfs.rollCount = 0 >>>>>> a1.sinks.k1.hdfs.batchSize = 10 >>>>>> a1.sinks.k1.hdfs.rollSize = 0 >>>>>> a1.sinks.k1.hdfs.useLocalTimeStamp = true >>>>>> >>>>>> # Use a channel which buffers events in memory >>>>>> a1.channels.c1.type = memory >>>>>> a1.channels.c1.capacity = 10000 >>>>>> a1.channels.c1.transactionCapacity = 10000 >>>>>> >>>>>> # Bind the source and sink to the channel >>>>>> a1.sources.r1.channels = c1 >>>>>> a1.sinks.k1.channel = c1 >>>>>> >>>>>> >>>>>> On 14 July 2014 22:54, Jonathan Natkins <na...@streamsets.com> wrote: >>>>>> >>>>>>> Hi Saravana, >>>>>>> >>>>>>> What does your sink configuration look like? >>>>>>> >>>>>>> Thanks, >>>>>>> Natty >>>>>>> >>>>>>> >>>>>>> On Fri, Jul 11, 2014 at 11:05 PM, SaravanaKumar TR < >>>>>>> saran0081...@gmail.com> wrote: >>>>>>> >>>>>>>> Assuming each line in the logfile is considered as a event for >>>>>>>> flume , >>>>>>>> >>>>>>>> 1.Do we have any maximum size of event defined for memory/file >>>>>>>> channel.like any maximum no of characters in a line. >>>>>>>> 2.Does flume supports all formats of data to be processed as events >>>>>>>> or do we have any limitation. >>>>>>>> >>>>>>>> I am just still trying to understanding why the flume stops >>>>>>>> processing events after sometime. >>>>>>>> >>>>>>>> Can someone please help me out here. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> saravana >>>>>>>> >>>>>>>> >>>>>>>> On 11 July 2014 17:49, SaravanaKumar TR <saran0081...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi , >>>>>>>>> >>>>>>>>> I am new to flume and using Apache Flume 1.5.0. Quick setup >>>>>>>>> explanation here. >>>>>>>>> >>>>>>>>> Source:exec , tail –F command for a logfile. >>>>>>>>> >>>>>>>>> Channel: tried with both Memory & file channel >>>>>>>>> >>>>>>>>> Sink: HDFS >>>>>>>>> >>>>>>>>> When flume starts , processing events happens properly and its >>>>>>>>> moved to hdfs without any issues. >>>>>>>>> >>>>>>>>> But after sometime flume suddenly stops sending events to HDFS. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> I am not seeing any errors in logfile flume.log as well.Please let >>>>>>>>> me know if I am missing any configuration here. >>>>>>>>> >>>>>>>>> >>>>>>>>> Below is the channel configuration defined and I left the >>>>>>>>> remaining to be default values. >>>>>>>>> >>>>>>>>> >>>>>>>>> a1.channels.c1.type = FILE >>>>>>>>> >>>>>>>>> a1.channels.c1.transactionCapacity = 100000 >>>>>>>>> >>>>>>>>> a1.channels.c1.capacity = 10000000 >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Saravana >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >