Lain,

I am using file channel. Source is spoolDir and Sinks are Solr and HDFS
Please find below my Code

#Flume Configuration Starts

agent.sources = SpoolDirSrc
agent.channels = Channel1 Channel2
agent.sinks = SolrSink HDFSsink

# Configure Source

agent.sources.SpoolDirSrc.channels = Channel1 Channel2
agent.sources.SpoolDirSrc.type = spooldir
#agent.sources.SpoolDirSrc.spoolDir = /app/home/solr/sources_tmp2
#agent.sources.SpoolDirSrc.spoolDir = 
/app/home/eventsvc/source/processed_emails/
agent.sources.SpoolDirSrc.spoolDir = 
/app/home/eventsvc/source/processed_emails2/
agent.sources.SpoolDirSrc.basenameHeader = true
agent.sources.SpoolDirSrc.selector.type = replicating
#agent.sources.SpoolDirSrc.batchSize = 100000

agent.sources.SpoolDirSrc.fileHeader = true
#agent.sources.src1.fileSuffix = .COMPLETED
agent.sources.SpoolDirSrc.deserializer = 
org.apache.flume.sink.solr.morphline.BlobDeserializer$Builder


# Use a channel that buffers events in file
#
agent.channels.Channel1.type = file
agent.channels.Channel2.type = file
agent.channels.Channel1.capacity = 5000
agent.channels.Channel2.capacity = 5000
agent.channels.Channel1.transactionCapacity = 5000
agent.channels.Channel2.transactionCapacity = 5000
agent.channels.Channel1.checkpointDir = 
/app/home/flume/.flume/file-channel/checkpoint1
agent.channels.Channel2.checkpointDir = 
/app/home/flume/.flume/file-channel/checkpoint2
agent.channels.Channel1.dataDirs = /app/home/flume/.flume/file-channel/data1
agent.channels.Channel2.dataDirs = /app/home/flume/.flume/file-channel/data2


#agent.channels.Channel.transactionCapacity = 10000


# Configure Solr Sink

agent.sinks.SolrSink.type = 
org.apache.flume.sink.solr.morphline.MorphlineSolrSink
agent.sinks.SolrSink.morphlineFile = /etc/flume/conf/morphline.conf
agent.sinks.SolrSink.batchsize = 10
agent.sinks.SolrSink.batchDurationMillis = 10
agent.sinks.SolrSink.channel = Channel1
agent.sinks.SolrSink.morphlineId = morphline1
agent.sinks.SolrSink.tika.config = tikaConfig.xml
#agent.sinks.SolrSink.fileType = DataStream
#agent.sinks.SolrSink.hdfs.batchsize = 5
agent.sinks.SolrSink.rollCount = 0
agent.sinks.SolrSink.rollInterval = 0
#agent.sinks.SolrSink.rollsize = 100000000
agent.sinks.SolrSink.idleTimeout = 0
#agent.sinks.SolrSink.txnEventMax = 5000

# Configure HDFS Sink

agent.sinks.HDFSsink.channel = Channel2
agent.sinks.HDFSsink.type = hdfs
#agent.sinks.HDFSsink.hdfs.path = 
hdfs://codehdplak-po-r10p.sys.comcast.net:8020/user/solr/emails
agent.sinks.HDFSsink.hdfs.path = hdfs://codehann/user/solr/emails
#agent.sinks.HDFSsink.hdfs.fileType = DataStream
agent.sinks.HDFSsink.hdfs.fileType = CompressedStream
agent.sinks.HDFSsink.hdfs.batchsize = 1000
agent.sinks.HDFSsink.hdfs.rollCount = 0
agent.sinks.HDFSsink.hdfs.rollInterval = 0
agent.sinks.HDFSsink.hdfs.rollsize = 10485760
agent.sinks.HDFSsink.hdfs.idleTimeout = 0
agent.sinks.HDFSsink.hdfs.maxOpenFiles = 1
agent.sinks.HDFSsink.hdfs.filePrefix = %{basename}
agent.sinks.HDFSsink.hdfs.codeC = gzip


agent.sources.SpoolDirSrc.channels = Channel1 Channel2
agent.sinks.SolrSink.channel = Channel1
agent.sinks.HDFSsink.channel = Channel2

Morhphine Code :


solrLocator: {

collection : esearch

#zkHost : "127.0.0.1:9983"

#zkHost : 
"codesolr-as-r1p.sys.comcast.net:2181,codesolr-as-r2p.sys.comcast.net:2182"
#zkHost : "codesolr-as-r2p:2181"
zkHost : 
"codesolr-wc-r1p.sys.comcast.net:2181,codesolr-wc-r2p.sys.comcast.net:2181,codesolr-wc-r3p.sys.comcast.net:2181"

}

morphlines :
[

  {

    id : morphline1

    importCommands : ["org.kitesdk.**", "org.apache.solr.**"]

    commands :
    [

      { detectMimeType { includeDefaultMimeTypes : true } }

      {

        solrCell {

          solrLocator : ${solrLocator}

          captureAttr : true

          lowernames : true

          capture : [_attachment_body, _attachment_mimetype, basename, content, 
content_encoding, content_type, file, meta,text]

          parsers : [ # { parser : org.apache.tika.parser.txt.TXTParser }

                    # { parser : org.apache.tika.parser.AutoDetectParser }
                      #{ parser : org.apache.tika.parser.asm.ClassParser }
                      #{ parser : org.gagravarr.tika.FlacParser }
                      #{ parser : 
org.apache.tika.parser.executable.ExecutableParser }
                      #{ parser : org.apache.tika.parser.font.TrueTypeParser }
                      #{ parser : org.apache.tika.parser.xml.XMLParser }
                      #{ parser : org.apache.tika.parser.html.HtmlParser }
                      #{ parser : org.apache.tika.parser.image.TiffParser }
                      # { parser : org.apache.tika.parser.mail.RFC822Parser }
                      #{ parser : org.apache.tika.parser.mbox.MboxParser, 
additionalSupportedMimeTypes : [message/x-emlx] }
                      #{ parser : org.apache.tika.parser.microsoft.OfficeParser 
}
                      #{ parser : org.apache.tika.parser.hdf.HDFParser }
                      #{ parser : org.apache.tika.parser.odf.OpenDocumentParser 
}
                      #{ parser : org.apache.tika.parser.pdf.PDFParser }
                      #{ parser : org.apache.tika.parser.rtf.RTFParser }
                      { parser : org.apache.tika.parser.txt.TXTParser }
                      #{ parser : org.apache.tika.parser.chm.ChmParser }
                    ]

         fmap : { content : text }
         }

      }
      { generateUUID { field : id } }

      { sanitizeUnknownSolrFields { solrLocator : ${solrLocator} } }


      { logDebug { format : "output record: {}", args : ["@{}"] } }

      { loadSolr: { solrLocator : ${solrLocator} } }

    ]

  }

]

I am not sure How I can get the flume metrics.
Thank you for looking into it

Regards,
~Sri

From: iain wright [mailto:iainw...@gmail.com]
Sent: Wednesday, July 26, 2017 2:37 PM
To: user@flume.apache.org
Subject: Re: Flume consumes all memory - { OutOfMemoryError: GC overhead limit 
exceeded }

Hi Sri,

Are you using a memory channel? What source/sink?

Can you please paste/link your obfuscated config

What does the metrics endpoint say in terms of channel size, sinkdrainsuccess 
etc, for the period leading up to the OOM?

Best,
Iain

Sent from my iPhone

On Jul 26, 2017, at 8:00 AM, Anantharaman, Srinatha (Contractor) 
<srinatha_ananthara...@comcast.com<mailto:srinatha_ananthara...@comcast.com>> 
wrote:
Hi All,

Though I have mentioned the -Xms and -Xmx  values Flume is consuming all memory 
and failing at the end

I have tried adding above parameters in command line as below


a.       /usr/hdp/current/flume-server/bin/flume-ng agent -c /etc/flume/conf -f 
/etc/flume/conf/flumeSolr.conf -n agent -Dproperty="-Xms1024m -Xmx4048m"

b.      /usr/hdp/current/flume-server/bin/flume-ng agent -c /etc/flume/conf -f 
/etc/flume/conf/flumeSolr.conf -n agent -Xms1024m -Xmx4048m

And also using flume-env.sh file as below

export JAVA_OPTS="-Xms2048m -Xmx4048m -Dcom.sun.management.jmxremote 
-XX:+UseParNewGC -XX:+UseConcMarkSweepGC"

I am using HDP 2.5  and flume 1.5.2.2.5

Kindly let me know how to resolve this issue

Regards,
~Sri

Reply via email to