Lain, I am using file channel. Source is spoolDir and Sinks are Solr and HDFS Please find below my Code
#Flume Configuration Starts agent.sources = SpoolDirSrc agent.channels = Channel1 Channel2 agent.sinks = SolrSink HDFSsink # Configure Source agent.sources.SpoolDirSrc.channels = Channel1 Channel2 agent.sources.SpoolDirSrc.type = spooldir #agent.sources.SpoolDirSrc.spoolDir = /app/home/solr/sources_tmp2 #agent.sources.SpoolDirSrc.spoolDir = /app/home/eventsvc/source/processed_emails/ agent.sources.SpoolDirSrc.spoolDir = /app/home/eventsvc/source/processed_emails2/ agent.sources.SpoolDirSrc.basenameHeader = true agent.sources.SpoolDirSrc.selector.type = replicating #agent.sources.SpoolDirSrc.batchSize = 100000 agent.sources.SpoolDirSrc.fileHeader = true #agent.sources.src1.fileSuffix = .COMPLETED agent.sources.SpoolDirSrc.deserializer = org.apache.flume.sink.solr.morphline.BlobDeserializer$Builder # Use a channel that buffers events in file # agent.channels.Channel1.type = file agent.channels.Channel2.type = file agent.channels.Channel1.capacity = 5000 agent.channels.Channel2.capacity = 5000 agent.channels.Channel1.transactionCapacity = 5000 agent.channels.Channel2.transactionCapacity = 5000 agent.channels.Channel1.checkpointDir = /app/home/flume/.flume/file-channel/checkpoint1 agent.channels.Channel2.checkpointDir = /app/home/flume/.flume/file-channel/checkpoint2 agent.channels.Channel1.dataDirs = /app/home/flume/.flume/file-channel/data1 agent.channels.Channel2.dataDirs = /app/home/flume/.flume/file-channel/data2 #agent.channels.Channel.transactionCapacity = 10000 # Configure Solr Sink agent.sinks.SolrSink.type = org.apache.flume.sink.solr.morphline.MorphlineSolrSink agent.sinks.SolrSink.morphlineFile = /etc/flume/conf/morphline.conf agent.sinks.SolrSink.batchsize = 10 agent.sinks.SolrSink.batchDurationMillis = 10 agent.sinks.SolrSink.channel = Channel1 agent.sinks.SolrSink.morphlineId = morphline1 agent.sinks.SolrSink.tika.config = tikaConfig.xml #agent.sinks.SolrSink.fileType = DataStream #agent.sinks.SolrSink.hdfs.batchsize = 5 agent.sinks.SolrSink.rollCount = 0 agent.sinks.SolrSink.rollInterval = 0 #agent.sinks.SolrSink.rollsize = 100000000 agent.sinks.SolrSink.idleTimeout = 0 #agent.sinks.SolrSink.txnEventMax = 5000 # Configure HDFS Sink agent.sinks.HDFSsink.channel = Channel2 agent.sinks.HDFSsink.type = hdfs #agent.sinks.HDFSsink.hdfs.path = hdfs://codehdplak-po-r10p.sys.comcast.net:8020/user/solr/emails agent.sinks.HDFSsink.hdfs.path = hdfs://codehann/user/solr/emails #agent.sinks.HDFSsink.hdfs.fileType = DataStream agent.sinks.HDFSsink.hdfs.fileType = CompressedStream agent.sinks.HDFSsink.hdfs.batchsize = 1000 agent.sinks.HDFSsink.hdfs.rollCount = 0 agent.sinks.HDFSsink.hdfs.rollInterval = 0 agent.sinks.HDFSsink.hdfs.rollsize = 10485760 agent.sinks.HDFSsink.hdfs.idleTimeout = 0 agent.sinks.HDFSsink.hdfs.maxOpenFiles = 1 agent.sinks.HDFSsink.hdfs.filePrefix = %{basename} agent.sinks.HDFSsink.hdfs.codeC = gzip agent.sources.SpoolDirSrc.channels = Channel1 Channel2 agent.sinks.SolrSink.channel = Channel1 agent.sinks.HDFSsink.channel = Channel2 Morhphine Code : solrLocator: { collection : esearch #zkHost : "127.0.0.1:9983" #zkHost : "codesolr-as-r1p.sys.comcast.net:2181,codesolr-as-r2p.sys.comcast.net:2182" #zkHost : "codesolr-as-r2p:2181" zkHost : "codesolr-wc-r1p.sys.comcast.net:2181,codesolr-wc-r2p.sys.comcast.net:2181,codesolr-wc-r3p.sys.comcast.net:2181" } morphlines : [ { id : morphline1 importCommands : ["org.kitesdk.**", "org.apache.solr.**"] commands : [ { detectMimeType { includeDefaultMimeTypes : true } } { solrCell { solrLocator : ${solrLocator} captureAttr : true lowernames : true capture : [_attachment_body, _attachment_mimetype, basename, content, content_encoding, content_type, file, meta,text] parsers : [ # { parser : org.apache.tika.parser.txt.TXTParser } # { parser : org.apache.tika.parser.AutoDetectParser } #{ parser : org.apache.tika.parser.asm.ClassParser } #{ parser : org.gagravarr.tika.FlacParser } #{ parser : org.apache.tika.parser.executable.ExecutableParser } #{ parser : org.apache.tika.parser.font.TrueTypeParser } #{ parser : org.apache.tika.parser.xml.XMLParser } #{ parser : org.apache.tika.parser.html.HtmlParser } #{ parser : org.apache.tika.parser.image.TiffParser } # { parser : org.apache.tika.parser.mail.RFC822Parser } #{ parser : org.apache.tika.parser.mbox.MboxParser, additionalSupportedMimeTypes : [message/x-emlx] } #{ parser : org.apache.tika.parser.microsoft.OfficeParser } #{ parser : org.apache.tika.parser.hdf.HDFParser } #{ parser : org.apache.tika.parser.odf.OpenDocumentParser } #{ parser : org.apache.tika.parser.pdf.PDFParser } #{ parser : org.apache.tika.parser.rtf.RTFParser } { parser : org.apache.tika.parser.txt.TXTParser } #{ parser : org.apache.tika.parser.chm.ChmParser } ] fmap : { content : text } } } { generateUUID { field : id } } { sanitizeUnknownSolrFields { solrLocator : ${solrLocator} } } { logDebug { format : "output record: {}", args : ["@{}"] } } { loadSolr: { solrLocator : ${solrLocator} } } ] } ] I am not sure How I can get the flume metrics. Thank you for looking into it Regards, ~Sri From: iain wright [mailto:iainw...@gmail.com] Sent: Wednesday, July 26, 2017 2:37 PM To: user@flume.apache.org Subject: Re: Flume consumes all memory - { OutOfMemoryError: GC overhead limit exceeded } Hi Sri, Are you using a memory channel? What source/sink? Can you please paste/link your obfuscated config What does the metrics endpoint say in terms of channel size, sinkdrainsuccess etc, for the period leading up to the OOM? Best, Iain Sent from my iPhone On Jul 26, 2017, at 8:00 AM, Anantharaman, Srinatha (Contractor) <srinatha_ananthara...@comcast.com<mailto:srinatha_ananthara...@comcast.com>> wrote: Hi All, Though I have mentioned the -Xms and -Xmx values Flume is consuming all memory and failing at the end I have tried adding above parameters in command line as below a. /usr/hdp/current/flume-server/bin/flume-ng agent -c /etc/flume/conf -f /etc/flume/conf/flumeSolr.conf -n agent -Dproperty="-Xms1024m -Xmx4048m" b. /usr/hdp/current/flume-server/bin/flume-ng agent -c /etc/flume/conf -f /etc/flume/conf/flumeSolr.conf -n agent -Xms1024m -Xmx4048m And also using flume-env.sh file as below export JAVA_OPTS="-Xms2048m -Xmx4048m -Dcom.sun.management.jmxremote -XX:+UseParNewGC -XX:+UseConcMarkSweepGC" I am using HDP 2.5 and flume 1.5.2.2.5 Kindly let me know how to resolve this issue Regards, ~Sri