Config seems sane, im not familar with the rounding values.
transactionCapacity seems a bit high, getting max 10k events from the
source at a time (i've only ever used 100, perhaps 10k is normal).

How big is each event?

Could you also paste flume-env and the output of sudo jps -v or ps
auxww|grep -i flume after starting flume?

Are events flowing all the way through and flushed to files every 60s or
20k events? Do you see queueing in the channel? Is the OOM after some time?

JAVA_OPTS="-Xms1024m -Xmx3072m" previously worked for me in flume-env.sh



-- 
Iain Wright

This email message is confidential, intended only for the recipient(s)
named above and may contain information that is privileged, exempt from
disclosure under applicable law. If you are not the intended recipient, do
not disclose or disseminate the message to anyone except the intended
recipient. If you have received this message in error, or are not the named
recipient(s), please immediately notify the sender by return email, and
delete all copies of this message.

On Wed, Mar 22, 2017 at 11:22 AM, Suresh V <verdi...@gmail.com> wrote:

> Here it is:
>
> # Name the components on this agent
> myagent.sources = r1
> myagent.sinks = k1
> myagent.channels = c1
>
> myagent.sources.r1.type = com.aweber.flume.source.rabbitmq.RabbitMQSource
> myagent.sources.r1.host = xxx.yyy.com
> myagent.sources.r1.port = 5671
> myagent.sources.r1.username = xxx
> myagent.sources.r1.password = xxxx
> myagent.sources.r1.queue = QUEUENAME
> myagent.sources.r1.virtual-host = VH
> myagent.sources.r1.prefetchCount = 10
> myagent.sources.r1.ssl = true
>
>
> # Describe the sink
> myagent.sinks.k1.type = hdfs
> myagent.sinks.k1.hdfs.path = /hdfs/path/to/folder/
> myagent.sinks.k1.hdfs.filePrefix = filename_%Y%m%d.%H%M%S
> myagent.sinks.k1.hdfs.round = true
> myagent.sinks.k1.hdfs.roundUnit = second
> myagent.sinks.k1.hdfs.roundValue = 30
> myagent.sinks.k1.hdfs.useLocalTimeStamp = true
> myagent.sinks.k1.hdfs.timeZone = America/Chicago
> myagent.sinks.k1.hdfs.writeFormat = Text
> myagent.sinks.k1.hdfs.fileType = DataStream
> myagent.sinks.k1.hdfs.batchSize = 20000
> myagent.sinks.k1.hdfs.fileSuffix = .txt
> myagent.sinks.k1.hdfs.rollCount = 0
> myagent.sinks.k1.hdfs.rollSize = 0
> myagent.sinks.k1.hdfs.rollInterval = 60
>
> # Use a channel which buffers events in memory
>
> myagent.channels.c1.type = file
> myagent.channels.c1.capacity = 10000
> myagent.channels.c1.transactionCapacity = 10000
> myagent.channels.c1.dataDirs = /local/path
> myagent.channels.c1.checkpointDir = /local/path/checkpoint/
>
>
> # Bind the source and sink to the channel
> myagent.sources.r1.channels = c1
> myagent.sinks.k1.channel = c1
>
> Thank you
>
>
> On Wed, Mar 22, 2017 at 12:56 PM, iain wright <iainw...@gmail.com> wrote:
>
>> Can you please drop your config in a reply or pastebin (omitting any
>> sensitive info)
>>
>> --
>> Iain Wright
>>
>> This email message is confidential, intended only for the recipient(s)
>> named above and may contain information that is privileged, exempt from
>> disclosure under applicable law. If you are not the intended recipient, do
>> not disclose or disseminate the message to anyone except the intended
>> recipient. If you have received this message in error, or are not the named
>> recipient(s), please immediately notify the sender by return email, and
>> delete all copies of this message.
>>
>> On Wed, Mar 22, 2017 at 10:54 AM, Suresh V <verdi...@gmail.com> wrote:
>>
>>> Hello Flume users,
>>>
>>> I'm getting this error when starting the agent. The source is a rabbit
>>> mq that has millions of messages, channel is file and sink is HDFS..
>>>
>>> Exception: java.lang.OutOfMemoryError thrown from the
>>> UncaughtExceptionHandler in thread "RabbitMQ Consumer #0"
>>> Exception in thread "SinkRunner-PollingRunner-DefaultSinkProcessor"
>>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>> Exception in thread "pool-8-thread-1" java.lang.OutOfMemoryError: GC
>>> overhead limit exceeded
>>> ^CException in thread "Thread-0" java.lang.OutOfMemoryError: GC overhead
>>> limit exceeded
>>> Exception in thread "agent-shutdown-hook" java.lang.OutOfMemoryError: GC
>>> overhead limit exceeded
>>>
>>> I have tried increasing the JAVA_OPTS min and max in flume-env.sh but
>>> that has not helped.
>>>
>>> Any help appreciated.
>>>
>>> Thank you
>>> Suresh.
>>>
>>>
>>
>

Reply via email to