What about having more than one flume agent?
You could have two agents that read the small messages and sink to HDFS, or
two agents that read the messages, serialize them, and send them to a third
agent which sinks them into HDFS.
On Thu, Mar 27, 2014 at 9:43 AM, Chris Schneider <
ch...@christop
I can't understand what this error is trying to tell me. Can anyone help?
Caused by: org.apache.flume.ChannelException: Put queue for
MemoryTransaction of byteCapacity 1832743000 bytes cannot add an event of
size 598876 bytes because 299200 bytes are already used. Try consider
comitting more freq
What about using a workflow tool like Oozie, Azkaban, or Amazon Data
Pipeline? Set them to be triggered as soon as the s3 bucket is available
and execute the ALTER TABLE command.
On Thursday, July 31, 2014, Viral Bajaria wrote:
> Any suggestions on this ? Still trying to figure out how do I get
One way to avoid managing so many sources would be to have an aggregation point
between the data generators the flume sources. For example, maybe you could
have the data generators put events into a message queue(s), then have flume
consume from there?
Andrew
On Thu, 04 Sep 2014 08:29:04
What about adding in the data from MySQL as a small batch job after flume sinks
to S3? You could then delete the raw data that flume sank. I would worry that
the database connection would be relatively slow and unreliable and may slow
the Flume throughput.
Andrew
On Sep 4, 2014, at 7:53 PM, K