Using Flume to process data

Sid Ray Wed, 03 Sep 2014 13:27:02 -0700

Can you guys please let me know if the following scenario is supported:
I have a system in which there are Tomcat machines which have small JSON
files of 2K size each. The goal is to take those files, convert them to CSV
format and upload them to S3. Then from S3 they are loaded in parallel to
Redshift.


My idea of the architecture was that:

TomcatServer1   --------------
                                       |
TomcatServer2   --------------> Flume---->S3


Is it possbile in Flume we can do the conversion from the JSON file to CSV
files. The idea is that we need to take the contents of the JSON file, do
some database lookup, fetch the id and then create the CSV file out of
that. Is it possible to do this processing in Flume.

Also, what will the HA architecture of Flume look like. Any links etc.

Thanks,
Sid

Using Flume to process data

Reply via email to