Re: Using Flume to process data

Joey Echeverria Wed, 03 Sep 2014 14:22:29 -0700

You should be able to accomplish this with the Morplhines
intercepter[1]. It will let you build a configuration file that
converts from JSON to CSV. There's a similar example, though the
target is Avro rather than JSON, in the Kite project[2]. The full docs
for Morphlines will also be helpful[3].


-Joey

[1] http://flume.apache.org/FlumeUserGuide.html#morphline-interceptor
[2] https://github.com/kite-sdk/kite-examples/tree/master/json
[3] http://kitesdk.org/docs/current/kite-morphlines/index.html

On Wed, Sep 3, 2014 at 4:26 PM, Sid Ray <s...@fractalsciences.com> wrote:
> Can you guys please let me know if the following scenario is supported:
> I have a system in which there are Tomcat machines which have small JSON
> files of 2K size each. The goal is to take those files, convert them to CSV
> format and upload them to S3. Then from S3 they are loaded in parallel to
> Redshift.
>
> My idea of the architecture was that:
>
> TomcatServer1   --------------
>                                        |
> TomcatServer2   --------------> Flume---->S3
>
>
> Is it possbile in Flume we can do the conversion from the JSON file to CSV
> files. The idea is that we need to take the contents of the JSON file, do
> some database lookup, fetch the id and then create the CSV file out of that.
> Is it possible to do this processing in Flume.
>
> Also, what will the HA architecture of Flume look like. Any links etc.
>
> Thanks,
> Sid



-- 
Joey Echeverria

Re: Using Flume to process data

Reply via email to