Re: Flow in Flume, could it make better?

terrey shih Mon, 18 Aug 2014 10:52:05 -0700

Hi,

Your 2 sources (spooling) and source Avro (from sink 2) are in two
different JVMs/machines ?


thx


On Mon, Aug 18, 2014 at 9:53 AM, Guillermo Ortiz <konstt2...@gmail.com>
wrote:

> Hi,
>
> I have build a flow with Flume and I don't know if it's the way to do it,
> or there is something better. I am spooling a directory and need those data
> in three different paths in HDFS with different formats, so I have created
> two interceptors.
>
> Source(Spooling) + Replication + Interceptor1 --> to C1 and C2
> C1 -> Sink1 to HDFS Path1 (It's like a historic)
> C2 --> Sink2 to Avro --> Source Avro + Multiplexing + Interceptor2 --> C3
> and C4
> C3 --> Sink3 to HDFS Path2
> C4 --> Sink4 to HDFS Path3
>
> Interceptor1 doesn't make too much with the data, it's just to save as
> they are, it's like to store an history of the original data.
>
> Interceptor2 configure an selector and a header. It processes the data and
> configure the selector to redirect to Sink3 or Sink4. But this interceptor
> change the original data.
>
> I tried to do all the process without replicating data, but I could not.
> Now, it seems like too many steps just because I want to store the original
> data in HDFS like a historic.
>

Re: Flow in Flume, could it make better?

Reply via email to