Re: Flow in Flume, could it make better?

terrey shih Mon, 18 Aug 2014 15:06:47 -0700

Well, I am actually doing similar things as you do.  I also need to feed
that data to different sinks, one just raw data and the other ones are
Hbase sinks using the multiplexer.



                        channel 1 -> sink 1 (raw event sink)
agent 1src -> replicate
channel 2 - sink  2 -> agent 2 src -> multiplexer

                        channel 2 - sink  2 -> agent 2 src -> multiplexer




On Mon, Aug 18, 2014 at 1:35 PM, Guillermo Ortiz <konstt2...@gmail.com>
wrote:

> On my test, everything is in the same VM. Later, I'll have another flow
> which is just spooling or tailing a file and send through Avro to another
> Source on my system.
>
> Do I really need to do that replicating step? I think that I have too many
> channel and that means too resources and too configuration.
>
>
> 2014-08-18 19:51 GMT+02:00 terrey shih <terreys...@gmail.com>:
>
> Hi,
>>
>> Your 2 sources (spooling) and source Avro (from sink 2) are in two
>> different JVMs/machines ?
>>
>> thx
>>
>>
>> On Mon, Aug 18, 2014 at 9:53 AM, Guillermo Ortiz <konstt2...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I have build a flow with Flume and I don't know if it's the way to do
>>> it, or there is something better. I am spooling a directory and need those
>>> data in three different paths in HDFS with different formats, so I have
>>> created two interceptors.
>>>
>>> Source(Spooling) + Replication + Interceptor1 --> to C1 and C2
>>> C1 -> Sink1 to HDFS Path1 (It's like a historic)
>>> C2 --> Sink2 to Avro --> Source Avro + Multiplexing + Interceptor2 -->
>>> C3 and C4
>>> C3 --> Sink3 to HDFS Path2
>>> C4 --> Sink4 to HDFS Path3
>>>
>>> Interceptor1 doesn't make too much with the data, it's just to save as
>>> they are, it's like to store an history of the original data.
>>>
>>> Interceptor2 configure an selector and a header. It processes the data
>>> and configure the selector to redirect to Sink3 or Sink4. But this
>>> interceptor change the original data.
>>>
>>> I tried to do all the process without replicating data, but I could not.
>>> Now, it seems like too many steps just because I want to store the original
>>> data in HDFS like a historic.
>>>
>>
>>
>

Re: Flow in Flume, could it make better?

Reply via email to