Gotcha. In that case, what I think you'd want to do is have the client
sources send to an AvroSink that is directed to forward to an AvroSource in
your data center, and attach an interceptor to the AvroSource on your end.
Your interceptor should be able to unwrap the Avro event and transform it
as you need to for the HDFS/HBase sinks. Does that sound reasonable to you?


On Mon, Jul 28, 2014 at 11:25 PM, Guillermo Ortiz <konstt2...@gmail.com>
wrote:

> Yes, the reason it's that the Sources are in the client computers and
> sinks are installed in my systems, It doesn't seem polited overload the
> client system with those transformations.
>
> How about the connection between Sink1 and Source2?? should it be a Avro
> type? or it's not neccesary?? Anyway, I'm gonig to think about to do the
> transformations in the Source, although I think it's not possible.
>
>
> 2014-07-29 1:26 GMT+02:00 Jonathan Natkins <na...@streamsets.com>:
>
> Hi Guillermo,
>>
>> It might actually be easier to do the special transformation in a custom
>> interceptor that's attached to Source1. It depends a little bit on what
>> your transformation actually is, but generally, I'd say that it's going to
>> be *much* easier to implement a custom interceptor than it is to
>> implement a custom sink. This would also give the benefit of not requiring
>> you to forward to a second source, so you'd end up with a simpler pipeline
>> in the end. Is there some reason that you need to perform this
>> transformation in a custom sink?
>>
>> Thanks,
>> Natty
>>
>>
>> On Mon, Jul 28, 2014 at 4:13 AM, Guillermo Ortiz <konstt2...@gmail.com>
>> wrote:
>>
>>> I want to create a topology for Flume, what I want to get it's,.
>>>
>>> Data---> Source1-->Channel1-->MySink1 --->Source2 --> Channel2/Channel3
>>>
>>> Channel2 --> SinkHDFS
>>> Channel3 --> MySinkHBase
>>>
>>> I'd need to code MySink1 and do an special transformation to my data,
>>> the output would be the input for Source2.
>>> Finally, these data should store in Hdfs with the standard sink of flume
>>> and HBase, where I should create a new Serializer for HBase or something
>>> like that.
>>>
>>> I can't see how to do the connection between MySink1 and Source2.
>>> Should Source2 be of Avro type? I think that if I want to connect many
>>> flows inside Flume they have to be Avro, How I want an specific behavior, I
>>> should create a new implementation which extends AbstractRpcAvro or
>>> something like that... Am I right?
>>>
>>>
>>
>

Reply via email to