Would it be possible to link the interceptors to the channels?? I didn't find anything about it in the documentation, I guess not.
I guess that another possiblity it's to execute the interceptors in the Sink, what If i'm right means to implement specific Sinks or is it possible? 2014-08-19 9:11 GMT+02:00 Guillermo Ortiz <konstt2...@gmail.com>: > Yeah, I think that it's what I'm doing. > How about: > > channel1 -> sink1 (hdfs > raw data) > Agent 1src --> replicate + > Interceptor1 > -->sink3 > channel2 --> sink2 avro > --> agent2 src Avro --> multiplexing + interceptor2 > > -->sink4 > > Could it be possible to apply the interceptor1 just for channel1?? I know > that interceptors apply to source level. Interceptor1 doesn't modify too > much the data, > I could feed channel2 with those little transformations but ideally I > would like it. So, if I want to do it, it looks like I'd have to create > another level with more channels, etc, etc... Something like this: > > channel1 -> *sink1 avro -> scr1 avro + > interceptor1 -> channel -> sink1 (hdfs raw data)* > Agent 1src --> > replicate > -->sink3 > channel2 --> sink2 avro --> agent2 src > Avro --> multiplexing + interceptor2 > > -->sink4 > > The point is that in sink4 my flow continues and I have other structure > that it's similiar that all the previously, So, that means 8 channels in > total. I don't know if it's possible to simplify this. > > > 2014-08-19 0:09 GMT+02:00 terrey shih <terreys...@gmail.com>: > > something like this >> >> channel 1 -> sink 1 (raw event sink) >> agent 1src -> replicate >> >> -> >> sink 3 >> channel 2 - sink 2 -> agent 2 src -> multiplexer >> >> -> sink 4 >> >> In fact, I tried not having agent 2, but directly connecting sink2 to src >> 2, I was not able to do due to RPCClient exception. >> >> I am just going to try to have 2 agents. >> >> terrey >> >> >> On Mon, Aug 18, 2014 at 3:06 PM, terrey shih <terreys...@gmail.com> >> wrote: >> >>> Well, I am actually doing similar things as you do. I also need to feed >>> that data to different sinks, one just raw data and the other ones are >>> Hbase sinks using the multiplexer. >>> >>> >>> channel 1 -> sink 1 (raw event sink) >>> agent 1src -> replicate >>> channel 2 - sink 2 -> agent 2 src -> multiplexer >>> >>> channel 2 - sink 2 -> agent 2 src -> multiplexer >>> >>> >>> >>> >>> On Mon, Aug 18, 2014 at 1:35 PM, Guillermo Ortiz <konstt2...@gmail.com> >>> wrote: >>> >>>> On my test, everything is in the same VM. Later, I'll have another flow >>>> which is just spooling or tailing a file and send through Avro to another >>>> Source on my system. >>>> >>>> Do I really need to do that replicating step? I think that I have too >>>> many channel and that means too resources and too configuration. >>>> >>>> >>>> 2014-08-18 19:51 GMT+02:00 terrey shih <terreys...@gmail.com>: >>>> >>>> Hi, >>>>> >>>>> Your 2 sources (spooling) and source Avro (from sink 2) are in two >>>>> different JVMs/machines ? >>>>> >>>>> thx >>>>> >>>>> >>>>> On Mon, Aug 18, 2014 at 9:53 AM, Guillermo Ortiz <konstt2...@gmail.com >>>>> > wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I have build a flow with Flume and I don't know if it's the way to do >>>>>> it, or there is something better. I am spooling a directory and need >>>>>> those >>>>>> data in three different paths in HDFS with different formats, so I have >>>>>> created two interceptors. >>>>>> >>>>>> Source(Spooling) + Replication + Interceptor1 --> to C1 and C2 >>>>>> C1 -> Sink1 to HDFS Path1 (It's like a historic) >>>>>> C2 --> Sink2 to Avro --> Source Avro + Multiplexing + Interceptor2 >>>>>> --> C3 and C4 >>>>>> C3 --> Sink3 to HDFS Path2 >>>>>> C4 --> Sink4 to HDFS Path3 >>>>>> >>>>>> Interceptor1 doesn't make too much with the data, it's just to save >>>>>> as they are, it's like to store an history of the original data. >>>>>> >>>>>> Interceptor2 configure an selector and a header. It processes the >>>>>> data and configure the selector to redirect to Sink3 or Sink4. But this >>>>>> interceptor change the original data. >>>>>> >>>>>> I tried to do all the process without replicating data, but I could >>>>>> not. Now, it seems like too many steps just because I want to store the >>>>>> original data in HDFS like a historic. >>>>>> >>>>> >>>>> >>>> >>> >> >