Heh, I forgot to link the multiplexing channel selector documentation. Here it is.<http://flume.apache.org/FlumeUserGuide.html#multiplexing-channel-selector>
- Connor On Mon, Apr 22, 2013 at 11:52 PM, Connor Woodson <[email protected]>wrote: > Some more thoughts on this: > > The way Interceptors are currently set to work is that they apply to an > event as it is received. There are good uses for this - for instances, it > allows easily configuring a single Timestamp interceptor that gives all > events a source receives a timestamp, so even if you have multiple > sinks/channels responding to an event, you only have that one interceptor. > Interceptors in this sense serve to add data to event headers, and as such > it makes sense to have them applied only once by the source instead of > letting the channels change header data. > > If you wish to use an interceptor in the above way, to modify header data, > and still want that interceptor to apply for a single channel, then if you > don't mind could you elaborate on what you are trying to do? I haven't been > able to come up with a situation like that. The solution here would be to > do as Jeff suggested and use a serializer; if you want more in-depth > instructions on how to build it, please ask; I have a set of directions > lying around somewhere that I'll find for you. > > > However, the way Interceptors work I have myself faced a situation where I > would like the interceptors to be channel only. This use case is when I > want to use an Interceptor to filter events; I want to send an event to > some subset of channels based on the contents of its data. Here is how you > can do this in the current setup (where Interceptors are applied at the > source instead of per-channel): > > Using the Multiplexing Channel Selector you are able to choose which > channels an event is written to based off of the value of a specified > header (documentation in that link). There are some more features to the > selector that aren't documented, called Optional Channels or something, but > I don't know very much about them - just figured I would point out that > they exist; digging through the source should provide some more insight. > > So here is how you want to set your system up. Create an Interceptor that > will define a certain header value based off of the event's contents. For > instance, if you want all events containing exactly 1 character to be sent > to a channel, you could create an Interceptor that counts the characters in > the event. Then that Interceptor will set a certain header value to > "SINGLE" if there is just one character, or "MULTIPLE" if there are more. > > Then you can create your channel selector like this (modified from the > documentation example): > > a1.sources = r1 > a1.channels = all_events single_events multiple_events > a1.sources.r1.interceptors = your_interceptor > a1.sources.r1.interceptors.your_interceptor.header = header > a1.sources.r1.selector.type = multiplexing > a1.sources.r1.selector.header = header > a1.sources.r1.selector.mapping.SINGLE = all_events single_events > a1.sources.r1.selector.mapping.MULTIPLE = all_events multiple_events > a1.sources.r1.selector.default = all_events > > > The result is that now you have created a way to filter which channels a > certain event is sent to. Note that a channel can appear more than once - > for instance, all_events will get all events. And so the trick is to just > define the right interceptor (which are much simpler to code than a > serializer (which itself is fairly easy)). > > Hopefully that was clear. Feel free to ask more questions, > > - Connor > > > > On Fri, Apr 19, 2013 at 11:14 AM, Jeff Lord <[email protected]> wrote: > >> Jagadish, >> >> Here is an example of how to write a custom serializer. >> >> >> https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MyCustomSerializer.java >> >> -Jeff >> >> >> On Fri, Apr 19, 2013 at 9:34 AM, Jeff Lord <[email protected]> wrote: >> >>> Hi Jagadish, >>> >>> Have you considered using a custom event serializer to modify your event? >>> Its possible to replicate your flow using two channels and then have one >>> sink that implements a custom serializer to modify the event. >>> >>> -Jeff >>> >>> >>> On Tue, Apr 16, 2013 at 11:12 PM, Jagadish Bihani < >>> [email protected]> wrote: >>> >>>> Hi >>>> >>>> If anybody has any inputs on this that will surely help. >>>> >>>> Regards, >>>> Jagadish >>>> >>>> >>>> On 04/16/2013 12:06 PM, Jagadish Bihani wrote: >>>> >>>>> Hi >>>>> >>>>> We have a use case in which >>>>> 1. spooling source reads data. >>>>> 2. It needs to write events into multiple channels. It should apply >>>>> interceptor only when putting into one channel and should put >>>>> the event as it is while putting into another channel. >>>>> >>>>> Possible approach we have thought: >>>>> >>>>> 1. Create 2 different sources and then apply interceptor on one and >>>>> dont >>>>> apply on other. But that duplicates reads and increases IO. >>>>> >>>>> Is there any better way of achieving this use case? >>>>> >>>>> Regards, >>>>> Jagadish >>>>> >>>>> >>>> >>> >> >
