+1 - using sink groups with load balancing sink processor is the solution. 
backoff is optional (only if you want failed sinks to be not tried for a while).


Hari 

-- 
Hari Shreedharan


On Thursday, January 10, 2013 at 12:10 AM, Connor Woodson wrote:

> Forgot about sink processors; yes, it will work.
> 
> The trick of this method is you will use a different sink for each endpoint, 
> where as the RpcClient (when exposed) will do it all in itself. Your 
> configuration will need to look something like this: 
> 
> -----------------
> 
> <sources>
> 
> a1.channels = c1
> <channel setup>
> 
> a1.sinks = k1 k2
> 
> a1.sinks.k1.type = AVRO
> < set up centralFlumeE connection >
> a1.sinks.k1.channel = c1
> 
> a1.sinks.k2.type = AVRO
> < set up centralFlumeF connection >
> a1.sinks.k2.channel = c1
> 
> a1.sinkgroups = g1
> a1.sinkgroups.g1.sinks = k1 k2
> a1.sinkgroups.g1.processor.type = load_balance
> a1.sinkgroups.g1.processor.backoff = true
> a1.sinkgroups.g1.processor.selector = round_robin 
> 
> -----------------
> 
> here is the relevant link for the load balancing processor: 
> http://flume.apache.org/FlumeUserGuide.html#load-balancing-sink-processor 
> 
> Remember that all sinks in a sink group must share the same channel. This is 
> load balancing, which is what you are seeking in your scenario; the load 
> balancer is not for failover (in the setup of primary and backup servers), 
> although there is a FailoverSinkProcessor for if that's needed. 
> 
> - Connor
> 
> 
> On Wed, Jan 9, 2013 at 11:55 PM, Denny Ye <[email protected] 
> (mailto:[email protected])> wrote:
> > hi Hari, 
> >     I cannot judge the situation that using method you raised. I would like 
> > to explain my case and need your comments. Thanks a lot!
> >     What I need is load balancing while event transferring.  Assume that I 
> > have single local Flume server (located with application) named 
> > 'localFlumeA', configured with single AvroSink and Channel. Meanwhile, two 
> > central Flume servers (collectors) named 'centralFlumeE' and 
> > 'centralFlumeF'. Under this case, I would like to configure load balancing 
> > between 'centralFlumeE' and 'centralFlumeF' for events coming from 
> > 'localFlumeA', and load can be dispatched averagely for that two central 
> > Flume servers. 
> >     Can it be configured by LoadBalancingSinkProcessor in your mind? Wish 
> > your advice
> > 
> > -Regards
> > Denny Ye
> > 
> > 
> > 
> > 2013/1/10 Hari Shreedharan <[email protected] 
> > (mailto:[email protected])>
> > > The LoadBalancing capability similar to the LoadBalancingRpcClient can be 
> > > configured for multiple Avro Sinks using a LoadBalancingSinkProcessor, if 
> > > you are looking for that functionality. 
> > > 
> > > 
> > > Hari 
> > > 
> > > -- 
> > > Hari Shreedharan
> > > 
> > > 
> > > On Wednesday, January 9, 2013 at 11:05 PM, Connor Woodson wrote:
> > > 
> > > > Short answer: there is no way in the current AvroSink to configure the 
> > > > RpcClient, limiting you to just a single host connection (I'm not sure 
> > > > how well it recovers if that host goes down).
> > > > 
> > > > The AvroSink is incredibly simplified from what the RPCClient can do 
> > > > and exposes none of the background functionality. Right now, the only 
> > > > way around that is to create a custom sink based off of the AvroSink 
> > > > source code and instead of setting the RPCClient up the way it 
> > > > currently is, you pass into the RPCClient.getInstance() a set of user 
> > > > supplied properties. To implement this in an unsafe way (not checking 
> > > > any of the user's values) would only take a couple lines of code I 
> > > > believe. It is a work around, but it will enable all of the various 
> > > > RPCClient capabilities such as failover or loadbalancing mode and allow 
> > > > it to connect to multiple hosts.
> > > > 
> > > > This is something that (I think) there is a JIRA filed for; but if not, 
> > > > it would be very helpful for this to be implemented into the actual 
> > > > AvroSink (and something that should be linked to that is 
> > > > RPCClient.getInstance accepting a Context object, simply for ease of 
> > > > use). 
> > > > 
> > > > - Connor
> > > > 
> > > > 
> > > > On Wed, Jan 9, 2013 at 10:55 PM, Denny Ye <[email protected] 
> > > > (mailto:[email protected])> wrote:
> > > > > hi all, 
> > > > >     I didn't find the relationship between AvroSink and other types 
> > > > > of RpcClient, including LoadBalancingRpcClient. In my opinion, user 
> > > > > can set the specified RpcClient type from AvroSink with several 
> > > > > strategies and host selectors. Also, I cannot get information from 
> > > > > source code and user guide. Did I miss something about this? 
> > > > >      Wish someone can support, thanks!
> > > > > 
> > > > > -Regards
> > > > > Denny Ye
> > > > > 
> > > > > 
> > > > > 
> > > > 
> > > > 
> > > > 
> > > 
> > 
> 

Reply via email to