I'm not sure without looking at the exact usecase, but maybe you can use something like haproxy?
-- Sharninder On Thu, Aug 14, 2014 at 4:08 PM, Mohit Durgapal <durgapalmo...@gmail.com> wrote: > Hi Sharninder, > > Thanks for the response. The load balancing is not based on header. To > simplify, lets say I have one web server generating logs and three flume > nodes receiving those logs. I want the load to be balanced on those three > flume nodes based on cpu utilization and load. > > > > > > On Thu, Aug 14, 2014 at 4:01 PM, Sharninder <sharnin...@gmail.com> wrote: > >> To add headers to the events, you can either send proper avro formatted >> packets (which have a header) to an avro source, or implement a custom >> interceptor to add headers after they're received by the syslog source. >> There is a static interceptor bundled with flume that you can use. The >> problem with that is that you can only add a single header (key->value) at >> a time, as far as I know. But, its a good starting point to do what you >> want to do. >> >> I didn't really understand your load balancing requirement but if its >> based on the headers, you'll have to write your own interceptors. >> >> >> >> On Thu, Aug 14, 2014 at 12:55 PM, Mohit Durgapal <durgapalmo...@gmail.com >> > wrote: >> >>> I have a requirement where I need to feed push traffic(comma separated >>> logs) at a very high rate to flume. >>> I have three concerns: >>> >>> >>> 1. I am using php to send events to flume through rsyslog. The code >>> I am using is : >>> >>> *openlog("mylogs", LOG_NDELAY, LOG_LOCAL2); >>> syslog(LOG_INFO, "aaid,bid,cid,info1,info2,...."); >>> closelog();* >>> >>> I want to add some fields as headers in the above event log " >>> *aaid,bid,cid,info1,info2,....*" , I don't see any function in php >>> where I could add headers for some fields so that I can take some action >>> based on just the headers without opening the complete msg. >>> >>> 2. How to load balance the trafffic. I want the logger to forward >>> the logs to the load balancer and then the load balancer to choose a >>> flume >>> node(based on various factors like current load, cpu utilization) and >>> also >>> handle failures(divert traffic if a flume node goes down). >>> >>> I looked at the flume based load balancer but it provides just two >>> options: Round Robin and Random load balancing. Any ideas as to how I >>> could >>> do this load balancing with failure detection and handling would be very >>> helpful. >>> >>> 3. I want to update a cache in real-time from flume(using >>> interceptor). I want a hashing based approach to divert certain >>> traffic(based on a field or header in log) to certain nodes, so that one >>> node is responsible for updating rows with keys under same hash bucket. >>> This is to avoid row level locking. >>> >>> >>> I hope I have explained my requirements well enough for everyone to >>> understand. But If it's not as clear as I think, please let me know. >>> >>> >>> Regards >>> Mohit >>> >> >> >