Re: how to load balance flume

Sharninder Thu, 14 Aug 2014 03:32:10 -0700

To add headers to the events, you can either send proper avro formatted
packets (which have a header) to an avro source, or implement a custom
interceptor to add headers after they're received by the syslog source.
There is a static interceptor bundled with flume that you can use. The
problem with that is that you can only add a single header (key->value) at
a time, as far as I know. But, its a good starting point to do what you
want to do.


I didn't really understand your load balancing requirement but if its based
on the headers, you'll have to write your own interceptors.



On Thu, Aug 14, 2014 at 12:55 PM, Mohit Durgapal <durgapalmo...@gmail.com>
wrote:

> I have a requirement where I need to feed push traffic(comma separated
> logs) at a very high rate to flume.
> I have three concerns:
>
>
>    1. I am using php to send events to flume through rsyslog. The code I
>    am using is :
>
> *openlog("mylogs", LOG_NDELAY, LOG_LOCAL2);
>    syslog(LOG_INFO, "aaid,bid,cid,info1,info2,....");
>    closelog();*
>
>    I want to add some fields as headers in the above event  log "
>    *aaid,bid,cid,info1,info2,....*"  , I don't see any function in php
>    where I could add headers for some fields so that I can take some action
>    based on just the headers without opening the complete msg.
>
>    2. How to load balance the trafffic. I want the logger to forward the
>    logs to the load balancer and then the load balancer to choose a flume
>    node(based on various factors like current load, cpu utilization) and also
>    handle failures(divert traffic if a flume node goes down).
>
>     I looked at the flume based load balancer but it provides just two
>    options: Round Robin and Random load balancing. Any ideas as to how I could
>    do this load balancing with failure detection and handling would be very
>    helpful.
>
>    3. I want to update a cache in real-time from flume(using
>    interceptor). I want a hashing based approach to divert certain
>    traffic(based on a field or header in log) to certain nodes, so that one
>    node is responsible for updating rows with keys under same hash bucket.
>    This is to avoid row level locking.
>
>
> I hope I have explained my requirements well enough for everyone to
> understand. But If it's not as clear as I think, please let me know.
>
>
> Regards
> Mohit
>

Re: how to load balance flume

Reply via email to