Check out my logstash-kafka project: https://github.com/joekiller/logstash-kafka
I believe the plugin will be merged into logstash itself soon but for now you can make it yourself. I would suggest making your apache format in json in your apache config and then stream the data through the logstash kafka output (producer) and parse it on the other side with logstash input (kafka consumer) Try something like: LogFormat "{\"@timestamp\":\"%{%Y-%m-%dT%H:%M:%S%z}t\",\"mod_proxy\":{\"x-forwarded-for\":\"%{X-Forwarded-For}i\"},\"mod_headers\":{\"referer\":\"%{Referer}i\",\"user-agent\":\"%{User-Agent}i\",\"host\":\"%{Host}i\"},\"mod_log\":{\"server_name\":\"%V\",\"remote_logname\":\"%l\",\"remote_user\":\"%u\",\"first_request\":\"%r\",\"last_request_status\":\"%>s\",\"response_size_bytes\":%B,\"duration_usec\": %D,\"@version\":1 }" logstash_json CustomLog "|rotatelogs /var/log/httpd/access_log_json-%s 3600" logstash_json ________________________________________ From: Philip O'Toole <philip.oto...@yahoo.com.INVALID> Sent: Thursday, August 07, 2014 3:01 PM To: users@kafka.apache.org Subject: Re: Apache webserver access logs + Kafka producer Fluentd might work or simply configure rsyslog or syslog-ng on the box to watch the Apache log files, and send them to a suitable Producer (for example I wrote something that will accept messages from a syslog client, and stream them to Kafka. https://github.com/otoolep/syslog-gollector) More ideas here: https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem Philip ----------------------------------------- http://www.philipotoole.com On Tuesday, August 5, 2014 2:48 PM, Florian Dambrine <flor...@gumgum.com> wrote: You might be interested by something like Logstash http://logstash.org for logs and event processing. Regards, Florian Le 5 août 2014 23:17, "Jonathan Weeks" <jonathanbwe...@gmail.com> a écrit : > You can look at something like: > > https://github.com/harelba/tail2kafka > > (although I don’t know what the effort would be to update it, as it > doesn’t look like it has been updated in a couple years) > > We are using flume to gather logs, and then sending them to a kafka > cluster via a flume kafka sink — e.g.. > > https://github.com/thilinamb/flume-ng-kafka-sink > > -Jonathan > > > On Aug 5, 2014, at 1:40 PM, mvs.s...@gmail.com wrote: > > > Hi, > > > > I want to collect apache web server logs in real time and send it to > Kafka > > server. Is there any existing Producer available to do this operation, If > > not can you please provide a way to implement it. > > > > Regards, > > Sree. > >