PS you can also try just feeding the logs into a Kafka console producer by doing:
TransferLog "| /opt/kafka/bin/kafka-console-producer.sh --topic apache --broker-list broker-1:9092" ErrorLog "| /opt/kafka/bin/kafka-console-producer.sh --topic apache-errors --broker-list broker-1:9092" You can also pipe a custom log into it as well :) ________________________________________ From: Joseph Lawson <jlaw...@roomkey.com> Sent: Thursday, August 07, 2014 5:35 PM To: users@kafka.apache.org; Philip O'Toole Subject: RE: Apache webserver access logs + Kafka producer Check out my logstash-kafka project: https://github.com/joekiller/logstash-kafka I believe the plugin will be merged into logstash itself soon but for now you can make it yourself. I would suggest making your apache format in json in your apache config and then stream the data through the logstash kafka output (producer) and parse it on the other side with logstash input (kafka consumer) Try something like: LogFormat "{\"@timestamp\":\"%{%Y-%m-%dT%H:%M:%S%z}t\",\"mod_proxy\":{\"x-forwarded-for\":\"%{X-Forwarded-For}i\"},\"mod_headers\":{\"referer\":\"%{Referer}i\",\"user-agent\":\"%{User-Agent}i\",\"host\":\"%{Host}i\"},\"mod_log\":{\"server_name\":\"%V\",\"remote_logname\":\"%l\",\"remote_user\":\"%u\",\"first_request\":\"%r\",\"last_request_status\":\"%>s\",\"response_size_bytes\":%B,\"duration_usec\": %D,\"@version\":1 }" logstash_json CustomLog "|rotatelogs /var/log/httpd/access_log_json-%s 3600" logstash_json ________________________________________ From: Philip O'Toole <philip.oto...@yahoo.com.INVALID> Sent: Thursday, August 07, 2014 3:01 PM To: users@kafka.apache.org Subject: Re: Apache webserver access logs + Kafka producer Fluentd might work or simply configure rsyslog or syslog-ng on the box to watch the Apache log files, and send them to a suitable Producer (for example I wrote something that will accept messages from a syslog client, and stream them to Kafka. https://github.com/otoolep/syslog-gollector) More ideas here: https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem Philip ----------------------------------------- http://www.philipotoole.com On Tuesday, August 5, 2014 2:48 PM, Florian Dambrine <flor...@gumgum.com> wrote: You might be interested by something like Logstash http://logstash.org for logs and event processing. Regards, Florian Le 5 août 2014 23:17, "Jonathan Weeks" <jonathanbwe...@gmail.com> a écrit : > You can look at something like: > > https://github.com/harelba/tail2kafka > > (although I don’t know what the effort would be to update it, as it > doesn’t look like it has been updated in a couple years) > > We are using flume to gather logs, and then sending them to a kafka > cluster via a flume kafka sink — e.g.. > > https://github.com/thilinamb/flume-ng-kafka-sink > > -Jonathan > > > On Aug 5, 2014, at 1:40 PM, mvs.s...@gmail.com wrote: > > > Hi, > > > > I want to collect apache web server logs in real time and send it to > Kafka > > server. Is there any existing Producer available to do this operation, If > > not can you please provide a way to implement it. > > > > Regards, > > Sree. > >