I have been experimenting with Kafka for the last hour or so and it seems like using a custom *tail* command with the producer is sending the written lines of logs to Kafka.
However, there's no a clear separation between the messages that are received by the ConsoleConsumer, so I can't be sure that lines are sent fully or cut down in the middle (Even if a message contains 10 lines of logs it should be fine, because they will be going through some processing later. I just need to make sure that lines don't split up into 2 messages). I will be testing this sometime next week. Is there any simple consumer available that show messages separated? Thanks a lot! Ron On Fri, Jan 11, 2013 at 7:18 AM, Neha Narkhede <neha.narkh...@gmail.com>wrote: > Ron, > > The best way of doing this would be to use the ConsoleProducer. Basically, > it reads data from the console and parses it using the message "reader" > which by default is the LineReader. In this case, you can either write your > own SquidMessageReader that understands the Squid access format [1] and > sends JSON data to Kafka or use the inbuilt LineReader al though that > wouldn't attach any structure to your data. > > If you do end up writing a squid message reader, would you mind putting it > up on github. I can see it being useful to the community - > https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem > > Thanks, > Neha > > 1. http://wiki.squid-cache.org/Features/LogFormat > > > On Thu, Jan 10, 2013 at 10:24 AM, Jun Rao <jun...@gmail.com> wrote: > > > The following wiki describes the operational part of Kafka. > > https://cwiki.apache.org/confluence/display/KAFKA/Operations > > > > To get your log into Kafka, if this log4j data, you may consider adding a > > KafkaLog4jAppender. Otherwise, you can probably use ConsoleProducer. You > > will still need to deal with things like log rolling yourself though. > > > > Thanks, > > > > Jun > > > > On Thu, Jan 10, 2013 at 9:59 AM, Ron Tsoref <chie...@gmail.com> wrote: > > > > > Hi. > > > > > > I currently have a couple of servers running a reverse proxy software > > that > > > creates access logs in the squid log format (Here is a > > > screenshot<http://www.arnut.com/pics/itpro/LogFile22.jpg>showing the > > > file's formation). > > > > > > As I understand it, Kafka is a good solution for handling the > connection > > > between the proxies (that actually create the logs) and Storm ( in > order > > to > > > analyze them in real-time). > > > > > > Right now, I'm looking for a way to gather the logs from each server > with > > > Kafka and how to configure this Kafka instances. > > > > > > I would appreciate any recommendation on how to do this, or any other > > > source regarding this kind of setup. > > > > > > Is there any production-ready producer that can handle this log > > aggregating > > > task? Basically, the ideal solution for me would be to generate a > message > > > for each one of the lines in the logs for each server, and then > analyzing > > > with Storm shouldn't be a big problem. > > > > > > Thanks, > > > > > > Ron > > > > > >