Hi Gwen,

Well I have gone through this link while trying to setup my Logstash Kafka
handler,

https://github.com/joekiller/logstash-kafka

I could achieve what I was looking for but the performance is badly
affected while trying to write a big file of GB's.
I guess there should be some way so as to parallelise the existing running
process.

Thanks!

On Sun, Feb 8, 2015 at 8:06 PM, Gwen Shapira <gshap...@cloudera.com> wrote:

> I'm wondering how much of the time is spent by Logstash reading and
> processing the log vs. time spent sending data to Kafka. Also, I'm not
> familiar with log.stash internals, perhaps it can be tuned to send the data
> to Kafka in larger batches?
>
> At the moment its difficult to tell where is the slowdown. More information
> about the breakdown of time will help.
>
> You can try Flume's SpoolingDirectory source with Kafka Channel or Sink and
> see if you get improved performance out of other tools.
>
>
> Gwen
>
> On Sun, Feb 8, 2015 at 12:06 AM, Vineet Mishra <clearmido...@gmail.com>
> wrote:
>
> > Hi All,
> >
> > I am having some log files of around 30GB, I am trying to event process
> > these logs by pushing them to Kafka. I could clearly see the throughput
> > achieved while publishing these event to Kafka is quiet slow.
> >
> > So as mentioned for the single log file of 30GB, the Logstash is
> > continuously emitting to Kafka and it is running from more than 2 days
> but
> > still it has processed just 60% of the log data. I was looking out for a
> > way to increase the efficiency of the publishing the event to kafka as
> with
> > this rate of data ingestion I don't think it will be a good option to
> move
> > ahead.
> >
> > Looking out for performance improvisation for the same.
> >
> > Experts advise required!
> >
> > Thanks!
> >
>

Reply via email to