Jun,

let me see if I can fix first and then will submit back.

Daniel,

I was looking at the code some more and was thinking this might work

https://github.com/apache/kafka/blob/0.8.1/perf/src/main/scala/kafka/perf/ProducerPerformance.scala

on line 246 instead of looping to create messages  I could open a sample
file and add the rows as messages lien by line until I hit configured
message cap.  If I hit end of file I would start at the top.  I think I can
figure this out.

Bert


On Mon, Jun 30, 2014 at 1:46 PM, Daniel Compton <d...@danielcompton.net>
wrote:

> Hi Bert
>
> What you are describing could be done partially with the console producer.
> It will read from a file and send each line to the Kafka broker. You could
> make a really big file or alter that code to repeat a certain number of
> times. The source is pretty readable, I think that might be an easier route
> to take.
>
> Daniel.
>
> > On 1/07/2014, at 2:07 am, Bert Corderman <bertc...@gmail.com> wrote:
> >
> > Daniel,
> >
> >
> >
> > We have the same question.  We noticed that the compression tests we ran
> > using the built in performance tester was not realistic.  I think on disk
> > compression was 200:1.  (yes that is two hundred to one) I had planned to
> > try and edit the producer performance tester source and do the following
> >
> >
> >
> > 1.       Add an option to read sample data from provided text file.
> > (thought would be to add a file with 1-5000 rows, whatever I thought my
> > batch size might be)
> >
> > 2.      Load sample file into array
> >
> > 3.      Change code that creates message to pull a random row from array
> >
> >
> >
> > I also am not a Scala developer  so would take me a little bit to figure
> > this out.  This is on hold right now as I am looking at options of
> > compression of the message before sending to kafka.  We had originally
> not
> > wanted to do this as we are assuming that we would not get efficient
> > compression ratios as we are only doing a single message however we are
> > also talking about sending multiple messages from our application as a
> > single Kafka message.  Our concern with using kafka compression is the
> > overhead required from decompression on the broker to assign Ids.  Here
> is
> > a good article that describes this
> >
> http://geekmantra.wordpress.com/2013/03/28/compression-in-kafka-gzip-or-snappy/
> >
> >
> >
> > But again we haven’t decided just yet.  Would like to test and evaluate.
> >
> >
> >
> > Bert
> >
> >
> > On Mon, Jun 30, 2014 at 2:24 AM, Daniel Compton <d...@danielcompton.net>
> > wrote:
> >
> >> Hi folks
> >>
> >> I was doing some performance testing using the built in Kafka
> performance
> >> tester and it seems like it sends messages of size n bytes but with all
> >> bytes having the value 0x0. Is that correct? Reading the source seemed
> to
> >> indicate that too but I'm not a Scala developer so I could be wrong.
> >>
> >> Would this affect the performance compared to a real world scenario?
> >> Obviously you will get very efficient compression rates but apart from
> >> that, is there likely to be optimisations carried out  anywhere between
> the
> >> JVM and the network card that won't hold for messages with non zero
> entropy?
> >>
> >> We're going to test this against our production workload so it's not a
> big
> >> deal for us but I wondered if this could give others skewed results?
> >>
> >> ---
> >> Daniel
>

Reply via email to