FYI, there's an outstanding patch for getting some JMH benchmarking setup:
https://github.com/apache/kafka/pull/1712 I haven't found time to review it
(and don't really know JMH well anyway) but it might be worth getting that
landed so we can use it for this as well.

-Ewen

On Wed, Jan 11, 2017 at 6:35 AM, Dongjin Lee <dong...@apache.org> wrote:

> Hi Ismael,
>
> 1. In the case of compression output, yes, lz4 is producing the smaller
> output than gzip. In fact, my benchmark was inspired
> by MessageCompressionTest#testCompressSize unit test and the result is
> same - 396 bytes for gzip and 387 bytes for lz4.
> 2. I agree that my (former) approach can result in unreliable output.
> However, I am experiencing difficulties on how to acquire the benchmark
> metrics from Kafka. For you recommended JMH, I just started to google for
> it. If possible, could you give any example on how to use JMH against
> Kafka? If it is the case, it will be a great help.
> Regards,Dongjin
>
>                 _____________________________
> From: Ismael Juma <ism...@juma.me.uk>
> Sent: Wednesday, January 11, 2017 7:33 PM
> Subject: Re: [DISCUSS] KIP-110: Add Codec for ZStandard Compression
> To:  <dev@kafka.apache.org>
>
>
> Thanks Dongjin. I highly recommend using JMH for the benchmark, the
> existing one has a few problems that could result in unreliable results.
> Also, it's a bit surprising that LZ4 is producing smaller output than gzip.
> Is that right?
>
> Ismael
>
> On Wed, Jan 11, 2017 at 10:20 AM, Dongjin Lee <dong...@apache.org> wrote:
>
> > Ismael,
> >
> > I pushed the benchmark code I used, with some updates (iteration: 20 ->
> > 1000). I also updated the KIP page with the updated benchmark results.
> > Please take a review when you are free. The attached screenshot shows how
> > to run the benchmarker.
> >
> > Thanks,
> > Dongjin
> >
> > On Tue, Jan 10, 2017 at 8:03 PM, Dongjin Lee <dong...@apache.org> wrote:
> >
> >> Ismael,
> >>
> >> I see. Then, I will share the benchmark code I used by tomorrow. Thanks
> >> for your guidance.
> >>
> >> Best,
> >> Dongjin
> >>
> >> -----
> >>
> >> Dongjin Lee
> >>
> >> Software developer in Line+.
> >> So interested in massive-scale machine learning.
> >>
> >> facebook: www.facebook.com/dongjin.lee.kr
> >> linkedin: kr.linkedin.com/in/dongjinleekr
> >> github: github.com/dongjinleekr
> >> twitter: www.twitter.com/dongjinleekr
> >>
> >>
> >>
> >>
> >> On Tue, Jan 10, 2017 at 7:24 PM +0900, "Ismael Juma" <ism...@juma.me.uk
> >
> >> wrote:
> >>
> >> Dongjin,
> >>>
> >>> The KIP states:
> >>>
> >>> "I compared the compressed size and compression time of 3 1kb-sized
> >>> messages (3102 bytes in total), with the Draft-implementation of
> ZStandard
> >>> Compression Codec and all currently available CompressionCodecs. All
> >>> elapsed times are the average of 20 trials."
> >>>
> >>> But doesn't give any details of how this was implemented. Is the source
> >>> code available somewhere? Micro-benchmarking in the JVM is pretty
> tricky so
> >>> it needs verification before numbers can be trusted. A performance test
> >>> with kafka-producer-perf-test.sh would be nice to have as well, if
> possible.
> >>>
> >>> Thanks,
> >>> Ismael
> >>>
> >>> On Tue, Jan 10, 2017 at 7:44 AM, Dongjin Lee  wrote:
> >>>
> >>> > Ismael,
> >>> >
> >>> > 1. Is the benchmark in the KIP page not enough? You mean we need a
> whole
> >>> > performance test using kafka-producer-perf-test.sh?
> >>> >
> >>> > 2. It seems like no major project is relying on it currently.
> However,
> >>> > after reviewing the code, I concluded that at least this project has
> a good
> >>> > test coverage. And for the problem of upstream tracking - although
> there is
> >>> > no significant update on ZStandard to judge this problem, it seems
> not bad.
> >>> > If required, I can take responsibility of the tracking for this
> library.
> >>> >
> >>> > Thanks,
> >>> > Dongjin
> >>> >
> >>> > On Tue, Jan 10, 2017 at 7:09 AM, Ismael Juma  wrote:
> >>> >
> >>> > > Thanks for posting the KIP, ZStandard looks like a nice
> improvement over
> >>> > > the existing compression algorithms. A couple of questions:
> >>> > >
> >>> > > 1. Can you please elaborate on the details of the benchmark?
> >>> > > 2. About https://github.com/luben/zstd-jni, can we rely on it? A
> few
> >>> > > things
> >>> > > to consider: are there other projects using it, does it have good
> test
> >>> > > coverage, are there performance tests, does it track upstream
> closely?
> >>> > >
> >>> > > Thanks,
> >>> > > Ismael
> >>> > >
> >>> > > On Fri, Jan 6, 2017 at 2:40 AM, Dongjin Lee  wrote:
> >>> > >
> >>> > > > Hi all,
> >>> > > >
> >>> > > > I've just posted a new KIP "KIP-110: Add Codec for ZStandard
> >>> > Compression"
> >>> > > > for
> >>> > > > discussion:
> >>> > > >
> >>> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>> > > > 110%3A+Add+Codec+for+ZStandard+Compression
> >>> > > >
> >>> > > > Please have a look when you are free.
> >>> > > >
> >>> > > > Best,
> >>> > > > Dongjin
> >>> > > >
> >>> > > > --
> >>> > > > *Dongjin Lee*
> >>> > > >
> >>> > > >
> >>> > > > *Software developer in Line+.So interested in massive-scale
> machine
> >>> > > > learning.facebook: www.facebook.com/dongjin.lee.kr
> >>> > > > linkedin:
> >>> > > > kr.linkedin.com/in/dongjinleekr
> >>> > > > github:
> >>> > > > github.com/dongjinleekr
> >>> > > > twitter: www.twitter.com/dongjinleekr
> >>> > > > *
> >>> > > >
> >>> > >
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > *Dongjin Lee*
> >>> >
> >>> >
> >>> > *Software developer in Line+.So interested in massive-scale machine
> >>> > learning.facebook: www.facebook.com/dongjin.lee.kr
> >>> > linkedin:
> >>> > kr.linkedin.com/in/dongjinleekr
> >>> > github:
> >>> > github.com/dongjinleekr
> >>> > twitter: www.twitter.com/dongjinleekr
> >>> > *
> >>> >
> >>>
> >>>
> >
> >
> > --
> > *Dongjin Lee*
> >
> >
> > *Software developer in Line+.So interested in massive-scale machine
> > learning.facebook: www.facebook.com/dongjin.lee.kr
> > <http://www.facebook.com/dongjin.lee.kr>linkedin: kr.linkedin.com/in/
> dongjinleekr
> > <http://kr.linkedin.com/in/dongjinleekr>github:
> > <http://goog_969573159/>github.com/dongjinleekr
> > <http://github.com/dongjinleekr>twitter: www.twitter.com/dongjinleekr
> > <http://www.twitter.com/dongjinleekr>*
> >
>
>
>
>
>

Reply via email to