FYI, there's an outstanding patch for getting some JMH benchmarking setup: https://github.com/apache/kafka/pull/1712 I haven't found time to review it (and don't really know JMH well anyway) but it might be worth getting that landed so we can use it for this as well.
-Ewen On Wed, Jan 11, 2017 at 6:35 AM, Dongjin Lee <dong...@apache.org> wrote: > Hi Ismael, > > 1. In the case of compression output, yes, lz4 is producing the smaller > output than gzip. In fact, my benchmark was inspired > by MessageCompressionTest#testCompressSize unit test and the result is > same - 396 bytes for gzip and 387 bytes for lz4. > 2. I agree that my (former) approach can result in unreliable output. > However, I am experiencing difficulties on how to acquire the benchmark > metrics from Kafka. For you recommended JMH, I just started to google for > it. If possible, could you give any example on how to use JMH against > Kafka? If it is the case, it will be a great help. > Regards,Dongjin > > _____________________________ > From: Ismael Juma <ism...@juma.me.uk> > Sent: Wednesday, January 11, 2017 7:33 PM > Subject: Re: [DISCUSS] KIP-110: Add Codec for ZStandard Compression > To: <dev@kafka.apache.org> > > > Thanks Dongjin. I highly recommend using JMH for the benchmark, the > existing one has a few problems that could result in unreliable results. > Also, it's a bit surprising that LZ4 is producing smaller output than gzip. > Is that right? > > Ismael > > On Wed, Jan 11, 2017 at 10:20 AM, Dongjin Lee <dong...@apache.org> wrote: > > > Ismael, > > > > I pushed the benchmark code I used, with some updates (iteration: 20 -> > > 1000). I also updated the KIP page with the updated benchmark results. > > Please take a review when you are free. The attached screenshot shows how > > to run the benchmarker. > > > > Thanks, > > Dongjin > > > > On Tue, Jan 10, 2017 at 8:03 PM, Dongjin Lee <dong...@apache.org> wrote: > > > >> Ismael, > >> > >> I see. Then, I will share the benchmark code I used by tomorrow. Thanks > >> for your guidance. > >> > >> Best, > >> Dongjin > >> > >> ----- > >> > >> Dongjin Lee > >> > >> Software developer in Line+. > >> So interested in massive-scale machine learning. > >> > >> facebook: www.facebook.com/dongjin.lee.kr > >> linkedin: kr.linkedin.com/in/dongjinleekr > >> github: github.com/dongjinleekr > >> twitter: www.twitter.com/dongjinleekr > >> > >> > >> > >> > >> On Tue, Jan 10, 2017 at 7:24 PM +0900, "Ismael Juma" <ism...@juma.me.uk > > > >> wrote: > >> > >> Dongjin, > >>> > >>> The KIP states: > >>> > >>> "I compared the compressed size and compression time of 3 1kb-sized > >>> messages (3102 bytes in total), with the Draft-implementation of > ZStandard > >>> Compression Codec and all currently available CompressionCodecs. All > >>> elapsed times are the average of 20 trials." > >>> > >>> But doesn't give any details of how this was implemented. Is the source > >>> code available somewhere? Micro-benchmarking in the JVM is pretty > tricky so > >>> it needs verification before numbers can be trusted. A performance test > >>> with kafka-producer-perf-test.sh would be nice to have as well, if > possible. > >>> > >>> Thanks, > >>> Ismael > >>> > >>> On Tue, Jan 10, 2017 at 7:44 AM, Dongjin Lee wrote: > >>> > >>> > Ismael, > >>> > > >>> > 1. Is the benchmark in the KIP page not enough? You mean we need a > whole > >>> > performance test using kafka-producer-perf-test.sh? > >>> > > >>> > 2. It seems like no major project is relying on it currently. > However, > >>> > after reviewing the code, I concluded that at least this project has > a good > >>> > test coverage. And for the problem of upstream tracking - although > there is > >>> > no significant update on ZStandard to judge this problem, it seems > not bad. > >>> > If required, I can take responsibility of the tracking for this > library. > >>> > > >>> > Thanks, > >>> > Dongjin > >>> > > >>> > On Tue, Jan 10, 2017 at 7:09 AM, Ismael Juma wrote: > >>> > > >>> > > Thanks for posting the KIP, ZStandard looks like a nice > improvement over > >>> > > the existing compression algorithms. A couple of questions: > >>> > > > >>> > > 1. Can you please elaborate on the details of the benchmark? > >>> > > 2. About https://github.com/luben/zstd-jni, can we rely on it? A > few > >>> > > things > >>> > > to consider: are there other projects using it, does it have good > test > >>> > > coverage, are there performance tests, does it track upstream > closely? > >>> > > > >>> > > Thanks, > >>> > > Ismael > >>> > > > >>> > > On Fri, Jan 6, 2017 at 2:40 AM, Dongjin Lee wrote: > >>> > > > >>> > > > Hi all, > >>> > > > > >>> > > > I've just posted a new KIP "KIP-110: Add Codec for ZStandard > >>> > Compression" > >>> > > > for > >>> > > > discussion: > >>> > > > > >>> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP- > >>> > > > 110%3A+Add+Codec+for+ZStandard+Compression > >>> > > > > >>> > > > Please have a look when you are free. > >>> > > > > >>> > > > Best, > >>> > > > Dongjin > >>> > > > > >>> > > > -- > >>> > > > *Dongjin Lee* > >>> > > > > >>> > > > > >>> > > > *Software developer in Line+.So interested in massive-scale > machine > >>> > > > learning.facebook: www.facebook.com/dongjin.lee.kr > >>> > > > linkedin: > >>> > > > kr.linkedin.com/in/dongjinleekr > >>> > > > github: > >>> > > > github.com/dongjinleekr > >>> > > > twitter: www.twitter.com/dongjinleekr > >>> > > > * > >>> > > > > >>> > > > >>> > > >>> > > >>> > > >>> > -- > >>> > *Dongjin Lee* > >>> > > >>> > > >>> > *Software developer in Line+.So interested in massive-scale machine > >>> > learning.facebook: www.facebook.com/dongjin.lee.kr > >>> > linkedin: > >>> > kr.linkedin.com/in/dongjinleekr > >>> > github: > >>> > github.com/dongjinleekr > >>> > twitter: www.twitter.com/dongjinleekr > >>> > * > >>> > > >>> > >>> > > > > > > -- > > *Dongjin Lee* > > > > > > *Software developer in Line+.So interested in massive-scale machine > > learning.facebook: www.facebook.com/dongjin.lee.kr > > <http://www.facebook.com/dongjin.lee.kr>linkedin: kr.linkedin.com/in/ > dongjinleekr > > <http://kr.linkedin.com/in/dongjinleekr>github: > > <http://goog_969573159/>github.com/dongjinleekr > > <http://github.com/dongjinleekr>twitter: www.twitter.com/dongjinleekr > > <http://www.twitter.com/dongjinleekr>* > > > > > > >