Thanks! We could move that to a "Future Work" section instead of "Rejected Alternatives" if the results are inconclusive.
Ismael On Wed, Jun 9, 2021 at 8:09 AM Dongjin Lee <dong...@apache.org> wrote: > > I am OK with doing compression level first, but I don't want to rule out > the buffer size change without understanding better. > > I see. I am now retrying buffer size configuration & benchmark. As soon as > I get a promising result, I will update the KIP. > > Thanks, > Dongjin > > On Wed, Jun 9, 2021 at 12:36 AM Ismael Juma <ism...@juma.me.uk> wrote: > > > Btw, I am OK with doing compression level first, but I don't want to rule > > out the buffer size change without understanding better. > > > > Ismael > > > > On Tue, Jun 8, 2021 at 8:33 AM Ismael Juma <ism...@juma.me.uk> wrote: > > > > > Hi Dongjin, > > > > > > I was thinking of a simple test: Snappy with 1 KB block size vs 32 KB > > > block size. If the compression rate is similar for both, then it seems > > very > > > wasteful to use 32 KB. I suspect you will see a significant difference > > > though. > > > > > > Ismael > > > > > > On Tue, Jun 8, 2021 at 8:27 AM Dongjin Lee <dong...@apache.org> wrote: > > > > > >> Hi Ismael, > > >> > > >> I added the linear write benchmark result to the proposal. Like the > > >> producer benchmark, the least compression level showed the best MB/sec > > for > > >> any case. I tested several configurations, but the result was almost > the > > >> same. > > >> > > >> If you have any proposals for the benchmark, don't hesitate to give > me a > > >> suggestion. I am a newbie to run the linear write benchmark. > > >> > > >> Best, > > >> Dongjin > > >> > > >> On Sun, Jun 6, 2021 at 8:20 AM Dongjin Lee <dong...@apache.org> > wrote: > > >> > > >> > Hi Ismael, > > >> > > > >> > Thanks for the reply. > > >> > > > >> > > So you're saying that reducing the buffer size didn't reduce the > > >> > compression rate for codecs like lz4? > > >> > > > >> > Of course, there were some improvements in compressed size when I > > tried > > >> > the 'buffer.size' option, but the gain was not significant. I tried > > >> several > > >> > datasets, but the result was the same. It made me so skeptical about > > >> adding > > >> > this option, which seemed to make the configuration option complex > > only. > > >> > > > >> > In contrast, 'compression.level' showed its effectiveness > immediately. > > >> It > > >> > is why I decided to focus on the 'compression.level' in this rework. > > >> > > > >> > As you can see in the update KIP with the benchmark, IMHO, the true > > >> value > > >> > of supporting the compression option may not be the compressed size > or > > >> > rate, but speed. By tweaking the compression level slightly, it > showed > > >> > great produce performance gain. > > >> > > > >> > Thanks, > > >> > Dongjin > > >> > > > >> > > > >> > On Sun, Jun 6, 2021 at 6:48 AM Ismael Juma <ism...@juma.me.uk> > wrote: > > >> > > > >> >> Thanks Dongjin. So you're saying that reducing the buffer size > didn't > > >> >> reduce the compression rate for codecs like lz4? If so, that would > > >> suggest > > >> >> reducing the default value, but that seems odd. > > >> >> > > >> >> Ismael > > >> >> > > >> >> On Sat, Jun 5, 2021, 9:25 AM Dongjin Lee <dong...@apache.org> > wrote: > > >> >> > > >> >> > Hello Kafka dev, > > >> >> > > > >> >> > I hope to reboot the discussion of KIP-390: Support Compression > > Level > > >> >> > < > > >> >> > > > >> >> > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-390%3A+Support+Compression+Level > > >> >> > >. > > >> >> > It proposes to add a new option, 'compression.level', that > controls > > >> the > > >> >> > compression level. > > >> >> > > > >> >> > This KIP has been submitted more than one year ago, but had been > > >> >> neglected > > >> >> > for a long time. Recently I reworked it from scratch with the > > >> following > > >> >> > differences: > > >> >> > > > >> >> > 1. Tested how it works with a real-world dataset. As you can see > in > > >> the > > >> >> > updated KIP, *this feature can improve the producer's > > message/second > > >> >> rate > > >> >> > by more than 50%*, such a significant enhancement. > > >> >> > 2. Dropped 'compression.buffer.size' option that was in the > initial > > >> >> work. > > >> >> > With the repeated benchmarks, I could not find any evidence this > > >> option > > >> >> > results in meaningful differences. So I removed it. > > >> >> > > > >> >> > All feedback will be highly appreciated. > > >> >> > > > >> >> > Best, > > >> >> > Dongjin > > >> >> > > > >> >> > > > >> >> > -- > > >> >> > *Dongjin Lee* > > >> >> > > > >> >> > *A hitchhiker in the mathematical world.* > > >> >> > > > >> >> > > > >> >> > > > >> >> > *github: <http://goog_969573159/>github.com/dongjinleekr > > >> >> > <https://github.com/dongjinleekr>keybase: > > >> >> https://keybase.io/dongjinleekr > > >> >> > <https://keybase.io/dongjinleekr>linkedin: > > >> >> kr.linkedin.com/in/dongjinleekr > > >> >> > <https://kr.linkedin.com/in/dongjinleekr>speakerdeck: > > >> >> > speakerdeck.com/dongjin > > >> >> > <https://speakerdeck.com/dongjin>* > > >> >> > > > >> >> > > >> > > > >> > > > >> > -- > > >> > *Dongjin Lee* > > >> > > > >> > *A hitchhiker in the mathematical world.* > > >> > > > >> > > > >> > > > >> > *github: <http://goog_969573159/>github.com/dongjinleekr > > >> > <https://github.com/dongjinleekr>keybase: > > >> https://keybase.io/dongjinleekr > > >> > <https://keybase.io/dongjinleekr>linkedin: > > >> kr.linkedin.com/in/dongjinleekr > > >> > <https://kr.linkedin.com/in/dongjinleekr>speakerdeck: > > >> speakerdeck.com/dongjin > > >> > <https://speakerdeck.com/dongjin>* > > >> > > > >> > > >> > > >> -- > > >> *Dongjin Lee* > > >> > > >> *A hitchhiker in the mathematical world.* > > >> > > >> > > >> > > >> *github: <http://goog_969573159/>github.com/dongjinleekr > > >> <https://github.com/dongjinleekr>keybase: > > https://keybase.io/dongjinleekr > > >> <https://keybase.io/dongjinleekr>linkedin: > > >> kr.linkedin.com/in/dongjinleekr > > >> <https://kr.linkedin.com/in/dongjinleekr>speakerdeck: > > >> speakerdeck.com/dongjin > > >> <https://speakerdeck.com/dongjin>* > > >> > > > > > > > > -- > *Dongjin Lee* > > *A hitchhiker in the mathematical world.* > > > > *github: <http://goog_969573159/>github.com/dongjinleekr > <https://github.com/dongjinleekr>keybase: https://keybase.io/dongjinleekr > <https://keybase.io/dongjinleekr>linkedin: kr.linkedin.com/in/dongjinleekr > <https://kr.linkedin.com/in/dongjinleekr>speakerdeck: > speakerdeck.com/dongjin > <https://speakerdeck.com/dongjin>* >