Hi Ismael, I added the linear write benchmark result to the proposal. Like the producer benchmark, the least compression level showed the best MB/sec for any case. I tested several configurations, but the result was almost the same.
If you have any proposals for the benchmark, don't hesitate to give me a suggestion. I am a newbie to run the linear write benchmark. Best, Dongjin On Sun, Jun 6, 2021 at 8:20 AM Dongjin Lee <dong...@apache.org> wrote: > Hi Ismael, > > Thanks for the reply. > > > So you're saying that reducing the buffer size didn't reduce the > compression rate for codecs like lz4? > > Of course, there were some improvements in compressed size when I tried > the 'buffer.size' option, but the gain was not significant. I tried several > datasets, but the result was the same. It made me so skeptical about adding > this option, which seemed to make the configuration option complex only. > > In contrast, 'compression.level' showed its effectiveness immediately. It > is why I decided to focus on the 'compression.level' in this rework. > > As you can see in the update KIP with the benchmark, IMHO, the true value > of supporting the compression option may not be the compressed size or > rate, but speed. By tweaking the compression level slightly, it showed > great produce performance gain. > > Thanks, > Dongjin > > > On Sun, Jun 6, 2021 at 6:48 AM Ismael Juma <ism...@juma.me.uk> wrote: > >> Thanks Dongjin. So you're saying that reducing the buffer size didn't >> reduce the compression rate for codecs like lz4? If so, that would suggest >> reducing the default value, but that seems odd. >> >> Ismael >> >> On Sat, Jun 5, 2021, 9:25 AM Dongjin Lee <dong...@apache.org> wrote: >> >> > Hello Kafka dev, >> > >> > I hope to reboot the discussion of KIP-390: Support Compression Level >> > < >> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-390%3A+Support+Compression+Level >> > >. >> > It proposes to add a new option, 'compression.level', that controls the >> > compression level. >> > >> > This KIP has been submitted more than one year ago, but had been >> neglected >> > for a long time. Recently I reworked it from scratch with the following >> > differences: >> > >> > 1. Tested how it works with a real-world dataset. As you can see in the >> > updated KIP, *this feature can improve the producer's message/second >> rate >> > by more than 50%*, such a significant enhancement. >> > 2. Dropped 'compression.buffer.size' option that was in the initial >> work. >> > With the repeated benchmarks, I could not find any evidence this option >> > results in meaningful differences. So I removed it. >> > >> > All feedback will be highly appreciated. >> > >> > Best, >> > Dongjin >> > >> > >> > -- >> > *Dongjin Lee* >> > >> > *A hitchhiker in the mathematical world.* >> > >> > >> > >> > *github: <http://goog_969573159/>github.com/dongjinleekr >> > <https://github.com/dongjinleekr>keybase: >> https://keybase.io/dongjinleekr >> > <https://keybase.io/dongjinleekr>linkedin: >> kr.linkedin.com/in/dongjinleekr >> > <https://kr.linkedin.com/in/dongjinleekr>speakerdeck: >> > speakerdeck.com/dongjin >> > <https://speakerdeck.com/dongjin>* >> > >> > > > -- > *Dongjin Lee* > > *A hitchhiker in the mathematical world.* > > > > *github: <http://goog_969573159/>github.com/dongjinleekr > <https://github.com/dongjinleekr>keybase: https://keybase.io/dongjinleekr > <https://keybase.io/dongjinleekr>linkedin: kr.linkedin.com/in/dongjinleekr > <https://kr.linkedin.com/in/dongjinleekr>speakerdeck: speakerdeck.com/dongjin > <https://speakerdeck.com/dongjin>* > -- *Dongjin Lee* *A hitchhiker in the mathematical world.* *github: <http://goog_969573159/>github.com/dongjinleekr <https://github.com/dongjinleekr>keybase: https://keybase.io/dongjinleekr <https://keybase.io/dongjinleekr>linkedin: kr.linkedin.com/in/dongjinleekr <https://kr.linkedin.com/in/dongjinleekr>speakerdeck: speakerdeck.com/dongjin <https://speakerdeck.com/dongjin>*