Hi Ismael,

Thanks for the reply.

> So you're saying that reducing the buffer size didn't reduce the
compression rate for codecs like lz4?

Of course, there were some improvements in compressed size when I tried the
'buffer.size' option, but the gain was not significant. I tried several
datasets, but the result was the same. It made me so skeptical about adding
this option, which seemed to make the configuration option complex only.

In contrast, 'compression.level' showed its effectiveness immediately. It
is why I decided to focus on the 'compression.level' in this rework.

As you can see in the update KIP with the benchmark, IMHO, the true value
of supporting the compression option may not be the compressed size or
rate, but speed. By tweaking the compression level slightly, it showed
great produce performance gain.

Thanks,
Dongjin


On Sun, Jun 6, 2021 at 6:48 AM Ismael Juma <ism...@juma.me.uk> wrote:

> Thanks Dongjin. So you're saying that reducing the buffer size didn't
> reduce the compression rate for codecs like lz4? If so, that would suggest
> reducing the default value, but that seems odd.
>
> Ismael
>
> On Sat, Jun 5, 2021, 9:25 AM Dongjin Lee <dong...@apache.org> wrote:
>
> > Hello Kafka dev,
> >
> > I hope to reboot the discussion of KIP-390: Support Compression Level
> > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-390%3A+Support+Compression+Level
> > >.
> > It proposes to add a new option, 'compression.level', that controls the
> > compression level.
> >
> > This KIP has been submitted more than one year ago, but had been
> neglected
> > for a long time. Recently I reworked it from scratch with the following
> > differences:
> >
> > 1. Tested how it works with a real-world dataset. As you can see in the
> > updated KIP, *this feature can improve the producer's message/second rate
> > by more than 50%*, such a significant enhancement.
> > 2. Dropped 'compression.buffer.size' option that was in the initial work.
> > With the repeated benchmarks, I could not find any evidence this option
> > results in meaningful differences. So I removed it.
> >
> > All feedback will be highly appreciated.
> >
> > Best,
> > Dongjin
> >
> >
> > --
> > *Dongjin Lee*
> >
> > *A hitchhiker in the mathematical world.*
> >
> >
> >
> > *github:  <http://goog_969573159/>github.com/dongjinleekr
> > <https://github.com/dongjinleekr>keybase:
> https://keybase.io/dongjinleekr
> > <https://keybase.io/dongjinleekr>linkedin:
> kr.linkedin.com/in/dongjinleekr
> > <https://kr.linkedin.com/in/dongjinleekr>speakerdeck:
> > speakerdeck.com/dongjin
> > <https://speakerdeck.com/dongjin>*
> >
>


-- 
*Dongjin Lee*

*A hitchhiker in the mathematical world.*



*github:  <http://goog_969573159/>github.com/dongjinleekr
<https://github.com/dongjinleekr>keybase: https://keybase.io/dongjinleekr
<https://keybase.io/dongjinleekr>linkedin: kr.linkedin.com/in/dongjinleekr
<https://kr.linkedin.com/in/dongjinleekr>speakerdeck: speakerdeck.com/dongjin
<https://speakerdeck.com/dongjin>*

Reply via email to