Btw, I am OK with doing compression level first, but I don't want to rule
out the buffer size change without understanding better.

Ismael

On Tue, Jun 8, 2021 at 8:33 AM Ismael Juma <ism...@juma.me.uk> wrote:

> Hi Dongjin,
>
> I was thinking of a simple test: Snappy with 1 KB block size vs 32 KB
> block size. If the compression rate is similar for both, then it seems very
> wasteful to use 32 KB. I suspect you will see a significant difference
> though.
>
> Ismael
>
> On Tue, Jun 8, 2021 at 8:27 AM Dongjin Lee <dong...@apache.org> wrote:
>
>> Hi Ismael,
>>
>> I added the linear write benchmark result to the proposal. Like the
>> producer benchmark, the least compression level showed the best MB/sec for
>> any case. I tested several configurations, but the result was almost the
>> same.
>>
>> If you have any proposals for the benchmark, don't hesitate to give me a
>> suggestion. I am a newbie to run the linear write benchmark.
>>
>> Best,
>> Dongjin
>>
>> On Sun, Jun 6, 2021 at 8:20 AM Dongjin Lee <dong...@apache.org> wrote:
>>
>> > Hi Ismael,
>> >
>> > Thanks for the reply.
>> >
>> > > So you're saying that reducing the buffer size didn't reduce the
>> > compression rate for codecs like lz4?
>> >
>> > Of course, there were some improvements in compressed size when I tried
>> > the 'buffer.size' option, but the gain was not significant. I tried
>> several
>> > datasets, but the result was the same. It made me so skeptical about
>> adding
>> > this option, which seemed to make the configuration option complex only.
>> >
>> > In contrast, 'compression.level' showed its effectiveness immediately.
>> It
>> > is why I decided to focus on the 'compression.level' in this rework.
>> >
>> > As you can see in the update KIP with the benchmark, IMHO, the true
>> value
>> > of supporting the compression option may not be the compressed size or
>> > rate, but speed. By tweaking the compression level slightly, it showed
>> > great produce performance gain.
>> >
>> > Thanks,
>> > Dongjin
>> >
>> >
>> > On Sun, Jun 6, 2021 at 6:48 AM Ismael Juma <ism...@juma.me.uk> wrote:
>> >
>> >> Thanks Dongjin. So you're saying that reducing the buffer size didn't
>> >> reduce the compression rate for codecs like lz4? If so, that would
>> suggest
>> >> reducing the default value, but that seems odd.
>> >>
>> >> Ismael
>> >>
>> >> On Sat, Jun 5, 2021, 9:25 AM Dongjin Lee <dong...@apache.org> wrote:
>> >>
>> >> > Hello Kafka dev,
>> >> >
>> >> > I hope to reboot the discussion of KIP-390: Support Compression Level
>> >> > <
>> >> >
>> >>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-390%3A+Support+Compression+Level
>> >> > >.
>> >> > It proposes to add a new option, 'compression.level', that controls
>> the
>> >> > compression level.
>> >> >
>> >> > This KIP has been submitted more than one year ago, but had been
>> >> neglected
>> >> > for a long time. Recently I reworked it from scratch with the
>> following
>> >> > differences:
>> >> >
>> >> > 1. Tested how it works with a real-world dataset. As you can see in
>> the
>> >> > updated KIP, *this feature can improve the producer's message/second
>> >> rate
>> >> > by more than 50%*, such a significant enhancement.
>> >> > 2. Dropped 'compression.buffer.size' option that was in the initial
>> >> work.
>> >> > With the repeated benchmarks, I could not find any evidence this
>> option
>> >> > results in meaningful differences. So I removed it.
>> >> >
>> >> > All feedback will be highly appreciated.
>> >> >
>> >> > Best,
>> >> > Dongjin
>> >> >
>> >> >
>> >> > --
>> >> > *Dongjin Lee*
>> >> >
>> >> > *A hitchhiker in the mathematical world.*
>> >> >
>> >> >
>> >> >
>> >> > *github:  <http://goog_969573159/>github.com/dongjinleekr
>> >> > <https://github.com/dongjinleekr>keybase:
>> >> https://keybase.io/dongjinleekr
>> >> > <https://keybase.io/dongjinleekr>linkedin:
>> >> kr.linkedin.com/in/dongjinleekr
>> >> > <https://kr.linkedin.com/in/dongjinleekr>speakerdeck:
>> >> > speakerdeck.com/dongjin
>> >> > <https://speakerdeck.com/dongjin>*
>> >> >
>> >>
>> >
>> >
>> > --
>> > *Dongjin Lee*
>> >
>> > *A hitchhiker in the mathematical world.*
>> >
>> >
>> >
>> > *github:  <http://goog_969573159/>github.com/dongjinleekr
>> > <https://github.com/dongjinleekr>keybase:
>> https://keybase.io/dongjinleekr
>> > <https://keybase.io/dongjinleekr>linkedin:
>> kr.linkedin.com/in/dongjinleekr
>> > <https://kr.linkedin.com/in/dongjinleekr>speakerdeck:
>> speakerdeck.com/dongjin
>> > <https://speakerdeck.com/dongjin>*
>> >
>>
>>
>> --
>> *Dongjin Lee*
>>
>> *A hitchhiker in the mathematical world.*
>>
>>
>>
>> *github:  <http://goog_969573159/>github.com/dongjinleekr
>> <https://github.com/dongjinleekr>keybase: https://keybase.io/dongjinleekr
>> <https://keybase.io/dongjinleekr>linkedin:
>> kr.linkedin.com/in/dongjinleekr
>> <https://kr.linkedin.com/in/dongjinleekr>speakerdeck:
>> speakerdeck.com/dongjin
>> <https://speakerdeck.com/dongjin>*
>>
>

Reply via email to