Thanks!

Ismael

On Mon, Jan 21, 2019 at 6:02 AM Dongjin Lee <dong...@apache.org> wrote:

> Hi Ismael,
>
> After reviewing `LogValidator#validateMessagesAndAssignOffsetsCompressed`,
> yes, you are right. If source codec and target codec is identical and the
> magic is above 0, the broker can do an in-place assignment, without
> recompressing. Sorry for my misunderstanding.
>
> Since we don't need `compression.[gzip,lz4,zstd].level` and
> `compression.[gzip,snappy,lz4].buffer.size` anymore, I will revert those
> changes and update the KIP with the recent discussions. I will complete it
> in 48 hours from now.
>
> Thanks,
> Dongjin
>
> On Mon, Jan 21, 2019 at 4:18 PM Dongjin Lee <dong...@apache.org> wrote:
>
> > I see. Let me have a check. If not needed, of course, we don't have to
> > waste on configuration options.
> >
> > Since the KIP deadline is imminent, I just opened the voting thread.
> Let's
> > continue the discussion here.
> >
> > Best,
> > Dongjin
> >
> > On Mon, Jan 21, 2019 at 1:30 AM Ismael Juma <ism...@juma.me.uk> wrote:
> >
> >> Hi Dongjin,
> >>
> >> When the compression type is "producer", then the broker doesn't
> >> recompress
> >> though. Thinking about it some more, there are some uncommon cases where
> >> recompression does happen (the old (and hopefully hardly used by now)
> >> message format == 0 and some edge cases), so it is a good point you
> >> raised.
> >>
> >> It's a bit unfortunate to add so many topic configs for cases that
> >> probably
> >> don't matter. That is, if you are using "producer" compression, you
> >> probably don't need to configure these settings and can live with the
> >> defaults. Perhaps we should only support the topic config for the cases
> >> where you are actually recompressing in the broker.
> >>
> >> What do you think? I'd be interested in other people's thoughts too.
> >>
> >> Ismael
> >>
> >> On Sun, Jan 20, 2019 at 2:14 AM Dongjin Lee <dong...@apache.org> wrote:
> >>
> >> > Hi Ismael,
> >> >
> >> > It seems like it needs more explanation. Here is the detailed
> reasoning.
> >> >
> >> > You know, topic and broker's 'compression.type' allows 'uncompressed',
> >> > 'producer' with standard codecs (i.e., gzip, snappy, lz4, zstd.) And
> >> this
> >> > configuration is used by the broker in the re-compressing process
> after
> >> > offset assignment. After this feature, the new configs,
> >> 'compression.level'
> >> > and 'compression.buffer.size', also will be used in this process.
> >> >
> >> > The problem arises when given topic's compression type (whether it was
> >> > inherited from broker's configuration or explicitly set) is
> 'producer.'
> >> > With this setting, the compression codec to be used is decided by the
> >> > producer client. Since there is no way to restore the compression
> level
> >> and
> >> > buffer size from the message, we can take the following strategies:
> >> >
> >> > 1. Just use given 'compression.level' and 'compression.buffer.size'
> >> > settings.
> >> >
> >> > It will cause so many errors. Let's imagine the case of topic's
> >> > configuration is { compression.type=producer, compression.level=10,
> >> > compression.buffer.size=8192 }. In this case, all producers with gzip
> or
> >> > lz4 compressed messages will result in an error. (gzip doesn't allow
> >> > compression level 10, and lz4 also for a buffer size of 8192.)
> >> >
> >> > 2. Extend the message format to include compression configurations.
> >> >
> >> > With this strategy, we need to change the message format - it's a too
> >> big
> >> > change.
> >> >
> >> > 3. If topic's compression.type is 'producer', use the default
> >> configuration
> >> > for the given codec.
> >> >
> >> > With this strategy, allowing fine-grained compression configuration is
> >> > meaningless.
> >> >
> >> > For the above reasons, I think the only alternative is providing
> options
> >> > that can be used when the topic's 'compression.type' is 'producer.' In
> >> > other words, adding compression.[gzip, lz4, zstd].level and
> >> > compression.[gzip.snappy.lz4].buffer.size options - and it is what I
> >> did in
> >> > the last modification.
> >> >
> >> > (wait, the reasoning above should be included in the KIP in the
> rejected
> >> > alternatives section, isn't it?)
> >> >
> >> > Thanks,
> >> > Dongjin
> >> >
> >> > On Sun, Jan 20, 2019 at 2:33 AM Ismael Juma <ism...@juma.me.uk>
> wrote:
> >> >
> >> > > Hi Dongjin,
> >> > >
> >> > > For topic level, you can only have a single compression type so the
> >> way
> >> > it
> >> > > was before was fine, right? The point you raise is how to set broker
> >> > > defaults that vary depending on the compression type, correct?
> >> > >
> >> > > Ismael
> >> > >
> >> > > On Mon, Jan 14, 2019 at 10:18 AM Dongjin Lee <dong...@apache.org>
> >> wrote:
> >> > >
> >> > > > I just realized that there was a missing hole in the KIP, so I
> fixed
> >> > it.
> >> > > > The draft implementation will be updated soon.
> >> > > >
> >> > > > In short, the proposed change did not regard the case of the topic
> >> or
> >> > > > broker's 'compression.type' is 'producer'; in this case, the
> broker
> >> has
> >> > > to
> >> > > > handle all kinds of the supported codec. So I added additional
> >> options
> >> > > > (compression.[gzip,snappy,lz4, zstd].level,
> >> > compression.[gzip,snappy,lz4,
> >> > > > zstd].buffer.size) with handling routines.
> >> > > >
> >> > > > Please have a look when you are free.
> >> > > >
> >> > > > Thanks,
> >> > > > Dongjin
> >> > > >
> >> > > > On Mon, Jan 7, 2019 at 6:23 AM Dongjin Lee <dong...@apache.org>
> >> wrote:
> >> > > >
> >> > > > > Thanks for pointing out Ismael. It's now updated.
> >> > > > >
> >> > > > > Best,
> >> > > > > Dongjin
> >> > > > >
> >> > > > > On Mon, Jan 7, 2019 at 4:36 AM Ismael Juma <isma...@gmail.com>
> >> > wrote:
> >> > > > >
> >> > > > >> Thanks Dongjin. One minor suggestion: we should mention that
> the
> >> > > broker
> >> > > > >> side configs are also topic configs (i.e. can be set for a
> given
> >> > > topic).
> >> > > > >>
> >> > > > >> Ismael
> >> > > > >>
> >> > > > >> On Sun, Jan 6, 2019, 10:37 AM Dongjin Lee <dong...@apache.org
> >> > wrote:
> >> > > > >>
> >> > > > >> > Happy new year.
> >> > > > >> >
> >> > > > >> > I just updated the title and contents of KIP and Jira issue,
> >> with
> >> > > > >> updated
> >> > > > >> > draft implementation. Now both of compression level and
> buffer
> >> > size
> >> > > > >> options
> >> > > > >> > are available to producer and broker configuration. You can
> >> check
> >> > > the
> >> > > > >> > updated KIP from modified URL:
> >> > > > >> >
> >> > > > >> >
> >> > > > >> >
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-390%3A+Allow+fine-grained+configuration+for+compression
> >> > > > >> >
> >> > > > >> > Please have a look when you are free.
> >> > > > >> >
> >> > > > >> > Thanks,
> >> > > > >> > Dongjin
> >> > > > >> >
> >> > > > >> > On Mon, Dec 3, 2018 at 12:50 AM Ismael Juma <
> isma...@gmail.com
> >> >
> >> > > > wrote:
> >> > > > >> >
> >> > > > >> > > The updated title sounds fine to me.
> >> > > > >> > >
> >> > > > >> > > Ismael
> >> > > > >> > >
> >> > > > >> > > On Sun, Dec 2, 2018, 5:25 AM Dongjin Lee <
> dong...@apache.org
> >> > > wrote:
> >> > > > >> > >
> >> > > > >> > > > Hi Ismael,
> >> > > > >> > > >
> >> > > > >> > > > Got it. Your direction is perfectly reasonable. I am now
> >> > > updating
> >> > > > >> the
> >> > > > >> > KIP
> >> > > > >> > > > document and the implementation.
> >> > > > >> > > >
> >> > > > >> > > > By allowing the buffer/block size to be configurable, it
> >> would
> >> > > be
> >> > > > >> > better
> >> > > > >> > > to
> >> > > > >> > > > update the title of the KIP like 'Allow fine-grained
> >> > > configuration
> >> > > > >> for
> >> > > > >> > > > compression'. Is that right?
> >> > > > >> > > >
> >> > > > >> > > > @Other committers:
> >> > > > >> > > >
> >> > > > >> > > > Is there any other opinion on allowing the buffer/block
> >> size
> >> > to
> >> > > be
> >> > > > >> > > > configurable?
> >> > > > >> > > >
> >> > > > >> > > > Thanks,
> >> > > > >> > > > Dongjin
> >> > > > >> > > >
> >> > > > >> > > > On Thu, Nov 29, 2018 at 1:45 AM Ismael Juma <
> >> > ism...@juma.me.uk>
> >> > > > >> wrote:
> >> > > > >> > > >
> >> > > > >> > > > > Hi Dongjin,
> >> > > > >> > > > >
> >> > > > >> > > > > To clarify, I mean a broker topic config with regards
> to
> >> > point
> >> > > > 1.
> >> > > > >> As
> >> > > > >> > > you
> >> > > > >> > > > > know, compression can be done by the producer and/or by
> >> the
> >> > > > >> broker.
> >> > > > >> > The
> >> > > > >> > > > > default is for the broker to just use whatever
> >> compression
> >> > was
> >> > > > >> used
> >> > > > >> > by
> >> > > > >> > > > the
> >> > > > >> > > > > producer, but this can be changed by the user on a per
> >> topic
> >> > > > >> basis.
> >> > > > >> > It
> >> > > > >> > > > > seems like it would make sense for the configs to be .
> >> > > > consistent
> >> > > > >> > > between
> >> > > > >> > > > > producer and broker.
> >> > > > >> > > > >
> >> > > > >> > > > > For point 2, I haven't looked at the implementation,
> but
> >> we
> >> > > > could
> >> > > > >> do
> >> > > > >> > it
> >> > > > >> > > > in
> >> > > > >> > > > > the `CompressionType` enum by invoking the right
> >> constructor
> >> > > or
> >> > > > >> > > > retrieving
> >> > > > >> > > > > the default value via a constant (if defined). That's
> an
> >> > > > >> > implementation
> >> > > > >> > > > > detail and can be discussed in the PR. The more general
> >> > point
> >> > > is
> >> > > > >> to
> >> > > > >> > > rely
> >> > > > >> > > > on
> >> > > > >> > > > > the library defaults instead of choosing one ourselves.
> >> > > > >> > > > >
> >> > > > >> > > > > For point 3, I'm in favour of doing that in this KIP.
> >> > > > >> > > > >
> >> > > > >> > > > > Ismael
> >> > > > >> > > > >
> >> > > > >> > > > > On Wed, Nov 28, 2018 at 7:01 AM Dongjin Lee <
> >> > > dong...@apache.org
> >> > > > >
> >> > > > >> > > wrote:
> >> > > > >> > > > >
> >> > > > >> > > > > > Thank you Ismael, here are the answers:
> >> > > > >> > > > > >
> >> > > > >> > > > > > *1. About topic config*
> >> > > > >> > > > > >
> >> > > > >> > > > > > After some consideration, I concluded that topic
> config
> >> > > > doesn't
> >> > > > >> > need
> >> > > > >> > > to
> >> > > > >> > > > > > support compression.level. Here is why: since the
> >> > > compression
> >> > > > is
> >> > > > >> > > > > conducted
> >> > > > >> > > > > > by the client, the one who can select the best
> >> compression
> >> > > > >> level is
> >> > > > >> > > the
> >> > > > >> > > > > > client itself. Let us assume that the compression
> >> level is
> >> > > set
> >> > > > >> at
> >> > > > >> > the
> >> > > > >> > > > > topic
> >> > > > >> > > > > > config level. In that case, there is a possibility
> that
> >> > the
> >> > > > >> > > compression
> >> > > > >> > > > > > level is not optimal for some producers. Actually,
> >> Kafka's
> >> > > go
> >> > > > >> > client
> >> > > > >> > > > also
> >> > > > >> > > > > > supports compression level functionality for the
> >> producer
> >> > > > config
> >> > > > >> > > only.
> >> > > > >> > > > > > <
> >> https://github.com/Shopify/sarama/blob/master/config.go>
> >> > > > >> (wait,
> >> > > > >> > do
> >> > > > >> > > we
> >> > > > >> > > > > > need
> >> > > > >> > > > > > to add this reasoning in the KIP, rejected
> alternatives
> >> > > > >> section?)
> >> > > > >> > > > > >
> >> > > > >> > > > > > *2. About default level*
> >> > > > >> > > > > >
> >> > > > >> > > > > > As of current draft implementation, the default
> >> > compression
> >> > > is
> >> > > > >> set
> >> > > > >> > on
> >> > > > >> > > > the
> >> > > > >> > > > > > CompressionType enum. Of course, changing this
> strategy
> >> > into
> >> > > > >> > relying
> >> > > > >> > > > on a
> >> > > > >> > > > > > method from the library to pick the default
> compression
> >> > > level
> >> > > > >> seems
> >> > > > >> > > > > > possible, like `GZIPBlockOutputStream` does. In this
> >> case,
> >> > > we
> >> > > > >> need
> >> > > > >> > to
> >> > > > >> > > > add
> >> > > > >> > > > > > similar wrapper class for zstd and modify lz4 the
> >> wrapper
> >> > > > also.
> >> > > > >> Add
> >> > > > >> > > to
> >> > > > >> > > > > > this, it seems like we need to explicitly state that
> we
> >> > > follow
> >> > > > >> the
> >> > > > >> > > > > default
> >> > > > >> > > > > > compression level of the codec in the documentation.
> Is
> >> > this
> >> > > > >> what
> >> > > > >> > you
> >> > > > >> > > > > > intended?
> >> > > > >> > > > > >
> >> > > > >> > > > > > *3. Whether to allow the buffer/block size to be
> >> > > configurable*
> >> > > > >> > > > > >
> >> > > > >> > > > > > Well, As of current draft implementation, the lz4
> >> level is
> >> > > > >> > > implemented
> >> > > > >> > > > as
> >> > > > >> > > > > > block size; this is caused by my misunderstanding on
> >> lz4.
> >> > > > After
> >> > > > >> > > > reviewing
> >> > > > >> > > > > > lz4 today, I found that it also supports compression
> >> level
> >> > > of
> >> > > > >> 1~16
> >> > > > >> > > > > > (default: 1), not block size. I will fix it in this
> >> > weekend
> >> > > by
> >> > > > >> > > updating
> >> > > > >> > > > > the
> >> > > > >> > > > > > wrapper class.
> >> > > > >> > > > > >
> >> > > > >> > > > > > For the problem of the buffer/block size, I have no
> >> strong
> >> > > > >> opinion.
> >> > > > >> > > If
> >> > > > >> > > > > the
> >> > > > >> > > > > > community needs it, I will do it all together. How do
> >> you
> >> > > > think?
> >> > > > >> > > > > >
> >> > > > >> > > > > > In short, it seems like I need to update the KIP
> >> document
> >> > > for
> >> > > > >> issue
> >> > > > >> > > #1
> >> > > > >> > > > > and
> >> > > > >> > > > > > update the compression wrapper for issue #2, #3. Is
> >> this
> >> > > okay?
> >> > > > >> > > > > >
> >> > > > >> > > > > > Thanks,
> >> > > > >> > > > > > Dongjin
> >> > > > >> > > > > >
> >> > > > >> > > > > > On Wed, Nov 28, 2018 at 12:34 AM Ismael Juma <
> >> > > > isma...@gmail.com
> >> > > > >> >
> >> > > > >> > > > wrote:
> >> > > > >> > > > > >
> >> > > > >> > > > > > >  Thanks for the KIP, this is helpful. A few
> >> questions:
> >> > > > >> > > > > > >
> >> > > > >> > > > > > > 1. Have we considered whether we want to allow a
> >> similar
> >> > > > topic
> >> > > > >> > > > config?
> >> > > > >> > > > > > > 2. Can we rely on a method from the library to pick
> >> the
> >> > > > >> default
> >> > > > >> > > > > > compression
> >> > > > >> > > > > > > level if compression.level is not set? We do it for
> >> gzip
> >> > > and
> >> > > > >> it
> >> > > > >> > > would
> >> > > > >> > > > > > seem
> >> > > > >> > > > > > > reasonable to do something similar for the other
> >> > > compression
> >> > > > >> > > > libraries.
> >> > > > >> > > > > > > 3. Do we want to allow the buffer/block size to be
> >> > > > >> configurable?
> >> > > > >> > > This
> >> > > > >> > > > > has
> >> > > > >> > > > > > > an impact on memory usage and people may want to
> >> trade
> >> > > > >> > compression
> >> > > > >> > > > for
> >> > > > >> > > > > > > less/more memory in some cases. For example, the
> >> default
> >> > > for
> >> > > > >> LZ4
> >> > > > >> > is
> >> > > > >> > > > > 64KB
> >> > > > >> > > > > > > which is a bit high.
> >> > > > >> > > > > > >
> >> > > > >> > > > > > > Ismael
> >> > > > >> > > > > > >
> >> > > > >> > > > > > > On Sun, Nov 18, 2018, 2:07 PM Dongjin Lee <
> >> > > > dong...@apache.org
> >> > > > >> > > wrote:
> >> > > > >> > > > > > >
> >> > > > >> > > > > > > > Hello dev,
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > > > I hope to initiate the discussion of KIP-390: Add
> >> > > producer
> >> > > > >> > option
> >> > > > >> > > > to
> >> > > > >> > > > > > > adjust
> >> > > > >> > > > > > > > compression level
> >> > > > >> > > > > > > > <
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > >
> >> > > > >> > > > > >
> >> > > > >> > > > >
> >> > > > >> > > >
> >> > > > >> > >
> >> > > > >> >
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-390%3A+Add+producer+option+to+adjust+compression+level
> >> > > > >> > > > > > > > >.
> >> > > > >> > > > > > > > All feedbacks will be highly appreciated.
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > > > Best,
> >> > > > >> > > > > > > > Dongjin
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > > > --
> >> > > > >> > > > > > > > *Dongjin Lee*
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > > > *A hitchhiker in the mathematical world.*
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > > > *github:  <http://goog_969573159/>
> >> > > github.com/dongjinleekr
> >> > > > >> > > > > > > > <http://github.com/dongjinleekr>linkedin:
> >> > > > >> > > > > > > kr.linkedin.com/in/dongjinleekr
> >> > > > >> > > > > > > > <http://kr.linkedin.com/in/dongjinleekr
> >> >slideshare:
> >> > > > >> > > > > > > > www.slideshare.net/dongjinleekr
> >> > > > >> > > > > > > > <http://www.slideshare.net/dongjinleekr>*
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > >
> >> > > > >> > > > > >
> >> > > > >> > > > > >
> >> > > > >> > > > > > --
> >> > > > >> > > > > > *Dongjin Lee*
> >> > > > >> > > > > >
> >> > > > >> > > > > > *A hitchhiker in the mathematical world.*
> >> > > > >> > > > > > *github:  <http://goog_969573159/>
> >> github.com/dongjinleekr
> >> > > > >> > > > > > <https://github.com/dongjinleekr>linkedin:
> >> > > > >> > > > > kr.linkedin.com/in/dongjinleekr
> >> > > > >> > > > > > <https://kr.linkedin.com/in/dongjinleekr
> >speakerdeck:
> >> > > > >> > > > > > speakerdeck.com/dongjin
> >> > > > >> > > > > > <https://speakerdeck.com/dongjin>*
> >> > > > >> > > > > >
> >> > > > >> > > > >
> >> > > > >> > > >
> >> > > > >> > > >
> >> > > > >> > > > --
> >> > > > >> > > > *Dongjin Lee*
> >> > > > >> > > >
> >> > > > >> > > > *A hitchhiker in the mathematical world.*
> >> > > > >> > > > *github:  <http://goog_969573159/>
> github.com/dongjinleekr
> >> > > > >> > > > <https://github.com/dongjinleekr>linkedin:
> >> > > > >> > > kr.linkedin.com/in/dongjinleekr
> >> > > > >> > > > <https://kr.linkedin.com/in/dongjinleekr>speakerdeck:
> >> > > > >> > > > speakerdeck.com/dongjin
> >> > > > >> > > > <https://speakerdeck.com/dongjin>*
> >> > > > >> > > >
> >> > > > >> > >
> >> > > > >> >
> >> > > > >> >
> >> > > > >> > --
> >> > > > >> > *Dongjin Lee*
> >> > > > >> >
> >> > > > >> > *A hitchhiker in the mathematical world.*
> >> > > > >> > *github:  <http://goog_969573159/>github.com/dongjinleekr
> >> > > > >> > <https://github.com/dongjinleekr>linkedin:
> >> > > > >> kr.linkedin.com/in/dongjinleekr
> >> > > > >> > <https://kr.linkedin.com/in/dongjinleekr>speakerdeck:
> >> > > > >> > speakerdeck.com/dongjin
> >> > > > >> > <https://speakerdeck.com/dongjin>*
> >> > > > >> >
> >> > > > >>
> >> > > > >
> >> > > > >
> >> > > > > --
> >> > > > > *Dongjin Lee*
> >> > > > >
> >> > > > > *A hitchhiker in the mathematical world.*
> >> > > > > *github:  <http://goog_969573159/>github.com/dongjinleekr
> >> > > > > <https://github.com/dongjinleekr>linkedin:
> >> > > > kr.linkedin.com/in/dongjinleekr
> >> > > > > <https://kr.linkedin.com/in/dongjinleekr>speakerdeck:
> >> > > > speakerdeck.com/dongjin
> >> > > > > <https://speakerdeck.com/dongjin>*
> >> > > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > *Dongjin Lee*
> >> > > >
> >> > > > *A hitchhiker in the mathematical world.*
> >> > > > *github:  <http://goog_969573159/>github.com/dongjinleekr
> >> > > > <https://github.com/dongjinleekr>linkedin:
> >> > > kr.linkedin.com/in/dongjinleekr
> >> > > > <https://kr.linkedin.com/in/dongjinleekr>speakerdeck:
> >> > > > speakerdeck.com/dongjin
> >> > > > <https://speakerdeck.com/dongjin>*
> >> > > >
> >> > >
> >> >
> >> >
> >> > --
> >> > *Dongjin Lee*
> >> >
> >> > *A hitchhiker in the mathematical world.*
> >> > *github:  <http://goog_969573159/>github.com/dongjinleekr
> >> > <https://github.com/dongjinleekr>linkedin:
> >> kr.linkedin.com/in/dongjinleekr
> >> > <https://kr.linkedin.com/in/dongjinleekr>speakerdeck:
> >> > speakerdeck.com/dongjin
> >> > <https://speakerdeck.com/dongjin>*
> >> >
> >>
> >
> >
> > --
> > *Dongjin Lee*
> >
> > *A hitchhiker in the mathematical world.*
> > *github:  <http://goog_969573159/>github.com/dongjinleekr
> > <https://github.com/dongjinleekr>linkedin:
> kr.linkedin.com/in/dongjinleekr
> > <https://kr.linkedin.com/in/dongjinleekr>speakerdeck:
> speakerdeck.com/dongjin
> > <https://speakerdeck.com/dongjin>*
> >
>
>
> --
> *Dongjin Lee*
>
> *A hitchhiker in the mathematical world.*
> *github:  <http://goog_969573159/>github.com/dongjinleekr
> <https://github.com/dongjinleekr>linkedin: kr.linkedin.com/in/dongjinleekr
> <https://kr.linkedin.com/in/dongjinleekr>speakerdeck:
> speakerdeck.com/dongjin
> <https://speakerdeck.com/dongjin>*
>

Reply via email to