You're rolling a new segment if the condition is met right? So I'm
struggling to understand the relevance of segment.ms here. Maybe an example
would help my understanding:

segment.ms=9999999
*min.cleanable.dirty.ratio=1*
max.compaction.lag.ms=1

When a duplicate message comes in, after 1ms the topic should be eligible
for compaction when the log compaction thread gets around to evaluating the
topic.

if we have
segment.ms=9999999
*min.cleanable.dirty.ratio=1*
max.compaction.lag.ms=0

When a duplicate message comes in, after 0ms the topic should be eligible
for compaction when the log compaction thread gets around to evaluating the
topic.

In both of those cases the change would mean a new segment is rolled so the
new message would be part of the compaction task. 0 and 1 are practically
the same meaning since neither is providing an actual guarantee at such low
MS settings, but effectively tying it to both the frequency of the log
cleaner running and the priority of the given topic being the highest
priority of all topics that are evaluated for cleaning on the next cycle.
You've captured that nuance with careful "skipped" wording in the KIP
here "controls
the max time interval a message/segment can be skipped for log compaction".

How is 0 different to 1, practically? And how is it relating to segment.ms?
Is it that you're proposing to have 0 mean "use segment.ms instead?" as a
kind of third option?



On Thu, Sep 6, 2018 at 11:34 AM xiongqi wu <xiongq...@gmail.com> wrote:

> To make it clear,
> I don't against using -1 as disabled, but we need to come up with the
> meaning of "0".
> If "0" means immediate compaction, but the actual compaction lag will be
> segment.ms.
> It has longer lag than setting the value to be half of segment.ms.
> We cannot provide "0" as max compaction lag.
>
> Here are two options.
> Option 1:
> Keep 0 as disabled
> Option 2:
> -1 (disabled), 0 (max compaction lag = segment.ms), and others.
>
>
>
> Xiongqi (Wesley) Wu
>
>
> On Wed, Sep 5, 2018 at 5:49 PM Brett Rann <br...@zendesk.com.invalid>
> wrote:
>
> > -1 is consistent as "special" with these settings for example:
> >
> > log.retention.bytes
> > socket.received.buffer.bytes
> > socket.send.buffer.bytes
> > queued.max.request.bytes
> > retention.bytes
> > retention.ms
> >
> > and acks.
> >
> > Where it may mean no limit, use OS defaults, max (acks), etc. I don't see
> > much convention of 0 meaning those things.
> >
> > There are some NULLs but it seems convetion there is NULL is used where
> > there's another setting in the hierarchy that would be used instead.
> >
> >
> >
> >
> > On Thu, Sep 6, 2018 at 10:42 AM Brett Rann <br...@zendesk.com> wrote:
> >
> > > If segment.ms can't be set to 0, then we're not being consistent
> > > by using 0 for this new setting? I throw out -1 for consideration
> > > again :)
> > >
> > > On Thu, Sep 6, 2018 at 10:03 AM xiongqi wu <xiongq...@gmail.com>
> wrote:
> > >
> > >> Thanks. I will document after PR is merged.
> > >>
> > >> BTW, Kafka enforce the minimum of "segment.ms" to 1, we cannot set "
> > >> segment.ms" to 0.
> > >>
> > >> I also updated the title of this KIP.
> > >>
> > >> Xiongqi (Wesley) Wu
> > >>
> > >>
> > >> On Wed, Sep 5, 2018 at 4:34 PM Brett Rann <br...@zendesk.com.invalid>
> > >> wrote:
> > >>
> > >> > I withdraw my comments on -1 since i'm in the minority. :) Can we
> > >> > make sure 0 gets documented as meaning disabled here:
> > >> > https://kafka.apache.org/documentation/#brokerconfigs
> <https://kafka.apache.org/documentation/#brokerconfigs>
> > >> <https://kafka.apache.org/documentation/#brokerconfigs
> <https://kafka.apache.org/documentation/#brokerconfigs>> ?
> > >> > And while there it would be good if segment.ms is documented
> > >> > that 0 is disabled too. (there's some hierarchy of configs for that
> > too
> > >> > if its not set and null for others means disabled!)
> > >> >
> > >> >
> > >> > On Thu, Sep 6, 2018 at 4:44 AM xiongqi wu <xiongq...@gmail.com>
> > wrote:
> > >> >
> > >> > > If we use 0 to indicate immediate compaction, the compaction lag
> is
> > >> > > determined by segment.ms in worst case. If segment.ms is 24
> hours,
> > >> > > "immediate compaction" is a weaker guarantee than setting any
> value
> > >> less
> > >> > > than 24 hours. By the definition of "max compaction lag", we
> cannot
> > >> have
> > >> > > zero lag. So I use 0 to indicate "disable".
> > >> > >
> > >> > >
> > >> > >
> > >> > > Xiongqi (Wesley) Wu
> > >> > >
> > >> > >
> > >> > > On Wed, Sep 5, 2018 at 8:34 AM Colin McCabe <cmcc...@apache.org>
> > >> wrote:
> > >> > >
> > >> > > > On Tue, Sep 4, 2018, at 22:11, Brett Rann wrote:
> > >> > > > > > That's a fair point. We should make 0 = disable, to be
> > >> consistent
> > >> > > with
> > >> > > > > the other settings.
> > >> > > > >
> > >> > > > > -1 is used elsewhere for disable and when seeing it in a
> config
> > >> it's
> > >> > > > clear
> > >> > > > > that it's a special meaning. 0 doesn't have to mean instant,
> it
> > >> just
> > >> > > > means
> > >> > > > > as quickly as possible. I don't think 0 is intuitive for
> > disabled
> > >> and
> > >> > > it
> > >> > > > > will be confusing. I wasn't aware segment.ms=0 == disabled,
> > but I
> > >> > > think
> > >> > > > > that is also unintuitive.
> > >> > > >
> > >> > > > I think there is an argument for keeping these two
> configurations
> > >> > > > consistent, since they are so similar. I agree that 0 was an
> > >> > unfortunate
> > >> > > > choice.,
> > >> > > >
> > >> > > > best,
> > >> > > > Colin
> > >> > > >
> > >> > > > >
> > >> > > > > On Wed, Sep 5, 2018 at 11:38 AM Colin McCabe <
> > cmcc...@apache.org>
> > >> > > wrote:
> > >> > > > >
> > >> > > > > > On Tue, Sep 4, 2018, at 17:47, xiongqi wu wrote:
> > >> > > > > > > Colin,
> > >> > > > > > > Thank you for comments.
> > >> > > > > > > see my inline reply below.
> > >> > > > > > >
> > >> > > > > > > Xiongqi (Wesley) Wu
> > >> > > > > > >
> > >> > > > > > >
> > >> > > > > > > On Tue, Sep 4, 2018 at 5:24 PM Colin McCabe <
> > >> cmcc...@apache.org>
> > >> > > > wrote:
> > >> > > > > > >
> > >> > > > > > > > Hi Xiongqi,
> > >> > > > > > > >
> > >> > > > > > > > Thanks for this KIP.
> > >> > > > > > > >
> > >> > > > > > > > The name seems a bit ambiguous. Our compaction policies
> > are
> > >> > > already
> > >> > > > > > > > time-based, after all. It seems like this change is
> > focused
> > >> > > around
> > >> > > > > > adding
> > >> > > > > > > > a “max.compaction.lag.ms." Perhaps the KIP title should
> > be
> > >> > > > something
> > >> > > > > > > > like "add maximum compaction lag time"?
> > >> > > > > > > >
> > >> > > > > > > > ==========> sure. I will change the title.
> > >> > > > > > >
> > >> > > > > > > > The active segment is forced to roll when either "
> > >> > > > > > max.compaction.lag.ms"
> > >> > > > > > > > > or "segment.ms" (log.roll.ms and log.roll.hours) has
> > >> > reached.
> > >> > > > > > > >
> > >> > > > > > > > If the max.compaction.lag.ms is low, it seems like
> > segments
> > >> > will
> > >> > > > be
> > >> > > > > > > > rolled very frequently. This can be a source of problems
> > in
> > >> the
> > >> > > > > > cluster,
> > >> > > > > > > > since creating many different small log segments
> consumes
> > a
> > >> > huge
> > >> > > > > > amount of
> > >> > > > > > > > cluster resources. Therefore, I would suggest adding a
> > >> > > broker-level
> > >> > > > > > > > configuration which allows us to set a minimum value for
> > >> > > > > > > > max.compaction.lag.ms. If we let users set it on a
> > >> per-topic
> > >> > > > basis,
> > >> > > > > > > > someone could set a value of 1 ms or something, and
> cause
> > >> > chaos.
> > >> > > > > > > >
> > >> > > > > > > > =========> this applies to segment.ms as well. Today
> > users
> > >> can
> > >> > > > set "
> > >> > > > > > > segment.ms" to a very low value, and cause a frequent
> > >> rolling of
> > >> > > > active
> > >> > > > > > > segments.
> > >> > > > > >
> > >> > > > > > Hi Xiongqi,
> > >> > > > > >
> > >> > > > > > I agree that this is an existing problem with segment.ms.
> > >> However,
> > >> > > > that
> > >> > > > > > doesn't mean that we shouldn't fix it. As you noted, there
> > will
> > >> be
> > >> > > more
> > >> > > > > > interest in these topic-level retention settings as a result
> > of
> > >> > GDPR.
> > >> > > > It
> > >> > > > > > seems likely that pre-existing problems will cause more
> > trouble.
> > >> > > > > >
> > >> > > > > > The fix seems relatively straightforward here -- add a
> > >> broker-level
> > >> > > > > > minimum segment.ms that overrides per-topic minimums. We
> can
> > >> also
> > >> > > fail
> > >> > > > > > with a helpful error message when someone attempts to set an
> > >> > invalid
> > >> > > > > > configuration.
> > >> > > > > >
> > >> > > > > > > In my option, the minimum of "max.compaction.lag.ms"
> should
> > >> be
> > >> > > > > > > based on the minimum of "segment.ms". Since today the
> > >> minimum of
> > >> > > > > > segment.ms
> > >> > > > > > > is 1, "max.compaction.lag.ms" also starts with 1. "0"
> means
> > >> > > > disable. I
> > >> > > > > > > can use -1 as disable, but it is hard to define the
> meaning
> > >> of 0
> > >> > > > because
> > >> > > > > > we
> > >> > > > > > > cannot just roll the active segment immediately.
> > >> > > > > >
> > >> > > > > > That's a fair point. We should make 0 = disable, to be
> > >> consistent
> > >> > > with
> > >> > > > the
> > >> > > > > > other settings.
> > >> > > > > >
> > >> > > > > > best,
> > >> > > > > > Colin
> > >> > > > > >
> > >> > > > > > >
> > >> > > > > > > > -- Note that an alternative configuration is to use -1
> as
> > >> > > > "disabled"
> > >> > > > > > and
> > >> > > > > > > > 0
> > >> > > > > > > > > as "immediate compaction". Because compaction lag is
> > still
> > >> > > > determined
> > >> > > > > > > > > based on min.compaction.lag and how long to roll an
> > active
> > >> > > > segment,
> > >> > > > > > > > the
> > >> > > > > > > > > actual lag for compaction is undetermined if we use
> "0".
> > >> On
> > >> > the
> > >> > > > other
> > >> > > > > > > > > hand, we can already set "min.cleanable.dirty.ratio"
> to
> > >> > achieve
> > >> > > > the
> > >> > > > > > > > same
> > >> > > > > > > > > goal. So here we choose "0" as "disabled".
> > >> > > > > > > >
> > >> > > > > > > > I would prefer -1 to be the invalid setting. Treating 0
> > >> > > differently
> > >> > > > > > than
> > >> > > > > > > > 1 seems strange to me.
> > >> > > > > > > >
> > >> > > > > > > > =====> see my previous comment, I am not strongly
> against,
> > >> but
> > >> > 0
> > >> > > is
> > >> > > > > > not a
> > >> > > > > > > valid configuration in my option. So I use "0" as disabled
> > >> state.
> > >> > > > > > >
> > >> > > > > > > best,
> > >> > > > > > > > Colin
> > >> > > > > > > >
> > >> > > > > > > >
> > >> > > > > > > > On Tue, Sep 4, 2018, at 15:04, xiongqi wu wrote:
> > >> > > > > > > > > Let's VOTE for this KIP.
> > >> > > > > > > > > KIP:
> > >> > > > > > > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-354
> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354>
> > >> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354
> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354>>
> > >> > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354
> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354>
> > >> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354
> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354>>>
> > >> > > > > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354
> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354>
> > >> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354
> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354>>
> > >> > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354
> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354>
> > >> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354
> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354>>>>
> > >> > > > > > > > > %3A+Time-based+log+compaction+policy
> > >> > > > > > > > >
> > >> > > > > > > > > Implementation:
> > >> > > > > > > > >
> > >> > > > > > > > > https://github.com/apache/kafka/pull/5611
> <https://github.com/apache/kafka/pull/5611>
> > >> <https://github.com/apache/kafka/pull/5611
> <https://github.com/apache/kafka/pull/5611>>
> > >> > > <https://github.com/apache/kafka/pull/5611
> <https://github.com/apache/kafka/pull/5611>
> > >> <https://github.com/apache/kafka/pull/5611
> <https://github.com/apache/kafka/pull/5611>>>
> > >> > > > > > <https://github.com/apache/kafka/pull/5611
> <https://github.com/apache/kafka/pull/5611>
> > >> <https://github.com/apache/kafka/pull/5611
> <https://github.com/apache/kafka/pull/5611>>
> > >> > > <https://github.com/apache/kafka/pull/5611
> <https://github.com/apache/kafka/pull/5611>
> > >> <https://github.com/apache/kafka/pull/5611
> <https://github.com/apache/kafka/pull/5611>>>>
> > >> > > > > > > > >
> > >> > > > > > > > >
> > >> > > > > > > > >
> > >> > > > > > > > > Xiongqi (Wesley) Wu
> > >> > > > > > > >
> > >> > > > > >
> > >> > > >
> > >> > >
> > >> >
> > >> >
> > >> > --
> > >> >
> > >> > Brett Rann
> > >> >
> > >> > Senior DevOps Engineer
> > >> >
> > >> >
> > >> > Zendesk International Ltd
> > >> >
> > >> > 395 Collins Street, Melbourne VIC 3000 Australia
> > >> >
> > >> > Mobile: +61 (0) 418 826 017
> > >> >
> > >>
> > >
> > >
> > > --
> > >
> > > Brett Rann
> > >
> > > Senior DevOps Engineer
> > >
> > >
> > > Zendesk International Ltd
> > >
> > > 395 Collins Street, Melbourne VIC 3000 Australia
> > >
> > > Mobile: +61 (0) 418 826 017
> > >
> >
> >
> > --
> >
> > Brett Rann
> >
> > Senior DevOps Engineer
> >
> >
> > Zendesk International Ltd
> >
> > 395 Collins Street, Melbourne VIC 3000 Australia
> >
> > Mobile: +61 (0) 418 826 017
> >
>

Reply via email to