Re: [VOTE] KIP-354 Time-based log compaction policy

2018-12-06 Thread xiongqi wu
Thanks Colin, Dong, and Joel for voting. I have masked this KIP as accepted. The pull request is up: https://github.com/apache/kafka/pull/6009 Xiongqi (Wesley) Wu On Thu, Dec 6, 2018 at 10:47 AM Joel Koshy wrote: > +1 on the updated KIP. > > On Wed, Dec 5, 2018 at 11:56 AM Dong Lin wrote:

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-12-06 Thread Joel Koshy
+1 on the updated KIP. On Wed, Dec 5, 2018 at 11:56 AM Dong Lin wrote: > Thanks for the update. +1 (binding) > > On Wed, Dec 5, 2018 at 8:19 AM Colin McCabe wrote: > > > Thanks, Xiongqi Wu. +1 (binding) > > > > regards, > > Colin > > > > > > On Tue, Dec 4, 2018, at 20:58, xiongqi (wesley) wu w

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-12-05 Thread Dong Lin
Thanks for the update. +1 (binding) On Wed, Dec 5, 2018 at 8:19 AM Colin McCabe wrote: > Thanks, Xiongqi Wu. +1 (binding) > > regards, > Colin > > > On Tue, Dec 4, 2018, at 20:58, xiongqi (wesley) wu wrote: > > Colin, > > > > Thanks for comments. > > Out of ordered message timestamp is a very g

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-12-05 Thread Colin McCabe
Thanks, Xiongqi Wu. +1 (binding) regards, Colin On Tue, Dec 4, 2018, at 20:58, xiongqi (wesley) wu wrote: > Colin, > > Thanks for comments. > Out of ordered message timestamp is a very good point. > We can combine max.compaction.lag.ms with > log.message.timestamp.difference.max.ms to achieve

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-12-04 Thread xiongqi (wesley) wu
Colin, Thanks for comments. Out of ordered message timestamp is a very good point. We can combine max.compaction.lag.ms with log.message.timestamp.difference.max.ms to achieve what we want in an environment that message timestamp can be shifted a lot. There are similar discussions regarding log.

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-12-04 Thread xiongqi wu
Colin, Thanks for comments. Out of ordered message timestamp is a very good point. We can combine max.compaction.lag.ms with log.message.timestamp.difference.max.ms to achieve what we want in an environment that message timestamp can be shifted a lot. There are similar discussions regarding log.

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-12-04 Thread Colin McCabe
Hi Xiongqi, Thinking about this a little bit more, it seems like we don't have any guarantees just by looking at the timestamp of the first message in a log segment. Similarly, we don't have any guarantees just by looking at the maxTimestamp of the previous log segment. Old data could appear

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-11-26 Thread xiongqi wu
Thanks for binding and non-binding votes. Can I get one more binding vote? Thanks in advance! Xiongqi (Wesley) Wu On Wed, Nov 14, 2018 at 7:29 PM Matt Farmer wrote: > I'm a +1 (non-binding) — This looks like it would have saved us a lot of > pain in an issue we had to debug recently. I can't

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-11-14 Thread Matt Farmer
I'm a +1 (non-binding) — This looks like it would have saved us a lot of pain in an issue we had to debug recently. I can't go into details, but figuring out how to achieve this effect gave me quite a headache. :) On Mon, Nov 12, 2018 at 1:00 PM xiongqi wu wrote: > Hi all, > > Can I have one mor

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-11-12 Thread xiongqi wu
Hi all, Can I have one more vote on this KIP? Any comment is appreciated. https://cwiki.apache.org/confluence/display/KAFKA/KIP-354%3A+Add+a+Maximum+Log+Compaction+Lag Xiongqi (Wesley) Wu On Fri, Nov 9, 2018 at 7:56 PM xiongqi wu wrote: > Thanks Dong. > I have updated the KIP. > > Xiongqi (

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-11-09 Thread xiongqi wu
Thanks Dong. I have updated the KIP. Xiongqi (Wesley) Wu On Fri, Nov 9, 2018 at 5:31 PM Dong Lin wrote: > Thanks for the KIP Xiongqi. LGTM. +1 (binding) > > One minor comment: it may be a bit better to clarify in the public > interface section that the value of the newly added metric is determ

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-11-09 Thread Dong Lin
Thanks for the KIP Xiongqi. LGTM. +1 (binding) One minor comment: it may be a bit better to clarify in the public interface section that the value of the newly added metric is determined based by applying that formula across all compactable segments. For example: The maximum value of Math.max(now

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-11-09 Thread xiongqi wu
Thanks Joel. Tracking the delay at second granularity makes sense I have updated KIP. Xiongqi (Wesley) Wu On Fri, Nov 9, 2018 at 5:05 PM Joel Koshy wrote: > +1 with one suggestion on the proposed metric. You should probably include > the unit. So for e.g., max-compaction-delay-secs. > > Joel >

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-11-09 Thread Joel Koshy
+1 with one suggestion on the proposed metric. You should probably include the unit. So for e.g., max-compaction-delay-secs. Joel On Tue, Nov 6, 2018 at 5:30 PM xiongqi wu wrote: > bump > Xiongqi (Wesley) Wu > > > On Thu, Sep 27, 2018 at 4:20 PM xiongqi wu wrote: > > > > > Thanks Eno, Brett, D

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-11-06 Thread xiongqi wu
bump Xiongqi (Wesley) Wu On Thu, Sep 27, 2018 at 4:20 PM xiongqi wu wrote: > > Thanks Eno, Brett, Dong, Guozhang, Colin, and Xiaohe for feedback. > Can I have more feedback or VOTE on this KIP? > > > Xiongqi (Wesley) Wu > > > On Wed, Sep 19, 2018 at 10:52 AM xiongqi wu wrote: > >> Any other v

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-27 Thread xiongqi wu
Thanks Eno, Brett, Dong, Guozhang, Colin, and Xiaohe for feedback. Can I have more feedback or VOTE on this KIP? Xiongqi (Wesley) Wu On Wed, Sep 19, 2018 at 10:52 AM xiongqi wu wrote: > Any other votes or comments? > > Xiongqi (Wesley) Wu > > > On Tue, Sep 11, 2018 at 11:45 AM xiongqi wu wr

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-19 Thread xiongqi wu
Any other votes or comments? Xiongqi (Wesley) Wu On Tue, Sep 11, 2018 at 11:45 AM xiongqi wu wrote: > Yes, more votes and code review. > > Xiongqi (Wesley) Wu > > > On Mon, Sep 10, 2018 at 11:37 PM Brett Rann > wrote: > >> +1 (non binding) from on 0 then, and on the KIP. >> >> Where do we go

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-11 Thread xiongqi wu
Yes, more votes and code review. Xiongqi (Wesley) Wu On Mon, Sep 10, 2018 at 11:37 PM Brett Rann wrote: > +1 (non binding) from on 0 then, and on the KIP. > > Where do we go from here? More votes? > > On Tue, Sep 11, 2018 at 5:34 AM Colin McCabe wrote: > > > On Mon, Sep 10, 2018, at 11:44, xi

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-10 Thread Brett Rann
+1 (non binding) from on 0 then, and on the KIP. Where do we go from here? More votes? On Tue, Sep 11, 2018 at 5:34 AM Colin McCabe wrote: > On Mon, Sep 10, 2018, at 11:44, xiongqi wu wrote: > > Thank you for comments. I will use '0' for now. > > > > If we create topics through admin client in

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-10 Thread Colin McCabe
On Mon, Sep 10, 2018, at 11:44, xiongqi wu wrote: > Thank you for comments. I will use '0' for now. > > If we create topics through admin client in the future, we might perform > some useful checks. (but the assumption is all brokers in the same cluster > have the same default configurations valu

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-10 Thread xiongqi wu
Thank you for comments. I will use '0' for now. If we create topics through admin client in the future, we might perform some useful checks. (but the assumption is all brokers in the same cluster have the same default configurations value, otherwise,it might still be tricky to do such cross valid

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-10 Thread Colin McCabe
I don't have a strong opinion. But I think we should probably be consistent with how segment.ms works, and just use 0. best, Colin On Wed, Sep 5, 2018, at 21:19, Brett Rann wrote: > OK thanks for that clarification. I see why you're uncomfortable with 0 now. > > I'm not really fussed. I just

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-10 Thread Colin McCabe
On Wed, Sep 5, 2018, at 10:26, xiongqi wu wrote: > Colin, > > When a user creates a topic, the user doesn't know broker's configures. If > there is 100 brokers, the user can't contact all the brokers to find out > the minimum allowed value for segment.ms. Today, the topic creation is > done via z

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-05 Thread Brett Rann
OK thanks for that clarification. I see why you're uncomfortable with 0 now. I'm not really fussed. I just prefer consistency in configuration options. Personally I lean towards treating 0 and 1 similarly in that scenario, because it favours the person thinking about setting the configurations, a

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-05 Thread xiongqi wu
I want to honor the minimum value of segment.ms (which is 1ms) to force roll an active segment. So if we set "max.compaction.lag.ms" any value > 0, the minimum of max.compaction.lag.ms and segment.ms will be used to seal an active segment. If we set max.compaction.lag.ms to 0, the current implem

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-05 Thread Brett Rann
You're rolling a new segment if the condition is met right? So I'm struggling to understand the relevance of segment.ms here. Maybe an example would help my understanding: segment.ms=999 *min.cleanable.dirty.ratio=1* max.compaction.lag.ms=1 When a duplicate message comes in, after 1ms the top

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-05 Thread xiongqi wu
To make it clear, I don't against using -1 as disabled, but we need to come up with the meaning of "0". If "0" means immediate compaction, but the actual compaction lag will be segment.ms. It has longer lag than setting the value to be half of segment.ms. We cannot provide "0" as max compaction lag

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-05 Thread Brett Rann
-1 is consistent as "special" with these settings for example: log.retention.bytes socket.received.buffer.bytes socket.send.buffer.bytes queued.max.request.bytes retention.bytes retention.ms and acks. Where it may mean no limit, use OS defaults, max (acks), etc. I don't see much convention of 0

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-05 Thread Brett Rann
If segment.ms can't be set to 0, then we're not being consistent by using 0 for this new setting? I throw out -1 for consideration again :) On Thu, Sep 6, 2018 at 10:03 AM xiongqi wu wrote: > Thanks. I will document after PR is merged. > > BTW, Kafka enforce the minimum of "segment.ms" to 1, we

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-05 Thread xiongqi wu
Thanks. I will document after PR is merged. BTW, Kafka enforce the minimum of "segment.ms" to 1, we cannot set " segment.ms" to 0. I also updated the title of this KIP. Xiongqi (Wesley) Wu On Wed, Sep 5, 2018 at 4:34 PM Brett Rann wrote: > I withdraw my comments on -1 since i'm in the minor

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-05 Thread Brett Rann
I withdraw my comments on -1 since i'm in the minority. :) Can we make sure 0 gets documented as meaning disabled here: https://kafka.apache.org/documentation/#brokerconfigs ? And while there it would be good if segment.ms is documented that 0 is disabled too. (there's some hierarchy of configs

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-05 Thread xiongqi wu
If we use 0 to indicate immediate compaction, the compaction lag is determined by segment.ms in worst case. If segment.ms is 24 hours, "immediate compaction" is a weaker guarantee than setting any value less than 24 hours. By the definition of "max compaction lag", we cannot have zero lag. So

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-05 Thread xiongqi wu
Colin, When a user creates a topic, the user doesn't know broker's configures. If there is 100 brokers, the user can't contact all the brokers to find out the minimum allowed value for segment.ms. Today, the topic creation is done via zookeeper, at this stage, the user doesn't have right view of

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-05 Thread Colin McCabe
On Tue, Sep 4, 2018, at 22:11, Brett Rann wrote: > > That's a fair point. We should make 0 = disable, to be consistent with > the other settings. > > -1 is used elsewhere for disable and when seeing it in a config it's clear > that it's a special meaning. 0 doesn't have to mean instant, it just me

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-05 Thread Colin McCabe
On Tue, Sep 4, 2018, at 20:25, xiongqi wu wrote: > Thanks for comments. > > Today, when creating topic, client only does simple local validation > and doesn't check against broker's configurations. > > We cannot just let users to create a configuration in zookeeper and > dishonor the user's choic

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-04 Thread Brett Rann
> That's a fair point. We should make 0 = disable, to be consistent with the other settings. -1 is used elsewhere for disable and when seeing it in a config it's clear that it's a special meaning. 0 doesn't have to mean instant, it just means as quickly as possible. I don't think 0 is intuitive fo

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-04 Thread xiongqi wu
Thanks for comments. Today, when creating topic, client only does simple local validation and doesn't check against broker's configurations. We cannot just let users to create a configuration in zookeeper and dishonor the user's choice in broker side. I agree we need a better way to enforce the

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-04 Thread Colin McCabe
On Tue, Sep 4, 2018, at 17:47, xiongqi wu wrote: > Colin, > Thank you for comments. > see my inline reply below. > > Xiongqi (Wesley) Wu > > > On Tue, Sep 4, 2018 at 5:24 PM Colin McCabe wrote: > > > Hi Xiongqi, > > > > Thanks for this KIP. > > > > The name seems a bit ambiguous. Our compacti

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-04 Thread Colin McCabe
On Tue, Sep 4, 2018, at 18:09, xiongqi wu wrote: > Colin, > > I will keep the title for now, since all the previous discussions and links > are tied to this title. > > I can change the title at the end or add a clarification note in the doc. Hi Xiongqi, It's fine to keep the email thread titles

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-04 Thread xiongqi wu
Colin, I will keep the title for now, since all the previous discussions and links are tied to this title. I can change the title at the end or add a clarification note in the doc. Xiongqi (Wesley) Wu On Tue, Sep 4, 2018 at 5:47 PM xiongqi wu wrote: > Colin, > Thank you for comments. > see m

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-04 Thread xiongqi wu
Colin, Thank you for comments. see my inline reply below. Xiongqi (Wesley) Wu On Tue, Sep 4, 2018 at 5:24 PM Colin McCabe wrote: > Hi Xiongqi, > > Thanks for this KIP. > > The name seems a bit ambiguous. Our compaction policies are already > time-based, after all. It seems like this change i

Re: [VOTE] KIP-354 Time-based log compaction policy

2018-09-04 Thread Colin McCabe
Hi Xiongqi, Thanks for this KIP. The name seems a bit ambiguous. Our compaction policies are already time-based, after all. It seems like this change is focused around adding a “max.compaction.lag.ms." Perhaps the KIP title should be something like "add maximum compaction lag time"? > The

[VOTE] KIP-354 Time-based log compaction policy

2018-09-04 Thread xiongqi wu
Let's VOTE for this KIP. KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-354 %3A+Time-based+log+compaction+policy Implementation: https://github.com/apache/kafka/pull/5611 Xiongqi (Wesley) Wu