Hi again, all,

Casey Green pointed out that we overlooked the scala api when we added
Suppress. He was kind enough to send
https://github.com/apache/kafka/pull/6314 to correct this, and we also
updated the KIP. Since it's essentially just copying the existing Java API
over to Scala, we didn't create a new KIP.

Note that we don't plan to treat this as a bug, and therefore don't
currently plan to backport the Scala Suppress API to 2.1 or 2.2.

Thanks,
-John

On Fri, Nov 16, 2018 at 3:54 PM Bill Bejeck <bbej...@gmail.com> wrote:

> Hi John,
>
> Thanks for the update, I'm +1 on changes and my +1 vote stands.
>
> -Bill
>
> On Fri, Nov 16, 2018 at 4:19 PM John Roesler <j...@confluent.io> wrote:
>
> > Hi all, sorry to do this again, but during review of the code to add the
> > metrics proposed in this KIP, the reviewers and I noticed some
> > inconsistencies and drawbacks of the metrics I proposed in the KIP.
> >
> > Here's the diff:
> >
> >
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=24&selectedPageVersions=23
> >
> > * The proposed metrics were all INFO level, but they would be updated on
> > every record, creating a performance concern. If we can refactor the
> > metrics framework in the future to resolve this concern, we may move the
> > metrics back to INFO level.
> > * having separate metrics for memory and disk buffers is unnecessarily
> > complex. The main utility is determining how close the buffer is to the
> > configured limit, which makes a single metric more useful. I've combined
> > them into one "suppression-buffer-size-*" metric.
> > * The "intermediate-result-suppression-*" metric would be equivalent to
> the
> > "process-*" metric which is already available on the ProcessorNode. I've
> > removed it from the KIP.
> > * The "suppression-mem-buffer-evict-*" metric had been proposed as a
> buffer
> > metric, but it makes more sense as a processor node metric, since its
> > counterpart is the "process-*" metric. I've replaced it with a processor
> > node metric, "suppression-emit-*"
> >
> > Let me know if you want to recast votes in response to this change.
> >
> > -John
> >
> > On Thu, Oct 4, 2018 at 11:26 AM John Roesler <j...@confluent.io> wrote:
> >
> > > Update: Here's a link to the documented eviction behavior:
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-328%3A+Ability+to+suppress+updates+for+KTables#KIP-328:AbilitytosuppressupdatesforKTables-BufferEvictionBehavior(akaSuppressEmitBehavior)
> > >
> > > On Thu, Oct 4, 2018 at 11:12 AM John Roesler <j...@confluent.io>
> wrote:
> > >
> > >> Hello again, all,
> > >>
> > >> During review, we realized that there is a relationship between this
> > >> (KIP-328) and KIP-372.
> > >>
> > >> KIP-372 proposed to allow naming *all* internal topics, and KIP-328
> adds
> > >> a new internal topic (the changelog for the suppression buffer).
> > >>
> > >> However, we didn't consider this relationship in either KIP
> discussion,
> > >> possibly since they were discussed and accepted concurrently.
> > >>
> > >> I have updated KIP-328 to effectively "merge" the two KIPs by adding a
> > >> `withName` builder to Suppressed in the style of the other builders
> > added
> > >> in KIP-372:
> > >>
> >
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=20&selectedPageVersions=19
> > >> .
> > >>
> > >> I think this should be uncontroversial, but as always, let me know of
> > any
> > >> objections you may have.
> > >>
> > >>
> > >> Also, note that I'll be updating the KIP to document the exact buffer
> > >> eviction behavior. I previously treated this as an internal
> > implementation
> > >> detail, but after consideration, I think users would want to know the
> > >> eviction semantics, especially if they are debugging their
> applications
> > and
> > >> scrutinizing the sequence of emitted records.
> > >>
> > >> Thanks,
> > >> -John
> > >>
> > >> On Thu, Sep 20, 2018 at 5:34 PM John Roesler <j...@confluent.io>
> wrote:
> > >>
> > >>> Hello all,
> > >>>
> > >>> During review of https://github.com/apache/kafka/pull/5567 for
> > KIP-328,
> > >>> the reviewers raised many good suggestions for the API.
> > >>>
> > >>> The basic design of the suppress operation remains the same, but the
> > >>> config object is (in my opinion) far more ergonomic with their
> > >>> suggestions.
> > >>>
> > >>> I have updated the KIP to reflect the new config (
> > >>>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-328%3A+Ability+to+suppress+updates+for+KTables#KIP-328:AbilitytosuppressupdatesforKTables-NewSuppressOperator
> > >>> )
> > >>>
> > >>> Please let me know if anyone wishes to change their vote, and we call
> > >>> for a recast.
> > >>>
> > >>> Thanks,
> > >>> -John
> > >>>
> > >>> On Thu, Aug 23, 2018 at 12:54 PM Matthias J. Sax <
> > matth...@confluent.io>
> > >>> wrote:
> > >>>
> > >>>> It seems nobody has any objections against the change.
> > >>>>
> > >>>> That's for the KIP improvement. I'll go ahead and merge the PR.
> > >>>>
> > >>>>
> > >>>> -Matthias
> > >>>>
> > >>>> On 8/21/18 2:44 PM, John Roesler wrote:
> > >>>> > Hello again, all,
> > >>>> >
> > >>>> > I belatedly had a better idea for adding grace period to the
> Windows
> > >>>> class
> > >>>> > hierarchy (TimeWindows, UnlimitedWindows, JoinWindows). Instead of
> > >>>> > providing the grace-setter in the abstract class and having to
> > >>>> retract it
> > >>>> > in UnlimitedWindows, I've made the getter abstract method in
> Windows
> > >>>> and
> > >>>> > only added setters to Time and Join windows.
> > >>>> >
> > >>>> > This should not only improve the ergonomics of grace period, but
> > make
> > >>>> the
> > >>>> > whole class hierarchy more maintainable.
> > >>>> >
> > >>>> > See the PR for more details:
> > >>>> https://github.com/apache/kafka/pull/5536
> > >>>> >
> > >>>> > I've updated the KIP accordingly. Here's the diff:
> > >>>> >
> > >>>>
> >
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=11&selectedPageVersions=9
> > >>>> >
> > >>>> > Please let me know if this changes your vote.
> > >>>> >
> > >>>> > Thanks,
> > >>>> > -John
> > >>>> >
> > >>>> > On Mon, Aug 13, 2018 at 5:20 PM John Roesler <j...@confluent.io>
> > >>>> wrote:
> > >>>> >
> > >>>> >> Hey all,
> > >>>> >>
> > >>>> >> I just wanted to let you know that a few small issues surfaced
> > during
> > >>>> >> implementation and review. I've updated the KIP. Here's the diff:
> > >>>> >>
> > >>>>
> >
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=9&selectedPageVersions=8
> > >>>> >>
> > >>>> >> Basically:
> > >>>> >> * the metrics named "*-event-*" are inconsistent with existing
> > >>>> >> nomenclature, and will be "*-record-*" instead (late records
> > instead
> > >>>> of
> > >>>> >> late events, for example)
> > >>>> >> * the apis taking and returning Duration will use long millis
> > >>>> instead. We
> > >>>> >> do want to transition to Duration in the future, but we shouldn't
> > do
> > >>>> it
> > >>>> >> piecemeal.
> > >>>> >>
> > >>>> >> Thanks,
> > >>>> >> -John
> > >>>> >>
> > >>>> >> On Tue, Aug 7, 2018 at 12:07 PM John Roesler <j...@confluent.io>
> > >>>> wrote:
> > >>>> >>
> > >>>> >>> Thanks everyone, KIP-328 has passed with 3 binding votes
> > (Guozhang,
> > >>>> >>> Damian, and Matthias) and 3 non-binding (Ted, Bill, and me).
> > >>>> >>>
> > >>>> >>> Thanks for your time,
> > >>>> >>> -John
> > >>>> >>>
> > >>>> >>> On Mon, Aug 6, 2018 at 6:35 PM Matthias J. Sax <
> > >>>> matth...@confluent.io>
> > >>>> >>> wrote:
> > >>>> >>>
> > >>>> >>>> +1 (binding)
> > >>>> >>>>
> > >>>> >>>> Thanks for the KIP.
> > >>>> >>>>
> > >>>> >>>>
> > >>>> >>>> -Matthias
> > >>>> >>>>
> > >>>> >>>> On 8/3/18 12:52 AM, Damian Guy wrote:
> > >>>> >>>>> Thanks John! +1
> > >>>> >>>>>
> > >>>> >>>>> On Mon, 30 Jul 2018 at 23:58 Guozhang Wang <
> wangg...@gmail.com>
> > >>>> wrote:
> > >>>> >>>>>
> > >>>> >>>>>> Yes, the addendum lgtm as well. Thanks!
> > >>>> >>>>>>
> > >>>> >>>>>> On Mon, Jul 30, 2018 at 3:34 PM, John Roesler <
> > j...@confluent.io
> > >>>> >
> > >>>> >>>> wrote:
> > >>>> >>>>>>
> > >>>> >>>>>>> Another thing that came up after I started working on an
> > >>>> >>>> implementation
> > >>>> >>>>>> is
> > >>>> >>>>>>> that in addition to deprecating "retention" from the Windows
> > >>>> >>>> interface,
> > >>>> >>>>>> we
> > >>>> >>>>>>> also need to deprecate "segmentInterval", for the same
> > reasons.
> > >>>> I
> > >>>> >>>> simply
> > >>>> >>>>>>> overlooked it previously. I've updated the KIP accordingly.
> > >>>> >>>>>>>
> > >>>> >>>>>>> Hopefully, this doesn't change anyone's vote.
> > >>>> >>>>>>>
> > >>>> >>>>>>> Thanks,
> > >>>> >>>>>>> -John
> > >>>> >>>>>>>
> > >>>> >>>>>>> On Mon, Jul 30, 2018 at 5:31 PM John Roesler <
> > j...@confluent.io
> > >>>> >
> > >>>> >>>> wrote:
> > >>>> >>>>>>>
> > >>>> >>>>>>>> Thanks Guozhang,
> > >>>> >>>>>>>>
> > >>>> >>>>>>>> Thanks for that catch. to clarify, currently, events are
> > >>>> "late" only
> > >>>> >>>>>> when
> > >>>> >>>>>>>> they are older than the retention period. Currently, we
> > detect
> > >>>> this
> > >>>> >>>> in
> > >>>> >>>>>>> the
> > >>>> >>>>>>>> processor and record it as a "skipped-record". We then do
> not
> > >>>> >>>> attempt
> > >>>> >>>>>> to
> > >>>> >>>>>>>> store the event in the window store. If a user provided a
> > >>>> >>>>>> pre-configured
> > >>>> >>>>>>>> window store with a retention period smaller than the one
> > they
> > >>>> >>>> specify
> > >>>> >>>>>>> via
> > >>>> >>>>>>>> Windows#until, the segmented store will drop the update
> with
> > no
> > >>>> >>>> metric
> > >>>> >>>>>>> and
> > >>>> >>>>>>>> record a debug-level log.
> > >>>> >>>>>>>>
> > >>>> >>>>>>>> With KIP-328, with the introduction of "grace period" and
> > >>>> moving
> > >>>> >>>>>>> retention
> > >>>> >>>>>>>> fully into the state store, we need to have metrics for
> both
> > >>>> "late
> > >>>> >>>>>>> events"
> > >>>> >>>>>>>> (new records older than the grace period) and "expired
> window
> > >>>> >>>> events"
> > >>>> >>>>>>> (new
> > >>>> >>>>>>>> records for windows that are no longer retained in the
> state
> > >>>> >>>> store). I
> > >>>> >>>>>>>> already proposed metrics for the late events, and I've just
> > >>>> updated
> > >>>> >>>> the
> > >>>> >>>>>>> KIP
> > >>>> >>>>>>>> with metrics for the expired window events. I also updated
> > the
> > >>>> KIP
> > >>>> >>>> to
> > >>>> >>>>>>> make
> > >>>> >>>>>>>> it clear that neither late nor expired events will count as
> > >>>> >>>>>>>> "skipped-records" any more.
> > >>>> >>>>>>>>
> > >>>> >>>>>>>> -John
> > >>>> >>>>>>>>
> > >>>> >>>>>>>> On Mon, Jul 30, 2018 at 4:22 PM Guozhang Wang <
> > >>>> wangg...@gmail.com>
> > >>>> >>>>>>> wrote:
> > >>>> >>>>>>>>
> > >>>> >>>>>>>>> Hi John,
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>> Thanks for the updated KIP, +1 from me, and one minor
> > >>>> suggestion:
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>> Following your suggestion of the differentiation of
> > >>>> >>>> `skipped-records`
> > >>>> >>>>>>> v.s.
> > >>>> >>>>>>>>> `late-event-drop`, we should probably consider moving the
> > >>>> scenarios
> > >>>> >>>>>>> where
> > >>>> >>>>>>>>> records got ignored due the window not being available any
> > >>>> more in
> > >>>> >>>>>>>>> windowed
> > >>>> >>>>>>>>> aggregation operators from the `skipped-records` metrics
> > >>>> recording
> > >>>> >>>> to
> > >>>> >>>>>>> the
> > >>>> >>>>>>>>> `late-event-drop` metrics recording.
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>> Guozhang
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>> On Mon, Jul 30, 2018 at 1:36 PM, Bill Bejeck <
> > >>>> bbej...@gmail.com>
> > >>>> >>>>>> wrote:
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>>> Thanks for the KIP!
> > >>>> >>>>>>>>>>
> > >>>> >>>>>>>>>> +1
> > >>>> >>>>>>>>>>
> > >>>> >>>>>>>>>> -Bill
> > >>>> >>>>>>>>>>
> > >>>> >>>>>>>>>> On Mon, Jul 30, 2018 at 3:42 PM Ted Yu <
> > yuzhih...@gmail.com>
> > >>>> >>>> wrote:
> > >>>> >>>>>>>>>>
> > >>>> >>>>>>>>>>> +1
> > >>>> >>>>>>>>>>>
> > >>>> >>>>>>>>>>> On Mon, Jul 30, 2018 at 11:46 AM John Roesler <
> > >>>> j...@confluent.io
> > >>>> >>>>>
> > >>>> >>>>>>>>> wrote:
> > >>>> >>>>>>>>>>>
> > >>>> >>>>>>>>>>>> Hello devs,
> > >>>> >>>>>>>>>>>>
> > >>>> >>>>>>>>>>>> The discussion of KIP-328 has gone some time with no
> new
> > >>>> >>>>>> comments,
> > >>>> >>>>>>>>> so I
> > >>>> >>>>>>>>>>> am
> > >>>> >>>>>>>>>>>> calling for a vote!
> > >>>> >>>>>>>>>>>>
> > >>>> >>>>>>>>>>>> Here's the KIP:
> > >>>> https://cwiki.apache.org/confluence/x/sQU0BQ
> > >>>> >>>>>>>>>>>>
> > >>>> >>>>>>>>>>>> The basic idea is to provide:
> > >>>> >>>>>>>>>>>> * more usable control over update rate (vs the current
> > >>>> state
> > >>>> >>>>>> store
> > >>>> >>>>>>>>>>> caches)
> > >>>> >>>>>>>>>>>> * the final-result-for-windowed-computations feature
> > which
> > >>>> >>>>>>> several
> > >>>> >>>>>>>>>> people
> > >>>> >>>>>>>>>>>> have requested
> > >>>> >>>>>>>>>>>>
> > >>>> >>>>>>>>>>>> Thanks,
> > >>>> >>>>>>>>>>>> -John
> > >>>> >>>>>>>>>>>>
> > >>>> >>>>>>>>>>>
> > >>>> >>>>>>>>>>
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>> --
> > >>>> >>>>>>>>> -- Guozhang
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>
> > >>>> >>>>>>>
> > >>>> >>>>>>
> > >>>> >>>>>>
> > >>>> >>>>>>
> > >>>> >>>>>> --
> > >>>> >>>>>> -- Guozhang
> > >>>> >>>>>>
> > >>>> >>>>>
> > >>>> >>>>
> > >>>> >>>>
> > >>>> >
> > >>>>
> > >>>>
> >
>

Reply via email to