Sounds promising to me too.

I'll update the KIP with this as the primary proposal but leave the other
alternatives there as under consideration for now.

Cheers,
Michal

On 4 April 2017 at 19:10, Matthias J. Sax <matth...@confluent.io> wrote:

> That sounds promising.
>
> I am just wondering if `Time` is the best name. Maybe we want to add
> other non-time based punctuations at some point later. I would suggest
>
> enum PunctuationType {
>   EVENT_TIME,
>   SYSTEM_TIME,
> }
>
> or similar. Just to keep the door open -- it's easier to add new stuff
> if the name is more generic.
>
>
> -Matthias
>
>
> On 4/4/17 5:30 AM, Thomas Becker wrote:
> > I agree that the framework providing and managing the notion of stream
> > time is valuable and not something we would want to delegate to the
> > tasks. I'm not entirely convinced that a separate callback (option C)
> > is that messy (it could just be a default method with an empty
> > implementation), but if we wanted a single API to handle both cases,
> > how about something like the following?
> >
> > enum Time {
> >    STREAM,
> >    CLOCK
> > }
> >
> > Then on ProcessorContext:
> > context.schedule(Time time, long interval)  // We could allow this to
> > be called once for each value of time to mix approaches.
> >
> > Then the Processor API becomes:
> > punctuate(Time time) // time here denotes which schedule resulted in
> > this call.
> >
> > Thoughts?
> >
> >
> > On Mon, 2017-04-03 at 22:44 -0700, Matthias J. Sax wrote:
> >> Thanks a lot for the KIP Michal,
> >>
> >> I was thinking about the four options you proposed in more details
> >> and
> >> this are my thoughts:
> >>
> >> (A) You argue, that users can still "punctuate" on event-time via
> >> process(), but I am not sure if this is possible. Note, that users
> >> only
> >> get record timestamps via context.timestamp(). Thus, users would need
> >> to
> >> track the time progress per partition (based on the partitions they
> >> obverse via context.partition(). (This alone puts a huge burden on
> >> the
> >> user by itself.) However, users are not notified at startup what
> >> partitions are assigned, and user are not notified when partitions
> >> get
> >> revoked. Because this information is not available, it's not possible
> >> to
> >> "manually advance" stream-time, and thus event-time punctuation
> >> within
> >> process() seems not to be possible -- or do you see a way to get it
> >> done? And even if, it might still be too clumsy to use.
> >>
> >> (B) This does not allow to mix both approaches, thus limiting what
> >> users
> >> can do.
> >>
> >> (C) This should give all flexibility we need. However, just adding
> >> one
> >> more method seems to be a solution that is too simple (cf my comments
> >> below).
> >>
> >> (D) This might be hard to use. Also, I am not sure how a user could
> >> enable system-time and event-time punctuation in parallel.
> >>
> >>
> >>
> >> Overall options (C) seems to be the most promising approach to me.
> >> Because I also favor a clean API, we might keep current punctuate()
> >> as-is, but deprecate it -- so we can remove it at some later point
> >> when
> >> people use the "new punctuate API".
> >>
> >>
> >> Couple of follow up questions:
> >>
> >> - I am wondering, if we should have two callback methods or just one
> >> (ie, a unified for system and event time punctuation or one for
> >> each?).
> >>
> >> - If we have one, how can the user figure out, which condition did
> >> trigger?
> >>
> >> - How would the API look like, for registering different punctuate
> >> schedules? The "type" must be somehow defined?
> >>
> >> - We might want to add "complex" schedules later on (like, punctuate
> >> on
> >> every 10 seconds event-time or 60 seconds system-time whatever comes
> >> first). I don't say we should add this right away, but we might want
> >> to
> >> define the API in a way, that it allows extensions like this later
> >> on,
> >> without redesigning the API (ie, the API should be designed
> >> extensible)
> >>
> >> - Did you ever consider count-based punctuation?
> >>
> >>
> >> I understand, that you would like to solve a simple problem, but we
> >> learned from the past, that just "adding some API" quickly leads to a
> >> not very well defined API that needs time consuming clean up later on
> >> via other KIPs. Thus, I would prefer to get a holistic punctuation
> >> KIP
> >> with this from the beginning on to avoid later painful redesign.
> >>
> >>
> >>
> >> -Matthias
> >>
> >>
> >>
> >> On 4/3/17 11:58 AM, Michal Borowiecki wrote:
> >>>
> >>> Thanks Thomas,
> >>>
> >>> I'm also wary of changing the existing semantics of punctuate, for
> >>> backward compatibility reasons, although I like the conceptual
> >>> simplicity of that option.
> >>>
> >>> Adding a new method to me feels safer but, in a way, uglier. I
> >>> added
> >>> this to the KIP now as option (C).
> >>>
> >>> The TimestampExtractor mechanism is actually more flexible, as it
> >>> allows
> >>> you to return any value, you're not limited to event time or system
> >>> time
> >>> (although I don't see an actual use case where you might need
> >>> anything
> >>> else then those two). Hence I also proposed the option to allow
> >>> users
> >>> to, effectively, decide what "stream time" is for them given the
> >>> presence or absence of messages, much like they can decide what msg
> >>> time
> >>> means for them using the TimestampExtractor. What do you think
> >>> about
> >>> that? This is probably most flexible but also most complicated.
> >>>
> >>> All comments appreciated.
> >>>
> >>> Cheers,
> >>>
> >>> Michal
> >>>
> >>>
> >>> On 03/04/17 19:23, Thomas Becker wrote:
> >>>>
> >>>> Although I fully agree we need a way to trigger periodic
> >>>> processing
> >>>> that is independent from whether and when messages arrive, I'm
> >>>> not sure
> >>>> I like the idea of changing the existing semantics across the
> >>>> board.
> >>>> What if we added an additional callback to Processor that can be
> >>>> scheduled similarly to punctuate() but was always called at
> >>>> fixed, wall
> >>>> clock based intervals? This way you wouldn't have to give up the
> >>>> notion
> >>>> of stream time to be able to do periodic processing.
> >>>>
> >>>> On Mon, 2017-04-03 at 10:34 +0100, Michal Borowiecki wrote:
> >>>>>
> >>>>> Hi all,
> >>>>>
> >>>>> I have created a draft for KIP-138: Change punctuate semantics
> >>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-138%3A+C
> >>>>> hange+
> >>>>> punctuate+semantics>
> >>>>> .
> >>>>>
> >>>>> Appreciating there can be different views on system-time vs
> >>>>> event-
> >>>>> time
> >>>>> semantics for punctuation depending on use-case and the
> >>>>> importance of
> >>>>> backwards compatibility of any such change, I've left it quite
> >>>>> open
> >>>>> and
> >>>>> hope to fill in more info as the discussion progresses.
> >>>>>
> >>>>> Thanks,
> >>>>> Michal
> > --
> >
> >
> >     Tommy Becker
> >
> >     Senior Software Engineer
> >
> >     O +1 919.460.4747
> >
> >     tivo.com
> >
> >
> > ________________________________
> >
> > This email and any attachments may contain confidential and privileged
> material for the sole use of the intended recipient. Any review, copying,
> or distribution of this email (or any attachments) by others is prohibited.
> If you are not the intended recipient, please contact the sender
> immediately and permanently delete this email and any attachments. No
> employee or agent of TiVo Inc. is authorized to conclude any binding
> agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo
> Inc. may only be made by a signed written agreement.
> >
>
>


-- 
<http://www.openbet.com> Michal Borowiecki
Technical Lead
T: +44 208 742 1600


+44 203 249 8448



E: michal.borowie...@openbet.com
W: www.openbet.com
OpenBet Ltd

Chiswick Park Building 9

566 Chiswick High Rd

London

W4 5XT <http://twitter.com/OpenBet_Ltd>
<http://www.linkedin.com/company/165331/> <http://www.facebook.com/OpenBet>
Winner of Sports Betting Supplier of the Year at the EGR B2B Awards 2010,
2011 & 2012
This message is confidential and intended only for the addressee. If you
have received this message in error, please immediately notify the
postmas...@openbet.com and delete it from your system as well as any
copies. The content of e-mails as well as traffic data may be monitored by
OpenBet for employment and security purposes. To protect the environment
please do not print this e-mail unless necessary. OpenBet Ltd. Registered
Office: Chiswick Park Building 9, 566 Chiswick High Road, London, W4 5XT,
United Kingdom. A company registered in England and Wales. Registered no.
3134634. VAT no. GB927523612

Reply via email to