>> Would a dashboard need perfect regularity? Wouldn't an upper bound suffice?
If you go with stream-time and don't have any input records for all partitions, punctuate would not be called at all, and thus your dashboard would "freeze". >> I thought about cron-type things, but aren't they better triggered by an >> external scheduler (they're more flexible anyway), which then feeds >> "commands" into the topology? I guess it depends what kind of periodic action you want to trigger. The "cron job" was just an analogy. Maybe it's just a heartbeat to some other service, that signals that your Streams app is still running. -Matthias On 4/24/17 10:02 AM, Michal Borowiecki wrote: > Thanks! > > Would a dashboard need perfect regularity? Wouldn't an upper bound suffice? > > Unless too frequent messages on replay could overpower it? > > > I thought about cron-type things, but aren't they better triggered by an > external scheduler (they're more flexible anyway), which then feeds > "commands" into the topology? > > Just my 2c. > > Cheers, > > Michal > > > On 24/04/17 17:57, Matthias J. Sax wrote: >> A simple example would be some dashboard app, that needs to get >> "current" status in regular time intervals (ie, and real-time app). >> >> Or something like a "scheduler" -- think "cron job" application. >> >> >> -Matthias >> >> On 4/24/17 2:23 AM, Michal Borowiecki wrote: >>> Hi Matthias, >>> >>> I agree it's difficult to reason about the hybrid approach, I certainly >>> found it hard and I'm totally on board with the mantra. >>> >>> I'd be happy to limit the scope of this KIP to add system-time >>> punctuation semantics (in addition to existing stream-time semantics) >>> and leave more complex schemes for users to implement on top of that. >>> >>> Further additional PunctuationTypes, could then be added by future KIPs, >>> possibly including the hybrid approach once it has been given more thought. >>> >>>> There are real-time applications, that want to get >>>> callbacks in regular system-time intervals (completely independent from >>>> stream-time). >>> Can you please describe what they are, so that I can put them on the >>> wiki for later reference? >>> >>> Thanks, >>> >>> Michal >>> >>> >>> On 23/04/17 21:27, Matthias J. Sax wrote: >>>> Hi, >>>> >>>> I do like Damian's API proposal about the punctuation callback function. >>>> >>>> I also did reread the KIP and thought about the semantics we want to >>>> provide. >>>> >>>>> Given the above, I don't see a reason any more for a separate system-time >>>>> based punctuation. >>>> I disagree here. There are real-time applications, that want to get >>>> callbacks in regular system-time intervals (completely independent from >>>> stream-time). Thus we should allow this -- if we really follow the >>>> "hybrid" approach, this could be configured with stream-time interval >>>> infinite and delay whatever system-time punctuation interval you want to >>>> have. However, I would like to add a proper API for this and do this >>>> configuration under the hood (that would allow one implementation within >>>> all kind of branching for different cases). >>>> >>>> Thus, we definitely should have PunctutionType#StreamTime and >>>> #SystemTime -- and additionally, we _could_ have #Hybrid. Thus, I am not >>>> a fan of your latest API proposal. >>>> >>>> >>>> About the hybrid approach in general. On the one hand I like it, on the >>>> other hand, it seems to be rather (1) complicated (not necessarily from >>>> an implementation point of view, but for people to understand it) and >>>> (2) mixes two semantics together in a "weird" way". Thus, I disagree with: >>>> >>>>> It may appear complicated at first but I do think these semantics will >>>>> still be more understandable to users than having 2 separate punctuation >>>>> schedules/callbacks with different PunctuationTypes. >>>> This statement only holds if you apply strong assumptions that I don't >>>> believe hold in general -- see (2) for details -- and I think it is >>>> harder than you assume to reason about the hybrid approach in general. >>>> IMHO, the hybrid approach is a "false friend" that seems to be easy to >>>> reason about... >>>> >>>> >>>> (1) Streams always embraced "easy to use" and we should really be >>>> careful to keep it this way. On the other hand, as we are talking about >>>> changes to PAPI, it won't affect DSL users (DSL does not use punctuation >>>> at all at the moment), and thus, the "easy to use" mantra might not be >>>> affected, while it will allow advanced users to express more complex stuff. >>>> >>>> I like the mantra: "make simple thing easy and complex things possible". >>>> >>>> (2) IMHO the major disadvantage (issue?) of the hybrid approach is the >>>> implicit assumption that even-time progresses at the same "speed" as >>>> system-time during regular processing. This implies the assumption that >>>> a slower progress in stream-time indicates the absence of input events >>>> (and that later arriving input events will have a larger event-time with >>>> high probability). Even if this might be true for some use cases, I >>>> doubt it holds in general. Assume that you get a spike in traffic and >>>> for some reason stream-time does advance slowly because you have more >>>> records to process. This might trigger a system-time based punctuation >>>> call even if this seems not to be intended. I strongly believe that it >>>> is not easy to reason about the semantics of the hybrid approach (even >>>> if the intentional semantics would be super useful -- but I doubt that >>>> we get want we ask for). >>>> >>>> Thus, I also believe that one might need different "configuration" >>>> values for the hybrid approach if you run the same code for different >>>> scenarios: regular processing, re-processing, catching up scenario. And >>>> as the term "configuration" implies, we might be better off to not mix >>>> configuration with business logic that is expressed via code. >>>> >>>> >>>> One more comment: I also don't think that the hybrid approach is >>>> deterministic as claimed in the use-case subpage. I understand the >>>> reasoning and agree, that it is deterministic if certain assumptions >>>> hold -- compare above -- and if configured correctly. But strictly >>>> speaking it's not because there is a dependency on system-time (and >>>> IMHO, if system-time is involved it cannot be deterministic by definition). >>>> >>>> >>>>> I see how in theory this could be implemented on top of the 2 punctuate >>>>> callbacks with the 2 different PunctuationTypes (one stream-time based, >>>>> the other system-time based) but it would be a much more complicated >>>>> scheme and I don't want to suggest that. >>>> I agree that expressing the intended hybrid semantics is harder if we >>>> offer only #StreamTime and #SystemTime punctuation. However, I also >>>> believe that the hybrid approach is a "false friend" with regard to >>>> reasoning about the semantics (it indicates that it more easy as it is >>>> in reality). Therefore, we might be better off to not offer the hybrid >>>> approach and make it clear to a developed, that it is hard to mix >>>> #StreamTime and #SystemTime in a semantically sound way. >>>> >>>> >>>> Looking forward to your feedback. :) >>>> >>>> -Matthias >>>> >>>> >>>> >>>> >>>> On 4/22/17 11:43 AM, Michal Borowiecki wrote: >>>>> Hi all, >>>>> >>>>> Looking for feedback on the functional interface approach Damian >>>>> proposed. What do people think? >>>>> >>>>> Further on the semantics of triggering punctuate though: >>>>> >>>>> I ran through the 2 use cases that Arun had kindly put on the wiki >>>>> (https://cwiki.apache.org/confluence/display/KAFKA/Punctuate+Use+Cases) >>>>> in my head and on a whiteboard and I can't find a better solution than >>>>> the "hybrid" approach he had proposed. >>>>> >>>>> I see how in theory this could be implemented on top of the 2 punctuate >>>>> callbacks with the 2 different PunctuationTypes (one stream-time based, >>>>> the other system-time based) but it would be a much more complicated >>>>> scheme and I don't want to suggest that. >>>>> >>>>> However, to add to the hybrid algorithm proposed, I suggest one >>>>> parameter to that: a tolerance period, expressed in milliseconds >>>>> system-time, after which the punctuation will be invoked in case the >>>>> stream-time advance hasn't triggered it within the requested interval >>>>> since the last invocation of punctuate (as recorded in system-time) >>>>> >>>>> This would allow a user-defined tolerance for late arriving events. The >>>>> trade off would be left for the user to decide: regular punctuation in >>>>> the case of absence of events vs allowing for records arriving late or >>>>> some build-up due to processing not catching up with the event rate. >>>>> In the one extreme, this tolerance could be set to infinity, turning >>>>> hybrid into simply stream-time based punctuate, like we have now. In the >>>>> other extreme, the tolerance could be set to 0, resulting in a >>>>> system-time upper bound on the effective punctuation interval. >>>>> >>>>> Given the above, I don't see a reason any more for a separate >>>>> system-time based punctuation. The "hybrid" approach with 0ms tolerance >>>>> would under normal operation trigger at regular intervals wrt the >>>>> system-time, except in cases of re-play/catch-up, where the stream time >>>>> advances faster than system time. In these cases punctuate would happen >>>>> more often than the specified interval wrt system time. However, the >>>>> use-cases that need system-time punctuations (that I've seen at least) >>>>> really only have a need for an upper bound on punctuation delay but >>>>> don't need a lower bound. >>>>> >>>>> To that effect I'd propose the api to be as follows, on ProcessorContext: >>>>> >>>>> schedule(Punctuator callback, long interval, long toleranceIterval); // >>>>> schedules punctuate at stream-time intervals with a system-time upper >>>>> bound of (interval+toleranceInterval) >>>>> >>>>> schedule(Punctuator callback, long interval); // schedules punctuate at >>>>> stream-time intervals without an system-time upper bound - this is >>>>> equivalent to current stream-time based punctuate >>>>> >>>>> Punctuation is triggered when either: >>>>> - the stream time advances past the (stream time of the previous >>>>> punctuation) + interval; >>>>> - or (iff the toleranceInterval is set) when the system time advances >>>>> past the (system time of the previous punctuation) + interval + >>>>> toleranceInterval >>>>> >>>>> In either case: >>>>> - we trigger punctuate passing as the argument the stream time at which >>>>> the current punctuation was meant to happen >>>>> - next punctuate is scheduled at (stream time at which the current >>>>> punctuation was meant to happen) + interval >>>>> >>>>> It may appear complicated at first but I do think these semantics will >>>>> still be more understandable to users than having 2 separate punctuation >>>>> schedules/callbacks with different PunctuationTypes. >>>>> >>>>> >>>>> >>>>> PS. Having re-read this, maybe the following alternative would be easier >>>>> to understand (WDYT?): >>>>> >>>>> schedule(Punctuator callback, long streamTimeInterval, long >>>>> systemTimeUpperBound); // schedules punctuate at stream-time intervals >>>>> with a system-time upper bound - systemTimeUpperBound must be no less >>>>> than streamTimeInterval >>>>> >>>>> schedule(Punctuator callback, long streamTimeInterval); // schedules >>>>> punctuate at stream-time intervals without a system-time upper bound - >>>>> this is equivalent to current stream-time based punctuate >>>>> >>>>> Punctuation is triggered when either: >>>>> - the stream time advances past the (stream time of the previous >>>>> punctuation) + streamTimeInterval; >>>>> - or (iff systemTimeUpperBound is set) when the system time advances >>>>> past the (system time of the previous punctuation) + systemTimeUpperBound >>>>> >>>>> Awaiting comments. >>>>> >>>>> Thanks, >>>>> Michal >>>>> >>>>> On 21/04/17 16:56, Michal Borowiecki wrote: >>>>>> Yes, that's what I meant. Just wanted to highlight we'd deprecate it >>>>>> in favour of something that doesn't return a record. Not a problem >>>>>> though. >>>>>> >>>>>> >>>>>> On 21/04/17 16:32, Damian Guy wrote: >>>>>>> Thanks Michal, >>>>>>> I agree Transformer.punctuate should also be void, but we can deprecate >>>>>>> that too in favor of the new interface. >>>>>>> >>>>>>> Thanks for the javadoc PR! >>>>>>> >>>>>>> Cheers, >>>>>>> Damian >>>>>>> >>>>>>> On Fri, 21 Apr 2017 at 09:31 Michal Borowiecki < >>>>>>> michal.borowie...@openbet.com> wrote: >>>>>>> >>>>>>>> Yes, that looks better to me. >>>>>>>> >>>>>>>> Note that punctuate on Transformer is currently returning a record, >>>>>>>> but I >>>>>>>> think it's ok to have all output records be sent via >>>>>>>> ProcessorContext.forward, which has to be used anyway if you want to >>>>>>>> send >>>>>>>> multiple records from one invocation of punctuate. >>>>>>>> >>>>>>>> This way it's consistent between Processor and Transformer. >>>>>>>> >>>>>>>> >>>>>>>> BTW, looking at this I found a glitch in the javadoc and put a comment >>>>>>>> there: >>>>>>>> >>>>>>>> https://github.com/apache/kafka/pull/2413/files#r112634612 >>>>>>>> >>>>>>>> and PR: https://github.com/apache/kafka/pull/2884 >>>>>>>> >>>>>>>> Cheers, >>>>>>>> >>>>>>>> Michal >>>>>>>> On 20/04/17 18:55, Damian Guy wrote: >>>>>>>> >>>>>>>> Hi Michal, >>>>>>>> >>>>>>>> Thanks for the KIP. I'd like to propose a bit more of a radical change >>>>>>>> to >>>>>>>> the API. >>>>>>>> 1. deprecate the punctuate method on Processor >>>>>>>> 2. create a new Functional Interface just for Punctuation, something >>>>>>>> like: >>>>>>>> interface Punctuator { >>>>>>>> void punctuate(long timestamp) >>>>>>>> } >>>>>>>> 3. add a new schedule function to ProcessorContext: schedule(long >>>>>>>> interval, PunctuationType type, Punctuator callback) >>>>>>>> 4. deprecate the existing schedule function >>>>>>>> >>>>>>>> Thoughts? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Damian >>>>>>>> >>>>>>>> On Sun, 16 Apr 2017 at 21:55 Michal Borowiecki < >>>>>>>> michal.borowie...@openbet.com> wrote: >>>>>>>> >>>>>>>>> Hi Thomas, >>>>>>>>> >>>>>>>>> I would say our use cases fall in the same category as yours. >>>>>>>>> >>>>>>>>> 1) One is expiry of old records, it's virtually identical to yours. >>>>>>>>> >>>>>>>>> 2) Second one is somewhat more convoluted but boils down to the same >>>>>>>>> type >>>>>>>>> of design: >>>>>>>>> >>>>>>>>> Incoming messages carry a number of fields, including a timestamp. >>>>>>>>> >>>>>>>>> Outgoing messages contain derived fields, one of them (X) is depended >>>>>>>>> on >>>>>>>>> by the timestamp input field (Y) and some other input field (Z). >>>>>>>>> >>>>>>>>> Since the output field X is derived in some non-trivial way, we don't >>>>>>>>> want to force the logic onto downstream apps. Instead we want to >>>>>>>>> calculate >>>>>>>>> it in the Kafka Streams app, which means we re-calculate X as soon as >>>>>>>>> the >>>>>>>>> timestamp in Y is reached (wall clock time) and send a message if it >>>>>>>>> changed (I say "if" because the derived field (X) is also conditional >>>>>>>>> on >>>>>>>>> another input field Z). >>>>>>>>> >>>>>>>>> So we have kv stores with the records and an additional kv store with >>>>>>>>> timestamp->id mapping which act like an index where we periodically >>>>>>>>> do a >>>>>>>>> ranged query. >>>>>>>>> >>>>>>>>> Initially we naively tried doing it in punctuate which of course >>>>>>>>> didn't >>>>>>>>> work when there were no regular msgs on the input topic. >>>>>>>>> Since this was before 0.10.1 and state stores weren't query-able from >>>>>>>>> outside we created a "ticker" that produced msgs once per second onto >>>>>>>>> another topic and fed it into the same topology to trigger punctuate. >>>>>>>>> This didn't work either, which was much more surprising to us at the >>>>>>>>> time, because it was not obvious at all that punctuate is only >>>>>>>>> triggered if >>>>>>>>> *all* input partitions receive messages regularly. >>>>>>>>> In the end we had to break this into 2 separate Kafka Streams. Main >>>>>>>>> transformer doesn't use punctuate but sends values of timestamp field >>>>>>>>> Y and >>>>>>>>> the id to a "scheduler" topic where also the periodic ticks are sent. >>>>>>>>> This >>>>>>>>> is consumed by the second topology and is its only input topic. >>>>>>>>> There's a >>>>>>>>> transformer on that topic which populates and updates the time-based >>>>>>>>> indexes and polls them from punctuate. If the time in the timestamp >>>>>>>>> elapsed, the record id is sent to the main transformer, which >>>>>>>>> updates/deletes the record from the main kv store and forwards the >>>>>>>>> transformed record to the output topic. >>>>>>>>> >>>>>>>>> To me this setup feels horrendously complicated for what it does. >>>>>>>>> >>>>>>>>> We could incrementally improve on this since 0.10.1 to poll the >>>>>>>>> timestamp->id "index" stores from some code outside the KafkaStreams >>>>>>>>> topology so that at least we wouldn't need the extra topic for >>>>>>>>> "ticks". >>>>>>>>> However, the ticks don't feel so hacky when you realise they give you >>>>>>>>> some hypothetical benefits in predictability. You can reprocess the >>>>>>>>> messages in a reproducible manner, since the topologies use >>>>>>>>> event-time, >>>>>>>>> just that the event time is simply the wall-clock time fed into a >>>>>>>>> topic by >>>>>>>>> the ticks. (NB in our use case we haven't yet found a need for this >>>>>>>>> kind of >>>>>>>>> reprocessing). >>>>>>>>> To make that work though, we would have to have the stream time >>>>>>>>> advance >>>>>>>>> based on the presence of msgs on the "tick" topic, regardless of the >>>>>>>>> presence of messages on the other input topic. >>>>>>>>> >>>>>>>>> Same as in the expiry use case, both the wall-clock triggered >>>>>>>>> punctuate >>>>>>>>> and the hybrid would work to simplify this a lot. >>>>>>>>> >>>>>>>>> 3) Finally, I have a 3rd use case in the making but I'm still looking >>>>>>>>> if >>>>>>>>> we can achieve it using session windows instead. I'll keep you posted >>>>>>>>> if we >>>>>>>>> have to go with punctuate there too. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Michal >>>>>>>>> >>>>>>>>> >>>>>>>>> On 11/04/17 20:52, Thomas Becker wrote: >>>>>>>>> >>>>>>>>> Here's an example that we currently have. We have a streams processor >>>>>>>>> that does a transform from one topic into another. One of the fields >>>>>>>>> in >>>>>>>>> the source topic record is an expiration time, and one of the >>>>>>>>> functions >>>>>>>>> of the processor is to ensure that expired records get deleted >>>>>>>>> promptly >>>>>>>>> after that time passes (typically days or weeks after the message was >>>>>>>>> originally produced). To do that, the processor keeps a state store of >>>>>>>>> keys and expiration times, iterates that store in punctuate(), and >>>>>>>>> emits delete (null) records for expired items. This needs to happen at >>>>>>>>> some minimum interval regardless of the incoming message rate of the >>>>>>>>> source topic. >>>>>>>>> >>>>>>>>> In this scenario, the expiration of records is the primary function of >>>>>>>>> punctuate, and therefore the key requirement is that the wall-clock >>>>>>>>> measured time between punctuate calls have some upper-bound. So a pure >>>>>>>>> wall-clock based schedule would be fine for our needs. But the >>>>>>>>> proposed >>>>>>>>> "hybrid" system would also be acceptable if that satisfies a broader >>>>>>>>> range of use-cases. >>>>>>>>> >>>>>>>>> On Tue, 2017-04-11 at 14:41 +0200, Michael Noll wrote: >>>>>>>>> >>>>>>>>> I apologize for the longer email below. To my defense, it started >>>>>>>>> out much >>>>>>>>> shorter. :-) Also, to be super-clear, I am intentionally playing >>>>>>>>> devil's >>>>>>>>> advocate for a number of arguments brought forth in order to help >>>>>>>>> improve >>>>>>>>> this KIP -- I am not implying I necessarily disagree with the >>>>>>>>> arguments. >>>>>>>>> >>>>>>>>> That aside, here are some further thoughts. >>>>>>>>> >>>>>>>>> First, there are (at least?) two categories for actions/behavior you >>>>>>>>> invoke >>>>>>>>> via punctuate(): >>>>>>>>> >>>>>>>>> 1. For internal housekeeping of your Processor or Transformer (e.g., >>>>>>>>> to >>>>>>>>> periodically commit to a custom store, to do metrics/logging). Here, >>>>>>>>> the >>>>>>>>> impact of punctuate is typically not observable by other processing >>>>>>>>> nodes >>>>>>>>> in the topology. >>>>>>>>> 2. For controlling the emit frequency of downstream records. Here, >>>>>>>>> the >>>>>>>>> punctuate is all about being observable by downstream processing >>>>>>>>> nodes. >>>>>>>>> >>>>>>>>> A few releases back, we introduced record caches (DSL) and state >>>>>>>>> store >>>>>>>>> caches (Processor API) in KIP-63. Here, we addressed a concern >>>>>>>>> relating to >>>>>>>>> (2) where some users needed to control -- here: limit -- the >>>>>>>>> downstream >>>>>>>>> output rate of Kafka Streams because the downstream systems/apps >>>>>>>>> would not >>>>>>>>> be able to keep up with the upstream output rate (Kafka scalability > >>>>>>>>> their >>>>>>>>> scalability). The argument for KIP-63, which notably did not >>>>>>>>> introduce a >>>>>>>>> "trigger" API, was that such an interaction with downstream systems >>>>>>>>> is an >>>>>>>>> operational concern; it should not impact the processing *logic* of >>>>>>>>> your >>>>>>>>> application, and thus we didn't want to complicate the Kafka Streams >>>>>>>>> API, >>>>>>>>> especially not the declarative DSL, with such operational concerns. >>>>>>>>> >>>>>>>>> This KIP's discussion on `punctuate()` takes us back in time (<-- >>>>>>>>> sorry, I >>>>>>>>> couldn't resist to not make this pun :-P). As a meta-comment, I am >>>>>>>>> observing that our conversation is moving more and more into the >>>>>>>>> direction >>>>>>>>> of explicit "triggers" because, so far, I have seen only motivations >>>>>>>>> for >>>>>>>>> use cases in category (2), but none yet for (1)? For example, some >>>>>>>>> comments voiced here are about sth like "IF stream-time didn't >>>>>>>>> trigger >>>>>>>>> punctuate, THEN trigger punctuate based on processing-time". Do we >>>>>>>>> want >>>>>>>>> this, and if so, for which use cases and benefits? Also, on a >>>>>>>>> related >>>>>>>>> note, whatever we are discussing here will impact state store caches >>>>>>>>> (Processor API) and perhaps also impact record caches (DSL), thus we >>>>>>>>> should >>>>>>>>> clarify any such impact here. >>>>>>>>> >>>>>>>>> Switching topics slightly. >>>>>>>>> >>>>>>>>> Jay wrote: >>>>>>>>> >>>>>>>>> One thing I've always found super important for this kind of design >>>>>>>>> work >>>>>>>>> is to do a really good job of cataloging the landscape of use cases >>>>>>>>> and >>>>>>>>> how prevalent each one is. >>>>>>>>> >>>>>>>>> +1 to this, as others have already said. >>>>>>>>> >>>>>>>>> Here, let me highlight -- just in case -- that when we talked about >>>>>>>>> windowing use cases in the recent emails, the Processor API (where >>>>>>>>> `punctuate` resides) does not have any notion of windowing at >>>>>>>>> all. If you >>>>>>>>> want to do windowing *in the Processor API*, you must do so manually >>>>>>>>> in >>>>>>>>> combination with window stores. For this reason I'd suggest to >>>>>>>>> discuss use >>>>>>>>> cases not just in general, but also in view of how you'd do so in the >>>>>>>>> Processor API vs. in the DSL. Right now, changing/improving >>>>>>>>> `punctuate` >>>>>>>>> does not impact the DSL at all, unless we add new functionality to >>>>>>>>> it. >>>>>>>>> >>>>>>>>> Jay wrote in his strawman example: >>>>>>>>> >>>>>>>>> You aggregate click and impression data for a reddit like site. >>>>>>>>> Every ten >>>>>>>>> minutes you want to output a ranked list of the top 10 articles >>>>>>>>> ranked by >>>>>>>>> clicks/impressions for each geographical area. I want to be able >>>>>>>>> run this >>>>>>>>> in steady state as well as rerun to regenerate results (or catch up >>>>>>>>> if it >>>>>>>>> crashes). >>>>>>>>> >>>>>>>>> This is a good example for more than the obvious reason: In KIP-63, >>>>>>>>> we >>>>>>>>> argued that the reason for saying "every ten minutes" above is not >>>>>>>>> necessarily about because you want to output data *exactly* after ten >>>>>>>>> minutes, but that you want to perform an aggregation based on 10- >>>>>>>>> minute >>>>>>>>> windows of input data; i.e., the point is about specifying the input >>>>>>>>> for >>>>>>>>> your aggregation, not or less about when the results of the >>>>>>>>> aggregation >>>>>>>>> should be send downstream. To take an extreme example, you could >>>>>>>>> disable >>>>>>>>> record caches and let your app output a downstream update for every >>>>>>>>> incoming input record. If the last input record was from at minute 7 >>>>>>>>> of 10 >>>>>>>>> (for a 10-min window), then what your app would output at minute 10 >>>>>>>>> would >>>>>>>>> be identical to what it had already emitted at minute 7 earlier >>>>>>>>> anyways. >>>>>>>>> This is particularly true when we take late-arriving data into >>>>>>>>> account: if >>>>>>>>> a late record arrived at minute 13, your app would (by default) send >>>>>>>>> a new >>>>>>>>> update downstream, even though the "original" 10 minutes have already >>>>>>>>> passed. >>>>>>>>> >>>>>>>>> Jay wrote...: >>>>>>>>> >>>>>>>>> There are a couple of tricky things that seem to make this hard >>>>>>>>> with >>>>>>>>> >>>>>>>>> either >>>>>>>>> >>>>>>>>> of the options proposed: >>>>>>>>> 1. If I emit this data using event time I have the problem >>>>>>>>> described where >>>>>>>>> a geographical region with no new clicks or impressions will fail >>>>>>>>> to >>>>>>>>> >>>>>>>>> output >>>>>>>>> >>>>>>>>> results. >>>>>>>>> >>>>>>>>> ...and Arun Mathew wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> We window by the event time, but trigger punctuate in <punctuate >>>>>>>>> interval> >>>>>>>>> duration of system time, in the absence of an event crossing the >>>>>>>>> punctuate >>>>>>>>> event time. >>>>>>>>> >>>>>>>>> So, given what I wrote above about the status quo and what you can >>>>>>>>> already >>>>>>>>> do with it, is the concern that the state store cache doesn't give >>>>>>>>> you >>>>>>>>> *direct* control over "forcing an output after no later than X >>>>>>>>> seconds [of >>>>>>>>> processing-time]" but only indirect control through a cache >>>>>>>>> size? (Note >>>>>>>>> that I am not dismissing the claims why this might be helpful.) >>>>>>>>> >>>>>>>>> Arun Mathew wrote: >>>>>>>>> >>>>>>>>> We are using Kafka Stream for our Audit Trail, where we need to >>>>>>>>> output the >>>>>>>>> event counts on each topic on each cluster aggregated over a 1 >>>>>>>>> minute >>>>>>>>> window. We have to use event time to be able to cross check the >>>>>>>>> counts. >>>>>>>>> >>>>>>>>> But >>>>>>>>> >>>>>>>>> we need to trigger punctuate [aggregate event pushes] by system >>>>>>>>> time in >>>>>>>>> >>>>>>>>> the >>>>>>>>> >>>>>>>>> absence of events. Otherwise the event counts for unexpired windows >>>>>>>>> would >>>>>>>>> be 0 which is bad. >>>>>>>>> >>>>>>>>> Isn't the latter -- "count would be 0" -- the problem between the >>>>>>>>> absence >>>>>>>>> of output vs. an output of 0, similar to the use of `Option[T]` in >>>>>>>>> Scala >>>>>>>>> and the difference between `None` and `Some(0)`? That is, isn't the >>>>>>>>> root >>>>>>>>> cause that the downstream system interprets the absence of output in >>>>>>>>> a >>>>>>>>> particular way ("No output after 1 minute = I consider the output to >>>>>>>>> be >>>>>>>>> 0.")? Arguably, you could also adapt the downstream system (if >>>>>>>>> possible) >>>>>>>>> to correctly handle the difference between absence of output vs. >>>>>>>>> output of >>>>>>>>> 0. I am not implying that we shouldn't care about such a use case, >>>>>>>>> but >>>>>>>>> want to understand the motivation better. :-) >>>>>>>>> >>>>>>>>> Also, to add some perspective, in some related discussions we talked >>>>>>>>> about >>>>>>>>> how a Kafka Streams application should not worry or not be coupled >>>>>>>>> unnecessarily with such interpretation specifics in a downstream >>>>>>>>> system's >>>>>>>>> behavior. After all, tomorrow your app's output might be consumed by >>>>>>>>> more >>>>>>>>> than just this one downstream system. Arguably, Kafka Connect rather >>>>>>>>> than >>>>>>>>> Kafka Streams might be the best tool to link the universes of Kafka >>>>>>>>> and >>>>>>>>> downstream systems, including helping to reconcile the differences in >>>>>>>>> how >>>>>>>>> these systems interpret changes, updates, late-arriving data, >>>>>>>>> etc. Kafka >>>>>>>>> Connect would allow you to decouple the Kafka Streams app's logical >>>>>>>>> processing from the specifics of downstream systems, thanks to >>>>>>>>> specific >>>>>>>>> sink connectors (e.g. for Elastic, Cassandra, MySQL, S3, etc). Would >>>>>>>>> this >>>>>>>>> decoupling with Kafka Connect help here? (And if the answer is "Yes, >>>>>>>>> but >>>>>>>>> it's currently awkward to use Connect for this", this might be a >>>>>>>>> problem we >>>>>>>>> can solve, too.) >>>>>>>>> >>>>>>>>> Switching topics slightly again. >>>>>>>>> >>>>>>>>> Thomas wrote: >>>>>>>>> >>>>>>>>> I'm not entirely convinced that a separate callback (option C) >>>>>>>>> is that messy (it could just be a default method with an empty >>>>>>>>> implementation), but if we wanted a single API to handle both >>>>>>>>> cases, >>>>>>>>> how about something like the following? >>>>>>>>> >>>>>>>>> enum Time { >>>>>>>>> STREAM, >>>>>>>>> CLOCK >>>>>>>>> } >>>>>>>>> >>>>>>>>> Yeah, I am on the fence here, too. If we use the 1-method approach, >>>>>>>>> then >>>>>>>>> whatever the user is doing inside this method is a black box to Kafka >>>>>>>>> Streams (similar to how we have no idea what the user does inside a >>>>>>>>> `foreach` -- if the function passed to `foreach` writes to external >>>>>>>>> systems, then Kafka Streams is totally unaware of the fact). We >>>>>>>>> won't >>>>>>>>> know, for example, if the stream-time action has a smaller "trigger" >>>>>>>>> frequency than the processing-time action. Or, we won't know whether >>>>>>>>> the >>>>>>>>> user custom-codes a "not later than" trigger logic ("Do X every 1- >>>>>>>>> minute of >>>>>>>>> stream-time or 1-minute of processing-time, whichever comes >>>>>>>>> first"). That >>>>>>>>> said, I am not certain yet whether we would need such knowledge >>>>>>>>> because, >>>>>>>>> when using the Processor API, most of the work and decisions must be >>>>>>>>> done >>>>>>>>> by the user anyways. It would matter though if the concept of >>>>>>>>> "triggers" >>>>>>>>> were to bubble up into the DSL because in the DSL the management of >>>>>>>>> windowing, window stores, etc. must be done automatically by Kafka >>>>>>>>> Streams. >>>>>>>>> >>>>>>>>> [In any case, btw, we have the corner case where the user configured >>>>>>>>> the >>>>>>>>> stream-time to be processing-time (e.g. via wall-clock timestamp >>>>>>>>> extractor), at which point both punctuate variants are based on the >>>>>>>>> same >>>>>>>>> time semantics / timeline.] >>>>>>>>> >>>>>>>>> Again, I apologize for the wall of text. Congratulations if you made >>>>>>>>> it >>>>>>>>> this far. :-) >>>>>>>>> >>>>>>>>> More than happy to hear your thoughts! >>>>>>>>> Michael >>>>>>>>> >>>>>>>>> On Tue, Apr 11, 2017 at 1:12 AM, Arun Mathew <arunmathe...@gmail.com> >>>>>>>>> <arunmathe...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks Matthias. >>>>>>>>> Sure, will correct it right away. >>>>>>>>> >>>>>>>>> On 11-Apr-2017 8:07 AM, "Matthias J. Sax" <matth...@confluent.io> >>>>>>>>> <matth...@confluent.io> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Thanks for preparing this page! >>>>>>>>> >>>>>>>>> About terminology: >>>>>>>>> >>>>>>>>> You introduce the term "event time" -- but we should call this >>>>>>>>> "stream >>>>>>>>> time" -- "stream time" is whatever TimestampExtractor returns and >>>>>>>>> this >>>>>>>>> could be event time, ingestion time, or processing/wall-clock time. >>>>>>>>> >>>>>>>>> Does this make sense to you? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -Matthias >>>>>>>>> >>>>>>>>> >>>>>>>>> On 4/10/17 4:58 AM, Arun Mathew wrote: >>>>>>>>> >>>>>>>>> Thanks Ewen. >>>>>>>>> >>>>>>>>> @Michal, @all, I have created a child page to start the Use Cases >>>>>>>>> >>>>>>>>> discussion [https://cwiki.apache.org/confluence/display/KAFKA/ >>>>>>>>> Punctuate+Use+Cases]. Please go through it and give your comments. >>>>>>>>> >>>>>>>>> >>>>>>>>> @Tianji, Sorry for the delay. I am trying to make the patch >>>>>>>>> public. >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Arun Mathew >>>>>>>>> >>>>>>>>> On 4/8/17, 02:00, "Ewen Cheslack-Postava" <e...@confluent.io> >>>>>>>>> <e...@confluent.io> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Arun, >>>>>>>>> >>>>>>>>> I've given you permission to edit the wiki. Let me know if >>>>>>>>> you run >>>>>>>>> >>>>>>>>> into any >>>>>>>>> >>>>>>>>> issues. >>>>>>>>> >>>>>>>>> -Ewen >>>>>>>>> >>>>>>>>> On Fri, Apr 7, 2017 at 1:21 AM, Arun Mathew <amathew@yahoo-co >>>>>>>>> rp.jp> <amat...@yahoo-corp.jp> >>>>>>>>> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> > Thanks Michal. I don’t have the access yet [arunmathew88]. >>>>>>>>> Should I >>>>>>>>> >>>>>>>>> be >>>>>>>>> >>>>>>>>> > sending a separate mail for this? >>>>>>>>> > >>>>>>>>> > I thought one of the person following this thread would be >>>>>>>>> able to >>>>>>>>> >>>>>>>>> give me >>>>>>>>> >>>>>>>>> > access. >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > *From: *Michal Borowiecki <michal.borowie...@openbet.com> >>>>>>>>> <michal.borowie...@openbet.com> >>>>>>>>> > *Reply-To: *"dev@kafka.apache.org" <dev@kafka.apache.org> >>>>>>>>> <dev@kafka.apache.org> <dev@kafka.apache.org> >>>>>>>>> > *Date: *Friday, April 7, 2017 at 17:16 >>>>>>>>> > *To: *"dev@kafka.apache.org" <dev@kafka.apache.org> >>>>>>>>> <dev@kafka.apache.org> <dev@kafka.apache.org> >>>>>>>>> > *Subject: *Re: [DISCUSS] KIP-138: Change punctuate >>>>>>>>> semantics >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > Hi Arun, >>>>>>>>> > >>>>>>>>> > I was thinking along the same lines as you, listing the use >>>>>>>>> cases >>>>>>>>> >>>>>>>>> on the >>>>>>>>> >>>>>>>>> > wiki, but didn't find time to get around doing that yet. >>>>>>>>> > Don't mind if you do it if you have access now. >>>>>>>>> > I was thinking it would be nice if, once we have the use >>>>>>>>> cases >>>>>>>>> >>>>>>>>> listed, >>>>>>>>> >>>>>>>>> > people could use likes to up-vote the use cases similar to >>>>>>>>> what >>>>>>>>> >>>>>>>>> they're >>>>>>>>> >>>>>>>>> > working on. >>>>>>>>> > >>>>>>>>> > I should have a bit more time to action this in the next >>>>>>>>> few days, >>>>>>>>> >>>>>>>>> but >>>>>>>>> >>>>>>>>> > happy for you to do it if you can beat me to it ;-) >>>>>>>>> > >>>>>>>>> > Cheers, >>>>>>>>> > Michal >>>>>>>>> > >>>>>>>>> > On 07/04/17 04:39, Arun Mathew wrote: >>>>>>>>> > >>>>>>>>> > Sure, Thanks Matthias. My id is [arunmathew88]. >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > Of course. I was thinking of a subpage where people can >>>>>>>>> >>>>>>>>> collaborate. >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > Will do as per Michael’s suggestion. >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > Regards, >>>>>>>>> > >>>>>>>>> > Arun Mathew >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > On 4/7/17, 12:30, "Matthias J. Sax" <matth...@confluent.io> >>>>>>>>> <matth...@confluent.io> >>>>>>>>> < >>>>>>>>> >>>>>>>>> matth...@confluent.io> wrote: >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > Please share your Wiki-ID and a committer can give you >>>>>>>>> write >>>>>>>>> >>>>>>>>> access. >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > Btw: as you did not initiate the KIP, you should not >>>>>>>>> change the >>>>>>>>> >>>>>>>>> KIP >>>>>>>>> >>>>>>>>> > >>>>>>>>> > without the permission of the original author -- in >>>>>>>>> this case >>>>>>>>> >>>>>>>>> Michael. >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > So you might also just share your thought over the >>>>>>>>> mailing list >>>>>>>>> >>>>>>>>> and >>>>>>>>> >>>>>>>>> > >>>>>>>>> > Michael can update the KIP page. Or, as an alternative, >>>>>>>>> just >>>>>>>>> >>>>>>>>> create a >>>>>>>>> >>>>>>>>> > >>>>>>>>> > subpage for the KIP page. >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > @Michael: WDYT? >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > -Matthias >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > On 4/6/17 8:05 PM, Arun Mathew wrote: >>>>>>>>> > >>>>>>>>> > > Hi Jay, >>>>>>>>> > >>>>>>>>> > > Thanks for the advise, I would like to list >>>>>>>>> down >>>>>>>>> >>>>>>>>> the use cases as >>>>>>>>> >>>>>>>>> > >>>>>>>>> > > per your suggestion. But it seems I don't have write >>>>>>>>> >>>>>>>>> permission to the >>>>>>>>> >>>>>>>>> > >>>>>>>>> > > Apache Kafka Confluent Space. Whom shall I request >>>>>>>>> for it? >>>>>>>>> > >>>>>>>>> > > >>>>>>>>> > >>>>>>>>> > > Regarding your last question. We are using a patch in >>>>>>>>> our >>>>>>>>> >>>>>>>>> production system >>>>>>>>> >>>>>>>>> > >>>>>>>>> > > which does exactly this. >>>>>>>>> > >>>>>>>>> > > We window by the event time, but trigger punctuate in >>>>>>>>> >>>>>>>>> <punctuate interval> >>>>>>>>> >>>>>>>>> > >>>>>>>>> > > duration of system time, in the absence of an event >>>>>>>>> crossing >>>>>>>>> >>>>>>>>> the punctuate >>>>>>>>> >>>>>>>>> > >>>>>>>>> > > event time. >>>>>>>>> > >>>>>>>>> > > >>>>>>>>> > >>>>>>>>> > > We are using Kafka Stream for our Audit Trail, where >>>>>>>>> we need >>>>>>>>> >>>>>>>>> to output the >>>>>>>>> >>>>>>>>> > >>>>>>>>> > > event counts on each topic on each cluster aggregated >>>>>>>>> over a >>>>>>>>> >>>>>>>>> 1 minute >>>>>>>>> >>>>>>>>> > >>>>>>>>> > > window. We have to use event time to be able to cross >>>>>>>>> check >>>>>>>>> >>>>>>>>> the counts. But >>>>>>>>> >>>>>>>>> > >>>>>>>>> > > we need to trigger punctuate [aggregate event pushes] >>>>>>>>> by >>>>>>>>> >>>>>>>>> system time in the >>>>>>>>> >>>>>>>>> > >>>>>>>>> > > absence of events. Otherwise the event counts for >>>>>>>>> unexpired >>>>>>>>> >>>>>>>>> windows would >>>>>>>>> >>>>>>>>> > >>>>>>>>> > > be 0 which is bad. >>>>>>>>> > >>>>>>>>> > > >>>>>>>>> > >>>>>>>>> > > "Maybe a hybrid solution works: I window by event >>>>>>>>> time but >>>>>>>>> >>>>>>>>> trigger results >>>>>>>>> >>>>>>>>> > >>>>>>>>> > > by system time for windows that have updated? Not >>>>>>>>> really sure >>>>>>>>> >>>>>>>>> the details >>>>>>>>> >>>>>>>>> > >>>>>>>>> > > of making that work. Does that work? Are there >>>>>>>>> concrete >>>>>>>>> >>>>>>>>> examples where you >>>>>>>>> >>>>>>>>> > >>>>>>>>> > > actually want the current behavior?" >>>>>>>>> > >>>>>>>>> > > >>>>>>>>> > >>>>>>>>> > > -- >>>>>>>>> > >>>>>>>>> > > With Regards, >>>>>>>>> > >>>>>>>>> > > >>>>>>>>> > >>>>>>>>> > > Arun Mathew >>>>>>>>> > >>>>>>>>> > > Yahoo! JAPAN Corporation >>>>>>>>> > >>>>>>>>> > > >>>>>>>>> > >>>>>>>>> > > On Wed, Apr 5, 2017 at 8:48 PM, Tianji Li < >>>>>>>>> >>>>>>>>> skyah...@gmail.com><skyah...@gmail.com> <skyah...@gmail.com> wrote: >>>>>>>>> >>>>>>>>> > >>>>>>>>> > > >>>>>>>>> > >>>>>>>>> > >> Hi Jay, >>>>>>>>> > >>>>>>>>> > >> >>>>>>>>> > >>>>>>>>> > >> The hybrid solution is exactly what I expect and >>>>>>>>> need for >>>>>>>>> >>>>>>>>> our use cases >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> when dealing with telecom data. >>>>>>>>> > >>>>>>>>> > >> >>>>>>>>> > >>>>>>>>> > >> Thanks >>>>>>>>> > >>>>>>>>> > >> Tianji >>>>>>>>> > >>>>>>>>> > >> >>>>>>>>> > >>>>>>>>> > >> On Wed, Apr 5, 2017 at 12:01 AM, Jay Kreps < >>>>>>>>> >>>>>>>>> j...@confluent.io><j...@confluent.io> <j...@confluent.io> wrote: >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> >>>>>>>>> > >>>>>>>>> > >>> Hey guys, >>>>>>>>> > >>>>>>>>> > >>> >>>>>>>>> > >>>>>>>>> > >>> One thing I've always found super important for >>>>>>>>> this kind >>>>>>>>> >>>>>>>>> of design work >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> is >>>>>>>>> > >>>>>>>>> > >>> to do a really good job of cataloging the landscape >>>>>>>>> of use >>>>>>>>> >>>>>>>>> cases and how >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> prevalent each one is. By that I mean not just >>>>>>>>> listing lots >>>>>>>>> >>>>>>>>> of uses, but >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> also grouping them into categories that >>>>>>>>> functionally need >>>>>>>>> >>>>>>>>> the same thing. >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> In the absence of this it is very hard to reason >>>>>>>>> about >>>>>>>>> >>>>>>>>> design proposals. >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> From the proposals so far I think we have a lot of >>>>>>>>> >>>>>>>>> discussion around >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> possible apis, but less around what the user needs >>>>>>>>> for >>>>>>>>> >>>>>>>>> different use >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> cases >>>>>>>>> > >>>>>>>>> > >>> and how they would implement that using the api. >>>>>>>>> > >>>>>>>>> > >>> >>>>>>>>> > >>>>>>>>> > >>> Here is an example: >>>>>>>>> > >>>>>>>>> > >>> You aggregate click and impression data for a >>>>>>>>> reddit like >>>>>>>>> >>>>>>>>> site. Every ten >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> minutes you want to output a ranked list of the top >>>>>>>>> 10 >>>>>>>>> >>>>>>>>> articles ranked by >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> clicks/impressions for each geographical area. I >>>>>>>>> want to be >>>>>>>>> >>>>>>>>> able run this >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> in steady state as well as rerun to regenerate >>>>>>>>> results (or >>>>>>>>> >>>>>>>>> catch up if it >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> crashes). >>>>>>>>> > >>>>>>>>> > >>> >>>>>>>>> > >>>>>>>>> > >>> There are a couple of tricky things that seem to >>>>>>>>> make this >>>>>>>>> >>>>>>>>> hard with >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> either >>>>>>>>> > >>>>>>>>> > >>> of the options proposed: >>>>>>>>> > >>>>>>>>> > >>> 1. If I emit this data using event time I have the >>>>>>>>> problem >>>>>>>>> >>>>>>>>> described >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> where >>>>>>>>> > >>>>>>>>> > >>> a geographical region with no new clicks or >>>>>>>>> impressions >>>>>>>>> >>>>>>>>> will fail to >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> output >>>>>>>>> > >>>>>>>>> > >>> results. >>>>>>>>> > >>>>>>>>> > >>> 2. If I emit this data using system time I have the >>>>>>>>> problem >>>>>>>>> >>>>>>>>> that when >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> reprocessing data my window may not be ten minutes >>>>>>>>> but 10 >>>>>>>>> >>>>>>>>> hours if my >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> processing is very fast so it dramatically changes >>>>>>>>> the >>>>>>>>> >>>>>>>>> output. >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> >>>>>>>>> > >>>>>>>>> > >>> Maybe a hybrid solution works: I window by event >>>>>>>>> time but >>>>>>>>> >>>>>>>>> trigger results >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> by system time for windows that have updated? Not >>>>>>>>> really >>>>>>>>> >>>>>>>>> sure the details >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> of making that work. Does that work? Are there >>>>>>>>> concrete >>>>>>>>> >>>>>>>>> examples where >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> you >>>>>>>>> > >>>>>>>>> > >>> actually want the current behavior? >>>>>>>>> > >>>>>>>>> > >>> >>>>>>>>> > >>>>>>>>> > >>> -Jay >>>>>>>>> > >>>>>>>>> > >>> >>>>>>>>> > >>>>>>>>> > >>> >>>>>>>>> > >>>>>>>>> > >>> On Tue, Apr 4, 2017 at 8:32 PM, Arun Mathew < >>>>>>>>> >>>>>>>>> arunmathe...@gmail.com> <arunmathe...@gmail.com> >>>>>>>>> <arunmathe...@gmail.com> >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> wrote: >>>>>>>>> > >>>>>>>>> > >>> >>>>>>>>> > >>>>>>>>> > >>>> Hi All, >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>>> Thanks for the KIP. We were also in need of a >>>>>>>>> mechanism to >>>>>>>>> >>>>>>>>> trigger >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>> punctuate in the absence of events. >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>>> As I described in [ >>>>>>>>> > >>>>>>>>> > >>>> https://issues.apache.org/jira/browse/KAFKA-3514? >>>>>>>>> > >>>>>>>>> > >>>> focusedCommentId=15926036&page=com.atlassian.jira. >>>>>>>>> > >>>>>>>>> > >>>> plugin.system.issuetabpanels:comment- >>>>>>>>> tabpanel#comment- >>>>>>>>> >>>>>>>>> 15926036 >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>> ], >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>>> - Our approached involved using the event time >>>>>>>>> by >>>>>>>>> >>>>>>>>> default. >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>> - The method to check if there is any punctuate >>>>>>>>> ready >>>>>>>>> >>>>>>>>> in the >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>> PunctuationQueue is triggered via the any event >>>>>>>>> >>>>>>>>> received by the >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> stream >>>>>>>>> > >>>>>>>>> > >>>> tread, or at the polling intervals in the >>>>>>>>> absence of >>>>>>>>> >>>>>>>>> any events. >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>> - When we create Punctuate objects (which >>>>>>>>> contains the >>>>>>>>> >>>>>>>>> next event >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> time >>>>>>>>> > >>>>>>>>> > >>>> for punctuation and interval), we also record >>>>>>>>> the >>>>>>>>> >>>>>>>>> creation time >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> (system >>>>>>>>> > >>>>>>>>> > >>>> time). >>>>>>>>> > >>>>>>>>> > >>>> - While checking for maturity of Punctuate >>>>>>>>> Schedule by >>>>>>>>> > >>>>>>>>> > >> mayBePunctuate >>>>>>>>> > >>>>>>>>> > >>>> method, we also check if the system clock has >>>>>>>>> elapsed >>>>>>>>> >>>>>>>>> the punctuate >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>> interval since the schedule creation time. >>>>>>>>> > >>>>>>>>> > >>>> - In the absence of any event, or in the >>>>>>>>> absence of any >>>>>>>>> >>>>>>>>> event for >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> one >>>>>>>>> > >>>>>>>>> > >>>> topic in the partition group assigned to the >>>>>>>>> stream >>>>>>>>> >>>>>>>>> task, the system >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>> time >>>>>>>>> > >>>>>>>>> > >>>> will elapse the interval and we trigger a >>>>>>>>> punctuate >>>>>>>>> >>>>>>>>> using the >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> expected >>>>>>>>> > >>>>>>>>> > >>>> punctuation event time. >>>>>>>>> > >>>>>>>>> > >>>> - we then create the next punctuation schedule >>>>>>>>> as >>>>>>>>> >>>>>>>>> punctuation event >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> time >>>>>>>>> > >>>>>>>>> > >>>> + punctuation interval, [again recording the >>>>>>>>> system >>>>>>>>> >>>>>>>>> time of creation >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> of >>>>>>>>> > >>>>>>>>> > >>>> the >>>>>>>>> > >>>>>>>>> > >>>> schedule]. >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>>> We call this a Hybrid Punctuate. Of course, this >>>>>>>>> approach >>>>>>>>> >>>>>>>>> has pros and >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>> cons. >>>>>>>>> > >>>>>>>>> > >>>> Pros >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>>> - Punctuates will happen in <punctuate >>>>>>>>> interval> time >>>>>>>>> >>>>>>>>> duration at >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> max >>>>>>>>> > >>>>>>>>> > >>> in >>>>>>>>> > >>>>>>>>> > >>>> terms of system time. >>>>>>>>> > >>>>>>>>> > >>>> - The semantics as a whole continues to revolve >>>>>>>>> around >>>>>>>>> >>>>>>>>> event time. >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>> - We can use the old data [old timestamps] to >>>>>>>>> rerun any >>>>>>>>> >>>>>>>>> experiments >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> or >>>>>>>>> > >>>>>>>>> > >>>> tests. >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>>> Cons >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>>> - In case the <punctuate interval> is not a >>>>>>>>> time >>>>>>>>> >>>>>>>>> duration [say >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> logical >>>>>>>>> > >>>>>>>>> > >>>> time/event count], then the approach might not >>>>>>>>> be >>>>>>>>> >>>>>>>>> meaningful. >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>> - In case there is a case where we have to wait >>>>>>>>> for an >>>>>>>>> >>>>>>>>> actual event >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> from >>>>>>>>> > >>>>>>>>> > >>>> a low event rate partition in the partition >>>>>>>>> group, this >>>>>>>>> >>>>>>>>> approach >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> will >>>>>>>>> > >>>>>>>>> > >>>> jump >>>>>>>>> > >>>>>>>>> > >>>> the gun. >>>>>>>>> > >>>>>>>>> > >>>> - in case the event processing cannot catch up >>>>>>>>> with the >>>>>>>>> >>>>>>>>> event rate >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> and >>>>>>>>> > >>>>>>>>> > >>>> the expected timestamp events gets queued for >>>>>>>>> long >>>>>>>>> >>>>>>>>> time, this >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> approach >>>>>>>>> > >>>>>>>>> > >>>> might jump the gun. >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>>> I believe the above approach and discussion goes >>>>>>>>> close to >>>>>>>>> >>>>>>>>> the approach >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> A. >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>>> ----------- >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>>> I like the idea of having an even count based >>>>>>>>> punctuate. >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>>> ----------- >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>>> I agree with the discussion around approach C, >>>>>>>>> that we >>>>>>>>> >>>>>>>>> should provide >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> the >>>>>>>>> > >>>>>>>>> > >>>> user with the option to choose system time or >>>>>>>>> event time >>>>>>>>> >>>>>>>>> based >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> punctuates. >>>>>>>>> > >>>>>>>>> > >>>> But I believe that the user predominantly wants to >>>>>>>>> use >>>>>>>>> >>>>>>>>> event time while >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> not >>>>>>>>> > >>>>>>>>> > >>>> missing out on regular punctuates due to event >>>>>>>>> delays or >>>>>>>>> >>>>>>>>> event >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> absences. >>>>>>>>> > >>>>>>>>> > >>>> Hence a complex punctuate option as Matthias >>>>>>>>> mentioned >>>>>>>>> >>>>>>>>> (quoted below) >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> would >>>>>>>>> > >>>>>>>>> > >>>> be most apt. >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>>> "- We might want to add "complex" schedules later >>>>>>>>> on >>>>>>>>> >>>>>>>>> (like, punctuate >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> on >>>>>>>>> > >>>>>>>>> > >>>> every 10 seconds event-time or 60 seconds system- >>>>>>>>> time >>>>>>>>> >>>>>>>>> whatever comes >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>> first)." >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>>> ----------- >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>>> I think I read somewhere that Kafka Streams >>>>>>>>> started with >>>>>>>>> >>>>>>>>> System Time as >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> the >>>>>>>>> > >>>>>>>>> > >>>> punctuation standard, but was later changed to >>>>>>>>> Event Time. >>>>>>>>> >>>>>>>>> I guess >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> there >>>>>>>>> > >>>>>>>>> > >>>> would be some good reason behind it. As Kafka >>>>>>>>> Streams want >>>>>>>>> >>>>>>>>> to evolve >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> more >>>>>>>>> > >>>>>>>>> > >>>> on the Stream Processing front, I believe the >>>>>>>>> emphasis on >>>>>>>>> >>>>>>>>> event time >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> would >>>>>>>>> > >>>>>>>>> > >>>> remain quite strong. >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>>> With Regards, >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>>> Arun Mathew >>>>>>>>> > >>>>>>>>> > >>>> Yahoo! JAPAN Corporation, Tokyo >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>>> On Wed, Apr 5, 2017 at 3:53 AM, Thomas Becker < >>>>>>>>> >>>>>>>>> tobec...@tivo.com> <tobec...@tivo.com> <tobec...@tivo.com> >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> wrote: >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>>>> Yeah I like PuncutationType much better; I just >>>>>>>>> threw >>>>>>>>> >>>>>>>>> Time out there >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>> more as a strawman than an actual suggestion ;) I >>>>>>>>> still >>>>>>>>> >>>>>>>>> think it's >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>> worth considering what this buys us over an >>>>>>>>> additional >>>>>>>>> >>>>>>>>> callback. I >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>> foresee a number of punctuate implementations >>>>>>>>> following >>>>>>>>> >>>>>>>>> this pattern: >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>> >>>>>>>>> > >>>>>>>>> > >>>>> public void punctuate(PunctuationType type) { >>>>>>>>> > >>>>>>>>> > >>>>> switch (type) { >>>>>>>>> > >>>>>>>>> > >>>>> case EVENT_TIME: >>>>>>>>> > >>>>>>>>> > >>>>> methodA(); >>>>>>>>> > >>>>>>>>> > >>>>> break; >>>>>>>>> > >>>>>>>>> > >>>>> case SYSTEM_TIME: >>>>>>>>> > >>>>>>>>> > >>>>> methodB(); >>>>>>>>> > >>>>>>>>> > >>>>> break; >>>>>>>>> > >>>>>>>>> > >>>>> } >>>>>>>>> > >>>>>>>>> > >>>>> } >>>>>>>>> > >>>>>>>>> > >>>>> >>>>>>>>> > >>>>>>>>> > >>>>> I guess one advantage of this approach is we >>>>>>>>> could add >>>>>>>>> >>>>>>>>> additional >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>> punctuation types later in a backwards compatible >>>>>>>>> way >>>>>>>>> >>>>>>>>> (like event >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> count >>>>>>>>> > >>>>>>>>> > >>>>> as you mentioned). >>>>>>>>> > >>>>>>>>> > >>>>> >>>>>>>>> > >>>>>>>>> > >>>>> -Tommy >>>>>>>>> > >>>>>>>>> > >>>>> >>>>>>>>> > >>>>>>>>> > >>>>> >>>>>>>>> > >>>>>>>>> > >>>>> On Tue, 2017-04-04 at 11:10 -0700, Matthias J. >>>>>>>>> Sax wrote: >>>>>>>>> > >>>>>>>>> > >>>>>> That sounds promising. >>>>>>>>> > >>>>>>>>> > >>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>> I am just wondering if `Time` is the best name. >>>>>>>>> Maybe we >>>>>>>>> >>>>>>>>> want to >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> add >>>>>>>>> > >>>>>>>>> > >>>>>> other non-time based punctuations at some point >>>>>>>>> later. I >>>>>>>>> >>>>>>>>> would >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>> suggest >>>>>>>>> > >>>>>>>>> > >>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>> enum PunctuationType { >>>>>>>>> > >>>>>>>>> > >>>>>> EVENT_TIME, >>>>>>>>> > >>>>>>>>> > >>>>>> SYSTEM_TIME, >>>>>>>>> > >>>>>>>>> > >>>>>> } >>>>>>>>> > >>>>>>>>> > >>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>> or similar. Just to keep the door open -- it's >>>>>>>>> easier to >>>>>>>>> >>>>>>>>> add new >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>> stuff >>>>>>>>> > >>>>>>>>> > >>>>>> if the name is more generic. >>>>>>>>> > >>>>>>>>> > >>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>> -Matthias >>>>>>>>> > >>>>>>>>> > >>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>> On 4/4/17 5:30 AM, Thomas Becker wrote: >>>>>>>>> > >>>>>>>>> > >>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> I agree that the framework providing and >>>>>>>>> managing the >>>>>>>>> >>>>>>>>> notion of >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> stream >>>>>>>>> > >>>>>>>>> > >>>>>>> time is valuable and not something we would >>>>>>>>> want to >>>>>>>>> >>>>>>>>> delegate to >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> the >>>>>>>>> > >>>>>>>>> > >>>>>>> tasks. I'm not entirely convinced that a >>>>>>>>> separate >>>>>>>>> >>>>>>>>> callback >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> (option >>>>>>>>> > >>>>>>>>> > >>>>>>> C) >>>>>>>>> > >>>>>>>>> > >>>>>>> is that messy (it could just be a default >>>>>>>>> method with >>>>>>>>> >>>>>>>>> an empty >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> implementation), but if we wanted a single API >>>>>>>>> to >>>>>>>>> >>>>>>>>> handle both >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> cases, >>>>>>>>> > >>>>>>>>> > >>>>>>> how about something like the following? >>>>>>>>> > >>>>>>>>> > >>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> enum Time { >>>>>>>>> > >>>>>>>>> > >>>>>>> STREAM, >>>>>>>>> > >>>>>>>>> > >>>>>>> CLOCK >>>>>>>>> > >>>>>>>>> > >>>>>>> } >>>>>>>>> > >>>>>>>>> > >>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> Then on ProcessorContext: >>>>>>>>> > >>>>>>>>> > >>>>>>> context.schedule(Time time, long interval) // >>>>>>>>> We could >>>>>>>>> >>>>>>>>> allow >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> this >>>>>>>>> > >>>>>>>>> > >>>>>>> to >>>>>>>>> > >>>>>>>>> > >>>>>>> be called once for each value of time to mix >>>>>>>>> >>>>>>>>> approaches. >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> Then the Processor API becomes: >>>>>>>>> > >>>>>>>>> > >>>>>>> punctuate(Time time) // time here denotes which >>>>>>>>> >>>>>>>>> schedule resulted >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> in >>>>>>>>> > >>>>>>>>> > >>>>>>> this call. >>>>>>>>> > >>>>>>>>> > >>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> Thoughts? >>>>>>>>> > >>>>>>>>> > >>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> On Mon, 2017-04-03 at 22:44 -0700, Matthias J. >>>>>>>>> Sax >>>>>>>>> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> Thanks a lot for the KIP Michal, >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> I was thinking about the four options you >>>>>>>>> proposed in >>>>>>>>> >>>>>>>>> more >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> details >>>>>>>>> > >>>>>>>>> > >>>>>>>> and >>>>>>>>> > >>>>>>>>> > >>>>>>>> this are my thoughts: >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> (A) You argue, that users can still >>>>>>>>> "punctuate" on >>>>>>>>> >>>>>>>>> event-time >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> via >>>>>>>>> > >>>>>>>>> > >>>>>>>> process(), but I am not sure if this is >>>>>>>>> possible. >>>>>>>>> >>>>>>>>> Note, that >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> users >>>>>>>>> > >>>>>>>>> > >>>>>>>> only >>>>>>>>> > >>>>>>>>> > >>>>>>>> get record timestamps via context.timestamp(). >>>>>>>>> Thus, >>>>>>>>> >>>>>>>>> users >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> would >>>>>>>>> > >>>>>>>>> > >>>>>>>> need >>>>>>>>> > >>>>>>>>> > >>>>>>>> to >>>>>>>>> > >>>>>>>>> > >>>>>>>> track the time progress per partition (based >>>>>>>>> on the >>>>>>>>> >>>>>>>>> partitions >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> they >>>>>>>>> > >>>>>>>>> > >>>>>>>> obverse via context.partition(). (This alone >>>>>>>>> puts a >>>>>>>>> >>>>>>>>> huge burden >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> on >>>>>>>>> > >>>>>>>>> > >>>>>>>> the >>>>>>>>> > >>>>>>>>> > >>>>>>>> user by itself.) However, users are not >>>>>>>>> notified at >>>>>>>>> >>>>>>>>> startup >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> what >>>>>>>>> > >>>>>>>>> > >>>>>>>> partitions are assigned, and user are not >>>>>>>>> notified >>>>>>>>> >>>>>>>>> when >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> partitions >>>>>>>>> > >>>>>>>>> > >>>>>>>> get >>>>>>>>> > >>>>>>>>> > >>>>>>>> revoked. Because this information is not >>>>>>>>> available, >>>>>>>>> >>>>>>>>> it's not >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> possible >>>>>>>>> > >>>>>>>>> > >>>>>>>> to >>>>>>>>> > >>>>>>>>> > >>>>>>>> "manually advance" stream-time, and thus >>>>>>>>> event-time >>>>>>>>> >>>>>>>>> punctuation >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> within >>>>>>>>> > >>>>>>>>> > >>>>>>>> process() seems not to be possible -- or do >>>>>>>>> you see a >>>>>>>>> >>>>>>>>> way to >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> get >>>>>>>>> > >>>>>>>>> > >>>>>>>> it >>>>>>>>> > >>>>>>>>> > >>>>>>>> done? And even if, it might still be too >>>>>>>>> clumsy to >>>>>>>>> >>>>>>>>> use. >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> (B) This does not allow to mix both >>>>>>>>> approaches, thus >>>>>>>>> >>>>>>>>> limiting >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> what >>>>>>>>> > >>>>>>>>> > >>>>>>>> users >>>>>>>>> > >>>>>>>>> > >>>>>>>> can do. >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> (C) This should give all flexibility we need. >>>>>>>>> However, >>>>>>>>> >>>>>>>>> just >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> adding >>>>>>>>> > >>>>>>>>> > >>>>>>>> one >>>>>>>>> > >>>>>>>>> > >>>>>>>> more method seems to be a solution that is too >>>>>>>>> simple >>>>>>>>> >>>>>>>>> (cf my >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> comments >>>>>>>>> > >>>>>>>>> > >>>>>>>> below). >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> (D) This might be hard to use. Also, I am not >>>>>>>>> sure how >>>>>>>>> >>>>>>>>> a user >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> could >>>>>>>>> > >>>>>>>>> > >>>>>>>> enable system-time and event-time punctuation >>>>>>>>> in >>>>>>>>> >>>>>>>>> parallel. >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> Overall options (C) seems to be the most >>>>>>>>> promising >>>>>>>>> >>>>>>>>> approach to >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> me. >>>>>>>>> > >>>>>>>>> > >>>>>>>> Because I also favor a clean API, we might >>>>>>>>> keep >>>>>>>>> >>>>>>>>> current >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> punctuate() >>>>>>>>> > >>>>>>>>> > >>>>>>>> as-is, but deprecate it -- so we can remove it >>>>>>>>> at some >>>>>>>>> >>>>>>>>> later >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> point >>>>>>>>> > >>>>>>>>> > >>>>>>>> when >>>>>>>>> > >>>>>>>>> > >>>>>>>> people use the "new punctuate API". >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> Couple of follow up questions: >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> - I am wondering, if we should have two >>>>>>>>> callback >>>>>>>>> >>>>>>>>> methods or >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> just >>>>>>>>> > >>>>>>>>> > >>>>>>>> one >>>>>>>>> > >>>>>>>>> > >>>>>>>> (ie, a unified for system and event time >>>>>>>>> punctuation >>>>>>>>> >>>>>>>>> or one for >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> each?). >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> - If we have one, how can the user figure out, >>>>>>>>> which >>>>>>>>> >>>>>>>>> condition >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> did >>>>>>>>> > >>>>>>>>> > >>>>>>>> trigger? >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> - How would the API look like, for registering >>>>>>>>> >>>>>>>>> different >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> punctuate >>>>>>>>> > >>>>>>>>> > >>>>>>>> schedules? The "type" must be somehow defined? >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> - We might want to add "complex" schedules >>>>>>>>> later on >>>>>>>>> >>>>>>>>> (like, >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> punctuate >>>>>>>>> > >>>>>>>>> > >>>>>>>> on >>>>>>>>> > >>>>>>>>> > >>>>>>>> every 10 seconds event-time or 60 seconds >>>>>>>>> system-time >>>>>>>>> >>>>>>>>> whatever >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> comes >>>>>>>>> > >>>>>>>>> > >>>>>>>> first). I don't say we should add this right >>>>>>>>> away, but >>>>>>>>> >>>>>>>>> we might >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> want >>>>>>>>> > >>>>>>>>> > >>>>>>>> to >>>>>>>>> > >>>>>>>>> > >>>>>>>> define the API in a way, that it allows >>>>>>>>> extensions >>>>>>>>> >>>>>>>>> like this >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> later >>>>>>>>> > >>>>>>>>> > >>>>>>>> on, >>>>>>>>> > >>>>>>>>> > >>>>>>>> without redesigning the API (ie, the API >>>>>>>>> should be >>>>>>>>> >>>>>>>>> designed >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> extensible) >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> - Did you ever consider count-based >>>>>>>>> punctuation? >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> I understand, that you would like to solve a >>>>>>>>> simple >>>>>>>>> >>>>>>>>> problem, >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> but >>>>>>>>> > >>>>>>>>> > >>>>>>>> we >>>>>>>>> > >>>>>>>>> > >>>>>>>> learned from the past, that just "adding some >>>>>>>>> API" >>>>>>>>> >>>>>>>>> quickly >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> leads >>>>>>>>> > >>>>>>>>> > >>>>>>>> to a >>>>>>>>> > >>>>>>>>> > >>>>>>>> not very well defined API that needs time >>>>>>>>> consuming >>>>>>>>> >>>>>>>>> clean up >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> later on >>>>>>>>> > >>>>>>>>> > >>>>>>>> via other KIPs. Thus, I would prefer to get a >>>>>>>>> holistic >>>>>>>>> > >>>>>>>>> > >>>>>>>> punctuation >>>>>>>>> > >>>>>>>>> > >>>>>>>> KIP >>>>>>>>> > >>>>>>>>> > >>>>>>>> with this from the beginning on to avoid later >>>>>>>>> painful >>>>>>>>> > >>>>>>>>> > >> redesign. >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> -Matthias >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>> On 4/3/17 11:58 AM, Michal Borowiecki wrote: >>>>>>>>> > >>>>>>>>> > >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Thanks Thomas, >>>>>>>>> > >>>>>>>>> > >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> I'm also wary of changing the existing >>>>>>>>> semantics of >>>>>>>>> > >>>>>>>>> > >> punctuate, >>>>>>>>> > >>>>>>>>> > >>>>>>>>> for >>>>>>>>> > >>>>>>>>> > >>>>>>>>> backward compatibility reasons, although I >>>>>>>>> like the >>>>>>>>> > >>>>>>>>> > >> conceptual >>>>>>>>> > >>>>>>>>> > >>>>>>>>> simplicity of that option. >>>>>>>>> > >>>>>>>>> > >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Adding a new method to me feels safer but, in >>>>>>>>> a way, >>>>>>>>> >>>>>>>>> uglier. >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> I >>>>>>>>> > >>>>>>>>> > >>>>>>>>> added >>>>>>>>> > >>>>>>>>> > >>>>>>>>> this to the KIP now as option (C). >>>>>>>>> > >>>>>>>>> > >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> The TimestampExtractor mechanism is actually >>>>>>>>> more >>>>>>>>> >>>>>>>>> flexible, >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> as >>>>>>>>> > >>>>>>>>> > >>>>>>>>> it >>>>>>>>> > >>>>>>>>> > >>>>>>>>> allows >>>>>>>>> > >>>>>>>>> > >>>>>>>>> you to return any value, you're not limited >>>>>>>>> to event >>>>>>>>> >>>>>>>>> time or >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> system >>>>>>>>> > >>>>>>>>> > >>>>>>>>> time >>>>>>>>> > >>>>>>>>> > >>>>>>>>> (although I don't see an actual use case >>>>>>>>> where you >>>>>>>>> >>>>>>>>> might need >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> anything >>>>>>>>> > >>>>>>>>> > >>>>>>>>> else then those two). Hence I also proposed >>>>>>>>> the >>>>>>>>> >>>>>>>>> option to >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> allow >>>>>>>>> > >>>>>>>>> > >>>>>>>>> users >>>>>>>>> > >>>>>>>>> > >>>>>>>>> to, effectively, decide what "stream time" is >>>>>>>>> for >>>>>>>>> >>>>>>>>> them given >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> the >>>>>>>>> > >>>>>>>>> > >>>>>>>>> presence or absence of messages, much like >>>>>>>>> they can >>>>>>>>> >>>>>>>>> decide >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> what >>>>>>>>> > >>>>>>>>> > >>>>>>>>> msg >>>>>>>>> > >>>>>>>>> > >>>>>>>>> time >>>>>>>>> > >>>>>>>>> > >>>>>>>>> means for them using the TimestampExtractor. >>>>>>>>> What do >>>>>>>>> >>>>>>>>> you >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> think >>>>>>>>> > >>>>>>>>> > >>>>>>>>> about >>>>>>>>> > >>>>>>>>> > >>>>>>>>> that? This is probably most flexible but also >>>>>>>>> most >>>>>>>>> > >>>>>>>>> > >> complicated. >>>>>>>>> > >>>>>>>>> > >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> All comments appreciated. >>>>>>>>> > >>>>>>>>> > >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Cheers, >>>>>>>>> > >>>>>>>>> > >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Michal >>>>>>>>> > >>>>>>>>> > >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On 03/04/17 19:23, Thomas Becker wrote: >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> Although I fully agree we need a way to >>>>>>>>> trigger >>>>>>>>> >>>>>>>>> periodic >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> processing >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> that is independent from whether and when >>>>>>>>> messages >>>>>>>>> >>>>>>>>> arrive, >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> I'm >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> not sure >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> I like the idea of changing the existing >>>>>>>>> semantics >>>>>>>>> >>>>>>>>> across >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> the >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> board. >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> What if we added an additional callback to >>>>>>>>> Processor >>>>>>>>> >>>>>>>>> that >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> can >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> be >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> scheduled similarly to punctuate() but was >>>>>>>>> always >>>>>>>>> >>>>>>>>> called at >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> fixed, wall >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> clock based intervals? This way you wouldn't >>>>>>>>> have to >>>>>>>>> >>>>>>>>> give >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> up >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> the >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> notion >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> of stream time to be able to do periodic >>>>>>>>> processing. >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> On Mon, 2017-04-03 at 10:34 +0100, Michal >>>>>>>>> Borowiecki >>>>>>>>> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> Hi all, >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> I have created a draft for KIP-138: Change >>>>>>>>> >>>>>>>>> punctuate >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> semantics >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> <https://cwiki.apache.org/ >>>>>>>>> >>>>>>>>> confluence/display/KAFKA/KIP- <https://cwiki.apache.org/ >>>>>>>>> confluence/display/KAFKA/KIP-> >>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-> >>>>>>>>> >>>>>>>>> > >>>>>>>>> > > <https://cwiki.apache.org/confluence/display/KAFKA/KI P-> >>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP->>> >>>>>>>>> >>>>>>>>> 138% >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> 3A+C >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> hange+ >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> punctuate+semantics> >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> . >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> Appreciating there can be different views >>>>>>>>> on >>>>>>>>> >>>>>>>>> system-time >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> vs >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> event- >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> time >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> semantics for punctuation depending on use- >>>>>>>>> case and >>>>>>>>> >>>>>>>>> the >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> importance of >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> backwards compatibility of any such change, >>>>>>>>> I've >>>>>>>>> >>>>>>>>> left it >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> quite >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> open >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> and >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> hope to fill in more info as the discussion >>>>>>>>> >>>>>>>>> progresses. >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> Thanks, >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>> Michal >>>>>>>>> > >>>>>>>>> > >>>>>>> -- >>>>>>>>> > >>>>>>>>> > >>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> Tommy Becker >>>>>>>>> > >>>>>>>>> > >>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> Senior Software Engineer >>>>>>>>> > >>>>>>>>> > >>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> O +1 919.460.4747 <%28919%29%20460-4747> >>>>>>>>> <(919)%20460-4747> >>>>>>>>> > >>>>>>>>> > >>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> tivo.com >>>>>>>>> > >>>>>>>>> > >>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> ________________________________ >>>>>>>>> > >>>>>>>>> > >>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> This email and any attachments may contain >>>>>>>>> confidential >>>>>>>>> >>>>>>>>> and >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> privileged material for the sole use of the >>>>>>>>> intended >>>>>>>>> >>>>>>>>> recipient. >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> Any >>>>>>>>> > >>>>>>>>> > >>>>>>> review, copying, or distribution of this email >>>>>>>>> (or any >>>>>>>>> > >>>>>>>>> > >> attachments) >>>>>>>>> > >>>>>>>>> > >>>>>>> by others is prohibited. If you are not the >>>>>>>>> intended >>>>>>>>> >>>>>>>>> recipient, >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> please contact the sender immediately and >>>>>>>>> permanently >>>>>>>>> >>>>>>>>> delete this >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> email and any attachments. No employee or agent >>>>>>>>> of TiVo >>>>>>>>> >>>>>>>>> Inc. is >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> authorized to conclude any binding agreement on >>>>>>>>> behalf >>>>>>>>> >>>>>>>>> of TiVo >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> Inc. >>>>>>>>> > >>>>>>>>> > >>>>>>> by email. Binding agreements with TiVo Inc. may >>>>>>>>> only be >>>>>>>>> >>>>>>>>> made by a >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>> signed written agreement. >>>>>>>>> > >>>>>>>>> > >>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>> -- >>>>>>>>> > >>>>>>>>> > >>>>> >>>>>>>>> > >>>>>>>>> > >>>>> >>>>>>>>> > >>>>>>>>> > >>>>> Tommy Becker >>>>>>>>> > >>>>>>>>> > >>>>> >>>>>>>>> > >>>>>>>>> > >>>>> Senior Software Engineer >>>>>>>>> > >>>>>>>>> > >>>>> >>>>>>>>> > >>>>>>>>> > >>>>> O +1 919.460.4747 <%28919%29%20460-4747> >>>>>>>>> <(919)%20460-4747> >>>>>>>>> > >>>>>>>>> > >>>>> >>>>>>>>> > >>>>>>>>> > >>>>> tivo.com >>>>>>>>> > >>>>>>>>> > >>>>> >>>>>>>>> > >>>>>>>>> > >>>>> >>>>>>>>> > >>>>>>>>> > >>>>> ________________________________ >>>>>>>>> > >>>>>>>>> > >>>>> >>>>>>>>> > >>>>>>>>> > >>>>> This email and any attachments may contain >>>>>>>>> confidential >>>>>>>>> >>>>>>>>> and >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> privileged >>>>>>>>> > >>>>>>>>> > >>>>> material for the sole use of the intended >>>>>>>>> recipient. Any >>>>>>>>> >>>>>>>>> review, >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>> copying, >>>>>>>>> > >>>>>>>>> > >>>>> or distribution of this email (or any >>>>>>>>> attachments) by >>>>>>>>> >>>>>>>>> others is >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>> prohibited. >>>>>>>>> > >>>>>>>>> > >>>>> If you are not the intended recipient, please >>>>>>>>> contact the >>>>>>>>> >>>>>>>>> sender >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>> immediately and permanently delete this email and >>>>>>>>> any >>>>>>>>> >>>>>>>>> attachments. No >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>> employee or agent of TiVo Inc. is authorized to >>>>>>>>> conclude >>>>>>>>> >>>>>>>>> any binding >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>> agreement on behalf of TiVo Inc. by email. >>>>>>>>> Binding >>>>>>>>> >>>>>>>>> agreements with >>>>>>>>> >>>>>>>>> > >>>>>>>>> > >> TiVo >>>>>>>>> > >>>>>>>>> > >>>>> Inc. may only be made by a signed written >>>>>>>>> agreement. >>>>>>>>> > >>>>>>>>> > >>>>> >>>>>>>>> > >>>>>>>>> > >>>> >>>>>>>>> > >>>>>>>>> > >>> >>>>>>>>> > >>>>>>>>> > >> >>>>>>>>> > >>>>>>>>> > > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > -- >>>>>>>>> > >>>>>>>>> > <http://www.openbet.com/> <http://www.openbet.com/> >>>>>>>>> >>>>>>>>> > >>>>>>>>> > *Michal Borowiecki* >>>>>>>>> > >>>>>>>>> > *Senior Software Engineer L4* >>>>>>>>> > >>>>>>>>> > *T: * >>>>>>>>> > >>>>>>>>> > +44 208 742 1600 <+44%2020%208742%201600> >>>>>>>>> <+44%2020%208742%201600> >>>>>>>>> > >>>>>>>>> > +44 203 249 8448 <+44%2020%203249%208448> >>>>>>>>> <+44%2020%203249%208448> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > *E: * >>>>>>>>> > >>>>>>>>> > michal.borowie...@openbet.com >>>>>>>>> > >>>>>>>>> > *W: * >>>>>>>>> > >>>>>>>>> > www.openbet.com >>>>>>>>> > >>>>>>>>> > *OpenBet Ltd* >>>>>>>>> > >>>>>>>>> > Chiswick Park Building 9 >>>>>>>>> > >>>>>>>>> > 566 Chiswick High Rd >>>>>>>>> > >>>>>>>>> > London >>>>>>>>> > >>>>>>>>> > W4 5XT >>>>>>>>> > >>>>>>>>> > UK >>>>>>>>> > >>>>>>>>> > <https://www.openbet.com/email_promo> >>>>>>>>> <https://www.openbet.com/email_promo> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > This message is confidential and intended only for the >>>>>>>>> addressee. >>>>>>>>> >>>>>>>>> If you >>>>>>>>> >>>>>>>>> > have received this message in error, please immediately >>>>>>>>> notify the >>>>>>>>> > postmas...@openbet.com and delete it from your system as >>>>>>>>> well as >>>>>>>>> >>>>>>>>> any >>>>>>>>> >>>>>>>>> > copies. The content of e-mails as well as traffic data may >>>>>>>>> be >>>>>>>>> >>>>>>>>> monitored by >>>>>>>>> >>>>>>>>> > OpenBet for employment and security purposes. To protect >>>>>>>>> the >>>>>>>>> >>>>>>>>> environment >>>>>>>>> >>>>>>>>> > please do not print this e-mail unless necessary. OpenBet >>>>>>>>> Ltd. >>>>>>>>> >>>>>>>>> Registered >>>>>>>>> >>>>>>>>> > Office: Chiswick Park Building 9, 566 Chiswick High Road, >>>>>>>>> London, >>>>>>>>> >>>>>>>>> W4 5XT, >>>>>>>>> >>>>>>>>> > United Kingdom. A company registered in England and Wales. >>>>>>>>> >>>>>>>>> Registered no. >>>>>>>>> >>>>>>>>> > 3134634. VAT no. GB927523612 >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> >>>>>>>>> Tommy Becker >>>>>>>>> >>>>>>>>> Senior Software Engineer >>>>>>>>> >>>>>>>>> O +1 919.460.4747 <%28919%29%20460-4747> >>>>>>>>> >>>>>>>>> >>>>>>>>> tivo.com >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> ________________________________ >>>>>>>>> >>>>>>>>> This email and any attachments may contain confidential and >>>>>>>>> privileged material for the sole use of the intended recipient. Any >>>>>>>>> review, copying, or distribution of this email (or any attachments) >>>>>>>>> by others is prohibited. If you are not the intended recipient, >>>>>>>>> please contact the sender immediately and permanently delete this >>>>>>>>> email and any attachments. No employee or agent of TiVo Inc. is >>>>>>>>> authorized to conclude any binding agreement on behalf of TiVo Inc. >>>>>>>>> by email. Binding agreements with TiVo Inc. may only be made by a >>>>>>>>> signed written agreement. >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> <http://www.openbet.com/> Michal Borowiecki >>>>>>>>> Senior Software Engineer L4 >>>>>>>>> T: +44 208 742 1600 <+44%2020%208742%201600> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>> -- >>>>>> Signature >>>>>> <http://www.openbet.com/> Michal Borowiecki >>>>>> Senior Software Engineer L4 >>>>>> T: +44 208 742 1600 >>>>>> >>>>>> >>>>>> +44 203 249 8448 >>>>>> >>>>>> >>>>>> >>>>>> E: michal.borowie...@openbet.com >>>>>> W: www.openbet.com <http://www.openbet.com/> >>>>>> >>>>>> >>>>>> OpenBet Ltd >>>>>> >>>>>> Chiswick Park Building 9 >>>>>> >>>>>> 566 Chiswick High Rd >>>>>> >>>>>> London >>>>>> >>>>>> W4 5XT >>>>>> >>>>>> UK >>>>>> >>>>>> >>>>>> <https://www.openbet.com/email_promo> >>>>>> >>>>>> This message is confidential and intended only for the addressee. If >>>>>> you have received this message in error, please immediately notify the >>>>>> postmas...@openbet.com <mailto:postmas...@openbet.com> and delete it >>>>>> from your system as well as any copies. The content of e-mails as well >>>>>> as traffic data may be monitored by OpenBet for employment and >>>>>> security purposes. To protect the environment please do not print this >>>>>> e-mail unless necessary. OpenBet Ltd. Registered Office: Chiswick Park >>>>>> Building 9, 566 Chiswick High Road, London, W4 5XT, United Kingdom. A >>>>>> company registered in England and Wales. Registered no. 3134634. VAT >>>>>> no. GB927523612 >>>>>> >>>>> -- >>>>> Signature >>>>> <http://www.openbet.com/> Michal Borowiecki >>>>> Senior Software Engineer L4 >>>>> T: +44 208 742 1600 >>>>> >>>>> >>>>> +44 203 249 8448 >>>>> >>>>> >>>>> >>>>> E: michal.borowie...@openbet.com >>>>> W: www.openbet.com <http://www.openbet.com/> >>>>> >>>>> >>>>> OpenBet Ltd >>>>> >>>>> Chiswick Park Building 9 >>>>> >>>>> 566 Chiswick High Rd >>>>> >>>>> London >>>>> >>>>> W4 5XT >>>>> >>>>> UK >>>>> >>>>> >>>>> <https://www.openbet.com/email_promo> >>>>> >>>>> This message is confidential and intended only for the addressee. If you >>>>> have received this message in error, please immediately notify the >>>>> postmas...@openbet.com <mailto:postmas...@openbet.com> and delete it >>>>> from your system as well as any copies. The content of e-mails as well >>>>> as traffic data may be monitored by OpenBet for employment and security >>>>> purposes. To protect the environment please do not print this e-mail >>>>> unless necessary. OpenBet Ltd. Registered Office: Chiswick Park Building >>>>> 9, 566 Chiswick High Road, London, W4 5XT, United Kingdom. A company >>>>> registered in England and Wales. Registered no. 3134634. VAT no. >>>>> GB927523612 >>>>> >>> -- >>> Signature >>> <http://www.openbet.com/> Michal Borowiecki >>> Senior Software Engineer L4 >>> T: +44 208 742 1600 >>> >>> >>> +44 203 249 8448 >>> >>> >>> >>> E: michal.borowie...@openbet.com >>> W: www.openbet.com <http://www.openbet.com/> >>> >>> >>> OpenBet Ltd >>> >>> Chiswick Park Building 9 >>> >>> 566 Chiswick High Rd >>> >>> London >>> >>> W4 5XT >>> >>> UK >>> >>> >>> <https://www.openbet.com/email_promo> >>> >>> This message is confidential and intended only for the addressee. If you >>> have received this message in error, please immediately notify the >>> postmas...@openbet.com <mailto:postmas...@openbet.com> and delete it >>> from your system as well as any copies. The content of e-mails as well >>> as traffic data may be monitored by OpenBet for employment and security >>> purposes. To protect the environment please do not print this e-mail >>> unless necessary. OpenBet Ltd. Registered Office: Chiswick Park Building >>> 9, 566 Chiswick High Road, London, W4 5XT, United Kingdom. A company >>> registered in England and Wales. Registered no. 3134634. VAT no. >>> GB927523612 >>> > > -- > Signature > <http://www.openbet.com/> Michal Borowiecki > Senior Software Engineer L4 > T: +44 208 742 1600 > > > +44 203 249 8448 > > > > E: michal.borowie...@openbet.com > W: www.openbet.com <http://www.openbet.com/> > > > OpenBet Ltd > > Chiswick Park Building 9 > > 566 Chiswick High Rd > > London > > W4 5XT > > UK > > > <https://www.openbet.com/email_promo> > > This message is confidential and intended only for the addressee. If you > have received this message in error, please immediately notify the > postmas...@openbet.com <mailto:postmas...@openbet.com> and delete it > from your system as well as any copies. The content of e-mails as well > as traffic data may be monitored by OpenBet for employment and security > purposes. To protect the environment please do not print this e-mail > unless necessary. OpenBet Ltd. Registered Office: Chiswick Park Building > 9, 566 Chiswick High Road, London, W4 5XT, United Kingdom. A company > registered in England and Wales. Registered no. 3134634. VAT no. > GB927523612 >
signature.asc
Description: OpenPGP digital signature