I agree with this. We would need to allow processor level configuration. And I also agree, that the global caching config is not optimal...
-Matthias On 4/24/17 3:55 AM, Michal Borowiecki wrote: > Further to this, on your point about configuration: > >> Thus, I also believe that one might need different "configuration" >> values for the hybrid approach if you run the same code for different >> scenarios: regular processing, re-processing, catching up scenario. And >> as the term "configuration" implies, we might be better off to not mix >> configuration with business logic that is expressed via code. > I'm not sure I understand what you are suggesting here. > > Configuration is global to a KafkaStreams instance and users might want > to have different tolerance in different parts of the topology. They > shouldn't be locked into one value set via global config. > > To illustrate this point: Lately I have discovered the cache config > introduced in KIP-63 > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-63%3A+Unify+store+and+downstream+caching+in+streams> > and found it quite annoying that it's controlled by a config item. IMO, > I should be able to control flushing per processor, not be forced to use > one global value defined in configs. > > It's easy enough for users to source a user-defined config and provided > it as a parameter to a /given /processor as needed. > > In principal I agree that configuration and business logic are better > not mixed together but then the configuration mechanism should allow > users to target specific processors and not be global to the > KafkaStreams instance. > > Thanks, > > Michal > > On 24/04/17 10:23, Michal Borowiecki wrote: >> >> Hi Matthias, >> >> I agree it's difficult to reason about the hybrid approach, I >> certainly found it hard and I'm totally on board with the mantra. >> >> I'd be happy to limit the scope of this KIP to add system-time >> punctuation semantics (in addition to existing stream-time semantics) >> and leave more complex schemes for users to implement on top of that. >> >> Further additional PunctuationTypes, could then be added by future >> KIPs, possibly including the hybrid approach once it has been given >> more thought. >> >>> There are real-time applications, that want to get >>> callbacks in regular system-time intervals (completely independent from >>> stream-time). >> Can you please describe what they are, so that I can put them on the >> wiki for later reference? >> >> Thanks, >> >> Michal >> >> >> On 23/04/17 21:27, Matthias J. Sax wrote: >>> Hi, >>> >>> I do like Damian's API proposal about the punctuation callback function. >>> >>> I also did reread the KIP and thought about the semantics we want to >>> provide. >>> >>>> Given the above, I don't see a reason any more for a separate system-time >>>> based punctuation. >>> I disagree here. There are real-time applications, that want to get >>> callbacks in regular system-time intervals (completely independent from >>> stream-time). Thus we should allow this -- if we really follow the >>> "hybrid" approach, this could be configured with stream-time interval >>> infinite and delay whatever system-time punctuation interval you want to >>> have. However, I would like to add a proper API for this and do this >>> configuration under the hood (that would allow one implementation within >>> all kind of branching for different cases). >>> >>> Thus, we definitely should have PunctutionType#StreamTime and >>> #SystemTime -- and additionally, we _could_ have #Hybrid. Thus, I am not >>> a fan of your latest API proposal. >>> >>> >>> About the hybrid approach in general. On the one hand I like it, on the >>> other hand, it seems to be rather (1) complicated (not necessarily from >>> an implementation point of view, but for people to understand it) and >>> (2) mixes two semantics together in a "weird" way". Thus, I disagree with: >>> >>>> It may appear complicated at first but I do think these semantics will >>>> still be more understandable to users than having 2 separate punctuation >>>> schedules/callbacks with different PunctuationTypes. >>> This statement only holds if you apply strong assumptions that I don't >>> believe hold in general -- see (2) for details -- and I think it is >>> harder than you assume to reason about the hybrid approach in general. >>> IMHO, the hybrid approach is a "false friend" that seems to be easy to >>> reason about... >>> >>> >>> (1) Streams always embraced "easy to use" and we should really be >>> careful to keep it this way. On the other hand, as we are talking about >>> changes to PAPI, it won't affect DSL users (DSL does not use punctuation >>> at all at the moment), and thus, the "easy to use" mantra might not be >>> affected, while it will allow advanced users to express more complex stuff. >>> >>> I like the mantra: "make simple thing easy and complex things possible". >>> >>> (2) IMHO the major disadvantage (issue?) of the hybrid approach is the >>> implicit assumption that even-time progresses at the same "speed" as >>> system-time during regular processing. This implies the assumption that >>> a slower progress in stream-time indicates the absence of input events >>> (and that later arriving input events will have a larger event-time with >>> high probability). Even if this might be true for some use cases, I >>> doubt it holds in general. Assume that you get a spike in traffic and >>> for some reason stream-time does advance slowly because you have more >>> records to process. This might trigger a system-time based punctuation >>> call even if this seems not to be intended. I strongly believe that it >>> is not easy to reason about the semantics of the hybrid approach (even >>> if the intentional semantics would be super useful -- but I doubt that >>> we get want we ask for). >>> >>> Thus, I also believe that one might need different "configuration" >>> values for the hybrid approach if you run the same code for different >>> scenarios: regular processing, re-processing, catching up scenario. And >>> as the term "configuration" implies, we might be better off to not mix >>> configuration with business logic that is expressed via code. >>> >>> >>> One more comment: I also don't think that the hybrid approach is >>> deterministic as claimed in the use-case subpage. I understand the >>> reasoning and agree, that it is deterministic if certain assumptions >>> hold -- compare above -- and if configured correctly. But strictly >>> speaking it's not because there is a dependency on system-time (and >>> IMHO, if system-time is involved it cannot be deterministic by definition). >>> >>> >>>> I see how in theory this could be implemented on top of the 2 punctuate >>>> callbacks with the 2 different PunctuationTypes (one stream-time based, >>>> the other system-time based) but it would be a much more complicated >>>> scheme and I don't want to suggest that. >>> I agree that expressing the intended hybrid semantics is harder if we >>> offer only #StreamTime and #SystemTime punctuation. However, I also >>> believe that the hybrid approach is a "false friend" with regard to >>> reasoning about the semantics (it indicates that it more easy as it is >>> in reality). Therefore, we might be better off to not offer the hybrid >>> approach and make it clear to a developed, that it is hard to mix >>> #StreamTime and #SystemTime in a semantically sound way. >>> >>> >>> Looking forward to your feedback. :) >>> >>> -Matthias >>> >>> >>> >>> >>> On 4/22/17 11:43 AM, Michal Borowiecki wrote: >>>> Hi all, >>>> >>>> Looking for feedback on the functional interface approach Damian >>>> proposed. What do people think? >>>> >>>> Further on the semantics of triggering punctuate though: >>>> >>>> I ran through the 2 use cases that Arun had kindly put on the wiki >>>> (https://cwiki.apache.org/confluence/display/KAFKA/Punctuate+Use+Cases) >>>> in my head and on a whiteboard and I can't find a better solution than >>>> the "hybrid" approach he had proposed. >>>> >>>> I see how in theory this could be implemented on top of the 2 punctuate >>>> callbacks with the 2 different PunctuationTypes (one stream-time based, >>>> the other system-time based) but it would be a much more complicated >>>> scheme and I don't want to suggest that. >>>> >>>> However, to add to the hybrid algorithm proposed, I suggest one >>>> parameter to that: a tolerance period, expressed in milliseconds >>>> system-time, after which the punctuation will be invoked in case the >>>> stream-time advance hasn't triggered it within the requested interval >>>> since the last invocation of punctuate (as recorded in system-time) >>>> >>>> This would allow a user-defined tolerance for late arriving events. The >>>> trade off would be left for the user to decide: regular punctuation in >>>> the case of absence of events vs allowing for records arriving late or >>>> some build-up due to processing not catching up with the event rate. >>>> In the one extreme, this tolerance could be set to infinity, turning >>>> hybrid into simply stream-time based punctuate, like we have now. In the >>>> other extreme, the tolerance could be set to 0, resulting in a >>>> system-time upper bound on the effective punctuation interval. >>>> >>>> Given the above, I don't see a reason any more for a separate >>>> system-time based punctuation. The "hybrid" approach with 0ms tolerance >>>> would under normal operation trigger at regular intervals wrt the >>>> system-time, except in cases of re-play/catch-up, where the stream time >>>> advances faster than system time. In these cases punctuate would happen >>>> more often than the specified interval wrt system time. However, the >>>> use-cases that need system-time punctuations (that I've seen at least) >>>> really only have a need for an upper bound on punctuation delay but >>>> don't need a lower bound. >>>> >>>> To that effect I'd propose the api to be as follows, on ProcessorContext: >>>> >>>> schedule(Punctuator callback, long interval, long toleranceIterval); // >>>> schedules punctuate at stream-time intervals with a system-time upper >>>> bound of (interval+toleranceInterval) >>>> >>>> schedule(Punctuator callback, long interval); // schedules punctuate at >>>> stream-time intervals without an system-time upper bound - this is >>>> equivalent to current stream-time based punctuate >>>> >>>> Punctuation is triggered when either: >>>> - the stream time advances past the (stream time of the previous >>>> punctuation) + interval; >>>> - or (iff the toleranceInterval is set) when the system time advances >>>> past the (system time of the previous punctuation) + interval + >>>> toleranceInterval >>>> >>>> In either case: >>>> - we trigger punctuate passing as the argument the stream time at which >>>> the current punctuation was meant to happen >>>> - next punctuate is scheduled at (stream time at which the current >>>> punctuation was meant to happen) + interval >>>> >>>> It may appear complicated at first but I do think these semantics will >>>> still be more understandable to users than having 2 separate punctuation >>>> schedules/callbacks with different PunctuationTypes. >>>> >>>> >>>> >>>> PS. Having re-read this, maybe the following alternative would be easier >>>> to understand (WDYT?): >>>> >>>> schedule(Punctuator callback, long streamTimeInterval, long >>>> systemTimeUpperBound); // schedules punctuate at stream-time intervals >>>> with a system-time upper bound - systemTimeUpperBound must be no less than >>>> streamTimeInterval >>>> >>>> schedule(Punctuator callback, long streamTimeInterval); // schedules >>>> punctuate at stream-time intervals without a system-time upper bound - >>>> this is equivalent to current stream-time based punctuate >>>> >>>> Punctuation is triggered when either: >>>> - the stream time advances past the (stream time of the previous >>>> punctuation) + streamTimeInterval; >>>> - or (iff systemTimeUpperBound is set) when the system time advances >>>> past the (system time of the previous punctuation) + systemTimeUpperBound >>>> >>>> Awaiting comments. >>>> >>>> Thanks, >>>> Michal >>>> >>>> On 21/04/17 16:56, Michal Borowiecki wrote: >>>>> Yes, that's what I meant. Just wanted to highlight we'd deprecate it >>>>> in favour of something that doesn't return a record. Not a problem though. >>>>> >>>>> >>>>> On 21/04/17 16:32, Damian Guy wrote: >>>>>> Thanks Michal, >>>>>> I agree Transformer.punctuate should also be void, but we can deprecate >>>>>> that too in favor of the new interface. >>>>>> >>>>>> Thanks for the javadoc PR! >>>>>> >>>>>> Cheers, >>>>>> Damian >>>>>> >>>>>> On Fri, 21 Apr 2017 at 09:31 Michal Borowiecki < >>>>>> michal.borowie...@openbet.com> wrote: >>>>>> >>>>>>> Yes, that looks better to me. >>>>>>> >>>>>>> Note that punctuate on Transformer is currently returning a record, but >>>>>>> I >>>>>>> think it's ok to have all output records be sent via >>>>>>> ProcessorContext.forward, which has to be used anyway if you want to >>>>>>> send >>>>>>> multiple records from one invocation of punctuate. >>>>>>> >>>>>>> This way it's consistent between Processor and Transformer. >>>>>>> >>>>>>> >>>>>>> BTW, looking at this I found a glitch in the javadoc and put a comment >>>>>>> there: >>>>>>> >>>>>>> https://github.com/apache/kafka/pull/2413/files#r112634612 >>>>>>> >>>>>>> and PR: https://github.com/apache/kafka/pull/2884 >>>>>>> >>>>>>> Cheers, >>>>>>> >>>>>>> Michal >>>>>>> On 20/04/17 18:55, Damian Guy wrote: >>>>>>> >>>>>>> Hi Michal, >>>>>>> >>>>>>> Thanks for the KIP. I'd like to propose a bit more of a radical change >>>>>>> to >>>>>>> the API. >>>>>>> 1. deprecate the punctuate method on Processor >>>>>>> 2. create a new Functional Interface just for Punctuation, something >>>>>>> like: >>>>>>> interface Punctuator { >>>>>>> void punctuate(long timestamp) >>>>>>> } >>>>>>> 3. add a new schedule function to ProcessorContext: schedule(long >>>>>>> interval, PunctuationType type, Punctuator callback) >>>>>>> 4. deprecate the existing schedule function >>>>>>> >>>>>>> Thoughts? >>>>>>> >>>>>>> Thanks, >>>>>>> Damian >>>>>>> >>>>>>> On Sun, 16 Apr 2017 at 21:55 Michal Borowiecki < >>>>>>> michal.borowie...@openbet.com> wrote: >>>>>>> >>>>>>>> Hi Thomas, >>>>>>>> >>>>>>>> I would say our use cases fall in the same category as yours. >>>>>>>> >>>>>>>> 1) One is expiry of old records, it's virtually identical to yours. >>>>>>>> >>>>>>>> 2) Second one is somewhat more convoluted but boils down to the same >>>>>>>> type >>>>>>>> of design: >>>>>>>> >>>>>>>> Incoming messages carry a number of fields, including a timestamp. >>>>>>>> >>>>>>>> Outgoing messages contain derived fields, one of them (X) is depended >>>>>>>> on >>>>>>>> by the timestamp input field (Y) and some other input field (Z). >>>>>>>> >>>>>>>> Since the output field X is derived in some non-trivial way, we don't >>>>>>>> want to force the logic onto downstream apps. Instead we want to >>>>>>>> calculate >>>>>>>> it in the Kafka Streams app, which means we re-calculate X as soon as >>>>>>>> the >>>>>>>> timestamp in Y is reached (wall clock time) and send a message if it >>>>>>>> changed (I say "if" because the derived field (X) is also conditional >>>>>>>> on >>>>>>>> another input field Z). >>>>>>>> >>>>>>>> So we have kv stores with the records and an additional kv store with >>>>>>>> timestamp->id mapping which act like an index where we periodically do >>>>>>>> a >>>>>>>> ranged query. >>>>>>>> >>>>>>>> Initially we naively tried doing it in punctuate which of course didn't >>>>>>>> work when there were no regular msgs on the input topic. >>>>>>>> Since this was before 0.10.1 and state stores weren't query-able from >>>>>>>> outside we created a "ticker" that produced msgs once per second onto >>>>>>>> another topic and fed it into the same topology to trigger punctuate. >>>>>>>> This didn't work either, which was much more surprising to us at the >>>>>>>> time, because it was not obvious at all that punctuate is only >>>>>>>> triggered if >>>>>>>> *all* input partitions receive messages regularly. >>>>>>>> In the end we had to break this into 2 separate Kafka Streams. Main >>>>>>>> transformer doesn't use punctuate but sends values of timestamp field >>>>>>>> Y and >>>>>>>> the id to a "scheduler" topic where also the periodic ticks are sent. >>>>>>>> This >>>>>>>> is consumed by the second topology and is its only input topic. >>>>>>>> There's a >>>>>>>> transformer on that topic which populates and updates the time-based >>>>>>>> indexes and polls them from punctuate. If the time in the timestamp >>>>>>>> elapsed, the record id is sent to the main transformer, which >>>>>>>> updates/deletes the record from the main kv store and forwards the >>>>>>>> transformed record to the output topic. >>>>>>>> >>>>>>>> To me this setup feels horrendously complicated for what it does. >>>>>>>> >>>>>>>> We could incrementally improve on this since 0.10.1 to poll the >>>>>>>> timestamp->id "index" stores from some code outside the KafkaStreams >>>>>>>> topology so that at least we wouldn't need the extra topic for "ticks". >>>>>>>> However, the ticks don't feel so hacky when you realise they give you >>>>>>>> some hypothetical benefits in predictability. You can reprocess the >>>>>>>> messages in a reproducible manner, since the topologies use event-time, >>>>>>>> just that the event time is simply the wall-clock time fed into a >>>>>>>> topic by >>>>>>>> the ticks. (NB in our use case we haven't yet found a need for this >>>>>>>> kind of >>>>>>>> reprocessing). >>>>>>>> To make that work though, we would have to have the stream time advance >>>>>>>> based on the presence of msgs on the "tick" topic, regardless of the >>>>>>>> presence of messages on the other input topic. >>>>>>>> >>>>>>>> Same as in the expiry use case, both the wall-clock triggered punctuate >>>>>>>> and the hybrid would work to simplify this a lot. >>>>>>>> >>>>>>>> 3) Finally, I have a 3rd use case in the making but I'm still looking >>>>>>>> if >>>>>>>> we can achieve it using session windows instead. I'll keep you posted >>>>>>>> if we >>>>>>>> have to go with punctuate there too. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Michal >>>>>>>> >>>>>>>> >>>>>>>> On 11/04/17 20:52, Thomas Becker wrote: >>>>>>>> >>>>>>>> Here's an example that we currently have. We have a streams processor >>>>>>>> that does a transform from one topic into another. One of the fields in >>>>>>>> the source topic record is an expiration time, and one of the functions >>>>>>>> of the processor is to ensure that expired records get deleted promptly >>>>>>>> after that time passes (typically days or weeks after the message was >>>>>>>> originally produced). To do that, the processor keeps a state store of >>>>>>>> keys and expiration times, iterates that store in punctuate(), and >>>>>>>> emits delete (null) records for expired items. This needs to happen at >>>>>>>> some minimum interval regardless of the incoming message rate of the >>>>>>>> source topic. >>>>>>>> >>>>>>>> In this scenario, the expiration of records is the primary function of >>>>>>>> punctuate, and therefore the key requirement is that the wall-clock >>>>>>>> measured time between punctuate calls have some upper-bound. So a pure >>>>>>>> wall-clock based schedule would be fine for our needs. But the proposed >>>>>>>> "hybrid" system would also be acceptable if that satisfies a broader >>>>>>>> range of use-cases. >>>>>>>> >>>>>>>> On Tue, 2017-04-11 at 14:41 +0200, Michael Noll wrote: >>>>>>>> >>>>>>>> I apologize for the longer email below. To my defense, it started >>>>>>>> out much >>>>>>>> shorter. :-) Also, to be super-clear, I am intentionally playing >>>>>>>> devil's >>>>>>>> advocate for a number of arguments brought forth in order to help >>>>>>>> improve >>>>>>>> this KIP -- I am not implying I necessarily disagree with the >>>>>>>> arguments. >>>>>>>> >>>>>>>> That aside, here are some further thoughts. >>>>>>>> >>>>>>>> First, there are (at least?) two categories for actions/behavior you >>>>>>>> invoke >>>>>>>> via punctuate(): >>>>>>>> >>>>>>>> 1. For internal housekeeping of your Processor or Transformer (e.g., >>>>>>>> to >>>>>>>> periodically commit to a custom store, to do metrics/logging). Here, >>>>>>>> the >>>>>>>> impact of punctuate is typically not observable by other processing >>>>>>>> nodes >>>>>>>> in the topology. >>>>>>>> 2. For controlling the emit frequency of downstream records. Here, >>>>>>>> the >>>>>>>> punctuate is all about being observable by downstream processing >>>>>>>> nodes. >>>>>>>> >>>>>>>> A few releases back, we introduced record caches (DSL) and state >>>>>>>> store >>>>>>>> caches (Processor API) in KIP-63. Here, we addressed a concern >>>>>>>> relating to >>>>>>>> (2) where some users needed to control -- here: limit -- the >>>>>>>> downstream >>>>>>>> output rate of Kafka Streams because the downstream systems/apps >>>>>>>> would not >>>>>>>> be able to keep up with the upstream output rate (Kafka scalability > >>>>>>>> their >>>>>>>> scalability). The argument for KIP-63, which notably did not >>>>>>>> introduce a >>>>>>>> "trigger" API, was that such an interaction with downstream systems >>>>>>>> is an >>>>>>>> operational concern; it should not impact the processing *logic* of >>>>>>>> your >>>>>>>> application, and thus we didn't want to complicate the Kafka Streams >>>>>>>> API, >>>>>>>> especially not the declarative DSL, with such operational concerns. >>>>>>>> >>>>>>>> This KIP's discussion on `punctuate()` takes us back in time (<-- >>>>>>>> sorry, I >>>>>>>> couldn't resist to not make this pun :-P). As a meta-comment, I am >>>>>>>> observing that our conversation is moving more and more into the >>>>>>>> direction >>>>>>>> of explicit "triggers" because, so far, I have seen only motivations >>>>>>>> for >>>>>>>> use cases in category (2), but none yet for (1)? For example, some >>>>>>>> comments voiced here are about sth like "IF stream-time didn't >>>>>>>> trigger >>>>>>>> punctuate, THEN trigger punctuate based on processing-time". Do we >>>>>>>> want >>>>>>>> this, and if so, for which use cases and benefits? Also, on a >>>>>>>> related >>>>>>>> note, whatever we are discussing here will impact state store caches >>>>>>>> (Processor API) and perhaps also impact record caches (DSL), thus we >>>>>>>> should >>>>>>>> clarify any such impact here. >>>>>>>> >>>>>>>> Switching topics slightly. >>>>>>>> >>>>>>>> Jay wrote: >>>>>>>> >>>>>>>> One thing I've always found super important for this kind of design >>>>>>>> work >>>>>>>> is to do a really good job of cataloging the landscape of use cases >>>>>>>> and >>>>>>>> how prevalent each one is. >>>>>>>> >>>>>>>> +1 to this, as others have already said. >>>>>>>> >>>>>>>> Here, let me highlight -- just in case -- that when we talked about >>>>>>>> windowing use cases in the recent emails, the Processor API (where >>>>>>>> `punctuate` resides) does not have any notion of windowing at >>>>>>>> all. If you >>>>>>>> want to do windowing *in the Processor API*, you must do so manually >>>>>>>> in >>>>>>>> combination with window stores. For this reason I'd suggest to >>>>>>>> discuss use >>>>>>>> cases not just in general, but also in view of how you'd do so in the >>>>>>>> Processor API vs. in the DSL. Right now, changing/improving >>>>>>>> `punctuate` >>>>>>>> does not impact the DSL at all, unless we add new functionality to >>>>>>>> it. >>>>>>>> >>>>>>>> Jay wrote in his strawman example: >>>>>>>> >>>>>>>> You aggregate click and impression data for a reddit like site. >>>>>>>> Every ten >>>>>>>> minutes you want to output a ranked list of the top 10 articles >>>>>>>> ranked by >>>>>>>> clicks/impressions for each geographical area. I want to be able >>>>>>>> run this >>>>>>>> in steady state as well as rerun to regenerate results (or catch up >>>>>>>> if it >>>>>>>> crashes). >>>>>>>> >>>>>>>> This is a good example for more than the obvious reason: In KIP-63, >>>>>>>> we >>>>>>>> argued that the reason for saying "every ten minutes" above is not >>>>>>>> necessarily about because you want to output data *exactly* after ten >>>>>>>> minutes, but that you want to perform an aggregation based on 10- >>>>>>>> minute >>>>>>>> windows of input data; i.e., the point is about specifying the input >>>>>>>> for >>>>>>>> your aggregation, not or less about when the results of the >>>>>>>> aggregation >>>>>>>> should be send downstream. To take an extreme example, you could >>>>>>>> disable >>>>>>>> record caches and let your app output a downstream update for every >>>>>>>> incoming input record. If the last input record was from at minute 7 >>>>>>>> of 10 >>>>>>>> (for a 10-min window), then what your app would output at minute 10 >>>>>>>> would >>>>>>>> be identical to what it had already emitted at minute 7 earlier >>>>>>>> anyways. >>>>>>>> This is particularly true when we take late-arriving data into >>>>>>>> account: if >>>>>>>> a late record arrived at minute 13, your app would (by default) send >>>>>>>> a new >>>>>>>> update downstream, even though the "original" 10 minutes have already >>>>>>>> passed. >>>>>>>> >>>>>>>> Jay wrote...: >>>>>>>> >>>>>>>> There are a couple of tricky things that seem to make this hard >>>>>>>> with >>>>>>>> >>>>>>>> either >>>>>>>> >>>>>>>> of the options proposed: >>>>>>>> 1. If I emit this data using event time I have the problem >>>>>>>> described where >>>>>>>> a geographical region with no new clicks or impressions will fail >>>>>>>> to >>>>>>>> >>>>>>>> output >>>>>>>> >>>>>>>> results. >>>>>>>> >>>>>>>> ...and Arun Mathew wrote: >>>>>>>> >>>>>>>> >>>>>>>> We window by the event time, but trigger punctuate in <punctuate >>>>>>>> interval> >>>>>>>> duration of system time, in the absence of an event crossing the >>>>>>>> punctuate >>>>>>>> event time. >>>>>>>> >>>>>>>> So, given what I wrote above about the status quo and what you can >>>>>>>> already >>>>>>>> do with it, is the concern that the state store cache doesn't give >>>>>>>> you >>>>>>>> *direct* control over "forcing an output after no later than X >>>>>>>> seconds [of >>>>>>>> processing-time]" but only indirect control through a cache >>>>>>>> size? (Note >>>>>>>> that I am not dismissing the claims why this might be helpful.) >>>>>>>> >>>>>>>> Arun Mathew wrote: >>>>>>>> >>>>>>>> We are using Kafka Stream for our Audit Trail, where we need to >>>>>>>> output the >>>>>>>> event counts on each topic on each cluster aggregated over a 1 >>>>>>>> minute >>>>>>>> window. We have to use event time to be able to cross check the >>>>>>>> counts. >>>>>>>> >>>>>>>> But >>>>>>>> >>>>>>>> we need to trigger punctuate [aggregate event pushes] by system >>>>>>>> time in >>>>>>>> >>>>>>>> the >>>>>>>> >>>>>>>> absence of events. Otherwise the event counts for unexpired windows >>>>>>>> would >>>>>>>> be 0 which is bad. >>>>>>>> >>>>>>>> Isn't the latter -- "count would be 0" -- the problem between the >>>>>>>> absence >>>>>>>> of output vs. an output of 0, similar to the use of `Option[T]` in >>>>>>>> Scala >>>>>>>> and the difference between `None` and `Some(0)`? That is, isn't the >>>>>>>> root >>>>>>>> cause that the downstream system interprets the absence of output in >>>>>>>> a >>>>>>>> particular way ("No output after 1 minute = I consider the output to >>>>>>>> be >>>>>>>> 0.")? Arguably, you could also adapt the downstream system (if >>>>>>>> possible) >>>>>>>> to correctly handle the difference between absence of output vs. >>>>>>>> output of >>>>>>>> 0. I am not implying that we shouldn't care about such a use case, >>>>>>>> but >>>>>>>> want to understand the motivation better. :-) >>>>>>>> >>>>>>>> Also, to add some perspective, in some related discussions we talked >>>>>>>> about >>>>>>>> how a Kafka Streams application should not worry or not be coupled >>>>>>>> unnecessarily with such interpretation specifics in a downstream >>>>>>>> system's >>>>>>>> behavior. After all, tomorrow your app's output might be consumed by >>>>>>>> more >>>>>>>> than just this one downstream system. Arguably, Kafka Connect rather >>>>>>>> than >>>>>>>> Kafka Streams might be the best tool to link the universes of Kafka >>>>>>>> and >>>>>>>> downstream systems, including helping to reconcile the differences in >>>>>>>> how >>>>>>>> these systems interpret changes, updates, late-arriving data, >>>>>>>> etc. Kafka >>>>>>>> Connect would allow you to decouple the Kafka Streams app's logical >>>>>>>> processing from the specifics of downstream systems, thanks to >>>>>>>> specific >>>>>>>> sink connectors (e.g. for Elastic, Cassandra, MySQL, S3, etc). Would >>>>>>>> this >>>>>>>> decoupling with Kafka Connect help here? (And if the answer is "Yes, >>>>>>>> but >>>>>>>> it's currently awkward to use Connect for this", this might be a >>>>>>>> problem we >>>>>>>> can solve, too.) >>>>>>>> >>>>>>>> Switching topics slightly again. >>>>>>>> >>>>>>>> Thomas wrote: >>>>>>>> >>>>>>>> I'm not entirely convinced that a separate callback (option C) >>>>>>>> is that messy (it could just be a default method with an empty >>>>>>>> implementation), but if we wanted a single API to handle both >>>>>>>> cases, >>>>>>>> how about something like the following? >>>>>>>> >>>>>>>> enum Time { >>>>>>>> STREAM, >>>>>>>> CLOCK >>>>>>>> } >>>>>>>> >>>>>>>> Yeah, I am on the fence here, too. If we use the 1-method approach, >>>>>>>> then >>>>>>>> whatever the user is doing inside this method is a black box to Kafka >>>>>>>> Streams (similar to how we have no idea what the user does inside a >>>>>>>> `foreach` -- if the function passed to `foreach` writes to external >>>>>>>> systems, then Kafka Streams is totally unaware of the fact). We >>>>>>>> won't >>>>>>>> know, for example, if the stream-time action has a smaller "trigger" >>>>>>>> frequency than the processing-time action. Or, we won't know whether >>>>>>>> the >>>>>>>> user custom-codes a "not later than" trigger logic ("Do X every 1- >>>>>>>> minute of >>>>>>>> stream-time or 1-minute of processing-time, whichever comes >>>>>>>> first"). That >>>>>>>> said, I am not certain yet whether we would need such knowledge >>>>>>>> because, >>>>>>>> when using the Processor API, most of the work and decisions must be >>>>>>>> done >>>>>>>> by the user anyways. It would matter though if the concept of >>>>>>>> "triggers" >>>>>>>> were to bubble up into the DSL because in the DSL the management of >>>>>>>> windowing, window stores, etc. must be done automatically by Kafka >>>>>>>> Streams. >>>>>>>> >>>>>>>> [In any case, btw, we have the corner case where the user configured >>>>>>>> the >>>>>>>> stream-time to be processing-time (e.g. via wall-clock timestamp >>>>>>>> extractor), at which point both punctuate variants are based on the >>>>>>>> same >>>>>>>> time semantics / timeline.] >>>>>>>> >>>>>>>> Again, I apologize for the wall of text. Congratulations if you made >>>>>>>> it >>>>>>>> this far. :-) >>>>>>>> >>>>>>>> More than happy to hear your thoughts! >>>>>>>> Michael >>>>>>>> >>>>>>>> On Tue, Apr 11, 2017 at 1:12 AM, Arun Mathew <arunmathe...@gmail.com> >>>>>>>> <arunmathe...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> Thanks Matthias. >>>>>>>> Sure, will correct it right away. >>>>>>>> >>>>>>>> On 11-Apr-2017 8:07 AM, "Matthias J. Sax" <matth...@confluent.io> >>>>>>>> <matth...@confluent.io> >>>>>>>> wrote: >>>>>>>> >>>>>>>> Thanks for preparing this page! >>>>>>>> >>>>>>>> About terminology: >>>>>>>> >>>>>>>> You introduce the term "event time" -- but we should call this >>>>>>>> "stream >>>>>>>> time" -- "stream time" is whatever TimestampExtractor returns and >>>>>>>> this >>>>>>>> could be event time, ingestion time, or processing/wall-clock time. >>>>>>>> >>>>>>>> Does this make sense to you? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -Matthias >>>>>>>> >>>>>>>> >>>>>>>> On 4/10/17 4:58 AM, Arun Mathew wrote: >>>>>>>> >>>>>>>> Thanks Ewen. >>>>>>>> >>>>>>>> @Michal, @all, I have created a child page to start the Use Cases >>>>>>>> >>>>>>>> discussion [https://cwiki.apache.org/confluence/display/KAFKA/ >>>>>>>> Punctuate+Use+Cases]. Please go through it and give your comments. >>>>>>>> >>>>>>>> >>>>>>>> @Tianji, Sorry for the delay. I am trying to make the patch >>>>>>>> public. >>>>>>>> >>>>>>>> -- >>>>>>>> Arun Mathew >>>>>>>> >>>>>>>> On 4/8/17, 02:00, "Ewen Cheslack-Postava" <e...@confluent.io> >>>>>>>> <e...@confluent.io> >>>>>>>> wrote: >>>>>>>> >>>>>>>> Arun, >>>>>>>> >>>>>>>> I've given you permission to edit the wiki. Let me know if >>>>>>>> you run >>>>>>>> >>>>>>>> into any >>>>>>>> >>>>>>>> issues. >>>>>>>> >>>>>>>> -Ewen >>>>>>>> >>>>>>>> On Fri, Apr 7, 2017 at 1:21 AM, Arun Mathew <amathew@yahoo-co >>>>>>>> rp.jp> <amat...@yahoo-corp.jp> >>>>>>>> >>>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> > Thanks Michal. I don’t have the access yet [arunmathew88]. >>>>>>>> Should I >>>>>>>> >>>>>>>> be >>>>>>>> >>>>>>>> > sending a separate mail for this? >>>>>>>> > >>>>>>>> > I thought one of the person following this thread would be >>>>>>>> able to >>>>>>>> >>>>>>>> give me >>>>>>>> >>>>>>>> > access. >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > *From: *Michal Borowiecki <michal.borowie...@openbet.com> >>>>>>>> <michal.borowie...@openbet.com> >>>>>>>> > *Reply-To: *"dev@kafka.apache.org" <dev@kafka.apache.org> >>>>>>>> <dev@kafka.apache.org> <dev@kafka.apache.org> >>>>>>>> > *Date: *Friday, April 7, 2017 at 17:16 >>>>>>>> > *To: *"dev@kafka.apache.org" <dev@kafka.apache.org> >>>>>>>> <dev@kafka.apache.org> <dev@kafka.apache.org> >>>>>>>> > *Subject: *Re: [DISCUSS] KIP-138: Change punctuate >>>>>>>> semantics >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > Hi Arun, >>>>>>>> > >>>>>>>> > I was thinking along the same lines as you, listing the use >>>>>>>> cases >>>>>>>> >>>>>>>> on the >>>>>>>> >>>>>>>> > wiki, but didn't find time to get around doing that yet. >>>>>>>> > Don't mind if you do it if you have access now. >>>>>>>> > I was thinking it would be nice if, once we have the use >>>>>>>> cases >>>>>>>> >>>>>>>> listed, >>>>>>>> >>>>>>>> > people could use likes to up-vote the use cases similar to >>>>>>>> what >>>>>>>> >>>>>>>> they're >>>>>>>> >>>>>>>> > working on. >>>>>>>> > >>>>>>>> > I should have a bit more time to action this in the next >>>>>>>> few days, >>>>>>>> >>>>>>>> but >>>>>>>> >>>>>>>> > happy for you to do it if you can beat me to it ;-) >>>>>>>> > >>>>>>>> > Cheers, >>>>>>>> > Michal >>>>>>>> > >>>>>>>> > On 07/04/17 04:39, Arun Mathew wrote: >>>>>>>> > >>>>>>>> > Sure, Thanks Matthias. My id is [arunmathew88]. >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > Of course. I was thinking of a subpage where people can >>>>>>>> >>>>>>>> collaborate. >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > Will do as per Michael’s suggestion. >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > Regards, >>>>>>>> > >>>>>>>> > Arun Mathew >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > On 4/7/17, 12:30, "Matthias J. Sax" <matth...@confluent.io> >>>>>>>> <matth...@confluent.io> >>>>>>>> < >>>>>>>> >>>>>>>> matth...@confluent.io> wrote: >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > Please share your Wiki-ID and a committer can give you >>>>>>>> write >>>>>>>> >>>>>>>> access. >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > Btw: as you did not initiate the KIP, you should not >>>>>>>> change the >>>>>>>> >>>>>>>> KIP >>>>>>>> >>>>>>>> > >>>>>>>> > without the permission of the original author -- in >>>>>>>> this case >>>>>>>> >>>>>>>> Michael. >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > So you might also just share your thought over the >>>>>>>> mailing list >>>>>>>> >>>>>>>> and >>>>>>>> >>>>>>>> > >>>>>>>> > Michael can update the KIP page. Or, as an alternative, >>>>>>>> just >>>>>>>> >>>>>>>> create a >>>>>>>> >>>>>>>> > >>>>>>>> > subpage for the KIP page. >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > @Michael: WDYT? >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > -Matthias >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > On 4/6/17 8:05 PM, Arun Mathew wrote: >>>>>>>> > >>>>>>>> > > Hi Jay, >>>>>>>> > >>>>>>>> > > Thanks for the advise, I would like to list >>>>>>>> down >>>>>>>> >>>>>>>> the use cases as >>>>>>>> >>>>>>>> > >>>>>>>> > > per your suggestion. But it seems I don't have write >>>>>>>> >>>>>>>> permission to the >>>>>>>> >>>>>>>> > >>>>>>>> > > Apache Kafka Confluent Space. Whom shall I request >>>>>>>> for it? >>>>>>>> > >>>>>>>> > > >>>>>>>> > >>>>>>>> > > Regarding your last question. We are using a patch in >>>>>>>> our >>>>>>>> >>>>>>>> production system >>>>>>>> >>>>>>>> > >>>>>>>> > > which does exactly this. >>>>>>>> > >>>>>>>> > > We window by the event time, but trigger punctuate in >>>>>>>> >>>>>>>> <punctuate interval> >>>>>>>> >>>>>>>> > >>>>>>>> > > duration of system time, in the absence of an event >>>>>>>> crossing >>>>>>>> >>>>>>>> the punctuate >>>>>>>> >>>>>>>> > >>>>>>>> > > event time. >>>>>>>> > >>>>>>>> > > >>>>>>>> > >>>>>>>> > > We are using Kafka Stream for our Audit Trail, where >>>>>>>> we need >>>>>>>> >>>>>>>> to output the >>>>>>>> >>>>>>>> > >>>>>>>> > > event counts on each topic on each cluster aggregated >>>>>>>> over a >>>>>>>> >>>>>>>> 1 minute >>>>>>>> >>>>>>>> > >>>>>>>> > > window. We have to use event time to be able to cross >>>>>>>> check >>>>>>>> >>>>>>>> the counts. But >>>>>>>> >>>>>>>> > >>>>>>>> > > we need to trigger punctuate [aggregate event pushes] >>>>>>>> by >>>>>>>> >>>>>>>> system time in the >>>>>>>> >>>>>>>> > >>>>>>>> > > absence of events. Otherwise the event counts for >>>>>>>> unexpired >>>>>>>> >>>>>>>> windows would >>>>>>>> >>>>>>>> > >>>>>>>> > > be 0 which is bad. >>>>>>>> > >>>>>>>> > > >>>>>>>> > >>>>>>>> > > "Maybe a hybrid solution works: I window by event >>>>>>>> time but >>>>>>>> >>>>>>>> trigger results >>>>>>>> >>>>>>>> > >>>>>>>> > > by system time for windows that have updated? Not >>>>>>>> really sure >>>>>>>> >>>>>>>> the details >>>>>>>> >>>>>>>> > >>>>>>>> > > of making that work. Does that work? Are there >>>>>>>> concrete >>>>>>>> >>>>>>>> examples where you >>>>>>>> >>>>>>>> > >>>>>>>> > > actually want the current behavior?" >>>>>>>> > >>>>>>>> > > >>>>>>>> > >>>>>>>> > > -- >>>>>>>> > >>>>>>>> > > With Regards, >>>>>>>> > >>>>>>>> > > >>>>>>>> > >>>>>>>> > > Arun Mathew >>>>>>>> > >>>>>>>> > > Yahoo! JAPAN Corporation >>>>>>>> > >>>>>>>> > > >>>>>>>> > >>>>>>>> > > On Wed, Apr 5, 2017 at 8:48 PM, Tianji Li < >>>>>>>> >>>>>>>> skyah...@gmail.com><skyah...@gmail.com> <skyah...@gmail.com> wrote: >>>>>>>> >>>>>>>> > >>>>>>>> > > >>>>>>>> > >>>>>>>> > >> Hi Jay, >>>>>>>> > >>>>>>>> > >> >>>>>>>> > >>>>>>>> > >> The hybrid solution is exactly what I expect and >>>>>>>> need for >>>>>>>> >>>>>>>> our use cases >>>>>>>> >>>>>>>> > >>>>>>>> > >> when dealing with telecom data. >>>>>>>> > >>>>>>>> > >> >>>>>>>> > >>>>>>>> > >> Thanks >>>>>>>> > >>>>>>>> > >> Tianji >>>>>>>> > >>>>>>>> > >> >>>>>>>> > >>>>>>>> > >> On Wed, Apr 5, 2017 at 12:01 AM, Jay Kreps < >>>>>>>> >>>>>>>> j...@confluent.io><j...@confluent.io> <j...@confluent.io> wrote: >>>>>>>> >>>>>>>> > >>>>>>>> > >> >>>>>>>> > >>>>>>>> > >>> Hey guys, >>>>>>>> > >>>>>>>> > >>> >>>>>>>> > >>>>>>>> > >>> One thing I've always found super important for >>>>>>>> this kind >>>>>>>> >>>>>>>> of design work >>>>>>>> >>>>>>>> > >>>>>>>> > >> is >>>>>>>> > >>>>>>>> > >>> to do a really good job of cataloging the landscape >>>>>>>> of use >>>>>>>> >>>>>>>> cases and how >>>>>>>> >>>>>>>> > >>>>>>>> > >>> prevalent each one is. By that I mean not just >>>>>>>> listing lots >>>>>>>> >>>>>>>> of uses, but >>>>>>>> >>>>>>>> > >>>>>>>> > >>> also grouping them into categories that >>>>>>>> functionally need >>>>>>>> >>>>>>>> the same thing. >>>>>>>> >>>>>>>> > >>>>>>>> > >>> In the absence of this it is very hard to reason >>>>>>>> about >>>>>>>> >>>>>>>> design proposals. >>>>>>>> >>>>>>>> > >>>>>>>> > >>> From the proposals so far I think we have a lot of >>>>>>>> >>>>>>>> discussion around >>>>>>>> >>>>>>>> > >>>>>>>> > >>> possible apis, but less around what the user needs >>>>>>>> for >>>>>>>> >>>>>>>> different use >>>>>>>> >>>>>>>> > >>>>>>>> > >> cases >>>>>>>> > >>>>>>>> > >>> and how they would implement that using the api. >>>>>>>> > >>>>>>>> > >>> >>>>>>>> > >>>>>>>> > >>> Here is an example: >>>>>>>> > >>>>>>>> > >>> You aggregate click and impression data for a >>>>>>>> reddit like >>>>>>>> >>>>>>>> site. Every ten >>>>>>>> >>>>>>>> > >>>>>>>> > >>> minutes you want to output a ranked list of the top >>>>>>>> 10 >>>>>>>> >>>>>>>> articles ranked by >>>>>>>> >>>>>>>> > >>>>>>>> > >>> clicks/impressions for each geographical area. I >>>>>>>> want to be >>>>>>>> >>>>>>>> able run this >>>>>>>> >>>>>>>> > >>>>>>>> > >>> in steady state as well as rerun to regenerate >>>>>>>> results (or >>>>>>>> >>>>>>>> catch up if it >>>>>>>> >>>>>>>> > >>>>>>>> > >>> crashes). >>>>>>>> > >>>>>>>> > >>> >>>>>>>> > >>>>>>>> > >>> There are a couple of tricky things that seem to >>>>>>>> make this >>>>>>>> >>>>>>>> hard with >>>>>>>> >>>>>>>> > >>>>>>>> > >> either >>>>>>>> > >>>>>>>> > >>> of the options proposed: >>>>>>>> > >>>>>>>> > >>> 1. If I emit this data using event time I have the >>>>>>>> problem >>>>>>>> >>>>>>>> described >>>>>>>> >>>>>>>> > >>>>>>>> > >> where >>>>>>>> > >>>>>>>> > >>> a geographical region with no new clicks or >>>>>>>> impressions >>>>>>>> >>>>>>>> will fail to >>>>>>>> >>>>>>>> > >>>>>>>> > >> output >>>>>>>> > >>>>>>>> > >>> results. >>>>>>>> > >>>>>>>> > >>> 2. If I emit this data using system time I have the >>>>>>>> problem >>>>>>>> >>>>>>>> that when >>>>>>>> >>>>>>>> > >>>>>>>> > >>> reprocessing data my window may not be ten minutes >>>>>>>> but 10 >>>>>>>> >>>>>>>> hours if my >>>>>>>> >>>>>>>> > >>>>>>>> > >>> processing is very fast so it dramatically changes >>>>>>>> the >>>>>>>> >>>>>>>> output. >>>>>>>> >>>>>>>> > >>>>>>>> > >>> >>>>>>>> > >>>>>>>> > >>> Maybe a hybrid solution works: I window by event >>>>>>>> time but >>>>>>>> >>>>>>>> trigger results >>>>>>>> >>>>>>>> > >>>>>>>> > >>> by system time for windows that have updated? Not >>>>>>>> really >>>>>>>> >>>>>>>> sure the details >>>>>>>> >>>>>>>> > >>>>>>>> > >>> of making that work. Does that work? Are there >>>>>>>> concrete >>>>>>>> >>>>>>>> examples where >>>>>>>> >>>>>>>> > >>>>>>>> > >> you >>>>>>>> > >>>>>>>> > >>> actually want the current behavior? >>>>>>>> > >>>>>>>> > >>> >>>>>>>> > >>>>>>>> > >>> -Jay >>>>>>>> > >>>>>>>> > >>> >>>>>>>> > >>>>>>>> > >>> >>>>>>>> > >>>>>>>> > >>> On Tue, Apr 4, 2017 at 8:32 PM, Arun Mathew < >>>>>>>> >>>>>>>> arunmathe...@gmail.com> <arunmathe...@gmail.com> >>>>>>>> <arunmathe...@gmail.com> >>>>>>>> >>>>>>>> > >>>>>>>> > >>> wrote: >>>>>>>> > >>>>>>>> > >>> >>>>>>>> > >>>>>>>> > >>>> Hi All, >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>>> Thanks for the KIP. We were also in need of a >>>>>>>> mechanism to >>>>>>>> >>>>>>>> trigger >>>>>>>> >>>>>>>> > >>>>>>>> > >>>> punctuate in the absence of events. >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>>> As I described in [ >>>>>>>> > >>>>>>>> > >>>> https://issues.apache.org/jira/browse/KAFKA-3514? >>>>>>>> > >>>>>>>> > >>>> focusedCommentId=15926036&page=com.atlassian.jira. >>>>>>>> > >>>>>>>> > >>>> plugin.system.issuetabpanels:comment- >>>>>>>> tabpanel#comment- >>>>>>>> >>>>>>>> 15926036 >>>>>>>> >>>>>>>> > >>>>>>>> > >>>> ], >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>>> - Our approached involved using the event time >>>>>>>> by >>>>>>>> >>>>>>>> default. >>>>>>>> >>>>>>>> > >>>>>>>> > >>>> - The method to check if there is any punctuate >>>>>>>> ready >>>>>>>> >>>>>>>> in the >>>>>>>> >>>>>>>> > >>>>>>>> > >>>> PunctuationQueue is triggered via the any event >>>>>>>> >>>>>>>> received by the >>>>>>>> >>>>>>>> > >>>>>>>> > >> stream >>>>>>>> > >>>>>>>> > >>>> tread, or at the polling intervals in the >>>>>>>> absence of >>>>>>>> >>>>>>>> any events. >>>>>>>> >>>>>>>> > >>>>>>>> > >>>> - When we create Punctuate objects (which >>>>>>>> contains the >>>>>>>> >>>>>>>> next event >>>>>>>> >>>>>>>> > >>>>>>>> > >> time >>>>>>>> > >>>>>>>> > >>>> for punctuation and interval), we also record >>>>>>>> the >>>>>>>> >>>>>>>> creation time >>>>>>>> >>>>>>>> > >>>>>>>> > >>> (system >>>>>>>> > >>>>>>>> > >>>> time). >>>>>>>> > >>>>>>>> > >>>> - While checking for maturity of Punctuate >>>>>>>> Schedule by >>>>>>>> > >>>>>>>> > >> mayBePunctuate >>>>>>>> > >>>>>>>> > >>>> method, we also check if the system clock has >>>>>>>> elapsed >>>>>>>> >>>>>>>> the punctuate >>>>>>>> >>>>>>>> > >>>>>>>> > >>>> interval since the schedule creation time. >>>>>>>> > >>>>>>>> > >>>> - In the absence of any event, or in the >>>>>>>> absence of any >>>>>>>> >>>>>>>> event for >>>>>>>> >>>>>>>> > >>>>>>>> > >> one >>>>>>>> > >>>>>>>> > >>>> topic in the partition group assigned to the >>>>>>>> stream >>>>>>>> >>>>>>>> task, the system >>>>>>>> >>>>>>>> > >>>>>>>> > >>>> time >>>>>>>> > >>>>>>>> > >>>> will elapse the interval and we trigger a >>>>>>>> punctuate >>>>>>>> >>>>>>>> using the >>>>>>>> >>>>>>>> > >>>>>>>> > >> expected >>>>>>>> > >>>>>>>> > >>>> punctuation event time. >>>>>>>> > >>>>>>>> > >>>> - we then create the next punctuation schedule >>>>>>>> as >>>>>>>> >>>>>>>> punctuation event >>>>>>>> >>>>>>>> > >>>>>>>> > >>> time >>>>>>>> > >>>>>>>> > >>>> + punctuation interval, [again recording the >>>>>>>> system >>>>>>>> >>>>>>>> time of creation >>>>>>>> >>>>>>>> > >>>>>>>> > >>> of >>>>>>>> > >>>>>>>> > >>>> the >>>>>>>> > >>>>>>>> > >>>> schedule]. >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>>> We call this a Hybrid Punctuate. Of course, this >>>>>>>> approach >>>>>>>> >>>>>>>> has pros and >>>>>>>> >>>>>>>> > >>>>>>>> > >>>> cons. >>>>>>>> > >>>>>>>> > >>>> Pros >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>>> - Punctuates will happen in <punctuate >>>>>>>> interval> time >>>>>>>> >>>>>>>> duration at >>>>>>>> >>>>>>>> > >>>>>>>> > >> max >>>>>>>> > >>>>>>>> > >>> in >>>>>>>> > >>>>>>>> > >>>> terms of system time. >>>>>>>> > >>>>>>>> > >>>> - The semantics as a whole continues to revolve >>>>>>>> around >>>>>>>> >>>>>>>> event time. >>>>>>>> >>>>>>>> > >>>>>>>> > >>>> - We can use the old data [old timestamps] to >>>>>>>> rerun any >>>>>>>> >>>>>>>> experiments >>>>>>>> >>>>>>>> > >>>>>>>> > >> or >>>>>>>> > >>>>>>>> > >>>> tests. >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>>> Cons >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>>> - In case the <punctuate interval> is not a >>>>>>>> time >>>>>>>> >>>>>>>> duration [say >>>>>>>> >>>>>>>> > >>>>>>>> > >>> logical >>>>>>>> > >>>>>>>> > >>>> time/event count], then the approach might not >>>>>>>> be >>>>>>>> >>>>>>>> meaningful. >>>>>>>> >>>>>>>> > >>>>>>>> > >>>> - In case there is a case where we have to wait >>>>>>>> for an >>>>>>>> >>>>>>>> actual event >>>>>>>> >>>>>>>> > >>>>>>>> > >>> from >>>>>>>> > >>>>>>>> > >>>> a low event rate partition in the partition >>>>>>>> group, this >>>>>>>> >>>>>>>> approach >>>>>>>> >>>>>>>> > >>>>>>>> > >> will >>>>>>>> > >>>>>>>> > >>>> jump >>>>>>>> > >>>>>>>> > >>>> the gun. >>>>>>>> > >>>>>>>> > >>>> - in case the event processing cannot catch up >>>>>>>> with the >>>>>>>> >>>>>>>> event rate >>>>>>>> >>>>>>>> > >>>>>>>> > >> and >>>>>>>> > >>>>>>>> > >>>> the expected timestamp events gets queued for >>>>>>>> long >>>>>>>> >>>>>>>> time, this >>>>>>>> >>>>>>>> > >>>>>>>> > >> approach >>>>>>>> > >>>>>>>> > >>>> might jump the gun. >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>>> I believe the above approach and discussion goes >>>>>>>> close to >>>>>>>> >>>>>>>> the approach >>>>>>>> >>>>>>>> > >>>>>>>> > >> A. >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>>> ----------- >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>>> I like the idea of having an even count based >>>>>>>> punctuate. >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>>> ----------- >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>>> I agree with the discussion around approach C, >>>>>>>> that we >>>>>>>> >>>>>>>> should provide >>>>>>>> >>>>>>>> > >>>>>>>> > >> the >>>>>>>> > >>>>>>>> > >>>> user with the option to choose system time or >>>>>>>> event time >>>>>>>> >>>>>>>> based >>>>>>>> >>>>>>>> > >>>>>>>> > >>> punctuates. >>>>>>>> > >>>>>>>> > >>>> But I believe that the user predominantly wants to >>>>>>>> use >>>>>>>> >>>>>>>> event time while >>>>>>>> >>>>>>>> > >>>>>>>> > >>> not >>>>>>>> > >>>>>>>> > >>>> missing out on regular punctuates due to event >>>>>>>> delays or >>>>>>>> >>>>>>>> event >>>>>>>> >>>>>>>> > >>>>>>>> > >> absences. >>>>>>>> > >>>>>>>> > >>>> Hence a complex punctuate option as Matthias >>>>>>>> mentioned >>>>>>>> >>>>>>>> (quoted below) >>>>>>>> >>>>>>>> > >>>>>>>> > >>> would >>>>>>>> > >>>>>>>> > >>>> be most apt. >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>>> "- We might want to add "complex" schedules later >>>>>>>> on >>>>>>>> >>>>>>>> (like, punctuate >>>>>>>> >>>>>>>> > >>>>>>>> > >> on >>>>>>>> > >>>>>>>> > >>>> every 10 seconds event-time or 60 seconds system- >>>>>>>> time >>>>>>>> >>>>>>>> whatever comes >>>>>>>> >>>>>>>> > >>>>>>>> > >>>> first)." >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>>> ----------- >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>>> I think I read somewhere that Kafka Streams >>>>>>>> started with >>>>>>>> >>>>>>>> System Time as >>>>>>>> >>>>>>>> > >>>>>>>> > >>> the >>>>>>>> > >>>>>>>> > >>>> punctuation standard, but was later changed to >>>>>>>> Event Time. >>>>>>>> >>>>>>>> I guess >>>>>>>> >>>>>>>> > >>>>>>>> > >> there >>>>>>>> > >>>>>>>> > >>>> would be some good reason behind it. As Kafka >>>>>>>> Streams want >>>>>>>> >>>>>>>> to evolve >>>>>>>> >>>>>>>> > >>>>>>>> > >> more >>>>>>>> > >>>>>>>> > >>>> on the Stream Processing front, I believe the >>>>>>>> emphasis on >>>>>>>> >>>>>>>> event time >>>>>>>> >>>>>>>> > >>>>>>>> > >>> would >>>>>>>> > >>>>>>>> > >>>> remain quite strong. >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>>> With Regards, >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>>> Arun Mathew >>>>>>>> > >>>>>>>> > >>>> Yahoo! JAPAN Corporation, Tokyo >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>>> On Wed, Apr 5, 2017 at 3:53 AM, Thomas Becker < >>>>>>>> >>>>>>>> tobec...@tivo.com> <tobec...@tivo.com> <tobec...@tivo.com> >>>>>>>> >>>>>>>> > >>>>>>>> > >> wrote: >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>>>> Yeah I like PuncutationType much better; I just >>>>>>>> threw >>>>>>>> >>>>>>>> Time out there >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>> more as a strawman than an actual suggestion ;) I >>>>>>>> still >>>>>>>> >>>>>>>> think it's >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>> worth considering what this buys us over an >>>>>>>> additional >>>>>>>> >>>>>>>> callback. I >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>> foresee a number of punctuate implementations >>>>>>>> following >>>>>>>> >>>>>>>> this pattern: >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>> >>>>>>>> > >>>>>>>> > >>>>> public void punctuate(PunctuationType type) { >>>>>>>> > >>>>>>>> > >>>>> switch (type) { >>>>>>>> > >>>>>>>> > >>>>> case EVENT_TIME: >>>>>>>> > >>>>>>>> > >>>>> methodA(); >>>>>>>> > >>>>>>>> > >>>>> break; >>>>>>>> > >>>>>>>> > >>>>> case SYSTEM_TIME: >>>>>>>> > >>>>>>>> > >>>>> methodB(); >>>>>>>> > >>>>>>>> > >>>>> break; >>>>>>>> > >>>>>>>> > >>>>> } >>>>>>>> > >>>>>>>> > >>>>> } >>>>>>>> > >>>>>>>> > >>>>> >>>>>>>> > >>>>>>>> > >>>>> I guess one advantage of this approach is we >>>>>>>> could add >>>>>>>> >>>>>>>> additional >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>> punctuation types later in a backwards compatible >>>>>>>> way >>>>>>>> >>>>>>>> (like event >>>>>>>> >>>>>>>> > >>>>>>>> > >> count >>>>>>>> > >>>>>>>> > >>>>> as you mentioned). >>>>>>>> > >>>>>>>> > >>>>> >>>>>>>> > >>>>>>>> > >>>>> -Tommy >>>>>>>> > >>>>>>>> > >>>>> >>>>>>>> > >>>>>>>> > >>>>> >>>>>>>> > >>>>>>>> > >>>>> On Tue, 2017-04-04 at 11:10 -0700, Matthias J. >>>>>>>> Sax wrote: >>>>>>>> > >>>>>>>> > >>>>>> That sounds promising. >>>>>>>> > >>>>>>>> > >>>>>> >>>>>>>> > >>>>>>>> > >>>>>> I am just wondering if `Time` is the best name. >>>>>>>> Maybe we >>>>>>>> >>>>>>>> want to >>>>>>>> >>>>>>>> > >>>>>>>> > >> add >>>>>>>> > >>>>>>>> > >>>>>> other non-time based punctuations at some point >>>>>>>> later. I >>>>>>>> >>>>>>>> would >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>> suggest >>>>>>>> > >>>>>>>> > >>>>>> >>>>>>>> > >>>>>>>> > >>>>>> enum PunctuationType { >>>>>>>> > >>>>>>>> > >>>>>> EVENT_TIME, >>>>>>>> > >>>>>>>> > >>>>>> SYSTEM_TIME, >>>>>>>> > >>>>>>>> > >>>>>> } >>>>>>>> > >>>>>>>> > >>>>>> >>>>>>>> > >>>>>>>> > >>>>>> or similar. Just to keep the door open -- it's >>>>>>>> easier to >>>>>>>> >>>>>>>> add new >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>> stuff >>>>>>>> > >>>>>>>> > >>>>>> if the name is more generic. >>>>>>>> > >>>>>>>> > >>>>>> >>>>>>>> > >>>>>>>> > >>>>>> >>>>>>>> > >>>>>>>> > >>>>>> -Matthias >>>>>>>> > >>>>>>>> > >>>>>> >>>>>>>> > >>>>>>>> > >>>>>> >>>>>>>> > >>>>>>>> > >>>>>> On 4/4/17 5:30 AM, Thomas Becker wrote: >>>>>>>> > >>>>>>>> > >>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> I agree that the framework providing and >>>>>>>> managing the >>>>>>>> >>>>>>>> notion of >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> stream >>>>>>>> > >>>>>>>> > >>>>>>> time is valuable and not something we would >>>>>>>> want to >>>>>>>> >>>>>>>> delegate to >>>>>>>> >>>>>>>> > >>>>>>>> > >> the >>>>>>>> > >>>>>>>> > >>>>>>> tasks. I'm not entirely convinced that a >>>>>>>> separate >>>>>>>> >>>>>>>> callback >>>>>>>> >>>>>>>> > >>>>>>>> > >> (option >>>>>>>> > >>>>>>>> > >>>>>>> C) >>>>>>>> > >>>>>>>> > >>>>>>> is that messy (it could just be a default >>>>>>>> method with >>>>>>>> >>>>>>>> an empty >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> implementation), but if we wanted a single API >>>>>>>> to >>>>>>>> >>>>>>>> handle both >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> cases, >>>>>>>> > >>>>>>>> > >>>>>>> how about something like the following? >>>>>>>> > >>>>>>>> > >>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> enum Time { >>>>>>>> > >>>>>>>> > >>>>>>> STREAM, >>>>>>>> > >>>>>>>> > >>>>>>> CLOCK >>>>>>>> > >>>>>>>> > >>>>>>> } >>>>>>>> > >>>>>>>> > >>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> Then on ProcessorContext: >>>>>>>> > >>>>>>>> > >>>>>>> context.schedule(Time time, long interval) // >>>>>>>> We could >>>>>>>> >>>>>>>> allow >>>>>>>> >>>>>>>> > >>>>>>>> > >> this >>>>>>>> > >>>>>>>> > >>>>>>> to >>>>>>>> > >>>>>>>> > >>>>>>> be called once for each value of time to mix >>>>>>>> >>>>>>>> approaches. >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> Then the Processor API becomes: >>>>>>>> > >>>>>>>> > >>>>>>> punctuate(Time time) // time here denotes which >>>>>>>> >>>>>>>> schedule resulted >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> in >>>>>>>> > >>>>>>>> > >>>>>>> this call. >>>>>>>> > >>>>>>>> > >>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> Thoughts? >>>>>>>> > >>>>>>>> > >>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> On Mon, 2017-04-03 at 22:44 -0700, Matthias J. >>>>>>>> Sax >>>>>>>> >>>>>>>> wrote: >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> Thanks a lot for the KIP Michal, >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> I was thinking about the four options you >>>>>>>> proposed in >>>>>>>> >>>>>>>> more >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> details >>>>>>>> > >>>>>>>> > >>>>>>>> and >>>>>>>> > >>>>>>>> > >>>>>>>> this are my thoughts: >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> (A) You argue, that users can still >>>>>>>> "punctuate" on >>>>>>>> >>>>>>>> event-time >>>>>>>> >>>>>>>> > >>>>>>>> > >> via >>>>>>>> > >>>>>>>> > >>>>>>>> process(), but I am not sure if this is >>>>>>>> possible. >>>>>>>> >>>>>>>> Note, that >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> users >>>>>>>> > >>>>>>>> > >>>>>>>> only >>>>>>>> > >>>>>>>> > >>>>>>>> get record timestamps via context.timestamp(). >>>>>>>> Thus, >>>>>>>> >>>>>>>> users >>>>>>>> >>>>>>>> > >>>>>>>> > >> would >>>>>>>> > >>>>>>>> > >>>>>>>> need >>>>>>>> > >>>>>>>> > >>>>>>>> to >>>>>>>> > >>>>>>>> > >>>>>>>> track the time progress per partition (based >>>>>>>> on the >>>>>>>> >>>>>>>> partitions >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> they >>>>>>>> > >>>>>>>> > >>>>>>>> obverse via context.partition(). (This alone >>>>>>>> puts a >>>>>>>> >>>>>>>> huge burden >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> on >>>>>>>> > >>>>>>>> > >>>>>>>> the >>>>>>>> > >>>>>>>> > >>>>>>>> user by itself.) However, users are not >>>>>>>> notified at >>>>>>>> >>>>>>>> startup >>>>>>>> >>>>>>>> > >>>>>>>> > >> what >>>>>>>> > >>>>>>>> > >>>>>>>> partitions are assigned, and user are not >>>>>>>> notified >>>>>>>> >>>>>>>> when >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> partitions >>>>>>>> > >>>>>>>> > >>>>>>>> get >>>>>>>> > >>>>>>>> > >>>>>>>> revoked. Because this information is not >>>>>>>> available, >>>>>>>> >>>>>>>> it's not >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> possible >>>>>>>> > >>>>>>>> > >>>>>>>> to >>>>>>>> > >>>>>>>> > >>>>>>>> "manually advance" stream-time, and thus >>>>>>>> event-time >>>>>>>> >>>>>>>> punctuation >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> within >>>>>>>> > >>>>>>>> > >>>>>>>> process() seems not to be possible -- or do >>>>>>>> you see a >>>>>>>> >>>>>>>> way to >>>>>>>> >>>>>>>> > >>>>>>>> > >> get >>>>>>>> > >>>>>>>> > >>>>>>>> it >>>>>>>> > >>>>>>>> > >>>>>>>> done? And even if, it might still be too >>>>>>>> clumsy to >>>>>>>> >>>>>>>> use. >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> (B) This does not allow to mix both >>>>>>>> approaches, thus >>>>>>>> >>>>>>>> limiting >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> what >>>>>>>> > >>>>>>>> > >>>>>>>> users >>>>>>>> > >>>>>>>> > >>>>>>>> can do. >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> (C) This should give all flexibility we need. >>>>>>>> However, >>>>>>>> >>>>>>>> just >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> adding >>>>>>>> > >>>>>>>> > >>>>>>>> one >>>>>>>> > >>>>>>>> > >>>>>>>> more method seems to be a solution that is too >>>>>>>> simple >>>>>>>> >>>>>>>> (cf my >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> comments >>>>>>>> > >>>>>>>> > >>>>>>>> below). >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> (D) This might be hard to use. Also, I am not >>>>>>>> sure how >>>>>>>> >>>>>>>> a user >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> could >>>>>>>> > >>>>>>>> > >>>>>>>> enable system-time and event-time punctuation >>>>>>>> in >>>>>>>> >>>>>>>> parallel. >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> Overall options (C) seems to be the most >>>>>>>> promising >>>>>>>> >>>>>>>> approach to >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> me. >>>>>>>> > >>>>>>>> > >>>>>>>> Because I also favor a clean API, we might >>>>>>>> keep >>>>>>>> >>>>>>>> current >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> punctuate() >>>>>>>> > >>>>>>>> > >>>>>>>> as-is, but deprecate it -- so we can remove it >>>>>>>> at some >>>>>>>> >>>>>>>> later >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> point >>>>>>>> > >>>>>>>> > >>>>>>>> when >>>>>>>> > >>>>>>>> > >>>>>>>> people use the "new punctuate API". >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> Couple of follow up questions: >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> - I am wondering, if we should have two >>>>>>>> callback >>>>>>>> >>>>>>>> methods or >>>>>>>> >>>>>>>> > >>>>>>>> > >> just >>>>>>>> > >>>>>>>> > >>>>>>>> one >>>>>>>> > >>>>>>>> > >>>>>>>> (ie, a unified for system and event time >>>>>>>> punctuation >>>>>>>> >>>>>>>> or one for >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> each?). >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> - If we have one, how can the user figure out, >>>>>>>> which >>>>>>>> >>>>>>>> condition >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> did >>>>>>>> > >>>>>>>> > >>>>>>>> trigger? >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> - How would the API look like, for registering >>>>>>>> >>>>>>>> different >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> punctuate >>>>>>>> > >>>>>>>> > >>>>>>>> schedules? The "type" must be somehow defined? >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> - We might want to add "complex" schedules >>>>>>>> later on >>>>>>>> >>>>>>>> (like, >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> punctuate >>>>>>>> > >>>>>>>> > >>>>>>>> on >>>>>>>> > >>>>>>>> > >>>>>>>> every 10 seconds event-time or 60 seconds >>>>>>>> system-time >>>>>>>> >>>>>>>> whatever >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> comes >>>>>>>> > >>>>>>>> > >>>>>>>> first). I don't say we should add this right >>>>>>>> away, but >>>>>>>> >>>>>>>> we might >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> want >>>>>>>> > >>>>>>>> > >>>>>>>> to >>>>>>>> > >>>>>>>> > >>>>>>>> define the API in a way, that it allows >>>>>>>> extensions >>>>>>>> >>>>>>>> like this >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> later >>>>>>>> > >>>>>>>> > >>>>>>>> on, >>>>>>>> > >>>>>>>> > >>>>>>>> without redesigning the API (ie, the API >>>>>>>> should be >>>>>>>> >>>>>>>> designed >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> extensible) >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> - Did you ever consider count-based >>>>>>>> punctuation? >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> I understand, that you would like to solve a >>>>>>>> simple >>>>>>>> >>>>>>>> problem, >>>>>>>> >>>>>>>> > >>>>>>>> > >> but >>>>>>>> > >>>>>>>> > >>>>>>>> we >>>>>>>> > >>>>>>>> > >>>>>>>> learned from the past, that just "adding some >>>>>>>> API" >>>>>>>> >>>>>>>> quickly >>>>>>>> >>>>>>>> > >>>>>>>> > >> leads >>>>>>>> > >>>>>>>> > >>>>>>>> to a >>>>>>>> > >>>>>>>> > >>>>>>>> not very well defined API that needs time >>>>>>>> consuming >>>>>>>> >>>>>>>> clean up >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> later on >>>>>>>> > >>>>>>>> > >>>>>>>> via other KIPs. Thus, I would prefer to get a >>>>>>>> holistic >>>>>>>> > >>>>>>>> > >>>>>>>> punctuation >>>>>>>> > >>>>>>>> > >>>>>>>> KIP >>>>>>>> > >>>>>>>> > >>>>>>>> with this from the beginning on to avoid later >>>>>>>> painful >>>>>>>> > >>>>>>>> > >> redesign. >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> -Matthias >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>> On 4/3/17 11:58 AM, Michal Borowiecki wrote: >>>>>>>> > >>>>>>>> > >>>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>> Thanks Thomas, >>>>>>>> > >>>>>>>> > >>>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>> I'm also wary of changing the existing >>>>>>>> semantics of >>>>>>>> > >>>>>>>> > >> punctuate, >>>>>>>> > >>>>>>>> > >>>>>>>>> for >>>>>>>> > >>>>>>>> > >>>>>>>>> backward compatibility reasons, although I >>>>>>>> like the >>>>>>>> > >>>>>>>> > >> conceptual >>>>>>>> > >>>>>>>> > >>>>>>>>> simplicity of that option. >>>>>>>> > >>>>>>>> > >>>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>> Adding a new method to me feels safer but, in >>>>>>>> a way, >>>>>>>> >>>>>>>> uglier. >>>>>>>> >>>>>>>> > >>>>>>>> > >> I >>>>>>>> > >>>>>>>> > >>>>>>>>> added >>>>>>>> > >>>>>>>> > >>>>>>>>> this to the KIP now as option (C). >>>>>>>> > >>>>>>>> > >>>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>> The TimestampExtractor mechanism is actually >>>>>>>> more >>>>>>>> >>>>>>>> flexible, >>>>>>>> >>>>>>>> > >>>>>>>> > >> as >>>>>>>> > >>>>>>>> > >>>>>>>>> it >>>>>>>> > >>>>>>>> > >>>>>>>>> allows >>>>>>>> > >>>>>>>> > >>>>>>>>> you to return any value, you're not limited >>>>>>>> to event >>>>>>>> >>>>>>>> time or >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>> system >>>>>>>> > >>>>>>>> > >>>>>>>>> time >>>>>>>> > >>>>>>>> > >>>>>>>>> (although I don't see an actual use case >>>>>>>> where you >>>>>>>> >>>>>>>> might need >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>> anything >>>>>>>> > >>>>>>>> > >>>>>>>>> else then those two). Hence I also proposed >>>>>>>> the >>>>>>>> >>>>>>>> option to >>>>>>>> >>>>>>>> > >>>>>>>> > >> allow >>>>>>>> > >>>>>>>> > >>>>>>>>> users >>>>>>>> > >>>>>>>> > >>>>>>>>> to, effectively, decide what "stream time" is >>>>>>>> for >>>>>>>> >>>>>>>> them given >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>> the >>>>>>>> > >>>>>>>> > >>>>>>>>> presence or absence of messages, much like >>>>>>>> they can >>>>>>>> >>>>>>>> decide >>>>>>>> >>>>>>>> > >>>>>>>> > >> what >>>>>>>> > >>>>>>>> > >>>>>>>>> msg >>>>>>>> > >>>>>>>> > >>>>>>>>> time >>>>>>>> > >>>>>>>> > >>>>>>>>> means for them using the TimestampExtractor. >>>>>>>> What do >>>>>>>> >>>>>>>> you >>>>>>>> >>>>>>>> > >>>>>>>> > >> think >>>>>>>> > >>>>>>>> > >>>>>>>>> about >>>>>>>> > >>>>>>>> > >>>>>>>>> that? This is probably most flexible but also >>>>>>>> most >>>>>>>> > >>>>>>>> > >> complicated. >>>>>>>> > >>>>>>>> > >>>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>> All comments appreciated. >>>>>>>> > >>>>>>>> > >>>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>> Cheers, >>>>>>>> > >>>>>>>> > >>>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>> Michal >>>>>>>> > >>>>>>>> > >>>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>> On 03/04/17 19:23, Thomas Becker wrote: >>>>>>>> > >>>>>>>> > >>>>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>>> Although I fully agree we need a way to >>>>>>>> trigger >>>>>>>> >>>>>>>> periodic >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>>> processing >>>>>>>> > >>>>>>>> > >>>>>>>>>> that is independent from whether and when >>>>>>>> messages >>>>>>>> >>>>>>>> arrive, >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>>> I'm >>>>>>>> > >>>>>>>> > >>>>>>>>>> not sure >>>>>>>> > >>>>>>>> > >>>>>>>>>> I like the idea of changing the existing >>>>>>>> semantics >>>>>>>> >>>>>>>> across >>>>>>>> >>>>>>>> > >>>>>>>> > >> the >>>>>>>> > >>>>>>>> > >>>>>>>>>> board. >>>>>>>> > >>>>>>>> > >>>>>>>>>> What if we added an additional callback to >>>>>>>> Processor >>>>>>>> >>>>>>>> that >>>>>>>> >>>>>>>> > >>>>>>>> > >> can >>>>>>>> > >>>>>>>> > >>>>>>>>>> be >>>>>>>> > >>>>>>>> > >>>>>>>>>> scheduled similarly to punctuate() but was >>>>>>>> always >>>>>>>> >>>>>>>> called at >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>>> fixed, wall >>>>>>>> > >>>>>>>> > >>>>>>>>>> clock based intervals? This way you wouldn't >>>>>>>> have to >>>>>>>> >>>>>>>> give >>>>>>>> >>>>>>>> > >>>>>>>> > >> up >>>>>>>> > >>>>>>>> > >>>>>>>>>> the >>>>>>>> > >>>>>>>> > >>>>>>>>>> notion >>>>>>>> > >>>>>>>> > >>>>>>>>>> of stream time to be able to do periodic >>>>>>>> processing. >>>>>>>> > >>>>>>>> > >>>>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>>> On Mon, 2017-04-03 at 10:34 +0100, Michal >>>>>>>> Borowiecki >>>>>>>> >>>>>>>> wrote: >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>>>> Hi all, >>>>>>>> > >>>>>>>> > >>>>>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>>>> I have created a draft for KIP-138: Change >>>>>>>> >>>>>>>> punctuate >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>>>> semantics >>>>>>>> > >>>>>>>> > >>>>>>>>>>> <https://cwiki.apache.org/ >>>>>>>> >>>>>>>> confluence/display/KAFKA/KIP- <https://cwiki.apache.org/ >>>>>>>> confluence/display/KAFKA/KIP-> >>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-> >>>>>>>> >>>>>>>> > >>>>>>>> > > <https://cwiki.apache.org/confluence/display/KAFKA/KI P-> >>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP->>> >>>>>>>> >>>>>>>> 138% >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>>>> 3A+C >>>>>>>> > >>>>>>>> > >>>>>>>>>>> hange+ >>>>>>>> > >>>>>>>> > >>>>>>>>>>> punctuate+semantics> >>>>>>>> > >>>>>>>> > >>>>>>>>>>> . >>>>>>>> > >>>>>>>> > >>>>>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>>>> Appreciating there can be different views >>>>>>>> on >>>>>>>> >>>>>>>> system-time >>>>>>>> >>>>>>>> > >>>>>>>> > >> vs >>>>>>>> > >>>>>>>> > >>>>>>>>>>> event- >>>>>>>> > >>>>>>>> > >>>>>>>>>>> time >>>>>>>> > >>>>>>>> > >>>>>>>>>>> semantics for punctuation depending on use- >>>>>>>> case and >>>>>>>> >>>>>>>> the >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>>>> importance of >>>>>>>> > >>>>>>>> > >>>>>>>>>>> backwards compatibility of any such change, >>>>>>>> I've >>>>>>>> >>>>>>>> left it >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>>>> quite >>>>>>>> > >>>>>>>> > >>>>>>>>>>> open >>>>>>>> > >>>>>>>> > >>>>>>>>>>> and >>>>>>>> > >>>>>>>> > >>>>>>>>>>> hope to fill in more info as the discussion >>>>>>>> >>>>>>>> progresses. >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>>>>>> Thanks, >>>>>>>> > >>>>>>>> > >>>>>>>>>>> Michal >>>>>>>> > >>>>>>>> > >>>>>>> -- >>>>>>>> > >>>>>>>> > >>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> Tommy Becker >>>>>>>> > >>>>>>>> > >>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> Senior Software Engineer >>>>>>>> > >>>>>>>> > >>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> O +1 919.460.4747 <%28919%29%20460-4747> >>>>>>>> <(919)%20460-4747> >>>>>>>> > >>>>>>>> > >>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> tivo.com >>>>>>>> > >>>>>>>> > >>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> ________________________________ >>>>>>>> > >>>>>>>> > >>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> This email and any attachments may contain >>>>>>>> confidential >>>>>>>> >>>>>>>> and >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> privileged material for the sole use of the >>>>>>>> intended >>>>>>>> >>>>>>>> recipient. >>>>>>>> >>>>>>>> > >>>>>>>> > >> Any >>>>>>>> > >>>>>>>> > >>>>>>> review, copying, or distribution of this email >>>>>>>> (or any >>>>>>>> > >>>>>>>> > >> attachments) >>>>>>>> > >>>>>>>> > >>>>>>> by others is prohibited. If you are not the >>>>>>>> intended >>>>>>>> >>>>>>>> recipient, >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> please contact the sender immediately and >>>>>>>> permanently >>>>>>>> >>>>>>>> delete this >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> email and any attachments. No employee or agent >>>>>>>> of TiVo >>>>>>>> >>>>>>>> Inc. is >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> authorized to conclude any binding agreement on >>>>>>>> behalf >>>>>>>> >>>>>>>> of TiVo >>>>>>>> >>>>>>>> > >>>>>>>> > >> Inc. >>>>>>>> > >>>>>>>> > >>>>>>> by email. Binding agreements with TiVo Inc. may >>>>>>>> only be >>>>>>>> >>>>>>>> made by a >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>>>> signed written agreement. >>>>>>>> > >>>>>>>> > >>>>>>> >>>>>>>> > >>>>>>>> > >>>>> -- >>>>>>>> > >>>>>>>> > >>>>> >>>>>>>> > >>>>>>>> > >>>>> >>>>>>>> > >>>>>>>> > >>>>> Tommy Becker >>>>>>>> > >>>>>>>> > >>>>> >>>>>>>> > >>>>>>>> > >>>>> Senior Software Engineer >>>>>>>> > >>>>>>>> > >>>>> >>>>>>>> > >>>>>>>> > >>>>> O +1 919.460.4747 <%28919%29%20460-4747> >>>>>>>> <(919)%20460-4747> >>>>>>>> > >>>>>>>> > >>>>> >>>>>>>> > >>>>>>>> > >>>>> tivo.com >>>>>>>> > >>>>>>>> > >>>>> >>>>>>>> > >>>>>>>> > >>>>> >>>>>>>> > >>>>>>>> > >>>>> ________________________________ >>>>>>>> > >>>>>>>> > >>>>> >>>>>>>> > >>>>>>>> > >>>>> This email and any attachments may contain >>>>>>>> confidential >>>>>>>> >>>>>>>> and >>>>>>>> >>>>>>>> > >>>>>>>> > >> privileged >>>>>>>> > >>>>>>>> > >>>>> material for the sole use of the intended >>>>>>>> recipient. Any >>>>>>>> >>>>>>>> review, >>>>>>>> >>>>>>>> > >>>>>>>> > >>> copying, >>>>>>>> > >>>>>>>> > >>>>> or distribution of this email (or any >>>>>>>> attachments) by >>>>>>>> >>>>>>>> others is >>>>>>>> >>>>>>>> > >>>>>>>> > >>>> prohibited. >>>>>>>> > >>>>>>>> > >>>>> If you are not the intended recipient, please >>>>>>>> contact the >>>>>>>> >>>>>>>> sender >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>> immediately and permanently delete this email and >>>>>>>> any >>>>>>>> >>>>>>>> attachments. No >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>> employee or agent of TiVo Inc. is authorized to >>>>>>>> conclude >>>>>>>> >>>>>>>> any binding >>>>>>>> >>>>>>>> > >>>>>>>> > >>>>> agreement on behalf of TiVo Inc. by email. >>>>>>>> Binding >>>>>>>> >>>>>>>> agreements with >>>>>>>> >>>>>>>> > >>>>>>>> > >> TiVo >>>>>>>> > >>>>>>>> > >>>>> Inc. may only be made by a signed written >>>>>>>> agreement. >>>>>>>> > >>>>>>>> > >>>>> >>>>>>>> > >>>>>>>> > >>>> >>>>>>>> > >>>>>>>> > >>> >>>>>>>> > >>>>>>>> > >> >>>>>>>> > >>>>>>>> > > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > -- >>>>>>>> > >>>>>>>> > <http://www.openbet.com/> <http://www.openbet.com/> >>>>>>>> >>>>>>>> > >>>>>>>> > *Michal Borowiecki* >>>>>>>> > >>>>>>>> > *Senior Software Engineer L4* >>>>>>>> > >>>>>>>> > *T: * >>>>>>>> > >>>>>>>> > +44 208 742 1600 <+44%2020%208742%201600> >>>>>>>> <+44%2020%208742%201600> >>>>>>>> > >>>>>>>> > +44 203 249 8448 <+44%2020%203249%208448> >>>>>>>> <+44%2020%203249%208448> >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > *E: * >>>>>>>> > >>>>>>>> > michal.borowie...@openbet.com >>>>>>>> > >>>>>>>> > *W: * >>>>>>>> > >>>>>>>> > www.openbet.com >>>>>>>> > >>>>>>>> > *OpenBet Ltd* >>>>>>>> > >>>>>>>> > Chiswick Park Building 9 >>>>>>>> > >>>>>>>> > 566 Chiswick High Rd >>>>>>>> > >>>>>>>> > London >>>>>>>> > >>>>>>>> > W4 5XT >>>>>>>> > >>>>>>>> > UK >>>>>>>> > >>>>>>>> > <https://www.openbet.com/email_promo> >>>>>>>> <https://www.openbet.com/email_promo> >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > This message is confidential and intended only for the >>>>>>>> addressee. >>>>>>>> >>>>>>>> If you >>>>>>>> >>>>>>>> > have received this message in error, please immediately >>>>>>>> notify the >>>>>>>> > postmas...@openbet.com and delete it from your system as >>>>>>>> well as >>>>>>>> >>>>>>>> any >>>>>>>> >>>>>>>> > copies. The content of e-mails as well as traffic data may >>>>>>>> be >>>>>>>> >>>>>>>> monitored by >>>>>>>> >>>>>>>> > OpenBet for employment and security purposes. To protect >>>>>>>> the >>>>>>>> >>>>>>>> environment >>>>>>>> >>>>>>>> > please do not print this e-mail unless necessary. OpenBet >>>>>>>> Ltd. >>>>>>>> >>>>>>>> Registered >>>>>>>> >>>>>>>> > Office: Chiswick Park Building 9, 566 Chiswick High Road, >>>>>>>> London, >>>>>>>> >>>>>>>> W4 5XT, >>>>>>>> >>>>>>>> > United Kingdom. A company registered in England and Wales. >>>>>>>> >>>>>>>> Registered no. >>>>>>>> >>>>>>>> > 3134634. VAT no. GB927523612 >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> >>>>>>>> Tommy Becker >>>>>>>> >>>>>>>> Senior Software Engineer >>>>>>>> >>>>>>>> O +1 919.460.4747 <%28919%29%20460-4747> >>>>>>>> >>>>>>>> >>>>>>>> tivo.com >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ________________________________ >>>>>>>> >>>>>>>> This email and any attachments may contain confidential and privileged >>>>>>>> material for the sole use of the intended recipient. Any review, >>>>>>>> copying, or distribution of this email (or any attachments) by others >>>>>>>> is prohibited. If you are not the intended recipient, please contact >>>>>>>> the sender immediately and permanently delete this email and any >>>>>>>> attachments. No employee or agent of TiVo Inc. is authorized to >>>>>>>> conclude any binding agreement on behalf of TiVo Inc. by email. >>>>>>>> Binding agreements with TiVo Inc. may only be made by a signed written >>>>>>>> agreement. >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> <http://www.openbet.com/> Michal Borowiecki >>>>>>>> Senior Software Engineer L4 >>>>>>>> T: +44 208 742 1600 <+44%2020%208742%201600> >>>>>>>> >>>>>>>> >>>>>>>> >>>>> -- >>>>> Signature >>>>> <http://www.openbet.com/> Michal Borowiecki >>>>> Senior Software Engineer L4 >>>>> T: +44 208 742 1600 >>>>> >>>>> >>>>> +44 203 249 8448 >>>>> >>>>> >>>>> >>>>> E: michal.borowie...@openbet.com >>>>> W: www.openbet.com <http://www.openbet.com/> >>>>> >>>>> >>>>> OpenBet Ltd >>>>> >>>>> Chiswick Park Building 9 >>>>> >>>>> 566 Chiswick High Rd >>>>> >>>>> London >>>>> >>>>> W4 5XT >>>>> >>>>> UK >>>>> >>>>> >>>>> <https://www.openbet.com/email_promo> >>>>> >>>>> This message is confidential and intended only for the addressee. If >>>>> you have received this message in error, please immediately notify the >>>>> postmas...@openbet.com <mailto:postmas...@openbet.com> and delete it >>>>> from your system as well as any copies. The content of e-mails as well >>>>> as traffic data may be monitored by OpenBet for employment and >>>>> security purposes. To protect the environment please do not print this >>>>> e-mail unless necessary. OpenBet Ltd. Registered Office: Chiswick Park >>>>> Building 9, 566 Chiswick High Road, London, W4 5XT, United Kingdom. A >>>>> company registered in England and Wales. Registered no. 3134634. VAT >>>>> no. GB927523612 >>>>> >>>> -- >>>> Signature >>>> <http://www.openbet.com/> Michal Borowiecki >>>> Senior Software Engineer L4 >>>> T: +44 208 742 1600 >>>> >>>> >>>> +44 203 249 8448 >>>> >>>> >>>> >>>> E: michal.borowie...@openbet.com >>>> W: www.openbet.com <http://www.openbet.com/> >>>> >>>> >>>> OpenBet Ltd >>>> >>>> Chiswick Park Building 9 >>>> >>>> 566 Chiswick High Rd >>>> >>>> London >>>> >>>> W4 5XT >>>> >>>> UK >>>> >>>> >>>> <https://www.openbet.com/email_promo> >>>> >>>> This message is confidential and intended only for the addressee. If you >>>> have received this message in error, please immediately notify the >>>> postmas...@openbet.com <mailto:postmas...@openbet.com> and delete it >>>> from your system as well as any copies. The content of e-mails as well >>>> as traffic data may be monitored by OpenBet for employment and security >>>> purposes. To protect the environment please do not print this e-mail >>>> unless necessary. OpenBet Ltd. Registered Office: Chiswick Park Building >>>> 9, 566 Chiswick High Road, London, W4 5XT, United Kingdom. A company >>>> registered in England and Wales. Registered no. 3134634. VAT no. >>>> GB927523612 >>>> >> >> -- >> Signature >> <http://www.openbet.com/> Michal Borowiecki >> Senior Software Engineer L4 >> T: +44 208 742 1600 >> >> >> +44 203 249 8448 >> >> >> >> E: michal.borowie...@openbet.com >> W: www.openbet.com <http://www.openbet.com/> >> >> >> OpenBet Ltd >> >> Chiswick Park Building 9 >> >> 566 Chiswick High Rd >> >> London >> >> W4 5XT >> >> UK >> >> >> <https://www.openbet.com/email_promo> >> >> This message is confidential and intended only for the addressee. If >> you have received this message in error, please immediately notify the >> postmas...@openbet.com <mailto:postmas...@openbet.com> and delete it >> from your system as well as any copies. The content of e-mails as well >> as traffic data may be monitored by OpenBet for employment and >> security purposes. To protect the environment please do not print this >> e-mail unless necessary. OpenBet Ltd. Registered Office: Chiswick Park >> Building 9, 566 Chiswick High Road, London, W4 5XT, United Kingdom. A >> company registered in England and Wales. Registered no. 3134634. VAT >> no. GB927523612 >> > > -- > Signature > <http://www.openbet.com/> Michal Borowiecki > Senior Software Engineer L4 > T: +44 208 742 1600 > > > +44 203 249 8448 > > > > E: michal.borowie...@openbet.com > W: www.openbet.com <http://www.openbet.com/> > > > OpenBet Ltd > > Chiswick Park Building 9 > > 566 Chiswick High Rd > > London > > W4 5XT > > UK > > > <https://www.openbet.com/email_promo> > > This message is confidential and intended only for the addressee. If you > have received this message in error, please immediately notify the > postmas...@openbet.com <mailto:postmas...@openbet.com> and delete it > from your system as well as any copies. The content of e-mails as well > as traffic data may be monitored by OpenBet for employment and security > purposes. To protect the environment please do not print this e-mail > unless necessary. OpenBet Ltd. Registered Office: Chiswick Park Building > 9, 566 Chiswick High Road, London, W4 5XT, United Kingdom. A company > registered in England and Wales. Registered no. 3134634. VAT no. > GB927523612 >
signature.asc
Description: OpenPGP digital signature