I agree with this. We would need to allow processor level configuration.

And I also agree, that the global caching config is not optimal...


-Matthias

On 4/24/17 3:55 AM, Michal Borowiecki wrote:
> Further to this, on your point about configuration:
> 
>> Thus, I also believe that one might need different "configuration"
>> values for the hybrid approach if you run the same code for different
>> scenarios: regular processing, re-processing, catching up scenario. And
>> as the term "configuration" implies, we might be better off to not mix
>> configuration with business logic that is expressed via code.
> I'm not sure I understand what you are suggesting here.
> 
> Configuration is global to a KafkaStreams instance and users might want
> to have different tolerance in different parts of the topology. They
> shouldn't be locked into one value set via global config.
> 
> To illustrate this point: Lately I have discovered the cache config
> introduced in KIP-63
> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-63%3A+Unify+store+and+downstream+caching+in+streams>
> and found it quite annoying that it's controlled by a config item. IMO,
> I should be able to control flushing per processor, not be forced to use
> one global value defined in configs.
> 
> It's easy enough for users to source a user-defined config and provided
> it as a parameter to a /given /processor as needed.
> 
> In principal I agree that configuration and business logic are better
> not mixed together but then the configuration mechanism should allow
> users to target specific processors and not be global to the
> KafkaStreams instance.
> 
> Thanks,
> 
> Michal
> 
> On 24/04/17 10:23, Michal Borowiecki wrote:
>>
>> Hi Matthias,
>>
>> I agree it's difficult to reason about the hybrid approach, I
>> certainly found it hard and I'm totally on board with the mantra.
>>
>> I'd be happy to limit the scope of this KIP to add system-time
>> punctuation semantics (in addition to existing stream-time semantics)
>> and leave more complex schemes for users to implement on top of that.
>>
>> Further additional PunctuationTypes, could then be added by future
>> KIPs, possibly including the hybrid approach once it has been given
>> more thought.
>>
>>> There are real-time applications, that want to get
>>> callbacks in regular system-time intervals (completely independent from
>>> stream-time).
>> Can you please describe what they are, so that I can put them on the
>> wiki for later reference?
>>
>> Thanks,
>>
>> Michal
>>
>>
>> On 23/04/17 21:27, Matthias J. Sax wrote:
>>> Hi,
>>>
>>> I do like Damian's API proposal about the punctuation callback function.
>>>
>>> I also did reread the KIP and thought about the semantics we want to
>>> provide.
>>>
>>>> Given the above, I don't see a reason any more for a separate system-time 
>>>> based punctuation.
>>> I disagree here. There are real-time applications, that want to get
>>> callbacks in regular system-time intervals (completely independent from
>>> stream-time). Thus we should allow this -- if we really follow the
>>> "hybrid" approach, this could be configured with stream-time interval
>>> infinite and delay whatever system-time punctuation interval you want to
>>> have. However, I would like to add a proper API for this and do this
>>> configuration under the hood (that would allow one implementation within
>>> all kind of branching for different cases).
>>>
>>> Thus, we definitely should have PunctutionType#StreamTime and
>>> #SystemTime -- and additionally, we _could_ have #Hybrid. Thus, I am not
>>> a fan of your latest API proposal.
>>>
>>>
>>> About the hybrid approach in general. On the one hand I like it, on the
>>> other hand, it seems to be rather (1) complicated (not necessarily from
>>> an implementation point of view, but for people to understand it) and
>>> (2) mixes two semantics together in a "weird" way". Thus, I disagree with:
>>>
>>>> It may appear complicated at first but I do think these semantics will
>>>> still be more understandable to users than having 2 separate punctuation
>>>> schedules/callbacks with different PunctuationTypes.
>>> This statement only holds if you apply strong assumptions that I don't
>>> believe hold in general -- see (2) for details -- and I think it is
>>> harder than you assume to reason about the hybrid approach in general.
>>> IMHO, the hybrid approach is a "false friend" that seems to be easy to
>>> reason about...
>>>
>>>
>>> (1) Streams always embraced "easy to use" and we should really be
>>> careful to keep it this way. On the other hand, as we are talking about
>>> changes to PAPI, it won't affect DSL users (DSL does not use punctuation
>>> at all at the moment), and thus, the "easy to use" mantra might not be
>>> affected, while it will allow advanced users to express more complex stuff.
>>>
>>> I like the mantra: "make simple thing easy and complex things possible".
>>>
>>> (2) IMHO the major disadvantage (issue?) of the hybrid approach is the
>>> implicit assumption that even-time progresses at the same "speed" as
>>> system-time during regular processing. This implies the assumption that
>>> a slower progress in stream-time indicates the absence of input events
>>> (and that later arriving input events will have a larger event-time with
>>> high probability). Even if this might be true for some use cases, I
>>> doubt it holds in general. Assume that you get a spike in traffic and
>>> for some reason stream-time does advance slowly because you have more
>>> records to process. This might trigger a system-time based punctuation
>>> call even if this seems not to be intended. I strongly believe that it
>>> is not easy to reason about the semantics of the hybrid approach (even
>>> if the intentional semantics would be super useful -- but I doubt that
>>> we get want we ask for).
>>>
>>> Thus, I also believe that one might need different "configuration"
>>> values for the hybrid approach if you run the same code for different
>>> scenarios: regular processing, re-processing, catching up scenario. And
>>> as the term "configuration" implies, we might be better off to not mix
>>> configuration with business logic that is expressed via code.
>>>
>>>
>>> One more comment: I also don't think that the hybrid approach is
>>> deterministic as claimed in the use-case subpage. I understand the
>>> reasoning and agree, that it is deterministic if certain assumptions
>>> hold -- compare above -- and if configured correctly. But strictly
>>> speaking it's not because there is a dependency on system-time (and
>>> IMHO, if system-time is involved it cannot be deterministic by definition).
>>>
>>>
>>>> I see how in theory this could be implemented on top of the 2 punctuate
>>>> callbacks with the 2 different PunctuationTypes (one stream-time based,
>>>> the other system-time based) but it would be a much more complicated
>>>> scheme and I don't want to suggest that.
>>> I agree that expressing the intended hybrid semantics is harder if we
>>> offer only #StreamTime and #SystemTime punctuation. However, I also
>>> believe that the hybrid approach is a "false friend" with regard to
>>> reasoning about the semantics (it indicates that it more easy as it is
>>> in reality). Therefore, we might be better off to not offer the hybrid
>>> approach and make it clear to a developed, that it is hard to mix
>>> #StreamTime and #SystemTime in a semantically sound way.
>>>
>>>
>>> Looking forward to your feedback. :)
>>>
>>> -Matthias
>>>
>>>
>>>
>>>
>>> On 4/22/17 11:43 AM, Michal Borowiecki wrote:
>>>> Hi all,
>>>>
>>>> Looking for feedback on the functional interface approach Damian
>>>> proposed. What do people think?
>>>>
>>>> Further on the semantics of triggering punctuate though:
>>>>
>>>> I ran through the 2 use cases that Arun had kindly put on the wiki
>>>> (https://cwiki.apache.org/confluence/display/KAFKA/Punctuate+Use+Cases)
>>>> in my head and on a whiteboard and I can't find a better solution than
>>>> the "hybrid" approach he had proposed.
>>>>
>>>> I see how in theory this could be implemented on top of the 2 punctuate
>>>> callbacks with the 2 different PunctuationTypes (one stream-time based,
>>>> the other system-time based) but it would be a much more complicated
>>>> scheme and I don't want to suggest that.
>>>>
>>>> However, to add to the hybrid algorithm proposed, I suggest one
>>>> parameter to that: a tolerance period, expressed in milliseconds
>>>> system-time, after which the punctuation will be invoked in case the
>>>> stream-time advance hasn't triggered it within the requested interval
>>>> since the last invocation of punctuate (as recorded in system-time)
>>>>
>>>> This would allow a user-defined tolerance for late arriving events. The
>>>> trade off would be left for the user to decide: regular punctuation in
>>>> the case of absence of events vs allowing for records arriving late or
>>>> some build-up due to processing not catching up with the event rate.
>>>> In the one extreme, this tolerance could be set to infinity, turning
>>>> hybrid into simply stream-time based punctuate, like we have now. In the
>>>> other extreme, the tolerance could be set to 0, resulting in a
>>>> system-time upper bound on the effective punctuation interval.
>>>>
>>>> Given the above, I don't see a reason any more for a separate
>>>> system-time based punctuation. The "hybrid" approach with 0ms tolerance
>>>> would under normal operation trigger at regular intervals wrt the
>>>> system-time, except in cases of re-play/catch-up, where the stream time
>>>> advances faster than system time. In these cases punctuate would happen
>>>> more often than the specified interval wrt system time. However, the
>>>> use-cases that need system-time punctuations (that I've seen at least)
>>>> really only have a need for an upper bound on punctuation delay but
>>>> don't need a lower bound.
>>>>
>>>> To that effect I'd propose the api to be as follows, on ProcessorContext:
>>>>
>>>> schedule(Punctuator callback, long interval, long toleranceIterval); // 
>>>> schedules punctuate at stream-time intervals with a system-time upper 
>>>> bound of (interval+toleranceInterval)
>>>>
>>>> schedule(Punctuator callback, long interval); // schedules punctuate at 
>>>> stream-time intervals without an system-time upper bound - this is 
>>>> equivalent to current stream-time based punctuate
>>>>
>>>> Punctuation is triggered when either:
>>>> - the stream time advances past the (stream time of the previous
>>>> punctuation) + interval;
>>>> - or (iff the toleranceInterval is set) when the system time advances
>>>> past the (system time of the previous punctuation) + interval +
>>>> toleranceInterval
>>>>
>>>> In either case:
>>>> - we trigger punctuate passing as the argument the stream time at which
>>>> the current punctuation was meant to happen
>>>> - next punctuate is scheduled at (stream time at which the current
>>>> punctuation was meant to happen) + interval
>>>>
>>>> It may appear complicated at first but I do think these semantics will
>>>> still be more understandable to users than having 2 separate punctuation
>>>> schedules/callbacks with different PunctuationTypes.
>>>>
>>>>
>>>>
>>>> PS. Having re-read this, maybe the following alternative would be easier
>>>> to understand (WDYT?):
>>>>
>>>> schedule(Punctuator callback, long streamTimeInterval, long 
>>>> systemTimeUpperBound); // schedules punctuate at stream-time intervals 
>>>> with a system-time upper bound - systemTimeUpperBound must be no less than 
>>>> streamTimeInterval
>>>>
>>>> schedule(Punctuator callback, long streamTimeInterval); // schedules 
>>>> punctuate at stream-time intervals without a system-time upper bound - 
>>>> this is equivalent to current stream-time based punctuate
>>>>
>>>> Punctuation is triggered when either:
>>>> - the stream time advances past the (stream time of the previous
>>>> punctuation) + streamTimeInterval;
>>>> - or (iff systemTimeUpperBound is set) when the system time advances
>>>> past the (system time of the previous punctuation) + systemTimeUpperBound
>>>>
>>>> Awaiting comments.
>>>>
>>>> Thanks,
>>>> Michal
>>>>
>>>> On 21/04/17 16:56, Michal Borowiecki wrote:
>>>>> Yes, that's what I meant. Just wanted to highlight we'd deprecate it
>>>>> in favour of something that doesn't return a record. Not a problem though.
>>>>>
>>>>>
>>>>> On 21/04/17 16:32, Damian Guy wrote:
>>>>>> Thanks Michal,
>>>>>> I agree Transformer.punctuate should also be void, but we can deprecate
>>>>>> that too in favor of the new interface.
>>>>>>
>>>>>> Thanks for the javadoc PR!
>>>>>>
>>>>>> Cheers,
>>>>>> Damian
>>>>>>
>>>>>> On Fri, 21 Apr 2017 at 09:31 Michal Borowiecki <
>>>>>> michal.borowie...@openbet.com> wrote:
>>>>>>
>>>>>>> Yes, that looks better to me.
>>>>>>>
>>>>>>> Note that punctuate on Transformer is currently returning a record, but 
>>>>>>> I
>>>>>>> think it's ok to have all output records be sent via
>>>>>>> ProcessorContext.forward, which has to be used anyway if you want to 
>>>>>>> send
>>>>>>> multiple records from one invocation of punctuate.
>>>>>>>
>>>>>>> This way it's consistent between Processor and Transformer.
>>>>>>>
>>>>>>>
>>>>>>> BTW, looking at this I found a glitch in the javadoc and put a comment
>>>>>>> there:
>>>>>>>
>>>>>>> https://github.com/apache/kafka/pull/2413/files#r112634612
>>>>>>>
>>>>>>> and PR: https://github.com/apache/kafka/pull/2884
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> Michal
>>>>>>> On 20/04/17 18:55, Damian Guy wrote:
>>>>>>>
>>>>>>> Hi Michal,
>>>>>>>
>>>>>>> Thanks for the KIP. I'd like to propose a bit more of a radical change 
>>>>>>> to
>>>>>>> the API.
>>>>>>> 1. deprecate the punctuate method on Processor
>>>>>>> 2. create a new Functional Interface just for Punctuation, something 
>>>>>>> like:
>>>>>>> interface Punctuator {
>>>>>>>     void punctuate(long timestamp)
>>>>>>> }
>>>>>>> 3. add a new schedule function to ProcessorContext: schedule(long
>>>>>>> interval, PunctuationType type, Punctuator callback)
>>>>>>> 4. deprecate the existing schedule function
>>>>>>>
>>>>>>> Thoughts?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Damian
>>>>>>>
>>>>>>> On Sun, 16 Apr 2017 at 21:55 Michal Borowiecki <
>>>>>>> michal.borowie...@openbet.com> wrote:
>>>>>>>
>>>>>>>> Hi Thomas,
>>>>>>>>
>>>>>>>> I would say our use cases fall in the same category as yours.
>>>>>>>>
>>>>>>>> 1) One is expiry of old records, it's virtually identical to yours.
>>>>>>>>
>>>>>>>> 2) Second one is somewhat more convoluted but boils down to the same 
>>>>>>>> type
>>>>>>>> of design:
>>>>>>>>
>>>>>>>> Incoming messages carry a number of fields, including a timestamp.
>>>>>>>>
>>>>>>>> Outgoing messages contain derived fields, one of them (X) is depended 
>>>>>>>> on
>>>>>>>> by the timestamp input field (Y) and some other input field (Z).
>>>>>>>>
>>>>>>>> Since the output field X is derived in some non-trivial way, we don't
>>>>>>>> want to force the logic onto downstream apps. Instead we want to 
>>>>>>>> calculate
>>>>>>>> it in the Kafka Streams app, which means we re-calculate X as soon as 
>>>>>>>> the
>>>>>>>> timestamp in Y is reached (wall clock time) and send a message if it
>>>>>>>> changed (I say "if" because the derived field (X) is also conditional 
>>>>>>>> on
>>>>>>>> another input field Z).
>>>>>>>>
>>>>>>>> So we have kv stores with the records and an additional kv store with
>>>>>>>> timestamp->id mapping which act like an index where we periodically do 
>>>>>>>> a
>>>>>>>> ranged query.
>>>>>>>>
>>>>>>>> Initially we naively tried doing it in punctuate which of course didn't
>>>>>>>> work when there were no regular msgs on the input topic.
>>>>>>>> Since this was before 0.10.1 and state stores weren't query-able from
>>>>>>>> outside we created a "ticker" that produced msgs once per second onto
>>>>>>>> another topic and fed it into the same topology to trigger punctuate.
>>>>>>>> This didn't work either, which was much more surprising to us at the
>>>>>>>> time, because it was not obvious at all that punctuate is only 
>>>>>>>> triggered if
>>>>>>>> *all* input partitions receive messages regularly.
>>>>>>>> In the end we had to break this into 2 separate Kafka Streams. Main
>>>>>>>> transformer doesn't use punctuate but sends values of timestamp field 
>>>>>>>> Y and
>>>>>>>> the id to a "scheduler" topic where also the periodic ticks are sent. 
>>>>>>>> This
>>>>>>>> is consumed by the second topology and is its only input topic. 
>>>>>>>> There's a
>>>>>>>> transformer on that topic which populates and updates the time-based
>>>>>>>> indexes and polls them from punctuate. If the time in the timestamp
>>>>>>>> elapsed, the record id is sent to the main transformer, which
>>>>>>>> updates/deletes the record from the main kv store and forwards the
>>>>>>>> transformed record to the output topic.
>>>>>>>>
>>>>>>>> To me this setup feels horrendously complicated for what it does.
>>>>>>>>
>>>>>>>> We could incrementally improve on this since 0.10.1 to poll the
>>>>>>>> timestamp->id "index" stores from some code outside the KafkaStreams
>>>>>>>> topology so that at least we wouldn't need the extra topic for "ticks".
>>>>>>>> However, the ticks don't feel so hacky when you realise they give you
>>>>>>>> some hypothetical benefits in predictability. You can reprocess the
>>>>>>>> messages in a reproducible manner, since the topologies use event-time,
>>>>>>>> just that the event time is simply the wall-clock time fed into a 
>>>>>>>> topic by
>>>>>>>> the ticks. (NB in our use case we haven't yet found a need for this 
>>>>>>>> kind of
>>>>>>>> reprocessing).
>>>>>>>> To make that work though, we would have to have the stream time advance
>>>>>>>> based on the presence of msgs on the "tick" topic, regardless of the
>>>>>>>> presence of messages on the other input topic.
>>>>>>>>
>>>>>>>> Same as in the expiry use case, both the wall-clock triggered punctuate
>>>>>>>> and the hybrid would work to simplify this a lot.
>>>>>>>>
>>>>>>>> 3) Finally, I have a 3rd use case in the making but I'm still looking 
>>>>>>>> if
>>>>>>>> we can achieve it using session windows instead. I'll keep you posted 
>>>>>>>> if we
>>>>>>>> have to go with punctuate there too.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Michal
>>>>>>>>
>>>>>>>>
>>>>>>>> On 11/04/17 20:52, Thomas Becker wrote:
>>>>>>>>
>>>>>>>> Here's an example that we currently have.  We have a streams processor
>>>>>>>> that does a transform from one topic into another. One of the fields in
>>>>>>>> the source topic record is an expiration time, and one of the functions
>>>>>>>> of the processor is to ensure that expired records get deleted promptly
>>>>>>>> after that time passes (typically days or weeks after the message was
>>>>>>>> originally produced). To do that, the processor keeps a state store of
>>>>>>>> keys and expiration times, iterates that store in punctuate(), and
>>>>>>>> emits delete (null) records for expired items. This needs to happen at
>>>>>>>> some minimum interval regardless of the incoming message rate of the
>>>>>>>> source topic.
>>>>>>>>
>>>>>>>> In this scenario, the expiration of records is the primary function of
>>>>>>>> punctuate, and therefore the key requirement is that the wall-clock
>>>>>>>> measured time between punctuate calls have some upper-bound. So a pure
>>>>>>>> wall-clock based schedule would be fine for our needs. But the proposed
>>>>>>>> "hybrid" system would also be acceptable if that satisfies a broader
>>>>>>>> range of use-cases.
>>>>>>>>
>>>>>>>> On Tue, 2017-04-11 at 14:41 +0200, Michael Noll wrote:
>>>>>>>>
>>>>>>>> I apologize for the longer email below.  To my defense, it started
>>>>>>>> out much
>>>>>>>> shorter. :-)  Also, to be super-clear, I am intentionally playing
>>>>>>>> devil's
>>>>>>>> advocate for a number of arguments brought forth in order to help
>>>>>>>> improve
>>>>>>>> this KIP -- I am not implying I necessarily disagree with the
>>>>>>>> arguments.
>>>>>>>>
>>>>>>>> That aside, here are some further thoughts.
>>>>>>>>
>>>>>>>> First, there are (at least?) two categories for actions/behavior you
>>>>>>>> invoke
>>>>>>>> via punctuate():
>>>>>>>>
>>>>>>>> 1. For internal housekeeping of your Processor or Transformer (e.g.,
>>>>>>>> to
>>>>>>>> periodically commit to a custom store, to do metrics/logging).  Here,
>>>>>>>> the
>>>>>>>> impact of punctuate is typically not observable by other processing
>>>>>>>> nodes
>>>>>>>> in the topology.
>>>>>>>> 2. For controlling the emit frequency of downstream records.  Here,
>>>>>>>> the
>>>>>>>> punctuate is all about being observable by downstream processing
>>>>>>>> nodes.
>>>>>>>>
>>>>>>>> A few releases back, we introduced record caches (DSL) and state
>>>>>>>> store
>>>>>>>> caches (Processor API) in KIP-63.  Here, we addressed a concern
>>>>>>>> relating to
>>>>>>>> (2) where some users needed to control -- here: limit -- the
>>>>>>>> downstream
>>>>>>>> output rate of Kafka Streams because the downstream systems/apps
>>>>>>>> would not
>>>>>>>> be able to keep up with the upstream output rate (Kafka scalability >
>>>>>>>> their
>>>>>>>> scalability).  The argument for KIP-63, which notably did not
>>>>>>>> introduce a
>>>>>>>> "trigger" API, was that such an interaction with downstream systems
>>>>>>>> is an
>>>>>>>> operational concern;  it should not impact the processing *logic* of
>>>>>>>> your
>>>>>>>> application, and thus we didn't want to complicate the Kafka Streams
>>>>>>>> API,
>>>>>>>> especially not the declarative DSL, with such operational concerns.
>>>>>>>>
>>>>>>>> This KIP's discussion on `punctuate()` takes us back in time (<--
>>>>>>>> sorry, I
>>>>>>>> couldn't resist to not make this pun :-P).  As a meta-comment, I am
>>>>>>>> observing that our conversation is moving more and more into the
>>>>>>>> direction
>>>>>>>> of explicit "triggers" because, so far, I have seen only motivations
>>>>>>>> for
>>>>>>>> use cases in category (2), but none yet for (1)?  For example, some
>>>>>>>> comments voiced here are about sth like "IF stream-time didn't
>>>>>>>> trigger
>>>>>>>> punctuate, THEN trigger punctuate based on processing-time".  Do we
>>>>>>>> want
>>>>>>>> this, and if so, for which use cases and benefits?  Also, on a
>>>>>>>> related
>>>>>>>> note, whatever we are discussing here will impact state store caches
>>>>>>>> (Processor API) and perhaps also impact record caches (DSL), thus we
>>>>>>>> should
>>>>>>>> clarify any such impact here.
>>>>>>>>
>>>>>>>> Switching topics slightly.
>>>>>>>>
>>>>>>>> Jay wrote:
>>>>>>>>
>>>>>>>> One thing I've always found super important for this kind of design
>>>>>>>> work
>>>>>>>> is to do a really good job of cataloging the landscape of use cases
>>>>>>>> and
>>>>>>>> how prevalent each one is.
>>>>>>>>
>>>>>>>> +1 to this, as others have already said.
>>>>>>>>
>>>>>>>> Here, let me highlight -- just in case -- that when we talked about
>>>>>>>> windowing use cases in the recent emails, the Processor API (where
>>>>>>>> `punctuate` resides) does not have any notion of windowing at
>>>>>>>> all.  If you
>>>>>>>> want to do windowing *in the Processor API*, you must do so manually
>>>>>>>> in
>>>>>>>> combination with window stores.  For this reason I'd suggest to
>>>>>>>> discuss use
>>>>>>>> cases not just in general, but also in view of how you'd do so in the
>>>>>>>> Processor API vs. in the DSL.  Right now, changing/improving
>>>>>>>> `punctuate`
>>>>>>>> does not impact the DSL at all, unless we add new functionality to
>>>>>>>> it.
>>>>>>>>
>>>>>>>> Jay wrote in his strawman example:
>>>>>>>>
>>>>>>>> You aggregate click and impression data for a reddit like site.
>>>>>>>> Every ten
>>>>>>>> minutes you want to output a ranked list of the top 10 articles
>>>>>>>> ranked by
>>>>>>>> clicks/impressions for each geographical area. I want to be able
>>>>>>>> run this
>>>>>>>> in steady state as well as rerun to regenerate results (or catch up
>>>>>>>> if it
>>>>>>>> crashes).
>>>>>>>>
>>>>>>>> This is a good example for more than the obvious reason:  In KIP-63,
>>>>>>>> we
>>>>>>>> argued that the reason for saying "every ten minutes" above is not
>>>>>>>> necessarily about because you want to output data *exactly* after ten
>>>>>>>> minutes, but that you want to perform an aggregation based on 10-
>>>>>>>> minute
>>>>>>>> windows of input data; i.e., the point is about specifying the input
>>>>>>>> for
>>>>>>>> your aggregation, not or less about when the results of the
>>>>>>>> aggregation
>>>>>>>> should be send downstream.  To take an extreme example, you could
>>>>>>>> disable
>>>>>>>> record caches and let your app output a downstream update for every
>>>>>>>> incoming input record.  If the last input record was from at minute 7
>>>>>>>> of 10
>>>>>>>> (for a 10-min window), then what your app would output at minute 10
>>>>>>>> would
>>>>>>>> be identical to what it had already emitted at minute 7 earlier
>>>>>>>> anyways.
>>>>>>>> This is particularly true when we take late-arriving data into
>>>>>>>> account:  if
>>>>>>>> a late record arrived at minute 13, your app would (by default) send
>>>>>>>> a new
>>>>>>>> update downstream, even though the "original" 10 minutes have already
>>>>>>>> passed.
>>>>>>>>
>>>>>>>> Jay wrote...:
>>>>>>>>
>>>>>>>> There are a couple of tricky things that seem to make this hard
>>>>>>>> with
>>>>>>>>
>>>>>>>> either
>>>>>>>>
>>>>>>>> of the options proposed:
>>>>>>>> 1. If I emit this data using event time I have the problem
>>>>>>>> described where
>>>>>>>> a geographical region with no new clicks or impressions will fail
>>>>>>>> to
>>>>>>>>
>>>>>>>> output
>>>>>>>>
>>>>>>>> results.
>>>>>>>>
>>>>>>>> ...and Arun Mathew wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> We window by the event time, but trigger punctuate in <punctuate
>>>>>>>> interval>
>>>>>>>> duration of system time, in the absence of an event crossing the
>>>>>>>> punctuate
>>>>>>>> event time.
>>>>>>>>
>>>>>>>> So, given what I wrote above about the status quo and what you can
>>>>>>>> already
>>>>>>>> do with it, is the concern that the state store cache doesn't give
>>>>>>>> you
>>>>>>>> *direct* control over "forcing an output after no later than X
>>>>>>>> seconds [of
>>>>>>>> processing-time]" but only indirect control through a cache
>>>>>>>> size?  (Note
>>>>>>>> that I am not dismissing the claims why this might be helpful.)
>>>>>>>>
>>>>>>>> Arun Mathew wrote:
>>>>>>>>
>>>>>>>> We are using Kafka Stream for our Audit Trail, where we need to
>>>>>>>> output the
>>>>>>>> event counts on each topic on each cluster aggregated over a 1
>>>>>>>> minute
>>>>>>>> window. We have to use event time to be able to cross check the
>>>>>>>> counts.
>>>>>>>>
>>>>>>>> But
>>>>>>>>
>>>>>>>> we need to trigger punctuate [aggregate event pushes] by system
>>>>>>>> time in
>>>>>>>>
>>>>>>>> the
>>>>>>>>
>>>>>>>> absence of events. Otherwise the event counts for unexpired windows
>>>>>>>> would
>>>>>>>> be 0 which is bad.
>>>>>>>>
>>>>>>>> Isn't the latter -- "count would be 0" -- the problem between the
>>>>>>>> absence
>>>>>>>> of output vs. an output of 0, similar to the use of `Option[T]` in
>>>>>>>> Scala
>>>>>>>> and the difference between `None` and `Some(0)`?  That is, isn't the
>>>>>>>> root
>>>>>>>> cause that the downstream system interprets the absence of output in
>>>>>>>> a
>>>>>>>> particular way ("No output after 1 minute = I consider the output to
>>>>>>>> be
>>>>>>>> 0.")?  Arguably, you could also adapt the downstream system (if
>>>>>>>> possible)
>>>>>>>> to correctly handle the difference between absence of output vs.
>>>>>>>> output of
>>>>>>>> 0.  I am not implying that we shouldn't care about such a use case,
>>>>>>>> but
>>>>>>>> want to understand the motivation better. :-)
>>>>>>>>
>>>>>>>> Also, to add some perspective, in some related discussions we talked
>>>>>>>> about
>>>>>>>> how a Kafka Streams application should not worry or not be coupled
>>>>>>>> unnecessarily with such interpretation specifics in a downstream
>>>>>>>> system's
>>>>>>>> behavior.  After all, tomorrow your app's output might be consumed by
>>>>>>>> more
>>>>>>>> than just this one downstream system.  Arguably, Kafka Connect rather
>>>>>>>> than
>>>>>>>> Kafka Streams might be the best tool to link the universes of Kafka
>>>>>>>> and
>>>>>>>> downstream systems, including helping to reconcile the differences in
>>>>>>>> how
>>>>>>>> these systems interpret changes, updates, late-arriving data,
>>>>>>>> etc.  Kafka
>>>>>>>> Connect would allow you to decouple the Kafka Streams app's logical
>>>>>>>> processing from the specifics of downstream systems, thanks to
>>>>>>>> specific
>>>>>>>> sink connectors (e.g. for Elastic, Cassandra, MySQL, S3, etc).  Would
>>>>>>>> this
>>>>>>>> decoupling with Kafka Connect help here?  (And if the answer is "Yes,
>>>>>>>> but
>>>>>>>> it's currently awkward to use Connect for this", this might be a
>>>>>>>> problem we
>>>>>>>> can solve, too.)
>>>>>>>>
>>>>>>>> Switching topics slightly again.
>>>>>>>>
>>>>>>>> Thomas wrote:
>>>>>>>>
>>>>>>>> I'm not entirely convinced that a separate callback (option C)
>>>>>>>> is that messy (it could just be a default method with an empty
>>>>>>>> implementation), but if we wanted a single API to handle both
>>>>>>>> cases,
>>>>>>>> how about something like the following?
>>>>>>>>
>>>>>>>> enum Time {
>>>>>>>>    STREAM,
>>>>>>>>    CLOCK
>>>>>>>> }
>>>>>>>>
>>>>>>>> Yeah, I am on the fence here, too.  If we use the 1-method approach,
>>>>>>>> then
>>>>>>>> whatever the user is doing inside this method is a black box to Kafka
>>>>>>>> Streams (similar to how we have no idea what the user does inside a
>>>>>>>> `foreach` -- if the function passed to `foreach` writes to external
>>>>>>>> systems, then Kafka Streams is totally unaware of the fact).  We
>>>>>>>> won't
>>>>>>>> know, for example, if the stream-time action has a smaller "trigger"
>>>>>>>> frequency than the processing-time action.  Or, we won't know whether
>>>>>>>> the
>>>>>>>> user custom-codes a "not later than" trigger logic ("Do X every 1-
>>>>>>>> minute of
>>>>>>>> stream-time or 1-minute of processing-time, whichever comes
>>>>>>>> first").  That
>>>>>>>> said, I am not certain yet whether we would need such knowledge
>>>>>>>> because,
>>>>>>>> when using the Processor API, most of the work and decisions must be
>>>>>>>> done
>>>>>>>> by the user anyways.  It would matter though if the concept of
>>>>>>>> "triggers"
>>>>>>>> were to bubble up into the DSL because in the DSL the management of
>>>>>>>> windowing, window stores, etc. must be done automatically by Kafka
>>>>>>>> Streams.
>>>>>>>>
>>>>>>>> [In any case, btw, we have the corner case where the user configured
>>>>>>>> the
>>>>>>>> stream-time to be processing-time (e.g. via wall-clock timestamp
>>>>>>>> extractor), at which point both punctuate variants are based on the
>>>>>>>> same
>>>>>>>> time semantics / timeline.]
>>>>>>>>
>>>>>>>> Again, I apologize for the wall of text.  Congratulations if you made
>>>>>>>> it
>>>>>>>> this far. :-)
>>>>>>>>
>>>>>>>> More than happy to hear your thoughts!
>>>>>>>> Michael
>>>>>>>>
>>>>>>>> On Tue, Apr 11, 2017 at 1:12 AM, Arun Mathew <arunmathe...@gmail.com> 
>>>>>>>> <arunmathe...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks Matthias.
>>>>>>>> Sure, will correct it right away.
>>>>>>>>
>>>>>>>> On 11-Apr-2017 8:07 AM, "Matthias J. Sax" <matth...@confluent.io> 
>>>>>>>> <matth...@confluent.io>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Thanks for preparing this page!
>>>>>>>>
>>>>>>>> About terminology:
>>>>>>>>
>>>>>>>> You introduce the term "event time" -- but we should call this
>>>>>>>> "stream
>>>>>>>> time" -- "stream time" is whatever TimestampExtractor returns and
>>>>>>>> this
>>>>>>>> could be event time, ingestion time, or processing/wall-clock time.
>>>>>>>>
>>>>>>>> Does this make sense to you?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -Matthias
>>>>>>>>
>>>>>>>>
>>>>>>>> On 4/10/17 4:58 AM, Arun Mathew wrote:
>>>>>>>>
>>>>>>>> Thanks Ewen.
>>>>>>>>
>>>>>>>> @Michal, @all, I have created a child page to start the Use Cases
>>>>>>>>
>>>>>>>> discussion [https://cwiki.apache.org/confluence/display/KAFKA/
>>>>>>>> Punctuate+Use+Cases]. Please go through it and give your comments.
>>>>>>>>
>>>>>>>>
>>>>>>>> @Tianji, Sorry for the delay. I am trying to make the patch
>>>>>>>> public.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Arun Mathew
>>>>>>>>
>>>>>>>> On 4/8/17, 02:00, "Ewen Cheslack-Postava" <e...@confluent.io> 
>>>>>>>> <e...@confluent.io>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>     Arun,
>>>>>>>>
>>>>>>>>     I've given you permission to edit the wiki. Let me know if
>>>>>>>> you run
>>>>>>>>
>>>>>>>> into any
>>>>>>>>
>>>>>>>>     issues.
>>>>>>>>
>>>>>>>>     -Ewen
>>>>>>>>
>>>>>>>>     On Fri, Apr 7, 2017 at 1:21 AM, Arun Mathew <amathew@yahoo-co 
>>>>>>>> rp.jp> <amat...@yahoo-corp.jp>
>>>>>>>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>     > Thanks Michal. I don’t have the access yet [arunmathew88].
>>>>>>>> Should I
>>>>>>>>
>>>>>>>> be
>>>>>>>>
>>>>>>>>     > sending a separate mail for this?
>>>>>>>>     >
>>>>>>>>     > I thought one of the person following this thread would be
>>>>>>>> able to
>>>>>>>>
>>>>>>>> give me
>>>>>>>>
>>>>>>>>     > access.
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     > *From: *Michal Borowiecki <michal.borowie...@openbet.com> 
>>>>>>>> <michal.borowie...@openbet.com>
>>>>>>>>     > *Reply-To: *"dev@kafka.apache.org" <dev@kafka.apache.org> 
>>>>>>>> <dev@kafka.apache.org> <dev@kafka.apache.org>
>>>>>>>>     > *Date: *Friday, April 7, 2017 at 17:16
>>>>>>>>     > *To: *"dev@kafka.apache.org" <dev@kafka.apache.org> 
>>>>>>>> <dev@kafka.apache.org> <dev@kafka.apache.org>
>>>>>>>>     > *Subject: *Re: [DISCUSS] KIP-138: Change punctuate
>>>>>>>> semantics
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     > Hi Arun,
>>>>>>>>     >
>>>>>>>>     > I was thinking along the same lines as you, listing the use
>>>>>>>> cases
>>>>>>>>
>>>>>>>> on the
>>>>>>>>
>>>>>>>>     > wiki, but didn't find time to get around doing that yet.
>>>>>>>>     > Don't mind if you do it if you have access now.
>>>>>>>>     > I was thinking it would be nice if, once we have the use
>>>>>>>> cases
>>>>>>>>
>>>>>>>> listed,
>>>>>>>>
>>>>>>>>     > people could use likes to up-vote the use cases similar to
>>>>>>>> what
>>>>>>>>
>>>>>>>> they're
>>>>>>>>
>>>>>>>>     > working on.
>>>>>>>>     >
>>>>>>>>     > I should have a bit more time to action this in the next
>>>>>>>> few days,
>>>>>>>>
>>>>>>>> but
>>>>>>>>
>>>>>>>>     > happy for you to do it if you can beat me to it ;-)
>>>>>>>>     >
>>>>>>>>     > Cheers,
>>>>>>>>     > Michal
>>>>>>>>     >
>>>>>>>>     > On 07/04/17 04:39, Arun Mathew wrote:
>>>>>>>>     >
>>>>>>>>     > Sure, Thanks Matthias. My id is [arunmathew88].
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     > Of course. I was thinking of a subpage where people can
>>>>>>>>
>>>>>>>> collaborate.
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     > Will do as per Michael’s suggestion.
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     > Regards,
>>>>>>>>     >
>>>>>>>>     > Arun Mathew
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     > On 4/7/17, 12:30, "Matthias J. Sax" <matth...@confluent.io> 
>>>>>>>> <matth...@confluent.io>
>>>>>>>> <
>>>>>>>>
>>>>>>>> matth...@confluent.io> wrote:
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >     Please share your Wiki-ID and a committer can give you
>>>>>>>> write
>>>>>>>>
>>>>>>>> access.
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >     Btw: as you did not initiate the KIP, you should not
>>>>>>>> change the
>>>>>>>>
>>>>>>>> KIP
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     without the permission of the original author -- in
>>>>>>>> this case
>>>>>>>>
>>>>>>>> Michael.
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >     So you might also just share your thought over the
>>>>>>>> mailing list
>>>>>>>>
>>>>>>>> and
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     Michael can update the KIP page. Or, as an alternative,
>>>>>>>> just
>>>>>>>>
>>>>>>>> create a
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     subpage for the KIP page.
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >     @Michael: WDYT?
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >     -Matthias
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >     On 4/6/17 8:05 PM, Arun Mathew wrote:
>>>>>>>>     >
>>>>>>>>     >     > Hi Jay,
>>>>>>>>     >
>>>>>>>>     >     >           Thanks for the advise, I would like to list
>>>>>>>> down
>>>>>>>>
>>>>>>>> the use cases as
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     > per your suggestion. But it seems I don't have write
>>>>>>>>
>>>>>>>> permission to the
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     > Apache Kafka Confluent Space. Whom shall I request
>>>>>>>> for it?
>>>>>>>>     >
>>>>>>>>     >     >
>>>>>>>>     >
>>>>>>>>     >     > Regarding your last question. We are using a patch in
>>>>>>>> our
>>>>>>>>
>>>>>>>> production system
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     > which does exactly this.
>>>>>>>>     >
>>>>>>>>     >     > We window by the event time, but trigger punctuate in
>>>>>>>>
>>>>>>>> <punctuate interval>
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     > duration of system time, in the absence of an event
>>>>>>>> crossing
>>>>>>>>
>>>>>>>> the punctuate
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     > event time.
>>>>>>>>     >
>>>>>>>>     >     >
>>>>>>>>     >
>>>>>>>>     >     > We are using Kafka Stream for our Audit Trail, where
>>>>>>>> we need
>>>>>>>>
>>>>>>>> to output the
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     > event counts on each topic on each cluster aggregated
>>>>>>>> over a
>>>>>>>>
>>>>>>>> 1 minute
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     > window. We have to use event time to be able to cross
>>>>>>>> check
>>>>>>>>
>>>>>>>> the counts. But
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     > we need to trigger punctuate [aggregate event pushes]
>>>>>>>> by
>>>>>>>>
>>>>>>>> system time in the
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     > absence of events. Otherwise the event counts for
>>>>>>>> unexpired
>>>>>>>>
>>>>>>>> windows would
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     > be 0 which is bad.
>>>>>>>>     >
>>>>>>>>     >     >
>>>>>>>>     >
>>>>>>>>     >     > "Maybe a hybrid solution works: I window by event
>>>>>>>> time but
>>>>>>>>
>>>>>>>> trigger results
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     > by system time for windows that have updated? Not
>>>>>>>> really sure
>>>>>>>>
>>>>>>>> the details
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     > of making that work. Does that work? Are there
>>>>>>>> concrete
>>>>>>>>
>>>>>>>> examples where you
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     > actually want the current behavior?"
>>>>>>>>     >
>>>>>>>>     >     >
>>>>>>>>     >
>>>>>>>>     >     > --
>>>>>>>>     >
>>>>>>>>     >     > With Regards,
>>>>>>>>     >
>>>>>>>>     >     >
>>>>>>>>     >
>>>>>>>>     >     > Arun Mathew
>>>>>>>>     >
>>>>>>>>     >     > Yahoo! JAPAN Corporation
>>>>>>>>     >
>>>>>>>>     >     >
>>>>>>>>     >
>>>>>>>>     >     > On Wed, Apr 5, 2017 at 8:48 PM, Tianji Li <
>>>>>>>>
>>>>>>>> skyah...@gmail.com><skyah...@gmail.com> <skyah...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >
>>>>>>>>     >
>>>>>>>>     >     >> Hi Jay,
>>>>>>>>     >
>>>>>>>>     >     >>
>>>>>>>>     >
>>>>>>>>     >     >> The hybrid solution is exactly what I expect and
>>>>>>>> need for
>>>>>>>>
>>>>>>>> our use cases
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> when dealing with telecom data.
>>>>>>>>     >
>>>>>>>>     >     >>
>>>>>>>>     >
>>>>>>>>     >     >> Thanks
>>>>>>>>     >
>>>>>>>>     >     >> Tianji
>>>>>>>>     >
>>>>>>>>     >     >>
>>>>>>>>     >
>>>>>>>>     >     >> On Wed, Apr 5, 2017 at 12:01 AM, Jay Kreps <
>>>>>>>>
>>>>>>>> j...@confluent.io><j...@confluent.io> <j...@confluent.io> wrote:
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>
>>>>>>>>     >
>>>>>>>>     >     >>> Hey guys,
>>>>>>>>     >
>>>>>>>>     >     >>>
>>>>>>>>     >
>>>>>>>>     >     >>> One thing I've always found super important for
>>>>>>>> this kind
>>>>>>>>
>>>>>>>> of design work
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> is
>>>>>>>>     >
>>>>>>>>     >     >>> to do a really good job of cataloging the landscape
>>>>>>>> of use
>>>>>>>>
>>>>>>>> cases and how
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> prevalent each one is. By that I mean not just
>>>>>>>> listing lots
>>>>>>>>
>>>>>>>> of uses, but
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> also grouping them into categories that
>>>>>>>> functionally need
>>>>>>>>
>>>>>>>> the same thing.
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> In the absence of this it is very hard to reason
>>>>>>>> about
>>>>>>>>
>>>>>>>> design proposals.
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> From the proposals so far I think we have a lot of
>>>>>>>>
>>>>>>>> discussion around
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> possible apis, but less around what the user needs
>>>>>>>> for
>>>>>>>>
>>>>>>>> different use
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> cases
>>>>>>>>     >
>>>>>>>>     >     >>> and how they would implement that using the api.
>>>>>>>>     >
>>>>>>>>     >     >>>
>>>>>>>>     >
>>>>>>>>     >     >>> Here is an example:
>>>>>>>>     >
>>>>>>>>     >     >>> You aggregate click and impression data for a
>>>>>>>> reddit like
>>>>>>>>
>>>>>>>> site. Every ten
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> minutes you want to output a ranked list of the top
>>>>>>>> 10
>>>>>>>>
>>>>>>>> articles ranked by
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> clicks/impressions for each geographical area. I
>>>>>>>> want to be
>>>>>>>>
>>>>>>>> able run this
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> in steady state as well as rerun to regenerate
>>>>>>>> results (or
>>>>>>>>
>>>>>>>> catch up if it
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> crashes).
>>>>>>>>     >
>>>>>>>>     >     >>>
>>>>>>>>     >
>>>>>>>>     >     >>> There are a couple of tricky things that seem to
>>>>>>>> make this
>>>>>>>>
>>>>>>>> hard with
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> either
>>>>>>>>     >
>>>>>>>>     >     >>> of the options proposed:
>>>>>>>>     >
>>>>>>>>     >     >>> 1. If I emit this data using event time I have the
>>>>>>>> problem
>>>>>>>>
>>>>>>>> described
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> where
>>>>>>>>     >
>>>>>>>>     >     >>> a geographical region with no new clicks or
>>>>>>>> impressions
>>>>>>>>
>>>>>>>> will fail to
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> output
>>>>>>>>     >
>>>>>>>>     >     >>> results.
>>>>>>>>     >
>>>>>>>>     >     >>> 2. If I emit this data using system time I have the
>>>>>>>> problem
>>>>>>>>
>>>>>>>> that when
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> reprocessing data my window may not be ten minutes
>>>>>>>> but 10
>>>>>>>>
>>>>>>>> hours if my
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> processing is very fast so it dramatically changes
>>>>>>>> the
>>>>>>>>
>>>>>>>> output.
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>
>>>>>>>>     >
>>>>>>>>     >     >>> Maybe a hybrid solution works: I window by event
>>>>>>>> time but
>>>>>>>>
>>>>>>>> trigger results
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> by system time for windows that have updated? Not
>>>>>>>> really
>>>>>>>>
>>>>>>>> sure the details
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> of making that work. Does that work? Are there
>>>>>>>> concrete
>>>>>>>>
>>>>>>>> examples where
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> you
>>>>>>>>     >
>>>>>>>>     >     >>> actually want the current behavior?
>>>>>>>>     >
>>>>>>>>     >     >>>
>>>>>>>>     >
>>>>>>>>     >     >>> -Jay
>>>>>>>>     >
>>>>>>>>     >     >>>
>>>>>>>>     >
>>>>>>>>     >     >>>
>>>>>>>>     >
>>>>>>>>     >     >>> On Tue, Apr 4, 2017 at 8:32 PM, Arun Mathew <
>>>>>>>>
>>>>>>>> arunmathe...@gmail.com> <arunmathe...@gmail.com> 
>>>>>>>> <arunmathe...@gmail.com>
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> wrote:
>>>>>>>>     >
>>>>>>>>     >     >>>
>>>>>>>>     >
>>>>>>>>     >     >>>> Hi All,
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>> Thanks for the KIP. We were also in need of a
>>>>>>>> mechanism to
>>>>>>>>
>>>>>>>> trigger
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>> punctuate in the absence of events.
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>> As I described in [
>>>>>>>>     >
>>>>>>>>     >     >>>> https://issues.apache.org/jira/browse/KAFKA-3514?
>>>>>>>>     >
>>>>>>>>     >     >>>> focusedCommentId=15926036&page=com.atlassian.jira.
>>>>>>>>     >
>>>>>>>>     >     >>>> plugin.system.issuetabpanels:comment-
>>>>>>>> tabpanel#comment-
>>>>>>>>
>>>>>>>> 15926036
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>> ],
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>    - Our approached involved using the event time
>>>>>>>> by
>>>>>>>>
>>>>>>>> default.
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>    - The method to check if there is any punctuate
>>>>>>>> ready
>>>>>>>>
>>>>>>>> in the
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>    PunctuationQueue is triggered via the any event
>>>>>>>>
>>>>>>>> received by the
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> stream
>>>>>>>>     >
>>>>>>>>     >     >>>>    tread, or at the polling intervals in the
>>>>>>>> absence of
>>>>>>>>
>>>>>>>> any events.
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>    - When we create Punctuate objects (which
>>>>>>>> contains the
>>>>>>>>
>>>>>>>> next event
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> time
>>>>>>>>     >
>>>>>>>>     >     >>>>    for punctuation and interval), we also record
>>>>>>>> the
>>>>>>>>
>>>>>>>> creation time
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> (system
>>>>>>>>     >
>>>>>>>>     >     >>>>    time).
>>>>>>>>     >
>>>>>>>>     >     >>>>    - While checking for maturity of Punctuate
>>>>>>>> Schedule by
>>>>>>>>     >
>>>>>>>>     >     >> mayBePunctuate
>>>>>>>>     >
>>>>>>>>     >     >>>>    method, we also check if the system clock has
>>>>>>>> elapsed
>>>>>>>>
>>>>>>>> the punctuate
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>    interval since the schedule creation time.
>>>>>>>>     >
>>>>>>>>     >     >>>>    - In the absence of any event, or in the
>>>>>>>> absence of any
>>>>>>>>
>>>>>>>> event for
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> one
>>>>>>>>     >
>>>>>>>>     >     >>>>    topic in the partition group assigned to the
>>>>>>>> stream
>>>>>>>>
>>>>>>>> task, the system
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>> time
>>>>>>>>     >
>>>>>>>>     >     >>>>    will elapse the interval and we trigger a
>>>>>>>> punctuate
>>>>>>>>
>>>>>>>> using the
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> expected
>>>>>>>>     >
>>>>>>>>     >     >>>>    punctuation event time.
>>>>>>>>     >
>>>>>>>>     >     >>>>    - we then create the next punctuation schedule
>>>>>>>> as
>>>>>>>>
>>>>>>>> punctuation event
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> time
>>>>>>>>     >
>>>>>>>>     >     >>>>    + punctuation interval, [again recording the
>>>>>>>> system
>>>>>>>>
>>>>>>>> time of creation
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> of
>>>>>>>>     >
>>>>>>>>     >     >>>> the
>>>>>>>>     >
>>>>>>>>     >     >>>>    schedule].
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>> We call this a Hybrid Punctuate. Of course, this
>>>>>>>> approach
>>>>>>>>
>>>>>>>> has pros and
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>> cons.
>>>>>>>>     >
>>>>>>>>     >     >>>> Pros
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>    - Punctuates will happen in <punctuate
>>>>>>>> interval> time
>>>>>>>>
>>>>>>>> duration at
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> max
>>>>>>>>     >
>>>>>>>>     >     >>> in
>>>>>>>>     >
>>>>>>>>     >     >>>>    terms of system time.
>>>>>>>>     >
>>>>>>>>     >     >>>>    - The semantics as a whole continues to revolve
>>>>>>>> around
>>>>>>>>
>>>>>>>> event time.
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>    - We can use the old data [old timestamps] to
>>>>>>>> rerun any
>>>>>>>>
>>>>>>>> experiments
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> or
>>>>>>>>     >
>>>>>>>>     >     >>>>    tests.
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>> Cons
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>    - In case the  <punctuate interval> is not a
>>>>>>>> time
>>>>>>>>
>>>>>>>> duration [say
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> logical
>>>>>>>>     >
>>>>>>>>     >     >>>>    time/event count], then the approach might not
>>>>>>>> be
>>>>>>>>
>>>>>>>> meaningful.
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>    - In case there is a case where we have to wait
>>>>>>>> for an
>>>>>>>>
>>>>>>>> actual event
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> from
>>>>>>>>     >
>>>>>>>>     >     >>>>    a low event rate partition in the partition
>>>>>>>> group, this
>>>>>>>>
>>>>>>>> approach
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> will
>>>>>>>>     >
>>>>>>>>     >     >>>> jump
>>>>>>>>     >
>>>>>>>>     >     >>>>    the gun.
>>>>>>>>     >
>>>>>>>>     >     >>>>    - in case the event processing cannot catch up
>>>>>>>> with the
>>>>>>>>
>>>>>>>> event rate
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> and
>>>>>>>>     >
>>>>>>>>     >     >>>>    the expected timestamp events gets queued for
>>>>>>>> long
>>>>>>>>
>>>>>>>> time, this
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> approach
>>>>>>>>     >
>>>>>>>>     >     >>>>    might jump the gun.
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>> I believe the above approach and discussion goes
>>>>>>>> close to
>>>>>>>>
>>>>>>>> the approach
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> A.
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>> -----------
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>> I like the idea of having an even count based
>>>>>>>> punctuate.
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>> -----------
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>> I agree with the discussion around approach C,
>>>>>>>> that we
>>>>>>>>
>>>>>>>> should provide
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> the
>>>>>>>>     >
>>>>>>>>     >     >>>> user with the option to choose system time or
>>>>>>>> event time
>>>>>>>>
>>>>>>>> based
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> punctuates.
>>>>>>>>     >
>>>>>>>>     >     >>>> But I believe that the user predominantly wants to
>>>>>>>> use
>>>>>>>>
>>>>>>>> event time while
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> not
>>>>>>>>     >
>>>>>>>>     >     >>>> missing out on regular punctuates due to event
>>>>>>>> delays or
>>>>>>>>
>>>>>>>> event
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> absences.
>>>>>>>>     >
>>>>>>>>     >     >>>> Hence a complex punctuate option as Matthias
>>>>>>>> mentioned
>>>>>>>>
>>>>>>>> (quoted below)
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> would
>>>>>>>>     >
>>>>>>>>     >     >>>> be most apt.
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>> "- We might want to add "complex" schedules later
>>>>>>>> on
>>>>>>>>
>>>>>>>> (like, punctuate
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> on
>>>>>>>>     >
>>>>>>>>     >     >>>> every 10 seconds event-time or 60 seconds system-
>>>>>>>> time
>>>>>>>>
>>>>>>>> whatever comes
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>> first)."
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>> -----------
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>> I think I read somewhere that Kafka Streams
>>>>>>>> started with
>>>>>>>>
>>>>>>>> System Time as
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> the
>>>>>>>>     >
>>>>>>>>     >     >>>> punctuation standard, but was later changed to
>>>>>>>> Event Time.
>>>>>>>>
>>>>>>>> I guess
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> there
>>>>>>>>     >
>>>>>>>>     >     >>>> would be some good reason behind it. As Kafka
>>>>>>>> Streams want
>>>>>>>>
>>>>>>>> to evolve
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> more
>>>>>>>>     >
>>>>>>>>     >     >>>> on the Stream Processing front, I believe the
>>>>>>>> emphasis on
>>>>>>>>
>>>>>>>> event time
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> would
>>>>>>>>     >
>>>>>>>>     >     >>>> remain quite strong.
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>> With Regards,
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>> Arun Mathew
>>>>>>>>     >
>>>>>>>>     >     >>>> Yahoo! JAPAN Corporation, Tokyo
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>> On Wed, Apr 5, 2017 at 3:53 AM, Thomas Becker <
>>>>>>>>
>>>>>>>> tobec...@tivo.com> <tobec...@tivo.com> <tobec...@tivo.com>
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> wrote:
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>> Yeah I like PuncutationType much better; I just
>>>>>>>> threw
>>>>>>>>
>>>>>>>> Time out there
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>> more as a strawman than an actual suggestion ;) I
>>>>>>>> still
>>>>>>>>
>>>>>>>> think it's
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>> worth considering what this buys us over an
>>>>>>>> additional
>>>>>>>>
>>>>>>>> callback. I
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>> foresee a number of punctuate implementations
>>>>>>>> following
>>>>>>>>
>>>>>>>> this pattern:
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>> public void punctuate(PunctuationType type) {
>>>>>>>>     >
>>>>>>>>     >     >>>>>     switch (type) {
>>>>>>>>     >
>>>>>>>>     >     >>>>>         case EVENT_TIME:
>>>>>>>>     >
>>>>>>>>     >     >>>>>             methodA();
>>>>>>>>     >
>>>>>>>>     >     >>>>>             break;
>>>>>>>>     >
>>>>>>>>     >     >>>>>         case SYSTEM_TIME:
>>>>>>>>     >
>>>>>>>>     >     >>>>>             methodB();
>>>>>>>>     >
>>>>>>>>     >     >>>>>             break;
>>>>>>>>     >
>>>>>>>>     >     >>>>>     }
>>>>>>>>     >
>>>>>>>>     >     >>>>> }
>>>>>>>>     >
>>>>>>>>     >     >>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>> I guess one advantage of this approach is we
>>>>>>>> could add
>>>>>>>>
>>>>>>>> additional
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>> punctuation types later in a backwards compatible
>>>>>>>> way
>>>>>>>>
>>>>>>>> (like event
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> count
>>>>>>>>     >
>>>>>>>>     >     >>>>> as you mentioned).
>>>>>>>>     >
>>>>>>>>     >     >>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>> -Tommy
>>>>>>>>     >
>>>>>>>>     >     >>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>> On Tue, 2017-04-04 at 11:10 -0700, Matthias J.
>>>>>>>> Sax wrote:
>>>>>>>>     >
>>>>>>>>     >     >>>>>> That sounds promising.
>>>>>>>>     >
>>>>>>>>     >     >>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>> I am just wondering if `Time` is the best name.
>>>>>>>> Maybe we
>>>>>>>>
>>>>>>>> want to
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> add
>>>>>>>>     >
>>>>>>>>     >     >>>>>> other non-time based punctuations at some point
>>>>>>>> later. I
>>>>>>>>
>>>>>>>> would
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>> suggest
>>>>>>>>     >
>>>>>>>>     >     >>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>> enum PunctuationType {
>>>>>>>>     >
>>>>>>>>     >     >>>>>>   EVENT_TIME,
>>>>>>>>     >
>>>>>>>>     >     >>>>>>   SYSTEM_TIME,
>>>>>>>>     >
>>>>>>>>     >     >>>>>> }
>>>>>>>>     >
>>>>>>>>     >     >>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>> or similar. Just to keep the door open -- it's
>>>>>>>> easier to
>>>>>>>>
>>>>>>>> add new
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>> stuff
>>>>>>>>     >
>>>>>>>>     >     >>>>>> if the name is more generic.
>>>>>>>>     >
>>>>>>>>     >     >>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>> -Matthias
>>>>>>>>     >
>>>>>>>>     >     >>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>> On 4/4/17 5:30 AM, Thomas Becker wrote:
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> I agree that the framework providing and
>>>>>>>> managing the
>>>>>>>>
>>>>>>>> notion of
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> stream
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> time is valuable and not something we would
>>>>>>>> want to
>>>>>>>>
>>>>>>>> delegate to
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> the
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> tasks. I'm not entirely convinced that a
>>>>>>>> separate
>>>>>>>>
>>>>>>>> callback
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> (option
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> C)
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> is that messy (it could just be a default
>>>>>>>> method with
>>>>>>>>
>>>>>>>> an empty
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> implementation), but if we wanted a single API
>>>>>>>> to
>>>>>>>>
>>>>>>>> handle both
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> cases,
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> how about something like the following?
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> enum Time {
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>    STREAM,
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>    CLOCK
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> }
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> Then on ProcessorContext:
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> context.schedule(Time time, long interval)  //
>>>>>>>> We could
>>>>>>>>
>>>>>>>> allow
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> this
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> to
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> be called once for each value of time to mix
>>>>>>>>
>>>>>>>> approaches.
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> Then the Processor API becomes:
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> punctuate(Time time) // time here denotes which
>>>>>>>>
>>>>>>>> schedule resulted
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> in
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> this call.
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> Thoughts?
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> On Mon, 2017-04-03 at 22:44 -0700, Matthias J.
>>>>>>>> Sax
>>>>>>>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> Thanks a lot for the KIP Michal,
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> I was thinking about the four options you
>>>>>>>> proposed in
>>>>>>>>
>>>>>>>> more
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> details
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> and
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> this are my thoughts:
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> (A) You argue, that users can still
>>>>>>>> "punctuate" on
>>>>>>>>
>>>>>>>> event-time
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> via
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> process(), but I am not sure if this is
>>>>>>>> possible.
>>>>>>>>
>>>>>>>> Note, that
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> users
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> only
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> get record timestamps via context.timestamp().
>>>>>>>> Thus,
>>>>>>>>
>>>>>>>> users
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> would
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> need
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> to
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> track the time progress per partition (based
>>>>>>>> on the
>>>>>>>>
>>>>>>>> partitions
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> they
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> obverse via context.partition(). (This alone
>>>>>>>> puts a
>>>>>>>>
>>>>>>>> huge burden
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> on
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> the
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> user by itself.) However, users are not
>>>>>>>> notified at
>>>>>>>>
>>>>>>>> startup
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> what
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> partitions are assigned, and user are not
>>>>>>>> notified
>>>>>>>>
>>>>>>>> when
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> partitions
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> get
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> revoked. Because this information is not
>>>>>>>> available,
>>>>>>>>
>>>>>>>> it's not
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> possible
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> to
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> "manually advance" stream-time, and thus
>>>>>>>> event-time
>>>>>>>>
>>>>>>>> punctuation
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> within
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> process() seems not to be possible -- or do
>>>>>>>> you see a
>>>>>>>>
>>>>>>>> way to
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> get
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> it
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> done? And even if, it might still be too
>>>>>>>> clumsy to
>>>>>>>>
>>>>>>>> use.
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> (B) This does not allow to mix both
>>>>>>>> approaches, thus
>>>>>>>>
>>>>>>>> limiting
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> what
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> users
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> can do.
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> (C) This should give all flexibility we need.
>>>>>>>> However,
>>>>>>>>
>>>>>>>> just
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> adding
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> one
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> more method seems to be a solution that is too
>>>>>>>> simple
>>>>>>>>
>>>>>>>> (cf my
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> comments
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> below).
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> (D) This might be hard to use. Also, I am not
>>>>>>>> sure how
>>>>>>>>
>>>>>>>> a user
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> could
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> enable system-time and event-time punctuation
>>>>>>>> in
>>>>>>>>
>>>>>>>> parallel.
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> Overall options (C) seems to be the most
>>>>>>>> promising
>>>>>>>>
>>>>>>>> approach to
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> me.
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> Because I also favor a clean API, we might
>>>>>>>> keep
>>>>>>>>
>>>>>>>> current
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> punctuate()
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> as-is, but deprecate it -- so we can remove it
>>>>>>>> at some
>>>>>>>>
>>>>>>>> later
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> point
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> when
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> people use the "new punctuate API".
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> Couple of follow up questions:
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> - I am wondering, if we should have two
>>>>>>>> callback
>>>>>>>>
>>>>>>>> methods or
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> just
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> one
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> (ie, a unified for system and event time
>>>>>>>> punctuation
>>>>>>>>
>>>>>>>> or one for
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> each?).
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> - If we have one, how can the user figure out,
>>>>>>>> which
>>>>>>>>
>>>>>>>> condition
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> did
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> trigger?
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> - How would the API look like, for registering
>>>>>>>>
>>>>>>>> different
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> punctuate
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> schedules? The "type" must be somehow defined?
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> - We might want to add "complex" schedules
>>>>>>>> later on
>>>>>>>>
>>>>>>>> (like,
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> punctuate
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> on
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> every 10 seconds event-time or 60 seconds
>>>>>>>> system-time
>>>>>>>>
>>>>>>>> whatever
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> comes
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> first). I don't say we should add this right
>>>>>>>> away, but
>>>>>>>>
>>>>>>>> we might
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> want
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> to
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> define the API in a way, that it allows
>>>>>>>> extensions
>>>>>>>>
>>>>>>>> like this
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> later
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> on,
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> without redesigning the API (ie, the API
>>>>>>>> should be
>>>>>>>>
>>>>>>>> designed
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> extensible)
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> - Did you ever consider count-based
>>>>>>>> punctuation?
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> I understand, that you would like to solve a
>>>>>>>> simple
>>>>>>>>
>>>>>>>> problem,
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> but
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> we
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> learned from the past, that just "adding some
>>>>>>>> API"
>>>>>>>>
>>>>>>>> quickly
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> leads
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> to a
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> not very well defined API that needs time
>>>>>>>> consuming
>>>>>>>>
>>>>>>>> clean up
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> later on
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> via other KIPs. Thus, I would prefer to get a
>>>>>>>> holistic
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> punctuation
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> KIP
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> with this from the beginning on to avoid later
>>>>>>>> painful
>>>>>>>>     >
>>>>>>>>     >     >> redesign.
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> -Matthias
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>> On 4/3/17 11:58 AM, Michal Borowiecki wrote:
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> Thanks Thomas,
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> I'm also wary of changing the existing
>>>>>>>> semantics of
>>>>>>>>     >
>>>>>>>>     >     >> punctuate,
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> for
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> backward compatibility reasons, although I
>>>>>>>> like the
>>>>>>>>     >
>>>>>>>>     >     >> conceptual
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> simplicity of that option.
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> Adding a new method to me feels safer but, in
>>>>>>>> a way,
>>>>>>>>
>>>>>>>> uglier.
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> I
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> added
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> this to the KIP now as option (C).
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> The TimestampExtractor mechanism is actually
>>>>>>>> more
>>>>>>>>
>>>>>>>> flexible,
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> as
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> it
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> allows
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> you to return any value, you're not limited
>>>>>>>> to event
>>>>>>>>
>>>>>>>> time or
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> system
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> time
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> (although I don't see an actual use case
>>>>>>>> where you
>>>>>>>>
>>>>>>>> might need
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> anything
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> else then those two). Hence I also proposed
>>>>>>>> the
>>>>>>>>
>>>>>>>> option to
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> allow
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> users
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> to, effectively, decide what "stream time" is
>>>>>>>> for
>>>>>>>>
>>>>>>>> them given
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> the
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> presence or absence of messages, much like
>>>>>>>> they can
>>>>>>>>
>>>>>>>> decide
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> what
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> msg
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> time
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> means for them using the TimestampExtractor.
>>>>>>>> What do
>>>>>>>>
>>>>>>>> you
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> think
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> about
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> that? This is probably most flexible but also
>>>>>>>> most
>>>>>>>>     >
>>>>>>>>     >     >> complicated.
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> All comments appreciated.
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> Cheers,
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> Michal
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>> On 03/04/17 19:23, Thomas Becker wrote:
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>> Although I fully agree we need a way to
>>>>>>>> trigger
>>>>>>>>
>>>>>>>> periodic
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>> processing
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>> that is independent from whether and when
>>>>>>>> messages
>>>>>>>>
>>>>>>>> arrive,
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>> I'm
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>> not sure
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>> I like the idea of changing the existing
>>>>>>>> semantics
>>>>>>>>
>>>>>>>> across
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> the
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>> board.
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>> What if we added an additional callback to
>>>>>>>> Processor
>>>>>>>>
>>>>>>>> that
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> can
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>> be
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>> scheduled similarly to punctuate() but was
>>>>>>>> always
>>>>>>>>
>>>>>>>> called at
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>> fixed, wall
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>> clock based intervals? This way you wouldn't
>>>>>>>> have to
>>>>>>>>
>>>>>>>> give
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> up
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>> the
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>> notion
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>> of stream time to be able to do periodic
>>>>>>>> processing.
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>> On Mon, 2017-04-03 at 10:34 +0100, Michal
>>>>>>>> Borowiecki
>>>>>>>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>> Hi all,
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>> I have created a draft for KIP-138: Change
>>>>>>>>
>>>>>>>> punctuate
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>> semantics
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>> <https://cwiki.apache.org/
>>>>>>>>
>>>>>>>> confluence/display/KAFKA/KIP- <https://cwiki.apache.org/ 
>>>>>>>> confluence/display/KAFKA/KIP-> 
>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP->
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     > <https://cwiki.apache.org/confluence/display/KAFKA/KI P-> 
>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP->>>
>>>>>>>>
>>>>>>>> 138%
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>> 3A+C
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>> hange+
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>> punctuate+semantics>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>> .
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>> Appreciating there can be different views
>>>>>>>> on
>>>>>>>>
>>>>>>>> system-time
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> vs
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>> event-
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>> time
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>> semantics for punctuation depending on use-
>>>>>>>> case and
>>>>>>>>
>>>>>>>> the
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>> importance of
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>> backwards compatibility of any such change,
>>>>>>>> I've
>>>>>>>>
>>>>>>>> left it
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>> quite
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>> open
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>> and
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>> hope to fill in more info as the discussion
>>>>>>>>
>>>>>>>> progresses.
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>> Thanks,
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>>>>> Michal
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> --
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>     Tommy Becker
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>     Senior Software Engineer
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>     O +1 919.460.4747 <%28919%29%20460-4747> 
>>>>>>>> <(919)%20460-4747>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>     tivo.com
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> ________________________________
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> This email and any attachments may contain
>>>>>>>> confidential
>>>>>>>>
>>>>>>>> and
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> privileged material for the sole use of the
>>>>>>>> intended
>>>>>>>>
>>>>>>>> recipient.
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> Any
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> review, copying, or distribution of this email
>>>>>>>> (or any
>>>>>>>>     >
>>>>>>>>     >     >> attachments)
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> by others is prohibited. If you are not the
>>>>>>>> intended
>>>>>>>>
>>>>>>>> recipient,
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> please contact the sender immediately and
>>>>>>>> permanently
>>>>>>>>
>>>>>>>> delete this
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> email and any attachments. No employee or agent
>>>>>>>> of TiVo
>>>>>>>>
>>>>>>>> Inc. is
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> authorized to conclude any binding agreement on
>>>>>>>> behalf
>>>>>>>>
>>>>>>>> of TiVo
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> Inc.
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> by email. Binding agreements with TiVo Inc. may
>>>>>>>> only be
>>>>>>>>
>>>>>>>> made by a
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>>> signed written agreement.
>>>>>>>>     >
>>>>>>>>     >     >>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>> --
>>>>>>>>     >
>>>>>>>>     >     >>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>     Tommy Becker
>>>>>>>>     >
>>>>>>>>     >     >>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>     Senior Software Engineer
>>>>>>>>     >
>>>>>>>>     >     >>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>     O +1 919.460.4747 <%28919%29%20460-4747> 
>>>>>>>> <(919)%20460-4747>
>>>>>>>>     >
>>>>>>>>     >     >>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>     tivo.com
>>>>>>>>     >
>>>>>>>>     >     >>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>> ________________________________
>>>>>>>>     >
>>>>>>>>     >     >>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>> This email and any attachments may contain
>>>>>>>> confidential
>>>>>>>>
>>>>>>>> and
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> privileged
>>>>>>>>     >
>>>>>>>>     >     >>>>> material for the sole use of the intended
>>>>>>>> recipient. Any
>>>>>>>>
>>>>>>>> review,
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>> copying,
>>>>>>>>     >
>>>>>>>>     >     >>>>> or distribution of this email (or any
>>>>>>>> attachments) by
>>>>>>>>
>>>>>>>> others is
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>> prohibited.
>>>>>>>>     >
>>>>>>>>     >     >>>>> If you are not the intended recipient, please
>>>>>>>> contact the
>>>>>>>>
>>>>>>>> sender
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>> immediately and permanently delete this email and
>>>>>>>> any
>>>>>>>>
>>>>>>>> attachments. No
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>> employee or agent of TiVo Inc. is authorized to
>>>>>>>> conclude
>>>>>>>>
>>>>>>>> any binding
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>> agreement on behalf of TiVo Inc. by email.
>>>>>>>> Binding
>>>>>>>>
>>>>>>>> agreements with
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     >     >> TiVo
>>>>>>>>     >
>>>>>>>>     >     >>>>> Inc. may only be made by a signed written
>>>>>>>> agreement.
>>>>>>>>     >
>>>>>>>>     >     >>>>>
>>>>>>>>     >
>>>>>>>>     >     >>>>
>>>>>>>>     >
>>>>>>>>     >     >>>
>>>>>>>>     >
>>>>>>>>     >     >>
>>>>>>>>     >
>>>>>>>>     >     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     > --
>>>>>>>>     >
>>>>>>>>     > <http://www.openbet.com/> <http://www.openbet.com/>
>>>>>>>>
>>>>>>>>     >
>>>>>>>>     > *Michal Borowiecki*
>>>>>>>>     >
>>>>>>>>     > *Senior Software Engineer L4*
>>>>>>>>     >
>>>>>>>>     > *T: *
>>>>>>>>     >
>>>>>>>>     > +44 208 742 1600 <+44%2020%208742%201600> 
>>>>>>>> <+44%2020%208742%201600>
>>>>>>>>     >
>>>>>>>>     > +44 203 249 8448 <+44%2020%203249%208448> 
>>>>>>>> <+44%2020%203249%208448>
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     > *E: *
>>>>>>>>     >
>>>>>>>>     > michal.borowie...@openbet.com
>>>>>>>>     >
>>>>>>>>     > *W: *
>>>>>>>>     >
>>>>>>>>     > www.openbet.com
>>>>>>>>     >
>>>>>>>>     > *OpenBet Ltd*
>>>>>>>>     >
>>>>>>>>     > Chiswick Park Building 9
>>>>>>>>     >
>>>>>>>>     > 566 Chiswick High Rd
>>>>>>>>     >
>>>>>>>>     > London
>>>>>>>>     >
>>>>>>>>     > W4 5XT
>>>>>>>>     >
>>>>>>>>     > UK
>>>>>>>>     >
>>>>>>>>     > <https://www.openbet.com/email_promo> 
>>>>>>>> <https://www.openbet.com/email_promo>
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     > This message is confidential and intended only for the
>>>>>>>> addressee.
>>>>>>>>
>>>>>>>> If you
>>>>>>>>
>>>>>>>>     > have received this message in error, please immediately
>>>>>>>> notify the
>>>>>>>>     > postmas...@openbet.com and delete it from your system as
>>>>>>>> well as
>>>>>>>>
>>>>>>>> any
>>>>>>>>
>>>>>>>>     > copies. The content of e-mails as well as traffic data may
>>>>>>>> be
>>>>>>>>
>>>>>>>> monitored by
>>>>>>>>
>>>>>>>>     > OpenBet for employment and security purposes. To protect
>>>>>>>> the
>>>>>>>>
>>>>>>>> environment
>>>>>>>>
>>>>>>>>     > please do not print this e-mail unless necessary. OpenBet
>>>>>>>> Ltd.
>>>>>>>>
>>>>>>>> Registered
>>>>>>>>
>>>>>>>>     > Office: Chiswick Park Building 9, 566 Chiswick High Road,
>>>>>>>> London,
>>>>>>>>
>>>>>>>> W4 5XT,
>>>>>>>>
>>>>>>>>     > United Kingdom. A company registered in England and Wales.
>>>>>>>>
>>>>>>>> Registered no.
>>>>>>>>
>>>>>>>>     > 3134634. VAT no. GB927523612
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>     >
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>>
>>>>>>>>     Tommy Becker
>>>>>>>>
>>>>>>>>     Senior Software Engineer
>>>>>>>>
>>>>>>>>     O +1 919.460.4747 <%28919%29%20460-4747>
>>>>>>>>
>>>>>>>>
>>>>>>>>     tivo.com
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ________________________________
>>>>>>>>
>>>>>>>> This email and any attachments may contain confidential and privileged 
>>>>>>>> material for the sole use of the intended recipient. Any review, 
>>>>>>>> copying, or distribution of this email (or any attachments) by others 
>>>>>>>> is prohibited. If you are not the intended recipient, please contact 
>>>>>>>> the sender immediately and permanently delete this email and any 
>>>>>>>> attachments. No employee or agent of TiVo Inc. is authorized to 
>>>>>>>> conclude any binding agreement on behalf of TiVo Inc. by email. 
>>>>>>>> Binding agreements with TiVo Inc. may only be made by a signed written 
>>>>>>>> agreement.
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> <http://www.openbet.com/> Michal Borowiecki
>>>>>>>> Senior Software Engineer L4
>>>>>>>> T: +44 208 742 1600 <+44%2020%208742%201600>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>> -- 
>>>>> Signature
>>>>> <http://www.openbet.com/>         Michal Borowiecki
>>>>> Senior Software Engineer L4
>>>>>   T:      +44 208 742 1600
>>>>>
>>>>>   
>>>>>   +44 203 249 8448
>>>>>
>>>>>   
>>>>>    
>>>>>   E:      michal.borowie...@openbet.com
>>>>>   W:      www.openbet.com <http://www.openbet.com/>
>>>>>
>>>>>   
>>>>>   OpenBet Ltd
>>>>>
>>>>>   Chiswick Park Building 9
>>>>>
>>>>>   566 Chiswick High Rd
>>>>>
>>>>>   London
>>>>>
>>>>>   W4 5XT
>>>>>
>>>>>   UK
>>>>>
>>>>>   
>>>>> <https://www.openbet.com/email_promo>
>>>>>
>>>>> This message is confidential and intended only for the addressee. If
>>>>> you have received this message in error, please immediately notify the
>>>>> postmas...@openbet.com <mailto:postmas...@openbet.com> and delete it
>>>>> from your system as well as any copies. The content of e-mails as well
>>>>> as traffic data may be monitored by OpenBet for employment and
>>>>> security purposes. To protect the environment please do not print this
>>>>> e-mail unless necessary. OpenBet Ltd. Registered Office: Chiswick Park
>>>>> Building 9, 566 Chiswick High Road, London, W4 5XT, United Kingdom. A
>>>>> company registered in England and Wales. Registered no. 3134634. VAT
>>>>> no. GB927523612
>>>>>
>>>> -- 
>>>> Signature
>>>> <http://www.openbet.com/>  Michal Borowiecki
>>>> Senior Software Engineer L4
>>>>    T:      +44 208 742 1600
>>>>
>>>>    
>>>>    +44 203 249 8448
>>>>
>>>>    
>>>>     
>>>>    E:      michal.borowie...@openbet.com
>>>>    W:      www.openbet.com <http://www.openbet.com/>
>>>>
>>>>    
>>>>    OpenBet Ltd
>>>>
>>>>    Chiswick Park Building 9
>>>>
>>>>    566 Chiswick High Rd
>>>>
>>>>    London
>>>>
>>>>    W4 5XT
>>>>
>>>>    UK
>>>>
>>>>    
>>>> <https://www.openbet.com/email_promo>
>>>>
>>>> This message is confidential and intended only for the addressee. If you
>>>> have received this message in error, please immediately notify the
>>>> postmas...@openbet.com <mailto:postmas...@openbet.com> and delete it
>>>> from your system as well as any copies. The content of e-mails as well
>>>> as traffic data may be monitored by OpenBet for employment and security
>>>> purposes. To protect the environment please do not print this e-mail
>>>> unless necessary. OpenBet Ltd. Registered Office: Chiswick Park Building
>>>> 9, 566 Chiswick High Road, London, W4 5XT, United Kingdom. A company
>>>> registered in England and Wales. Registered no. 3134634. VAT no.
>>>> GB927523612
>>>>
>>
>> -- 
>> Signature
>> <http://www.openbet.com/>    Michal Borowiecki
>> Senior Software Engineer L4
>>      T:      +44 208 742 1600
>>
>>      
>>      +44 203 249 8448
>>
>>      
>>       
>>      E:      michal.borowie...@openbet.com
>>      W:      www.openbet.com <http://www.openbet.com/>
>>
>>      
>>      OpenBet Ltd
>>
>>      Chiswick Park Building 9
>>
>>      566 Chiswick High Rd
>>
>>      London
>>
>>      W4 5XT
>>
>>      UK
>>
>>      
>> <https://www.openbet.com/email_promo>
>>
>> This message is confidential and intended only for the addressee. If
>> you have received this message in error, please immediately notify the
>> postmas...@openbet.com <mailto:postmas...@openbet.com> and delete it
>> from your system as well as any copies. The content of e-mails as well
>> as traffic data may be monitored by OpenBet for employment and
>> security purposes. To protect the environment please do not print this
>> e-mail unless necessary. OpenBet Ltd. Registered Office: Chiswick Park
>> Building 9, 566 Chiswick High Road, London, W4 5XT, United Kingdom. A
>> company registered in England and Wales. Registered no. 3134634. VAT
>> no. GB927523612
>>
> 
> -- 
> Signature
> <http://www.openbet.com/>     Michal Borowiecki
> Senior Software Engineer L4
>       T:      +44 208 742 1600
> 
>       
>       +44 203 249 8448
> 
>       
>        
>       E:      michal.borowie...@openbet.com
>       W:      www.openbet.com <http://www.openbet.com/>
> 
>       
>       OpenBet Ltd
> 
>       Chiswick Park Building 9
> 
>       566 Chiswick High Rd
> 
>       London
> 
>       W4 5XT
> 
>       UK
> 
>       
> <https://www.openbet.com/email_promo>
> 
> This message is confidential and intended only for the addressee. If you
> have received this message in error, please immediately notify the
> postmas...@openbet.com <mailto:postmas...@openbet.com> and delete it
> from your system as well as any copies. The content of e-mails as well
> as traffic data may be monitored by OpenBet for employment and security
> purposes. To protect the environment please do not print this e-mail
> unless necessary. OpenBet Ltd. Registered Office: Chiswick Park Building
> 9, 566 Chiswick High Road, London, W4 5XT, United Kingdom. A company
> registered in England and Wales. Registered no. 3134634. VAT no.
> GB927523612
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to