Hi Richard, Thanks for picking this up! I know of at least one large community member for which this feature is absolutely essential.
If I understand your two options, it seems like the proposal is to implement it as a behavior change regardless, and the question is whether to provide an opt-out config or not. Given that any implementation of this feature would have some performance impact under some workloads, and also that we don't know if anyone really depends on emit-on-update time semantics, it seems like we should propose to add an opt-out config. Can you update the KIP to mention the exact config key and value(s) you'd propose? Just to move the discussion forward, maybe something like: emit.on := change|update with the new default being "change" Thanks for pointing out the timestamp issue in particular. I agree that if we discard the latter update as a no-op, then we also have to discard its timestamp (obviously, we don't forward the timestamp update, as that's the whole point, but we also can't update the timestamp in the store, as the store must remain consistent with what has been emitted). I have to confess that I disagree with your implementation proposal, but it's also not necessary to discuss implementation in the KIP. Maybe it would be less controversial if you just drop that section for now, so that the KIP discussion can focus on the behavior change and config. Just for reference, there is some research into this domain. For example, see the "Report" section (3.2.3) of the SECRET paper: http://people.csail.mit.edu/tatbul/publications/maxstream_vldb10.pdf It might help to round out the proposal if you take a brief survey of the behaviors of other systems, along with pros and cons if any are reported. Thanks, -John On Fri, Jan 10, 2020, at 22:27, Richard Yu wrote: > Hi everybody! > > I'd like to propose a change that we probably should've added for a long > time now. > > The key benefit of this KIP would be reduced traffic in Kafka Streams since > a lot of no-op results would no longer be sent downstream. > Here is the KIP for reference. > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-557%3A+Add+emit+on+change+support+for+Kafka+Streams > > Currently, I seek to formalize our approach for this KIP first before we > determine concrete API additions / configurations. > Some configs might warrant adding, whiles others are not necessary since > adding them would only increase complexity of Kafka Streams. > > Cheers, > Richard >