Sure thing :) For the original issue to preserve partition-time within a task across restarts, we don't need this KIP.
And I agree that downstream time propagation via heartbeats is nothing critical atm but may require a big change. Hence, maybe not worth the effort right now. If you are fine with it, we can discard this KIP for now. Maybe we can revisit this idea at a later point when we really need it. Thanks a lot! -Matthias On 5/31/19 4:19 PM, Richard Yu wrote: > Hi Matthias, > > Thanks for responding. :) > > I suppose from the scope of the change that is needed to fix the timestamp > propagation bug is too complex (in other words, probably not worth it). > So should this KIP be closed since it might be a little excessive? > > There probably is no need for too big of a change like this one. > > On Fri, May 31, 2019 at 3:17 PM Matthias J. Sax <matth...@confluent.io> > wrote: > >> Thanks for the KIP. >> >> However, I have some doubts that this would work. >> >> In the example, you mention 4 records >> >>> r1(2), r2(3), r3(7), and r4(9) >> >> and that if r3 and r4 would be filtered, the downstream task would not >> advance partition time from 3 to 9. >> >> However, if r3 and r4 are filtered, no record will be sent downstream at >> all -- hence, there is no record to which partition-time could be >> piggy-bagged onto its header. >> >> When we sent r2, we don't know anything about r3 and r4 yet. And if >> there is an r5 that is not filtered, r5 would advance partition time >> anyway based on its own timestamp, hence no header is required. >> >> Also, I am not a big fan of adding headers in general, as they "leak" >> internal implementation details. >> >> >> Overall, my personal opinion is, that we should change Kafka's message >> format and allow for "heartbeat" messages, that don't carry any data, >> but only a timestamp. By default, those messages would not be exposed to >> an application but would be considered "internal" similar to transaction >> markers. However, changing the message format is a mayor change and >> hence, I am not sure if it worth doing at all atm. >> >> >> -Matthias >> >> >> >> On 5/20/19 7:20 PM, Richard Yu wrote: >>> Hello, >>> >>> I wish to introduce a minor addition present in RecordContext (a public >>> facing API). This addition works to both provide the user with >>> more information regarding the processing state of the partition, but >> also >>> help resolve a bug which Kafka is currently experiencing. >>> Here is the KIP Link: >>> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-472%3A+%5BSTREAMS%5D+Add+partition+time+field+to+RecordContext >>> >>> Cheers, >>> Richard Yu >>> >> >> >
signature.asc
Description: OpenPGP digital signature