Hi Eric, Thank you for submitting this improvement suggestion.
Do you mind clarifying the use-case for me? Looking at your gist: https://gist.github.com/ewasserman/f8c892c2e7a9cf26ee46 If my consumer started reading all the CDC topics from the very beginning in which they were created, without ever stopping, it is obviously guaranteed to see every single consistent state of the database. If my consumer joined late (lets say after Tq got clobbered by Tr) it will get a mixed state, but if it will continue listening on those topics, always following the logs to their end, it is guaranteed to see a consistent state as soon a new transaction commits. Am I missing anything? Basically, I do not understand why you claim: "However, to recover all the tables at the same checkpoint, with each independently compacting, one may need to move to an even more recent checkpoint when a different table had the same read issue with the new checkpoint. Thus one could never be assured of this process terminating." I mean, it is true that you need to continuously read forward in order to get to a consistent state, but why can't you be assured of getting there? We are doing something very similar in KafkaConnect, where we need a consistent view of our configuration. We make sure that if the current state is inconsistent (i.e there is data that are not "committed" yet), we continue reading to the log end until we get to a consistent state. I am not convinced the new functionality is necessary, or even helpful. Gwen On Mon, May 16, 2016 at 4:07 PM, Eric Wasserman <eric.wasser...@gmail.com> wrote: > I would like to begin discussion on KIP-58 > > The KIP is here: > https://cwiki.apache.org/confluence/display/KAFKA/KIP-58+-+Make+Log+Compaction+Point+Configurable > > Jira: https://issues.apache.org/jira/browse/KAFKA-1981 > > Pull Request: https://github.com/apache/kafka/pull/1168 > > Thanks, > > Eric