Rather than leave this thread so open ended, perhaps I can narrow down to what I think is the best approach. These accumulations are really just additional information from the source that don’t get written to the normal topics. Instead, each change to the accumulated state can be emitted as source records on a dedicated topic. That is very straightforward with the existing Kafka Connect.
The challenge I’m struggling with is how a task can/should, upon startup, *consume* that stream to rebuild its state. I can set up my own Kafka consumer for that topic, but IIUC now my connector config has to include much of the same information included in the Kafka Connect workers configuration. Am I just missing how a connector can see the worker configuration properties? Or is there a way that Kafka Connect can help me create a Kafka consumer? Best regards, Randall Hauch On January 28, 2016 at 12:11:07 AM, Randall Hauch (rha...@gmail.com) wrote: I’m creating a custom Kafka Connect source connector, and I’m running into a situation for which Kafka Connect doesn’t seem to provide a solution out of the box. I thought I’d first post to the users list in case I’m just missing a feature that’s already there. My connector’s SourceTask implementation is reading a relational database transaction log. That log contains schema changes and row changes, and the row changes include a reference to the table and the row values. Thus, as the task processes the log, it has to use any schema changes in the log to adjust how it converts subsequent row changes into Kafka source records. Should the task stop and be restarted elsewhere, it can continue reading the transaction log where it left off only if that new task instance can recover the schema state accumulated by an earlier task. While I certainly can use a custom solution to store this state somewhere, it seems like other connectors might benefit from having Kafka Connect include something out of the box. And, this accumulated state (and its history with respect to the source offset at which the state changes) seems like a perfect fit for storing in a Kafka topic. Does Kafka Connect already have a mechanism for tasks to store and recover arbitrary state? If not, then is there interest in adding this capability to Kafka Connect? (If there is interest, then perhaps the dev list is a better venue.) Best regards, Randall Hauch