Copycat enables streaming data in and out of Kafka. Connector writers need to define the serde of the data as it is different per system. Metadata should be entirely hidden by the copycat framework and isn't something users or connector implementors need to serialize differently as long as we provide tools/REST APIs to access the metadata where required. Moreover, as you suggest, evolution, maintenance and configs are much simpler if it remains hidden.
+1 on keeping just the serializers for data configurable. On Thu, Aug 13, 2015 at 9:59 PM, Gwen Shapira <g...@confluent.io> wrote: > Hi Team Kafka, > > As you know from KIP-26 and PR-99, when you will use Copycat to move data > from an external system to Kafka, in addition to storing the data itself, > Copycat will also need to store some metadata. > > This metadata is currently offsets on the source system (say, SCN# from > Oracle redo log), but I can imagine to store a bit more. > > When storing data, we obviously want pluggable serializers, so users will > get the data in a format they like. > > But the metadata seems internal. i.e users shouldn't care about it and if > we want them to read or change anything, we want to provide them with tools > to do it. > > Moreover, by controlling the format we can do three important things: > * Read the metadata for monitoring / audit purposes > * Evolve / modify it. If users serialize it in their own format, and > actually write clients to use this metadata, we don't know if its safe to > evolve. > * Keep configuration a bit simpler. This adds at least 4 new configuration > items... > > What do you guys think? > > Gwen > -- Thanks, Neha