Hi Ewan, So on the point of JMS the predefined/standardised JMS and JMSX headers have predefined types. So these can be serialised/deserialised accordingly.
Custom jms headers agreed could be a bit more difficult but on the 80/20 rule I would agree mostly they're string values and as anyhow you can hold bytes as a string it wouldn't cause any issue, defaulting to that. But I think easily we maybe able to do one better. Obviously can override the/config the headers converter but we can supply a default converter could take a config file with key to type mapping? Allowing people to maybe define/declare a header key with the expected type in some property file? To support string, byte[] and primitives? And undefined headers just either default to String or byte[] We could also pre define known headers like the jms ones mentioned above. E.g AwesomeHeader1=boolean AwesomeHeader2=long JMSCorrelationId=String JMSXGroupId=String What you think? Cheers Mike Sent from my iPhone > On 2 May 2017, at 18:45, Ewen Cheslack-Postava <e...@confluent.io> wrote: > > A couple of thoughts: > > First, agreed that we definitely want to expose header functionality. Thank > you Mike for starting the conversation! Even if Connect doesn't do anything > special with it, there's value in being able to access/set headers. > > On motivation -- I think there are much broader use cases. When thinking > about exposing headers, I'd actually use Replicator as only a minor > supporting case. The reason is that it is a very uncommon case where there > is zero impedance mismatch between the source and sink of the data since > they are both Kafka. This means you don't need to think much about data > formats/serialization. I think the JMS use case is a better example since > JMS headers and Kafka headers don't quite match up. Here's a quick list of > use cases I can think of off the top of my head: > > 1. Include headers from other systems that support them: JMS (or really any > MQ), HTTP > 2. Other connector-specific headers. For example, from JDBC maybe the table > the data comes from is a header; for a CDC connector you might include the > binlog offset as a header. > 3. Interceptor/SMT-style use cases for annotating things like provenance of > data: > 3a. Generically w/ user-supplied data like data center, host, app ID, etc. > 3b. Kafka Connect framework level info, such as the connector/task > generating the data > > On deviation from Connect's model -- to be honest, the KIP-82 also deviates > quite substantially from how Kafka handles data already, so we may struggle > a bit to rectify the two. (In particular, headers specify some structure > and enforce strings specifically for header keys, but then require you to > do serialization of header values yourself...). > > I think the use cases I mentioned above may also need different approaches > to how the data in headers are handled. As Gwen mentions, if we expose the > headers to Connectors, they need to have some idea of the format and the > reason for byte[] values in KIP-82 is to leave that decision up to the > organization using them. But without knowing the format, connectors can't > really do anything with them -- if a source connector assumes a format, > they may generate data incompatible with the format used by the rest of the > organization. On the other hand, I have a feeling most people will just use > <String, String> headers, so allowing connectors to embed arbitrarily > complex data may not work out well in practice. Or maybe we leave it > flexible, most people default to using StringConverter for the serializer > and Connectors will end up defaulting to that just for compatibility... > > I'm not sure I have a real proposal yet, but I do think understanding the > impact of using a Converter for headers would be useful, and we might want > to think about how this KIP would fit in with transformations (or if that > is something that can be deferred, handled separately from the existing > transformations, etc). > > -Ewen > > On Mon, May 1, 2017 at 11:52 AM, Michael Pearce <michael.pea...@ig.com> > wrote: > >> Hi Gwen, >> >> Then intent here was to allow tools that perform similar role to mirror >> makers of replicating the messaging from one cluster to another. Eg like >> mirror make should just be taking and transferring the headers as is. >> >> We don't actually use this inside our company, so not exposing this isn't >> an issue for us. Just believe there are companies like confluent who have >> tools like replicator that do. >> >> And as good citizens think we should complete the work and expose the >> headers same as in the record to at least allow them to replicate the >> messages as is. Note Steph seems to want it. >> >> Cheers >> Mike >> >> Sent using OWA for iPhone >> ________________________________________ >> From: Gwen Shapira <g...@confluent.io> >> Sent: Monday, May 1, 2017 2:36:34 PM >> To: dev@kafka.apache.org >> Subject: Re: [DISCUSS] KIP 145 - Expose Record Headers in Kafka Connect >> >> Hi, >> >> I'm excited to see the community expanding Connect in this direction! >> Headers + Transforms == Fun message routing. >> >> I like how clean the proposal is, but I'm concerned that it kinda deviates >> from how Connect handles data elsewhere. >> Unlike Kafka, Connect doesn't look at all data as byte-arrays, we have >> converters that take data in specific formats (JSON, Avro) and turns it >> into Connect data types (defined in the data api). I think it will be more >> consistent for connector developers to also get headers as some kind of >> structured or semi-structured data (and to expand the converters to handle >> header conversions as well). >> This will allow for Connect's separation of concerns - Connector developers >> don't worry about data formats (because they get the internal connect >> objects) and Converters do all the data format work. >> >> Another thing, in my experience, APIs work better if they are put into use >> almost immediately - so difficulties in using the APIs are immediately >> surfaced. Are you planning any connectors that will use this feature (not >> necessarily in Kafka, just in general)? Or perhaps we can think of a way to >> expand Kafka's file connectors so they'll use headers somehow (can't think >> of anything, but maybe?). >> >> Gwen >> >> On Sat, Apr 29, 2017 at 12:12 AM, Michael Pearce <michael.pea...@ig.com> >> wrote: >> >>> Hi All, >>> >>> Now KIP-82 is committed I would like to discuss extending the work to >>> expose it in Kafka Connect, its primary focus being so connectors that >> may >>> do similar tasks as MirrorMakers, either Kafka->Kafka or JMS-Kafka would >> be >>> able to replicate the headers. >>> It would be ideal but not mandatory for this to go in 0.11 release so is >>> available on day one of headers being available. >>> >>> Please find the KIP here: >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >>> 145+-+Expose+Record+Headers+in+Kafka+Connect >>> >>> Please find an initial implementation as a PR here: >>> https://github.com/apache/kafka/pull/2942 >>> >>> Kind Regards >>> Mike >>> The information contained in this email is strictly confidential and for >>> the use of the addressee only, unless otherwise indicated. If you are not >>> the intended recipient, please do not read, copy, use or disclose to >> others >>> this message or any attachment. Please also notify the sender by replying >>> to this email or by telephone (+44(020 7896 0011) and then delete the >> email >>> and any copies of it. Opinions, conclusion (etc) that do not relate to >> the >>> official business of this company shall be understood as neither given >> nor >>> endorsed by it. IG is a trading name of IG Markets Limited (a company >>> registered in England and Wales, company number 04008957) and IG Index >>> Limited (a company registered in England and Wales, company number >>> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill, >>> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG >>> Index Limited (register number 114059) are authorised and regulated by >> the >>> Financial Conduct Authority. >>> >> >> >> >> -- >> *Gwen Shapira* >> Product Manager | Confluent >> 650.450.2760 | @gwenshap >> Follow us: Twitter <https://twitter.com/ConfluentInc> | blog >> <http://www.confluent.io/blog> >> The information contained in this email is strictly confidential and for >> the use of the addressee only, unless otherwise indicated. If you are not >> the intended recipient, please do not read, copy, use or disclose to others >> this message or any attachment. Please also notify the sender by replying >> to this email or by telephone (+44(020 7896 0011) and then delete the email >> and any copies of it. Opinions, conclusion (etc) that do not relate to the >> official business of this company shall be understood as neither given nor >> endorsed by it. IG is a trading name of IG Markets Limited (a company >> registered in England and Wales, company number 04008957) and IG Index >> Limited (a company registered in England and Wales, company number >> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill, >> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG >> Index Limited (register number 114059) are authorised and regulated by the >> Financial Conduct Authority. >>