[
https://issues.apache.org/jira/browse/KAFKA-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269148#comment-15269148
]
Ewen Cheslack-Postava commented on KAFKA-3209:
----------------------------------------------
To help clarify [~Skandragon]'s comment a bit, the idea is that the records are
going to be small compared to the headers. This means that the approach we
might normally suggest -- doing the flatMap transformation with an application
or stream processor, storing that data back to Kafka, then using Connect to
store the data to another system -- will have very high overhead.
Whereas most of the message transforms we've discussed so far are either simple
map() or filter() transformations, this is a case where we might want to
generate multiple output messages from a single input message. The API for
supporting this is obviously straightforward -- just support returning a list
of messages from the transformation instead of a single message. However, I
think the main challenge is that message offsets either aren't unique anymore
or we'd need to extend the concept of offset to account for "sub-messages".
> Support single message transforms in Kafka Connect
> --------------------------------------------------
>
> Key: KAFKA-3209
> URL: https://issues.apache.org/jira/browse/KAFKA-3209
> Project: Kafka
> Issue Type: Improvement
> Components: KafkaConnect
> Reporter: Neha Narkhede
>
> Users should be able to perform light transformations on messages between a
> connector and Kafka. This is needed because some transformations must be
> performed before the data hits Kafka (e.g. filtering certain types of events
> or PII filtering). It's also useful for very light, single-message
> modifications that are easier to perform inline with the data import/export.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)