[ https://issues.apache.org/jira/browse/HUDI-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17498756#comment-17498756 ]
Xianghu Wang commented on HUDI-3525: ------------------------------------ hi [~shivnarayan], [~xushiyan] any ideas about this ? > Introduce JsonkafkaSourceProcessor to support data preprocess before it is > transformed to DataSet > ------------------------------------------------------------------------------------------------- > > Key: HUDI-3525 > URL: https://issues.apache.org/jira/browse/HUDI-3525 > Project: Apache Hudi > Issue Type: New Feature > Components: deltastreamer > Reporter: Xianghu Wang > Assignee: Xianghu Wang > Priority: Major > > currently we have `Transform` to transform source to target dataset before > writing, but it is based on DataSet. > In some scenarios, our kafka data is not in the right format we need, such as > binlog json format. > We need a way to extract/prepare the data we need from the original data > before converting it into a DataSet. -- This message was sent by Atlassian Jira (v8.20.1#820001)