[ 
https://issues.apache.org/jira/browse/HUDI-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pratyaksh Sharma reassigned HUDI-3264:
--------------------------------------

    Assignee: Pratyaksh Sharma

> Make schema registry configs more flexible with MultiTableDeltaStreamer
> -----------------------------------------------------------------------
>
>                 Key: HUDI-3264
>                 URL: https://issues.apache.org/jira/browse/HUDI-3264
>             Project: Apache Hudi
>          Issue Type: Task
>          Components: deltastreamer
>            Reporter: sivabalan narayanan
>            Assignee: Pratyaksh Sharma
>            Priority: Major
>              Labels: sev:normal
>             Fix For: 0.11.0
>
>
> Ref issue: [https://github.com/apache/hudi/issues/4585]
> Hi guys,
> we ran into a problem setting the target schema of our Hudi table using the 
> MultiTableDeltaStreamer.
> Using a normal DeltaStreamer, we are able to set our source and target 
> schemas using the properties:
>  * hoodie.deltastreamer.schemaprovider.registry.url
>  * hoodie.deltastreamer.schemaprovider.registry.targetUrl
> We found that we are not able to set these properties on a table basis using 
> the MultiTableDeltaStreamer, since the MTDS builds SchemaRegistry URLs for 
> target and source schema using the properties:
>  * hoodie.deltastreamer.schemaprovider.registry.baseUrl
>  * hoodie.deltastreamer.schemaprovider.registry.sourceUrlSuffix
>  * hoodie.deltastreamer.schemaprovider.registry.targetUrlSuffix
> Later the MultiTableDeltaStreamer uses the source Kafka Topic name also for 
> setting the name of the target schema:
>  
> [hudi/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieMultiTableDeltaStreamer.java|https://github.com/apache/hudi/blob/9fe28e56b49c7bf68ae2d83bfe89755314aa793b/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieMultiTableDeltaStreamer.java#L167]
> Line 167 in 
> [9fe28e5|https://github.com/apache/hudi/commit/9fe28e56b49c7bf68ae2d83bfe89755314aa793b]
> ||typedProperties.setProperty(Constants.TARGET_SCHEMA_REGISTRY_URL_PROP, 
> schemaRegistryBaseUrl + typedProperties.getString(Constants.KAFKA_TOPIC_PROP) 
> + targetSchemaRegistrySuffix);|
>  
> We think, that schema names should be more configurable, like the origin 
> DeltaStreamer would handle it. Actually the names of the schemas you want to 
> use for reading or writing the data are very tight coupled to the name of the 
> Kafka topic the data is loaded from.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to