[ https://issues.apache.org/jira/browse/HUDI-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pratyaksh Sharma reassigned HUDI-3264: -------------------------------------- Assignee: Pratyaksh Sharma > Make schema registry configs more flexible with MultiTableDeltaStreamer > ----------------------------------------------------------------------- > > Key: HUDI-3264 > URL: https://issues.apache.org/jira/browse/HUDI-3264 > Project: Apache Hudi > Issue Type: Task > Components: deltastreamer > Reporter: sivabalan narayanan > Assignee: Pratyaksh Sharma > Priority: Major > Labels: sev:normal > Fix For: 0.11.0 > > > Ref issue: [https://github.com/apache/hudi/issues/4585] > Hi guys, > we ran into a problem setting the target schema of our Hudi table using the > MultiTableDeltaStreamer. > Using a normal DeltaStreamer, we are able to set our source and target > schemas using the properties: > * hoodie.deltastreamer.schemaprovider.registry.url > * hoodie.deltastreamer.schemaprovider.registry.targetUrl > We found that we are not able to set these properties on a table basis using > the MultiTableDeltaStreamer, since the MTDS builds SchemaRegistry URLs for > target and source schema using the properties: > * hoodie.deltastreamer.schemaprovider.registry.baseUrl > * hoodie.deltastreamer.schemaprovider.registry.sourceUrlSuffix > * hoodie.deltastreamer.schemaprovider.registry.targetUrlSuffix > Later the MultiTableDeltaStreamer uses the source Kafka Topic name also for > setting the name of the target schema: > > [hudi/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieMultiTableDeltaStreamer.java|https://github.com/apache/hudi/blob/9fe28e56b49c7bf68ae2d83bfe89755314aa793b/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieMultiTableDeltaStreamer.java#L167] > Line 167 in > [9fe28e5|https://github.com/apache/hudi/commit/9fe28e56b49c7bf68ae2d83bfe89755314aa793b] > ||typedProperties.setProperty(Constants.TARGET_SCHEMA_REGISTRY_URL_PROP, > schemaRegistryBaseUrl + typedProperties.getString(Constants.KAFKA_TOPIC_PROP) > + targetSchemaRegistrySuffix);| > > We think, that schema names should be more configurable, like the origin > DeltaStreamer would handle it. Actually the names of the schemas you want to > use for reading or writing the data are very tight coupled to the name of the > Kafka topic the data is loaded from. > > > -- This message was sent by Atlassian Jira (v8.20.1#820001)