[
https://issues.apache.org/jira/browse/HUDI-603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17101998#comment-17101998
]
Pratyaksh Sharma edited comment on HUDI-603 at 5/7/20, 7:44 PM:
----------------------------------------------------------------
[~yx3zhu] I am a bit skeptical about restarting the program itself, IMHO our
solution should be able to handle the worst possible case also (frequent table
schema updates in this case). Also how would you deduce the provider schema has
changed?
was (Author: pratyaksh):
[~yx3zhu] I am a bit skeptical about restarting the program itself. Also how
would you deduce the provider schema has changed?
> HoodieDeltaStreamer should periodically fetch table schema update
> -----------------------------------------------------------------
>
> Key: HUDI-603
> URL: https://issues.apache.org/jira/browse/HUDI-603
> Project: Apache Hudi (incubating)
> Issue Type: Bug
> Components: DeltaStreamer
> Reporter: Yixue Zhu
> Assignee: Pratyaksh Sharma
> Priority: Major
> Labels: evolution, pull-request-available, schema
>
> HoodieDeltaStreamer create SchemaProvider instance and delegate to DeltaSync
> for periodical sync. However, default implementation of SchemaProvider does
> not refresh schema, which can change due to schema evolution. DeltaSync
> snapshot the schema when it creates writeClient, using the SchemaProvider
> instance or pick up from source, and the schema for writeClient is not
> refreshed during the loop of Sync.
> I think this needs to be addressed to support schema evolution fully.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)