bvaradar commented on pull request #4910: URL: https://github.com/apache/hudi/pull/4910#issuecomment-1079989342
> @xiarixiaoyao 'Cause this pr is so huge. So please help me to sort the implement out. > > 1. do the `SerDeHelper.LATESTSCHEMA` attribute of one commit file and the `SAVE_SCHEMA_ACTION` file save the same thing, or can they convert each other? > 2. if enable `hoodie.schema.evolution.enable`, will every commit persist `SerDeHelper.LATESTSCHEMA` in meta file ? > 3. When will commit the `SAVE_SCHEMA_ACTION` file? Once that schema is changed ? > 4. How to make the Hudi Table with old version like 0.10 compatible with this ? If enable `hoodie.schema.evolution.enable` on an existed old-version hudi table, what will happen? Or we are not about to make them compatible, then how to refuse this. > 5. this pr can work when enable `hoodie.metadata.enable` ? > 6. why we need to separate Spark3.1 and Spark3.2? See more repeated codes. so try to optimize them if indeed need to deal with spark3.1 and spark3.2 separately. @xiarixiaoyao : Can you please reply to each of @YannByron questions here ? One specific question: For an existing table (0.10.1 or prior), Specifically, hoodie.metadata.enable is a writer side config. Readers (Query Engines) will only get to look at evolved schema when they exist as valid schema files. This is what I understood. Is this correct? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
