prashantwason commented on issue #2331:
URL: https://github.com/apache/hudi/issues/2331#issuecomment-748228679


   Thats correct.
   
   HUDI does not have a full schema management system. The schema to be used is 
provided at the time of the write where we validate that the schema being used 
for current write is compatible with the existing schema (from the previous 
writes). Hence, HUDI schema management is very simplistic compared to the 
documentation you have referred.
   
   In producer-consumer systems, schema compatibility is a simpler job - by 
upgrading the producer and consumer code with newer schemas the schema can be 
changed  - as all new data will be generated using a schema which both 
understand and there is no historical data with older schema version to be 
processed any longer. But within HUDI there are always versions of data saved 
with older schema and to continue to provide features like incremental read 
(which reads data over a time-range) and updates (old data can be changed), we 
have to restrict the schema modification. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to