[ 
https://issues.apache.org/jira/browse/HUDI-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-2839:
-----------------------------
    Component/s: Spark Integration

> Align configs across Spark datasource, write client, etc
> --------------------------------------------------------
>
>                 Key: HUDI-2839
>                 URL: https://issues.apache.org/jira/browse/HUDI-2839
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: configs, Spark Integration
>            Reporter: Ethan Guo
>            Priority: Major
>             Fix For: 0.11.0
>
>
> This is aroused when discussing HUDI-2818.  For the same logic such as 
> keygenerator, compaction, clustering, etc., there are different configs in 
> Spark datasource and write client and they may cause conflicts.  This can 
> cause unexpected behavior on the write path.
>  
> Raymond: I encountered this NPE when trying to run 0.10 over a 0.8 table: 
> https://issues.apache.org/jira/browse/HUDI-2818.
> to align configs, do you think we should auto set 
> {{hoodie.table.keygenerator.class}} when user sets 
> {{hoodie.datasource.write.keygenerator.class}} and also the other way around?
> Siva: guess in the regular write path(HoodiesparkSqlWriter), this is what 
> happens. i.e. users sets only 
> {{{}hoodie.datasource.write.keygenerator.class{}}}, but internally we set 
> {{hoodie.table.keygenerator.class}}  from datasource write config.
> Vinoth: {{HoodieConfig}} has some alternaitves/fallback mechanism. Something 
> to consider
> but overall we should fix these
> Ethan: when working on compaction/clustering, I also see different configs 
> around the same logic between spark datasource and write client.  maybe we 
> can take a pass of all configs later and make them consistent



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to