[
https://issues.apache.org/jira/browse/HUDI-8758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17910360#comment-17910360
]
Lin Liu edited comment on HUDI-8758 at 1/6/25 6:46 PM:
-------------------------------------------------------
FG reader supports `no merge mode` if `hoodie.datasource.merge.type` is set to
`skip_merge`.
Obviously, we need to set this config for `insert` tables.
However, `insert` tables are defined by `hoodie.datasource.insert.dup.policy`
config, which is a datasource option. We cannot check this option in `fg
reader`. Therefore, we have to set this property in spark sql client. That is,
when `hoodie.datasource.insert.dup.policy` is set to `none`, we need to set
`hoodie.datasource.merge.type` to `skip_merge`.
But if a customer created an `insert` table in one session, but query it in a
different session without setting `hoodie.datasource.insert.dup.policy`, how do
we set the merge mode config propertly?
was (Author: JIRAUSER301185):
FG reader supports `no merge mode` if `hoodie.datasource.merge.type` is set to
`skip_merge`.
Obviously, we need to set this config for `insert` tables.
> hoodie.datasource.insert.dup.policy interplay with file group reader
> --------------------------------------------------------------------
>
> Key: HUDI-8758
> URL: https://issues.apache.org/jira/browse/HUDI-8758
> Project: Apache Hudi
> Issue Type: Sub-task
> Reporter: Y Ethan Guo
> Priority: Blocker
> Fix For: 1.0.1
>
> Original Estimate: 2h
> Remaining Estimate: 2h
>
> Check hoodie.datasource.insert.dup.policy set to different values and make
> sure fg reader can read tables generated by writes in both cases.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)