dongkelun commented on pull request #4083: URL: https://github.com/apache/hudi/pull/4083#issuecomment-1003327348
> > @dongkelun @xushiyan I offer another solution to discuss. > > Query incrementally in hive need to set `hoodie.%s.consume.start.timestamp` which is used in `HoodieHiveUtils.readStartCommitTime`。Currently, we pass the `hoodie.table.name` named `tableName` to this function. We can add configs `hoodie.datasource.write.database.name` in `DataSourceWriteOptions` and `hoodie.database.name` in `HoodieTableConfig`. And if `database.name` provided, we joint the `database.name` and `table.name` and pass it to `readStartCommitTime`. And then, use can set `hoodie.dbName.tableName.consume.start.timestamp` in hive and query. > > Also, `hoodie.datasource.write.database.name` and `hoodie.database.name` can reuse in other scene. > > @xushiyan what do you think. > > @xushiyan @YannByron I probably understand the solution. > > SQL will persist the database name to ` hoodie.properties` by default, DF is selectively persisted through optional database parameters. Then, in incremental query, if set ` databaseName.tableName`, we match `databaseName.tableName`. If it is inconsistent or there is no databaseName, incremental query will not be performed. If consistent, perform an incremental query.If the incremental query does not have a database name set, does not match the database name, only the table name > > So, which parameter should DF use to persist the database name? @xushiyan Hello, do you think this idea is OK? If so, I'll submit a version according to this idea first -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org