yihua commented on code in PR #13208:
URL: https://github.com/apache/hudi/pull/13208#discussion_r2060772368
##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/common/SparkReaderContextFactory.java:
##########
@@ -88,6 +90,9 @@ class SparkReaderContextFactory implements
ReaderContextFactory<InternalRow> {
// Spark parquet reader has to be instantiated on the driver and broadcast
to the executors
SparkParquetReader parquetFileReader =
sparkAdapter.createParquetFileReader(false, sqlConf, options, configs);
parquetReaderBroadcast = jsc.broadcast(parquetFileReader);
+ // Broadcast: TableConfig.
+ HoodieTableConfig tableConfig = metaClient.getTableConfig();
Review Comment:
As Tim mentioned `SerializableConfiguration` wraps the hadoop configs shared
in a Spark session which do not contain Hudi configs and it's unsafe to add
table-specific configs there. And given the table config object is small it's
OK to have it this way now, and let's remove the table config from the reader
context after subsequent refactoring once the Hudi semantics are removed from
the reader context class. @danny0405 wdyt
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]