Re: [PR] feat(variant): Add support to write shredded variants for HoodieRecordType.AVRO [hudi]

via GitHub Thu, 04 Jun 2026 01:14:49 -0700


voonhous commented on code in PR #18065:
URL: https://github.com/apache/hudi/pull/18065#discussion_r3354476787



##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieHadoopFsRelationFactory.scala:
##########
@@ -62,6 +62,18 @@ abstract class HoodieBaseHadoopFsRelationFactory(val 
sqlContext: SQLContext,
                                                  val schemaSpec: 
Option[StructType],
                                                  val isBootstrap: Boolean
                                                 ) extends SparkAdapterSupport 
with HoodieHadoopFsRelationFactory with Logging {
+  // Propagate Hudi's variant allow-reading-shredded config to Spark's SQLConf.
+  // ParquetToSparkSchemaConverter reads this from SQLConf.get(), so it must 
be set
+  // before query execution starts here during table resolution
+  if (HoodieSparkUtils.gteqSpark4_0) {
+    val sqlConf = sqlContext.sparkSession.sessionState.conf
+    val hoodieParquetAllowReadingShreddedConfKey = 
"hoodie.parquet.variant.allow.reading.shredded"
+    val allowReadingShredded = options.getOrElse(
+      hoodieParquetAllowReadingShreddedConfKey,

Review Comment:
   The actual per-file read is already scoped on the Hadoop conf - 
`Spark40ParquetReader.build` sets `VARIANT_ALLOW_READING_SHREDDED` on the 
per-read `hadoopConf`, which is what `ParquetReadSupport`'s converter uses. The 
session-level set here exists because Spark's 
`spark.sql.variant.allowReadingShredded` is itself a session-scoped flag (there 
is no per-relation equivalent), and Hudi defaults shredded reading to `true`.
   
   I have reworked the precedence so it no longer clobbers an explicitly-set 
Spark conf: table option > hoodie session key > existing Spark conf > Hudi 
default.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] feat(variant): Add support to write shredded variants for HoodieRecordType.AVRO [hudi]

Reply via email to