[I] Hudi table not queryable by SQL on Databricks Spark [hudi]

via GitHub Sat, 29 Nov 2025 22:22:41 -0800


hudi-bot opened a new issue, #15720:
URL: https://github.com/apache/hudi/issues/15720


   Customer: I’ve tried this with 0.12.2 and still receive the same error. does 
the table format version also need to be updated? i.e. we’re writing with Hudi 
0.11.1 using EMR but reading from Databricks using Hudi 0.12.2 and Spark 3.3.
   
    
   
   What have been tried so far on 0.12.2:
    # 
!https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/[email protected]!
 SparkSQL
   
   so just tried Spark SQL and doesn’t work (different issue)
   SET hoodie.file.index.enable=false
   select count(*) from validated_sales;
   returns 0 count but no errors
   2. 
!https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/[email protected]!
 when running via pyspark
   %python
   df = spark.read.format('hudi')\
   .load('s3://<bucket>/validated_sales/*/*/*')
   df.count()
   all is good with 0.12.2 Hudi and Databricks 11.3 (spark 3.3).
   3. 
!https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/[email protected]!
 without the wildcard in pyspark
   %python
   df = spark.read.format('hudi')\
   .load('s3://<bucket>/validated_sales')
   df.count()
   count = 0
   4. 
!https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/[email protected]!
 without wildcard but with recursive option set in pyspark
   %python
   df = spark.read.format('hudi')\
   .option("recursiveFileLookup","true")\
   .load('s3://<bucket>/validated_sales')
   df.count()
   count = 250k 
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-5609
   - Type: Bug
   - Fix version(s):
     - 1.1.0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] Hudi table not queryable by SQL on Databricks Spark [hudi]

Reply via email to