[PR] [SPARK-51479][SQL] Nullable in Row Level Operation Column is not correct [spark]

via GitHub Tue, 11 Mar 2025 23:43:29 -0700


huaxingao opened a new pull request, #50246:
URL: https://github.com/apache/spark/pull/50246


   
   
   ### What changes were proposed in this pull request?
   fix nullable in Row Level Operation column
   
   
   ### Why are the changes needed?
   In iceberg/spark 4.0 integration, there are a few test failures because of 
nullable is not correctly computed. 
   
   ```
   TestMergeOnReadUpdate > testUpdateWithMultiColumnInSubquery() > catalogName 
= spark_catalog, implementation = org.apache.iceberg.spark.SparkSessionCatalog, 
config = {type=hive, default-namespace=default, clients=1, 
parquet-enabled=false, cache-enabled=false}, format = AVRO, vectorized = false, 
distributionMode = range, fanout = false, branch = test, planningMode = 
DISTRIBUTED, formatVersion = 3 FAILED
       java.lang.IllegalArgumentException: Provided metadata schema is 
incompatible with expected schema:
       table {
         2147483643: _spec_id: required int (Spec ID used to track the file 
containing a row)
         2147483642: _partition: optional struct<> (Partition to which a row 
belongs to)
       }
       Provided schema:
       table {
         2147483643: _spec_id: optional int
         2147483642: _partition: optional struct<>
       }
       Problems:
       * _spec_id should be required, but is optional
           at 
org.apache.iceberg.types.TypeUtil.checkSchemaCompatibility(TypeUtil.java:493)
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   no
   
   
   ### How was this patch tested?
   new test
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   no
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[PR] [SPARK-51479][SQL] Nullable in Row Level Operation Column is not correct [spark]

Reply via email to