Re: [PR] [SPARK-51479][SQL] Nullable in Row Level Operation Column is not correct [spark]

via GitHub Thu, 20 Mar 2025 15:42:59 -0700


amogh-jahagirdar commented on code in PR #50246:
URL: https://github.com/apache/spark/pull/50246#discussion_r2006563930



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteRowLevelCommand.scala:
##########
@@ -273,9 +273,8 @@ trait RewriteRowLevelCommand extends Rule[LogicalPlan] {
       outputs: Seq[Seq[Expression]],
       colOrdinals: Seq[Int],
       attrs: Seq[Attribute]): ProjectingInternalRow = {
-    val schema = StructType(attrs.zipWithIndex.map { case (attr, index) =>
-      val nullable = outputs.exists(output => 
output(colOrdinals(index)).nullable)
-      StructField(attr.name, attr.dataType, nullable, attr.metadata)
+    val schema = StructType(attrs.zipWithIndex.map { case (attr, _) =>
+      StructField(attr.name, attr.dataType, attr.nullable, attr.metadata)

Review Comment:
   
https://github.com/apache/iceberg/blob/main/spark/v3.4/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteRowLevelIcebergCommand.scala#L111
 in the older Spark 3.4 extension we had in Iceberg before plans were in Spark.
   
   
   ```
   output attr is nullable if at least one output projection may produce null 
for that attr
   but row ID and metadata attrs are projected only for update/delete records 
and
   row attrs are projected only in insert/update records
   that's why the projection schema must rely only on relevant outputs
   instead of blindly inheriting the output attr nullability
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Re: [PR] [SPARK-51479][SQL] Nullable in Row Level Operation Column is not correct [spark]

Reply via email to