LiaCastaneda commented on code in PR #16454:
URL: https://github.com/apache/datafusion/pull/16454#discussion_r2183300447


##########
datafusion/core/src/physical_planner.rs:
##########
@@ -1502,6 +1521,64 @@ fn get_null_physical_expr_pair(
     Ok((Arc::new(null_value), physical_name))
 }
 
+/// Qualifies the fields in a join schema with "left" and "right" qualifiers
+/// without mutating the original schema. This function should only be used 
when
+/// the join inputs have already been requalified earlier in 
`try_new_with_project_input`.
+///
+/// The purpose is to avoid ambiguity errors later in planning (e.g., in 
nullability or data type resolution)
+/// when converting expressions to fields.
+fn qualify_join_schema_sides(
+    join_schema: &DFSchema,
+    left: &LogicalPlan,
+    right: &LogicalPlan,
+) -> Result<DFSchema> {
+    let left_fields = left.schema().fields();
+    let right_fields = right.schema().fields();
+    let join_fields = join_schema.fields();
+
+    // Validate lengths
+    if join_fields.len() != left_fields.len() + right_fields.len() {
+        return internal_err!(
+            "Join schema field count must match left and right field count."
+        );
+    }
+
+    // Validate field names match
+    for (i, (field, expected)) in join_fields
+        .iter()
+        .zip(left_fields.iter().chain(right_fields.iter()))
+        .enumerate()
+    {
+        if field.name() != expected.name() {
+            return internal_err!(
+                "Field name mismatch at index {}: expected '{}', found '{}'",
+                i,
+                expected.name(),
+                field.name()
+            );
+        }
+    }

Review Comment:
   I don't expect these errors to occur in production, but they might be useful 
for devs working on the physical planning of Joins or modifying 
`requalify_sides_if_needed` prevent unwanted behaviour/bugs, as the comment 
says, it this function should only be used when `try_new_with_project_input` is 
used & if sides have been requalified.
   
   For instance, if this fucntion is applied to a semi join, planning later 
will fail because its schema is not the result of left + right (it wont fail 
now as this is only called for inner joins) but I think its still worth 
checking to prevent any regressions



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to