ethan-tyler commented on code in PR #20763:
URL: https://github.com/apache/datafusion/pull/20763#discussion_r2915600803
##########
datafusion/expr/src/logical_plan/dml.rs:
##########
@@ -291,6 +294,62 @@ impl Display for InsertOp {
}
}
+/// Describes a MERGE INTO operation's parameters.
+///
+/// This is carried inside `WriteOp::MergeInto` and contains
+/// the ON condition and WHEN clauses that the TableProvider
+/// needs to execute the merge.
+#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Hash)]
+pub struct MergeIntoOp {
+ /// The join condition from `ON <expr>`.
+ /// Kept as a general logical Expr; downstream providers
+ /// (e.g., Iceberg) can decompose into column pairs if needed.
+ pub on: Expr,
+ /// The WHEN clauses, in the order they appeared in the SQL.
+ pub clauses: Vec<MergeIntoClause>,
+}
+
+/// A single WHEN clause within a MERGE INTO statement.
+#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Hash)]
+pub struct MergeIntoClause {
+ /// Whether this fires on matched or unmatched rows.
+ pub kind: MergeIntoClauseKind,
+ /// Optional additional predicate (`AND <expr>`).
+ pub predicate: Option<Expr>,
+ /// The action to take.
Review Comment:
Once the planner lands, `apply_expressions` / `with_new_exprs` will need to
be wired through DML to pick up the Expr payloads stored here. Not a concern
for this PR since these are just type definitions.
##########
datafusion/expr/src/logical_plan/dml.rs:
##########
@@ -239,6 +239,8 @@ pub enum WriteOp {
Ctas,
/// `TRUNCATE` operation
Truncate,
+ /// `MERGE INTO` operation
+ MergeInto(MergeIntoOp),
Review Comment:
This is breaking datafusion-proto. The proto conversion for `WriteOp` is
non-exhaustive and the proto schema can't carry `MergeIntoOp.on` or clauses. I
would either land proto changes together or add an explicit serialization error
path.
##########
datafusion/expr/src/logical_plan/dml.rs:
##########
@@ -291,6 +294,62 @@ impl Display for InsertOp {
}
}
+/// Describes a MERGE INTO operation's parameters.
+///
+/// This is carried inside `WriteOp::MergeInto` and contains
+/// the ON condition and WHEN clauses that the TableProvider
+/// needs to execute the merge.
+#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Hash)]
+pub struct MergeIntoOp {
+ /// The join condition from `ON <expr>`.
+ /// Kept as a general logical Expr; downstream providers
+ /// (e.g., Iceberg) can decompose into column pairs if needed.
+ pub on: Expr,
+ /// The WHEN clauses, in the order they appeared in the SQL.
+ pub clauses: Vec<MergeIntoClause>,
+}
+
+/// A single WHEN clause within a MERGE INTO statement.
+#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Hash)]
+pub struct MergeIntoClause {
+ /// Whether this fires on matched or unmatched rows.
+ pub kind: MergeIntoClauseKind,
+ /// Optional additional predicate (`AND <expr>`).
+ pub predicate: Option<Expr>,
+ /// The action to take.
+ pub action: MergeIntoAction,
+}
+
+/// Which rows a MERGE WHEN clause applies to.
+#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Hash)]
+pub enum MergeIntoClauseKind {
+ /// WHEN MATCHED
+ Matched,
+ /// WHEN NOT MATCHED (synonymous with NOT MATCHED BY TARGET)
+ NotMatched,
+ /// WHEN NOT MATCHED BY TARGET
+ NotMatchedByTarget,
+ /// WHEN NOT MATCHED BY SOURCE
+ NotMatchedBySource,
Review Comment:
`NotMatched` and `NotMatchedByTarget` look synonymous. If both stay,
document how downstream consumers should handle them
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]