wiedld commented on code in PR #13651:
URL: https://github.com/apache/datafusion/pull/13651#discussion_r1886199978
##########
datafusion/optimizer/src/analyzer/mod.rs:
##########
@@ -164,18 +172,38 @@ impl Analyzer {
log_plan(rule.name(), &new_plan);
observer(&new_plan, rule.as_ref());
}
- // for easier display in explain output
- check_plan(&new_plan).map_err(|e| {
+
+ // verify at the end, after the last LP analyzer pass.
+ assert_valid_semantic_plan(&new_plan).map_err(|e| {
DataFusionError::Context("check_analyzed_plan".to_string(),
Box::new(e))
})?;
+
log_plan("Final analyzed plan", &new_plan);
debug!("Analyzer took {} ms", start_time.elapsed().as_millis());
Ok(new_plan)
}
}
-/// Do necessary check and fail the invalid plan
-fn check_plan(plan: &LogicalPlan) -> Result<()> {
+/// These are invariants which should hold true before and after each analyzer.
+///
+/// This differs from [`LogicalPlan::assert_invariants`], which addresses if a
singular
+/// LogicalPlan is valid. Instead this address if the analyzer (before and
after)
+/// is valid based upon permitted changes.
+///
+/// Does not check elements which are mutated by the analyzers (e.g. the
schema).
+fn assert_valid_semantic_plan(plan: &LogicalPlan) -> Result<()> {
+ plan.assert_invariants()?;
+
+ // TODO: should this be moved to LogicalPlan::assert_invariants?
+ assert_subqueries_are_valid(plan)?;
Review Comment:
The `assert_subqueries_are_valid()` runs [these
checks](https://github.com/apache/datafusion/blob/bd2c975df88c9e95f2f6b5f2abed5638904e2ac0/datafusion/optimizer/src/analyzer/subquery.rs#L28-L34).
The existing (on main) code only performs these checks after all of the
analyzer passes have completed. If I run these checks before the first analyzer
pass => we get test failures.
My next step will be to figure out which of these test failures are due to
code bugs.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]