eliaperantoni commented on code in PR #13664:
URL: https://github.com/apache/datafusion/pull/13664#discussion_r1923512091


##########
datafusion/sql/src/expr/mod.rs:
##########
@@ -165,13 +177,126 @@ impl<S: ContextProvider> SqlToRel<'_, S> {
         schema: &DFSchema,
         planner_context: &mut PlannerContext,
     ) -> Result<Expr> {
-        let mut expr = self.sql_expr_to_logical_expr(sql, schema, 
planner_context)?;
+        // The location of the original SQL expression in the source code
+        let mut expr =
+            self.sql_expr_to_logical_expr(sql.clone(), schema, 
planner_context)?;

Review Comment:
   That's a very interesting idea. I see a few issues with it though:
   
   1.
   
   What if `sql` is a compound expression like `a + b` but only `b` is 
unresolved? Ideally, I would want `sql_expr_to_logical_expr` to be able to 
return diagnostics that highlight `b` only. But does it do that if it only know 
the `Span` of the entire `sql` expression?
   
   2.
   
   What if a diagnostic needs to highlight various parts of a query. e.g. if a 
non-aggregated column is missing from the `GROUP BY` clause, a good diagnostic 
would highlight both the column and the `GROUP BY` clause. But planner context 
has only one `Span`, and which is the "current span" in this case?
   
   I thought that I could solve the above issues by putting a `HashMap<String, 
Vec<Span>>` in `PlannerContext`. The key would be to label different parts of 
the query, e.g. `group_by_clause`, `set_operator`, `left_side_of_expr`, etc.. 
So that I could do something like:
   
   ```
   error.with_diagnostic(Diagnostic::new().with_error(
       "column missing from GROUP BY",
       planner_context.span("expr").last(), // Highlight the column
   ).with_note(
       "GROUP BY is here",
       planner_context.span("group_by").last(), // Highlight the GROUP BY
   ))
   ```
   
   The `Vec<_>` is just to have stacks, so that in something like `SELECT ... 
UNION (SELECT ... UNION SELECT ...)` the `set_operator` key could point to the 
outer `UNION`, then the inner one, then restore the outer one when we're done 
planning the query in parentheses.
   
   But then I encountered another issue:
   
   3.
   
   Some functions where `PlannerContext` is threaded through (e.g. 
`sql_expr_to_logical_expr` and `sql_expr_to_logical_expr_internal`) take `&mut 
PlannerContext`. What if I need to add new spans? 
   
   For example, in the body of `sql_expr_to_logical_expr` I would like to do:
   
   <img width="874" alt="image" 
src="https://github.com/user-attachments/assets/82b4d8c8-2d78-4367-8340-50e1dcc067de";
 />
   
   But, what if `sql_expr_to_logical_expr_internal` needs to mutate the given 
`&mut PlannerContext`? I can't simply clone `PlannerContext` and add a new span 
because I then break the ability to mutate downstream.
   
   Then it seems like a pattern like:
   
   ```
   pattern_context.with_span("expr", sql_expr.span(), move |planner_context| {
       let expr = self.sql_expr_to_logical_expr_internal(
           *sql_expr,
           schema,
           planner_context,
       )?;
   });
   ```
   
   might be necessary, to add a `Span` to the given `PlannerContext` but also 
remove it. Idk if that's less invasive than adding `Span` to `Expr` though.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to