findepi commented on issue #12622: URL: https://github.com/apache/datafusion/issues/12622#issuecomment-2668262491
> > I am relying here that we can extract the necessary type information from the schema or record batches > > This is probably not true. Scalar is a free constant unlike column that has DataType in the defined table. Which may involve a bit of computation, especially for large expressions like deep AND or OR trees (https://github.com/apache/datafusion/issues/12604) (in https://github.com/apache/datafusion/issues/9375 we avoided stack overflow, not the computation). > A question I have is: When is `LogicalScalar` actually helpful? > > The LogicalScalar is used in `Expr::Literal` and is later converted into a ScalarValue during the physical plan execution. It's not helpful if logical planning remains tied to arrow `DataType`, which it remains to be today. I think we wanted to have `LogicalScalar` to decouple logical planning from arrow type system. > Even if we can convert to `ScalarValue`, since we don't have DataType, `LogicalScalar::String` can only be converted to `ScalarValue::Utf8` but not `ScalarValue::Utf8View` or `ScalarValue::Diction(_, Utf8)`. Given `type coercion` resolve the DataType of the Expr not only the LogicalType, it is problematic if we can't convert the scalar string to the specific This is chicken and egg problem. Coercion resolves DataType, because that's the (current) type system for LP. I believe the purpose of this issue (https://github.com/apache/datafusion/issues/12622) is "Logical operators during logical planning should unquestionably not have access to the physical type information". So the end state would be: LP operating on "logical types" (without distinguishing Utf8 from Utf8View, from Dicitonary(_, Utf8), from REE). The coercions operate on the same logical types. The constants in the plan are expressed also in terms of these logical types. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org