crepererum commented on code in PR #11455:
URL: https://github.com/apache/datafusion/pull/11455#discussion_r1715234362


##########
datafusion/physical-expr/src/expressions/binary.rs:
##########
@@ -41,20 +39,24 @@ use datafusion_expr::type_coercion::binary::get_result_type;
 use datafusion_expr::{ColumnarValue, Operator};
 use datafusion_physical_expr_common::datum::{apply, apply_cmp, 
apply_cmp_for_nested};
 
+use datafusion_physical_expr_common::expressions::Literal;
 use kernels::{
     bitwise_and_dyn, bitwise_and_dyn_scalar, bitwise_or_dyn, 
bitwise_or_dyn_scalar,
     bitwise_shift_left_dyn, bitwise_shift_left_dyn_scalar, 
bitwise_shift_right_dyn,
     bitwise_shift_right_dyn_scalar, bitwise_xor_dyn, bitwise_xor_dyn_scalar,
 };
 
 /// Binary expression
-#[derive(Debug, Hash, Clone)]
+#[derive(Clone)]
 pub struct BinaryExpr {
     left: Arc<dyn PhysicalExpr>,
     op: Operator,
     right: Arc<dyn PhysicalExpr>,
     /// Specifies whether an error is returned on overflow or not
     fail_on_overflow: bool,
+    /// Only used when evaluating literal regex expressions. Example regex 
expression: c1 ~ '^a'

Review Comment:
   This seems like a bit of a leaky abstraction or hack just for this one 
special case (= partial evaluation of regexes). If wonder if we could do 
something more elegant that would also work for other similar operators:
   
   1. introduce a new struct `CompiledRegex` that holds the complied regex. 
   2. implement `PhysicalExpr` for `CompiledRegex`
   3. change the regex evaluation of the binary expression to check if the RHS 
(= `BinaryExpr::right` is `CompiledExpr` and use this pre-compiled value
   4. change the optimizer pass to just replace the RHS instead attaching the 
cached data to `BinaryExpr`
   
   Another approach would be to lower/pre-compile the regex expressions to a 
custom data type, but currently arrow-rs doesn't support that, see 
https://github.com/apache/arrow-rs/issues/4472



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to