pepijnve commented on code in PR #17839:
URL: https://github.com/apache/datafusion/pull/17839#discussion_r2402499183


##########
datafusion/sqllogictest/test_files/string/string_view.slt:
##########
@@ -784,7 +784,7 @@ EXPLAIN SELECT
 FROM test;
 ----
 logical_plan
-01)Projection: regexp_like(test.column1_utf8view, 
Utf8("^https?://(?:www\.)?([^/]+)/.*$")) AS k
+01)Projection: test.column1_utf8view ~ 
Utf8View("^https?://(?:www\.)?([^/]+)/.*$") AS k

Review Comment:
   See https://github.com/apache/datafusion/issues/17838#issuecomment-3355083929
   
   The operator logic is in `physical_expr`, while `regexp_like` lives in 
`functions`. We would probably have to move the common logic to a separate 
crate. This PR was intended as a stopgap solution for common cases.
   
   We can only rewrite in some cases because of the optional `flags` argument. 
With the operators all you have is the case sensitivity (i.e. the `i`flag).
   
   The reason for the operator being more efficient is that it will make use of 
the `regexp_is_match_scalar` kernel if it can, while `regexp_like` always uses 
`regexp_is_match`. `regexp_is_match` does maintain a cache of compiled regexes 
so at least the pattern isn't compiled over and over again, but it's still 
quite a bit more code compared to `regexp_is_match_scalar`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to