Omega359 commented on PR #14282: URL: https://github.com/apache/datafusion/pull/14282#issuecomment-2773275754
I spent some time today looking at this PR. Here are my thoughts: - This udf closely mirrors the [duckdb](https://duckdb.org/docs/stable/sql/functions/regular_expressions.html#regexp_extractstring-pattern-group--0-options) functionality but it does not mirror the typical regex substring functionality found in [postgres](https://www.postgresql.org/docs/current/functions-matching.html), [mysql](https://dev.mysql.com/doc/refman/8.4/en/regexp.html#function_regexp-substr), [sql server](https://learn.microsoft.com/en-us/sql/t-sql/functions/regexp-substr-transact-sql?view=azuresqldb-current) nor [oracle](https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/REGEXP_SUBSTR.html) - I do not think that datafusion should have both regexp_extract and regexp_substring functionality as the general consensus has been to try and mirror postgresql with respect to functionality if possible. Thus, if I was to choose to include this type of function it would be the regexp_substr, not regexp_extract. That is not to say that there isn't merit with regexp_extract but it may be best to have it in a repo such as [datafusion-functions-extra](https://github.com/datafusion-contrib/datafusion-functions-extra) or similar. As a note the regexp_substr PR (https://github.com/apache/datafusion/pull/14323) is about to expire and the author looks to have submitted it and abandoned it. I think there is a number of ideas that you've implemented in this PR that could be incorporated into a new PR perhaps based on the above mentioned PR? Just a thought if you are interested - I myself really do appreciate the work you've put into this PR ❤️ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org