lincoln-lil commented on code in PR #25137:
URL: https://github.com/apache/flink/pull/25137#discussion_r1723025685


##########
docs/data/sql_functions_zh.yml:
##########
@@ -433,6 +433,14 @@ string:
       str <CHAR | VARCHAR>, regex <CHAR | VARCHAR>
       
       返回一个 INTEGER 表示匹配成功的子串索引。如果任何参数为 `NULL` 或 regex 非法,则返回 `NULL`。
+  - sql: REGEXP_SUBSTR(str, regex)
+    table: str.regexpSubStr(regex)
+    description: |
+      返回 str 中第一个匹配 regex 的子字符串。
+      
+      str <CHAR | VARCHAR>, regex <CHAR | VARCHAR>
+      
+      返回一个 STRING 表示匹配成功的子串。如果任何参数为 `NULL` 或 regex 非法或匹配失败,则返回 `NULL`。

Review Comment:
   返回一个 STRING 表示第一个匹配成功的子串



##########
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/functions/RegexpFunctionsITCase.java:
##########
@@ -307,4 +308,73 @@ private Stream<TestSetSpec> regexpInstrTestCases() {
                                 "Invalid input arguments. Expected signatures 
are:\n"
                                         + "REGEXP_INSTR(str 
<CHARACTER_STRING>, regex <CHARACTER_STRING>)"));
     }
+
+    private Stream<TestSetSpec> regexpSubstrTestCases() {
+        return Stream.of(
+                
TestSetSpec.forFunction(BuiltInFunctionDefinitions.REGEXP_SUBSTR)
+                        .onFieldsWithData(null, "abcdeabde", "100-200, 
300-400")
+                        .andDataTypes(DataTypes.STRING(), DataTypes.STRING(), 
DataTypes.STRING())
+                        // null input
+                        .testResult(
+                                $("f0").regexpSubstr($("f1")),
+                                "REGEXP_SUBSTR(f0, f1)",
+                                null,
+                                DataTypes.STRING())
+                        .testResult(
+                                $("f1").regexpSubstr($("f0")),
+                                "REGEXP_SUBSTR(f1, f0)",
+                                null,
+                                DataTypes.STRING())
+                        // invalid regexp
+                        .testResult(
+                                $("f1").regexpSubstr("("),
+                                "REGEXP_SUBSTR(f1, '(')",
+                                null,
+                                DataTypes.STRING())
+                        // not found
+                        .testResult(
+                                $("f2").regexpSubstr("[a-z]"),
+                                "REGEXP_SUBSTR(f2, '[a-z]')",
+                                null,
+                                DataTypes.STRING())
+                        // border chars
+                        .testResult(
+                                lit("Helloworld! Hello 
everyone!").regexpSubstr("\\bHello\\b"),
+                                "REGEXP_SUBSTR('Helloworld! Hello everyone!', 
'\\bHello\\b')",
+                                "Hello",
+                                DataTypes.STRING())
+                        .testResult(
+                                $("f2").regexpSubstr("(\\d+)-(\\d+)$"),
+                                "REGEXP_SUBSTR(f2, '(\\d+)-(\\d+)$')",
+                                "300-400",
+                                DataTypes.STRING())

Review Comment:
   also add a case has multi matched groups, e.g.,
   ```java
                           .testResult(
                                   $("f2").regexpSubstr("(\\d+)-(\\d+)"),
                                   "REGEXP_SUBSTR(f2, '(\\d+)-(\\d+)')",
                                   "100-200",
                                   DataTypes.STRING())
   ```



##########
docs/data/sql_functions.yml:
##########
@@ -362,6 +362,14 @@ string:
       str <CHAR | VARCHAR>, regex <CHAR | VARCHAR>
       
       Returns an INTEGER representation of matched substring index. `NULL` if 
any of the arguments are `NULL` or regex is invalid.
+  - sql: REGEXP_SUBSTR(str, regex)
+    table: str.regexpSubstr(regex)
+    description: |
+      Returns the first substring in str that matches regex.
+      
+      str <CHAR | VARCHAR>, regex <CHAR | VARCHAR>
+      
+      Returns an STRING representation of matched substring. `NULL` if any of 
the arguments are `NULL` or regex if invalid or pattern is not found.

Review Comment:
    Returns an STRING representation of the first matched substring.



##########
flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/BaseExpressions.java:
##########
@@ -1222,6 +1223,19 @@ public OutType regexpInstr(InType regex) {
                 unresolvedCall(REGEXP_INSTR, toExpr(), 
objectToExpression(regex)));
     }
 
+    /**
+     * Returns the first substring in {@code str} that matches {@code regex}.
+     *
+     * @param regex A STRING expression with a matching pattern.
+     * @return A STRING representation of matched substring. <br>

Review Comment:
   A STRING representation of the first matched substring.



##########
flink-python/pyflink/table/expression.py:
##########
@@ -1265,6 +1265,16 @@ def regexp_instr(self, regex) -> 'Expression':
         """
         return _binary_op("regexpInstr")(self, regex)
 
+    def regexp_substr(self, regex) -> 'Expression':
+        """
+        Returns the first substring in str that matches regex.
+        null if any of the arguments are null or regex is invalid or pattern 
is not found.
+
+        :param regex: A STRING expression with a matching pattern.
+        :return: A STRING representation of matched substring.

Review Comment:
   A STRING representation of the first matched substring.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to