walterddr commented on a change in pull request #7114:
URL: https://github.com/apache/pinot/pull/7114#discussion_r709287638



##########
File path: 
pinot-common/src/main/java/org/apache/pinot/common/function/scalar/StringFunctions.java
##########
@@ -179,6 +180,48 @@ public static String rtrim(String input) {
     return RTRIM.matcher(input).replaceAll("");
   }
 
+  /**
+   * @see Pattern#matches(String, CharSequence)
+   * @param value input value
+   * @param regexp regular expression
+   * @return the matched result.
+   */
+  @ScalarFunction
+  public static String regexpExtract(String value, String regexp) {
+    return regexpExtract(value, regexp, 1, "");
+  }
+
+  /**
+   * Regular expression extract that accepts starting position as argument.
+   * @param value input value
+   * @param regexp regular expression
+   * @param occurrence the specified i-th occurrence to extract
+   * @return the matched result.
+   */
+  @ScalarFunction
+  public static String regexpExtract(String value, String regexp, int 
occurrence) {
+    return regexpExtract(value, regexp, occurrence, "");
+  }
+
+  /**
+   * Regular expression extract that accepts starting position and i-th 
occurrence as argument.
+   * @param value input value
+   * @param regexp regular expression
+   * @param occurrence the specified i-th occurrence to extract
+   * @param defaultValue the default value if no match found
+   * @return the matched result
+   */
+  @ScalarFunction
+  public static String regexpExtract(String value, String regexp, int 
occurrence, String defaultValue) {
+    Pattern p = Pattern.compile(regexp);
+    Matcher matcher = p.matcher(value);
+    if (matcher.find()) {
+      return matcher.group(occurrence - 1);

Review comment:
       good catch. let search around and see what's the tradition to handle 
this exception.
   
   But in SQL semantics all index starts from "1". unless Pinot follow a 
different one I think we should stick with 1-based indexing. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to