ankitsultana commented on code in PR #12392:
URL: https://github.com/apache/pinot/pull/12392#discussion_r1501214269
##########
pinot-common/src/main/java/org/apache/pinot/common/function/scalar/StringFunctions.java:
##########
@@ -570,6 +572,107 @@ public static String[] split(String input, String
delimiter, int limit) {
return StringUtils.splitByWholeSeparator(input, delimiter, limit);
}
+ /**
+ * @param input an input string for prefix strings generations.
+ * @param maxlength the max length of the prefix strings for the string.
+ * @return generate an array of prefix strings of the string that are
shorter than the specified length.
+ */
+ @ScalarFunction
+ public static String[] prefixes(String input, int maxlength) {
+ ObjectSet<String> prefixSet = new ObjectLinkedOpenHashSet<>();
+ for (int prefixLength = 1; prefixLength <= maxlength && prefixLength <=
input.length(); prefixLength++) {
+ prefixSet.add(input.substring(0, prefixLength));
+ }
+ return prefixSet.toArray(new String[0]);
+ }
+
+ /**
+ * @param input an input string for prefix strings generations.
+ * @param maxlength the max length of the prefix strings for the string.
+ * @param regexChar the character for regex matching to be added to prefix
strings generated. e.g. '^'
+ * @return generate an array of prefix matchers of the string that are
shorter than the specified length.
+ */
+ @ScalarFunction
+ public static String[] prefixMatchers(String input, int maxlength, String
regexChar) {
Review Comment:
sorry, I didn't understand what `regexChar` is for. We are simply prepending
the character to the strings right?
Can we then name this method: `uniquePrefixesWithPrefix(String input, int
maxLength, String prefix)` or similar?
My point being that this function itself is not tied to the fact that we are
going to use this for regex matching purposes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]