Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/21825 )
Change subject: IMPALA-889: Add trim() function matching ANSI SQL definition ...................................................................... Patch Set 2: (19 comments) Thanks for this change, Mihály! http://gerrit.cloudera.org:8080/#/c/21825/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21825/2//COMMIT_MSG@14 PS2, Line 14: MPALA Nit: IMPALA-... http://gerrit.cloudera.org:8080/#/c/21825/2//COMMIT_MSG@16 PS2, Line 16: https://gerrit.sjc.cloudera.com/c/Impala/+/4579 This is an internal link, so it would be better to have a Github link to this commit: https://github.com/apache/impala/commit/543fa73f3a846f0e4527514c993cb0985912b06c http://gerrit.cloudera.org:8080/#/c/21825/2//COMMIT_MSG@52 PS2, Line 52: a Nit: negative http://gerrit.cloudera.org:8080/#/c/21825/2//COMMIT_MSG@64 PS2, Line 64: 2. Escaping "where". Since "leading", "trailing", and "both" are inherited Nit: line too long. http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/cup/sql-parser.cup File fe/src/main/cup/sql-parser.cup: http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/cup/sql-parser.cup@3805 PS2, Line 3805: unnest_expr:e Shouldn't "unnest_expr" also incuded in "string_expr"? Similarly to this: select trim(t.string_col FROM "0101") from alltypestiny t; this could also work: select trim(unnest(arr2) from "onetwothree") from complextypes_arrays; http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/cup/sql-parser.cup@3829 PS2, Line 3829: We could add an EE test for each of these cases as the "charset" parameter of TRIM. http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/cup/sql-parser.cup@3873 PS2, Line 3873: compromizing syntax like Could you elaborate on it, I don't completely understand the example. Maybe a complete wrong example would be better, including the TRIM function. http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/cup/sql-parser.cup@3873 PS2, Line 3873: string_expr is needed here This is also true for the version above, isn't it? Then it would be better to put it there and indicate that it refers to both cases. http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java File fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java: http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java@45 PS2, Line 45: fnName_ This field shadows the field with the same name from the base class. We could use another name, e.g. "trimFromFnName_". http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java@51 PS2, Line 51: private static final Set<String> TRIM_OPTIONS; This could instead be an enum. http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java@60 PS2, Line 60: B Nit: should start with lower-case letter. http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java@69 PS2, Line 69: // Error will be raised on analysis stage. For now set btrim as default. Can't we raise an exception here? http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java@77 PS2, Line 77: this(fnName, where, null, srcexpr); I'm wondering whether it's a good idea to raise an error when the "where" parameter is explicitly given (so this constructor is called and not the one on L81) AND it is NULL. I don't think it's good practice to give NULL as the parameter and rely on it defaulting to "both". http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java@98 PS2, Line 98: * Copy c'tor used in clone(). We should also override clone(), otherwise this copy constructor won't be used. http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java@111 PS2, Line 111: extract It should be "trim()", shouldn't it? http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java@113 PS2, Line 113: fnName_.getFunction() This could be extracted into a variable. http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java@143 PS2, Line 143: if (errMsg.length() > 0) { errMsg.append(" "); } If the block body firs on one line, we omit the braces in Impala. http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java@144 PS2, Line 144: errMsg.append("Expression '" + srcexpr_.toSql() + "' has a return type of "); : errMsg.append(srcexpr_.getType().toSql() + " but a STRING is required."); This could be extracted into a function that could also be used on L139. http://gerrit.cloudera.org:8080/#/c/21825/2/testdata/workloads/functional-query/queries/QueryTest/utf8-string-functions.test File testdata/workloads/functional-query/queries/QueryTest/utf8-string-functions.test: http://gerrit.cloudera.org:8080/#/c/21825/2/testdata/workloads/functional-query/queries/QueryTest/utf8-string-functions.test@398 PS2, Line 398: ---- QUERY We could have some EE tests with utf8_trim() too. -- To view, visit http://gerrit.cloudera.org:8080/21825 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3c4fa6d0d8d0684c4b6d8dac8fd531d205e4f7b4 Gerrit-Change-Number: 21825 Gerrit-PatchSet: 2 Gerrit-Owner: Mihaly Szjatinya <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Comment-Date: Fri, 20 Sep 2024 13:00:15 +0000 Gerrit-HasComments: Yes
