Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22603 )

Change subject: WIP IMPALA-10349: Support constant folding for non ascii strings
......................................................................


Patch Set 6:

(3 comments)

The change is still WIP, I didn't want to finalize things before agreeing on 
storing with String vs byte[]

>Supprised that we don't need BE changes.
Thrift's binary and string are both std::string in c++, so luckily no change 
was needed

http://gerrit.cloudera.org:8080/#/c/22603/5/fe/src/main/java/org/apache/impala/analysis/StringLiteral.java
File fe/src/main/java/org/apache/impala/analysis/StringLiteral.java:

http://gerrit.cloudera.org:8080/#/c/22603/5/fe/src/main/java/org/apache/impala/analysis/StringLiteral.java@69
PS5, Line 69:     binValue_ = StringUtils.toUtf8Array(value);
> Can we just generate this byte array in need? StringLiteral is also used in
Maybe we could keep the byte array, and generate the String when needed? This 
could decrease the memory footprint, as in most case 1 byte per char would be 
enough instead of 2. If I see correctly we keep the strings in unescaped format.
https://github.com/apache/impala/blob/32836eab4a42541e44d95435f815a5d0cf9d3b24/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java#L1280


http://gerrit.cloudera.org:8080/#/c/22603/5/fe/src/main/java/org/apache/impala/util/StringUtils.java
File fe/src/main/java/org/apache/impala/util/StringUtils.java:

http://gerrit.cloudera.org:8080/#/c/22603/5/fe/src/main/java/org/apache/impala/util/StringUtils.java@28
PS5, Line 28: class St
> Don't need "abstract".
Done


http://gerrit.cloudera.org:8080/#/c/22603/5/testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test:

http://gerrit.cloudera.org:8080/#/c/22603/5/testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test@488
PS5, Line 488: concat("a", "รก")
> Can we add tests for UTF-8 aware string functions in both UTF-8 and non-UTF
done, added a test that use a few utf8 aware functions



--
To view, visit http://gerrit.cloudera.org:8080/22603
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I70663457a0b0a3443e586350f0a5996bb75ba64a
Gerrit-Change-Number: 22603
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Comment-Date: Sat, 05 Apr 2025 10:40:48 +0000
Gerrit-HasComments: Yes

Reply via email to