[
https://issues.apache.org/jira/browse/CALCITE-2619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16680757#comment-16680757
]
Ted Xu commented on CALCITE-2619:
---------------------------------
[~julianhyde] sorry for the late reply, I was already working on this issue.
However, the change is a bit larger than what I expected. I'd like to raise
some more issue before I submit the patch:
1. I think the original verification of charset can only tell a Unicode string
is LATIN1 encoded or not, since 'value' of NlsString is a Java String. I would
change the 'value' type from String to byte[].
2. The payload of NlsString is a byte[] but we still need to cache an encoded
String to reduce encoding cost. I would also like to have a method
'getValueBytes() : byte[]' if someone need to skip encoding entirely.
> Reduce string literal creation cost by removing charset check
> -------------------------------------------------------------
>
> Key: CALCITE-2619
> URL: https://issues.apache.org/jira/browse/CALCITE-2619
> Project: Calcite
> Issue Type: Improvement
> Components: core
> Reporter: Ted Xu
> Assignee: Julian Hyde
> Priority: Major
>
> The cost of creating NlsString is very high, due to its charset check. In
> some cases, e.g., expression evaluate because of Partition Prune, the
> NlsString creation costs 40%+ of total executor's overhead.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)