[
https://issues.apache.org/jira/browse/KUDU-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16987384#comment-16987384
]
ASF subversion and git services commented on KUDU-1938:
-------------------------------------------------------
Commit d5ac2f2e6f1003ff2f5163631e2be2f88f5c97d5 in kudu's branch
refs/heads/master from Attila Bukor
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=d5ac2f2 ]
KUDU-1938 Make UTF-8 truncation faster pt 1
This commit adds a fast path for ASCII strings where if the MSB is a
0-bit on each byte in a chunk of string it advances the counter and the
iterator by the chunk size. This way if a chunk contains only ASCII
characters there's no need to count each individual character.
Thanks to Todd Lipcon for the initial idea and Zoltan Chovan and Istvan
Farmosi for the brainstorming and the help in figuring out how this
should be done.
Before:
[ RUN ] CharUtilTest.StressTestUtf8
[ OK ] CharUtilTest.StressTestUtf8 (6698 ms)
[ RUN ] CharUtilTest.StressTestAscii
[ OK ] CharUtilTest.StressTestAscii (6161 ms)
After:
[ RUN ] CharUtilTest.StressTestUtf8
[ OK ] CharUtilTest.StressTestUtf8 (7746 ms)
[ RUN ] CharUtilTest.StressTestAscii
[ OK ] CharUtilTest.StressTestAscii (1028 ms)
Change-Id: Iebb98e18a3619029d9b0bc224c7dead89a3d7374
Reviewed-on: http://gerrit.cloudera.org:8080/14353
Reviewed-by: Adar Dembo <[email protected]>
Tested-by: Kudu Jenkins
> Support for VARCHAR type
> ------------------------
>
> Key: KUDU-1938
> URL: https://issues.apache.org/jira/browse/KUDU-1938
> Project: Kudu
> Issue Type: New Feature
> Components: client, tablet
> Reporter: Farzana Kader
> Assignee: Attila Bukor
> Priority: Major
> Labels: limitations, roadmap-candidate
>
> VARCHAR is currently not supported by Kudu. This is functionality that
> currently exists in Impala. Some client applications convert STRING to 32K
> bytes which causes performance issues so they need the VARCHAR support in
> order to integrate well with Kudu.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)