[
https://issues.apache.org/jira/browse/HIVE-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13973730#comment-13973730
]
Szehon Ho commented on HIVE-6843:
---------------------------------
Thanks for the review. As I understand, you are passing in a string literal to
Text constructor, so it is not interpreting \uD801 as one char, so there is
actually 5 chars there: '\', 'u', 'D', '8', '0', '1'.
I tried the following test and it seemed to work:
char[] chararray = new char[] {'1', '2', '3', '\uD801', '\uDC00', '4', '5',
'6'};
String str = new String(chararray);
Assert.assertEquals(5, GenericUDFUtils.findText(new Text(str), new
Text("4"), 0));
I guess the second check was supposed to be 5, not 4.
> INSTR for UTF-8 returns incorrect position
> ------------------------------------------
>
> Key: HIVE-6843
> URL: https://issues.apache.org/jira/browse/HIVE-6843
> Project: Hive
> Issue Type: Bug
> Components: UDF
> Affects Versions: 0.11.0, 0.12.0
> Reporter: Clif Kranish
> Assignee: Szehon Ho
> Priority: Minor
> Attachments: HIVE-6843.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.2#6252)