[ 
https://issues.apache.org/jira/browse/HIVE-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13973730#comment-13973730
 ] 

Szehon Ho commented on HIVE-6843:
---------------------------------

Thanks for the review.  As I understand, you are passing in a string literal to 
Text constructor, so it is not interpreting \uD801 as one char, so there is 
actually 5 chars there: '\', 'u', 'D', '8', '0', '1'.

I tried the following test and it seemed to work:

    char[] chararray = new char[] {'1', '2', '3', '\uD801', '\uDC00', '4', '5', 
'6'};
    String str = new String(chararray);
    Assert.assertEquals(5, GenericUDFUtils.findText(new Text(str), new 
Text("4"), 0));

I guess the second check was supposed to be 5, not 4.

> INSTR for UTF-8 returns incorrect position
> ------------------------------------------
>
>                 Key: HIVE-6843
>                 URL: https://issues.apache.org/jira/browse/HIVE-6843
>             Project: Hive
>          Issue Type: Bug
>          Components: UDF
>    Affects Versions: 0.11.0, 0.12.0
>            Reporter: Clif Kranish
>            Assignee: Szehon Ho
>            Priority: Minor
>         Attachments: HIVE-6843.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to