[ 
https://issues.apache.org/jira/browse/CAMEL-21199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882715#comment-17882715
 ] 

Radovan Netuka commented on CAMEL-21199:
----------------------------------------

Fixed in Jackson 2.18.0 - 
[https://github.com/FasterXML/jackson-core/pull/1335.] Once we upgrade to that 
version, all that needs to be done is to enable the new option (in Jackson) 
COMBINE_UNICODE_SURROGATES_IN_UTF8.

> Camel-jackson not properly marshalling 4-byte characters
> --------------------------------------------------------
>
>                 Key: CAMEL-21199
>                 URL: https://issues.apache.org/jira/browse/CAMEL-21199
>             Project: Camel
>          Issue Type: Bug
>          Components: camel-jackson
>            Reporter: Radovan Netuka
>            Assignee: Radovan Netuka
>            Priority: Major
>             Fix For: 4.9.0
>
>
> Camel-jackson doesn't handle 4-byte characters well. Marshalling a 4-byte 
> Japanese kanji character results in two UTF-16 escapes to be written instead 
> of the character itself. While this is ok for emoji an such, it's not for 
> natural languages.
> Jackson issue: 
> [FasterXML/jackson-core#223|https://github.com/FasterXML/jackson-core/issues/223]
>  
> Reproducer:
> {code:java}
> from("file:data?file-name=input.txt&noop=true")
>     .log("${body}")
>     .unmarshal().json(JsonLibrary.Jackson)
>     .log("${body[0]['name']}")
>     .marshal().json(JsonLibrary.Jackson, true)
>     .log("${body}"); {code}
>  
> with the file input.txt containing:
> {code:java}
> [{"name": "システム𩸽"}] {code}
>  
> Expected output seen in the log: *"システム𩸽"*
> Actual output seen in the log: *"システム\uD867\uDE3D"*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to