[ 
https://issues.apache.org/jira/browse/CXF-4533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Kulp resolved CXF-4533.
------------------------------

       Resolution: Fixed
    Fix Version/s: 2.7.0
                   2.6.3
                   2.5.6
                   2.4.10
         Assignee: Daniel Kulp
    
> Encoding error in CachedOutputStream when double-byte char is on 1024 byte 
> boundary
> -----------------------------------------------------------------------------------
>
>                 Key: CXF-4533
>                 URL: https://issues.apache.org/jira/browse/CXF-4533
>             Project: CXF
>          Issue Type: Bug
>    Affects Versions: 2.3, 2.6.2
>            Reporter: Lars Svensson
>            Assignee: Daniel Kulp
>             Fix For: 2.4.10, 2.5.6, 2.6.3, 2.7.0
>
>
> Hi,
> We experience occasional encoding errors where a small number of two-byte 
> chars get encoded wrong in an otherwise correct encoded message. I have 
> traced the problem to the writeCacheTo method of CachedOutputStream where the 
> temp cached file is read as 1024 bytes at the time which are then converted 
> to a String before getting appended to the StringBuilder. If the 1024 byte 
> boundary falls right between the two bytes of a two byte char the encoding 
> fails.
> public void writeCacheTo(StringBuilder out, String charsetName) throws 
> IOException {
>    flush();
>    if (inmem) {
>       if (currentStream instanceof ByteArrayOutputStream) {
>          byte[] bytes = ((ByteArrayOutputStream)currentStream).toByteArray();
>          out.append(IOUtils.newStringFromBytes(bytes, charsetName));
>       } else {
>          throw new IOException("Unknown format of currentStream");
>       }
>    } else {
>       // read the file
>       FileInputStream fin = new FileInputStream(tempFile);
>       byte bytes[] = new byte[1024];
>       int x = fin.read(bytes);
>       while (x != -1) {
>          out.append(IOUtils.newStringFromBytes(bytes, charsetName, 0, x));
>          x = fin.read(bytes);
>       }
>       fin.close();
>    }
> }
> Below is a couple of lines from the hex-dump of the cache-file where you can 
> see that the second o-slash in the file fall on a 1024 byte boundary and 
> therefore gets corrupted in the outgoing message:
> 0001fbe0:  66 66 65 6e 74 6c 69 67 20 66 c3 b8 72 74 69 64 73 70 65 6e 73 69 
> 6f 6e 2c 20 73 6f 6d 20 66 c3  ffentlig førtidspension, som f?
> 0001fc00:  b8 72 65 72 20 74 69 6c 2c 20 61 74 20 6d 65 64 6c 65 6d 3c 2f 70 
> 67 66 3a 52 65 70 75 72 63 68  ?rer til, at medlem</pgf:Repurch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to